From patchwork Thu Jul 20 18:57:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 123431 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp3318786vqt; Thu, 20 Jul 2023 11:58:53 -0700 (PDT) X-Google-Smtp-Source: APBJJlHrURJPQS0SDJ4H/4gWQebWT+2kTsydsO2v2kwlRNRy6iOa7oMlLHy4AjPTvYbchAUxbux9 X-Received: by 2002:a05:651c:92:b0:2b9:4093:a873 with SMTP id 18-20020a05651c009200b002b94093a873mr2815969ljq.5.1689879533329; Thu, 20 Jul 2023 11:58:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689879533; cv=none; d=google.com; s=arc-20160816; b=m/wbiCdbON2LaVOAymx8TF4lxkBRVBxYxTGXvC+pGDRDxrS0/yXGMbVKOBGfaQxS+H yw+78qycLaxwQlr242AhyYr7K8rCOAWIHrG9+McaZ8R/cYHcepyvMturb/4NPmaRCqzV 40+A8uKigr0so6J5x1qKMbULe5JU4ujM4AADeyLmKniCX1sHs+XTAErnGCKsLpOijc4h WLNlOtzQgL/PZA0HkA6mHPrgDHBD1ysAa04zoIlMnCDXLvw9uCuSTEO+nVDGmC8bomRl Kq7rwgXYSd8xGpEU/fI14tfUzTGlTiO6z0pq1YZruZIk8vEvnZrCpa3qr7/QsgCmItM/ oWgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:to:subject :message-id:date:mime-version:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=GbuVNlR3Hf/H8JuHrTxYWN+1ePyK+1nwYWXNvvdpqpY=; fh=HBK5sYjCSY3abfomIo1OsC2COh3RKQ+LJ7T9VtS2mGI=; b=wm+fKNkLvSsqBXNUm2aiKPnds5bSboQWB/WIGzww+ub4UfDPUr5BXtyJ7gfTCYztkr wTJIiiszEyrphcJf/Q99257NCzo3jd06raRjlDjOjk5JD3DmqRg0rw8yiO2aQFtkvOV0 zA7aChken9g78QrY+8QjUdo7Ac7FMM4XJil5v5qIV1979kGGe19fw9/RHHt76EZ+tRTZ IfQ7455yi9WfW3EZ6i04Nd9zw/bidw9OaI5jBKuQ9pwvC3ybghnX14bAnhj0Bm3HdeD/ KwxtrGS2HGk24NcKJV4Jqe6BVCvGUP8AafiV2gRkwX994qrisfzAB/VefQ7QnqR7wzaI FHGw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=hX7+lchS; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id qc12-20020a170906d8ac00b00997c9f1bdf7si1091243ejb.407.2023.07.20.11.58.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 20 Jul 2023 11:58:53 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=hX7+lchS; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 310DC385559F for ; Thu, 20 Jul 2023 18:58:52 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 310DC385559F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689879532; bh=GbuVNlR3Hf/H8JuHrTxYWN+1ePyK+1nwYWXNvvdpqpY=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=hX7+lchSZ3Ti3zcHe9qysUcOUuXaFuA+DKB+9D8suj7iinMNm/W3pbYXnobVrSMDY XhQYBhpNyXdp/fJMm9pSVJX1slrsb9fngj0EE2F1kVkjcsTyp58w+jYSDbcO2fxCHz i9kGx7oUDVNIvcl66OAgwCyNswuvM635PdX5PcuQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lf1-x132.google.com (mail-lf1-x132.google.com [IPv6:2a00:1450:4864:20::132]) by sourceware.org (Postfix) with ESMTPS id C0BCE3858CDB for ; Thu, 20 Jul 2023 18:58:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C0BCE3858CDB Received: by mail-lf1-x132.google.com with SMTP id 2adb3069b0e04-4fb761efa7aso1875335e87.0 for ; Thu, 20 Jul 2023 11:58:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689879485; x=1690484285; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=GbuVNlR3Hf/H8JuHrTxYWN+1ePyK+1nwYWXNvvdpqpY=; b=SUU0pqmZZshL56T/7PPMkTUQnHY2KztTrLgomB+ZAHpY8pNwaAjch8zUmIloSW74zP z+NXByjJgGpUAZvpx0OuwK2k6dUNWCpKF5XNdXF18Wiz7xrnkvKOFH1k4X8vkm0xsb4E 3dNmmv4vHk/ho8Qix4heXO/n3zkxydRoH4lQI9S2STAMm9sZKhg/qJ6u7jXm2s3OqaNZ Fr82axDuWiezHhbimB0TiWvk2rLh5vVOXhKA1HdxscWnEAPLWcu/lR04icZoG5MXSATf 8RhagoqZOiiVat9fIbffCs3uaaS0qWP7ocd7WEMSUJ/QsjW6AZ4cpiU3qPqjs7flWbX0 3zfg== X-Gm-Message-State: ABy/qLbbTAvQhudtkxlIoe9xSLRoiM4748C7YwOEnfhW5C+ynyGWyQr+ eDnkl8/iJp+l94iO+gT25Dxx9yin+JGr2CJbaJV/Zp05Sgui3w== X-Received: by 2002:ac2:4642:0:b0:4f9:5404:af5 with SMTP id s2-20020ac24642000000b004f954040af5mr2541049lfo.46.1689879484769; Thu, 20 Jul 2023 11:58:04 -0700 (PDT) MIME-Version: 1.0 Date: Thu, 20 Jul 2023 20:57:53 +0200 Message-ID: Subject: [committed] i386: Double-word sign-extension missed-optimization [PR110717] To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Uros Bizjak via Gcc-patches From: Uros Bizjak Reply-To: Uros Bizjak Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771967121699432849 X-GMAIL-MSGID: 1771967121699432849 When sign-extending the value in a double-word register pair using shift and ashiftrt sequence with the same count immediate value less than word width, there is no need to shift the lower word of the value. The sign-extension could be limited to the upper word, but we uselessly shift the lower word with it as well: movq %rdi, %rax movq %rsi, %rdx shldq $59, %rdi, %rdx salq $59, %rax shrdq $59, %rdx, %rax sarq $59, %rdx ret for -m64 and movl 4(%esp), %eax movl 8(%esp), %edx shldl $27, %eax, %edx sall $27, %eax shrdl $27, %edx, %eax sarl $27, %edx ret for -m32. The patch introduces a new post-reload splitter to provide the combined ASHIFTRT/SHIFT instruction pattern. The instruction is split to a sequence of SAL and SAR insns with the same count immediate operand: movq %rsi, %rdx movq %rdi, %rax salq $59, %rdx sarq $59, %rdx ret Some complication is required to properly handle STV transform, where we emit a sequence with DImode PSLLQ and PSRAQ insns for 32-bit AVX512VL targets when profitable. The patch also fixes a small oversight and enables STV transform of SImode ASHIFTRT to PSRAD also for SSE2 targets. PR target/110717 gcc/ChangeLog: * config/i386/i386-features.cc (general_scalar_chain::compute_convert_gain): Calculate gain for extend higpart case. (general_scalar_chain::convert_op): Handle ASHIFTRT/ASHIFT combined RTX. (general_scalar_to_vector_candidate_p): Enable ASHIFTRT for SImode for SSE2 targets. Handle ASHIFTRT/ASHIFT combined RTX. * config/i386/i386.md (*extend2_doubleword_highpart): New define_insn_and_split pattern. (*extendv2di2_highpart_stv): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr110717.c: New test. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Uros. diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc index 4d69251d4f5..f801a8fc94a 100644 --- a/gcc/config/i386/i386-features.cc +++ b/gcc/config/i386/i386-features.cc @@ -572,6 +572,9 @@ general_scalar_chain::compute_convert_gain () { if (INTVAL (XEXP (src, 1)) >= 32) igain += ix86_cost->add; + /* Gain for extend highpart case. */ + else if (GET_CODE (XEXP (src, 0)) == ASHIFT) + igain += ix86_cost->shift_const - ix86_cost->sse_op; else igain += ix86_cost->shift_const; } @@ -951,7 +954,8 @@ general_scalar_chain::convert_op (rtx *op, rtx_insn *insn) { *op = copy_rtx_if_shared (*op); - if (GET_CODE (*op) == NOT) + if (GET_CODE (*op) == NOT + || GET_CODE (*op) == ASHIFT) { convert_op (&XEXP (*op, 0), insn); PUT_MODE (*op, vmode); @@ -2120,7 +2124,7 @@ general_scalar_to_vector_candidate_p (rtx_insn *insn, enum machine_mode mode) switch (GET_CODE (src)) { case ASHIFTRT: - if (!TARGET_AVX512VL) + if (mode == DImode && !TARGET_AVX512VL) return false; /* FALLTHRU */ @@ -2131,6 +2135,14 @@ general_scalar_to_vector_candidate_p (rtx_insn *insn, enum machine_mode mode) if (!CONST_INT_P (XEXP (src, 1)) || !IN_RANGE (INTVAL (XEXP (src, 1)), 0, GET_MODE_BITSIZE (mode)-1)) return false; + + /* Check for extend highpart case. */ + if (mode != DImode + || GET_CODE (src) != ASHIFTRT + || GET_CODE (XEXP (src, 0)) != ASHIFT) + break; + + src = XEXP (src, 0); break; case SMAX: diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 8c54aa5e981..4db210cc795 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -15292,6 +15292,41 @@ (define_insn "*qi_ext_2" (const_string "0") (const_string "*"))) (set_attr "mode" "QI")]) + +(define_insn_and_split "*extend2_doubleword_highpart" + [(set (match_operand: 0 "register_operand" "=r") + (ashiftrt: + (ashift: (match_operand: 1 "nonimmediate_operand" "0") + (match_operand:QI 2 "const_int_operand")) + (match_operand:QI 3 "const_int_operand"))) + (clobber (reg:CC FLAGS_REG))] + "INTVAL (operands[2]) == INTVAL (operands[3]) + && UINTVAL (operands[2]) < * BITS_PER_UNIT" + "#" + "&& reload_completed" + [(parallel [(set (match_dup 4) + (ashift:DWIH (match_dup 4) (match_dup 2))) + (clobber (reg:CC FLAGS_REG))]) + (parallel [(set (match_dup 4) + (ashiftrt:DWIH (match_dup 4) (match_dup 2))) + (clobber (reg:CC FLAGS_REG))])] + "split_double_mode (mode, &operands[0], 1, &operands[0], &operands[4]);") + +(define_insn_and_split "*extendv2di2_highpart_stv" + [(set (match_operand:V2DI 0 "register_operand" "=v") + (ashiftrt:V2DI + (ashift:V2DI (match_operand:V2DI 1 "nonimmediate_operand" "vm") + (match_operand:QI 2 "const_int_operand")) + (match_operand:QI 3 "const_int_operand")))] + "!TARGET_64BIT && TARGET_STV && TARGET_AVX512VL + && INTVAL (operands[2]) == INTVAL (operands[3]) + && UINTVAL (operands[2]) < 32" + "#" + "&& reload_completed" + [(set (match_dup 0) + (ashift:V2DI (match_dup 1) (match_dup 2))) + (set (match_dup 0) + (ashiftrt:V2DI (match_dup 0) (match_dup 2)))]) ;; Rotate instructions diff --git a/gcc/testsuite/gcc.target/i386/pr110717.c b/gcc/testsuite/gcc.target/i386/pr110717.c new file mode 100644 index 00000000000..233f0eae5b5 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr110717.c @@ -0,0 +1,21 @@ +/* PR target/110717 */ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#ifdef __SIZEOF_INT128__ +unsigned __int128 +foo (unsigned __int128 x) +{ + x <<= 59; + return ((__int128) x) >> 59; +} +#else +unsigned long long +foo (unsigned long long x) +{ + x <<= 27; + return ((long long) x) >> 27; +} +#endif + +/* { dg-final { scan-assembler-not "sh\[lr\]d" } } */