From patchwork Fri Sep 16 01:06:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 1246 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5044:0:0:0:0:0 with SMTP id h4csp531036wrt; Thu, 15 Sep 2022 18:10:04 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4kJfIqupPmkZcUH4dq7afsqce9x6AtH5cAqULS98/y7ed+yhiE6lJ5n9uO6ZnVwMcFdj8J X-Received: by 2002:a17:906:7c8f:b0:771:3912:3942 with SMTP id w15-20020a1709067c8f00b0077139123942mr1812437ejo.387.1663290604138; Thu, 15 Sep 2022 18:10:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663290604; cv=none; d=google.com; s=arc-20160816; b=jrDJGNVKznOO1D4nxGWwrL3Mal1UEFcw3+EvCvZGE1hQtv4VJFkuG3As1GSUtEe3Lc 0J8QBmodyRHcoRooGVtxDYwHCHNT+YGNJSrLKafK4ErnFMtc8rOcdabiG6x+l6keiQ1m bqiQGHRmhgTzuxTP8AbrJDcnoNSkNj9O6XqtCZ1JrFd/DNx/NTP6DPiAII+PBfXBnO90 dEcLWOH0J6/NqD2sWQgJJyb0ie7wdPVJq+mpwyUVKSozLWCL+KxKF6pVIejBsL1uk/ii qkPF4A8LdD42HpBKXIz6YqpzO6o5TfxYjnOa01QXiUrQED78T+mbwTK2q1NPT77sKg+2 fGBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:message-id:date :subject:to:dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=bUmCYbYWBY8qQyIQcnEJW7wD6i0kVj4mLpywnvo2+Hk=; b=qlH8PtSO9iv3D5x66p2kttMXERzXVUppiMBf00Xi6PjsfLEVHdUKj5HGioQvzAoxKY QNIPWzNYqkiWKVWij6zpgtfpqwvQMSuEHD2ABUAKqblM8HW9rpsngM/AOlaUP+4moU3P b6GlUSBbjg1Sw8fEhT7dW656nIeWCu83aqdoeVMdvNldj52ndQ9lEEx2OKtNH8pDDkeR QQyc73NVzPqVSdlIbtFtU+fpVjHMmQBgkFPes83f6/0ZCp/7BMeDWznf6uNvSD+jZj8A 3LP4aQ/hQlnbzMig7Hb/bXT/1dJ5Gm9JFVpMFrPzagASJJtQvmKLf5iDNNyPgrExGvep zUaA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=ZE1yx8Tc; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id k17-20020aa7c391000000b0044884549409si761764edq.356.2022.09.15.18.10.03 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 15 Sep 2022 18:10:04 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=ZE1yx8Tc; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0C495382F0A8 for ; Fri, 16 Sep 2022 01:10:03 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0C495382F0A8 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1663290603; bh=bUmCYbYWBY8qQyIQcnEJW7wD6i0kVj4mLpywnvo2+Hk=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=ZE1yx8Tc0xeVcLW6lF3ZkB44WyFdNKafRxErLOnefzNE+B/k/LAsKn2vfBehBHlW+ KOWbGkj74qN3ZrU7rSW459tMsxvF4FaV81Lx6LDi3igOz2VHTMM0CswanPrKJ1jq2z YLuoSu5FHYRud/oLzTavFAC4CiPhCy4BugGlADUQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by sourceware.org (Postfix) with ESMTPS id 2D67E382F092 for ; Fri, 16 Sep 2022 01:09:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2D67E382F092 X-IronPort-AV: E=McAfee;i="6500,9779,10471"; a="279263560" X-IronPort-AV: E=Sophos;i="5.93,319,1654585200"; d="scan'208";a="279263560" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Sep 2022 18:09:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,319,1654585200"; d="scan'208";a="613094010" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga007.jf.intel.com with ESMTP; 15 Sep 2022 18:09:00 -0700 Received: from shliclel320.sh.intel.com (shliclel320.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 7C06C100560B; Fri, 16 Sep 2022 09:08:59 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH] [x86]Don't optimize cmp mem, 0 to load mem, reg + test reg, reg Date: Fri, 16 Sep 2022 09:06:59 +0800 Message-Id: <20220916010659.37555-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.18.1 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: liuhongt via Gcc-patches From: liuhongt Reply-To: liuhongt Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1744086608573329639?= X-GMAIL-MSGID: =?utf-8?q?1744086608573329639?= There's peephole2 submit in 1990s which split cmp mem, 0 to load mem, reg + test reg, reg. I don't know exact reason why gcc do this. For latest x86 processors, ciscization should help processor frontend also codesize, for processor backend, they should be the same(has same uops). So the patch deleted the peephole2, and also modify another splitter to generate more cmp mem, 0 for 32-bit target. It will help instruction fetch. for minmax-1.c minmax-2.c minmax-10, pr96891.c, it's supposed to scan there's no comparison to 1 or -1, so adjust the testcase since under 32-bit target, we now generate cmp mem, 0 instead of load + test. Similar for pr78035.c. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,} No performance impact for SPEC2017 on ICX/Znver3. Ok for trunk? gcc/ChangeLog: * config/i386/i386.md (*3_1): Replace register_operand with nonimmediate_operand for operand 1. Also force_reg it when mode is QImode. (define_peephole2): Deleted related peephole2. gcc/testsuite/ChangeLog: * gcc.target/i386/minmax-1.c: Scan-assemble-not for cmp with 1 or -1, also don't scan-assembler test for ia32. * gcc.target/i386/minmax-10.c: Ditto. * gcc.target/i386/minmax-2.c: Ditto. * gcc.target/i386/pr78035.c: Ditto. * gcc.target/i386/pr96861.c: Scan either cmp or test 3 times. --- gcc/config/i386/i386.md | 18 +++++------------- gcc/testsuite/gcc.target/i386/minmax-1.c | 4 ++-- gcc/testsuite/gcc.target/i386/minmax-10.c | 4 ++-- gcc/testsuite/gcc.target/i386/minmax-2.c | 4 ++-- gcc/testsuite/gcc.target/i386/pr78035.c | 2 +- gcc/testsuite/gcc.target/i386/pr96861.c | 4 ++-- 6 files changed, 14 insertions(+), 22 deletions(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 1be9b669909..93b905beb72 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -21871,7 +21871,7 @@ (define_insn_and_split "*3_doubleword" (define_insn_and_split "*3_1" [(set (match_operand:SWI 0 "register_operand") (maxmin:SWI - (match_operand:SWI 1 "register_operand") + (match_operand:SWI 1 "nonimmediate_operand") (match_operand:SWI 2 "general_operand"))) (clobber (reg:CC FLAGS_REG))] "TARGET_CMOVE @@ -21886,9 +21886,12 @@ (define_insn_and_split "*3_1" { machine_mode mode = mode; rtx cmp_op = operands[2]; - operands[2] = force_reg (mode, cmp_op); + /* movqicc_noc only support register_operand for op1. */ + if (mode == QImode) + operands[1] = force_reg (mode, operands[1]); + enum rtx_code code = ; if (cmp_op == const1_rtx) @@ -22482,17 +22485,6 @@ (define_peephole2 [(set (match_dup 2) (match_dup 1)) (set (match_dup 0) (match_dup 2))]) -;; Don't compare memory with zero, load and use a test instead. -(define_peephole2 - [(set (match_operand 0 "flags_reg_operand") - (match_operator 1 "compare_operator" - [(match_operand:SI 2 "memory_operand") - (const_int 0)])) - (match_scratch:SI 3 "r")] - "optimize_insn_for_speed_p () && ix86_match_ccmode (insn, CCNOmode)" - [(set (match_dup 3) (match_dup 2)) - (set (match_dup 0) (match_op_dup 1 [(match_dup 3) (const_int 0)]))]) - ;; NOT is not pairable on Pentium, while XOR is, but one byte longer. ;; Don't split NOTs with a displacement operand, because resulting XOR ;; will not be pairable anyway. diff --git a/gcc/testsuite/gcc.target/i386/minmax-1.c b/gcc/testsuite/gcc.target/i386/minmax-1.c index 0ec35b1c5a1..840b32c5414 100644 --- a/gcc/testsuite/gcc.target/i386/minmax-1.c +++ b/gcc/testsuite/gcc.target/i386/minmax-1.c @@ -1,7 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -march=opteron -mno-stv" } */ -/* { dg-final { scan-assembler "test" } } */ -/* { dg-final { scan-assembler-not "cmp" } } */ +/* { dg-final { scan-assembler "test" { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-not {(?n)cmp.*[$]+1} } } */ #define max(a,b) (((a) > (b))? (a) : (b)) int t(int a) diff --git a/gcc/testsuite/gcc.target/i386/minmax-10.c b/gcc/testsuite/gcc.target/i386/minmax-10.c index b044462c5a9..1dd2eedf435 100644 --- a/gcc/testsuite/gcc.target/i386/minmax-10.c +++ b/gcc/testsuite/gcc.target/i386/minmax-10.c @@ -34,5 +34,5 @@ unsigned int umin1(unsigned int x) return min(x,1); } -/* { dg-final { scan-assembler-times "test" 6 } } */ -/* { dg-final { scan-assembler-not "cmp" } } */ +/* { dg-final { scan-assembler-times "test" 6 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-not {(?n)cmp.*1} } } */ diff --git a/gcc/testsuite/gcc.target/i386/minmax-2.c b/gcc/testsuite/gcc.target/i386/minmax-2.c index af9baeaaf7c..2c82f6cecb9 100644 --- a/gcc/testsuite/gcc.target/i386/minmax-2.c +++ b/gcc/testsuite/gcc.target/i386/minmax-2.c @@ -1,7 +1,7 @@ /* { dg-do compile } */ /* { dg-options "-O2 -mno-stv" } */ -/* { dg-final { scan-assembler "test" } } */ -/* { dg-final { scan-assembler-not "cmp" } } */ +/* { dg-final { scan-assembler "test" { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-not {(?n)cmp.*[$]1} } } */ #define max(a,b) (((a) > (b))? (a) : (b)) unsigned int t(unsigned int a) diff --git a/gcc/testsuite/gcc.target/i386/pr78035.c b/gcc/testsuite/gcc.target/i386/pr78035.c index 7d3a983b218..d543d3f1d38 100644 --- a/gcc/testsuite/gcc.target/i386/pr78035.c +++ b/gcc/testsuite/gcc.target/i386/pr78035.c @@ -22,4 +22,4 @@ int bar () } /* We should not optimize away either comparison. */ -/* { dg-final { scan-assembler-times "cmp" 2 } } */ +/* { dg-final { scan-assembler-times "(?:cmp|test)" 3 } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr96861.c b/gcc/testsuite/gcc.target/i386/pr96861.c index 7b7aeccb83c..8c0f0841f7d 100644 --- a/gcc/testsuite/gcc.target/i386/pr96861.c +++ b/gcc/testsuite/gcc.target/i386/pr96861.c @@ -34,5 +34,5 @@ unsigned int umin1(unsigned int x) return min(x,1); } -/* { dg-final { scan-assembler-times "test" 6 } } */ -/* { dg-final { scan-assembler-not "cmp" } } */ +/* { dg-final { scan-assembler-times "test" 6 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-not {(?n)cmp.*[$]+1} } } */