From patchwork Tue Jul 11 09:13:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 118337 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp344873vqm; Tue, 11 Jul 2023 02:16:45 -0700 (PDT) X-Google-Smtp-Source: APBJJlGiRciQluZEJtJ4GcdYeMyvKVicDEK+NlmkEIsYqprWChRpaQDIJnrpQ5KmKminDv78D8Eh X-Received: by 2002:a17:906:29c:b0:977:d660:c5aa with SMTP id 28-20020a170906029c00b00977d660c5aamr23656591ejf.31.1689067005510; Tue, 11 Jul 2023 02:16:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689067005; cv=none; d=google.com; s=arc-20160816; b=aZiFjk7EQIo+scODEPsOqf+7QwH/qpqb8nbAbrm8XzmHYd2VO/V6Dud/SFqApmWnS1 CIqnKSKEZBlIt1MUGeqvtLNYf+XmxTeU6r/ucIju+HO6CIOkuNmgIwQPuA9ZnBL+duTO GHf+k5s9dwSIUpqSpUat0ZfzgqiroLa8091oEC6UAfs8eKM7XHmci8PQN18dc6IE11HB 8JAwd779WFQ/xfOaqbyBYpliPFu9H9lm6LQ7zOgGELPeuxO/yBGA2RgOr037CqJ/8f4F dbiGmYM6ZhjaP9BXmqKoJO/YZIA8HHt9KRdZa+hr20Wo4k22sPsiqQygsV0OrYcYhn/l ik4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=rGSfAa/fOHcnu/xihlPSgGytYWaFpjYCzz2BeEZwjRI=; fh=dm/TRqLJBeVjVbri+NDGxi/A9xo65BhFe8USDc1Ftcw=; b=ruHGBWSH06MlZ8LW566PwubK9YmF46DYK9/5nnCsC3oARpJeOaUoXEQp3QKTMlVnri eINTuQrKkLcbYaBkFM0WJgsZqgcn4hUM3C4EUu8WEE9NKJ9FMTaVbAanzlSQ7pPMyVuq wO+dwJlTwV9oIASCuGHbyhQXKTEytfSxifsk9LIBVA9mtVsLeB1kT9QSxYsNXBjTZuke hcap21pziN4ctZaXWDZFZsErJHUv4m350ahDv+3pqvrhankYuCPIHUy3VXhSaRnc/s9T m3+JTAfdJnFT7oilfzDK4l+mSeGiV8mpTCQKo95np7UR2Av6Q9ZuEsLnRGThaSHt+e8w jvfg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=YcgofSOQ; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id i18-20020a17090685d200b00988907e3aaesi1518782ejy.428.2023.07.11.02.16.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jul 2023 02:16:45 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=YcgofSOQ; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 539F13858284 for ; Tue, 11 Jul 2023 09:16:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 539F13858284 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689067004; bh=rGSfAa/fOHcnu/xihlPSgGytYWaFpjYCzz2BeEZwjRI=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=YcgofSOQvK6wqAwE2Y3I3anwibG2rF0uCFtSei4IazLwhAGS6aIB+NV5f7v5OhwNa jl8iBmgH0itZYqbO0ynu5N1Mww5iedOrpmtCW7nhw338XO8cpCp9SLocsfrMaplRIX lUrGpjSWcIovN6fYxCTuazglSgffRw8+xkjft7Lc= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by sourceware.org (Postfix) with ESMTPS id BCB643858D20 for ; Tue, 11 Jul 2023 09:15:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BCB643858D20 X-IronPort-AV: E=McAfee;i="6600,9927,10767"; a="395354088" X-IronPort-AV: E=Sophos;i="6.01,196,1684825200"; d="scan'208";a="395354088" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jul 2023 02:15:52 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10767"; a="791142670" X-IronPort-AV: E=Sophos;i="6.01,196,1684825200"; d="scan'208";a="791142670" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 11 Jul 2023 02:15:50 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id D6C061007813; Tue, 11 Jul 2023 17:15:49 +0800 (CST) To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com Subject: [PATCH] Add peephole to eliminate redundant comparison after cmpccxadd. Date: Tue, 11 Jul 2023 17:13:49 +0800 Message-Id: <20230711091349.3376586-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.39.1.388.g2fc9e9ca3c MIME-Version: 1.0 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: liuhongt via Gcc-patches From: liuhongt Reply-To: liuhongt Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771092354763534736 X-GMAIL-MSGID: 1771115124410595894 Similar like we did for CMPXCHG, but extended to all ix86_comparison_int_operator since CMPCCXADD set EFLAGS exactly same as CMP. When operand order in CMP insn is same as that in CMPCCXADD, CMP insn can be eliminated directly. When operand order is swapped in CMP insn, only optimize cmpccxadd + cmpl + jcc/setcc to cmpccxadd + jcc/setcc when FLAGS_REG is dead after jcc/setcc plus adjusting code for jcc/setcc. gcc/ChangeLog: PR target/110591 * config/i386/sync.md (cmpccxadd_): Adjust the pattern to explicitly set FLAGS_REG like *cmp_1, also add extra 3 define_peephole2 after the pattern. gcc/testsuite/ChangeLog: * gcc.target/i386/pr110591.c: New test. * gcc.target/i386/pr110591-2.c: New test. --- gcc/config/i386/sync.md | 160 ++++++++++++++++++++- gcc/testsuite/gcc.target/i386/pr110591-2.c | 90 ++++++++++++ gcc/testsuite/gcc.target/i386/pr110591.c | 66 +++++++++ 3 files changed, 315 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr110591-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr110591.c diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md index e1fa1504deb..e84226cf895 100644 --- a/gcc/config/i386/sync.md +++ b/gcc/config/i386/sync.md @@ -1093,7 +1093,9 @@ (define_insn "cmpccxadd_" UNSPECV_CMPCCXADD)) (set (match_dup 1) (unspec_volatile:SWI48x [(const_int 0)] UNSPECV_CMPCCXADD)) - (clobber (reg:CC FLAGS_REG))] + (set (reg:CC FLAGS_REG) + (compare:CC (match_dup 1) + (match_dup 2)))] "TARGET_CMPCCXADD && TARGET_64BIT" { char buf[128]; @@ -1105,3 +1107,159 @@ (define_insn "cmpccxadd_" output_asm_insn (buf, operands); return ""; }) + +(define_peephole2 + [(set (match_operand:SWI48x 0 "register_operand") + (match_operand:SWI48x 1 "x86_64_general_operand")) + (parallel [(set (match_dup 0) + (unspec_volatile:SWI48x + [(match_operand:SWI48x 2 "memory_operand") + (match_dup 0) + (match_operand:SWI48x 3 "register_operand") + (match_operand:SI 4 "const_int_operand")] + UNSPECV_CMPCCXADD)) + (set (match_dup 2) + (unspec_volatile:SWI48x [(const_int 0)] UNSPECV_CMPCCXADD)) + (set (reg:CC FLAGS_REG) + (compare:CC (match_dup 2) + (match_dup 0)))]) + (set (reg FLAGS_REG) + (compare (match_operand:SWI48x 5 "register_operand") + (match_operand:SWI48x 6 "x86_64_general_operand")))] + "TARGET_CMPCCXADD && TARGET_64BIT + && rtx_equal_p (operands[0], operands[5]) + && rtx_equal_p (operands[1], operands[6])" + [(set (match_dup 0) + (match_dup 1)) + (parallel [(set (match_dup 0) + (unspec_volatile:SWI48x + [(match_dup 2) + (match_dup 0) + (match_dup 3) + (match_dup 4)] + UNSPECV_CMPCCXADD)) + (set (match_dup 2) + (unspec_volatile:SWI48x [(const_int 0)] UNSPECV_CMPCCXADD)) + (set (reg:CC FLAGS_REG) + (compare:CC (match_dup 2) + (match_dup 0)))]) + (set (match_dup 7) + (match_op_dup 8 + [(match_dup 9) (const_int 0)]))]) + +(define_peephole2 + [(set (match_operand:SWI48x 0 "register_operand") + (match_operand:SWI48x 1 "x86_64_general_operand")) + (parallel [(set (match_dup 0) + (unspec_volatile:SWI48x + [(match_operand:SWI48x 2 "memory_operand") + (match_dup 0) + (match_operand:SWI48x 3 "register_operand") + (match_operand:SI 4 "const_int_operand")] + UNSPECV_CMPCCXADD)) + (set (match_dup 2) + (unspec_volatile:SWI48x [(const_int 0)] UNSPECV_CMPCCXADD)) + (set (reg:CC FLAGS_REG) + (compare:CC (match_dup 2) + (match_dup 0)))]) + (set (reg FLAGS_REG) + (compare (match_operand:SWI48x 5 "register_operand") + (match_operand:SWI48x 6 "x86_64_general_operand"))) + (set (match_operand:QI 7 "nonimmediate_operand") + (match_operator:QI 8 "ix86_comparison_int_operator" + [(reg FLAGS_REG) (const_int 0)]))] + "TARGET_CMPCCXADD && TARGET_64BIT + && rtx_equal_p (operands[0], operands[6]) + && rtx_equal_p (operands[1], operands[5]) + && peep2_regno_dead_p (4, FLAGS_REG)" + [(set (match_dup 0) + (match_dup 1)) + (parallel [(set (match_dup 0) + (unspec_volatile:SWI48x + [(match_dup 2) + (match_dup 0) + (match_dup 3) + (match_dup 4)] + UNSPECV_CMPCCXADD)) + (set (match_dup 2) + (unspec_volatile:SWI48x [(const_int 0)] UNSPECV_CMPCCXADD)) + (set (reg:CC FLAGS_REG) + (compare:CC (match_dup 2) + (match_dup 0)))]) + (set (match_dup 7) + (match_op_dup 8 + [(match_dup 9) (const_int 0)]))] +{ + operands[9] = gen_rtx_REG (GET_MODE (XEXP (operands[8], 0)), FLAGS_REG); + if (swap_condition (GET_CODE (operands[8])) != GET_CODE (operands[8])) + { + operands[8] = shallow_copy_rtx (operands[8]); + enum rtx_code ccode = swap_condition (GET_CODE (operands[8])); + PUT_CODE (operands[8], ccode); + operands[9] = gen_rtx_REG (SELECT_CC_MODE (ccode, + operands[6], + operands[5]), + FLAGS_REG); + } +}) + +(define_peephole2 + [(set (match_operand:SWI48x 0 "register_operand") + (match_operand:SWI48x 1 "x86_64_general_operand")) + (parallel [(set (match_dup 0) + (unspec_volatile:SWI48x + [(match_operand:SWI48x 2 "memory_operand") + (match_dup 0) + (match_operand:SWI48x 3 "register_operand") + (match_operand:SI 4 "const_int_operand")] + UNSPECV_CMPCCXADD)) + (set (match_dup 2) + (unspec_volatile:SWI48x [(const_int 0)] UNSPECV_CMPCCXADD)) + (set (reg:CC FLAGS_REG) + (compare:CC (match_dup 2) + (match_dup 0)))]) + (set (reg FLAGS_REG) + (compare (match_operand:SWI48x 5 "register_operand") + (match_operand:SWI48x 6 "x86_64_general_operand"))) + (set (pc) + (if_then_else (match_operator 7 "ix86_comparison_int_operator" + [(reg FLAGS_REG) (const_int 0)]) + (label_ref (match_operand 8)) + (pc)))] + "TARGET_CMPCCXADD && TARGET_64BIT + && rtx_equal_p (operands[0], operands[6]) + && rtx_equal_p (operands[1], operands[5]) + && peep2_regno_dead_p (4, FLAGS_REG)" + [(set (match_dup 0) + (match_dup 1)) + (parallel [(set (match_dup 0) + (unspec_volatile:SWI48x + [(match_dup 2) + (match_dup 0) + (match_dup 3) + (match_dup 4)] + UNSPECV_CMPCCXADD)) + (set (match_dup 2) + (unspec_volatile:SWI48x [(const_int 0)] UNSPECV_CMPCCXADD)) + (set (reg:CC FLAGS_REG) + (compare:CC (match_dup 2) + (match_dup 0)))]) + (set (pc) + (if_then_else + (match_op_dup 7 + [(match_dup 9) (const_int 0)]) + (label_ref (match_dup 8)) + (pc)))] +{ + operands[9] = gen_rtx_REG (GET_MODE (XEXP (operands[7], 0)), FLAGS_REG); + if (swap_condition (GET_CODE (operands[7])) != GET_CODE (operands[7])) + { + operands[7] = shallow_copy_rtx (operands[7]); + enum rtx_code ccode = swap_condition (GET_CODE (operands[7])); + PUT_CODE (operands[7], ccode); + operands[9] = gen_rtx_REG (SELECT_CC_MODE (ccode, + operands[6], + operands[5]), + FLAGS_REG); + } +}) diff --git a/gcc/testsuite/gcc.target/i386/pr110591-2.c b/gcc/testsuite/gcc.target/i386/pr110591-2.c new file mode 100644 index 00000000000..92ffdb97d62 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr110591-2.c @@ -0,0 +1,90 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-mcmpccxadd -O2 -fno-if-conversion -fno-if-conversion2" } */ +/* { dg-final { scan-assembler-not {cmp[lq]?[ \t]+} } } */ +/* { dg-final { scan-assembler-times {cmpoxadd[ \t]+} 12 } } */ + +#include + +int foo_jg (int *ptr, int v) +{ + if (_cmpccxadd_epi32(ptr, v, 1, _CMPCCX_O) > v) + return 100; + return 200; +} + +int foo_jl (int *ptr, int v) +{ + if (_cmpccxadd_epi32(ptr, v, 1, _CMPCCX_O) < v) + return 300; + return 100; +} + +int foo_je(int *ptr, int v) +{ + if (_cmpccxadd_epi32(ptr, v, 1, _CMPCCX_O) == v) + return 123; + return 134; +} + +int foo_jne(int *ptr, int v) +{ + if (_cmpccxadd_epi32(ptr, v, 1, _CMPCCX_O) != v) + return 111; + return 12; +} + +int foo_jge(int *ptr, int v) +{ + if (_cmpccxadd_epi32(ptr, v, 1, _CMPCCX_O) >= v) + return 413; + return 23; +} + +int foo_jle(int *ptr, int v) +{ + if (_cmpccxadd_epi32(ptr, v, 1, _CMPCCX_O) <= v) + return 3141; + return 341; +} + +int fooq_jg (long long *ptr, long long v) +{ + if (_cmpccxadd_epi64(ptr, v, 1, _CMPCCX_O) > v) + return 123; + return 3; +} + +int fooq_jl (long long *ptr, long long v) +{ + if (_cmpccxadd_epi64(ptr, v, 1, _CMPCCX_O) < v) + return 313; + return 5; +} + +int fooq_je(long long *ptr, long long v) +{ + if (_cmpccxadd_epi64(ptr, v, 1, _CMPCCX_O) == v) + return 1313; + return 13; +} + +int fooq_jne(long long *ptr, long long v) +{ + if (_cmpccxadd_epi64(ptr, v, 1, _CMPCCX_O) != v) + return 1314; + return 132; +} + +int fooq_jge(long long *ptr, long long v) +{ + if (_cmpccxadd_epi64(ptr, v, 1, _CMPCCX_O) >= v) + return 14314; + return 434; +} + +int fooq_jle(long long *ptr, long long v) +{ + if (_cmpccxadd_epi64(ptr, v, 1, _CMPCCX_O) <= v) + return 14414; + return 43; +} diff --git a/gcc/testsuite/gcc.target/i386/pr110591.c b/gcc/testsuite/gcc.target/i386/pr110591.c new file mode 100644 index 00000000000..32a515b429e --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr110591.c @@ -0,0 +1,66 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-mcmpccxadd -O2" } */ +/* { dg-final { scan-assembler-not {cmp[lq]?[ \t]+} } } */ +/* { dg-final { scan-assembler-times {cmpoxadd[ \t]+} 12 } } */ + +#include + +_Bool foo_setg (int *ptr, int v) +{ + return _cmpccxadd_epi32(ptr, v, 1, _CMPCCX_O) > v; +} + +_Bool foo_setl (int *ptr, int v) +{ + return _cmpccxadd_epi32(ptr, v, 1, _CMPCCX_O) < v; +} + +_Bool foo_sete(int *ptr, int v) +{ + return _cmpccxadd_epi32(ptr, v, 1, _CMPCCX_O) == v; +} + +_Bool foo_setne(int *ptr, int v) +{ + return _cmpccxadd_epi32(ptr, v, 1, _CMPCCX_O) != v; +} + +_Bool foo_setge(int *ptr, int v) +{ + return _cmpccxadd_epi32(ptr, v, 1, _CMPCCX_O) >= v; +} + +_Bool foo_setle(int *ptr, int v) +{ + return _cmpccxadd_epi32(ptr, v, 1, _CMPCCX_O) <= v; +} + +_Bool fooq_setg (long long *ptr, long long v) +{ + return _cmpccxadd_epi64(ptr, v, 1, _CMPCCX_O) > v; +} + +_Bool fooq_setl (long long *ptr, long long v) +{ + return _cmpccxadd_epi64(ptr, v, 1, _CMPCCX_O) < v; +} + +_Bool fooq_sete(long long *ptr, long long v) +{ + return _cmpccxadd_epi64(ptr, v, 1, _CMPCCX_O) == v; +} + +_Bool fooq_setne(long long *ptr, long long v) +{ + return _cmpccxadd_epi64(ptr, v, 1, _CMPCCX_O) != v; +} + +_Bool fooq_setge(long long *ptr, long long v) +{ + return _cmpccxadd_epi64(ptr, v, 1, _CMPCCX_O) >= v; +} + +_Bool fooq_setle(long long *ptr, long long v) +{ + return _cmpccxadd_epi64(ptr, v, 1, _CMPCCX_O) <= v; +}