From patchwork Mon Jun 5 07:30:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Takayuki 'January June' Suwa X-Patchwork-Id: 103194 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp2559058vqr; Mon, 5 Jun 2023 02:38:28 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7VM+xdLJfwx4Ly0CUw8BcRINJVlAFMz4uuVA4vGPH+UVi8/lwFylrUmd0E5a7ChbUPkDI0 X-Received: by 2002:a17:907:9284:b0:974:5399:c21 with SMTP id bw4-20020a170907928400b0097453990c21mr5010748ejc.24.1685957908252; Mon, 05 Jun 2023 02:38:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685957908; cv=none; d=google.com; s=arc-20160816; b=USJr7bUwkhS0accVuZJThjbC3DOo20RN9JKWKXEgzeLxjmnxDNQMIeCE5Kk1v5PlZq /DQOMd/567+LKKACSONqrH368X3B/5p/9YmEWrWKjwoHPge40oKB2kVBcWquJQ+wUt4k cUDwUgkXFeLAFXBDtkzqRI1ccxuAlVNBeT0oJNwyJ64m74VA9mvHnWqVVJSTiFwHVxrS UIqMgQ7cAz2O8agG1vPsT6RNPpMyF/c8iLCy48kfmpPx3MgyE5o8FwdUiIXdnlM34bPX QIZBTbNDhmDKFSIPnDj+nbX9mDcauee4/Eq+di69BeaKbBkTjH6tccvsGpJcrEcjy5oq z6vQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:references :content-transfer-encoding:subject:cc:to:user-agent:mime-version :date:message-id:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=ljpkIPKLkK0yr48pcd/9iKWBMIYuvo0rQ7KtQZTscFI=; b=k3PbK/nuDM5V3JqgXEKlMGm97HQmeGoVT3eErdVDBvbs6/GP5q9KYWUblvd6moNcmz S68jU6uwpwla4p6YJN+pDGsMY1lBqnUGH1xlgqS0hvlDJRGVo4EUkEmwg8E2LZM1+uCu +IrLNr0DPGrfYHQEKzazvxLeThscOIwNQQiPisMWEIonzGpQ/k02Uml7/11Shh7IYRcZ 45iEq4ShTp6s7nnkfjV6Vfl0oYd+HAuvOKuqVIj63BxoZ1T9rRnogUOPz80KMrRsVQoG Hyp/aa16gTpUsBER3DWr6cvyANOlG/mIdDwVEjSZeym+E6uo6ASQmX/hJ19VGX2WomrA o+gQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="Kj0z/e14"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id y15-20020a170906470f00b00950ea4cbd16si4471344ejq.271.2023.06.05.02.38.27 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Jun 2023 02:38:28 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="Kj0z/e14"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BD40D3853D17 for ; Mon, 5 Jun 2023 09:38:26 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org BD40D3853D17 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1685957906; bh=ljpkIPKLkK0yr48pcd/9iKWBMIYuvo0rQ7KtQZTscFI=; h=Date:To:Cc:Subject:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=Kj0z/e142paeNHYBVdfCMucFHXlWQHPkvPr9OUsf4qZw+Ed0zAmVnArARa5PVVe+O JtOnxUpsOOjrc9/MJJp1nIreZKNDWoPNapNTAa8XIlBiaeGFWbrVp/kbW6WEJ7LTc8 qbUWkGRpqPliBOrCLGmlT4PGIKaUN3zDZNnxZroc= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from omggw7003.mail.djm.yahoo.co.jp (omggw7003.mail.djm.yahoo.co.jp [183.79.54.15]) by sourceware.org (Postfix) with ESMTPS id 0D3CB3854E45 for ; Mon, 5 Jun 2023 09:37:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0D3CB3854E45 X-YMail-OSG: IvB3e7cVM1mooal4jXR7bAvf9AIh1Hpl0HrlSq0DWJKABCeJSYnLy9bveBc1GLX YPjLy2qYzAwMYP54QTtDZ8aXb5gxNuSGYkwXCdrZXAJVdAj2DrSOdLyaaZM9qt2fOLlRnm5D7O5B KSop77h9gL3JhyMprwiWjOaR6YogCSbperGPHZM5_bIB8EZlj9Q5lepfd.pkmSJsVGX8rcMZDxnv Jikkl42Gj4XxjHRYMPCG5y6.cF8HSBKy0JagU2idxWBzuTxZatV77wmk24SA8Ujik5kkbClp0WBq jAEL8sjz35QBQT4CyL6dxGWo_zESg9a507NK4vLpXEhh0D3s9meiQVUP8XmNIWNZ8QBXysl9d2Jh ZMdweQmz.UcW8fXx_Aww8bR9cD9C_GBNETsIm1ro7jbBGOVmBQpkDcTznNgA_0MbCds6StcQJmj1 D3eYVERnoErmG5r_RZvJcdiS1Whqe2hgnLh4MCRI9jNGHt82JfX6t48OYCXyNRie3kJm72Dmzsn7 9NPUeyXdnrzscThae.lQR6b2t01Niqc7h33FHwA8ZydEd3sfExuXRt9pN97DZ_MD3JbU3ow5V7we qFA2T5O5vMCZffCLR5_HSz4k_EzEtiy.HVk4TJ8blRUTyQG3zk.lsTIvQXXMolv_3ir8Y_2PZ_E3 59B5_5QLiqonXkcNCmv.fmG0y6MzcYG198qti4IeszpCj6d.eijga8PqWB.7M_uavNcUvoJZoLvN BmW6Zztb0Hnj0_kYjxm_fSuCLNDr4Qn.gmFDsytqW4YzviOprAY.zf7x9o3aBGBoWo3msWB3Y.wJ 9b3d7xBYDS0MhRD5gxvrWlSaPpNr1O8kYNb9j1_7UoJ08QiYwqzd9H1I4yOr9RidQIjepGbM.zyP SfFQIBWWNRt9TeEuhX4PIaF5ZaGukaTcGzjRxjeTUzSfWLXcnRV7CcsPyBqXWywyNxm_OgjEzXH_ XpoQZro5Ahh4oCslep27BA_J_FV8JI98wV2fdyGao6Pkkrz0KRk49X_DMDw9fS4cd6Os- Received: from sonicgw.mail.yahoo.co.jp by sonicconh5003.mail.kks.yahoo.co.jp with HTTP; Mon, 5 Jun 2023 09:37:19 +0000 Received: by smtphe5004.mail.kks.ynwp.yahoo.co.jp (YJ Hermes SMTP Server) with ESMTPA ID fe3bd80c6bec6d48d9784dd24c19eeb3; Mon, 05 Jun 2023 18:37:17 +0900 (JST) Message-ID: Date: Mon, 5 Jun 2023 16:30:55 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.11.2 To: GCC Patches Cc: Max Filippov Subject: [PATCH v2] xtensa: Optimize boolean evaluation or branching when EQ/NE to zero in S[IF]mode References: X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Takayuki 'January June' Suwa via Gcc-patches From: Takayuki 'January June' Suwa Reply-To: Takayuki 'January June' Suwa Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1767854999614408210?= X-GMAIL-MSGID: =?utf-8?q?1767854999614408210?= This patch optimizes the boolean evaluation of EQ/NE against zero by adding two insn_and_split patterns similar to SImode conditional store: "eq_zero": op0 = (op1 == 0) ? 1 : 0; op0 = clz(op1) >> 5; /* optimized (requires TARGET_NSA) */ "movsicc_ne0_reg_0": op0 = (op1 != 0) ? op2 : 0; op0 = op2; if (op1 == 0) ? op0 = op1; /* optimized */ /* example #1 */ int bool_eqSI(int x) { return x == 0; } int bool_neSI(int x) { return x != 0; } ;; after (TARGET_NSA) bool_eqSI: nsau a2, a2 srli a2, a2, 5 ret.n bool_neSI: mov.n a9, a2 movi.n a2, 1 moveqz a2, a9, a9 ret.n These also work in SFmode by ignoring their sign bits, and further- more, the branch if EQ/NE against zero in SFmode is also done in the same manner. The reasons for this optimization in SFmode are: - Only zero values (negative or non-negative) contain no bits of 1 with both the exponent and the mantissa. - EQ/NE comparisons involving NaNs produce no signal even if they are signaling. - Even if the use of IEEE 754 single-precision floating-point co- processor is configured (TARGET_HARD_FLOAT is true): 1. Load zero value to FP register 2. Possibly, additional FP move if the comparison target is an address register 3. FP equality check instruction 4. Read the boolean register containing the result, or condi- tional branch As noted above, a considerable number of instructions are still generated. /* example #2 */ int bool_eqSF(float x) { return x == 0; } int bool_neSF(float x) { return x != 0; } int bool_ltSF(float x) { return x < 0; } extern void foo(void); void cb_eqSF(float x) { if(x != 0) foo(); } void cb_neSF(float x) { if(x == 0) foo(); } void cb_geSF(float x) { if(x < 0) foo(); } ;; after ;; (TARGET_NSA, TARGET_BOOLEANS and TARGET_HARD_FLOAT) bool_eqSF: add.n a2, a2, a2 nsau a2, a2 srli a2, a2, 5 ret.n bool_neSF: add.n a9, a2, a2 movi.n a2, 1 moveqz a2, a9, a9 ret.n bool_ltSF: movi.n a9, 0 wfr f0, a2 wfr f1, a9 olt.s b0, f0, f1 movi.n a9, 0 movi.n a2, 1 movf a2, a9, b0 ret.n cb_eqSF: add.n a2, a2, a2 beqz.n a2, .L6 j.l foo, a9 .L6: ret.n cb_neSF: add.n a2, a2, a2 bnez.n a2, .L8 j.l foo, a9 .L8: ret.n cb_geSF: addi sp, sp, -16 movi.n a3, 0 s32i.n a12, sp, 8 s32i.n a0, sp, 12 mov.n a12, a2 call0 __unordsf2 bnez.n a2, .L10 movi.n a3, 0 mov.n a2, a12 call0 __gesf2 bnei a2, -1, .L10 l32i.n a0, sp, 12 l32i.n a12, sp, 8 addi sp, sp, 16 j.l foo, a9 .L10: l32i.n a0, sp, 12 l32i.n a12, sp, 8 addi sp, sp, 16 ret.n gcc/ChangeLog: * config/xtensa/predicates.md (const_float_0_operand): Rename from obsolete "const_float_1_operand" and change the constant to compare. (cstoresf_cbranchsf_operand, cstoresf_cbranchsf_operator): New. * config/xtensa/xtensa.cc (xtensa_expand_conditional_branch): Add code for EQ/NE comparison with constant zero in SFmode. (xtensa_expand_scc): Added code to derive boolean evaluation of EQ/NE with constant zero for comparison in SFmode. (xtensa_rtx_costs): Change cost of CONST_DOUBLE with value zero inside "cbranchsf4" to 0. * config/xtensa/xtensa.md (cbranchsf4, cstoresf4): Change "match_operator" and the third "match_operand" to the ones mentioned above. (movsicc_ne0_reg_zero, eq_zero): New. --- gcc/config/xtensa/predicates.md | 17 +++++++++-- gcc/config/xtensa/xtensa.cc | 45 ++++++++++++++++++++++++++++ gcc/config/xtensa/xtensa.md | 53 +++++++++++++++++++++++++++++---- 3 files changed, 106 insertions(+), 9 deletions(-) diff --git a/gcc/config/xtensa/predicates.md b/gcc/config/xtensa/predicates.md index a3575a68892..cfac3ad4936 100644 --- a/gcc/config/xtensa/predicates.md +++ b/gcc/config/xtensa/predicates.md @@ -155,11 +155,11 @@ && CONSTANT_P (op) && GET_MODE_SIZE (mode) % UNITS_PER_WORD == 0"))))) -;; Accept the floating point constant 1 in the appropriate mode. -(define_predicate "const_float_1_operand" +;; Accept the floating point constant 0 in the appropriate mode. +(define_predicate "const_float_0_operand" (match_code "const_double") { - return real_equal (CONST_DOUBLE_REAL_VALUE (op), &dconst1); + return real_equal (CONST_DOUBLE_REAL_VALUE (op), &dconst0); }) (define_predicate "fpmem_offset_operand" @@ -179,6 +179,11 @@ return false; }) +(define_predicate "cstoresf_cbranchsf_operand" + (ior (and (match_test "TARGET_HARD_FLOAT") + (match_operand 0 "register_operand")) + (match_operand 0 "const_float_0_operand"))) + (define_predicate "branch_operator" (match_code "eq,ne,lt,ge")) @@ -197,6 +202,12 @@ (define_predicate "xtensa_cstoresi_operator" (match_code "eq,ne,gt,ge,lt,le")) +(define_predicate "cstoresf_cbranchsf_operator" + (ior (and (match_test "TARGET_HARD_FLOAT") + (and (match_operand 0 "comparison_operator") + (match_test "register_operand (XEXP (op, 1), SFmode)"))) + (match_operand 0 "boolean_operator"))) + (define_predicate "xtensa_shift_per_byte_operator" (match_code "ashift,ashiftrt,lshiftrt")) diff --git a/gcc/config/xtensa/xtensa.cc b/gcc/config/xtensa/xtensa.cc index 3b5d25b660a..f43f057344c 100644 --- a/gcc/config/xtensa/xtensa.cc +++ b/gcc/config/xtensa/xtensa.cc @@ -865,6 +865,16 @@ xtensa_expand_conditional_branch (rtx *operands, machine_mode mode) switch (mode) { case E_SFmode: + if ((test_code == EQ || test_code == NE) + && const_float_0_operand (cmp1, SFmode)) + { + emit_move_insn (cmp1 = gen_reg_rtx (SImode), + simplify_gen_subreg (SImode, cmp0, SFmode, 0)); + emit_insn (gen_addsi3 (cmp1, cmp1, cmp1)); + cmp = gen_int_relational (test_code, cmp1, const0_rtx); + break; + } + if (TARGET_HARD_FLOAT) { cmp = gen_float_relational (test_code, cmp0, cmp1); @@ -996,6 +1006,36 @@ xtensa_expand_scc (rtx operands[4], machine_mode cmp_mode) rtx one_tmp, zero_tmp; rtx (*gen_fn) (rtx, rtx, rtx, rtx, rtx); + if (cmp_mode == SFmode) + { + if (const_float_0_operand (operands[3], SFmode)) + switch (GET_CODE (operands[1])) + { + case EQ: + emit_move_insn (cmp = gen_reg_rtx (SImode), + simplify_gen_subreg (SImode, operands[2], + SFmode, 0)); + emit_insn (gen_addsi3 (cmp, cmp, cmp)); + emit_insn (gen_eq_zero (dest, cmp)); + return 1; + + case NE: + emit_move_insn (cmp = gen_reg_rtx (SImode), + simplify_gen_subreg (SImode, operands[2], + SFmode, 0)); + emit_insn (gen_addsi3 (cmp, cmp, cmp)); + one_tmp = force_reg (SImode, const1_rtx); + emit_insn (gen_movsicc_ne0_reg_zero (dest, cmp, one_tmp)); + return 1; + + default: + return 0; + } + + if (! register_operand (operands[3], SFmode)) + return 0; + } + if (!(cmp = gen_conditional_move (GET_CODE (operands[1]), cmp_mode, operands[2], operands[3]))) return 0; @@ -4438,6 +4478,11 @@ xtensa_rtx_costs (rtx x, machine_mode mode, int outer_code, return true; case CONST_DOUBLE: + if (outer_code == COMPARE && const_float_0_operand (x, SFmode)) + { + *total = 0; + return true; + } if (TARGET_CONST16) *total = COSTS_N_INSNS (4); else diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md index 4b4ab3f5f37..d4b91ef8fd2 100644 --- a/gcc/config/xtensa/xtensa.md +++ b/gcc/config/xtensa/xtensa.md @@ -1906,11 +1906,11 @@ }) (define_expand "cbranchsf4" - [(match_operator 0 "comparison_operator" + [(match_operator 0 "cstoresf_cbranchsf_operator" [(match_operand:SF 1 "register_operand") - (match_operand:SF 2 "register_operand")]) + (match_operand:SF 2 "cstoresf_cbranchsf_operand")]) (match_operand 3 "")] - "TARGET_HARD_FLOAT" + "" { xtensa_expand_conditional_branch (operands, SFmode); DONE; @@ -2395,10 +2395,10 @@ (define_expand "cstoresf4" [(match_operand:SI 0 "register_operand") - (match_operator:SI 1 "comparison_operator" + (match_operator:SI 1 "cstoresf_cbranchsf_operator" [(match_operand:SF 2 "register_operand") - (match_operand:SF 3 "register_operand")])] - "TARGET_HARD_FLOAT" + (match_operand:SF 3 "cstoresf_cbranchsf_operand")])] + "" { if (!xtensa_expand_scc (operands, SFmode)) FAIL; @@ -2463,6 +2463,30 @@ (set_attr "mode" "SI") (set_attr "length" "3,3")]) +(define_insn_and_split "movsicc_ne0_reg_zero" + [(set (match_operand:SI 0 "register_operand" "=a") + (if_then_else:SI (ne (match_operand:SI 1 "register_operand" "r") + (const_int 0)) + (match_operand:SI 2 "register_operand" "r") + (const_int 0)))] + "" + "#" + "" + [(set (match_dup 0) + (match_dup 2)) + (set (match_dup 0) + (if_then_else:SI (ne (match_dup 1) + (const_int 0)) + (match_dup 0) + (match_dup 1)))] + "" + [(set_attr "type" "move") + (set_attr "mode" "SI") + (set (attr "length") + (if_then_else (match_test "TARGET_DENSITY") + (const_int 5) + (const_int 6)))]) + (define_insn "movsfcc_internal0" [(set (match_operand:SF 0 "register_operand" "=a,a,f,f") (if_then_else:SF (match_operator 4 "branch_operator" @@ -3222,6 +3246,23 @@ (const_int 5) (const_int 6))))]) +(define_insn_and_split "eq_zero" + [(set (match_operand:SI 0 "register_operand" "=a") + (eq:SI (match_operand:SI 1 "register_operand" "r") + (const_int 0)))] + "TARGET_NSA" + "#" + "&& 1" + [(set (match_dup 0) + (clz:SI (match_dup 1))) + (set (match_dup 0) + (lshiftrt:SI (match_dup 0) + (const_int 5)))] + "" + [(set_attr "type" "move") + (set_attr "mode" "SI") + (set_attr "length" "6")]) + (define_peephole2 [(set (match_operand:SI 0 "register_operand") (match_operand:SI 6 "reload_operand"))