From patchwork Wed Aug 16 13:12:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 135758 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b82d:0:b0:3f2:4152:657d with SMTP id z13csp1683vqi; Wed, 16 Aug 2023 06:12:57 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGa3MJnfcFWZhku9TFrM51m6tV52dIMUL3GAbHmiUJZ7g8tWYcLVKoXbz8VsO7wDw6P2alk X-Received: by 2002:a2e:2c16:0:b0:2b6:a08d:e142 with SMTP id s22-20020a2e2c16000000b002b6a08de142mr1347700ljs.7.1692191575151; Wed, 16 Aug 2023 06:12:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692191575; cv=none; d=google.com; s=arc-20160816; b=pLxb4/hv+CIXd1R27PX9zTnYYiwaGxJcaxzHMI6qjWJTcWdxe5TyXD2jy+96mWgZBI LJ8SsJzFJX75Y/sBwNrZu4gDkr6MSfc/cHotvLSXz9X+yw5sa0pWtEblpu+t1IeH9000 Fn1ZGQ97R9zrDekNIVfoavVAitVUVMElo3jOgEyAfgATI/4Yj4R8MK7ZyJOAUMggR3Pi n62QMkVb3Yt0xOe2EJj5LgshzgcdpFPivf2sGi7bV3HdCG75ILQhTokk/ZlEkPiQzpE1 92DrfctIFxDtXMjNwVFh0YxOFFTy7cigB+DFmc6HY727t2HdOp3p0VkT03UapoEXhGLp aV9Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dmarc-filter:delivered-to; bh=6D9qf1NAZFZHMXIpFG5vtjPZiIrQ6UWUN3kcFEt9UwE=; fh=bFGt8bvBdss4RDVNwtoG4vql1cbNszblsxOGwrQqCdA=; b=mG+WzwR5oTwHHEII4VAnqhoavTUNNj8r0FHW7ZN4GnQv6/aOhErzcZ+v80Ev0TCiLy Ie0Qcum2vU3fGq8H7IkXuRVtYONkP3YpGYUfuW4+drUnQD1bhgieGOjabrduYvNYcHqS G8M7Cep7I+lHLgKEqa5DqEYh0G6T1sN2f0GWHqMq3YH4BDlozePnxI51rXA143doF/Dw Z6JxdesA3eMTcwg3NVqU6iGucWYlkbiFko5GCiInpl2B/n43cDwsHxMF5Kp4AC6XIHwI pKLV1vED85Q0ORGzv0K5fm9vYBzbBWOEVk4BWZ+pkmf7yitCAlHbtt8Kymtg6aNqP5bL x2cQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id oz40-20020a170906cd2800b0098d0a88d4fasi11134071ejb.808.2023.08.16.06.12.52 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Aug 2023 06:12:55 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CAECD385DC0B for ; Wed, 16 Aug 2023 13:12:40 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbguseast2.qq.com (smtpbguseast2.qq.com [54.204.34.130]) by sourceware.org (Postfix) with ESMTPS id 868463858C52 for ; Wed, 16 Aug 2023 13:12:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 868463858C52 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp66t1692191528tla7kyz4 Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.7]) by bizesmtp.qq.com (ESMTP) with id ; Wed, 16 Aug 2023 21:12:07 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: jXjag1m6xl558jww6hJ403I2emUmhMGPgwMR/kr8UGWHgw3t/ld858meWrmZD qt3etmOHAEHwR2Gcno13aqmutmy/DgMXJpO+BST/5EhvPnXxA3DnxuEiSY51WXpTnJkAHSI skJzyY2r2TiEYsveazEU9szSkf52/oFsJAUIfSsLMSGjipmMSGup7mejoNqX/SK/dktZMyo iPrhBsTbXYEouc6jAD2fZ92B0XNDMISkjYMPKWgMJF0iTuHJ0esEK3CaI8XyqFH24mg5E5B QgJQb7yrbeyUUSf5jYX2q4fzNdMvWtM6CooPqHyG2EHKkjqmoF1IEIt7aDPaG9kbozvYCVA 8a9OSRmm7EqhJUgVLEJSONb8HsTJN4mjvMjw5i5Ak0jGsV68KaX+ZRBOPgtxbOJctVf5wYw 2k4+64V7zFcDHkV95H2K0Q== X-QQ-GoodBg: 2 X-BIZMAIL-ID: 13048938466747967206 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: richard.sandiford@arm.com, rguenther@suse.de, Juzhe-Zhong Subject: [PATCH] gimple_fold: Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold Date: Wed, 16 Aug 2023 21:12:05 +0800 Message-Id: <20230816131205.233568-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774391473022299961 X-GMAIL-MSGID: 1774391473022299961 Hi, Richard and Richi. Currently, GCC support COND_LEN_FMA for floating-point **NO** -ffast-math. It's supported in tree-ssa-math-opts.cc. However, GCC failed to support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS. Consider this following case: #define TEST_TYPE(TYPE) \ __attribute__ ((noipa)) void ternop_##TYPE (TYPE *__restrict dst, \ TYPE *__restrict a, \ TYPE *__restrict b, int n) \ { \ for (int i = 0; i < n; i++) \ dst[i] -= a[i] * b[i]; \ } #define TEST_ALL() \ TEST_TYPE (float) \ TEST_ALL () Gimple IR for RVV: ... _39 = -vect__8.14_26; vect__10.16_21 = .COND_LEN_FMA ({ -1, ... }, vect__6.11_30, _39, vect__4.8_34, vect__4.8_34, _46, 0); ... This is because this following piece of codes in tree-ssa-math-opts.cc: if (len) fma_stmt = gimple_build_call_internal (IFN_COND_LEN_FMA, 7, cond, mulop1, op2, addop, else_value, len, bias); else if (cond) fma_stmt = gimple_build_call_internal (IFN_COND_FMA, 5, cond, mulop1, op2, addop, else_value); else fma_stmt = gimple_build_call_internal (IFN_FMA, 3, mulop1, op2, addop); gimple_set_lhs (fma_stmt, gimple_get_lhs (use_stmt)); gimple_call_set_nothrow (fma_stmt, !stmt_can_throw_internal (cfun, use_stmt)); gsi_replace (&gsi, fma_stmt, true); /* Follow all SSA edges so that we generate FMS, FNMA and FNMS regardless of where the negation occurs. */ gimple *orig_stmt = gsi_stmt (gsi); if (fold_stmt (&gsi, follow_all_ssa_edges)) { if (maybe_clean_or_replace_eh_stmt (orig_stmt, gsi_stmt (gsi))) gcc_unreachable (); update_stmt (gsi_stmt (gsi)); } 'fold_stmt' failed to fold NEGATE_EXPR + COND_LEN_FMA ====> COND_LEN_FNMA. This patch support STMT fold into: vect__10.16_21 = .COND_LEN_FNMA ({ -1, ... }, vect__8.14_26, vect__6.11_30, vect__4.8_34, { 0.0, ... }, _46, 0); Note that COND_LEN_FNMA has 7 arguments and COND_LEN_ADD has 6 arguments. Extend maximum num ops: - static const unsigned int MAX_NUM_OPS = 5; + static const unsigned int MAX_NUM_OPS = 7; Bootstrap and Regtest on X86 passed. Fully tested COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS on RISC-V backend. Testing on aarch64 is on progress. gcc/ChangeLog: * genmatch.cc (decision_tree::gen): Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold. * gimple-match-exports.cc (gimple_simplify): Ditto. (gimple_resimplify6): New function. (gimple_resimplify7): New function. (gimple_match_op::resimplify): Support COND_LEN_FNMA/COND_LEN_FMS/COND_LEN_FNMS gimple fold. (convert_conditional_op): Ditto. (build_call_internal): Ditto. (try_conditional_simplification): Ditto. (gimple_extract): Ditto. * gimple-match.h (gimple_match_cond::gimple_match_cond): Ditto. * internal-fn.cc (CASE): Ditto. --- gcc/genmatch.cc | 2 +- gcc/gimple-match-exports.cc | 124 ++++++++++++++++++++++++++++++++++-- gcc/gimple-match.h | 19 +++++- gcc/internal-fn.cc | 11 ++-- 4 files changed, 144 insertions(+), 12 deletions(-) diff --git a/gcc/genmatch.cc b/gcc/genmatch.cc index f46d2e1520d..a1925a747a7 100644 --- a/gcc/genmatch.cc +++ b/gcc/genmatch.cc @@ -4052,7 +4052,7 @@ decision_tree::gen (vec &files, bool gimple) } fprintf (stderr, "removed %u duplicate tails\n", rcnt); - for (unsigned n = 1; n <= 5; ++n) + for (unsigned n = 1; n <= 7; ++n) { bool has_kids_p = false; diff --git a/gcc/gimple-match-exports.cc b/gcc/gimple-match-exports.cc index 7aeb4ddb152..895950309b7 100644 --- a/gcc/gimple-match-exports.cc +++ b/gcc/gimple-match-exports.cc @@ -60,6 +60,12 @@ extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree), code_helper, tree, tree, tree, tree, tree); extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree), code_helper, tree, tree, tree, tree, tree, tree); +extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree), + code_helper, tree, tree, tree, tree, tree, tree, + tree); +extern bool gimple_simplify (gimple_match_op *, gimple_seq *, tree (*)(tree), + code_helper, tree, tree, tree, tree, tree, tree, + tree, tree); /* Functions that are needed by gimple-match but that are exported and used in other places in the compiler. */ @@ -89,6 +95,8 @@ static bool gimple_resimplify2 (gimple_seq *, gimple_match_op *, tree (*)(tree)) static bool gimple_resimplify3 (gimple_seq *, gimple_match_op *, tree (*)(tree)); static bool gimple_resimplify4 (gimple_seq *, gimple_match_op *, tree (*)(tree)); static bool gimple_resimplify5 (gimple_seq *, gimple_match_op *, tree (*)(tree)); +static bool gimple_resimplify6 (gimple_seq *, gimple_match_op *, tree (*)(tree)); +static bool gimple_resimplify7 (gimple_seq *, gimple_match_op *, tree (*)(tree)); /* Match and simplify the toplevel valueized operation THIS. Replaces THIS with a simplified and/or canonicalized result and @@ -109,6 +117,10 @@ gimple_match_op::resimplify (gimple_seq *seq, tree (*valueize)(tree)) return gimple_resimplify4 (seq, this, valueize); case 5: return gimple_resimplify5 (seq, this, valueize); + case 6: + return gimple_resimplify6 (seq, this, valueize); + case 7: + return gimple_resimplify7 (seq, this, valueize); default: gcc_unreachable (); } @@ -146,7 +158,14 @@ convert_conditional_op (gimple_match_op *orig_op, if (ifn == IFN_LAST) return false; unsigned int num_ops = orig_op->num_ops; - new_op->set_op (as_combined_fn (ifn), orig_op->type, num_ops + 2); + unsigned int num_cond_ops = 2; + if (orig_op->cond.len) + { + /* Convert COND_FNMA to COND_LEN_FNMA if it has LEN and BIAS. */ + ifn = get_len_internal_fn (ifn); + num_cond_ops = 4; + } + new_op->set_op (as_combined_fn (ifn), orig_op->type, num_ops + num_cond_ops); new_op->ops[0] = orig_op->cond.cond; for (unsigned int i = 0; i < num_ops; ++i) new_op->ops[i + 1] = orig_op->ops[i]; @@ -155,6 +174,11 @@ convert_conditional_op (gimple_match_op *orig_op, else_value = targetm.preferred_else_value (ifn, orig_op->type, num_ops, orig_op->ops); new_op->ops[num_ops + 1] = else_value; + if (orig_op->cond.len) + { + new_op->ops[num_ops + 2] = orig_op->cond.len; + new_op->ops[num_ops + 3] = orig_op->cond.bias; + } return true; } /* Helper for gimple_simplify valueizing OP using VALUEIZE and setting @@ -219,7 +243,9 @@ build_call_internal (internal_fn fn, gimple_match_op *res_op) res_op->op_or_null (1), res_op->op_or_null (2), res_op->op_or_null (3), - res_op->op_or_null (4)); + res_op->op_or_null (4), + res_op->op_or_null (5), + res_op->op_or_null (6)); } /* RES_OP is the result of a simplification. If it is conditional, @@ -319,6 +345,7 @@ try_conditional_simplification (internal_fn ifn, gimple_match_op *res_op, { code_helper op; tree_code code = conditional_internal_fn_code (ifn); + int len_index = internal_fn_len_index (ifn); if (code != ERROR_MARK) op = code; else @@ -330,12 +357,20 @@ try_conditional_simplification (internal_fn ifn, gimple_match_op *res_op, } unsigned int num_ops = res_op->num_ops; + /* num_cond_ops = 2 for COND_ADD (MASK and ELSE) + wheras num_cond_ops = 4 for COND_LEN_ADD (MASK, ELSE, LEN and BIAS). */ + unsigned int num_cond_ops = len_index < 0 ? 2 : 4; + tree else_value + = len_index < 0 ? res_op->ops[num_ops - 1] : res_op->ops[num_ops - 3]; + tree len = len_index < 0 ? NULL_TREE : res_op->ops[num_ops - 2]; + tree bias = len_index < 0 ? NULL_TREE : res_op->ops[num_ops - 1]; + unsigned int last_op_index = len_index < 0 ? num_ops - 1 : num_ops - 3; gimple_match_op cond_op (gimple_match_cond (res_op->ops[0], - res_op->ops[num_ops - 1]), - op, res_op->type, num_ops - 2); + else_value, len, bias), + op, res_op->type, num_ops - num_cond_ops); memcpy (cond_op.ops, res_op->ops + 1, (num_ops - 1) * sizeof *cond_op.ops); - switch (num_ops - 2) + switch (num_ops - num_cond_ops) { case 1: if (!gimple_resimplify1 (seq, &cond_op, valueize)) @@ -717,7 +752,7 @@ gimple_extract (gimple *stmt, gimple_match_op *res_op, /* ??? This way we can't simplify calls with side-effects. */ if (gimple_call_lhs (stmt) != NULL_TREE && gimple_call_num_args (stmt) >= 1 - && gimple_call_num_args (stmt) <= 5) + && gimple_call_num_args (stmt) <= 7) { combined_fn cfn; if (gimple_call_internal_p (stmt)) @@ -1145,6 +1180,83 @@ gimple_resimplify5 (gimple_seq *seq, gimple_match_op *res_op, return canonicalized; } +/* Helper that matches and simplifies the toplevel result from + a gimple_simplify run (where we don't want to build + a stmt in case it's used in in-place folding). Replaces + RES_OP with a simplified and/or canonicalized result and + returns whether any change was made. */ + +static bool +gimple_resimplify6 (gimple_seq *seq, gimple_match_op *res_op, + tree (*valueize)(tree)) +{ + /* No constant folding is defined for five-operand functions. */ + + /* Canonicalize operand order. */ + bool canonicalized = false; + int argno = first_commutative_argument (res_op->code, res_op->type); + if (argno >= 0 + && tree_swap_operands_p (res_op->ops[argno], res_op->ops[argno + 1])) + { + std::swap (res_op->ops[argno], res_op->ops[argno + 1]); + canonicalized = true; + } + + gimple_match_op res_op2 (*res_op); + if (gimple_simplify (&res_op2, seq, valueize, + res_op->code, res_op->type, + res_op->ops[0], res_op->ops[1], res_op->ops[2], + res_op->ops[3], res_op->ops[4], res_op->ops[5])) + { + *res_op = res_op2; + return true; + } + + if (maybe_resimplify_conditional_op (seq, res_op, valueize)) + return true; + + return canonicalized; +} + +/* Helper that matches and simplifies the toplevel result from + a gimple_simplify run (where we don't want to build + a stmt in case it's used in in-place folding). Replaces + RES_OP with a simplified and/or canonicalized result and + returns whether any change was made. */ + +static bool +gimple_resimplify7 (gimple_seq *seq, gimple_match_op *res_op, + tree (*valueize)(tree)) +{ + /* No constant folding is defined for five-operand functions. */ + + /* Canonicalize operand order. */ + bool canonicalized = false; + int argno = first_commutative_argument (res_op->code, res_op->type); + if (argno >= 0 + && tree_swap_operands_p (res_op->ops[argno], res_op->ops[argno + 1])) + { + std::swap (res_op->ops[argno], res_op->ops[argno + 1]); + canonicalized = true; + } + + gimple_match_op res_op2 (*res_op); + if (gimple_simplify (&res_op2, seq, valueize, + res_op->code, res_op->type, + res_op->ops[0], res_op->ops[1], res_op->ops[2], + res_op->ops[3], res_op->ops[4], res_op->ops[5], + res_op->ops[6])) + { + *res_op = res_op2; + return true; + } + + if (maybe_resimplify_conditional_op (seq, res_op, valueize)) + return true; + + return canonicalized; +} + /* Return a canonical form for CODE when operating on TYPE. The idea is to remove redundant ways of representing the same operation so that code_helpers can be hashed and compared for equality. diff --git a/gcc/gimple-match.h b/gcc/gimple-match.h index b20585dca4b..c16cfede826 100644 --- a/gcc/gimple-match.h +++ b/gcc/gimple-match.h @@ -34,6 +34,7 @@ public: /* Build an unconditional op. */ gimple_match_cond (uncond) : cond (NULL_TREE), else_value (NULL_TREE) {} gimple_match_cond (tree, tree); + gimple_match_cond (tree, tree, tree, tree); gimple_match_cond any_else () const; @@ -44,6 +45,16 @@ public: /* The value to use when the condition is false. This is NULL_TREE if the operation is unconditional or if the value doesn't matter. */ tree else_value; + + /* The length value for conditional len operation that operation occurs + when element index < len + bias. This is NULL_TREE if the operation + is non-LEN operation. */ + tree len; + + /* The bias value for conditional len operation that operation occurs + when element index < len + bias. This is NULL_TREE if the operation + is non-LEN operation. */ + tree bias; }; inline @@ -52,6 +63,12 @@ gimple_match_cond::gimple_match_cond (tree cond_in, tree else_value_in) { } +inline +gimple_match_cond::gimple_match_cond (tree cond_in, tree else_value_in, + tree len_in, tree bias_in) + : cond (cond_in), else_value (else_value_in), len (len_in), bias (bias_in) +{} + /* Return a gimple_match_cond with the same condition but with an arbitrary ELSE_VALUE. */ @@ -93,7 +110,7 @@ public: bool resimplify (gimple_seq *, tree (*)(tree)); /* The maximum value of NUM_OPS. */ - static const unsigned int MAX_NUM_OPS = 5; + static const unsigned int MAX_NUM_OPS = 7; /* The conditions under which the operation is performed, and the value to use as a fallback. */ diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index cc1ede58799..be9b93e9f71 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -4477,11 +4477,14 @@ get_unconditional_internal_fn (internal_fn ifn) { switch (ifn) { -#define CASE(NAME) case IFN_COND_##NAME: return IFN_##NAME; - FOR_EACH_COND_FN_PAIR(CASE) +#define CASE(NAME) \ + case IFN_COND_##NAME: \ + case IFN_COND_LEN_##NAME: \ + return IFN_##NAME; + FOR_EACH_COND_FN_PAIR (CASE) #undef CASE - default: - return IFN_LAST; + default: + return IFN_LAST; } }