From patchwork Fri Sep 8 05:24:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lehua Ding X-Patchwork-Id: 137696 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ab0a:0:b0:3f2:4152:657d with SMTP id m10csp331597vqo; Thu, 7 Sep 2023 22:25:05 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHcQBcSNYslTKAThEhOEn3Dp4WQJorJhdmGrvYVasgR7miA0uvF5VfZvG6BWZSCSvwFhWJb X-Received: by 2002:a17:906:9e:b0:9a1:ddb9:6547 with SMTP id 30-20020a170906009e00b009a1ddb96547mr961248ejc.57.1694150705562; Thu, 07 Sep 2023 22:25:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694150705; cv=none; d=google.com; s=arc-20160816; b=jQq/ntIY6XOoRSf0P9UzqFJ2RZFDosTXpo+F9LRq2w9LR6J3XGejEBZ6TtegW2MKEv OgCOlnLV/8d78CLFK08dox7nUD9UOqqWjm7JJ0ZESFyewG3SA+ba/c+j/cc/7AFbSpnH 1IW7/K5XTmPZO7MFVaVOXqd8IfSTM2da+cXaF+xn3WO8vqdL1FYBl3DEPw1/pDazzl82 ecN0Ed1N9GSjCa3BMKbvCQX30OSQrKNWOYDqlEe5+QHfvn23+NGlSZN20pI+ru2iLQes tpdDWmH2BSouBpckVPr91GkJZY/B1Eq9uRi0dVTvsuud144DhrW78mULhlY62hjaNy9J BwPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:to :from:dmarc-filter:delivered-to; bh=/51Q0pbLmFas9s3Ud3AGnhcVTpeAM61njqbMV5xfI8w=; fh=IHsknfB5ddWERXFVhKhLifDS7xgre0nydmaHZurL0Ik=; b=RZdmk8fPv8mKGaNmuRDufxlGzkU5Oz7urxnQ2bgKPIX2kJ3I27fvTm/94wt8JIZeu/ 4mKZK1aRGPmoA6homek947jd0zhpBYDlxf8cTMFvIuD8kU2aobgrj+cgRMjlS1QqDNMW xL43BNf77Z5WfkVZ7jWUGjPhJffmxaXFpGz5AO9WvPJ4ylKUai0hAhBLifC9bB5571fA ZFc7Gs0jz1q9fzrVbcIQL7n92cPkMUbOIqX6SmP7CqY9UWS5Z3hzQnIa6iS3sXiGe2jv xpGTCOriHoIwaLCD8g56e1Qx3MLw/vfxH5pphwVsRqah0ecDOTkWfEtGYwBdTBLJW+iK NXyw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id d21-20020a17090694d500b009a9f295d7dasi733464ejy.593.2023.09.07.22.25.05 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Sep 2023 22:25:05 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EE479385700C for ; Fri, 8 Sep 2023 05:24:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbguseast2.qq.com (smtpbguseast2.qq.com [54.204.34.130]) by sourceware.org (Postfix) with ESMTPS id 052A43858410 for ; Fri, 8 Sep 2023 05:24:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 052A43858410 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp76t1694150658taihew86 Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Fri, 08 Sep 2023 13:24:16 +0800 (CST) X-QQ-SSF: 01400000000000C0F000000A0000000 X-QQ-FEAT: dKvkn8qoLrFGb3ypUKnB52osYyuHlgKIOZgXHwEkMiT5MMzNsOh7q4GRoBHBx 1iLHUgM8X8Wc4k0dZvPZirxBLm1YdlflbYEx0oeNmW7cU/zmwxDwoN8A1NuThA8BNRN1aMg xZwfefWe/lJXbesT4WHg33du9F1Az5ZLbnSgDga+lc8ouorjhojgPrldV5N8+1fk+fciA4R 3UqqAVWDuqZG8KgEHXBSKFNmm6ghbYFiNDKKUc+Q4x4SxNsOQqd4Mr586ISmSksFRwTAVye TH6EGl+uEi6/lRZHLlyQFV8cY2elWHRO2CeThrZYOdTsmWT0/x8evA4KDNt2/2gqin22BNR XSlNPOZptWMMv08nclJ/C74PFcV0gpkr3DAWFUKNTGASTOyX7BNdQOvCsuM+A== X-QQ-GoodBg: 2 X-BIZMAIL-ID: 10666323058140014960 From: Lehua Ding To: gcc-patches@gcc.gnu.org Subject: [PATCH] Support folding min(poly,poly) to const Date: Fri, 8 Sep 2023 13:24:15 +0800 Message-Id: <20230908052415.3307098-1-lehua.ding@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz6a-0 X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: richard.sandiford@arm.com, lehua.ding@rivai.ai, juzhe.zhong@rivai.ai Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776445770007831579 X-GMAIL-MSGID: 1776445770007831579 Hi, This patch adds support that tries to fold `MIN (poly, poly)` to a constant. Consider the following C Code: ``` void foo2 (int* restrict a, int* restrict b, int n) { for (int i = 0; i < 3; i += 1) a[i] += b[i]; } ``` Before this patch: ``` void foo2 (int * restrict a, int * restrict b, int n) { vector([4,4]) int vect__7.27; vector([4,4]) int vect__6.26; vector([4,4]) int vect__4.23; unsigned long _32; [local count: 268435456]: _32 = MIN_EXPR <3, POLY_INT_CST [4, 4]>; vect__4.23_20 = .MASK_LEN_LOAD (a_11(D), 32B, { -1, ... }, _32, 0); vect__6.26_15 = .MASK_LEN_LOAD (b_12(D), 32B, { -1, ... }, _32, 0); vect__7.27_9 = vect__6.26_15 + vect__4.23_20; .MASK_LEN_STORE (a_11(D), 32B, { -1, ... }, _32, 0, vect__7.27_9); [tail call] return; } ``` After this patch: ``` void foo2 (int * restrict a, int * restrict b, int n) { vector([4,4]) int vect__7.27; vector([4,4]) int vect__6.26; vector([4,4]) int vect__4.23; [local count: 268435456]: vect__4.23_20 = .MASK_LEN_LOAD (a_11(D), 32B, { -1, ... }, 3, 0); vect__6.26_15 = .MASK_LEN_LOAD (b_12(D), 32B, { -1, ... }, 3, 0); vect__7.27_9 = vect__6.26_15 + vect__4.23_20; .MASK_LEN_STORE (a_11(D), 32B, { -1, ... }, 3, 0, vect__7.27_9); [tail call] return; } ``` For RISC-V RVV, one branch instruction can be reduced: Before this patch: ``` foo2: csrr a4,vlenb srli a4,a4,2 li a5,3 bleu a5,a4,.L5 mv a5,a4 .L5: vsetvli zero,a5,e32,m1,ta,ma ... ``` After this patch. ``` foo2: vsetivli zero,3,e32,m1,ta,ma ... ``` Best, Lehua gcc/ChangeLog: * fold-const.cc (can_min_p): New function. (poly_int_binop): Try fold MIN_EXPR. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/div-1.c: Adjust. * gcc.target/riscv/rvv/autovec/vls/shift-3.c: Adjust. * gcc.target/riscv/rvv/autovec/fold-min-poly.c: New test. --- gcc/fold-const.cc | 33 +++++++++++++++++++ .../riscv/rvv/autovec/fold-min-poly.c | 24 ++++++++++++++ .../gcc.target/riscv/rvv/autovec/vls/div-1.c | 2 +- .../riscv/rvv/autovec/vls/shift-3.c | 2 +- 4 files changed, 59 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/fold-min-poly.c diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index 1da498a3152..f7f793cc326 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -1213,6 +1213,34 @@ wide_int_binop (wide_int &res, return true; } +/* Returns true if we know who is smaller or equal, ARG1 or ARG2., and set the + min value to RES. */ +bool +can_min_p (const_tree arg1, const_tree arg2, poly_wide_int &res) +{ + if (tree_fits_poly_int64_p (arg1) && tree_fits_poly_int64_p (arg2)) + { + if (known_le (tree_to_poly_int64 (arg1), tree_to_poly_int64 (arg2))) + res = wi::to_poly_wide (arg1); + else if (known_le (tree_to_poly_int64 (arg2), tree_to_poly_int64 (arg1))) + res = wi::to_poly_wide (arg2); + else + return false; + } + else if (tree_fits_poly_uint64_p (arg1) && tree_fits_poly_uint64_p (arg2)) + { + if (known_le (tree_to_poly_uint64 (arg1), tree_to_poly_uint64 (arg2))) + res = wi::to_poly_wide (arg1); + else if (known_le (tree_to_poly_int64 (arg2), tree_to_poly_int64 (arg1))) + res = wi::to_poly_wide (arg2); + else + return false; + } + else + return false; + return true; +} + /* Combine two poly int's ARG1 and ARG2 under operation CODE to produce a new constant in RES. Return FALSE if we don't know how to evaluate CODE at compile-time. */ @@ -1261,6 +1289,11 @@ poly_int_binop (poly_wide_int &res, enum tree_code code, return false; break; + case MIN_EXPR: + if (!can_min_p (arg1, arg2, res)) + return false; + break; + default: return false; } diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/fold-min-poly.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/fold-min-poly.c new file mode 100644 index 00000000000..de4c472c76e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/fold-min-poly.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options " -march=rv64gcv_zvl128b -mabi=lp64d -O3 --param riscv-autovec-preference=scalable --param riscv-autovec-lmul=m1 -fno-vect-cost-model" } */ + +void foo1 (int* restrict a, int* restrict b, int n) +{ + for (int i = 0; i < 4; i += 1) + a[i] += b[i]; +} + +void foo2 (int* restrict a, int* restrict b, int n) +{ + for (int i = 0; i < 3; i += 1) + a[i] += b[i]; +} + +void foo3 (int* restrict a, int* restrict b, int n) +{ + for (int i = 0; i < 5; i += 1) + a[i] += b[i]; +} + +/* { dg-final { scan-assembler-not {\tcsrr\t} } } */ +/* { dg-final { scan-assembler {\tvsetivli\tzero,4,e32,m1,t[au],m[au]} } } */ +/* { dg-final { scan-assembler {\tvsetivli\tzero,3,e32,m1,t[au],m[au]} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/div-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/div-1.c index f3388a86e38..40224c69458 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/div-1.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/div-1.c @@ -55,4 +55,4 @@ DEF_OP_VV (div, 512, int64_t, /) /* { dg-final { scan-assembler-times {vdivu?\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 42 } } */ /* TODO: Ideally, we should make sure there is no "csrr vlenb". However, we still have 'csrr vlenb' for some cases since we don't support VLS mode conversion which are needed by division. */ -/* { dg-final { scan-assembler-times {csrr} 19 } } */ +/* { dg-final { scan-assembler-not {csrr} } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-3.c index 98822b15657..b34a349949b 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-3.c +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-3.c @@ -55,4 +55,4 @@ DEF_OP_VV (shift, 512, int64_t, <<) /* { dg-final { scan-assembler-times {vsll\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 41 } } */ /* TODO: Ideally, we should make sure there is no "csrr vlenb". However, we still have 'csrr vlenb' for some cases since we don't support VLS mode conversion which are needed by division. */ -/* { dg-final { scan-assembler-times {csrr} 18 } } */ +/* { dg-final { scan-assembler-not {csrr} } } */