From patchwork Fri Sep  8 08:54:19 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Lehua Ding <lehua.ding@rivai.ai>
X-Patchwork-Id: 137708
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a59:ab0a:0:b0:3f2:4152:657d with SMTP id m10csp406795vqo;
        Fri, 8 Sep 2023 01:55:11 -0700 (PDT)
X-Google-Smtp-Source: 
 AGHT+IGQGOrjgd0rUv9YlD++BSS7dM9NlbmBVKT7qtMyTdHXQgOnSgr4qMAdp5p1jG+SEONbmbi9
X-Received: by 2002:a17:906:32cb:b0:9a2:232f:6f85 with SMTP id
 k11-20020a17090632cb00b009a2232f6f85mr1373928ejk.52.1694163311232;
        Fri, 08 Sep 2023 01:55:11 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1694163311; cv=none;
        d=google.com; s=arc-20160816;
        b=WxGeBydiVedlsLiK6VgoNZZBXakRSndtZ4+78H+uDz9pFlY1e0e2ddV4OA8AZ1dTre
         91DIxR0L/SGaXTaG+mUqK00Fy67/+0K6nA7+iCFzXHpQO87b47jLX1RpOsg0NHA5yhQu
         t59Bg2tV2OMBMypyRRFNckSDJymyrBKisRm5WARy6uq2VrGRXhdnZWgQFidR0XilcL2d
         p3DYiAhYeYplnFg+4o+PyqTVzTqnCZd1nFyHGD2PacgDYOMgBiN1XC2UuXrV7Ca1Q2w8
         P1rdWONqS1i+owGMbjxaJ/59CwkgfKbtyPyV8NLJAsLhBvyf5I94YiPlBJ74Uw+0fUq2
         5NEQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive
         :list-unsubscribe:list-id:precedence:feedback-id
         :content-transfer-encoding:mime-version:message-id:date:subject:to
         :from:dmarc-filter:delivered-to;
        bh=uZzYOZFXVBtU+Burf9tF10eEmv8JU8ykH6KzFvSD/y4=;
        fh=G8eOK2VNitaChuNKUZtNM6hm0dFGb6jSpKrEJBIhPPE=;
        b=voufBo1mFntnaaU5PRbd/ohEGOa/2ndKW/Ph7W4NjZEJd6Y/gdMb0HenBnkLi0ktIG
         R4coI7WH81kb9G0szq3Xx4TYcuZtREecjMrxlVDGTEkzepWdLSE85Mo0k/cuLzi4elqM
         sH7TBg3cyHmAOPZ9rBOl9f6XEzyoIRuvagGn7fj0s9B03jDMdIyWxKUEdTAjogeuA4JM
         MugLcQ/8Lyhtuz2sD4pxysoqRJ8Ynh12gDytkA8tAlWUb5p/D7OQLtyw/Qo8DHpcpEfh
         szHs2mThcL20kF5OV4m71YA+uYVMhVzU/iTCqslf0yh3rlTneHG/UZHSqakhYmz7sBgD
         TCWg==
ARC-Authentication-Results: i=1; mx.google.com;
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"
Received: from server2.sourceware.org (server2.sourceware.org.
 [2620:52:3:1:0:246e:9693:128c])
        by mx.google.com with ESMTPS id
 gu18-20020a170906f29200b00988dd4c5e2asi927235ejb.404.2023.09.08.01.55.10
        for <ouuuleilei@gmail.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Fri, 08 Sep 2023 01:55:11 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 client-ip=2620:52:3:1:0:246e:9693:128c;
Authentication-Results: mx.google.com;
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id C937B385700C
	for <ouuuleilei@gmail.com>; Fri,  8 Sep 2023 08:55:01 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from smtpbg153.qq.com (smtpbg153.qq.com [13.245.218.24])
 by sourceware.org (Postfix) with ESMTPS id 7D6723858D1E
 for <gcc-patches@gcc.gnu.org>; Fri,  8 Sep 2023 08:54:31 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7D6723858D1E
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=rivai.ai
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai
X-QQ-mid: bizesmtp87t1694163260tdbc31hg
Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9])
 by bizesmtp.qq.com (ESMTP) with
 id ; Fri, 08 Sep 2023 16:54:19 +0800 (CST)
X-QQ-SSF: 01400000000000C0F000000A0000000
X-QQ-FEAT: JaBgqeDEvbV1kpZq36dcp/XTotou1PvsTOqHZiVo2AAme/PKoJfwTPY8DIWa6
 nPz0f3+CHuyU9xYMkEauiedm+4urqFt5cfeuXG0jdNi32IDm6qRsm4+WT3SLNtwS1KwKETP
 AwQkUecjksz/EEbfVCyDup0KxhS8uJD0qNBWLqAxM2KQ0w7O82laj85y1J0dB3zYrv7ls5F
 Id5+EAU9s+ld8GyFCLHHy6T7HP3tORv7iXTkxoHx7SP01+zwvehSu+xSVzWoplyOhCowDN5
 dKVim4MPvO+7twihiaPncD6fuQjE+cyBBJPiFTjEg0IUxvUxb1FsCsr61AVIggTseZKCQOU
 bbvf8Gc71WB2i67BqOJO4N7+Ie8hSsXgwKc5o4EllQexFDP8qhb0aX2LhylD62tGDwtBywD
 RK8D7R+n8L49wm4ijP+wAw==
X-QQ-GoodBg: 2
X-BIZMAIL-ID: 12914107968464649632
From: Lehua Ding <lehua.ding@rivai.ai>
To: gcc-patches@gcc.gnu.org,
	richard.sandiford@arm.com
Subject: [PATCH V3] Support folding min(poly,poly) to const
Date: Fri,  8 Sep 2023 16:54:19 +0800
Message-Id: <20230908085419.494384-1-lehua.ding@rivai.ai>
X-Mailer: git-send-email 2.36.3
MIME-Version: 1.0
X-QQ-SENDSIZE: 520
Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz6a-0
X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,
 SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Cc: lehua.ding@rivai.ai, juzhe.zhong@rivai.ai
Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org
Sender: "Gcc-patches" <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org>
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1776451415237740794
X-GMAIL-MSGID: 1776458988224434755

V3 change: Address Richard's comments.

Hi,

This patch adds support that tries to fold `MIN (poly, poly)` to
a constant. Consider the following C Code:

```
void foo2 (int* restrict a, int* restrict b, int n)
{
    for (int i = 0; i < 3; i += 1)
      a[i] += b[i];
}
```

Before this patch:

```
void foo2 (int * restrict a, int * restrict b, int n)
{
  vector([4,4]) int vect__7.27;
  vector([4,4]) int vect__6.26;
  vector([4,4]) int vect__4.23;
  unsigned long _32;

  <bb 2> [local count: 268435456]:
  _32 = MIN_EXPR <3, POLY_INT_CST [4, 4]>;
  vect__4.23_20 = .MASK_LEN_LOAD (a_11(D), 32B, { -1, ... }, _32, 0);
  vect__6.26_15 = .MASK_LEN_LOAD (b_12(D), 32B, { -1, ... }, _32, 0);
  vect__7.27_9 = vect__6.26_15 + vect__4.23_20;
  .MASK_LEN_STORE (a_11(D), 32B, { -1, ... }, _32, 0, vect__7.27_9); [tail call]
  return;

}
```

After this patch:

```
void foo2 (int * restrict a, int * restrict b, int n)
{
  vector([4,4]) int vect__7.27;
  vector([4,4]) int vect__6.26;
  vector([4,4]) int vect__4.23;

  <bb 2> [local count: 268435456]:
  vect__4.23_20 = .MASK_LEN_LOAD (a_11(D), 32B, { -1, ... }, 3, 0);
  vect__6.26_15 = .MASK_LEN_LOAD (b_12(D), 32B, { -1, ... }, 3, 0);
  vect__7.27_9 = vect__6.26_15 + vect__4.23_20;
  .MASK_LEN_STORE (a_11(D), 32B, { -1, ... }, 3, 0, vect__7.27_9); [tail call]
  return;

}
```

For RISC-V RVV, csrr and branch instructions can be reduced:

Before this patch:

```
foo2:
        csrr    a4,vlenb
        srli    a4,a4,2
        li      a5,3
        bleu    a5,a4,.L5
        mv      a5,a4
.L5:
        vsetvli zero,a5,e32,m1,ta,ma
        ...
```

After this patch.

```
foo2:
	vsetivli	zero,3,e32,m1,ta,ma
        ...
```

Best,
Lehua

gcc/ChangeLog:

	* fold-const.cc (can_min_p): New function.
	(poly_int_binop): Try fold MIN_EXPR.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/autovec/vls/div-1.c: Adjust.
	* gcc.target/riscv/rvv/autovec/vls/shift-3.c: Adjust.
	* gcc.target/riscv/rvv/autovec/fold-min-poly.c: New test.
---
 gcc/fold-const.cc                             | 24 +++++++++++++++++++
 .../riscv/rvv/autovec/fold-min-poly.c         | 24 +++++++++++++++++++
 .../gcc.target/riscv/rvv/autovec/vls/div-1.c  |  2 +-
 .../riscv/rvv/autovec/vls/shift-3.c           |  2 +-
 4 files changed, 50 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/fold-min-poly.c

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 1da498a3152..d19b4666c65 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -1213,6 +1213,25 @@ wide_int_binop (wide_int &res,
   return true;
 }
 
+/* Returns true if we know who is smaller or equal, ARG1 or ARG2, and set the
+   min value to RES.  */
+bool
+can_min_p (const_tree arg1, const_tree arg2, poly_wide_int &res)
+{
+  if (known_le (wi::to_poly_widest (arg1), wi::to_poly_widest (arg2)))
+    {
+      res = wi::to_poly_wide (arg1);
+      return true;
+    }
+  else if (known_le (wi::to_poly_widest (arg2), wi::to_poly_widest (arg1)))
+    {
+      res = wi::to_poly_wide (arg2);
+      return true;
+    }
+
+  return false;
+}
+
 /* Combine two poly int's ARG1 and ARG2 under operation CODE to
    produce a new constant in RES.  Return FALSE if we don't know how
    to evaluate CODE at compile-time.  */
@@ -1261,6 +1280,11 @@ poly_int_binop (poly_wide_int &res, enum tree_code code,
 	return false;
       break;
 
+    case MIN_EXPR:
+      if (!can_min_p (arg1, arg2, res))
+	return false;
+      break;
+
     default:
       return false;
     }
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/fold-min-poly.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/fold-min-poly.c
new file mode 100644
index 00000000000..de4c472c76e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/fold-min-poly.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options " -march=rv64gcv_zvl128b -mabi=lp64d -O3 --param riscv-autovec-preference=scalable --param riscv-autovec-lmul=m1 -fno-vect-cost-model" } */
+
+void foo1 (int* restrict a, int* restrict b, int n)
+{
+    for (int i = 0; i < 4; i += 1)
+      a[i] += b[i];
+}
+
+void foo2 (int* restrict a, int* restrict b, int n)
+{
+    for (int i = 0; i < 3; i += 1)
+      a[i] += b[i];
+}
+
+void foo3 (int* restrict a, int* restrict b, int n)
+{
+    for (int i = 0; i < 5; i += 1)
+      a[i] += b[i];
+}
+
+/* { dg-final { scan-assembler-not {\tcsrr\t} } } */
+/* { dg-final { scan-assembler {\tvsetivli\tzero,4,e32,m1,t[au],m[au]} } } */
+/* { dg-final { scan-assembler {\tvsetivli\tzero,3,e32,m1,t[au],m[au]} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/div-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/div-1.c
index f3388a86e38..40224c69458 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/div-1.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/div-1.c
@@ -55,4 +55,4 @@ DEF_OP_VV (div, 512, int64_t, /)
 
 /* { dg-final { scan-assembler-times {vdivu?\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 42 } } */
 /* TODO: Ideally, we should make sure there is no "csrr vlenb". However, we still have 'csrr vlenb' for some cases since we don't support VLS mode conversion which are needed by division.  */
-/* { dg-final { scan-assembler-times {csrr} 19 } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-3.c
index 98822b15657..b34a349949b 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-3.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/shift-3.c
@@ -55,4 +55,4 @@ DEF_OP_VV (shift, 512, int64_t, <<)
 
 /* { dg-final { scan-assembler-times {vsll\.vv\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+} 41 } } */
 /* TODO: Ideally, we should make sure there is no "csrr vlenb". However, we still have 'csrr vlenb' for some cases since we don't support VLS mode conversion which are needed by division.  */
-/* { dg-final { scan-assembler-times {csrr} 18 } } */
+/* { dg-final { scan-assembler-not {csrr} } } */