From patchwork Thu Jun 1 03:48:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 101693 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp28637vqr; Wed, 31 May 2023 20:49:20 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4A4xNq3lVoRv/lLzt+BAJDzRxqa0hIvkrzHSR+A2511m9b9JD3N4fAmKC5684Z1vrtxgHt X-Received: by 2002:a17:907:983:b0:973:92a8:f611 with SMTP id bf3-20020a170907098300b0097392a8f611mr604677ejc.31.1685591360320; Wed, 31 May 2023 20:49:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685591360; cv=none; d=google.com; s=arc-20160816; b=SRPVbPvwiJjuBV9hYy3DUq29W7rQfVgke4AOkyAeDd67R7Z2fVTNzt8d/nLOv3KQED abN02WcyPhI/EXV9K9BSDlHZqQYlBGHIvFMtyNlrbcMZ/l6U1N1wlM+myhHUMocdbEWG vR1Nsm/p7IwRJlCTpdfc3+Mt+gVZiTjJ+6FTrP0Xu/HxPUyfMLLzQWc1vdFG8IqVmFpx YThTjTWfNG6KLxNCnQ+sugY3PpWE3AWy3mOKV+vPF1Q6ej8nb725HfJnyFmbDjSgsSQk e1mkTOmT43pbk46xNMJ2pd6y+Lk3m+clU0Z8RrTcBtF+j61zzCVHOL7pC+HcF1NxArPW YhUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dmarc-filter:delivered-to; bh=DnJLYGB8QJKSOO7aiOY1dVWWaZUOKG+ayLlWm4x9KJU=; b=M5a5QR6vs2y7P7Af3CmQfmPDNwxeCGhfTlUh81O2NfHtYEUQGwS+I+cjkSt6SU5LaF XtcIfogfE3b9RiyKdL8TKk4Q8XjvO5jY5wDjYrNy/4FEGCmr7vyeK8oteq/K7xL+5HcO b3MKPbbpZIZuTB8uP09juTNOuuxTAxo3rfSLRXEW/J2rNws/+T7KBozbO97iL5CNyOoK J55HhsFC4sA3gJzUDxLE4MfcUvvjlD1TI7NfmrHxzi0Y/Xc3PMyl4AHxXwmaLQ7KGXMi xZFYGecGAjcl7ZuN854EbZ1Io3UHR2BexQbFqrEwZ+elKoPeotNYnXQ5WyMXzghSK3rW V9eg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id jr18-20020a170906515200b0096f80e0a97esi773038ejc.559.2023.05.31.20.49.19 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 May 2023 20:49:20 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A28A03857738 for ; Thu, 1 Jun 2023 03:49:08 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgbr1.qq.com (smtpbgbr1.qq.com [54.207.19.206]) by sourceware.org (Postfix) with ESMTPS id F2D793858D20 for ; Thu, 1 Jun 2023 03:48:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F2D793858D20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp66t1685591305tcydvq8i Received: from server1.localdomain ( [58.60.1.22]) by bizesmtp.qq.com (ESMTP) with id ; Thu, 01 Jun 2023 11:48:24 +0800 (CST) X-QQ-SSF: 01400000000000F0R000000A0000000 X-QQ-FEAT: swJM8EmkKsnpV3rCyj4m2LXwNKguOQtQtbW1rRsRJaqYI+syAtz7LYO4haFsr zq2BqI2Q44uFk2JbAMFtkny1YB70+wxtaO/Iph9+wKT1Eq9gfDgm077IuASPYG2xDwu3uZn OHjM3NEaXlDFWyI04CfRXk+wlQzAqZyrxISMdwktreYnT3NO5BVp1ZSx88ebxWN5sdgSFLY 9i+rL/tKE3Kxew6CSfiRkQHt5DjOyvu8vlfRW1wkyalm50oh4Qh5BY2g1dsraN7nOeZ6efN Qh6hJIOwmB98rwfykGJUlmr7AXrccP9563vlbvEwGyIvruiGL87+afaPo3DXOHPgVRySLOg IDQNC4VWQNYHFTwVnRLXcBhkMiopuS0IaD4XjphIWgNglMPoY8DZS3nWqHK0sc7Qq/nWIpP o42j40vVLE4= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 17032503745498053710 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, palmer@dabbelt.com, palmer@rivosinc.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH] RISC-V: Add vwadd.wv/vwsub.wv auto-vectorization lowering optimization Date: Thu, 1 Jun 2023 11:48:23 +0800 Message-Id: <20230601034823.235258-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1767470646367033452?= X-GMAIL-MSGID: =?utf-8?q?1767470646367033452?= From: Juzhe-Zhong 1. This patch optimize the codegen of the following auto-vectorization codes: void foo (int32_t * __restrict a, int64_t * __restrict b, int64_t * __restrict c, int n) { for (int i = 0; i < n; i++) c[i] = (int64_t)a[i] + b[i]; } Combine instruction from: ... vsext.vf2 vadd.vv ... into: ... vwadd.wv ... Since for PLUS operation, GCC prefer the following RTL operand order when combining: (plus: (sign_extend:..) (reg:) instead of (plus: (reg:..) (sign_extend:) which is different from MINUS pattern. I split patterns of vwadd/vwsub, and add dedicated patterns for them. 2. This patch not only optimize the case as above (1) mentioned, also enhance vwadd.vv/vwsub.vv optimization for complicate PLUS/MINUS codes, consider this following codes: __attribute__ ((noipa)) void vwadd_int16_t_int8_t (int16_t *__restrict dst, int16_t *__restrict dst2, int16_t *__restrict dst3, int8_t *__restrict a, int8_t *__restrict b, int8_t *__restrict a2, int8_t *__restrict b2, int n) { for (int i = 0; i < n; i++) { dst[i] = (int16_t) a[i] + (int16_t) b[i]; dst2[i] = (int16_t) a2[i] + (int16_t) b[i]; dst3[i] = (int16_t) a2[i] + (int16_t) a[i]; } } Before this patch: ... vsetvli zero,a6,e8,mf2,ta,ma vle8.v v2,0(a3) vle8.v v1,0(a4) vsetvli t1,zero,e16,m1,ta,ma vsext.vf2 v3,v2 vsext.vf2 v2,v1 vadd.vv v1,v2,v3 vsetvli zero,a6,e16,m1,ta,ma vse16.v v1,0(a0) vle8.v v4,0(a5) vsetvli t1,zero,e16,m1,ta,ma vsext.vf2 v1,v4 vadd.vv v2,v1,v2 ... After this patch: ... vsetvli zero,a6,e8,mf2,ta,ma vle8.v v3,0(a4) vle8.v v1,0(a3) vsetvli t4,zero,e8,mf2,ta,ma vwadd.vv v2,v1,v3 vsetvli zero,a6,e16,m1,ta,ma vse16.v v2,0(a0) vle8.v v2,0(a5) vsetvli t4,zero,e8,mf2,ta,ma vwadd.vv v4,v3,v2 vsetvli zero,a6,e16,m1,ta,ma vse16.v v4,0(a1) vsetvli t4,zero,e8,mf2,ta,ma sub a7,a7,a6 vwadd.vv v3,v2,v1 vsetvli zero,a6,e16,m1,ta,ma vse16.v v3,0(a2) ... The reason why current upstream GCC can not optimize codes using vwadd thoroughly is combine PASS needs intermediate RTL IR (extend one of the operand pattern (vwadd.wv)), then base on this intermediate RTL IR, extend the other operand to generate vwadd.vv. So vwadd.wv/vwsub.wv definitely helps to vwadd.vv/vwsub.vv code optimizations. gcc/ChangeLog: * config/riscv/riscv-vector-builtins-bases.cc: Change vwadd.wv/vwsub.wv intrinsic API expander * config/riscv/vector.md (@pred_single_widen_): Remove it. (@pred_single_widen_sub): New pattern. (@pred_single_widen_add): New pattern. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/widen/widen-5.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen-6.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen-complicate-1.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen-complicate-2.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen_run-5.c: New test. * gcc.target/riscv/rvv/autovec/widen/widen_run-6.c: New test. --- .../riscv/riscv-vector-builtins-bases.cc | 8 +++-- gcc/config/riscv/vector.md | 29 +++++++++++++--- .../riscv/rvv/autovec/widen/widen-5.c | 27 +++++++++++++++ .../riscv/rvv/autovec/widen/widen-6.c | 27 +++++++++++++++ .../rvv/autovec/widen/widen-complicate-1.c | 31 +++++++++++++++++ .../rvv/autovec/widen/widen-complicate-2.c | 31 +++++++++++++++++ .../riscv/rvv/autovec/widen/widen_run-5.c | 34 +++++++++++++++++++ .../riscv/rvv/autovec/widen/widen_run-6.c | 34 +++++++++++++++++++ 8 files changed, 215 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-6.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-5.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-6.c diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc b/gcc/config/riscv/riscv-vector-builtins-bases.cc index a8113f6602b..3f92084929d 100644 --- a/gcc/config/riscv/riscv-vector-builtins-bases.cc +++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc @@ -361,8 +361,12 @@ public: return e.use_exact_insn ( code_for_pred_dual_widen_scalar (CODE1, CODE2, e.vector_mode ())); case OP_TYPE_wv: - return e.use_exact_insn ( - code_for_pred_single_widen (CODE1, CODE2, e.vector_mode ())); + if (CODE1 == PLUS) + return e.use_exact_insn ( + code_for_pred_single_widen_add (CODE2, e.vector_mode ())); + else + return e.use_exact_insn ( + code_for_pred_single_widen_sub (CODE2, e.vector_mode ())); case OP_TYPE_wx: return e.use_exact_insn ( code_for_pred_single_widen_scalar (CODE1, CODE2, e.vector_mode ())); diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index cd41ebbb24f..c74dce89db6 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -3131,7 +3131,7 @@ [(set_attr "type" "vi") (set_attr "mode" "")]) -(define_insn "@pred_single_widen_" +(define_insn "@pred_single_widen_sub" [(set (match_operand:VWEXTI 0 "register_operand" "=&vr,&vr") (if_then_else:VWEXTI (unspec: @@ -3142,14 +3142,35 @@ (match_operand 8 "const_int_operand" " i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) - (plus_minus:VWEXTI + (minus:VWEXTI (match_operand:VWEXTI 3 "register_operand" " vr, vr") (any_extend:VWEXTI (match_operand: 4 "register_operand" " vr, vr"))) (match_operand:VWEXTI 2 "vector_merge_operand" " vu, 0")))] "TARGET_VECTOR" - "vw.wv\t%0,%3,%4%p1" - [(set_attr "type" "vi") + "vwsub.wv\t%0,%3,%4%p1" + [(set_attr "type" "viwalu") + (set_attr "mode" "")]) + +(define_insn "@pred_single_widen_add" + [(set (match_operand:VWEXTI 0 "register_operand" "=&vr,&vr") + (if_then_else:VWEXTI + (unspec: + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") + (match_operand 5 "vector_length_operand" " rK, rK") + (match_operand 6 "const_int_operand" " i, i") + (match_operand 7 "const_int_operand" " i, i") + (match_operand 8 "const_int_operand" " i, i") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) + (plus:VWEXTI + (any_extend:VWEXTI + (match_operand: 4 "register_operand" " vr, vr")) + (match_operand:VWEXTI 3 "register_operand" " vr, vr")) + (match_operand:VWEXTI 2 "vector_merge_operand" " vu, 0")))] + "TARGET_VECTOR" + "vwadd.wv\t%0,%3,%4%p1" + [(set_attr "type" "viwalu") (set_attr "mode" "")]) (define_insn "@pred_single_widen__scalar" diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-5.c new file mode 100644 index 00000000000..7f8909272ab --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-5.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=scalable" } */ + +#include + +#define TEST_TYPE(TYPE1, TYPE2) \ + __attribute__ ((noipa)) void vwadd_##TYPE1_##TYPE2 (TYPE1 *__restrict dst, \ + TYPE2 *__restrict a, \ + TYPE1 *__restrict b, \ + int n) \ + { \ + for (int i = 0; i < n; i++) \ + dst[i] = (TYPE1) a[i] + b[i]; \ + } + +#define TEST_ALL() \ + TEST_TYPE (int16_t, int8_t) \ + TEST_TYPE (uint16_t, uint8_t) \ + TEST_TYPE (int32_t, int16_t) \ + TEST_TYPE (uint32_t, uint16_t) \ + TEST_TYPE (int64_t, int32_t) \ + TEST_TYPE (uint64_t, uint32_t) + +TEST_ALL () + +/* { dg-final { scan-assembler-times {\tvwadd\.wv} 3 } } */ +/* { dg-final { scan-assembler-times {\tvwaddu\.wv} 3 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-6.c new file mode 100644 index 00000000000..f9542d7601e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-6.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=scalable" } */ + +#include + +#define TEST_TYPE(TYPE1, TYPE2) \ + __attribute__ ((noipa)) void vwsub_##TYPE1_##TYPE2 (TYPE1 *__restrict dst, \ + TYPE1 *__restrict a, \ + TYPE2 *__restrict b, \ + int n) \ + { \ + for (int i = 0; i < n; i++) \ + dst[i] = a[i] - (TYPE1) b[i]; \ + } + +#define TEST_ALL() \ + TEST_TYPE (int16_t, int8_t) \ + TEST_TYPE (uint16_t, uint8_t) \ + TEST_TYPE (int32_t, int16_t) \ + TEST_TYPE (uint32_t, uint16_t) \ + TEST_TYPE (int64_t, int32_t) \ + TEST_TYPE (uint64_t, uint32_t) + +TEST_ALL () + +/* { dg-final { scan-assembler-times {\tvwsub\.wv} 3 } } */ +/* { dg-final { scan-assembler-times {\tvwsubu\.wv} 3 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-1.c new file mode 100644 index 00000000000..baf91b72d60 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-1.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=scalable" } */ + +#include + +#define TEST_TYPE(TYPE1, TYPE2) \ + __attribute__ ((noipa)) void vwadd_##TYPE1_##TYPE2 ( \ + TYPE1 *__restrict dst, TYPE1 *__restrict dst2, TYPE1 *__restrict dst3, \ + TYPE2 *__restrict a, TYPE2 *__restrict b, TYPE2 *__restrict a2, \ + TYPE2 *__restrict b2, int n) \ + { \ + for (int i = 0; i < n; i++) \ + { \ + dst[i] = (TYPE1) a[i] + (TYPE1) b[i]; \ + dst2[i] = (TYPE1) a2[i] + (TYPE1) b[i]; \ + dst3[i] = (TYPE1) a2[i] + (TYPE1) a[i]; \ + } \ + } + +#define TEST_ALL() \ + TEST_TYPE (int16_t, int8_t) \ + TEST_TYPE (uint16_t, uint8_t) \ + TEST_TYPE (int32_t, int16_t) \ + TEST_TYPE (uint32_t, uint16_t) \ + TEST_TYPE (int64_t, int32_t) \ + TEST_TYPE (uint64_t, uint32_t) + +TEST_ALL () + +/* { dg-final { scan-assembler-times {\tvwadd\.vv} 9 } } */ +/* { dg-final { scan-assembler-times {\tvwaddu\.vv} 9 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-2.c new file mode 100644 index 00000000000..6d1709bf2f6 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-2.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv -mabi=ilp32d --param=riscv-autovec-preference=scalable" } */ + +#include + +#define TEST_TYPE(TYPE1, TYPE2) \ + __attribute__ ((noipa)) void vwadd_##TYPE1_##TYPE2 ( \ + TYPE1 *__restrict dst, TYPE1 *__restrict dst2, TYPE1 *__restrict dst3, \ + TYPE2 *__restrict a, TYPE2 *__restrict b, TYPE2 *__restrict a2, \ + TYPE2 *__restrict b2, int n) \ + { \ + for (int i = 0; i < n; i++) \ + { \ + dst[i] = (TYPE1) a[i] - (TYPE1) b[i]; \ + dst2[i] = (TYPE1) a2[i] - (TYPE1) b[i]; \ + dst3[i] = (TYPE1) a2[i] - (TYPE1) a[i]; \ + } \ + } + +#define TEST_ALL() \ + TEST_TYPE (int16_t, int8_t) \ + TEST_TYPE (uint16_t, uint8_t) \ + TEST_TYPE (int32_t, int16_t) \ + TEST_TYPE (uint32_t, uint16_t) \ + TEST_TYPE (int64_t, int32_t) \ + TEST_TYPE (uint64_t, uint32_t) + +TEST_ALL () + +/* { dg-final { scan-assembler-times {\tvwsub\.vv} 9 } } */ +/* { dg-final { scan-assembler-times {\tvwsubu\.vv} 9 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-5.c new file mode 100644 index 00000000000..ca16585a945 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-5.c @@ -0,0 +1,34 @@ +/* { dg-do run { target { riscv_vector } } } */ +/* { dg-additional-options "--param=riscv-autovec-preference=scalable" } */ + +#include +#include "widen-5.c" + +#define SZ 512 + +#define RUN(TYPE1, TYPE2, LIMIT) \ + TYPE2 a##TYPE2[SZ]; \ + TYPE1 b##TYPE1[SZ]; \ + TYPE1 dst##TYPE1[SZ]; \ + for (int i = 0; i < SZ; i++) \ + { \ + a##TYPE2[i] = LIMIT + i % 8723; \ + b##TYPE1[i] = LIMIT + i & 1964; \ + } \ + vwadd_##TYPE1_##TYPE2 (dst##TYPE1, a##TYPE2, b##TYPE1, SZ); \ + for (int i = 0; i < SZ; i++) \ + assert (dst##TYPE1[i] == ((TYPE1) a##TYPE2[i] + (TYPE1) b##TYPE1[i])); + +#define RUN_ALL() \ + RUN (int16_t, int8_t, -128) \ + RUN (uint16_t, uint8_t, 255) \ + RUN (int32_t, int16_t, -32768) \ + RUN (uint32_t, uint16_t, 65535) \ + RUN (int64_t, int32_t, -2147483648) \ + RUN (uint64_t, uint32_t, 4294967295) + +int +main () +{ + RUN_ALL () +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-6.c new file mode 100644 index 00000000000..5b69c2ab0c6 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-6.c @@ -0,0 +1,34 @@ +/* { dg-do run { target { riscv_vector } } } */ +/* { dg-additional-options "--param=riscv-autovec-preference=scalable" } */ + +#include +#include "widen-6.c" + +#define SZ 512 + +#define RUN(TYPE1, TYPE2, LIMIT) \ + TYPE1 a##TYPE1[SZ]; \ + TYPE2 b##TYPE2[SZ]; \ + TYPE1 dst##TYPE1[SZ]; \ + for (int i = 0; i < SZ; i++) \ + { \ + a##TYPE1[i] = LIMIT + i % 8723; \ + b##TYPE2[i] = LIMIT + i & 1964; \ + } \ + vwsub_##TYPE1_##TYPE2 (dst##TYPE1, a##TYPE1, b##TYPE2, SZ); \ + for (int i = 0; i < SZ; i++) \ + assert (dst##TYPE1[i] == ((TYPE1) a##TYPE1[i] - (TYPE1) b##TYPE2[i])); + +#define RUN_ALL() \ + RUN (int16_t, int8_t, -128) \ + RUN (uint16_t, uint8_t, 255) \ + RUN (int32_t, int16_t, -32768) \ + RUN (uint32_t, uint16_t, 65535) \ + RUN (int64_t, int32_t, -2147483648) \ + RUN (uint64_t, uint32_t, 4294967295) + +int +main () +{ + RUN_ALL () +}