From patchwork Fri Sep 22 03:00:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lehua Ding X-Patchwork-Id: 143124 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp5286391vqi; Thu, 21 Sep 2023 20:02:08 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEcvdFY1uaqMqnwZVcNbUL9q1hBb27JH8G7Wa9/39XbWCCqxnZJB6qyHSpamkJLSeRbUKdB X-Received: by 2002:aa7:c0cc:0:b0:523:2811:5531 with SMTP id j12-20020aa7c0cc000000b0052328115531mr6226484edp.4.1695351727928; Thu, 21 Sep 2023 20:02:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695351727; cv=none; d=google.com; s=arc-20160816; b=XMAEycSj2UV8FoCUS9O/TPR9j1pyaPoCtxPIu9pudPh2O1CFE+IpRH8gCK3yqDdpM+ JYXcCTfFDP3kJgMjERFdKjOMBtuc1bDogQyRD+MHTJ0ndwRMx3OrqfefKMK4fFtwL2SZ QmEo3JXCEa8v+ToPklLSmrLH2AFwVKO6ESI2D2UKvzfDvqvrlImYu6Z0XdbnoCwD8Ucn CYIYJlDbk/cvFsvnBsEKwaxIP01FaMbFvt4kAXnJVUrhTZaiOiAkPwyFag32QOWkFZNs nuVHmyJhNKF7L3WDnzyB1MHw5sMLmdW4k+eWUK0F5KX5BW90A3vT3NO1c0vl42S6t6Cn 4+Dw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dmarc-filter:delivered-to; bh=tqXx7FaKzl49w396YoKlUBI07MmuIiVsoVTzdYzS7hs=; fh=x/Q0OlwHuvCZ3FpkiZPiUSvevOYVxUAi4aNnf76mUPQ=; b=K3+V5Y9h0WtaW6xYg8rVI/HPovMd5wlKtBEzc77sMdxFq7t41nfpNLu8Jq1QDUC59i PXgCkfcqZCSfTTi5LBmS0Vu2vK2D6YqZIFaCLEEs84OY16S5ucZFbERix8+l0tGm1Muj LllJEA7GhwDAxZfkdbU0wJLvuU9iV3Iv8KvAyYvhkQgC1oEZBCyh3hr4aonssH40J0h0 tlkx80laHvaAmO75vlitk0AT9aZm5K/84JYPpBBkbuMfMHi61V9qMHOP13Rx5wmhe/3m Sj9uvpPRG+/pvvwaQFJYDqwpuYk4rngqoxVVJ/DHpC+UW0T8zPelS6zGTHelAwfTmPNP PUow== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id t5-20020aa7d705000000b00532e2d49046si2507190edq.671.2023.09.21.20.02.07 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 20:02:07 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 30104385F00A for ; Fri, 22 Sep 2023 03:01:05 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg151.qq.com (smtpbg151.qq.com [18.169.211.239]) by sourceware.org (Postfix) with ESMTPS id 759293858407 for ; Fri, 22 Sep 2023 03:00:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 759293858407 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp68t1695351629tzudrkmm Received: from rios-cad5.localdomain ( [58.60.1.25]) by bizesmtp.qq.com (ESMTP) with id ; Fri, 22 Sep 2023 11:00:27 +0800 (CST) X-QQ-SSF: 01400000000000C0F000000A0000000 X-QQ-FEAT: PS+4EKbh/3VYaIwFlUsi5jNqgrxMhPdjZW3vyR5Wd5z8et51QXso3pXkS9a6/ FHvlBHIBBUNWf7AS1bo35QdVrvl7SiAKGSOljwv9/we+Qb9IqyN1we0yh81AXnQ/d3981iN KnfhqlesPdZhKvpBotjx/LpM1SdGeHoEHHKh5GwpcwWgqF2L68Jxi0eGLBRa6X3J8zm2YWE 7bdhKX2UMoE+4NoIV4H0yrGeLuqGrOE5q4VPKt1gAUHr7Xp48In2k1VFLL3zpsnUqlmx9GY vm7NMG658aQqP0bS8wayUafRFjRICSPxcBw6njoQGWGJPdrbaH8YHIoPk4gXPjEb8xLwnFH ZsK2Z/qYvjLA9X0M4ojGYsvsYIHcy4MxAZywTrg6yVUHfwTqTViAfjeUJIeaJYfTCsUrcDd X-QQ-GoodBg: 2 X-BIZMAIL-ID: 11551578116258109365 From: Lehua Ding To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, kito.cheng@gmail.com, rdapp.gcc@gmail.com, palmer@rivosinc.com, jeffreyalaw@gmail.com, lehua.ding@rivai.ai Subject: [COMMITTED V4] RISC-V: Support combine cond extend and reduce sum to widen reduce sum Date: Fri, 22 Sep 2023 11:00:26 +0800 Message-Id: <20230922030026.197559-1-lehua.ding@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz6a-0 X-Spam-Status: No, score=-9.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777705133261650494 X-GMAIL-MSGID: 1777705133261650494 This patch support combining cond extend and reduce_sum to cond widen reduce_sum like combine the following three insns: (set (reg:RVVM2HI 149) (if_then_else:RVVM2HI (unspec:RVVMF8BI [ (const_vector:RVVMF8BI repeat [ (const_int 1 [0x1]) ]) (reg:DI 146) (const_int 2 [0x2]) repeated x2 (const_int 1 [0x1]) (reg:SI 66 vl) (reg:SI 67 vtype) ] UNSPEC_VPREDICATE) (const_vector:RVVM2HI repeat [ (const_int 0 [0]) ]) (unspec:RVVM2HI [ (reg:SI 0 zero) ] UNSPEC_VUNDEF))) (set (reg:RVVM2HI 138) (if_then_else:RVVM2HI (reg:RVVMF8BI 135) (reg:RVVM2HI 148) (reg:RVVM2HI 149))) (set (reg:HI 150) (unspec:HI [ (reg:RVVM2HI 138) ] UNSPEC_REDUC_SUM)) into one insn: (set (reg:SI 147) (unspec:SI [ (if_then_else:RVVM2SI (reg:RVVMF16BI 135) (sign_extend:RVVM2SI (reg:RVVM1HI 136)) (if_then_else:RVVM2HI (unspec:RVVMF8BI [ (const_vector:RVVMF8BI repeat [ (const_int 1 [0x1]) ]) (reg:DI 146) (const_int 2 [0x2]) repeated x2 (const_int 1 [0x1]) (reg:SI 66 vl) (reg:SI 67 vtype) ] UNSPEC_VPREDICATE) (const_vector:RVVM2HI repeat [ (const_int 0 [0]) ]) (unspec:RVVM2HI [ (reg:SI 0 zero) ] UNSPEC_VUNDEF))) ] UNSPEC_REDUC_SUM)) Consider the following C code: int16_t foo (int8_t *restrict a, int8_t *restrict pred) { int16_t sum = 0; for (int i = 0; i < 16; i += 1) if (pred[i]) sum += a[i]; return sum; } assembly before this patch: foo: vsetivli zero,16,e16,m2,ta,ma li a5,0 vmv.v.i v2,0 vsetvli zero,zero,e8,m1,ta,ma vl1re8.v v0,0(a1) vmsne.vi v0,v0,0 vsetvli zero,zero,e16,m2,ta,mu vle8.v v4,0(a0),v0.t vmv.s.x v1,a5 vsext.vf2 v2,v4,v0.t vredsum.vs v2,v2,v1 vmv.x.s a0,v2 slliw a0,a0,16 sraiw a0,a0,16 ret assembly after this patch: foo: li a5,0 vsetivli zero,16,e16,m1,ta,ma vmv.s.x v3,a5 vsetivli zero,16,e8,m1,ta,ma vl1re8.v v0,0(a1) vmsne.vi v0,v0,0 vle8.v v2,0(a0),v0.t vwredsum.vs v1,v2,v3,v0.t vsetivli zero,0,e16,m1,ta,ma vmv.x.s a0,v1 slliw a0,a0,16 sraiw a0,a0,16 ret gcc/ChangeLog: * config/riscv/autovec-opt.md (*cond_widen_reduc_plus_scal_): New combine patterns. * config/riscv/riscv-protos.h (enum insn_type): New insn_type. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-2.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-2.c: New test. --- gcc/config/riscv/autovec-opt.md | 72 +++++++++++++++++++ gcc/config/riscv/riscv-protos.h | 1 + .../rvv/autovec/cond/cond_widen_reduc-1.c | 30 ++++++++ .../rvv/autovec/cond/cond_widen_reduc-2.c | 30 ++++++++ .../rvv/autovec/cond/cond_widen_reduc_run-1.c | 28 ++++++++ .../rvv/autovec/cond/cond_widen_reduc_run-2.c | 28 ++++++++ 6 files changed, 189 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-2.c -- 2.36.3 diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md index a97a095691c..ed9c0777eb9 100644 --- a/gcc/config/riscv/autovec-opt.md +++ b/gcc/config/riscv/autovec-opt.md @@ -1119,6 +1119,78 @@ } [(set_attr "type" "vfwmuladd")]) +;; Combine mask_extend + vredsum to mask_vwredsum[u] +;; where the mrege of mask_extend is vector const 0 +(define_insn_and_split "*cond_widen_reduc_plus_scal_" + [(set (match_operand: 0 "register_operand") + (unspec: [ + (if_then_else: + (match_operand: 1 "register_operand") + (any_extend: + (match_operand:VI_QHS_NO_M8 2 "register_operand")) + (if_then_else: + (unspec: [ + (match_operand: 3 "vector_all_trues_mask_operand") + (match_operand 6 "vector_length_operand") + (match_operand 7 "const_int_operand") + (match_operand 8 "const_int_operand") + (match_operand 9 "const_1_or_2_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM) + ] UNSPEC_VPREDICATE) + (match_operand: 5 "vector_const_0_operand") + (match_operand: 4 "vector_merge_operand"))) + ] UNSPEC_REDUC_SUM))] + "TARGET_VECTOR && can_create_pseudo_p ()" + "#" + "&& 1" + [(const_int 0)] +{ + rtx ops[] = {operands[0], operands[2], operands[1], + gen_int_mode (GET_MODE_NUNITS (mode), Pmode)}; + riscv_vector::expand_reduction (, + riscv_vector::REDUCE_OP_M, + ops, CONST0_RTX (mode)); + DONE; +} +[(set_attr "type" "vector")]) + +;; Combine mask_extend + vfredsum to mask_vfwredusum +;; where the mrege of mask_extend is vector const 0 +(define_insn_and_split "*cond_widen_reduc_plus_scal_" + [(set (match_operand: 0 "register_operand") + (unspec: [ + (if_then_else: + (match_operand: 1 "register_operand") + (float_extend: + (match_operand:VF_HS_NO_M8 2 "register_operand")) + (if_then_else: + (unspec: [ + (match_operand: 3 "vector_all_trues_mask_operand") + (match_operand 6 "vector_length_operand") + (match_operand 7 "const_int_operand") + (match_operand 8 "const_int_operand") + (match_operand 9 "const_1_or_2_operand") + (reg:SI VL_REGNUM) + (reg:SI VTYPE_REGNUM) + ] UNSPEC_VPREDICATE) + (match_operand: 5 "vector_const_0_operand") + (match_operand: 4 "vector_merge_operand"))) + ] UNSPEC_REDUC_SUM_UNORDERED))] + "TARGET_VECTOR && can_create_pseudo_p ()" + "#" + "&& 1" + [(const_int 0)] +{ + rtx ops[] = {operands[0], operands[2], operands[1], + gen_int_mode (GET_MODE_NUNITS (mode), Pmode)}; + riscv_vector::expand_reduction (UNSPEC_WREDUC_SUM_UNORDERED, + riscv_vector::REDUCE_OP_M_FRM_DYN, + ops, CONST0_RTX (mode)); + DONE; +} +[(set_attr "type" "vector")]) + ;; ============================================================================= ;; Misc combine patterns ;; ============================================================================= diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index d8372a7886f..3eec72b6703 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -341,6 +341,7 @@ enum insn_type : unsigned int /* For vreduce, no mask policy operand. */ REDUCE_OP = __NORMAL_OP_TA | BINARY_OP_P | VTYPE_MODE_FROM_OP1_P, + REDUCE_OP_M = __MASK_OP_TA | BINARY_OP_P | VTYPE_MODE_FROM_OP1_P, REDUCE_OP_FRM_DYN = REDUCE_OP | FRM_DYN_P | VTYPE_MODE_FROM_OP1_P, REDUCE_OP_M_FRM_DYN = __MASK_OP_TA | BINARY_OP_P | FRM_DYN_P | VTYPE_MODE_FROM_OP1_P, diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-1.c new file mode 100644 index 00000000000..22a71048684 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-1.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv64gcv_zvfh_zvl128b -mabi=lp64d --param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m2 -fno-vect-cost-model -ffast-math" } */ +#include + +#define TEST_TYPE(TYPE1, TYPE2, N) \ + __attribute__ ((noipa)) \ + TYPE1 reduc_##TYPE1##_##TYPE2 (TYPE2 *restrict a, TYPE2 *restrict pred) \ + { \ + TYPE1 sum = 0; \ + for (int i = 0; i < N; i += 1) \ + if (pred[i]) \ + sum += a[i]; \ + return sum; \ + } + +#define TEST_ALL(TEST) \ + TEST (int16_t, int8_t, 16) \ + TEST (int32_t, int16_t, 8) \ + TEST (int64_t, int32_t, 4) \ + TEST (uint16_t, uint8_t, 16) \ + TEST (uint32_t, uint16_t, 8) \ + TEST (uint64_t, uint32_t, 4) \ + TEST (float, _Float16, 8) \ + TEST (double, float, 4) + +TEST_ALL (TEST_TYPE) + +/* { dg-final { scan-assembler-times {\tvfwredusum\.vs\tv[0-9]+,v[0-9]+,v[0-9]+,v0\.t} 2 } } */ +/* { dg-final { scan-assembler-times {\tvwredsum\.vs\tv[0-9]+,v[0-9]+,v[0-9]+,v0\.t} 3 } } */ +/* { dg-final { scan-assembler-times {\tvwredsumu\.vs\tv[0-9]+,v[0-9]+,v[0-9]+,v0\.t} 3 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-2.c new file mode 100644 index 00000000000..7c8fedd072b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-2.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv64gcv_zvfh_zvl128b -mabi=lp64d --param riscv-autovec-preference=scalable --param riscv-autovec-lmul=m2 -fno-vect-cost-model -ffast-math" } */ +#include + +#define TEST_TYPE(TYPE1, TYPE2, N) \ + __attribute__ ((noipa)) \ + TYPE1 reduc_##TYPE1##_##TYPE2 (TYPE2 *restrict a, TYPE2 *restrict pred) \ + { \ + TYPE1 sum = 0; \ + for (int i = 0; i < N; i += 1) \ + if (pred[i]) \ + sum += a[i]; \ + return sum; \ + } + +#define TEST_ALL(TEST) \ + TEST (int16_t, int8_t, 16) \ + TEST (int32_t, int16_t, 8) \ + TEST (int64_t, int32_t, 4) \ + TEST (uint16_t, uint8_t, 16) \ + TEST (uint32_t, uint16_t, 8) \ + TEST (uint64_t, uint32_t, 4) \ + TEST (float, _Float16, 8) \ + TEST (double, float, 4) + +TEST_ALL (TEST_TYPE) + +/* { dg-final { scan-assembler-times {\tvfwredusum\.vs\tv[0-9]+,v[0-9]+,v[0-9]+,v0\.t} 2 } } */ +/* { dg-final { scan-assembler-times {\tvwredsum\.vs\tv[0-9]+,v[0-9]+,v[0-9]+,v0\.t} 3 } } */ +/* { dg-final { scan-assembler-times {\tvwredsumu\.vs\tv[0-9]+,v[0-9]+,v[0-9]+,v0\.t} 3 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-1.c new file mode 100644 index 00000000000..228df0959b6 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-1.c @@ -0,0 +1,28 @@ +/* { dg-do run { target { riscv_vector } } } */ +/* { dg-additional-options "--param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m2 -fno-vect-cost-model -ffast-math" } */ + +#include "cond_widen_reduc-1.c" + +#define RUN(TYPE1, TYPE2, N) \ + { \ + TYPE2 a[N]; \ + TYPE2 pred[N]; \ + TYPE1 r = 0; \ + for (int i = 0; i < N; i++) \ + { \ + a[i] = (i * 0.1) * (i & 1 ? 1 : -1); \ + pred[i] = i % 3; \ + if (pred[i]) \ + r += a[i]; \ + asm volatile ("" ::: "memory"); \ + } \ + if (r != reduc_##TYPE1##_##TYPE2 (a, pred)) \ + __builtin_abort (); \ + } + +int __attribute__ ((optimize (1))) +main () +{ + TEST_ALL (RUN) + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-2.c new file mode 100644 index 00000000000..2bf0f5fffda --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-2.c @@ -0,0 +1,28 @@ +/* { dg-do run { target { riscv_vector } } } */ +/* { dg-additional-options "--param riscv-autovec-preference=scalable --param riscv-autovec-lmul=m2 -fno-vect-cost-model -ffast-math" } */ + +#include "cond_widen_reduc-2.c" + +#define RUN(TYPE1, TYPE2, N) \ + { \ + TYPE2 a[N]; \ + TYPE2 pred[N]; \ + TYPE1 r = 0; \ + for (int i = 0; i < N; i++) \ + { \ + a[i] = (i * 0.1) * (i & 1 ? 1 : -1); \ + pred[i] = i % 3; \ + if (pred[i]) \ + r += a[i]; \ + asm volatile ("" ::: "memory"); \ + } \ + if (r != reduc_##TYPE1##_##TYPE2 (a, pred)) \ + __builtin_abort (); \ + } + +int __attribute__ ((optimize (1))) +main () +{ + TEST_ALL (RUN) + return 0; +}