From patchwork Wed Sep 20 07:57:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lehua Ding X-Patchwork-Id: 142274 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp3954664vqi; Wed, 20 Sep 2023 00:58:34 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEVA8vYMhyAhR0FajA1RPR6xrqxaomqvib4nnA62E2iGCY4ZMEii0SdVp30GbFf/8PIM8Zt X-Received: by 2002:a17:907:7611:b0:9ae:4ead:6c06 with SMTP id jx17-20020a170907761100b009ae4ead6c06mr492220ejc.30.1695196714206; Wed, 20 Sep 2023 00:58:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695196714; cv=none; d=google.com; s=arc-20160816; b=v0tCgHPYPOknGRzN7JoCdGyLSYRdEer1McrbnL4T799sXE0/BLqfC9Dmxke14Vi89z +5teUIqL+LEIhnJSRGW6nq/AGr7GWfdQwZGXj9Dj1gqnsmvgmofQUGm6Q0DYq7XpSSR1 JkniYD7/s390QcncuIZ0xWR3xAXo6blWDsbp8WDUqpzIEIyssyKWRi0CmMRly7F0ghtH I562FZUjLcTGTEvGpO4KosTZm38PZp5mQu/70Z3Ua71oLBFbc6V17jRvRVPmA58rknKc D1E2iqHdQP9rfUV3F76ZEaen4AdfTX2nT1yzlx7qbd/e5nBZD+I1npPovCNFio9Dl3FV uJbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dmarc-filter:delivered-to; bh=kFM90izw4qyf1tj7yyEqQB9f/4GLbxayyiySozGVpp8=; fh=x/Q0OlwHuvCZ3FpkiZPiUSvevOYVxUAi4aNnf76mUPQ=; b=eC5mOU7NCYhmznc4slL4suz2aoO/aAbDXGzFdq/mNEeUS61BqG08eiFWKWFEKiGsM7 S1uITY7Dc6r3JHy3WDuvwN0R3Piona329qI+uefPpmF4fZ9Jo1tvhRR91zIl9e02ejGW Bz2yxTM6LNWJtrYyEZ+uRl/7uVMq/GMi0LkfqjtYXhAwfORxigAQA5Z+/J5R/Q/kHtxr zZSnL33XpLNQGZBgfnQAACSLor4kXIK15aznhLKwQI0zfYuyPjKwZF05kT1ZRSZMwO3k Kb483aggmtw9nbX0ufQpDZpjemkemenGfvPBkekoEJupcusjW5dkx6UoQktsUDRtshEW aPpQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id i21-20020a1709063c5500b0099bcde09ca3si12008537ejg.89.2023.09.20.00.58.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Sep 2023 00:58:34 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3639C3856DD0 for ; Wed, 20 Sep 2023 07:58:29 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbg153.qq.com (smtpbg153.qq.com [13.245.218.24]) by sourceware.org (Postfix) with ESMTPS id B217C3858D20 for ; Wed, 20 Sep 2023 07:57:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B217C3858D20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp79t1695196670twkdflce Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Wed, 20 Sep 2023 15:57:49 +0800 (CST) X-QQ-SSF: 01400000000000C0F000000A0000000 X-QQ-FEAT: 3M0okmaRx3jHrL8PGlJbAVinYJCkkySfg7zZM96KpmkRiMFV5Vvl06ze2uy/Q CS/F3zraAMPfmd9kdpiApSb63CQwJkmNM/SJPHjdENuunmR3v+asCWAyZqzErUhRbUVGKMF SlGOYp1xKwwftF6BGnonyY48tfy4/2pGZstA2b6QppZcQESrubmQ7Hi3Kg0u5vflgfecjG7 6bLkR8gC2GUo+Mp+BkAprz8zkjxbVSoLooXf99IIhC8CpQFikJnf3rp4LMQKgerrjIAeDhq RLTQj9qQiqX22Asz4LYVzgYxXLmdP9RN0gYbp/Jl0dRS45CklFmwKHPuwWsqNAQo8Mdjzpv cH4D+wkmaXL/+dAVElyrtNapO5SD+Uph/kvvM8bZTFPxcX5QYeLkuV7nswqsXy0NuIySxDh MVa0zp8ciYo= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 17637019314259142546 From: Lehua Ding To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, kito.cheng@gmail.com, rdapp.gcc@gmail.com, palmer@rivosinc.com, jeffreyalaw@gmail.com, lehua.ding@rivai.ai Subject: [PATCH V2] RISC-V: Support combine cond extend and reduce sum to widen reduce sum Date: Wed, 20 Sep 2023 15:57:44 +0800 Message-Id: <20230920075744.3514469-1-lehua.ding@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz6a-0 X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777542589588374146 X-GMAIL-MSGID: 1777542589588374146 V2 Change: Use new method to simple move const 0 to vector. This patch support combining cond extend and reduce_sum to cond widen reduce_sum like combine the following three insns: (set (reg:RVVM2HI 149) (const_vector:RVVM2HI repeat [ (const_int 0) ])) (set (reg:RVVM2HI 138) (if_then_else:RVVM2HI (reg:RVVMF8BI 135) (reg:RVVM2HI 148) (reg:RVVM2HI 149))) (set (reg:HI 150) (unspec:HI [ (reg:RVVM2HI 138) ] UNSPEC_REDUC_SUM)) into one insn: (set (reg:SI 147) (unspec:SI [ (if_then_else:RVVM2SI (reg:RVVMF16BI 135) (sign_extend:RVVM2SI (reg:RVVM1HI 136)) (const_vector:RVVM2SI repeat [ (const_int 0) ])) ] UNSPEC_REDUC_SUM)) Consider the following C code: int16_t foo (int8_t *restrict a, int8_t *restrict pred) { int16_t sum = 0; for (int i = 0; i < 16; i += 1) if (pred[i]) sum += a[i]; return sum; } assembly before this patch: foo: vsetivli zero,16,e16,m2,ta,ma li a5,0 vmv.v.i v2,0 vsetvli zero,zero,e8,m1,ta,ma vl1re8.v v0,0(a1) vmsne.vi v0,v0,0 vsetvli zero,zero,e16,m2,ta,mu vle8.v v4,0(a0),v0.t vmv.s.x v1,a5 vsext.vf2 v2,v4,v0.t vredsum.vs v2,v2,v1 vmv.x.s a0,v2 slliw a0,a0,16 sraiw a0,a0,16 ret assembly after this patch: foo: li a5,0 vsetivli zero,16,e16,m1,ta,ma vmv.s.x v3,a5 vsetivli zero,16,e8,m1,ta,ma vl1re8.v v0,0(a1) vmsne.vi v0,v0,0 vle8.v v2,0(a0),v0.t vwredsum.vs v1,v2,v3,v0.t vsetivli zero,0,e16,m1,ta,ma vmv.x.s a0,v1 slliw a0,a0,16 sraiw a0,a0,16 ret gcc/ChangeLog: * config/riscv/autovec-opt.md (@mov_vec_const_0): New helper pattern. (*cond_widen_reduc_plus_scal_): New combine pattern. * config/riscv/riscv-protos.h (enum insn_type): Ditto. * config/riscv/riscv-v.cc (expand_const_vector): Gen new pattern. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-1.c: New test. * gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-1.c: New test. --- gcc/config/riscv/autovec-opt.md | 64 +++++++++++++++++++ gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv-v.cc | 7 +- .../rvv/autovec/cond/cond_widen_reduc-1.c | 30 +++++++++ .../rvv/autovec/cond/cond_widen_reduc_run-1.c | 28 ++++++++ 5 files changed, 129 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-1.c -- 2.36.3 diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md index 66c77ad6ebb..5cc13c85fe5 100644 --- a/gcc/config/riscv/autovec-opt.md +++ b/gcc/config/riscv/autovec-opt.md @@ -185,6 +185,22 @@ [(set_attr "type" "vimovvx") (set_attr "mode" "")]) +;; Let the mov pattern move 0 to vector remain simple pattern before split1. +;; This simple pattern will let more patterns be made to combine successfully. +(define_insn_and_split "@mov_vec_const_0" + [(set (match_operand:V_VLS 0 "register_operand") + (match_operand:V_VLS 1 "vector_const_0_operand"))] + "TARGET_VECTOR && can_create_pseudo_p ()" + "#" + "&& 1" + [(const_int 0)] + { + riscv_vector::emit_vlmax_insn (code_for_pred_mov (mode), + riscv_vector::UNARY_OP, operands); + DONE; + } + [(set_attr "type" "vimov")]) + ;; ============================================================================= ;; All combine patterns for combine pass. ;; ============================================================================= @@ -1175,6 +1191,54 @@ } [(set_attr "type" "vfwmuladd")]) +;; Combine mask_extend + vredsum to mask_vwredsum[u] +(define_insn_and_split "*cond_widen_reduc_plus_scal_" + [(set (match_operand: 0 "register_operand") + (unspec: [ + (if_then_else: + (match_operand: 1 "register_operand") + (any_extend: + (match_operand:VI_QHS_NO_M8 2 "register_operand")) + (match_operand: 3 "vector_const_0_operand")) + ] UNSPEC_REDUC_SUM))] + "TARGET_VECTOR && can_create_pseudo_p ()" + "#" + "&& 1" + [(const_int 0)] +{ + rtx ops[] = {operands[0], operands[2], operands[1], + gen_int_mode (GET_MODE_NUNITS (mode), Pmode)}; + riscv_vector::expand_reduction (, + riscv_vector::REDUCE_OP_M, + ops, CONST0_RTX (mode)); + DONE; +} +[(set_attr "type" "vector")]) + +;; Combine mask_extend + vfredsum to mask_vfwredusum +(define_insn_and_split "*cond_widen_reduc_plus_scal_" + [(set (match_operand: 0 "register_operand") + (unspec: [ + (if_then_else: + (match_operand: 1 "register_operand") + (float_extend: + (match_operand:VF_HS_NO_M8 2 "register_operand")) + (match_operand: 3 "vector_const_0_operand")) + ] UNSPEC_REDUC_SUM_UNORDERED))] + "TARGET_VECTOR && can_create_pseudo_p ()" + "#" + "&& 1" + [(const_int 0)] +{ + rtx ops[] = {operands[0], operands[2], operands[1], + gen_int_mode (GET_MODE_NUNITS (mode), Pmode)}; + riscv_vector::expand_reduction (UNSPEC_WREDUC_SUM_UNORDERED, + riscv_vector::REDUCE_OP_M_FRM_DYN, + ops, CONST0_RTX (mode)); + DONE; +} +[(set_attr "type" "vector")]) + ;; ============================================================================= ;; Misc combine patterns ;; ============================================================================= diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 9ea0bcf15d3..a75b0b485b4 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -337,6 +337,7 @@ enum insn_type : unsigned int /* For vreduce, no mask policy operand. */ REDUCE_OP = __NORMAL_OP_TA | BINARY_OP_P | VTYPE_MODE_FROM_OP1_P, + REDUCE_OP_M = __MASK_OP_TA | BINARY_OP_P | VTYPE_MODE_FROM_OP1_P, REDUCE_OP_FRM_DYN = REDUCE_OP | FRM_DYN_P | VTYPE_MODE_FROM_OP1_P, REDUCE_OP_M_FRM_DYN = __MASK_OP_TA | BINARY_OP_P | FRM_DYN_P | VTYPE_MODE_FROM_OP1_P, diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 64a71a128d4..d2687969997 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -973,7 +973,12 @@ expand_const_vector (rtx target, rtx src) rtx tmp = register_operand (target, mode) ? target : gen_reg_rtx (mode); /* Element in range -16 ~ 15 integer or 0.0 floating-point, we use vmv.v.i instruction. */ - if (satisfies_constraint_vi (src) || satisfies_constraint_Wc0 (src)) + /* For const int or float 0, we keep the simple pattern before split1 + pass. */ + if ((can_create_pseudo_p () && !lra_in_progress) + && satisfies_constraint_Wc0 (src)) + emit_insn (gen_mov_vec_const_0 (mode, tmp, src)); + else if (satisfies_constraint_vi (src)) { rtx ops[] = {tmp, src}; emit_vlmax_insn (code_for_pred_mov (mode), UNARY_OP, ops); diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-1.c new file mode 100644 index 00000000000..22a71048684 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc-1.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv64gcv_zvfh_zvl128b -mabi=lp64d --param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m2 -fno-vect-cost-model -ffast-math" } */ +#include + +#define TEST_TYPE(TYPE1, TYPE2, N) \ + __attribute__ ((noipa)) \ + TYPE1 reduc_##TYPE1##_##TYPE2 (TYPE2 *restrict a, TYPE2 *restrict pred) \ + { \ + TYPE1 sum = 0; \ + for (int i = 0; i < N; i += 1) \ + if (pred[i]) \ + sum += a[i]; \ + return sum; \ + } + +#define TEST_ALL(TEST) \ + TEST (int16_t, int8_t, 16) \ + TEST (int32_t, int16_t, 8) \ + TEST (int64_t, int32_t, 4) \ + TEST (uint16_t, uint8_t, 16) \ + TEST (uint32_t, uint16_t, 8) \ + TEST (uint64_t, uint32_t, 4) \ + TEST (float, _Float16, 8) \ + TEST (double, float, 4) + +TEST_ALL (TEST_TYPE) + +/* { dg-final { scan-assembler-times {\tvfwredusum\.vs\tv[0-9]+,v[0-9]+,v[0-9]+,v0\.t} 2 } } */ +/* { dg-final { scan-assembler-times {\tvwredsum\.vs\tv[0-9]+,v[0-9]+,v[0-9]+,v0\.t} 3 } } */ +/* { dg-final { scan-assembler-times {\tvwredsumu\.vs\tv[0-9]+,v[0-9]+,v[0-9]+,v0\.t} 3 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-1.c new file mode 100644 index 00000000000..fdb7e5249ee --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/cond/cond_widen_reduc_run-1.c @@ -0,0 +1,28 @@ +/* { dg-do run { target { riscv_vector } } } */ +/* { dg-additional-options "--param=riscv-autovec-preference=scalable -fno-vect-cost-model" } */ + +#include "cond_widen_reduc-1.c" + +#define RUN(TYPE1, TYPE2, N) \ + { \ + TYPE2 a[N]; \ + TYPE2 pred[N]; \ + TYPE1 r = 0; \ + for (int i = 0; i < N; i++) \ + { \ + a[i] = (i * 0.1) * (i & 1 ? 1 : -1); \ + pred[i] = i % 3; \ + if (pred[i]) \ + r += a[i]; \ + asm volatile ("" ::: "memory"); \ + } \ + if (r != reduc_##TYPE1##_##TYPE2 (a, pred)) \ + __builtin_abort (); \ + } + +int __attribute__ ((optimize (1))) +main () +{ + TEST_ALL (RUN) + return 0; +}