From patchwork Tue Jun 27 06:47:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 113252 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp7993245vqr; Mon, 26 Jun 2023 23:48:18 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4yhiPOln/9hPczmAtw/u7QNrmfDf8wbycut1rUIdhqbZZvxqv0Uchhlbzckuyky0+ZHT3a X-Received: by 2002:a2e:9083:0:b0:2b5:a500:5597 with SMTP id l3-20020a2e9083000000b002b5a5005597mr4838300ljg.14.1687848498574; Mon, 26 Jun 2023 23:48:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687848498; cv=none; d=google.com; s=arc-20160816; b=ppd9Pp+k9zDrUauBr/nq7Nhaa24WuB+lw058jCI9BGBHh28GnSF41qZLxuQgbxpR4C lD0f3+eqmjTo8f4I3P/nJv0OKHrzmtKU3HI7cZLIfhzi+4qRJDJfxHpxoM/tZSwwnkOO ugEGt2CZK7IuDeNV/opxdLTC7W7ADWQzHGdf02aace6Vrl8+naoHOdRfiH1Ho+wswd++ Zf4GJHNd1djQ20hWkgNzVXOGF/WUkCZl2pOgjWwQBOsXWGb7Kys/R7Rm7in8Jq4L5nMR PKSd7lmFn58cmDSTIUmtFhdliLPj0EhiGTYfOrC76a3+8WXwUWEdjhbzrEta6vbfb+0r ewiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dmarc-filter:delivered-to; bh=RbKH7OtEbeLLRjQfCu3cX2kmjh37nUONN7CAUunrTJg=; fh=LxgSMRQgUe3V4LqA1E0dccbBxNhkicg+4forc5/KnEY=; b=CpF8zARC/WuVgNkhbu29bUAJQPH2rLUEkh9gDJqGvHcynNacAzsxh6aRIYRN02JlOV ifXSmDS2EAy3tmB7VApPywMDI9BMidq8oJ7JSxOAQek8Q4AAwMRQZizGwhMkUe+dxAuS Y0PSmMLHedDIeIEe42dBMsFKh0UdwlIXtWcrhbI/v4dikNzo0gxJxN6PcN7rdx9BGbXD 6o1h2eCKDk5E67VgIjNA8eAcgGgJDKCobCrM+7czAbFinEmBoNju7TNq93ShIdK2LCQg EO+0BTOb1sBDh4egdB+KvQAzEz168y/0YmX/NgO3UsUzWcm32BaHDAEzZApN5IjJQ3gS 0VcA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id g8-20020a056402114800b0051da4f1bea9si878152edw.383.2023.06.26.23.48.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Jun 2023 23:48:18 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2240C3857726 for ; Tue, 27 Jun 2023 06:48:11 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgsg2.qq.com (smtpbgsg2.qq.com [54.254.200.128]) by sourceware.org (Postfix) with ESMTPS id B3B763858D33 for ; Tue, 27 Jun 2023 06:47:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B3B763858D33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp72t1687848459t0wc275d Received: from server1.localdomain ( [58.60.1.22]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 27 Jun 2023 14:47:38 +0800 (CST) X-QQ-SSF: 01400000000000G0S000000A0000000 X-QQ-FEAT: CR3LFp2JE4kNbG8Ckn798UKPKQWyddANkRqHA9lFyBl9qO1Vn0RhftbKiIpWR Bog26BV5IYlgwETYWhHr5h/1WxCi5QoCNalFk6/Fj4z5DnbJ+oflZJiwYRxJzlGS1kFMieh TKK/6tfQpvIGPi2DZydj/6hQ5mqO5tK4f01T+MPM6mOrilS+wMmmKTTEYZp1zIEkx9E8hCC HsjiuRFYlhRwH5odXz7ZrBSHy6B3ihHhpvzGy6FogsBjyhn/BG4VR4sEN1uXnazZSd+dypa 34fAYZcGvpio0mRlbcPmcqujWUNLUY/PI+oqJUKH1O7H/bVFinYZs/WNzh6RswTPCOU6prh WpWW2QH8RFp/iDDd5l9wzj4y6Ms+oSQLu6em7WlDGz39Y06utL4/19TlMPY+6a0OKUInxY1 X-QQ-GoodBg: 2 X-BIZMAIL-ID: 11366376520307476932 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Cc: richard.sandiford@arm.com, rguenther@suse.de, pan2.li@intel.com, Ju-Zhe Zhong Subject: [PATCH V4] SCCVN: Add LEN_MASK_STORE and fix LEN_STORE Date: Tue, 27 Jun 2023 14:47:37 +0800 Message-Id: <20230627064737.16257-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769837427091446995?= X-GMAIL-MSGID: =?utf-8?q?1769837427091446995?= From: Ju-Zhe Zhong Hi, Richi. I tried to understand your last email and to refactor the do-while loop using VECTOR_CST_NELTS. This patch works fine for LEN_MASK_STORE and compiler can CSE redundant store. I have appended testcase in this patch to test VN for LEN_MASK_STORE. I am not sure whether I am on the same page with you. Feel free to correct me, Thanks. gcc/ChangeLog: * tree-ssa-sccvn.cc (vn_reference_lookup_3): Add LEN_MASK_STORE and fix LEN_STORE gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/partial/len_maskstore_vn-1.c: New test. --- .../rvv/autovec/partial/len_maskstore_vn-1.c | 30 +++++++++++++++++++ gcc/tree-ssa-sccvn.cc | 24 +++++++++++---- 2 files changed, 49 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/len_maskstore_vn-1.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/len_maskstore_vn-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/len_maskstore_vn-1.c new file mode 100644 index 00000000000..0b2d03693dc --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/partial/len_maskstore_vn-1.c @@ -0,0 +1,30 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gcv_zvl256b -mabi=ilp32d --param riscv-autovec-preference=fixed-vlmax -O3 -fdump-tree-fre5" } */ + +void __attribute__((noinline,noclone)) +foo (int *out, int *res) +{ + int mask[] = { 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1 }; + int i; + for (i = 0; i < 16; ++i) + { + if (mask[i]) + out[i] = i; + } + int o0 = out[0]; + int o7 = out[7]; + int o14 = out[14]; + int o15 = out[15]; + res[0] = o0; + res[2] = o7; + res[4] = o14; + res[6] = o15; +} + +/* Vectorization produces .LEN_MASK_STORE, unrolling will unroll the two + vector iterations. FRE5 after that should be able to CSE + out[7] and out[15], but leave out[0] and out[14] alone. */ +/* { dg-final { scan-tree-dump " = o0_\[0-9\]+;" "fre5" } } */ +/* { dg-final { scan-tree-dump " = 7;" "fre5" } } */ +/* { dg-final { scan-tree-dump " = o14_\[0-9\]+;" "fre5" } } */ +/* { dg-final { scan-tree-dump " = 15;" "fre5" } } */ diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc index 11061a374a2..242d82d6274 100644 --- a/gcc/tree-ssa-sccvn.cc +++ b/gcc/tree-ssa-sccvn.cc @@ -3304,6 +3304,16 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_, if (!tree_fits_uhwi_p (len) || !tree_fits_shwi_p (bias)) return (void *)-1; break; + case IFN_LEN_MASK_STORE: + len = gimple_call_arg (call, 2); + bias = gimple_call_arg (call, 5); + if (!tree_fits_uhwi_p (len) || !tree_fits_shwi_p (bias)) + return (void *)-1; + mask = gimple_call_arg (call, internal_fn_mask_index (fn)); + mask = vn_valueize (mask); + if (TREE_CODE (mask) != VECTOR_CST) + return (void *)-1; + break; default: return (void *)-1; } @@ -3344,11 +3354,17 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_, tree vectype = TREE_TYPE (def_rhs); unsigned HOST_WIDE_INT elsz = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (vectype))); + /* Set initial len value is the UINT_MAX, so mask_idx < actual_len + is always true for MASK_STORE. */ + unsigned actual_len = UINT_MAX; + if (len) + actual_len = tree_to_uhwi (len) + tree_to_shwi (bias); + unsigned nunits + = MIN (actual_len, VECTOR_CST_NELTS (mask).coeffs[0]); if (mask) { HOST_WIDE_INT start = 0, length = 0; - unsigned mask_idx = 0; - do + for (unsigned mask_idx = 0; mask_idx < nunits; mask_idx++) { if (integer_zerop (VECTOR_CST_ELT (mask, mask_idx))) { @@ -3371,9 +3387,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_, } else length += elsz; - mask_idx++; } - while (known_lt (mask_idx, TYPE_VECTOR_SUBPARTS (vectype))); if (length != 0) { pd.rhs_off = start; @@ -3389,7 +3403,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_, { pd.offset = offset2i; pd.size = (tree_to_uhwi (len) - + -tree_to_shwi (bias)) * BITS_PER_UNIT; + + tree_to_shwi (bias)) * BITS_PER_UNIT; if (BYTES_BIG_ENDIAN) pd.rhs_off = pd.size - tree_to_uhwi (TYPE_SIZE (vectype)); else