From patchwork Wed Aug 9 06:36:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 133071 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c44e:0:b0:3f2:4152:657d with SMTP id w14csp2601895vqr; Tue, 8 Aug 2023 23:37:07 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGSlB4aX36is67+qovXPXNMubE/ZPQfg4WhrLqeNrDYCpr6RSRbj/p1HjxF9nZWfhphRcZ3 X-Received: by 2002:aa7:d84c:0:b0:522:37f1:5fd0 with SMTP id f12-20020aa7d84c000000b0052237f15fd0mr1587031eds.5.1691563027317; Tue, 08 Aug 2023 23:37:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691563027; cv=none; d=google.com; s=arc-20160816; b=RwEO+mu9++gwZGbfA/PSXsiOWSqdOqcfls+JtYnI4WftLHJXSawiQ71Q+HPe/eaOFH EQ03n4KAEjrJNQRcsxJjU9bVe/ly6KedfQPBy836c7hytn931/Q3yNMAIZJ5bX9uLjdv kP/40rBuSG4Y8OcH7VkSxG5kEbUxC+ReyFX2B4p310WWgr0PML1k5ZnK0qj9wNVCAexE HpqtwbgJo9Boc34mBFlhgKUA4f96oqwIIkO6Kr0CMwt4XHFKJy68ClY62s962IRVFvni wibd37yJjRVL3df7lDRm0JWDXditf2YAOeOGn7iR9n+zcEq6hP3lnm6HXv/a5fqKq+E8 9UcQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:dmarc-filter:delivered-to; bh=0uUW59DrMX6Op00y5KdkN9G/tfeTHh5k25MXAWOaEks=; fh=CHbn33ss3MXGFqXGZpS89+qfBQv2oFkoJCJQmHw6RIo=; b=AlMmx59cJ5WrrM9FK2i+kVqYB8inxRk3SzqwV9IeDk904rsxtnwuzxlOtURhtZuXTF 38VTMguZIBtzSujXFhL0UoPmCUrU8hx+bUO3sAmCvp/rQioPhkw3TDKpwFdO11naddf+ 8dsFwF+ORxuEhZ4bx92BPlF6HTEn9PjNXWuP8qetv2DADvY2+Qvzqj4we3eb3AKXIUf7 uLIkHWUD6eRX8R/Wg7108kEpHWx7cCH+Lb2YRGErWhwHZW870miZ2rU12CKHfiXO4bJr vKCnJBkWgIcx5qnbSUNQ2DTcgzYMiD72tGnB6lp3qvTGNdE3SFWdBV7vzrR3o5KgfpH4 CWtw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id u4-20020a056402064400b005234cb35ae1si1385448edx.471.2023.08.08.23.37.07 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Aug 2023 23:37:07 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 96F783857724 for ; Wed, 9 Aug 2023 06:36:57 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbguseast2.qq.com (smtpbguseast2.qq.com [54.204.34.130]) by sourceware.org (Postfix) with ESMTPS id F37EE3858404 for ; Wed, 9 Aug 2023 06:36:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F37EE3858404 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp87t1691562985tqx3y770 Received: from server1.localdomain ( [58.60.1.10]) by bizesmtp.qq.com (ESMTP) with id ; Wed, 09 Aug 2023 14:36:24 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: pPKMqzLgSAT6jUdf+NMeHN7Vp0QqLSXVXxXfl2vWjw1xiZYmwANIbdzN+trh4 UVBa7ftl6eBySgQXbfnUTYuMUtpwASvpqViyDsScuB1RW5kCJRYtJQqoDnXr3cpEZLy8vc5 1qO2mIygSmc8chho1ho8waHIbzjCyOwYUhlld+BeYElpWp4bM5EUFdpyzrpSuDxIvPNPI+O gIO4hUPV44mtTM84iXWFBR7PpOfTycvM9T4Kp7KqAzkiFHWcdfyj307zceBNvyfz6XJyDXC S7Vwkc+v+4Qub5rLuUUiSup5XeyNdzbaiIb/Orbr6tMofivpne2uW/4yczi39ad6IiPFmMU hrhR/ABab38Cf9J54/133dNopV/RX1M76yyB5SCotXuoZhMex7BlwslXzoRBzTDsH/jBKzo HtwQgHFPg8xza6X6KTfQUQ== X-QQ-GoodBg: 2 X-BIZMAIL-ID: 14944133363159421833 From: juzhe.zhong@rivai.ai To: gcc-patches@gcc.gnu.org Cc: richard.sandiford@arm.com, rguenther@suse.de, Ju-Zhe Zhong Subject: [PATCH] VECT: Support loop len control on EXTRACT_LAST vectorization Date: Wed, 9 Aug 2023 14:36:22 +0800 Message-Id: <20230809063622.316743-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-10.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773732392751104902 X-GMAIL-MSGID: 1773732392751104902 From: Ju-Zhe Zhong Hi, this patch is adding loop len control on extract_last autovectorization. Consider this following case: #include #define EXTRACT_LAST(TYPE) \ TYPE __attribute__ ((noinline, noclone)) \ test_##TYPE (TYPE *x, int n, TYPE value) \ { \ TYPE last; \ for (int j = 0; j < n; ++j) \ { \ last = x[j]; \ x[j] = last * value; \ } \ return last; \ } #define TEST_ALL(T) \ T (uint8_t) \ TEST_ALL (EXTRACT_LAST) ARM SVE IR: Preheader: max_mask_34 = .WHILE_ULT (0, bnd.5_6, { 0, ... }); Loop: ... # loop_mask_22 = PHI ... vect_last_12.8_23 = .MASK_LOAD (_7, 8B, loop_mask_22); vect__4.9_27 = vect_last_12.8_23 * vect_cst__26; .MASK_STORE (_7, 8B, loop_mask_22, vect__4.9_27); ... next_mask_35 = .WHILE_ULT (_1, bnd.5_6, { 0, ... }); ... Epilogue: _25 = .EXTRACT_LAST (loop_mask_22, vect_last_12.8_23); For RVV since we prefer len in loop control, after this patch for RVV: Loop: ... loop_len_22 = SELECT_VL; vect_last_12.8_23 = .MASK_LOAD (_7, 8B, loop_len_22); vect__4.9_27 = vect_last_12.8_23 * vect_cst__26; .MASK_STORE (_7, 8B, loop_len_22, vect__4.9_27); ... Epilogue: _25 = .EXTRACT_LAST (loop_len_22, vect_last_12.8_23); This patch didn't add a new pattern for length loop control of extract_last. Instead we reuse current extract_last. Here is the code: Step 1 - Enable length and record length for extract_last: + machine_mode vec_mode = TYPE_MODE (vectype); + if (get_len_load_store_mode (vec_mode, true).exists (&vec_mode)) + vect_record_loop_len (loop_vinfo, + &LOOP_VINFO_LENS (loop_vinfo), 1, + vectype, 1); + else + vect_record_loop_mask (loop_vinfo, + &LOOP_VINFO_MASKS (loop_vinfo), 1, + vectype, NULL); We use 'get_len_load_store_mode' to check whether targets support loop len control or not. If yes, record a loop len. Step 2 - Build EXTRACT_LAST with len: - tree mask = vect_get_loop_mask (loop_vinfo, gsi, - &LOOP_VINFO_MASKS (loop_vinfo), - 1, vectype, 0); + tree control; + if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo)) + control = vect_get_loop_len (loop_vinfo, gsi, + &LOOP_VINFO_LENS (loop_vinfo), 1, + vectype, 0, 0); + else + control = vect_get_loop_mask (loop_vinfo, gsi, + &LOOP_VINFO_MASKS (loop_vinfo), 1, + vectype, 0); tree scalar_res = gimple_build (&stmts, CFN_EXTRACT_LAST, scalar_type, - mask, vec_lhs_phi); + control, vec_lhs_phi); Reuse the current codes (build EXTRACT_LAST with mask), build length instead if 'LOOP_VINFO_FULLY_WITH_LENGTH_P' is true. This patch has been fully tested in RISC-V port. Bootstrap and Regression on X86 passed. Ok for trunk ? gcc/ChangeLog: * tree-vect-loop.cc (vectorizable_live_operation): Add length control. --- gcc/tree-vect-loop.cc | 40 ++++++++++++++++++++++++++++------------ 1 file changed, 28 insertions(+), 12 deletions(-) diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 00058c3c13e..fde098cafde 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -10311,9 +10311,15 @@ vectorizable_live_operation (vec_info *vinfo, else { gcc_assert (ncopies == 1 && !slp_node); - vect_record_loop_mask (loop_vinfo, - &LOOP_VINFO_MASKS (loop_vinfo), - 1, vectype, NULL); + machine_mode vec_mode = TYPE_MODE (vectype); + if (get_len_load_store_mode (vec_mode, true).exists (&vec_mode)) + vect_record_loop_len (loop_vinfo, + &LOOP_VINFO_LENS (loop_vinfo), 1, + vectype, 1); + else + vect_record_loop_mask (loop_vinfo, + &LOOP_VINFO_MASKS (loop_vinfo), 1, + vectype, NULL); } } /* ??? Enable for loop costing as well. */ @@ -10339,7 +10345,9 @@ vectorizable_live_operation (vec_info *vinfo, gimple *vec_stmt; if (slp_node) { - gcc_assert (!loop_vinfo || !LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)); + gcc_assert (!loop_vinfo + || !LOOP_VINFO_FULLY_MASKED_P (loop_vinfo) + || !LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo)); /* Get the correct slp vectorized stmt. */ vec_lhs = SLP_TREE_VEC_DEFS (slp_node)[vec_entry]; @@ -10383,21 +10391,29 @@ vectorizable_live_operation (vec_info *vinfo, gimple_seq stmts = NULL; tree new_tree; - if (LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)) + if (LOOP_VINFO_FULLY_MASKED_P (loop_vinfo) + || LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo)) { /* Emit: - SCALAR_RES = EXTRACT_LAST + SCALAR_RES = EXTRACT_LAST - where VEC_LHS is the vectorized live-out result and MASK is - the loop mask for the final iteration. */ + where VEC_LHS is the vectorized live-out result and CONTROL can + be either the loop mask for the final iteration or the loop len + for the final iteration. */ gcc_assert (ncopies == 1 && !slp_node); tree scalar_type = TREE_TYPE (STMT_VINFO_VECTYPE (stmt_info)); - tree mask = vect_get_loop_mask (loop_vinfo, gsi, - &LOOP_VINFO_MASKS (loop_vinfo), - 1, vectype, 0); + tree control; + if (LOOP_VINFO_FULLY_WITH_LENGTH_P (loop_vinfo)) + control = vect_get_loop_len (loop_vinfo, gsi, + &LOOP_VINFO_LENS (loop_vinfo), 1, + vectype, 0, 0); + else + control = vect_get_loop_mask (loop_vinfo, gsi, + &LOOP_VINFO_MASKS (loop_vinfo), 1, + vectype, 0); tree scalar_res = gimple_build (&stmts, CFN_EXTRACT_LAST, scalar_type, - mask, vec_lhs_phi); + control, vec_lhs_phi); /* Convert the extracted vector element to the scalar type. */ new_tree = gimple_convert (&stmts, lhs_type, scalar_res);