From patchwork Wed Nov 8 10:53:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "juzhe.zhong@rivai.ai" X-Patchwork-Id: 162947 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:aa0b:0:b0:403:3b70:6f57 with SMTP id k11csp828602vqo; Wed, 8 Nov 2023 02:53:54 -0800 (PST) X-Google-Smtp-Source: AGHT+IHY9UQl8ClvKw8gWLBI2JaPfVMqJOyUA4bI4ELHQgGV7vD4E1SQXlfNwFQEkNRjHzroNE3l X-Received: by 2002:a05:622a:130d:b0:418:225d:e9d2 with SMTP id v13-20020a05622a130d00b00418225de9d2mr1545623qtk.52.1699440834591; Wed, 08 Nov 2023 02:53:54 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1699440834; cv=pass; d=google.com; s=arc-20160816; b=N3F1CNYvrTCqaCcdxzmk2XM7Ulgi2x0IsSaE9EGx4KV23dhwSBBWNMYNIDCtjYSw+a u7PStgI2VPcrZMM5xl4v4xXZgRNfrePnZxnZLb0FL8aFxYJlEcLmGYqJVk8wB17TwEx5 QtKkufgJuqp+NP3wq1XQw+S1A2v4V4bvEsMS42G/QfodK2cUGjRM9wd9i6jV7csx9kJ/ bFNgepEXYNSo3xM/+rIHfrFIDU1mXxlO2Y2ApK4FVEYS9i6vZfzBeqKvfgVTyaoVY0xq 3aebo0dmHAmbXm61qhK/ZBihqttEL48LnI67o3eVbVx4wev+UPJnxaC4hAVGxOQegrIr Jfmg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:feedback-id :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:arc-filter:dmarc-filter:delivered-to; bh=xKudi7AmHZqlU6h0+yhlgFPsGozLgfYuD2M0EAVdMI4=; fh=Go9lx9e5G+qYP+a3bFuWu9EcCDR+BJwiOLrSYlAHRvw=; b=l4WpaK0qs1bOqoeCP9AGamULWxeSNJrCMxpEEuX9eKMGN5SoJANZB+w6gW7yTNUpEG zf0XAQ5aULQdFRolTbHdfqZJ1rogwkEM9fkohcnYwsWnsXyxLMcfAo+U9XM9pULFlhyU HelUixeMCUX03Zp9xtoKd+meAl0+Ea4aE9qnngnUHc21XqsxrGf0fx0JLbCi0YryuJfD tcFwB6NDINqany36ZDiYmeSShustqlYZKLllDt5DTwA3gs7dwo8NtAYgiV180LVPyqFC 0X4UQzCX4rA6fU+S0UpoUpYFXQJaFtjiGoLDlvb5k3r0N0vGxeRdTWjttTTY0y8Hldnx OFfA== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id m9-20020a05622a118900b0041e8d6729b6si1186550qtk.748.2023.11.08.02.53.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Nov 2023 02:53:54 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5C7AF3858D38 for ; Wed, 8 Nov 2023 10:53:54 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpbgsg2.qq.com (smtpbgsg2.qq.com [54.254.200.128]) by sourceware.org (Postfix) with ESMTPS id EE9273858D1E for ; Wed, 8 Nov 2023 10:53:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EE9273858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EE9273858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=54.254.200.128 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699440811; cv=none; b=wpYa/triXn1wlKu8dSw3yycA6mRCT7w8pcJFkLR9deiYmP+2NsJPMU6+yd6OQBImyY7Sp5fVjQRoEZm67ohcg42ed4sjy+tjcWyEnrbI5fNQYKCXRj13tWG6o9lzyUdKAsLS12zGgcbt4YE3eTmkUCOyzVQ071paFWzaGiV48zA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699440811; c=relaxed/simple; bh=Yx9uVdW2TZkYSdiX4+Gyxu8hTzROFBxkKH86UljyohY=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=iFtCQdL/m5RKeKwR/7Q0yeCXG+GK89Y/+yDReLThlm1CKhVKoWRDkpm+BTwjsYDuKvLuFvPBW8f3UNla2+mScmwLYP8KIuB0pD4Dej4yzqUjxE9tDYGJPi2wdzKrB6aJIlHBm8xXyUjifHhzwbvUTra9o/cEkjtJAi8BM4e0SdE= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp65t1699440801tw82g43j Received: from rios-cad122.hadoop.rioslab.org ( [58.60.1.26]) by bizesmtp.qq.com (ESMTP) with id ; Wed, 08 Nov 2023 18:53:19 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: +ynUkgUhZJmoDnYrYqYAmHjuVTMeY2ucnh+JIk9ERkQ7d3hTvXMEpL9ArWEBQ 4ID99qXSQYLxO+XSDzJutLiGRrC8eb4yJMhqbgxWT9kY58oDUcXwHmnXVjz4a1tQ2OqADt3 LDEDQvgYbdKEnJr1eiTCSfAit+Wng7r7E8osTfd2b4FQh6aJP2E3FMEzXhKCNSEoiuYHb3r FF+NMiI3p+KWVmv35lK22dHsxrip2v720i5oNl3Czt8QB8zbcfSY3QgqAHyofblYkdnK1pT rcBLygbu7Oxn/nhrHp1XpVsfvOUqiUio2Biw8AMH92o4EcheyYYzvHq8twrcMI41cM1VAmV rVKElRR41GPXelsImge5vIhnGMMEm85gDBI/7I2GEyhFYLYACGzV7CiaSgffJbiAxn96fk7 zWrAFm2dcuT8BQbRTCmOr8O3vgREcGUJ X-QQ-GoodBg: 2 X-BIZMAIL-ID: 11307286179470564756 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: richard.sandiford@arm.com, rguenther@suse.de, kito.cheng@gmail.com, kito.cheng@sifive.com, Juzhe-Zhong Subject: [PATCH] Middle-end: Fix bug of induction variable vectorization for RVV Date: Wed, 8 Nov 2023 18:53:17 +0800 Message-Id: <20231108105317.1786716-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-10.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781992872641385909 X-GMAIL-MSGID: 1781992872641385909 PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112438 SELECT_VL result is not necessary always VF in non-final iteration. Current GIMPLE IR is wrong: # vect_vec_iv_.21_25 = PHI <_24(4), { 0, 1, 2, ... }(3)> ... _24 = vect_vec_iv_.21_25 + { POLY_INT_CST [4, 4], ... }; After this patch which is correct for SELECT_VL: # vect_vec_iv_.8_22 = PHI <_21(4), { 0, 1, 2, ... }(3)> ... _35 = .SELECT_VL (ivtmp_33, POLY_INT_CST [4, 4]); _21 = vect_vec_iv_.8_22 + { POLY_INT_CST [4, 4], ... }; kito, could you give more explanation ? PR middle/112438 gcc/ChangeLog: * tree-vect-loop.cc (vectorizable_induction): Fix bug. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr112438.c: New test. --- .../gcc.target/riscv/rvv/autovec/pr112438.c | 35 +++++++++++++++++ gcc/tree-vect-loop.cc | 39 +++++++++++++++---- 2 files changed, 67 insertions(+), 7 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c new file mode 100644 index 00000000000..b326d56a52c --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c @@ -0,0 +1,35 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */ + +void +foo (int n, int *__restrict in, int *__restrict out) +{ + for (int i = 0; i < n; i += 1) + { + out[i] = in[i] + i; + } +} + +void +foo2 (int n, float * __restrict in, +float * __restrict out) +{ + for (int i = 0; i < n; i += 1) + { + out[i] = in[i] + i; + } +} + +void +foo3 (int n, float * __restrict in, +float * __restrict out, float x) +{ + for (int i = 0; i < n; i += 1) + { + out[i] = in[i] + i* i; + } +} + +/* We don't want to see vect_vec_iv_.21_25 + { POLY_INT_CST [4, 4], ... }. */ +/* { dg-final { scan-tree-dump-not "\\+ \{ POLY_INT_CST" "optimized" } } */ + diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index a544bc9b059..3e103946168 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -10309,10 +10309,30 @@ vectorizable_induction (loop_vec_info loop_vinfo, new_name = step_expr; else { + gimple_seq seq = NULL; + if (LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)) + { + /* When we're using loop_len produced by SELEC_VL, the non-final + iterations are not always processing VF elements. So vectorize + induction variable instead of + + _21 = vect_vec_iv_.6_22 + { VF, ... }; + + We should generate: + + _35 = .SELECT_VL (ivtmp_33, VF); + vect_cst__22 = [vec_duplicate_expr] _35; + _21 = vect_vec_iv_.6_22 + vect_cst__22; */ + vec_loop_lens *lens = &LOOP_VINFO_LENS (loop_vinfo); + tree len + = vect_get_loop_len (loop_vinfo, NULL, lens, 1, vectype, 0, 0); + expr = force_gimple_operand (fold_convert (TREE_TYPE (step_expr), + unshare_expr (len)), + &seq, true, NULL_TREE); + } /* iv_loop is the loop to be vectorized. Generate: vec_step = [VF*S, VF*S, VF*S, VF*S] */ - gimple_seq seq = NULL; - if (SCALAR_FLOAT_TYPE_P (TREE_TYPE (step_expr))) + else if (SCALAR_FLOAT_TYPE_P (TREE_TYPE (step_expr))) { expr = build_int_cst (integer_type_node, vf); expr = gimple_build (&seq, FLOAT_EXPR, TREE_TYPE (step_expr), expr); @@ -10323,8 +10343,13 @@ vectorizable_induction (loop_vec_info loop_vinfo, expr, step_expr); if (seq) { - new_bb = gsi_insert_seq_on_edge_immediate (pe, seq); - gcc_assert (!new_bb); + if (LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)) + gsi_insert_seq_before (&si, seq, GSI_SAME_STMT); + else + { + new_bb = gsi_insert_seq_on_edge_immediate (pe, seq); + gcc_assert (!new_bb); + } } } @@ -10332,9 +10357,9 @@ vectorizable_induction (loop_vec_info loop_vinfo, gcc_assert (CONSTANT_CLASS_P (new_name) || TREE_CODE (new_name) == SSA_NAME); new_vec = build_vector_from_val (step_vectype, t); - vec_step = vect_init_vector (loop_vinfo, stmt_info, - new_vec, step_vectype, NULL); - + vec_step + = vect_init_vector (loop_vinfo, stmt_info, new_vec, step_vectype, + LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo) ? &si : NULL); /* Create the following def-use cycle: loop prolog: