From patchwork Tue Sep 20 02:21:33 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: liuhongt <hongtao.liu@intel.com>
X-Patchwork-Id: 1309
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a5d:5044:0:0:0:0:0 with SMTP id h4csp1244080wrt;
        Mon, 19 Sep 2022 19:24:22 -0700 (PDT)
X-Google-Smtp-Source: 
 AMsMyM6EsJdy+pBIqyvQcB+TiJthD9l9iflvCYFA8NimPVbjsIQassIUoSc/twGOecYLb+TufzzS
X-Received: by 2002:a50:ed0d:0:b0:44e:8882:fc4a with SMTP id
 j13-20020a50ed0d000000b0044e8882fc4amr18214593eds.190.1663640661973;
        Mon, 19 Sep 2022 19:24:21 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; t=1663640661; cv=none;
        d=google.com; s=arc-20160816;
        b=ya3J6fS2vwLJbFVySBWmTaNT/5kzxu66Q5l7SuHxoxkT1cn434mwzlmKZePdj4yZOA
         vC7HoykjHW2gNPTipsazVzGVX8P+r6u5urPYaq6abUhL8Y1uO+/Bye9LpQ3GUbbefxnl
         Eg4sq7jMA5mMh9Foww5XsbQ7epGunpECs8dMBOm92cfVq9seta3MLd1gfczh1Dc8DhhE
         YXHCY55+Qqte1X/jZKr/+pZ9RE1s89T0eG+RH2wZph2ETy9czPVg0Gs8IPB2MSiYilGt
         JUUsy47d7nrawbROFXM8ckD+YJeZrUNzOVxBjymtmjWfK1GqZXUmgaYP7GcVnzCVrlHm
         63Ew==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post
         :list-archive:list-unsubscribe:list-id:precedence:message-id:date
         :subject:to:dmarc-filter:delivered-to:dkim-signature:dkim-filter;
        bh=u3ijWMYjEZFVLZN+u4Eul0Q2gW10nmPZkOTyx1betcE=;
        b=FcrZCiXALz88XvYBL8rRYqoEq+5SBbzMId3Ye9C5tr47+3QmgXVMU4draRpQLJ4H2r
         wJCZ2V/P3EyJX1Ze96l0oORwFuM5rrqpJzRhkAuM2gjajk08/jmTl1/Za933anUl+DEv
         rVjwCb5Nprce4Cp7C3zBLPd+BgODtLyPJb9F6F151+qhtXkbcra/qYDX0JDElgVc3lH2
         cqwmNVsqFcmhuzIZAk2isykiJTxI9G4mqkmSWLwo0J892GFJ12B2z2WoVxsVjivzLTcW
         MTLkB1BXQOypOS8C+VvAdDxH3reNENlDQJ3o9G0LhkVRTK/MA+JlKjzshY0BECn54cep
         BK9w==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=pass header.i=@gcc.gnu.org header.s=default header.b=UL9hYeiD;
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as
 permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org";
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org
Received: from sourceware.org (server2.sourceware.org. [8.43.85.97])
        by mx.google.com with ESMTPS id
 x6-20020a05640225c600b0044e8330487esi330119edb.264.2022.09.19.19.24.21
        for <ouuuleilei@gmail.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Mon, 19 Sep 2022 19:24:21 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as
 permitted sender) client-ip=8.43.85.97;
Authentication-Results: mx.google.com;
       dkim=pass header.i=@gcc.gnu.org header.s=default header.b=UL9hYeiD;
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as
 permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org";
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id CDF583858289
	for <ouuuleilei@gmail.com>; Tue, 20 Sep 2022 02:24:20 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CDF583858289
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1663640660;
	bh=u3ijWMYjEZFVLZN+u4Eul0Q2gW10nmPZkOTyx1betcE=;
	h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post:
	 List-Help:List-Subscribe:From:Reply-To:From;
	b=UL9hYeiD5HbC5vNXs98/IpMQqEKL8be9VJWhkSTQlN+/N7N239wd6PLnWUSFthWRs
	 I15YtJe7z5HcA2OZ6AMXez6p4f4YdCkoSIWOt3skjg7wGqtSZmFWDQkHauEVwnzZIY
	 aV2ztS2omAu759smNCf7RUlxbMaPuKiwPPKXPtwo=
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mga11.intel.com (mga11.intel.com [192.55.52.93])
 by sourceware.org (Postfix) with ESMTPS id 995DF385840F
 for <gcc-patches@gcc.gnu.org>; Tue, 20 Sep 2022 02:23:37 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 995DF385840F
X-IronPort-AV: E=McAfee;i="6500,9779,10475"; a="297166706"
X-IronPort-AV: E=Sophos;i="5.93,329,1654585200"; d="scan'208";a="297166706"
Received: from orsmga005.jf.intel.com ([10.7.209.41])
 by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 19 Sep 2022 19:23:36 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.93,329,1654585200"; d="scan'208";a="794085508"
Received: from shvmail03.sh.intel.com ([10.239.245.20])
 by orsmga005.jf.intel.com with ESMTP; 19 Sep 2022 19:23:34 -0700
Received: from shliclel320.sh.intel.com (shliclel320.sh.intel.com
 [10.239.240.127])
 by shvmail03.sh.intel.com (Postfix) with ESMTP id 0AA551005165;
 Tue, 20 Sep 2022 10:23:34 +0800 (CST)
To: gcc-patches@gcc.gnu.org
Subject: [PATCH] Fix incorrect handle in vectorizable_induction for mixed
 induction type.
Date: Tue, 20 Sep 2022 10:21:33 +0800
Message-Id: <20220920022133.64778-1-hongtao.liu@intel.com>
X-Mailer: git-send-email 2.18.1
X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 KAM_SHORT,
 SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-Patchwork-Original-From: liuhongt via Gcc-patches <gcc-patches@gcc.gnu.org>
From: liuhongt <hongtao.liu@intel.com>
Reply-To: liuhongt <hongtao.liu@intel.com>
Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org
Sender: "Gcc-patches" <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org>
X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?=
X-GMAIL-THRID: =?utf-8?q?1744453670342696758?=
X-GMAIL-MSGID: =?utf-8?q?1744453670342696758?=

The codes in vectorizable_induction for slp_node assume all phi_info
have same induction type(vect_step_op_add), but since we support
nonlinear induction, it could be wrong handled.
So the patch return false when slp_node has mixed induction type.

Note codes in other place will still vectorize the induction with
separate iv update and vec_perm. But slp_node handle in
vectorizable_induction will be more optimal when all induction type
are the same, it will update ivs with one operation instead of
separate iv updates and permutation.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

gcc/ChangeLog:

	PR tree-optimization/103144
	* tree-vect-loop.cc (vectorizable_induction): Return false for
	slp_node with mixed induction type.

gcc/testsuite/ChangeLog:

	* gcc.target/i386/pr103144-mix-1.c: New test.
	* gcc.target/i386/pr103144-mix-2.c: New test.
---
 .../gcc.target/i386/pr103144-mix-1.c          | 17 +++++++++
 .../gcc.target/i386/pr103144-mix-2.c          | 35 +++++++++++++++++++
 gcc/tree-vect-loop.cc                         | 34 ++++++++++++++----
 3 files changed, 79 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr103144-mix-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr103144-mix-2.c

diff --git a/gcc/testsuite/gcc.target/i386/pr103144-mix-1.c b/gcc/testsuite/gcc.target/i386/pr103144-mix-1.c
new file mode 100644
index 00000000000..b292d66ef71
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr103144-mix-1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 2 "optimized" } } */
+/* For induction variable with differernt induction type(vect_step_op_add, vect_step_op_neg),
+   It should't be handled in vectorizable_induction with just 1 single iv update(addition.),
+   separate iv update and vec_perm are needed.  */
+int
+__attribute__((noipa))
+foo (int* p, int c, int n)
+{
+  for (int i = 0; i != n; i++)
+    {
+      p[2* i]= i;
+      p[2 * i+1] = c;
+      c = -c;
+    }
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr103144-mix-2.c b/gcc/testsuite/gcc.target/i386/pr103144-mix-2.c
new file mode 100644
index 00000000000..b7043d59aec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr103144-mix-2.c
@@ -0,0 +1,35 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -mavx2 -ftree-vectorize -fvect-cost-model=unlimited -mprefer-vector-width=256" } */
+/* { dg-require-effective-target avx2 } */
+
+#include "avx2-check.h"
+#include <string.h>
+#include "pr103144-mix-1.c"
+
+typedef int v8si __attribute__((vector_size(32)));
+
+#define N 34
+void
+avx2_test (void)
+{
+  int* epi32_exp = (int*) malloc (N * sizeof (int));
+  int* epi32_dst = (int*) malloc (N * sizeof (int));
+
+  __builtin_memset (epi32_exp, 0, N * sizeof (int));
+  int b = 8;
+  v8si init1 = __extension__(v8si) { 0, b, 1, -b, 2, b, 3, -b };
+  v8si init2 = __extension__(v8si) { 4, b, 5, -b, 6, b, 7, -b };
+  v8si init3 = __extension__(v8si) { 8, b, 9, -b, 10, b, 11, -b };
+  v8si init4 = __extension__(v8si) { 12, b, 13, -b, 14, b, 15, -b };
+  memcpy (epi32_exp, &init1, 32);
+  memcpy (epi32_exp + 8, &init2, 32);
+  memcpy (epi32_exp + 16, &init3, 32);
+  memcpy (epi32_exp + 24, &init4, 32);
+  epi32_exp[32] = 16;
+  epi32_exp[33] = b;
+  foo (epi32_dst, b, N / 2);
+  if (__builtin_memcmp (epi32_dst, epi32_exp, N * sizeof (int)) != 0)
+    __builtin_abort ();
+
+  return;
+}
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 9c434b66c5b..c7050a47c1c 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -9007,14 +9007,34 @@ vectorizable_induction (loop_vec_info loop_vinfo,
     iv_loop = loop;
   gcc_assert (iv_loop == (gimple_bb (phi))->loop_father);
 
-  if (slp_node && !nunits.is_constant ())
+  if (slp_node)
     {
-      /* The current SLP code creates the step value element-by-element.  */
-      if (dump_enabled_p ())
-	dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-			 "SLP induction not supported for variable-length"
-			 " vectors.\n");
-      return false;
+      if (!nunits.is_constant ())
+	{
+	  /* The current SLP code creates the step value element-by-element.  */
+	  if (dump_enabled_p ())
+	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+			     "SLP induction not supported for variable-length"
+			     " vectors.\n");
+	  return false;
+	}
+
+      stmt_vec_info phi_info;
+      FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (slp_node), i, phi_info)
+	{
+	  if (STMT_VINFO_LOOP_PHI_EVOLUTION_TYPE (phi_info) != vect_step_op_add)
+	    {
+	      /* The below SLP code assume all induction type to be the same.
+		 But slp in other place will still vectorize the loop via updating
+		 iv update separately + vec_perm, but not from below codes.  */
+	      if (dump_enabled_p ())
+		dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+				 "SLP induction not supported for mixed induction type"
+				 " vectors.\n");
+	      return false;
+	    }
+	}
+
     }
 
   if (FLOAT_TYPE_P (vectype) && !param_vect_induction_float)