From patchwork Tue Oct 17 12:27:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 154250 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp4096249vqb; Tue, 17 Oct 2023 05:27:43 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF2xPk1KS4b4H7fYUEQDs83kB6jVErId6D0AW3pXmO2iy5cBr8cSHPow5k/UdL66ipk1Iwd X-Received: by 2002:a05:6214:252a:b0:66d:48f8:e637 with SMTP id gg10-20020a056214252a00b0066d48f8e637mr2575822qvb.34.1697545662879; Tue, 17 Oct 2023 05:27:42 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1697545662; cv=pass; d=google.com; s=arc-20160816; b=VSQke2F5jddtgDMMfm9TfTpSYrgZoMKeVTxp3l+F8Rnbf4kphr0a4Z8H76I5aXYMK9 Dhei9Fi89/nfBDko3RRQ5gqHbohCs+TEUM1EPcdt1j4Z5XFY4Orgmwu5CAwcYLf4dZcM j8fPA3CEAdq+jbDzewT2c4OBmcll013WX/un/QxlbAnmqJm/yrlmCmlmhy8JAW68pbO9 BDxFQozJTXXqHkR+QpL5FBiU6NA07N3qIMkNQvwLHR2u8f2U5xdOZq6kMQxuhJBxiMcM 4YqZtQT27e0yVeecjTwF5T8SCA21PcAAh37UWrSj9ejmeww2MivYT01MKXs2fzHoQFft tWbw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:errors-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :user-agent:subject:to:from:date:arc-filter:dmarc-filter :delivered-to; bh=sMSiheHHauMwAVd/nQxFxO3L6sGTx0be0gVAoIaxGtg=; fh=hPrbWPhweUx4V0GV9uXJqbyAzg2ABmTz7kczrAQqMmM=; b=uBB5bA6QtpfTfREeA7OtD1+qcwHwNQAHsNjPtngVBOzSgBIe2yK+0BbgwE3sXFU5od CioaUomwSE6cXQ36RKxq04okB3AR3lgYNNRc+hvaWsH104uwpBmx0iTrCHL9Unyi7kbb USKZC1mrOLcg7fbkHvS3WgcyXC+WlALXXmpQEOvuZj+xOr9u8RAHINk64SokyyUQ/lOB ZCGZtagrXl5mawbgqFFReImDqahVBi/jn0EBUtRCbjEPj8xf5X/qGbtKM9EXVM8uaPQ5 XboU8/t0EuUnkjSEC+601o9Q5RyIoiljNz+4pVtlRrUuqPDFwQHr8epSFNbcfehY03MB IY/A== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id du8-20020a05621409a800b00656289535bbsi870564qvb.472.2023.10.17.05.27.42 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Oct 2023 05:27:42 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0FFAB385C6DB for ; Tue, 17 Oct 2023 12:27:42 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by sourceware.org (Postfix) with ESMTPS id 45C483857736 for ; Tue, 17 Oct 2023 12:27:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 45C483857736 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 45C483857736 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.220.29 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697545639; cv=none; b=x31JHb2oJPsyQkBO6LjGMU/Fg7ZVLEHaBTMwx7fevzeNk9LiG5wDiRE9x2X2fBkgLWVALJKQ+5tpXD0R3e4OudjlZP7t+1M3nCiaS4yeplOy6EOYObo/DQSGLMhqEd2pBHxuer/OFGjZrOzfnJH2s6oXKd12OJpSwBZvKlAp1Do= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697545639; c=relaxed/simple; bh=fo/v0Fd9ZClRBeI39doFtflz2UVZGunTnWHwIp/O5d8=; h=Date:From:To:Subject:MIME-Version; b=ruUGRjr9KGAH5wztKZgXkPfQ+NWbqRFkzArV0IiYMrvo0AbcNk7F/QX8H2lqiZY+y5t5atPOd5rn4mIMgdsBQxJpdT4fFMpXadk3b6ogwAJ1Tm7M4L0+VzYOdF+KVO7Kh9i9nW75IkMBUeC9B5KcFCafQ/b0HbbyYIz0DXCKSgw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id CDF331FF0E for ; Tue, 17 Oct 2023 12:27:11 +0000 (UTC) Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 8E1202C5EF for ; Tue, 17 Oct 2023 12:27:11 +0000 (UTC) Date: Tue, 17 Oct 2023 12:27:11 +0000 (UTC) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH] tree-optimization/111846 - put simd-clone-info into SLP tree User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 X-Spam-Level: X-Spamd-Bar: / Authentication-Results: smtp-out2.suse.de; dkim=none; dmarc=none; spf=softfail (smtp-out2.suse.de: 149.44.160.134 is neither permitted nor denied by domain of rguenther@suse.de) smtp.mailfrom=rguenther@suse.de X-Rspamd-Server: rspamd2 X-Spamd-Result: default: False [-0.01 / 50.00]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; RWL_MAILSPIKE_GOOD(0.00)[149.44.160.134:from]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-3.00)[-1.000]; MIME_GOOD(-0.10)[text/plain]; PREVIOUSLY_DELIVERED(0.00)[gcc-patches@gcc.gnu.org]; TO_DN_NONE(0.00)[]; R_SPF_SOFTFAIL(0.60)[~all:c]; RCPT_COUNT_ONE(0.00)[1]; MISSING_MID(2.50)[]; VIOLATED_DIRECT_SPF(3.50)[]; MX_GOOD(-0.01)[]; NEURAL_HAM_SHORT(-1.00)[-1.000]; DMARC_NA(0.20)[suse.de]; RCVD_NO_TLS_LAST(0.10)[]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(0.20)[]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]; BAYES_HAM(-3.00)[100.00%] X-Spam-Score: -0.01 X-Rspamd-Queue-Id: CDF331FF0E X-Spam-Status: No, score=-10.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, MISSING_MID, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Message-Id: <20231017122742.0FFAB385C6DB@sourceware.org> X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780005640411448309 X-GMAIL-MSGID: 1780005640411448309 The following avoids bogously re-using the simd-clone-info we currently hang off stmt_info from two different SLP contexts where a different number of lanes should have chosen a different best simdclone. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/111846 * tree-vectorizer.h (_slp_tree::simd_clone_info): Add. (SLP_TREE_SIMD_CLONE_INFO): New. * tree-vect-slp.cc (_slp_tree::_slp_tree): Initialize SLP_TREE_SIMD_CLONE_INFO. (_slp_tree::~_slp_tree): Release it. * tree-vect-stmts.cc (vectorizable_simd_clone_call): Use SLP_TREE_SIMD_CLONE_INFO or STMT_VINFO_SIMD_CLONE_INFO dependent on if we're doing SLP. * gcc.dg/vect/pr111846.c: New testcase. --- gcc/testsuite/gcc.dg/vect/pr111846.c | 12 ++++++++++ gcc/tree-vect-slp.cc | 2 ++ gcc/tree-vect-stmts.cc | 35 +++++++++++++--------------- gcc/tree-vectorizer.h | 6 +++++ 4 files changed, 36 insertions(+), 19 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/pr111846.c diff --git a/gcc/testsuite/gcc.dg/vect/pr111846.c b/gcc/testsuite/gcc.dg/vect/pr111846.c new file mode 100644 index 00000000000..d283882f261 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr111846.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -ffast-math" } */ +/* { dg-additional-options "-mavx2" { target { x86_64-*-* i?86-*-* } } } */ + +extern __attribute__((__simd__)) float powf(float, float); +float gv[0][10]; +float eq_set_bands_real_adj[0]; +void eq_set_bands_real() { + for (int c = 0; c < 10; c++) + for (int i = 0; i < 10; i++) + gv[c][i] = powf(0, eq_set_bands_real_adj[i]) - 1; +} diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index af8f5031bd2..d081999a763 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -117,6 +117,7 @@ _slp_tree::_slp_tree () SLP_TREE_CHILDREN (this) = vNULL; SLP_TREE_LOAD_PERMUTATION (this) = vNULL; SLP_TREE_LANE_PERMUTATION (this) = vNULL; + SLP_TREE_SIMD_CLONE_INFO (this) = vNULL; SLP_TREE_DEF_TYPE (this) = vect_uninitialized_def; SLP_TREE_CODE (this) = ERROR_MARK; SLP_TREE_VECTYPE (this) = NULL_TREE; @@ -143,6 +144,7 @@ _slp_tree::~_slp_tree () SLP_TREE_VEC_DEFS (this).release (); SLP_TREE_LOAD_PERMUTATION (this).release (); SLP_TREE_LANE_PERMUTATION (this).release (); + SLP_TREE_SIMD_CLONE_INFO (this).release (); if (this->failed) free (failed); } diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index b3a56498595..9bb43e98f56 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4215,6 +4215,8 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, if (nargs == 0) return false; + vec& simd_clone_info = (slp_node ? SLP_TREE_SIMD_CLONE_INFO (slp_node) + : STMT_VINFO_SIMD_CLONE_INFO (stmt_info)); arginfo.reserve (nargs, true); auto_vec slp_op; slp_op.safe_grow_cleared (nargs); @@ -4256,25 +4258,22 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, gcc_assert (thisarginfo.vectype != NULL_TREE); /* For linear arguments, the analyze phase should have saved - the base and step in STMT_VINFO_SIMD_CLONE_INFO. */ - if (i * 3 + 4 <= STMT_VINFO_SIMD_CLONE_INFO (stmt_info).length () - && STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 3 + 2]) + the base and step in {STMT_VINFO,SLP_TREE}_SIMD_CLONE_INFO. */ + if (i * 3 + 4 <= simd_clone_info.length () + && simd_clone_info[i * 3 + 2]) { gcc_assert (vec_stmt); - thisarginfo.linear_step - = tree_to_shwi (STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 3 + 2]); - thisarginfo.op - = STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 3 + 1]; + thisarginfo.linear_step = tree_to_shwi (simd_clone_info[i * 3 + 2]); + thisarginfo.op = simd_clone_info[i * 3 + 1]; thisarginfo.simd_lane_linear - = (STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 3 + 3] - == boolean_true_node); + = (simd_clone_info[i * 3 + 3] == boolean_true_node); /* If loop has been peeled for alignment, we need to adjust it. */ tree n1 = LOOP_VINFO_NITERS_UNCHANGED (loop_vinfo); tree n2 = LOOP_VINFO_NITERS (loop_vinfo); if (n1 != n2 && !thisarginfo.simd_lane_linear) { tree bias = fold_build2 (MINUS_EXPR, TREE_TYPE (n1), n1, n2); - tree step = STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[i * 3 + 2]; + tree step = simd_clone_info[i * 3 + 2]; tree opt = TREE_TYPE (thisarginfo.op); bias = fold_convert (TREE_TYPE (step), bias); bias = fold_build2 (MULT_EXPR, TREE_TYPE (step), bias, step); @@ -4328,8 +4327,8 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, unsigned group_size = slp_node ? SLP_TREE_LANES (slp_node) : 1; unsigned int badness = 0; struct cgraph_node *bestn = NULL; - if (STMT_VINFO_SIMD_CLONE_INFO (stmt_info).exists ()) - bestn = cgraph_node::get (STMT_VINFO_SIMD_CLONE_INFO (stmt_info)[0]); + if (simd_clone_info.exists ()) + bestn = cgraph_node::get (simd_clone_info[0]); else for (struct cgraph_node *n = node->simd_clones; n != NULL; n = n->simdclone->next_clone) @@ -4532,24 +4531,22 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, so automagic virtual operand updating doesn't work. */ if (gimple_vuse (stmt) && slp_node) vinfo->any_known_not_updated_vssa = true; - STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_push (bestn->decl); + simd_clone_info.safe_push (bestn->decl); for (i = 0; i < nargs; i++) if ((bestn->simdclone->args[i].arg_type == SIMD_CLONE_ARG_TYPE_LINEAR_CONSTANT_STEP) || (bestn->simdclone->args[i].arg_type == SIMD_CLONE_ARG_TYPE_LINEAR_REF_CONSTANT_STEP)) { - STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_grow_cleared (i * 3 - + 1, - true); - STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_push (arginfo[i].op); + simd_clone_info.safe_grow_cleared (i * 3 + 1, true); + simd_clone_info.safe_push (arginfo[i].op); tree lst = POINTER_TYPE_P (TREE_TYPE (arginfo[i].op)) ? size_type_node : TREE_TYPE (arginfo[i].op); tree ls = build_int_cst (lst, arginfo[i].linear_step); - STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_push (ls); + simd_clone_info.safe_push (ls); tree sll = arginfo[i].simd_lane_linear ? boolean_true_node : boolean_false_node; - STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_push (sll); + simd_clone_info.safe_push (sll); } STMT_VINFO_TYPE (stmt_info) = call_simd_clone_vec_info_type; DUMP_VECT_SCOPE ("vectorizable_simd_clone_call"); diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index f1d0cd79961..f3152927e2d 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -196,6 +196,11 @@ struct _slp_tree { denotes the number of output lanes. */ lane_permutation_t lane_permutation; + /* Selected SIMD clone's function info. First vector element + is SIMD clone's function decl, followed by a pair of trees (base + step) + for linear arguments (pair of NULLs for other arguments). */ + vec simd_clone_info; + tree vectype; /* Vectorized defs. */ vec vec_defs; @@ -300,6 +305,7 @@ public: #define SLP_TREE_NUMBER_OF_VEC_STMTS(S) (S)->vec_stmts_size #define SLP_TREE_LOAD_PERMUTATION(S) (S)->load_permutation #define SLP_TREE_LANE_PERMUTATION(S) (S)->lane_permutation +#define SLP_TREE_SIMD_CLONE_INFO(S) (S)->simd_clone_info #define SLP_TREE_DEF_TYPE(S) (S)->def_type #define SLP_TREE_VECTYPE(S) (S)->vectype #define SLP_TREE_REPRESENTATIVE(S) (S)->representative