From patchwork Mon Jun 26 12:17:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 112898 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp7438747vqr; Mon, 26 Jun 2023 05:18:47 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ78aI+I6BK08wtaEV9CtTcm/1FJxURqC07ZbqfQQCBKJzJ+Wvp8xsU7TxnaCbQ3HOxr8UQX X-Received: by 2002:a17:907:72d3:b0:98c:f606:80cd with SMTP id du19-20020a17090772d300b0098cf60680cdmr12539112ejc.37.1687781927380; Mon, 26 Jun 2023 05:18:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687781927; cv=none; d=google.com; s=arc-20160816; b=PONp2Yn0dBK7EHoAbnPryLUHMvt98EUa/SWACgFbO8bwr3qeqgV/UR3z2nLu2UMmVp AKOr9+go89imSDN8MkUFcFVIKaY3ax4M95tiVKBWB5mS8FSX2i8ZEGHShLy/brfNOiya kMucFtFHh6sB7g68RBZPDVw4pxwK2ElPZ1JwQK08HEB4GdokranRgWrVwH3Y9AjDmwI7 pk13s8JMpXZ1R9hiqzAjvQSG3MtYMKCOOyFaemxgRUiEA9TDIMVVKH+HTY8Ek53LimnQ wKxDAtTh5W2Y1mI9enINDhRAYQEzaIZMv8+hH0POvLIFm7t3rp327MHFsCI79ZHaMDlo fzUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:sender:errors-to:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :mime-version:user-agent:subject:cc:to:date:dmarc-filter :delivered-to:dkim-signature:dkim-filter; bh=iJPD4mDnkA/WcIZCLAgYaESzpd9FyX9IxetsaEjPy1U=; fh=B058kuIemY9jTFn+fyVjo2rs7zVVRS/qgH481EZMj9A=; b=TQz0DmSbyTKuQ3dcWCXMJYXv7GnbhIuajY/cgWxhTnpJDfGMTU+sjh9Ib5Z5h6Masp wMwcss6S3isRTrsHu9rlE2chDnhF0RyxBSFHXB6JzKHLDyYREjjmJUDEGGXQ8O6V2ZFm zYfnQVj9y+fpK+BjaMrsSN8Tpl7pNcgC5jlxe98fEuytju5167noUH7J6YYbU68QpX9+ hxC+qFxC8Hks+oOmDGotqKJbEnoyxj104uHZixyyOEoCmurt3dBiGymUFIfOqe+LXA3G +WBVhzhgnnzpc0U15pVUREjzl6DRXP5FSMRxJQhJGQkmjdzkEWD1vj5IIjkJZc3Ook+b 5enA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=rFrQN8sZ; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id me10-20020a170906aeca00b00977d04a8fd4si2802194ejb.1054.2023.06.26.05.18.47 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 26 Jun 2023 05:18:47 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=rFrQN8sZ; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8030D385772D for ; Mon, 26 Jun 2023 12:18:26 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8030D385772D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1687781906; bh=iJPD4mDnkA/WcIZCLAgYaESzpd9FyX9IxetsaEjPy1U=; h=Date:To:cc:Subject:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=rFrQN8sZRYVesVMfMKGZ86VH/ZzuAnKg6rNP3sGOfK+pxZMJdNtFNHpy/LL+fRNYx 4g/CDtenxJvY1Mb5K4BCvVKwKqqV/ITObJwyd2Bhj5OGxKP+hVHEZV7TfnuwLhd/pT qgzA9Upiro8JHfvyxc3WRudVfgl1CBoYgJoNJpWE= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id 728213858CDB for ; Mon, 26 Jun 2023 12:17:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 728213858CDB Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id A7D0421881; Mon, 26 Jun 2023 12:17:28 +0000 (UTC) Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 9C3562C141; Mon, 26 Jun 2023 12:17:28 +0000 (UTC) Date: Mon, 26 Jun 2023 12:17:28 +0000 (UTC) To: gcc-patches@gcc.gnu.org cc: richard.sandiford@arm.com Subject: [PATCH] tree-optimization/110381 - preserve SLP permutation with in-order reductions User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, MISSING_MID, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Biener via Gcc-patches From: Richard Biener Reply-To: Richard Biener Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" Message-Id: <20230626121826.8030D385772D@sourceware.org> X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769767622049887480?= X-GMAIL-MSGID: =?utf-8?q?1769767622049887480?= The following fixes a bug that manifests itself during fold-left reduction transform in picking not the last scalar def to replace and thus double-counting some elements. But the underlying issue is that we merge a load permutation into the in-order reduction which is of course wrong. Now, reduction analysis has not yet been performend when optimizing permutations so we have to resort to check that ourselves. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. PR tree-optimization/110381 * tree-vect-slp.cc (vect_optimize_slp_pass::start_choosing_layouts): Materialize permutes before fold-left reductions. * gcc.dg/vect/pr110381.c: New testcase. --- gcc/testsuite/gcc.dg/vect/pr110381.c | 40 ++++++++++++++++++++++++++++ gcc/tree-vect-slp.cc | 18 +++++++++++-- 2 files changed, 56 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/pr110381.c diff --git a/gcc/testsuite/gcc.dg/vect/pr110381.c b/gcc/testsuite/gcc.dg/vect/pr110381.c new file mode 100644 index 00000000000..2313dbf11ca --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr110381.c @@ -0,0 +1,40 @@ +/* { dg-do run } */ + +struct FOO { + double a; + double b; + double c; +}; + +double __attribute__((noipa)) +sum_8_foos(const struct FOO* foos) +{ + double sum = 0; + + for (int i = 0; i < 8; ++i) + { + struct FOO foo = foos[i]; + + /* Need to use an in-order reduction here, preserving + the load permutation. */ + sum += foo.a; + sum += foo.c; + sum += foo.b; + } + + return sum; +} + +int main() +{ + struct FOO foos[8]; + + __builtin_memset (foos, 0, sizeof (foos)); + foos[0].a = __DBL_MAX__; + foos[0].b = 5; + foos[0].c = -__DBL_MAX__; + + if (sum_8_foos (foos) != 5) + __builtin_abort (); + return 0; +} diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index 4481d43e3d7..8cb1ac1f319 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -4682,14 +4682,28 @@ vect_optimize_slp_pass::start_choosing_layouts () m_partition_layout_costs.safe_grow_cleared (m_partitions.length () * m_perms.length ()); - /* We have to mark outgoing permutations facing non-reduction graph - entries that are not represented as to be materialized. */ + /* We have to mark outgoing permutations facing non-associating-reduction + graph entries that are not represented as to be materialized. + slp_inst_kind_bb_reduc currently only covers associatable reductions. */ for (slp_instance instance : m_vinfo->slp_instances) if (SLP_INSTANCE_KIND (instance) == slp_inst_kind_ctor) { unsigned int node_i = SLP_INSTANCE_TREE (instance)->vertex; m_partitions[m_vertices[node_i].partition].layout = 0; } + else if (SLP_INSTANCE_KIND (instance) == slp_inst_kind_reduc_chain) + { + stmt_vec_info stmt_info + = SLP_TREE_REPRESENTATIVE (SLP_INSTANCE_TREE (instance)); + stmt_vec_info reduc_info = info_for_reduction (m_vinfo, stmt_info); + if (needs_fold_left_reduction_p (TREE_TYPE + (gimple_get_lhs (stmt_info->stmt)), + STMT_VINFO_REDUC_CODE (reduc_info))) + { + unsigned int node_i = SLP_INSTANCE_TREE (instance)->vertex; + m_partitions[m_vertices[node_i].partition].layout = 0; + } + } /* Check which layouts each node and partition can handle. Calculate the weights associated with inserting layout changes on edges. */