From patchwork Fri Oct 14 09:51:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 2629 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp90320wrs; Fri, 14 Oct 2022 02:52:05 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5FZz2N3pyx8ZxsjFWPAHqk6kF719HzaiZit0JYbDekHui1ToojiB4ZF0XBMq5oLlfbBfyE X-Received: by 2002:a17:906:5587:b0:78d:b6b6:7872 with SMTP id y7-20020a170906558700b0078db6b67872mr2927827ejp.72.1665741125569; Fri, 14 Oct 2022 02:52:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665741125; cv=none; d=google.com; s=arc-20160816; b=H3WM9gnYeFfdf0kyD4PVA17bBacvOYj1bcr6IlTfLIaBeAmIiwvAhvZQL3uJZpvKVX 5jqtLRl3tg0ZPvIEI4QhR3CR/GpHheqzCfBGDiywp5ChN4+IOMaRqJ6Klt2JjCqbqx/n cUdh1hkyoIvk1XmBhkm8rLJbYf+vvYAWHLtWrVYlhE8J0Q3jzFTsk3oTuUdQe4eyL1Mr rQ42KJ4GQXpI058ro9o6B/RvuYdvz0QcYthS/+GXrI9AS4pZ9XpCLnvd6sHYnX3zoeFu WwCYzHdorYuLW0OLbfgNym+Cx+SlL3/RDEGgFq+xgHxxxDTc0yblHdcitwx/v9qAXNsT qjPQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:message-id :mime-version:subject:to:date:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=BWogg2MFSQ0jzz79UQPrDIvQxlPq9VeeM+cleioGQe0=; b=perrIO4IcszuSYyWfFB0MGGvtYLEbW6Y8bR5jcCKTpdaO9LVEGw4zPZuIiWC7QyqlS +YZ9nOcBPtU60Tqney0vWtg1JVPUMdBy0zGxtWHcUqdtjJ1qJSNbdkGHPCTP2/TJvPTR KazkTX+svJp03Th5JKmm2vh9HZUa3aZXzTkdh2CwI5fAS3L1auE/naoULLHRVsFpDGEo VBFT5dwvhgtmaEz1RXaWJgjgYWNfDFkisNEyujfwf1EIbY0k0B9VQ7wJZsdjxTt8jplG 39yVWGiwljG6fSMUQWjzrgvVgjoyIXXA7NLgvBxdVyslgtDFh62u7nh8Z5yZu0hE0rvu jHMA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=eA2LHcdl; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id eb7-20020a0564020d0700b0045d15503bb8si1940495edb.224.2022.10.14.02.52.05 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 14 Oct 2022 02:52:05 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=eA2LHcdl; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 56B0C38582A0 for ; Fri, 14 Oct 2022 09:52:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 56B0C38582A0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1665741124; bh=BWogg2MFSQ0jzz79UQPrDIvQxlPq9VeeM+cleioGQe0=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=eA2LHcdlrkhH95k0yHM6mCv5WdKpeqszop/poewF/B0ny8noloynvexHiYKN554b7 TXMAfucAqceQ/WQRT1s7IeiT96ud2GLlcTV4Ls1YO6jiLygZHEr0a1jePsvOxYn1U2 12WoMkpsbMzzMp9HyyfRYottiHW/M/86m9h75bvs= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id 3196B3858C50 for ; Fri, 14 Oct 2022 09:51:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3196B3858C50 Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id E336F219EE for ; Fri, 14 Oct 2022 09:51:20 +0000 (UTC) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id D070313A4A for ; Fri, 14 Oct 2022 09:51:20 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id fLjAMRgxSWOiLgAAMHmgww (envelope-from ) for ; Fri, 14 Oct 2022 09:51:20 +0000 Date: Fri, 14 Oct 2022 11:51:20 +0200 (CEST) To: gcc-patches@gcc.gnu.org Subject: [PATCH] tree-optimization/107254 - check and support live lanes from permutes MIME-Version: 1.0 Message-Id: <20221014095120.D070313A4A@imap2.suse-dmz.suse.de> X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Biener via Gcc-patches From: Richard Biener Reply-To: Richard Biener Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1746656166520501186?= X-GMAIL-MSGID: =?utf-8?q?1746656166520501186?= The following fixes an omission from adding SLP permute nodes which is live lanes originating from those. We have to check that we can extract the lane and have to actually code generate them. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. PR tree-optimization/107254 * tree-vect-slp.cc (vect_slp_analyze_node_operations_1): For permutes also analyze live lanes. (vect_schedule_slp_node): For permutes also code generate live lane extracts. * gfortran.dg/vect/pr107254.f90: New testcase. --- gcc/testsuite/gfortran.dg/vect/pr107254.f90 | 49 +++++++++++++++++++++ gcc/tree-vect-slp.cc | 33 +++++++++++--- 2 files changed, 77 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/vect/pr107254.f90 diff --git a/gcc/testsuite/gfortran.dg/vect/pr107254.f90 b/gcc/testsuite/gfortran.dg/vect/pr107254.f90 new file mode 100644 index 00000000000..85bcb5f3fa2 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/vect/pr107254.f90 @@ -0,0 +1,49 @@ +! { dg-do run } + +subroutine dlartg( f, g, s, r ) + implicit none + double precision :: f, g, r, s + double precision :: d, p + + d = sqrt( f*f + g*g ) + p = 1.d0 / d + if( abs( f ) > 1 ) then + s = g*sign( p, f ) + r = sign( d, f ) + else + s = g*sign( p, f ) + r = sign( d, f ) + end if +end subroutine + +subroutine dhgeqz( n, h, t ) + implicit none + integer n + double precision h( n, * ), t( n, * ) + integer jc + double precision c, s, temp, temp2, tempr + temp2 = 10d0 + call dlartg( 10d0, temp2, s, tempr ) + c = 0.9d0 + s = 1.d0 + do jc = 1, n + temp = c*h( 1, jc ) + s*h( 2, jc ) + h( 2, jc ) = -s*h( 1, jc ) + c*h( 2, jc ) + h( 1, jc ) = temp + temp2 = c*t( 1, jc ) + s*t( 2, jc ) + t( 2, jc ) = -s*t( 1, jc ) + c*t( 2, jc ) + t( 1, jc ) = temp2 + enddo +end subroutine dhgeqz + +program test + implicit none + double precision h(2,2), t(2,2) + h = 0 + t(1,1) = 1 + t(2,1) = 0 + t(1,2) = 0 + t(2,2) = 0 + call dhgeqz( 2, h, t ) + if (t(2,2).ne.0) STOP 1 +end program test diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index cea5d50da92..e54414f6bef 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -5933,7 +5933,23 @@ vect_slp_analyze_node_operations_1 (vec_info *vinfo, slp_tree node, /* Handle purely internal nodes. */ if (SLP_TREE_CODE (node) == VEC_PERM_EXPR) - return vectorizable_slp_permutation (vinfo, NULL, node, cost_vec); + { + if (!vectorizable_slp_permutation (vinfo, NULL, node, cost_vec)) + return false; + + stmt_vec_info slp_stmt_info; + unsigned int i; + FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, slp_stmt_info) + { + if (STMT_VINFO_LIVE_P (slp_stmt_info) + && !vectorizable_live_operation (vinfo, + slp_stmt_info, NULL, node, + node_instance, i, + false, cost_vec)) + return false; + } + return true; + } gcc_assert (STMT_SLP_TYPE (stmt_info) != loop_vect); @@ -8900,8 +8916,6 @@ vect_schedule_slp_node (vec_info *vinfo, } } - bool done_p = false; - /* Handle purely internal nodes. */ if (SLP_TREE_CODE (node) == VEC_PERM_EXPR) { @@ -8912,9 +8926,18 @@ vect_schedule_slp_node (vec_info *vinfo, but open-code it here (partly). */ bool done = vectorizable_slp_permutation (vinfo, &si, node, NULL); gcc_assert (done); - done_p = true; + stmt_vec_info slp_stmt_info; + unsigned int i; + FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, slp_stmt_info) + if (STMT_VINFO_LIVE_P (slp_stmt_info)) + { + done = vectorizable_live_operation (vinfo, + slp_stmt_info, &si, node, + instance, i, true, NULL); + gcc_assert (done); + } } - if (!done_p) + else vect_transform_stmt (vinfo, stmt_info, &si, node, instance); }