From patchwork Mon Nov 6 13:08:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 162009 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:8f47:0:b0:403:3b70:6f57 with SMTP id j7csp2642609vqu; Mon, 6 Nov 2023 05:09:17 -0800 (PST) X-Google-Smtp-Source: AGHT+IHn0CtswQ/AeMnQVB1XE1/gGORGVRcU43VnE1WRZArGQAnQEMYngTa8xbLr+G9Ok1zhbYYD X-Received: by 2002:a05:622a:1827:b0:418:d18:56ae with SMTP id t39-20020a05622a182700b004180d1856aemr32229846qtc.25.1699276157040; Mon, 06 Nov 2023 05:09:17 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1699276157; cv=pass; d=google.com; s=arc-20160816; b=Tk+Uoxg9gTbkfH0lrGXhIXXL6hJlA7WA8G+FnpLsRH8ZPftqWCV1RFSZNQV/DsZTir 2QbA5eZndTHCXQjnWCQE/wIY9+kM6ngMnlz/ghPbUZA7NP8mV2j1Ul6C/gXPQyQlnr2k iJ6VT3o91vE3caBBbANmNmRj5tfB/fvx2O9+bOdXV5V7+1NSmTbScYKdLAaEiuyc0gMP 9a/gwQSfSjjn8q18MQu0xO4BLGMedQDuHQXE/bmgVeH6uq3+eNR6PsqcBtHM0TMeopfD 7UEKNSB6G4qKaPVFlt7t0qX73QObVXsZ6jL8bO9Y/gaADP3E74kDmRU9g9s57aryVP1z hp+g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:errors-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :user-agent:subject:to:from:date:dkim-signature:dkim-signature :arc-filter:dmarc-filter:delivered-to; bh=K+qKj7uhffWa1OOX/8jFtHvD4UpVzO7suTZNTIC/MBc=; fh=hPrbWPhweUx4V0GV9uXJqbyAzg2ABmTz7kczrAQqMmM=; b=eVR/WU4sBZA1JK6bjEk09CGrltBM9+RbTbJkJM0+twfEXXi8Qz0DBD+jCQ18at35fI zNasLnr8urZXlI45w4eQ6cjU1YFBwHqkdzwV3HEstz/eq4NHcGK/B8EZY2rfahpIrayg 3hsrk5ztSGJgRQFxah3wG0BLZAZH9VtoDCxpby21Gv9vwNW0zl83u6qW22T0QqapE4ta su97IoENzvy9/IMvLe1PG5czgHUcb8/QPLUh2ZLEzGVZb9S98+rvrwz9gEE0x4NA4Wo2 oJJYfpC8+9MMAMb3MtiZKDrQRakACBRYwikkm8rq5iLX0X6Km6Mtt560UvP99Lrmkekb 4OAQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=kb+3fTN0; dkim=neutral (no key) header.i=@suse.de; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id c5-20020ac87dc5000000b00417cd1d2a6bsi5751024qte.136.2023.11.06.05.09.16 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Nov 2023 05:09:17 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=kb+3fTN0; dkim=neutral (no key) header.i=@suse.de; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CCDC43875DF2 for ; Mon, 6 Nov 2023 13:09:16 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by sourceware.org (Postfix) with ESMTPS id 4CECC3858C50 for ; Mon, 6 Nov 2023 13:08:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4CECC3858C50 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4CECC3858C50 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:67c:2178:6::1d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699276136; cv=none; b=UdDCWfVLqOffRAbtuq8vsCHusvsF5+B7jCu15J4NXUTOBJmDgU+mnU8Hxb1qCiO9iBar4oXLy3jc8jaDmLlNf5HRODbfqy0U0BrY7Few49NAk5l0qfxnEzNKQi1LnqqaOaHor0eQJkFJGk5gaZtoGgsd+cdQOcT1mBitJR341LA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699276136; c=relaxed/simple; bh=4vRGOp63YXCyfm0/MtgE9i75d+2p0mu2ZpGVO6Pzgxg=; h=DKIM-Signature:DKIM-Signature:Date:From:To:Subject:MIME-Version; b=nubXFJ7mYMt3d4AnYxl46YcbBC9sM7caV6D2Mvq9RJZ6bFIGtmc/SOKGRH3+LmWVJwquTzLqtfA5RQVxNfgdPJHt2YTM8h/2/PMfS6EVBjzeSx7sjzhhuqgIsdC1TxghGYtF7SzHFASdDNKCYaL4ijxtFs0NXZa1xwee4pDkn1A= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 64BA41FE3A for ; Mon, 6 Nov 2023 13:08:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1699276133; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=K+qKj7uhffWa1OOX/8jFtHvD4UpVzO7suTZNTIC/MBc=; b=kb+3fTN0wROGtCf7tdYzMeSgtfOiglYoRAVxzqtbohE497QJCCgpxKv2cnur1wZwNo+diJ shfEadPSx06bl3LS3yCcsR0tgseo/CeOvieisC5JKwaNchxU5zg6CIfrZ1j5Zbo7uIEdH6 /8vuLw6EJxzYITHXH93LIipwGvxt5UU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1699276133; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=K+qKj7uhffWa1OOX/8jFtHvD4UpVzO7suTZNTIC/MBc=; b=7cIttOzG24nw4xMfUDuDaRltSPS3RpBD6OyHlWxRLetbjl/faLGz2X8Il2b5v9sv6udmBc DpUlx+BVI6TqJuBA== Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 2D8E42D33C for ; Mon, 6 Nov 2023 13:08:53 +0000 (UTC) Date: Mon, 6 Nov 2023 13:08:53 +0000 (UTC) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH] tree-optimization/112404 - two issues with SLP of .MASK_LOAD User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, MISSING_MID, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Message-Id: <20231106130916.CCDC43875DF2@sourceware.org> X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781820195549644128 X-GMAIL-MSGID: 1781820195549644128 The following fixes an oversight in vect_check_scalar_mask when the mask is external or constant. When doing BB vectorization we need to provide a group_size, best via an overload accepting the SLP node as argument. When fixed we then run into the issue that we have not analyzed alignment of the .MASK_LOADs because they were not identified as loads by vect_gather_slp_loads. Fixed by reworking the detection. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. Richard. PR tree-optimization/112404 * tree-vectorizer.h (get_mask_type_for_scalar_type): Declare overload with SLP node argument. * tree-vect-stmts.cc (get_mask_type_for_scalar_type): Implement it. (vect_check_scalar_mask): Use it. * tree-vect-slp.cc (vect_gather_slp_loads): Properly identify loads also for nodes with children, like .MASK_LOAD. * tree-vect-loop.cc (vect_analyze_loop_2): Look at the representative for load nodes and check whether it is a grouped access before looking for load-lanes support. * gfortran.dg/pr112404.f90: New testcase. --- gcc/testsuite/gfortran.dg/pr112404.f90 | 23 +++++++++++++ gcc/tree-vect-loop.cc | 47 ++++++++++++++------------ gcc/tree-vect-slp.cc | 23 ++++++------- gcc/tree-vect-stmts.cc | 22 +++++++++++- gcc/tree-vectorizer.h | 1 + 5 files changed, 82 insertions(+), 34 deletions(-) create mode 100644 gcc/testsuite/gfortran.dg/pr112404.f90 diff --git a/gcc/testsuite/gfortran.dg/pr112404.f90 b/gcc/testsuite/gfortran.dg/pr112404.f90 new file mode 100644 index 00000000000..573fa28164a --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr112404.f90 @@ -0,0 +1,23 @@ +! { dg-do compile } +! { dg-options "-Ofast" } +! { dg-additional-options "-mavx2" { target avx2 } } + SUBROUTINE sfddagd( regime, znt, ite, jte ) + REAL, DIMENSION( ime, IN) :: regime, znt + REAL, DIMENSION( ite, jte) :: wndcor_u + LOGICAL wrf_dm_on_monitor + IF( int4 == 1 ) THEN + DO j=jts,jtf + DO i=itsu,itf + reg = regime(i-1, j) + IF( reg > 10.0 ) THEN + znt0 = znt(i-1, j) + znt(i, j) + IF( znt0 <= 0.2) THEN + wndcor_u(i,j) = 0.2 + ENDIF + ENDIF + ENDDO + ENDDO + IF ( wrf_dm_on_monitor()) THEN + ENDIF + ENDIF + END diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 362856a6507..5213aa0169c 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -2943,17 +2943,19 @@ start_over: != IFN_LAST) { FOR_EACH_VEC_ELT (SLP_INSTANCE_LOADS (instance), i, load_node) - { - stmt_vec_info stmt_vinfo = DR_GROUP_FIRST_ELEMENT - (SLP_TREE_SCALAR_STMTS (load_node)[0]); - /* Use SLP for strided accesses (or if we can't - load-lanes). */ - if (STMT_VINFO_STRIDED_P (stmt_vinfo) - || vect_load_lanes_supported - (STMT_VINFO_VECTYPE (stmt_vinfo), - DR_GROUP_SIZE (stmt_vinfo), false) == IFN_LAST) - break; - } + if (STMT_VINFO_GROUPED_ACCESS + (SLP_TREE_REPRESENTATIVE (load_node))) + { + stmt_vec_info stmt_vinfo = DR_GROUP_FIRST_ELEMENT + (SLP_TREE_REPRESENTATIVE (load_node)); + /* Use SLP for strided accesses (or if we can't + load-lanes). */ + if (STMT_VINFO_STRIDED_P (stmt_vinfo) + || vect_load_lanes_supported + (STMT_VINFO_VECTYPE (stmt_vinfo), + DR_GROUP_SIZE (stmt_vinfo), false) == IFN_LAST) + break; + } can_use_lanes = can_use_lanes && i == SLP_INSTANCE_LOADS (instance).length (); @@ -3261,16 +3263,19 @@ again: "unsupported grouped store\n"); FOR_EACH_VEC_ELT (SLP_INSTANCE_LOADS (instance), j, node) { - vinfo = SLP_TREE_SCALAR_STMTS (node)[0]; - vinfo = DR_GROUP_FIRST_ELEMENT (vinfo); - bool single_element_p = !DR_GROUP_NEXT_ELEMENT (vinfo); - size = DR_GROUP_SIZE (vinfo); - vectype = STMT_VINFO_VECTYPE (vinfo); - if (vect_load_lanes_supported (vectype, size, false) == IFN_LAST - && ! vect_grouped_load_supported (vectype, single_element_p, - size)) - return opt_result::failure_at (vinfo->stmt, - "unsupported grouped load\n"); + vinfo = SLP_TREE_REPRESENTATIVE (node); + if (STMT_VINFO_GROUPED_ACCESS (vinfo)) + { + vinfo = DR_GROUP_FIRST_ELEMENT (vinfo); + bool single_element_p = !DR_GROUP_NEXT_ELEMENT (vinfo); + size = DR_GROUP_SIZE (vinfo); + vectype = STMT_VINFO_VECTYPE (vinfo); + if (vect_load_lanes_supported (vectype, size, false) == IFN_LAST + && ! vect_grouped_load_supported (vectype, single_element_p, + size)) + return opt_result::failure_at (vinfo->stmt, + "unsupported grouped load\n"); + } } } diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index 6b8a7b628b6..13137ede8d4 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -2898,22 +2898,21 @@ vect_gather_slp_loads (vec &loads, slp_tree node, if (!node || visited.add (node)) return; - if (SLP_TREE_CHILDREN (node).length () == 0) + if (SLP_TREE_DEF_TYPE (node) != vect_internal_def) + return; + + if (SLP_TREE_CODE (node) != VEC_PERM_EXPR) { - if (SLP_TREE_DEF_TYPE (node) != vect_internal_def) - return; - stmt_vec_info stmt_info = SLP_TREE_SCALAR_STMTS (node)[0]; - if (STMT_VINFO_GROUPED_ACCESS (stmt_info) + stmt_vec_info stmt_info = SLP_TREE_REPRESENTATIVE (node); + if (STMT_VINFO_DATA_REF (stmt_info) && DR_IS_READ (STMT_VINFO_DATA_REF (stmt_info))) loads.safe_push (node); } - else - { - unsigned i; - slp_tree child; - FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child) - vect_gather_slp_loads (loads, child, visited); - } + + unsigned i; + slp_tree child; + FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child) + vect_gather_slp_loads (loads, child, visited); } diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index f895aaf3083..eefb1eec1ef 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -2456,7 +2456,8 @@ vect_check_scalar_mask (vec_info *vinfo, stmt_vec_info stmt_info, tree vectype = STMT_VINFO_VECTYPE (stmt_info); if (!mask_vectype) - mask_vectype = get_mask_type_for_scalar_type (vinfo, TREE_TYPE (vectype)); + mask_vectype = get_mask_type_for_scalar_type (vinfo, TREE_TYPE (vectype), + mask_node_1); if (!mask_vectype || !VECTOR_BOOLEAN_TYPE_P (mask_vectype)) { @@ -13386,6 +13387,25 @@ get_mask_type_for_scalar_type (vec_info *vinfo, tree scalar_type, return truth_type_for (vectype); } +/* Function get_mask_type_for_scalar_type. + + Returns the mask type corresponding to a result of comparison + of vectors of specified SCALAR_TYPE as supported by target. + NODE, if nonnull, is the SLP tree node that will use the returned + vector type. */ + +tree +get_mask_type_for_scalar_type (vec_info *vinfo, tree scalar_type, + slp_tree node) +{ + tree vectype = get_vectype_for_scalar_type (vinfo, scalar_type, node); + + if (!vectype) + return NULL; + + return truth_type_for (vectype); +} + /* Function get_same_sized_vectype Returns a vector type corresponding to SCALAR_TYPE of size diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index 254d172231d..d2ddc2e4ad5 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -2207,6 +2207,7 @@ extern tree get_related_vectype_for_scalar_type (machine_mode, tree, extern tree get_vectype_for_scalar_type (vec_info *, tree, unsigned int = 0); extern tree get_vectype_for_scalar_type (vec_info *, tree, slp_tree); extern tree get_mask_type_for_scalar_type (vec_info *, tree, unsigned int = 0); +extern tree get_mask_type_for_scalar_type (vec_info *, tree, slp_tree); extern tree get_same_sized_vectype (tree, tree); extern bool vect_chooses_same_modes_p (vec_info *, machine_mode); extern bool vect_get_loop_mask_type (loop_vec_info);