From patchwork Mon Jan 8 11:29:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 185917 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:37c1:b0:101:2151:f287 with SMTP id y1csp958909dyq; Mon, 8 Jan 2024 03:35:13 -0800 (PST) X-Google-Smtp-Source: AGHT+IEkb9XIi2X4q7nkTU8lz04c5V9tVdcZSJwupi/LuXK8RBp14WuHEPcEIqytrvbC2JyOxNDD X-Received: by 2002:a05:622a:15c2:b0:429:8baa:7263 with SMTP id d2-20020a05622a15c200b004298baa7263mr3959969qty.34.1704713713624; Mon, 08 Jan 2024 03:35:13 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1704713713; cv=pass; d=google.com; s=arc-20160816; b=oXi5V9LoGBxKCqG45aLGb4v2/QwBJ8ZLXbpYoRn9021U8kx8b+EfxnbJPTPSe0QkLQ 8xDWr8hgu5OFBx/x7q/VQ3JCsp0kQsTu4wzjrk13FE7H3h5PHDD+HwmN3pkHBRBULZ5g 3MKwDwRWmaCP6a+lKdEScvPwztArpjwhAF2t+FeiVHRg1TOgrRn/Q+/UaBFnc8ZhSlyW 4XfEdF1JIsOcWsxI5hqPmqHCx5xS4KMNxqE7iiRYgLm4V/xs0pRdGvfFWaz0lmJqRxSX PNNlJkKPBi+5JoDa/ez97rWqtXsEPnF/R691pJUxZq16s87JEpC6dFJc9Jt2Yr09HCZz WIng== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:errors-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :subject:cc:to:from:date:dkim-signature:dkim-signature :dkim-signature:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=aQ3BxX6IGlLyg+h5zE8+3tS/FckBLj0yFD4dNi2sizs=; fh=iwBcjOQuJQeC9NGqXlwJbGDynxCKlwwYbxsdwpHBsEc=; b=o/WmMYY821H1sWNXTkKm0hRZUC2NRJbzf02BMgfg+Y8o27eFjSdAT7aOnjaJ60R3il G8hR6sumc7keyF0a33EC05nYhvoJCiXyhMynmnKS55Eob7iAh4V3Qg6axoCIWQA18XzN RdHpOO8dFNkpGhIznzQuEQBmukwxSW4L240hMgCC3XlJ0dAYM/kHjPuQt6hastk6DH76 lkTl/1+QgOotImwnUexC0rOyxpROeXXYGXLgotPsojk50XRM81EAtgpB2fAJ7HW01Dpn Ae4rIijx3Pp7BAJdqkq59GKxtKPzOkpx0ZLFHtamF+XYGZVWTSYYQiGtcfM4KgOYgrU3 D1Gg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=NDeSOj8S; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519 header.b=pXZz7zeW; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=NDeSOj8S; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519 header.b=pXZz7zeW; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id f21-20020a05622a1a1500b00429996e1383si1027767qtb.260.2024.01.08.03.35.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Jan 2024 03:35:13 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=NDeSOj8S; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519 header.b=pXZz7zeW; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=NDeSOj8S; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519 header.b=pXZz7zeW; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 554483858CDB for ; Mon, 8 Jan 2024 11:35:13 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2a07:de40:b251:101:10:150:64:2]) by sourceware.org (Postfix) with ESMTPS id 22BE93858CDB for ; Mon, 8 Jan 2024 11:34:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 22BE93858CDB Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 22BE93858CDB Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a07:de40:b251:101:10:150:64:2 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704713667; cv=none; b=jx5qO4Qk+NZL0nF60LUbKlnAAWDZamBFEBxZmpBTGCZa9tWcP993nk8nrHMTTnBQzS+7/u2uGWhWtWCt2sSesnYR74HSqvHUZkdNcuygxjO6tMcT10hU9ENrjzsVgga7pt6dtJpnNe4JtMEkBtB5AWX4rJC95n1D+ZXRlyRog0E= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704713667; c=relaxed/simple; bh=RS0VyaNArLY3vwPNmXFmI85D3fg/warWtllbm1/kfbc=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:MIME-Version; b=D12uu6yCRueZ76jfodMTn7ZbDVQyCi4u+8jHCv/ruLS3MqUbLSXZKbKQ3CVeKFg016jvKxMtHPEhiG7Bby2hDqTu/AfAGgCUCiYQXj8jXsxiR60o9ZcFgXAH8+9nnjVXqA/1op25kRByOh6Kg4W/U8w1E4Rjj7hJQ8H4yAPwmfo= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from [10.168.4.150] (unknown [10.168.4.150]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 057F31FD03; Mon, 8 Jan 2024 11:34:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1704713663; h=from:from:reply-to:date:date:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=aQ3BxX6IGlLyg+h5zE8+3tS/FckBLj0yFD4dNi2sizs=; b=NDeSOj8S/6CD4XoS/fYlZjvRq9Hwz6c/TyMvocEall9uL2UJokYP9TEuGWtEk2/5ulvvnR t1BoGrt8dkhMuCwmAlBgGW2AAEEKm/MG4+vvs+dvYiyeDB38fWcAzoDfxRbETxJka9L+4N z+u/++nGboycDDeZtvGhMuKoBlvA72g= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1704713663; h=from:from:reply-to:date:date:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=aQ3BxX6IGlLyg+h5zE8+3tS/FckBLj0yFD4dNi2sizs=; b=pXZz7zeWVvPclFlQEbghi+sidZU4uNOP3Xc69/YQ748PAP7M+Y05Q7G37qovqHMcEuj3D+ nWwemzCXobRwOkCg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1704713663; h=from:from:reply-to:date:date:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=aQ3BxX6IGlLyg+h5zE8+3tS/FckBLj0yFD4dNi2sizs=; b=NDeSOj8S/6CD4XoS/fYlZjvRq9Hwz6c/TyMvocEall9uL2UJokYP9TEuGWtEk2/5ulvvnR t1BoGrt8dkhMuCwmAlBgGW2AAEEKm/MG4+vvs+dvYiyeDB38fWcAzoDfxRbETxJka9L+4N z+u/++nGboycDDeZtvGhMuKoBlvA72g= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1704713663; h=from:from:reply-to:date:date:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=aQ3BxX6IGlLyg+h5zE8+3tS/FckBLj0yFD4dNi2sizs=; b=pXZz7zeWVvPclFlQEbghi+sidZU4uNOP3Xc69/YQ748PAP7M+Y05Q7G37qovqHMcEuj3D+ nWwemzCXobRwOkCg== Date: Mon, 8 Jan 2024 12:29:24 +0100 (CET) From: Richard Biener To: gcc-patches@gcc.gnu.org cc: tamar.christina@arm.com Subject: [PATCH] tree-optimization/113026 - avoid vector epilog in more cases MIME-Version: 1.0 X-Spam-Level: * Authentication-Results: smtp-out2.suse.de; none X-Spam-Level: X-Spam-Score: 0.06 X-Spamd-Result: default: False [0.06 / 50.00]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_HAM_LONG(-0.99)[-0.989]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; NEURAL_SPAM_SHORT(1.65)[0.549]; MISSING_MID(2.50)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RCPT_COUNT_TWO(0.00)[2]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; BAYES_HAM(-3.00)[100.00%] X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, MISSING_MID, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Message-Id: <20240108113513.554483858CDB@sourceware.org> X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787521886544252539 X-GMAIL-MSGID: 1787521886544252539 The following avoids creating a niter peeling epilog more consistently, matching what peeling later uses for the skip_vector condition, in particular when versioning is required which then also ensures the vector loop is entered unless the epilog is vectorized. This should ideally match LOOP_VINFO_VERSIONING_THRESHOLD which is only computed later, some refactoring could make that better matching. The patch also makes sure to adjust the upper bound of the epilogues when we do not have a skip edge around the vector loop. Bootstrapped and tested on x86_64-unknown-linux-gnu. Tamar, does that look OK wrt early-breaks? Thanks, Richard. PR tree-optimization/113026 * tree-vect-loop.cc (vect_need_peeling_or_partial_vectors_p): Avoid an epilog in more cases. * tree-vect-loop-manip.cc (vect_do_peeling): Adjust the epilogues niter upper bounds and estimates. * gcc.dg/torture/pr113026-1.c: New testcase. * gcc.dg/torture/pr113026-2.c: Likewise. --- gcc/testsuite/gcc.dg/torture/pr113026-1.c | 11 ++++++++ gcc/testsuite/gcc.dg/torture/pr113026-2.c | 18 +++++++++++++ gcc/tree-vect-loop-manip.cc | 32 +++++++++++++++++++++++ gcc/tree-vect-loop.cc | 6 ++++- 4 files changed, 66 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.dg/torture/pr113026-1.c create mode 100644 gcc/testsuite/gcc.dg/torture/pr113026-2.c diff --git a/gcc/testsuite/gcc.dg/torture/pr113026-1.c b/gcc/testsuite/gcc.dg/torture/pr113026-1.c new file mode 100644 index 00000000000..56dfef3b36c --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr113026-1.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-Wall" } */ + +char dst[16]; + +void +foo (char *src, long n) +{ + for (long i = 0; i < n; i++) + dst[i] = src[i]; /* { dg-bogus "" } */ +} diff --git a/gcc/testsuite/gcc.dg/torture/pr113026-2.c b/gcc/testsuite/gcc.dg/torture/pr113026-2.c new file mode 100644 index 00000000000..b9d5857a403 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr113026-2.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-Wall" } */ + +char dst1[17]; +void +foo1 (char *src, long n) +{ + for (long i = 0; i < n; i++) + dst1[i] = src[i]; /* { dg-bogus "" } */ +} + +char dst2[18]; +void +foo2 (char *src, long n) +{ + for (long i = 0; i < n; i++) + dst2[i] = src[i]; /* { dg-bogus "" } */ +} diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc index 9330183bfb9..927f76a0947 100644 --- a/gcc/tree-vect-loop-manip.cc +++ b/gcc/tree-vect-loop-manip.cc @@ -3364,6 +3364,38 @@ vect_do_peeling (loop_vec_info loop_vinfo, tree niters, tree nitersm1, bb_before_epilog->count = single_pred_edge (bb_before_epilog)->count (); bb_before_epilog = loop_preheader_edge (epilog)->src; } + else + { + /* When we do not have a loop-around edge to the epilog we know + the vector loop covered at least VF scalar iterations unless + we have early breaks and the epilog will cover at most + VF - 1 + gap peeling iterations. + Update any known upper bound with this knowledge. */ + if (! LOOP_VINFO_EARLY_BREAKS (loop_vinfo)) + { + if (epilog->any_upper_bound) + epilog->nb_iterations_upper_bound -= lowest_vf; + if (epilog->any_likely_upper_bound) + epilog->nb_iterations_likely_upper_bound -= lowest_vf; + if (epilog->any_estimate) + epilog->nb_iterations_estimate -= lowest_vf; + } + unsigned HOST_WIDE_INT const_vf; + if (vf.is_constant (&const_vf)) + { + const_vf += LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) - 1; + if (epilog->any_upper_bound) + epilog->nb_iterations_upper_bound + = wi::umin (epilog->nb_iterations_upper_bound, const_vf); + if (epilog->any_likely_upper_bound) + epilog->nb_iterations_likely_upper_bound + = wi::umin (epilog->nb_iterations_likely_upper_bound, + const_vf); + if (epilog->any_estimate) + epilog->nb_iterations_estimate + = wi::umin (epilog->nb_iterations_estimate, const_vf); + } + } /* If loop is peeled for non-zero constant times, now niters refers to orig_niters - prolog_peeling, it won't overflow even the orig_niters diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index a06771611ac..9dd573ef125 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -1261,7 +1261,11 @@ vect_need_peeling_or_partial_vectors_p (loop_vec_info loop_vinfo) the epilogue is unnecessary. */ && (!LOOP_REQUIRES_VERSIONING (loop_vinfo) || ((unsigned HOST_WIDE_INT) max_niter - > (th / const_vf) * const_vf)))) + /* We'd like to use LOOP_VINFO_VERSIONING_THRESHOLD + but that's only computed later based on our result. + The following is the most conservative approximation. */ + > (std::max ((unsigned HOST_WIDE_INT) th, + const_vf) / const_vf) * const_vf)))) return true; return false;