From patchwork Wed Jan 24 14:40:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Biener X-Patchwork-Id: 191595 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2553:b0:103:945f:af90 with SMTP id p19csp1032160dyi; Wed, 24 Jan 2024 06:42:13 -0800 (PST) X-Google-Smtp-Source: AGHT+IGLBpDpsSgCXV313D1gGn7m2uOnYxCg1lUFCfBKim4HpjWcMtVT7O7yX4IcwQ0Q3XDfn/GA X-Received: by 2002:a05:6870:414a:b0:210:ad2e:22d7 with SMTP id r10-20020a056870414a00b00210ad2e22d7mr3003390oad.58.1706107332978; Wed, 24 Jan 2024 06:42:12 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706107332; cv=pass; d=google.com; s=arc-20160816; b=HPe59diHURUpeS/pPu/LxKYMK1o7/Bq5ajNpw1IL/3tUnRGd9SYyK1rMebCrL5Mex6 cs1i8oe3sqpDYbGhW37nlEzRt+TidgUOk4sd5CSdP9AgU0zVIaxqqGOScwFn/kYpMdD7 Ty+zHIcN0fq9O/0TetuTdw6dEsqtHU/FT48H0fU51yBUjlPC7I7BO2Q717aKoNZsZq1D lIkvO59OGcS4NZ+e8h7KEYK45t4PYq9u4UYHj2//as73Ueu0R4Bc5TpVJd7tBg2MlA8J sCWuMwkMHjBd9wsfWiqLxERh7sdXlIY7CuMhOpRK00pYKLpfKevhxU0b1ruzLq7yqrEV X3Fw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=message-id:errors-to:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :subject:to:from:date:dkim-signature:dkim-signature:dkim-signature :dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=gxhBRV1hJqDF0JEKSy8cFBwF3EQPWyVcrxhEnGDJBeY=; fh=hPrbWPhweUx4V0GV9uXJqbyAzg2ABmTz7kczrAQqMmM=; b=tNk2uzfylC1ITAZ09Degr6DuYfYbk3akp7m3X8aJWta2bEyLkHjo0oXFm/dpQSJN0w n7kQi92TrmY9zpeqD5/zwM3lAKC+Vz5Z1qCv1ZmKNzrkTk8QVopKCBJlIrgxWmXAfQm3 uwDPOTUjN3bhvspk9jGSgxqQNvCIB3GHuFIVx1I4pBKt/NzhPHODtKiAYptX+pczFh8J 0Xjk4swxQrHKKwJil+y1DRaNx2VjwSmzrJU4UFFOxhJfvdatPvXo95yGbMHgf3g5R5hb 1gqqkAneS3o2rvIsS1GiTznQjQvOsE7XJ8AC98rTaf+0tsIZxbcLpUeZMvRi7v1FDLZO HeLw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=Z7rshfDP; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519 header.b=AlPdaZdo; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=GWlHayTb; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id p7-20020a0ce187000000b00680684164a6si10473399qvl.304.2024.01.24.06.42.12 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 24 Jan 2024 06:42:12 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=Z7rshfDP; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519 header.b=AlPdaZdo; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=GWlHayTb; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A1CAA3858401 for ; Wed, 24 Jan 2024 14:42:12 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2a07:de40:b251:101:10:150:64:1]) by sourceware.org (Postfix) with ESMTPS id ED0DF3858D3C for ; Wed, 24 Jan 2024 14:41:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ED0DF3858D3C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org ED0DF3858D3C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a07:de40:b251:101:10:150:64:1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706107294; cv=none; b=Dcawp00OsL5s/DHY6fr33EdzVyYam/e91iQJsJO/qxn8dWvDIViw2vROO66CT+v76mSo2Uxb6u9jXXSowMuSwqDZ/Ss1X+FG7XwRJkmqCCS+/PNwDUHh4Uy7fAkhK7m+ydRY9yOsmHYLPw3X0lXV1Vjd811hi17nZdXeLlIBfPA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706107294; c=relaxed/simple; bh=H95MMTtz8fWAC3HiMKKJrHIKheaTHm5KalrvNjLkehY=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:MIME-Version; b=KWIkSXj2GSRnPlqJufHkJaL7e489AVxyfzvIVMQvOW+zrhnXNbButBnFJ1QPazZX7sxYk5qiu5J32L4Rl9XRMXcZMLqYpI6928VRoLBNf1HCYhXPceZavAbNZ7oE4r3X1BHA3TJpVBavhZ/fZKIt3O1wwvndLn0MRqWTLIoQAhw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from [10.168.4.150] (unknown [10.168.4.150]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id D6EC42117F for ; Wed, 24 Jan 2024 14:41:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1706107291; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=gxhBRV1hJqDF0JEKSy8cFBwF3EQPWyVcrxhEnGDJBeY=; b=Z7rshfDPLY1nbXcxQCvFtJ7E9Chr5ANRffTg/gOqsUNPG73BUZubJr155lg95L2GsmDDCL rryVNvYXYwOjXuTI2lbwiXUmFofI8PJnTnZNhktSmPJQMX/Lxuo6O7X1NWDWOQfFSDz8A3 SZAAksOU45sBW6hwNjclH+9ANpMfaJk= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1706107291; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=gxhBRV1hJqDF0JEKSy8cFBwF3EQPWyVcrxhEnGDJBeY=; b=AlPdaZdo7L2T2/OxD+Ee3o4Q6QyXyNg1aclLx8WMwLoRzIB7gRd3lh7AYOSfXPFidLhThw Zk80vD+blhyLVmDg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1706107290; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=gxhBRV1hJqDF0JEKSy8cFBwF3EQPWyVcrxhEnGDJBeY=; b=GWlHayTb+GlpJhm72eaoQOzjHHyR5gXkwfu3FQT9PWQqySbkpZxzgruoXuxLFD+9h4P8NT VGzTfxS1aRhVrA/UEYxrp91md7g8EcO8iarvR5W/492t1W7YnMFDnz9I7rn3tH/yybhS2n Rd1S8s7r3hmJ6gSjFWAZ7rYIUOBsSzs= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1706107290; h=from:from:reply-to:date:date:to:to:cc:mime-version:mime-version: content-type:content-type; bh=gxhBRV1hJqDF0JEKSy8cFBwF3EQPWyVcrxhEnGDJBeY=; b=lz10z21sTN7zPg7FYcC6sb5+cBVA8r26joO5TECoXc+ReR9/v1avwfdO+9yBKkL7BcZw/v vJ3WFpEmJhmg4FDQ== Date: Wed, 24 Jan 2024 15:40:24 +0100 (CET) From: Richard Biener To: gcc-patches@gcc.gnu.org Subject: [PATCH] tree-optimization/113576 - non-empty latch and may_be_zero vectorization MIME-Version: 1.0 Authentication-Results: smtp-out1.suse.de; none X-Spamd-Result: default: False [-0.60 / 50.00]; TO_DN_NONE(0.00)[]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; BAYES_HAM(-3.00)[100.00%]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; MIME_GOOD(-0.10)[text/plain]; RCPT_COUNT_ONE(0.00)[1]; MISSING_MID(2.50)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; DBL_BLOCKED_OPENRESOLVER(0.00)[tree-vect-loop.cc:url]; FUZZY_BLOCKED(0.00)[rspamd.com] X-Spam-Level: X-Spam-Score: -0.60 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, MISSING_MID, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Message-Id: <20240124144212.A1CAA3858401@sourceware.org> X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788983202725366273 X-GMAIL-MSGID: 1788983202725366273 We can't support niters with may_be_zero when we end up with a non-empty latch due to early exit peeling. At least not in the simplistic way the vectorizer handles this now. Disallow it again for exits that are not the last one. Bootstrap and regtest running on x86_64-unknown-linux-gnu. PR tree-optimization/113576 * tree-vect-loop.cc (vec_init_loop_exit_info): Only allow exits with may_be_zero niters when its the last one. * gcc.dg/vect/pr113576.c: New testcase. --- gcc/testsuite/gcc.dg/vect/pr113576.c | 157 +++++++++++++++++++++++++++ gcc/tree-vect-loop.cc | 9 +- 2 files changed, 164 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/pr113576.c diff --git a/gcc/testsuite/gcc.dg/vect/pr113576.c b/gcc/testsuite/gcc.dg/vect/pr113576.c new file mode 100644 index 00000000000..da5ddb09e33 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr113576.c @@ -0,0 +1,157 @@ +/* { dg-do run } */ +/* { dg-options "-O3" } */ +/* { dg-additional-options "-march=skylake-avx512" } */ + +#include "tree-vect.h" + +#include +#include +#include +#include + +#define SBITMAP_ELT_BITS ((unsigned) 64) +#define SBITMAP_ELT_TYPE unsigned long long +#define SBITMAP_SIZE_BYTES(BITMAP) ((BITMAP)->size * sizeof (SBITMAP_ELT_TYPE)) +#define do_popcount(x) __builtin_popcountll(x) + +typedef struct simple_bitmap_def +{ + unsigned char *popcount; /* Population count. */ + unsigned int n_bits; /* Number of bits. */ + unsigned int size; /* Size in elements. */ + SBITMAP_ELT_TYPE elms[1]; /* The elements. */ +} *sbitmap; +typedef const struct simple_bitmap_def *const_sbitmap; + +/* The iterator for sbitmap. */ +typedef struct { + /* The pointer to the first word of the bitmap. */ + const SBITMAP_ELT_TYPE *ptr; + + /* The size of the bitmap. */ + unsigned int size; + + /* The current word index. */ + unsigned int word_num; + + /* The current bit index (not modulo SBITMAP_ELT_BITS). */ + unsigned int bit_num; + + /* The words currently visited. */ + SBITMAP_ELT_TYPE word; +} sbitmap_iterator; + +static inline void +sbitmap_iter_init (sbitmap_iterator *i, const_sbitmap bmp, unsigned int min) +{ + i->word_num = min / (unsigned int) SBITMAP_ELT_BITS; + i->bit_num = min; + i->size = bmp->size; + i->ptr = bmp->elms; + + if (i->word_num >= i->size) + i->word = 0; + else + i->word = (i->ptr[i->word_num] + >> (i->bit_num % (unsigned int) SBITMAP_ELT_BITS)); +} + +/* Return true if we have more bits to visit, in which case *N is set + to the index of the bit to be visited. Otherwise, return + false. */ + +static inline bool +sbitmap_iter_cond (sbitmap_iterator *i, unsigned int *n) +{ + /* Skip words that are zeros. */ + for (; i->word == 0; i->word = i->ptr[i->word_num]) + { + i->word_num++; + + /* If we have reached the end, break. */ + if (i->word_num >= i->size) + return false; + + i->bit_num = i->word_num * SBITMAP_ELT_BITS; + } + + /* Skip bits that are zero. */ + for (; (i->word & 1) == 0; i->word >>= 1) + i->bit_num++; + + *n = i->bit_num; + + return true; +} + +/* Advance to the next bit. */ + +static inline void +sbitmap_iter_next (sbitmap_iterator *i) +{ + i->word >>= 1; + i->bit_num++; +} + +#define SBITMAP_SET_SIZE(N) (((N) + SBITMAP_ELT_BITS - 1) / SBITMAP_ELT_BITS) +/* Allocate a simple bitmap of N_ELMS bits. */ + +sbitmap +sbitmap_alloc (unsigned int n_elms) +{ + unsigned int bytes, size, amt; + sbitmap bmap; + + size = SBITMAP_SET_SIZE (n_elms); + bytes = size * sizeof (SBITMAP_ELT_TYPE); + amt = (sizeof (struct simple_bitmap_def) + + bytes - sizeof (SBITMAP_ELT_TYPE)); + bmap = (sbitmap) malloc (amt); + bmap->n_bits = n_elms; + bmap->size = size; + bmap->popcount = NULL; + return bmap; +} + +#define sbitmap_free(MAP) (free((MAP)->popcount), free((MAP))) +/* Loop over all elements of SBITMAP, starting with MIN. In each + iteration, N is set to the index of the bit being visited. ITER is + an instance of sbitmap_iterator used to iterate the bitmap. */ + +#define EXECUTE_IF_SET_IN_SBITMAP(SBITMAP, MIN, N, ITER) \ + for (sbitmap_iter_init (&(ITER), (SBITMAP), (MIN)); \ + sbitmap_iter_cond (&(ITER), &(N)); \ + sbitmap_iter_next (&(ITER))) + +int +__attribute__((noinline)) +sbitmap_first_set_bit (const_sbitmap bmap) +{ + unsigned int n = 0; + sbitmap_iterator sbi; + + EXECUTE_IF_SET_IN_SBITMAP (bmap, 0, n, sbi) + return n; + return -1; +} + +void +sbitmap_zero (sbitmap bmap) +{ + memset (bmap->elms, 0, SBITMAP_SIZE_BYTES (bmap)); + if (bmap->popcount) + memset (bmap->popcount, 0, bmap->size * sizeof (unsigned char)); +} + +int main () +{ + check_vect (); + + sbitmap tmp = sbitmap_alloc(1856); + sbitmap_zero (tmp); + int res = sbitmap_first_set_bit (tmp); + if (res != -1) + abort (); + sbitmap_free (tmp); + return 0; +} diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index fe631252dc2..e0d02e09d7d 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -991,8 +991,13 @@ vec_init_loop_exit_info (class loop *loop) { tree may_be_zero = niter_desc.may_be_zero; if ((integer_zerop (may_be_zero) - || integer_nonzerop (may_be_zero) - || COMPARISON_CLASS_P (may_be_zero)) + /* As we are handling may_be_zero that's not false by + rewriting niter to may_be_zero ? 0 : niter we require + an empty latch. */ + || (single_pred_p (loop->latch) + && exit->src == single_pred (loop->latch) + && (integer_nonzerop (may_be_zero) + || COMPARISON_CLASS_P (may_be_zero)))) && (!candidate || dominated_by_p (CDI_DOMINATORS, exit->src, candidate->src)))