Message ID | n083282q-0soo-43p0-4qr7-0qos15o313p8@fhfr.qr |
---|---|
Headers |
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:3b04:b0:fb:cd0c:d3e with SMTP id c4csp7739371dys; Wed, 13 Dec 2023 04:32:24 -0800 (PST) X-Google-Smtp-Source: AGHT+IGbydYj6pSZRAfJYS785uuUiaybDW7R+w1HcqCGPH5thzMvIzeMkoJGfPdn8HlA3X8V0nBf X-Received: by 2002:a67:c184:0:b0:464:79c8:f04e with SMTP id h4-20020a67c184000000b0046479c8f04emr5309766vsj.21.1702470743925; Wed, 13 Dec 2023 04:32:23 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1702470743; cv=pass; d=google.com; s=arc-20160816; b=Jeiecy7d1OCpE+y9MKSXtkfvjrGu01CGagY11tEF0PyEmEYbdsn6ORoAwrIdbCBBWM LpgIUSboyMT9+G4YZhh1iRlkAHBI/GBRRSpN5DzHP8sq+axHlj1dT9+Bfah6oJXcnivm htV/q2bgET+kAJsV7PpLpgb73nSotTnc8kb6oll6sw1SkdHv5xWY53X68FtdD4ptUSzS xNl8/vK6qkLZZIv9FCcnwDwGARPGvGLdjm/bcgjQieUq8MHPb1biwK6yo42C0N2f9JBb 2qPqHrl+Zr5oHDoEPObLwam1ZfPe65H3MDz7PDkZmuh7KxdfSBJ26MH+yS/c38HajJsl 4Jtw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mime-version:message-id:subject :cc:to:from:date:dkim-signature:dkim-signature:dkim-signature :dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=N5kC9/w2rU0Bl6RS2lAASmuVGANtN6jLsjDFJPPG6Eo=; fh=B058kuIemY9jTFn+fyVjo2rs7zVVRS/qgH481EZMj9A=; b=yskrqrRhA7yTSupiQBcBxuAggZeZU9/EBJnT54PtFgh8wZVv4VWI1Pz2Ze+ZoFqwkK FAUHTNW3cUNx/a2R1ZnsjPKtxi2R7yrECj9NDQsFWGsnmFXv/1o3TQN0Xg59yLNtznHQ yDNGMxlN1bdk3yvQNO1XbLCo2pxFd9pSQtxmuFAvBESmJWgE5V4+j0lm8fPl72fmYSXW 1aMt9ublxUKeSH6dIaWqUeywSCAM7IQOgRYrqpWzpN/CWn0jsxAM04+acYxtvWk+oAdI ECmTAdJN/8D6a04IHAfG7fC7W360zmwm7XqA0qMXMfiL04mfUDsFU3taVZXo0VZqWvIP Gl8g== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=PXEIluus; dkim=neutral (no key) header.i=@suse.de header.b="mQ/hFikw"; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=PXEIluus; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519 header.b="mQ/hFikw"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id p15-20020a67f40f000000b004649567d519si2056352vsn.124.2023.12.13.04.32.23 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Dec 2023 04:32:23 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=PXEIluus; dkim=neutral (no key) header.i=@suse.de header.b="mQ/hFikw"; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=PXEIluus; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519 header.b="mQ/hFikw"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D3BB4385C33A for <ouuuleilei@gmail.com>; Wed, 13 Dec 2023 12:32:18 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by sourceware.org (Postfix) with ESMTPS id 49E213858C2F for <gcc-patches@gcc.gnu.org>; Wed, 13 Dec 2023 12:31:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 49E213858C2F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=suse.de Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.de ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 49E213858C2F Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702470711; cv=none; b=U3Yp5cgF4L7zDns4iEA2P9QKNL39ypdhKYuKJcoo7xMDGGkDUpWTBgK2Yw4gr3fVAihBekvSR8JoXt06/p5xKqZ3q/i65N1YJc1CsFVR+9aJr0LC902+8sJcplx+SWetZyDrVNQYaVSrEZ1J/ADU3eirnjPie0rU9XDN98VWmWw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702470711; c=relaxed/simple; bh=708qSl3a8eY86X7Xgo9mb8xYic3/oWal+gv1gGkhwGs=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:Message-ID:MIME-Version; b=Fyt3LPChseLOF0RM0bGGT56dxTOLqb9HsSclQ+IaAC/MzTjWoKqQ6Gx3fmHeAHesF6sNN7Q0nW+Tz2MI9uq2FeE+iiGF1vUvMQjl6aGdPgMe/XBuRCxiudaHsSV20yaKr7uQqjMK78EHE0RNpGILLFM0RpWwZ/KHyxBKNE8T2vo= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from [10.168.4.150] (unknown [10.168.4.150]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 43C66221C9; Wed, 13 Dec 2023 12:31:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1702470707; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=N5kC9/w2rU0Bl6RS2lAASmuVGANtN6jLsjDFJPPG6Eo=; b=PXEIluusLXiuwQ3dTLk02BegqqSueCW+sVBmw9wEXrEwqtrmO0hevoww4uHzptDO13K8Oi cUMY7ZIQ3YdIwGjaDPf19wBCkVq9fJDe9Rfawo5bcirrXLBya4E+U2JVJAlKQm3GTjKRZp RW5ZdIrasX2qUvZoD8XpJ0Ze+PKShBg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1702470707; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=N5kC9/w2rU0Bl6RS2lAASmuVGANtN6jLsjDFJPPG6Eo=; b=mQ/hFikw4k2FAS4dhXyipws7LOBJMTeB5i2x+HqLV0wbREyF7SE1wYQm0fMy7pAKqSyhMu hAEg6GTmp2JYTMDA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1702470707; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=N5kC9/w2rU0Bl6RS2lAASmuVGANtN6jLsjDFJPPG6Eo=; b=PXEIluusLXiuwQ3dTLk02BegqqSueCW+sVBmw9wEXrEwqtrmO0hevoww4uHzptDO13K8Oi cUMY7ZIQ3YdIwGjaDPf19wBCkVq9fJDe9Rfawo5bcirrXLBya4E+U2JVJAlKQm3GTjKRZp RW5ZdIrasX2qUvZoD8XpJ0Ze+PKShBg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1702470707; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=N5kC9/w2rU0Bl6RS2lAASmuVGANtN6jLsjDFJPPG6Eo=; b=mQ/hFikw4k2FAS4dhXyipws7LOBJMTeB5i2x+HqLV0wbREyF7SE1wYQm0fMy7pAKqSyhMu hAEg6GTmp2JYTMDA== Date: Wed, 13 Dec 2023 13:30:43 +0100 (CET) From: Richard Biener <rguenther@suse.de> To: gcc-patches@gcc.gnu.org cc: richard.sandiford@arm.com Subject: [PATCH][0/6][RFC] Relax single-vector-size restriction Message-ID: <n083282q-0soo-43p0-4qr7-0qos15o313p8@fhfr.qr> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Level: X-Spam-Score: -3.10 Authentication-Results: smtp-out1.suse.de; none X-Spam-Level: X-Spam-Score: -0.26 X-Spamd-Result: default: False [-0.26 / 50.00]; ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; NEURAL_SPAM_SHORT(2.84)[0.948]; MIME_GOOD(-0.10)[text/plain]; TO_DN_NONE(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; RCPT_COUNT_TWO(0.00)[2]; FUZZY_BLOCKED(0.00)[rspamd.com]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; BAYES_HAM(-3.00)[100.00%] X-Spam-Flag: NO X-Spam-Status: No, score=-5.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785169962680737803 X-GMAIL-MSGID: 1785169962680737803 |
Series |
Relax single-vector-size restriction
|
|
Message
Richard Biener
Dec. 13, 2023, 12:30 p.m. UTC
I've been asked to look into how to best relax the current restriction of the vectorizer that it prefers to use a single vector size throughout loop vectorization. That size is determined by the preferred_simd_mode and the autovectorize_vector_modes hook for other-than-first iterations. The target does have some leeway with its related_mode hook which you can see in the aarch64 backend which has a "hack" prefering "1 128-bit vector instead of 2 64-bit vectors" (for ADVSIMD). Incidentially that hack allows it to vectorize gcc.dg/vect/pr65947-7.c which uses a condition reduction that is generally unhappy about the ncopies > 1 case. The first roadblock you hit when trying to relax things is that we are assigning vector types very early - during data reference analysis and during pattern matching and then for the rest of stmts as part of determining the vectorization factor. The patch series starts pushing that back (with some exceptions - it's a proof-of-concept), trying to get us to the point of determining the vectorization factor to use and only after that assign vector types (with that VF as one of the constraints). In particular the patch tries to avoid altering the VF choice as we're still iterating over the SIMD modes (possibly iterating over { VF, mode } pairs where 'mode' would be VLA or VLS might be a future improvement). Apart from gcc.dg/vect/pr65947-7.c which I'd like to see vectorized on x86_64 there is a motivational testcase like double x[1024]; char y[1024]; void foo () { for (int i = 0 ; i < 16; ++i) { x[i] = i; y[i] = i; } } where the iteration domain constrains the VF and we currently end up vectorizing this with SSE vectors, causing 8 vector stores to x[] even when AVX2 or AVX512 widths would be available there. After a lot of different experiments I finally settled on the following minimal impact solution - after determining the VF we assign vector types but allow larger than the current active vector modes up to the size of the mode of the preferred_simd_mode when that stays within the constraint of the VF. For the second example above on x86 with -march=znver4 we then fail vectorizing with V64QImode (AVX512, the preferred_simd_mode) and for V32QImode (AVX2) because of the low iteration count but we succeed with V16QImode (SSE, as with current GCC) but are able to choose V8DFmode for the accesses to x[] (AVX512, the preferred_simd_mode). The condition reduction case works in a similar way - with just SSE we succeed with V4HImode but use V4SImode for the condition, keeping ncopies == 1 and making the vectorizer happy. The patch series prototypes this for non-SLP loop vectorization (because the testcases above do not use SLP) - the prototype doesn't pass testing and I won't pursue this further until we get rid of the non-SLP path. The series starts with some cleanups that might still be applicable though, reducing calls to get_vectype_for_scalar_type where the vector types should be known already (all of the constant/external def kinds will go away with SLP-only anyway). Then as I first tried to vary VF it makes LOOP_VINFO_VECT_FACTOR an rvalue to make sure we're nowhere rely on its value before it's really final. Gathers/scatters also complicate manners right now since we're analyzing them very early (and that analysis needs a vector type), but the actual offset def we need to mark relevant is tightly coupled with the vector type chosen for it (and what the target actually supports). That's going to be trick. I also noticed that we might no longer need the gather/scater pattern support as SLP can handle them without the IFNs(?) Some general API cleanup wrt unsigned vs. poly-uint and finally the last patch in the series defers setting STMT_VINFO_VECTYPE (with exceptions as I said) and has a cobbled up loop to assign vector types after the VF is determined with the above described scheme. There's complication around mask types, so the patch goes one step further and makes vectorizable_operation determine the vector type of the def from the vector types of the operands. I think that in the end we want to "force" as few vector types as possible and perform upward/downward propagation from within vectorizable_* which would need a new mode of operation for this (figure either output or input vector types from what is present, possibly signaling DEFER and queuing either uses of the output or the fixed inputs for further processing). I'd like to get some feedback on the way I chose to wire the new flexibility into the existing mode iteration and whether that's sound both for SVE/NEON or whether any of you have concerns around that or ideas how to instead exploit such flexibility. Thanks, Richard.