From patchwork Fri Aug 5 12:53:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 400 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6a10:20da:b0:2d3:3019:e567 with SMTP id n26csp174127pxc; Fri, 5 Aug 2022 05:53:59 -0700 (PDT) X-Google-Smtp-Source: AA6agR5V/UD8W8FQDAV3FiZTtX3qCy/Bq2RxZCf6jXYLZYW9oX4KkUg6ViN35SazuuZOOtn9jFSI X-Received: by 2002:a05:6402:90e:b0:43b:914e:f084 with SMTP id g14-20020a056402090e00b0043b914ef084mr6670035edz.144.1659704039174; Fri, 05 Aug 2022 05:53:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659704039; cv=none; d=google.com; s=arc-20160816; b=mdHREZAV+JyQnQ53b8mTR/WVZOVvveOpyoRAoID4QMp/N3MwfjnV6tO/grYWyActTs us/THHzD/nGH5gpxBPWzfmNJyT2uiegQfhuo2wqr/nVbYIufwl1yeMy9zVxoT+6M8oUv r84GgMA8nVluYlcw3JjyjhuS9nGrjcrMeLS3Q40gQVyR2pytLecw+LfXIvA2d00S9Fys bfmylQvYbfdef9Drm1fUen4qUAM3TNMBBVBRFcOuUe/KiIyEFWcQPA7UpgA6AheC3tSM PjOKbbkwv4jCp+ixXhah1m9gaiErJ3q/FUCztxZ06ruwQoeoFSB0Ffwx1Ncm/bZWeP3i cjeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :in-reply-to:references:to:content-language:subject:user-agent :mime-version:date:message-id:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=pQOjoOiP09kxV643OhLiGH4qegREcqiUiK8X8XmQfBQ=; b=hA4c8NjM0F/9+VVMfym1/P9/jcAYkN8Xi9RWKGKpdQAOpYb+oVfvl/6oYUC40gHE8H XZwy2/tmAPzG8Pvu0pbgVSZV1B2VwZqvwuwb469oCkwEIo1ITtnVjJVZkCUssuljy8yV pR3d4fULNHxfZjB/+dDlZGz+SAr8bLjbpJCfcIAUZXPzBpOA96CDhY/J44RCcvzU5+NC 3iHCYyDZsNuXZ8wE33R+OBxvOrzTKBngUaJIYFIcuOIudzpyvmWW5qRL4PPeLmlZCXn1 zioKA8D3MrH4YD1wrKYh0RyobidiKGlDI8UuqlwhO/5E+vJOcvscTxoAsnKeRdOQjv3/ RNvw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=GFXBotKN; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id hw19-20020a170907a0d300b00726b4f91300si1837090ejc.269.2022.08.05.05.53.58 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Aug 2022 05:53:59 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=GFXBotKN; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 005E33857BB2 for ; Fri, 5 Aug 2022 12:53:58 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 005E33857BB2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1659704038; bh=pQOjoOiP09kxV643OhLiGH4qegREcqiUiK8X8XmQfBQ=; h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=GFXBotKNfGFkuVVg7qeH5Yr1qkXs1OQuejsqBSv7speiFVKz4ltiiquRgys8eqDPg aDoc/OeT389it92hAe0Hmx+kGwUsFFffXcoFet83QtURQ9jZcN6he5dwFdSXjOqxZd Q5EBLNlgWW0DdizL8QmnnxHRNBjpHQMFGO+DTJN0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 3207E3858C53 for ; Fri, 5 Aug 2022 12:53:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 3207E3858C53 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 596C7106F; Fri, 5 Aug 2022 05:53:15 -0700 (PDT) Received: from [10.57.14.36] (unknown [10.57.14.36]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E25473F73B; Fri, 5 Aug 2022 05:53:13 -0700 (PDT) Message-ID: <6bdb70e9-8c02-2c91-9ec3-33004a67c3ed@arm.com> Date: Fri, 5 Aug 2022 13:53:07 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: [PATCH 1/4] aarch64: encourage use of GPR input for SIMD inserts Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <95d2de77-5b68-6d0b-ac99-ac1ca28835e2@arm.com> In-Reply-To: <95d2de77-5b68-6d0b-ac99-ac1ca28835e2@arm.com> X-Spam-Status: No, score=-22.5 required=5.0 tests=BAYES_00, BODY_8BITS, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Cc: Richard Sandiford Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1740325822545293844?= X-GMAIL-MSGID: =?utf-8?q?1740325822545293844?= Hi, This enables and makes it more likely the compiler is able to use GPR input for SIMD inserts. I believe this is some outdated hack we used to prevent costly GPR<->SIMD register file swaps. This patch is required for better codegen in situations like the test case 'int8_3' in the next patch in this series. Bootstrapped and regression tested together with the next patch on aarch64-none-linux-gnu. gcc/ChangeLog: 2022-08-05  Andre Vieira          * config/aarch64/aarch64-simd.md (aarch64_simd_vec_set): Remove '?' modifier. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 587a45d77721e1b39accbad7dbeca4d741eccb10..51eab5a872ade7b70268676346e8be7c9c6c8e3a 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1038,7 +1038,7 @@ [(set (match_operand:VALL_F16 0 "register_operand" "=w,w,w") (vec_merge:VALL_F16 (vec_duplicate:VALL_F16 - (match_operand: 1 "aarch64_simd_nonimmediate_operand" "w,?r,Utv")) + (match_operand: 1 "aarch64_simd_nonimmediate_operand" "w,r,Utv")) (match_operand:VALL_F16 3 "register_operand" "0,0,0") (match_operand:SI 2 "immediate_operand" "i,i,i")))] "TARGET_SIMD" From patchwork Fri Aug 5 12:55:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 401 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6a10:20da:b0:2d3:3019:e567 with SMTP id n26csp175076pxc; Fri, 5 Aug 2022 05:55:55 -0700 (PDT) X-Google-Smtp-Source: AA6agR6guc8OcBA51pDBKSziHL/Kz1Fr1I/v2n4xS2I8pXEZoN8cN8W+6th9pO+R5sy5bbvp86P1 X-Received: by 2002:a05:6402:2b88:b0:43a:6c58:6c64 with SMTP id fj8-20020a0564022b8800b0043a6c586c64mr6651566edb.348.1659704155387; Fri, 05 Aug 2022 05:55:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659704155; cv=none; d=google.com; s=arc-20160816; b=JiMlyLJL0rnwkWraFJm0CQ9sKHAa0kzxYh9mleGAyPp/A8QiSYvtfMr+iPlseDOk4r S9T3fPJgYpCKOJo3X3qUHa3nL1G3SRc0kw5m8WQTF6Yws54z8XOVRUUu+BnFgbi4tQCD EczCjfkMjMLQO3v8txfhyBBsRwlRomP9i9erhUB3K16IiA98FmuytKBaonu9+eHUiRVy nqv9i5Z5hs3yiirHFB+p6vaWwtNSJl0/rh65m1teG3sy6MGUtALqcWaxOd6lVodAZ/17 EZcADlaykxH5NQ58tQixfirx86bOxBHLwolBC7rpA1UKzweki9UX0kZqIJqNSS3oo9xn u/Gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :in-reply-to:references:to:content-language:subject:user-agent :mime-version:date:message-id:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=B+kbS9OP+XO/2R0NXPswln9W/MU+vpYo1eBnAr1o4Uw=; b=P5Cg9mHm50/48PAx5dNrJXOn5+xZSRukgUKsufPgS6eobNDS/+uiPyP6myvcLY3yIL jGaceMFGmUEiukYX5F/8pVrD+g30Eby1WN7lqPLMAlOUNM8Vgch9XyVwKVtt0nnvqAUB UWh/n7kpGwivW6GWGXhKrdvNNpL3p41bgosBep5sU5GQKgQiDI/+9Yz05UmThkhIVVCs 81c8GknSK3yOG845VZU8aEonC+W8MJqSXyyAUMTFf7zB7mKpClwA0Vt27MDbb0uqNqE0 rWIkEQaUpvVYhUgKopNH4OWQgNaPZUD78Uxtx2fwHfO0tm6qqn1nez7qXbb/FAAJOLQO fy3w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=gY2ec2Gx; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id h13-20020a05640250cd00b0043ec2822c33si3818185edb.168.2022.08.05.05.55.55 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Aug 2022 05:55:55 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=gY2ec2Gx; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2E1F33857BB2 for ; Fri, 5 Aug 2022 12:55:54 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2E1F33857BB2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1659704154; bh=B+kbS9OP+XO/2R0NXPswln9W/MU+vpYo1eBnAr1o4Uw=; h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=gY2ec2GxEVmrprqk/mYPM9XW94CpdmI+GsXcalJwB6kkflr3jLB2CMss4Eo4/8uwJ g5/W5IKK5Xgh3xe4AwUE++YMHUy240rjt3CMH2x8qjSgs1j0r/IUHPyOkjXH0v33Lx iXUrZF9RtG4FoHrt1MMx4Kv2zpFfbYiZizIp9ZgY= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id B0A873858C53 for ; Fri, 5 Aug 2022 12:55:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B0A873858C53 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B620B106F; Fri, 5 Aug 2022 05:55:09 -0700 (PDT) Received: from [10.57.14.36] (unknown [10.57.14.36]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 209933F73B; Fri, 5 Aug 2022 05:55:07 -0700 (PDT) Message-ID: <317d0d74-e7e1-05e8-45d3-98bbc929a922@arm.com> Date: Fri, 5 Aug 2022 13:55:02 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: [PATCH 2/4]aarch64: Change aarch64_expand_vector_init to use rtx_vector_builder Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <95d2de77-5b68-6d0b-ac99-ac1ca28835e2@arm.com> In-Reply-To: <95d2de77-5b68-6d0b-ac99-ac1ca28835e2@arm.com> X-Spam-Status: No, score=-22.1 required=5.0 tests=BAYES_00, BODY_8BITS, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Cc: Richard Sandiford Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1740325944288025323?= X-GMAIL-MSGID: =?utf-8?q?1740325944288025323?= Hi, This patch changes aarch64_expand_vector_init to use rtx_vector_builder, exploiting it's internal pattern detection to find 'dup' patterns. Bootstrapped and regression tested on aarch64-none-linux-gnu. Is this OK for trunk or should we wait for the rest of the series? gcc/ChangeLog: 2022-08-05  Andre Vieira          * config/aarch64/aarch64.cc (aarch64_vec_duplicate): New.          (aarch64_expand_vector_init): Make the existing variant construct          a rtx_vector_builder from the list of elements and use this to detect          duplicate patterns. gcc/testesuite/ChangeLog: 2022-08-05  Andre Vieira          * gcc.target/aarch64/ldp_stp_16.c: Modify to reflect code change. diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 4b486aeea90ea2afb9cdd96a4dbe15c5bb2abd7a..a08043e18d609e258ebfe033875201163d129aba 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -305,6 +305,7 @@ static machine_mode aarch64_simd_container_mode (scalar_mode, poly_int64); static bool aarch64_print_address_internal (FILE*, machine_mode, rtx, aarch64_addr_query_type); static HOST_WIDE_INT aarch64_clamp_to_uimm12_shift (HOST_WIDE_INT val); +static void aarch64_expand_vector_init (rtx, rtx_vector_builder&); /* The processor for which instructions should be scheduled. */ enum aarch64_processor aarch64_tune = cortexa53; @@ -21804,55 +21805,96 @@ aarch64_simd_make_constant (rtx vals) return NULL_RTX; } +static void +aarch64_vec_duplicate (rtx target, machine_mode mode, machine_mode element_mode, + int narrow_n_elts) +{ + poly_uint64 size = narrow_n_elts * GET_MODE_BITSIZE (element_mode); + scalar_mode i_mode = int_mode_for_size (size, 0).require (); + machine_mode o_mode; + if (aarch64_sve_mode_p (mode)) + o_mode = aarch64_full_sve_mode (i_mode).require (); + else + o_mode + = aarch64_simd_container_mode (i_mode, + GET_MODE_BITSIZE (mode)); + rtx input = simplify_gen_subreg (i_mode, target, mode, 0); + rtx output = simplify_gen_subreg (o_mode, target, mode, 0); + aarch64_emit_move (output, gen_vec_duplicate (o_mode, input)); +} + + /* Expand a vector initialisation sequence, such that TARGET is initialised to contain VALS. */ void aarch64_expand_vector_init (rtx target, rtx vals) { - machine_mode mode = GET_MODE (target); - scalar_mode inner_mode = GET_MODE_INNER (mode); /* The number of vector elements. */ int n_elts = XVECLEN (vals, 0); - /* The number of vector elements which are not constant. */ - int n_var = 0; - rtx any_const = NULL_RTX; + machine_mode mode = GET_MODE (target); + scalar_mode inner_mode = GET_MODE_INNER (mode); /* The first element of vals. */ rtx v0 = XVECEXP (vals, 0, 0); - bool all_same = true; /* This is a special vec_init where N is not an element mode but a vector mode with half the elements of M. We expect to find two entries of mode N in VALS and we must put their concatentation into TARGET. */ - if (XVECLEN (vals, 0) == 2 && VECTOR_MODE_P (GET_MODE (XVECEXP (vals, 0, 0)))) + if (n_elts == 2 + && VECTOR_MODE_P (GET_MODE (v0))) { - machine_mode narrow_mode = GET_MODE (XVECEXP (vals, 0, 0)); + machine_mode narrow_mode = GET_MODE (v0); gcc_assert (GET_MODE_INNER (narrow_mode) == inner_mode && known_eq (GET_MODE_SIZE (mode), 2 * GET_MODE_SIZE (narrow_mode))); - emit_insn (gen_aarch64_vec_concat (narrow_mode, target, - XVECEXP (vals, 0, 0), + emit_insn (gen_aarch64_vec_concat (narrow_mode, target, v0, XVECEXP (vals, 0, 1))); return; } - /* Count the number of variable elements to initialise. */ + rtx_vector_builder builder (mode, n_elts, 1); for (int i = 0; i < n_elts; ++i) + builder.quick_push (XVECEXP (vals, 0, i)); + builder.finalize (); + + aarch64_expand_vector_init (target, builder); +} + +static void +aarch64_expand_vector_init (rtx target, rtx_vector_builder &v) +{ + machine_mode mode = GET_MODE (target); + scalar_mode inner_mode = GET_MODE_INNER (mode); + /* The number of vector elements which are not constant. */ + unsigned n_var = 0; + rtx any_const = NULL_RTX; + /* The first element of vals. */ + rtx v0 = v.elt (0); + /* Get the number of elements to insert into an Advanced SIMD vector. + If we have more than one element per pattern then we use the constant + number of elements in a full vector. + If we only have one element per pattern we use the number of patterns as + this may be lower than the number of elements in a full vector, which + means they repeat and we should use a duplicate of the smaller vector. */ + unsigned n_elts + = v.nelts_per_pattern () == 1 ? v.npatterns () + : v.full_nelts ().coeffs[0]; + + /* Count the number of variable elements to initialise. */ + for (unsigned i = 0; i < n_elts ; ++i) { - rtx x = XVECEXP (vals, 0, i); + rtx x = v.elt (i); if (!(CONST_INT_P (x) || CONST_DOUBLE_P (x))) ++n_var; else any_const = x; - - all_same &= rtx_equal_p (x, v0); } /* No variable elements, hand off to aarch64_simd_make_constant which knows how best to handle this. */ if (n_var == 0) { - rtx constant = aarch64_simd_make_constant (vals); + rtx constant = aarch64_simd_make_constant (v.build ()); if (constant != NULL_RTX) { emit_move_insn (target, constant); @@ -21861,7 +21903,7 @@ aarch64_expand_vector_init (rtx target, rtx vals) } /* Splat a single non-constant element if we can. */ - if (all_same) + if (n_elts == 1) { rtx x = copy_to_mode_reg (inner_mode, v0); aarch64_emit_move (target, gen_vec_duplicate (mode, x)); @@ -21879,14 +21921,15 @@ aarch64_expand_vector_init (rtx target, rtx vals) and matches[X][1] with the count of duplicate elements (if X is the earliest element which has duplicates). */ - if (n_var == n_elts && n_elts <= 16) + if (n_var == n_elts) { - int matches[16][2] = {0}; - for (int i = 0; i < n_elts; i++) + gcc_assert (n_elts <= 16); + unsigned matches[16][2] = {0}; + for (unsigned i = 0; i < n_elts; i++) { - for (int j = 0; j <= i; j++) + for (unsigned j = 0; j <= i; j++) { - if (rtx_equal_p (XVECEXP (vals, 0, i), XVECEXP (vals, 0, j))) + if (rtx_equal_p (v.elt (i), v.elt (j))) { matches[i][0] = j; matches[j][1]++; @@ -21894,9 +21937,9 @@ aarch64_expand_vector_init (rtx target, rtx vals) } } } - int maxelement = 0; - int maxv = 0; - for (int i = 0; i < n_elts; i++) + unsigned maxelement = 0; + unsigned maxv = 0; + for (unsigned i = 0; i < n_elts; i++) if (matches[i][1] > maxv) { maxelement = i; @@ -21915,8 +21958,8 @@ aarch64_expand_vector_init (rtx target, rtx vals) || inner_mode == E_DFmode)) { - rtx x0 = XVECEXP (vals, 0, 0); - rtx x1 = XVECEXP (vals, 0, 1); + rtx x0 = v.elt (0); + rtx x1 = v.elt (1); /* Combine can pick up this case, but handling it directly here leaves clearer RTL. @@ -21939,24 +21982,26 @@ aarch64_expand_vector_init (rtx target, rtx vals) vector register. For big-endian we want that position to hold the last element of VALS. */ maxelement = BYTES_BIG_ENDIAN ? n_elts - 1 : 0; - rtx x = copy_to_mode_reg (inner_mode, XVECEXP (vals, 0, maxelement)); + rtx x = copy_to_mode_reg (inner_mode, v.elt (maxelement)); aarch64_emit_move (target, lowpart_subreg (mode, x, inner_mode)); } else { - rtx x = copy_to_mode_reg (inner_mode, XVECEXP (vals, 0, maxelement)); + rtx x = copy_to_mode_reg (inner_mode, v.elt (maxelement)); aarch64_emit_move (target, gen_vec_duplicate (mode, x)); } /* Insert the rest. */ - for (int i = 0; i < n_elts; i++) + for (unsigned i = 0; i < n_elts; i++) { - rtx x = XVECEXP (vals, 0, i); + rtx x = v.elt (i); if (matches[i][0] == maxelement) continue; x = copy_to_mode_reg (inner_mode, x); emit_insn (GEN_FCN (icode) (target, x, GEN_INT (i))); } + if (!known_eq (v.full_nelts (), n_elts)) + aarch64_vec_duplicate (target, mode, GET_MODE (v0), n_elts); return; } @@ -21965,19 +22010,19 @@ aarch64_expand_vector_init (rtx target, rtx vals) can. */ if (n_var != n_elts) { - rtx copy = copy_rtx (vals); + rtx copy = v.build (); /* Load constant part of vector. We really don't care what goes into the parts we will overwrite, but we're more likely to be able to load the constant efficiently if it has fewer, larger, repeating parts (see aarch64_simd_valid_immediate). */ - for (int i = 0; i < n_elts; i++) + for (unsigned i = 0; i < n_elts; i++) { - rtx x = XVECEXP (vals, 0, i); + rtx x = XVECEXP (copy, 0, i); if (CONST_INT_P (x) || CONST_DOUBLE_P (x)) continue; rtx subst = any_const; - for (int bit = n_elts / 2; bit > 0; bit /= 2) + for (unsigned bit = n_elts / 2; bit > 0; bit /= 2) { /* Look in the copied vector, as more elements are const. */ rtx test = XVECEXP (copy, 0, i ^ bit); @@ -21989,18 +22034,21 @@ aarch64_expand_vector_init (rtx target, rtx vals) } XVECEXP (copy, 0, i) = subst; } + gcc_assert (GET_MODE (target) == GET_MODE (copy)); aarch64_expand_vector_init (target, copy); } /* Insert the variable lanes directly. */ - for (int i = 0; i < n_elts; i++) + for (unsigned i = 0; i < n_elts; i++) { - rtx x = XVECEXP (vals, 0, i); + rtx x = v.elt (i); if (CONST_INT_P (x) || CONST_DOUBLE_P (x)) continue; x = copy_to_mode_reg (inner_mode, x); emit_insn (GEN_FCN (icode) (target, x, GEN_INT (i))); } + if (!known_eq (v.full_nelts (), n_elts)) + aarch64_vec_duplicate (target, mode, inner_mode, n_elts); } /* Emit RTL corresponding to: diff --git a/gcc/testsuite/gcc.target/aarch64/ldp_stp_16.c b/gcc/testsuite/gcc.target/aarch64/ldp_stp_16.c index 8ab117c4dcd7a731abc7e1b039e1faf0dfa09a5d..b307d2791824dd9c30200931452b2636708b5035 100644 --- a/gcc/testsuite/gcc.target/aarch64/ldp_stp_16.c +++ b/gcc/testsuite/gcc.target/aarch64/ldp_stp_16.c @@ -96,8 +96,8 @@ CONS2_FN (4, float); /* ** cons2_8_float: -** dup v([0-9]+)\.4s, .* -** ... +** ins v0\.s\[1\], v1\.s\[0\] +** dup v([0-9]+)\.2d, v0\.d\[0\] ** stp q\1, q\1, \[x0\] ** stp q\1, q\1, \[x0, #?32\] ** ret diff --git a/gcc/testsuite/gcc.target/aarch64/vect_init.c b/gcc/testsuite/gcc.target/aarch64/vect_init.c new file mode 100644 index 0000000000000000000000000000000000000000..546e44e96f4db60d289b4bc0ebfecbe18c81b4cc --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/vect_init.c @@ -0,0 +1,144 @@ +#include + +/* +** int32_0: +** fmov s0, w0 +** ins v0.s\[1\], w1 +** dup v0.2d, v0.d\[0\] +** ret +*/ + +int32x4_t int32_0 (int a, int b) +{ + int32x4_t v = {a, b, a, b}; + return v; +} +/* +** int32_1: +** dup v0.4s, w0 +** ret +*/ + +int32x4_t int32_1 (int a) +{ + int32x4_t v = {a, a, a, a}; + return v; +} + +/* +** int16_0: +** sxth w0, w0 +** fmov s0, w0 +** ins v0.h\[1\], w1 +** ins v0.h\[2\], w2 +** ins v0.h\[3\], w3 +** dup v0.2d, v0.d\[0\] +** ret +*/ + +int16x8_t int16_0 (int16_t a, int16_t b, int16_t c, int16_t d) +{ + int16x8_t v = {a, b, c, d, + a, b, c, d}; + return v; +} + +/* +** int16_1: +** sxth w0, w0 +** fmov s0, w0 +** ins v0.h\[1\], w1 +** dup v0.4s, v0.s\[0\] +** ret +*/ + +int16x8_t int16_1 (int16_t a, int16_t b) +{ + int16x8_t v = {a, b, a, b, + a, b, a, b}; + return v; +} + +/* +** int16_2: +** dup v0.8h, w0 +** ret +*/ + +int16x8_t int16_2 (int16_t a) +{ + int16x8_t v = {a, a, a, a, + a, a, a, a}; + return v; +} + +/* +** int8_0: +** sxtb w0, w0 +** fmov s0, w0 +** ins v0.b\[1\], w1 +** ins v0.b\[2\], w2 +** ins v0.b\[3\], w3 +** ins v0.b\[4\], w4 +** ins v0.b\[5\], w5 +** ins v0.b\[6\], w6 +** ins v0.b\[7\], w7 +** dup v0.2d, v0.d\[0\] +** ret +*/ + +int8x16_t int8_0 (int8_t a, int8_t b, int8_t c, int8_t d, int8_t e, int8_t f, + int8_t g, int8_t h) +{ + int8x16_t v = {a, b, c, d, e, f, g, h, + a, b, c, d, e, f, g, h}; + return v; +} + +/* +** int8_1: +** sxtb w0, w0 +** fmov s0, w0 +** ins v0.b\[1\], w1 +** ins v0.b\[2\], w2 +** ins v0.b\[3\], w3 +** dup v0.4s, v0.s\[0\] +** ret +*/ + +int8x16_t int8_1 (int8_t a, int8_t b, int8_t c, int8_t d) +{ + int8x16_t v = {a, b, c, d, a, b, c, d, + a, b, c, d, a, b, c, d}; + return v; +} + +/* +** int8_2: +** sxtb w0, w0 +** fmov s0, w0 +** ins v0.b\[1\], w1 +** dup v0.8h, v0.h\[0\] +** ret +*/ + +int8x16_t int8_2 (int8_t a, int8_t b) +{ + int8x16_t v = {a, b, a, b, a, b, a, b, + a, b, a, b, a, b, a, b}; + return v; +} + +/* +** int8_3: +** dup v0.16b, w0 +** ret +*/ + +int8x16_t int8_3 (int8_t a) +{ + int8x16_t v = {a, a, a, a, a, a, a, a, + a, a, a, a, a, a, a, a}; + return v; +} + From patchwork Fri Aug 5 12:56:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 402 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6a10:20da:b0:2d3:3019:e567 with SMTP id n26csp175952pxc; Fri, 5 Aug 2022 05:57:42 -0700 (PDT) X-Google-Smtp-Source: AA6agR6/eOlECqoxzn7Uo21DhDwDHf+lzJccbuKbN2dBczNUGx8pGsf/43QvRM8jEJLH/xypUfXT X-Received: by 2002:aa7:d60b:0:b0:43c:f7ab:3c8f with SMTP id c11-20020aa7d60b000000b0043cf7ab3c8fmr6567268edr.6.1659704262075; Fri, 05 Aug 2022 05:57:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659704262; cv=none; d=google.com; s=arc-20160816; b=fE9+xwrHpodAjIEyrUlnjmzts1VJYbeirABID5NTKFeqw0CCA3hLf8k+PyZtZzrbln /NiM5W7Zr0O9eof5kg0e3P2gzb7oE+SEBFLEt/3sCkXJJmns0pcu9y7fuHysz4gPtigH YUlO3yTAV76CB6lLZSrvEGv/mvVm+wDTC3pjnW26uMr5hzdFo0/nN8oADlg/uqv61vE3 ENAqyI3wRmXV1Q39TVVJGMi34CFSinA2qq9HyEgim6HnSzMmTmWqlwfJBngzXZK90U1o Mqm31tJat8yyg++AEVvnbXu6tQ5IcwcH7balYTHeBH6jw9LmY1N6Hj4kUNRdA5TfLweI tV3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :in-reply-to:references:to:content-language:subject:user-agent :mime-version:date:message-id:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=/GYSMh8YUQHJWjKO4P4xa5kndn91rhpd2XxuGubsQb0=; b=pCzfOt2Xdl38LsJhepgA8cgWDmH77MtHM8pZpOFFTS+7YnsICJtQceb1FIHmXmxFBx evNoE+1LqzaR+1wNPBKcJRfTdzB79gCeqakZ8+lETc7/liPu9B2TfOQCa+dNWW7+0EXI 0h0ivQ8JzSsK/K9ATI0Hh5wnMowuvfp9h0iQHUp/MINT3YXbpHkIRaO0aiH0yXCoPqsP Ksgbd+YyVcQnmBuES6N74+f7Y3qUTKq1R4wCi1W//LcDVXI2qnegT5DdaWKX3+a6xMiU rNtLcuyqoRTsFFDkhb7dxyWIhzAhkRMgw0Tm08owhDCt1cwsZ1fDjkwfRFAbkT0SYoX6 uXSA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=eV5C+hw5; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id y17-20020a50eb91000000b0043d9a57098asi3415606edr.184.2022.08.05.05.57.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Aug 2022 05:57:42 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=eV5C+hw5; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F0D1F38582BE for ; Fri, 5 Aug 2022 12:57:40 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F0D1F38582BE DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1659704261; bh=/GYSMh8YUQHJWjKO4P4xa5kndn91rhpd2XxuGubsQb0=; h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=eV5C+hw5soZuAr9ZpMeEyuXZJgvbVToTZm+F6V/gpU1So6/P/EK2bfWPDkMDxn0wz ewcdTRyrdYehUCnGdY+kBwhPZ0SKoN2X2MMw8QUCXpgp564wHx5tTBOB7q4FU0puQw z5tpGqxhTx+SvEWgdgjRdqn4x6QWPB+jrqfEigCY= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id BFEC6385840B for ; Fri, 5 Aug 2022 12:56:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BFEC6385840B Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 10D76113E; Fri, 5 Aug 2022 05:56:59 -0700 (PDT) Received: from [10.57.14.36] (unknown [10.57.14.36]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A45E03F73B; Fri, 5 Aug 2022 05:56:57 -0700 (PDT) Message-ID: <69a0dc52-4125-1d25-fa2b-4acf6cc3b80f@arm.com> Date: Fri, 5 Aug 2022 13:56:48 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: [PATCH 3/4] match.pd: Teach forwprop to handle VLA VEC_PERM_EXPRs with VLS CONSTRUCTORs as arguments Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <95d2de77-5b68-6d0b-ac99-ac1ca28835e2@arm.com> In-Reply-To: <95d2de77-5b68-6d0b-ac99-ac1ca28835e2@arm.com> X-Spam-Status: No, score=-23.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Cc: Richard Sandiford Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1740326056356656262?= X-GMAIL-MSGID: =?utf-8?q?1740326056356656262?= Hi, This patch is part of the WIP patch that follows in this series. It's goal is to teach forwprop to handle VLA VEC_PERM_EXPRs with VLS CONSTRUCTORs as arguments as preparation for the 'VLA constructor' hook approach. Kind Regards, Andre diff --git a/gcc/match.pd b/gcc/match.pd index 9736393061aac61d4d53aaad6cf6b2c97a7d4679..3c3c0c6a88b35a6e42c506f6c4603680fe6e4318 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -7852,14 +7852,24 @@ and, if (!tree_to_vec_perm_builder (&builder, op2)) return NULL_TREE; + /* FIXME: disable folding of a VEC_PERM_EXPR with a VLA mask and VLS + CONSTRUCTORS, since that would yield a VLA CONSTRUCTOR which we + currently do not support. */ + if (!TYPE_VECTOR_SUBPARTS (type).is_constant () + && (TYPE_VECTOR_SUBPARTS (TREE_TYPE (op0)).is_constant () + || TYPE_VECTOR_SUBPARTS (TREE_TYPE (op1)).is_constant ())) + return NULL_TREE; + /* Create a vec_perm_indices for the integer vector. */ poly_uint64 nelts = TYPE_VECTOR_SUBPARTS (type); bool single_arg = (op0 == op1); vec_perm_indices sel (builder, single_arg ? 1 : 2, nelts); } - (if (sel.series_p (0, 1, 0, 1)) + (if (sel.series_p (0, 1, 0, 1) + && useless_type_conversion_p (type, TREE_TYPE (op0))) { op0; } - (if (sel.series_p (0, 1, nelts, 1)) + (if (sel.series_p (0, 1, nelts, 1) + && useless_type_conversion_p (type, TREE_TYPE (op1))) { op1; } (with { diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc index fdc4bc8909d2763876550e53277ff2b3dcca796a..cda91c21c476ea8611e12c593bfa64e1d71dd29e 100644 --- a/gcc/tree-ssa-forwprop.cc +++ b/gcc/tree-ssa-forwprop.cc @@ -2661,7 +2661,7 @@ simplify_permutation (gimple_stmt_iterator *gsi) /* Shuffle of a constructor. */ bool ret = false; - tree res_type = TREE_TYPE (arg0); + tree res_type = TREE_TYPE (gimple_get_lhs (stmt)); tree opt = fold_ternary (VEC_PERM_EXPR, res_type, arg0, arg1, op2); if (!opt || (TREE_CODE (opt) != CONSTRUCTOR && TREE_CODE (opt) != VECTOR_CST)) From patchwork Fri Aug 5 12:58:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 403 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6a10:20da:b0:2d3:3019:e567 with SMTP id n26csp176747pxc; Fri, 5 Aug 2022 05:59:21 -0700 (PDT) X-Google-Smtp-Source: AA6agR5yN37bji66dM2rQ6jKeFCEjccAYN5kp82x6TKfxSSLY+41u+A+xBU2Y1WKb6Ugfehp0XCN X-Received: by 2002:a05:6402:3892:b0:43b:d872:a66 with SMTP id fd18-20020a056402389200b0043bd8720a66mr6456595edb.139.1659704361625; Fri, 05 Aug 2022 05:59:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659704361; cv=none; d=google.com; s=arc-20160816; b=Kg1WIlJW5cfsSusQZY3XoDE8GLJlQM/vXrbqpwVZPahlVO0+xHECyQOLbsxHMdIZUQ CZbCRDyS7xMgLurZoxFKxP7e8bLo7ssRNT9uVP1A45qVOdxTFRLpNwdxNyWYCoHJ8i5R vnzdW9giJ2MNc0xDjK8v1FvXCetqAg+mrywVe6unt57C4MeM5Fa5NvV5Ipbo35psHWlc kwtj5MbS35xF+fD4qt33+vuLJI0bF6ICmTLntRJztyhn9BfLThZnIBYvh+ilZ2dOSqIw cCuAP2i7UjKqOMnJSFBk5SRQQ6CIJEz0qrdAwSMaZNVFhSPmfqJNRyhdjm0jzdsIAFiQ Sgrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :in-reply-to:references:to:content-language:subject:user-agent :mime-version:date:message-id:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=D+NHOzauTXLJ+7yGEG5GhWsLu9RQThInlMfH57fVPw4=; b=nMFt9qgPu8d0O8nDFN7nvuUem+J1MwcuP3cUhZtUD8aC11uIIBa44o3NZlvVdBwU8J hQ8KjWokBdrO02bXLpFtHwXvVulcZ/O30kSBPMcPSzvv3pJzMEv6bpwDV/CAwe03WS2/ 0GTd8YD9ypFc4iFpjVjNxURGpTHSaXtHJKfSoEoOFycnCLAi7Qbz+8TuSpcIPByU1TVP YQ39Sk77vQH51Q3y73wVNw0wrFWXw2+PLuuzWcxNZDvFq+4Z8akbpVX7yEkUuK9d8svO IXwAxylPrstjqNI4cL9meDzESBmbkZz8KJaaCKmQn5f6Wswf2kxSGfqElfi+03itIDWI l3WA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=Ratnfh4T; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id t23-20020a05640203d700b0043d64c59b10si3605752edw.94.2022.08.05.05.59.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Aug 2022 05:59:21 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=Ratnfh4T; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 17C44385AC2A for ; Fri, 5 Aug 2022 12:59:08 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 17C44385AC2A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1659704348; bh=D+NHOzauTXLJ+7yGEG5GhWsLu9RQThInlMfH57fVPw4=; h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=Ratnfh4TnCxsujQCW4JtgkoMhKkUaojcgvHJ2cqxZXJAXeP62Dbfo0bw30hQcrhJm TMUTLUNAG1bBJzMFyJG9qI/RyJ+THWE/Ryn7FdtFzZ+muL95y3VR22TkOjqct35L8B V/+8bxm0QP+C66zMA3nxCBW6wOE3MOoBATrHLfIY= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 4FEFC385840B for ; Fri, 5 Aug 2022 12:58:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 4FEFC385840B Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AB48E113E; Fri, 5 Aug 2022 05:58:23 -0700 (PDT) Received: from [10.57.14.36] (unknown [10.57.14.36]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1796B3F73B; Fri, 5 Aug 2022 05:58:21 -0700 (PDT) Message-ID: <3f90f079-8c12-2547-c925-a28779fdb267@arm.com> Date: Fri, 5 Aug 2022 13:58:16 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: [PATCH 4/4][RFC] VLA Constructor Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <95d2de77-5b68-6d0b-ac99-ac1ca28835e2@arm.com> In-Reply-To: <95d2de77-5b68-6d0b-ac99-ac1ca28835e2@arm.com> X-Spam-Status: No, score=-22.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH, KAM_SHORT, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Cc: Richard Sandiford Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1740326160738418628?= X-GMAIL-MSGID: =?utf-8?q?1740326160738418628?= This isn't really a 'PATCH' yet, it's something I was working on but had to put on hold. Feel free to re-use any bits or trash all of it if you'd like. diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index 82f9eba5c397af04924bdebdc684a1d77682d3fd..08625aad7b1a8dc9c9f8c491cb13d8af0b46a946 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc @@ -842,13 +842,45 @@ public: for (unsigned int i = 0; i < nargs; ++i) { tree elt = gimple_call_arg (f.call, i); - if (!CONSTANT_CLASS_P (elt)) - return NULL; builder.quick_push (elt); for (unsigned int j = 1; j < factor; ++j) builder.quick_push (build_zero_cst (TREE_TYPE (vec_type))); } - return gimple_build_assign (f.lhs, builder.build ()); + builder.finalize (); + unsigned int n_elts + = builder.nelts_per_pattern () == 1 ? builder.npatterns () + : builder.full_nelts ().coeffs[0]; + + if (n_elts == 1) + return gimple_build_assign (f.lhs, build1 (VEC_DUPLICATE_EXPR, vec_type, + builder.elt (0))); + tree list = NULL_TREE; + tree *pp = &list; + for (unsigned int i = 0; i < n_elts; ++i) + { + *pp = build_tree_list (NULL, builder.elt (i) PASS_MEM_STAT); + pp = &TREE_CHAIN (*pp); + } + + poly_uint64 vec_len = TYPE_VECTOR_SUBPARTS (vec_type); + vec_perm_builder sel (vec_len, n_elts, 1); + for (unsigned int i = 0; i < n_elts; i++) + sel.quick_push (i); + vec_perm_indices indices (sel, 1, n_elts); + + tree elt_type = TREE_TYPE (vec_type); + + tree ctor_type = build_vector_type (elt_type, n_elts); + tree ctor = make_ssa_name_fn (cfun, ctor_type, 0); + gimple *ctor_stmt + = gimple_build_assign (ctor, + build_constructor_from_list (ctor_type, list)); + gsi_insert_before (f.gsi, ctor_stmt, GSI_SAME_STMT); + + tree mask_type = build_vector_type (ssizetype, vec_len); + tree mask = vec_perm_indices_to_tree (mask_type, indices); + return gimple_build_assign (f.lhs, fold_build3 (VEC_PERM_EXPR, vec_type, + ctor, ctor, mask)); } rtx diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index bd60e65b0c3f05f1c931f03807170f3b9d699de5..dec935211e5a064239c858880a696e6ca3fe1ae2 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -2544,6 +2544,17 @@ } ) +;; Duplicate an Advanced SIMD vector to fill an SVE vector (LE version). +(define_insn "*aarch64_vec_duplicate_reg_le" + [(set (match_operand:SVE_FULL 0 "register_operand" "=w,w") + (vec_duplicate:SVE_FULL + (match_operand: 1 "register_operand" "w,r")))] + "TARGET_SVE && !BYTES_BIG_ENDIAN" + "@ + mov\t%0., %1 + mov\t%0., %1" +) + ;; Duplicate an Advanced SIMD vector to fill an SVE vector (BE version). ;; The SVE register layout puts memory lane N into (architectural) ;; register lane N, whereas the Advanced SIMD layout puts the memory diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index a08043e18d609e258ebfe033875201163d129aba..9b118e4101d0a5995a833769433be49321ab2151 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -6033,7 +6033,6 @@ rtx aarch64_expand_sve_dupq (rtx target, machine_mode mode, rtx src) { machine_mode src_mode = GET_MODE (src); - gcc_assert (GET_MODE_INNER (mode) == GET_MODE_INNER (src_mode)); insn_code icode = (BYTES_BIG_ENDIAN ? code_for_aarch64_vec_duplicate_vq_be (mode) : code_for_aarch64_vec_duplicate_vq_le (mode)); @@ -21806,20 +21805,29 @@ aarch64_simd_make_constant (rtx vals) } static void -aarch64_vec_duplicate (rtx target, machine_mode mode, machine_mode element_mode, +aarch64_vec_duplicate (rtx target, rtx op, machine_mode mode, machine_mode element_mode, int narrow_n_elts) { poly_uint64 size = narrow_n_elts * GET_MODE_BITSIZE (element_mode); - scalar_mode i_mode = int_mode_for_size (size, 0).require (); machine_mode o_mode; - if (aarch64_sve_mode_p (mode)) - o_mode = aarch64_full_sve_mode (i_mode).require (); + rtx input, output; + bool sve = aarch64_sve_mode_p (mode); + if (sve && known_eq (size, 128U)) + { + o_mode = mode; + output = target; + input = op; + } else - o_mode - = aarch64_simd_container_mode (i_mode, - GET_MODE_BITSIZE (mode)); - rtx input = simplify_gen_subreg (i_mode, target, mode, 0); - rtx output = simplify_gen_subreg (o_mode, target, mode, 0); + { + scalar_mode i_mode = int_mode_for_size (size, 0).require (); + o_mode + = sve ? aarch64_full_sve_mode (i_mode).require () + : aarch64_simd_container_mode (i_mode, + GET_MODE_BITSIZE (mode)); + input = simplify_gen_subreg (i_mode, op, GET_MODE (op), 0); + output = simplify_gen_subreg (o_mode, target, mode, 0); + } aarch64_emit_move (output, gen_vec_duplicate (o_mode, input)); } @@ -21910,6 +21918,16 @@ aarch64_expand_vector_init (rtx target, rtx_vector_builder &v) return; } + /* We are constructing a VLS vector that we may later duplicate into a VLA + one. Actually maybe split this into one for ASIMD and one for SVE? */ + machine_mode real_mode = mode; + rtx real_target = target; + if (aarch64_sve_mode_p (real_mode)) + { + mode = aarch64_vq_mode (GET_MODE_INNER (real_mode)).require (); + target = simplify_gen_subreg (mode, target, real_mode, 0); + } + enum insn_code icode = optab_handler (vec_set_optab, mode); gcc_assert (icode != CODE_FOR_nothing); @@ -22000,8 +22018,8 @@ aarch64_expand_vector_init (rtx target, rtx_vector_builder &v) x = copy_to_mode_reg (inner_mode, x); emit_insn (GEN_FCN (icode) (target, x, GEN_INT (i))); } - if (!known_eq (v.full_nelts (), n_elts)) - aarch64_vec_duplicate (target, mode, GET_MODE (v0), n_elts); + if (!known_eq (v.full_nelts (), n_elts)) + aarch64_vec_duplicate (real_target, target, real_mode, GET_MODE (v0), n_elts); return; } @@ -22048,7 +22066,7 @@ aarch64_expand_vector_init (rtx target, rtx_vector_builder &v) emit_insn (GEN_FCN (icode) (target, x, GEN_INT (i))); } if (!known_eq (v.full_nelts (), n_elts)) - aarch64_vec_duplicate (target, mode, inner_mode, n_elts); + aarch64_vec_duplicate (real_target, target, real_mode, inner_mode, n_elts); } /* Emit RTL corresponding to: @@ -23947,11 +23965,7 @@ aarch64_evpc_sve_dup (struct expand_vec_perm_d *d) if (BYTES_BIG_ENDIAN || !d->one_vector_p || d->vec_flags != VEC_SVE_DATA - || d->op_vec_flags != VEC_ADVSIMD - || d->perm.encoding ().nelts_per_pattern () != 1 - || !known_eq (d->perm.encoding ().npatterns (), - GET_MODE_NUNITS (d->op_mode)) - || !known_eq (GET_MODE_BITSIZE (d->op_mode), 128)) + || d->perm.encoding ().nelts_per_pattern () != 1) return false; int npatterns = d->perm.encoding ().npatterns (); @@ -23962,7 +23976,10 @@ aarch64_evpc_sve_dup (struct expand_vec_perm_d *d) if (d->testing_p) return true; - aarch64_expand_sve_dupq (d->target, GET_MODE (d->target), d->op0); + machine_mode mode = GET_MODE (d->target); + machine_mode element_mode = GET_MODE_INNER (mode); + aarch64_vec_duplicate (d->target, d->op0, mode, element_mode, + d->perm.encoding ().npatterns ()); return true; } @@ -24194,6 +24211,15 @@ aarch64_vectorize_vec_perm_const (machine_mode vmode, machine_mode op_mode, return ret; } +/* Implement TARGET_VECTORIZE_VLA_CONSTRUCTOR. */ + +static bool +aarch64_vectorize_vla_constructor (rtx target, rtx_vector_builder &builder) +{ + aarch64_expand_vector_init (target, builder); + return true; +} + /* Generate a byte permute mask for a register of mode MODE, which has NUNITS units. */ @@ -27667,6 +27693,10 @@ aarch64_libgcc_floating_mode_supported_p #define TARGET_VECTORIZE_VEC_PERM_CONST \ aarch64_vectorize_vec_perm_const +#undef TARGET_VECTORIZE_VLA_CONSTRUCTOR +#define TARGET_VECTORIZE_VLA_CONSTRUCTOR \ + aarch64_vectorize_vla_constructor + #undef TARGET_VECTORIZE_RELATED_MODE #define TARGET_VECTORIZE_RELATED_MODE aarch64_vectorize_related_mode #undef TARGET_VECTORIZE_GET_MASK_MODE diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index b0ea39884aa3ced5c0ccc1e792088aa66997ec3b..eda3f014984f62d96d7fe0b3c0c439905375f25a 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6112,6 +6112,11 @@ instruction pattern. There is no need for the hook to handle these two implementation approaches itself. @end deftypefn +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VLA_CONSTRUCTOR (rtx @var{target}, rtx_vector_builder @var{&builder}) +This hook is used to expand a vla constructor into @var{target} +using the rtx_vector_builder @var{builder}. +@end deftypefn + @deftypefn {Target Hook} tree TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION (unsigned @var{code}, tree @var{vec_type_out}, tree @var{vec_type_in}) This hook should return the decl of a function that implements the vectorized variant of the function with the @code{combined_fn} code diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index f869ddd5e5b8b7acbd8e9765fb103af24a1085b6..07f4f77877b18a23f6fd205a8dd8daf1a03c2923 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -4164,6 +4164,8 @@ address; but often a machine-dependent strategy can generate better code. @hook TARGET_VECTORIZE_VEC_PERM_CONST +@hook TARGET_VECTORIZE_VLA_CONSTRUCTOR + @hook TARGET_VECTORIZE_BUILTIN_VECTORIZED_FUNCTION @hook TARGET_VECTORIZE_BUILTIN_MD_VECTORIZED_FUNCTION diff --git a/gcc/expr.cc b/gcc/expr.cc index f9753d48245d56039206647be8576246a3b25ed3..b9eb550cac4c68464c95cffa8da19b3984b80782 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -10264,6 +10264,44 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, case VEC_PERM_EXPR: { + if (TREE_CODE (treeop2) == VECTOR_CST + && targetm.vectorize.vla_constructor) + { + tree ctor0, ctor1; + if (TREE_CODE (treeop0) == SSA_NAME + && is_gimple_assign (SSA_NAME_DEF_STMT (treeop0))) + ctor0 = gimple_assign_rhs1 (SSA_NAME_DEF_STMT (treeop0)); + else + ctor0 = treeop0; + if (TREE_CODE (treeop1) == SSA_NAME + && is_gimple_assign (SSA_NAME_DEF_STMT (treeop1))) + ctor1 = gimple_assign_rhs1 (SSA_NAME_DEF_STMT (treeop1)); + else + ctor1 = treeop1; + + if (TREE_CODE (ctor0) == CONSTRUCTOR + && TREE_CODE (ctor1) == CONSTRUCTOR) + { + + unsigned int nelts = vector_cst_encoded_nelts (treeop2); + unsigned int ctor_nelts = CONSTRUCTOR_NELTS (ctor0); + machine_mode mode = GET_MODE (target); + rtx_vector_builder builder (mode, nelts, 1); + for (unsigned int i = 0; i < nelts; ++i) + { + unsigned HOST_WIDE_INT index + = tree_to_uhwi (VECTOR_CST_ENCODED_ELT (treeop2, i)); + tree op + = index >= ctor_nelts + ? CONSTRUCTOR_ELT (ctor1, index - ctor_nelts)->value + : CONSTRUCTOR_ELT (ctor0, index)->value; + builder.quick_push (expand_normal (op)); + } + builder.finalize (); + if (targetm.vectorize.vla_constructor (target, builder)) + return target; + } + } expand_operands (treeop0, treeop1, target, &op0, &op1, EXPAND_NORMAL); vec_perm_builder sel; if (TREE_CODE (treeop2) == VECTOR_CST diff --git a/gcc/target.def b/gcc/target.def index 2a7fa68f83dd15dcdd2c332e8431e6142ec7d305..3c219b6a90d9cc1a6393a3ebc24e54fcf14c6377 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -1902,6 +1902,13 @@ implementation approaches itself.", const vec_perm_indices &sel), NULL) +DEFHOOK +(vla_constructor, + "This hook is used to expand a vla constructor into @var{target}\n\ +using the rtx_vector_builder @var{builder}.", + bool, (rtx target, rtx_vector_builder &builder), + NULL) + /* Return true if the target supports misaligned store/load of a specific factor denoted in the third parameter. The last parameter is true if the access is defined in a packed struct. */ diff --git a/gcc/target.h b/gcc/target.h index d6fa6931499d15edff3e5af3e429540d001c7058..b46b8f0d7a9c52f6efe6acf10f589703cec3bd08 100644 --- a/gcc/target.h +++ b/gcc/target.h @@ -262,6 +262,8 @@ enum poly_value_estimate_kind extern bool verify_type_context (location_t, type_context_kind, const_tree, bool = false); +class rtx_vector_builder; + /* The target structure. This holds all the backend hooks. */ #define DEFHOOKPOD(NAME, DOC, TYPE, INIT) TYPE NAME; #define DEFHOOK(NAME, DOC, TYPE, PARAMS, INIT) TYPE (* NAME) PARAMS; diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/dupq_opt_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/dupq_opt_1.c new file mode 100644 index 0000000000000000000000000000000000000000..01f652931555534f43e0487766c568c72a5df686 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/dupq_opt_1.c @@ -0,0 +1,134 @@ +/* { dg-options { "-O2" } } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ +#include + +/* +** test0: +** ins v0.s\[1\], v1.s\[0\] +** mov z0.d, d0 +** ret +*/ +svfloat32_t test0(float x, float y) { + return svdupq_n_f32(x, y, x, y); +} +/* +** test1: +** mov z0.s, s0 +** ret +*/ + +svfloat32_t test1(float x) { + return svdupq_n_f32(x, x, x, x); +} + +/* +** test2: +** mov z0.s, w0 +** ret +*/ + +svint32_t test2(int x) { + return svdupq_n_s32(x, x, x, x); +} + +/* +** test3: +** sxth w0, w0 +** fmov d0, x0 +** ins v0.h\[1\], w1 +** ins v0.h\[2\], w2 +** ins v0.h\[3\], w3 +** mov z0.d, d0 +** ret +*/ + +svint16_t test3(short a, short b, short c, short d) +{ + return svdupq_n_s16(a, b, c, d, a, b, c, d); +} + +/* +** test4: +** dup v0.4h, w0 +** ins v0.h\[1\], w1 +** ins v0.h\[3\], w1 +** mov z0.d, d0 +** ret +*/ + +svint16_t test4(short a, short b) +{ + return svdupq_n_s16(a, b, a, b, a, b, a, b); +} + +/* +** test5: +** mov z0.h, w0 +** ret +*/ + +svint16_t test5(short a) +{ + return svdupq_n_s16(a, a, a, a, a, a, a, a); +} +/* +** test6: +** sxtb w0, w0 +** fmov d0, x0 +** ins v0.b\[1\], w1 +** ins v0.b\[2\], w2 +** ins v0.b\[3\], w3 +** ins v0.b\[4\], w4 +** ins v0.b\[5\], w5 +** ins v0.b\[6\], w6 +** ins v0.b\[7\], w7 +** mov z0.d, d0 +** ret +*/ + +svint8_t test6(char a, char b, char c, char d, char e, char f, char g, char h) +{ + return svdupq_n_s8(a, b, c, d, e, f, g, h, a, b, c, d, e, f, g, h); +} + +/* +** test7: +** dup v0.8b, w0 +** ins v0.b\[1\], w1 +** ins v0.b\[2\], w2 +** ins v0.b\[3\], w3 +** mov z0.s, s0 +** ret +*/ + +svint8_t test7(char a, char b, char c, char d) +{ + return svdupq_n_s8(a, b, c, d, a, b, c, d, a, b, c, d, a, b, c, d); +} + + +// We can do better than this +/* +** sxtb w0, w0 +** fmov d0, x0 +** ins v0.d\[1\], x1 +** ins v0.b\[1\], w1 +** mov z0.h, h0 +** ret +*/ + +svint8_t test8(char a, char b) +{ + return svdupq_n_s8(a, b, a, b, a, b, a, b, a, b, a, b, a, b, a, b); +} + +/* +** test9: +** mov z0.b, w0 +** ret +*/ + +svint8_t test9(char a) +{ + return svdupq_n_s8(a, a, a, a, a, a, a, a, a, a, a, a, a, a, a, a); +} diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc index 350129555a0c71c0896c4f1003163f3b3557c11b..eaae1eefe02af3f51073310e7d17c33286b2bead 100644 --- a/gcc/tree-vect-generic.cc +++ b/gcc/tree-vect-generic.cc @@ -1513,6 +1513,11 @@ lower_vec_perm (gimple_stmt_iterator *gsi) if (!TYPE_VECTOR_SUBPARTS (vect_type).is_constant (&elements)) return; + /* It is possible to have a VEC_PERM_EXPR with a VLA mask and a VLS + CONSTRUCTOR, this should return a VLA type, so we can't lower it. */ + if (!TYPE_VECTOR_SUBPARTS (mask_type).is_constant ()) + return; + if (TREE_CODE (mask) == SSA_NAME) { gimple *def_stmt = SSA_NAME_DEF_STMT (mask);