From patchwork Wed Aug 30 09:06:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 137153 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a7d1:0:b0:3f2:4152:657d with SMTP id p17csp4415243vqm; Wed, 30 Aug 2023 02:07:17 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGfNGAm7FghHMcbvU6JHbTocKbKM365AaCraL0o9Z/+Wig5PjtU+j3AfLaovk7Tc2uCDwU/ X-Received: by 2002:a17:906:9746:b0:9a5:962c:cb6c with SMTP id o6-20020a170906974600b009a5962ccb6cmr5866976ejy.31.1693386437071; Wed, 30 Aug 2023 02:07:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693386437; cv=none; d=google.com; s=arc-20160816; b=hLLJxMdLZ4FeH9WJhMHjby5pKlGgctuzPwQHrvO0nC2MeC7kfg9unWkr2cfX8VAhRa pyILTkCklCCbQlpwc5iIGiCyk+jz/+DJrplPSpqTRvIWJmqwA2DMTqBLCmFzFxn3SdU1 uLlBB1buXXxv81QcIH2LpToDhrJAX1HIgDGqL/CNA5imHeQP9gEwn5UXmDVoIbvCqL11 8xvzBq1UFELpwyMQeCKgNTIm4jQGN5t09FQQkLD1o50nAk1hzutC2ZRcxbJFAzW0aXwo ij6ewUlXcst2fDMdzIVkhtNfM1q2nDlvrsckCAsT7bLNYD8mhTc0JP54HOCFLJw7vCi/ wH8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:in-reply-to:cc :references:to:content-language:subject:user-agent:mime-version:date :message-id:dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=ajWrULaZE4v63VHRKU44ZPN3PJMcX+ajCOkgfb5Ma+M=; fh=Z+8/kjPbzboGhTQMFYyyjN/U00RNQh1yaEWpyQnXrVg=; b=UTKPdcD2EfDUJ/meCxnuf/Jxd0+W9L8nLzZlKh6d6oEHSbYEItViTJ9G9ZNMY2SMyh gyIUEhBujvAhIRhoSx0rzVKrB3G7mBeOGJDJcDbOVobXz3SmZZHZmbxu05pViGKioSUL 0SHSUv0de9buBdZHJ5vWyZoNCUkqwBmOEEHAPp/BpyaOmzZoAxeHkFQ/PqlR9mR3QJPc 44AxCsvSzO/yG4iHhBVLWpR7qpcTy8G2//zHpoUIFh5Wu1Vh6fK4oSZj0rzfMH5Z0/3Z K+2Ph2ActpANcjrLaZtmSJZ84Wvv97c/WBnxW5a0c7EIzfTbM2ADuzYhTldorHkB+nYh MIkw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="I/+Vogss"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id y19-20020aa7d513000000b00523372ace05si5301859edq.530.2023.08.30.02.07.16 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Aug 2023 02:07:17 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="I/+Vogss"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0111E385800C for ; Wed, 30 Aug 2023 09:07:13 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0111E385800C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693386433; bh=ajWrULaZE4v63VHRKU44ZPN3PJMcX+ajCOkgfb5Ma+M=; h=Date:Subject:To:References:Cc:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=I/+VogssNMes5X4LZmoQqmeexM3dWRbVjiI3bUa9BAz0lWnDT7ITxgMmtDoVxoeSC adcxPY4snbqQF8rp6aQXO4JCtKZAsVCRwgx3EiZQza4tmjG9s9RSpbrj5eQvmzCu2e VKNsAP1bXPpwmGG7MeklC8g+QlMy14PDwLvyiDfg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id E901A3858C30 for ; Wed, 30 Aug 2023 09:06:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E901A3858C30 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1DACE2F4; Wed, 30 Aug 2023 02:07:03 -0700 (PDT) Received: from [10.57.64.216] (unknown [10.57.64.216]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DEA573F64C; Wed, 30 Aug 2023 02:06:22 -0700 (PDT) Message-ID: <9c15446b-1f4d-62d7-9427-a19eb07ac8ee@arm.com> Date: Wed, 30 Aug 2023 10:06:17 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: [PATCH 1/8] parloops: Copy target and optimizations when creating a function clone Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> Cc: Richard Sandiford , Richard Biener , "jakub@redhat.com" In-Reply-To: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> X-Spam-Status: No, score=-14.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775644376349426323 X-GMAIL-MSGID: 1775644376349426323 SVE simd clones require to be compiled with a SVE target enabled or the argument types will not be created properly. To achieve this we need to copy DECL_FUNCTION_SPECIFIC_TARGET from the original function declaration to the clones. I decided it was probably also a good idea to copy DECL_FUNCTION_SPECIFIC_OPTIMIZATION in case the original function is meant to be compiled with specific optimization options. gcc/ChangeLog: * tree-parloops.cc (create_loop_fn): Copy specific target and optimization options to clone. diff --git a/gcc/tree-parloops.cc b/gcc/tree-parloops.cc index e495bbd65270bdf90bae2c4a2b52777522352a77..a35f3d5023b06e5ef96eb4222488fcb34dd7bd45 100644 --- a/gcc/tree-parloops.cc +++ b/gcc/tree-parloops.cc @@ -2203,6 +2203,11 @@ create_loop_fn (location_t loc) DECL_CONTEXT (t) = decl; TREE_USED (t) = 1; DECL_ARGUMENTS (decl) = t; + DECL_FUNCTION_SPECIFIC_TARGET (decl) + = DECL_FUNCTION_SPECIFIC_TARGET (act_cfun->decl); + DECL_FUNCTION_SPECIFIC_OPTIMIZATION (decl) + = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (act_cfun->decl); + allocate_struct_function (decl, false); From patchwork Wed Aug 30 09:08:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 137154 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a7d1:0:b0:3f2:4152:657d with SMTP id p17csp4416085vqm; Wed, 30 Aug 2023 02:09:19 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGSNg/rmwCDJCZvicxLeG+VZ3ib1l5uFZrz64IbK9kA5yLHd29xCAyJGQPuHsPy5TrmmVeP X-Received: by 2002:a05:6402:1856:b0:522:4cd7:efb0 with SMTP id v22-20020a056402185600b005224cd7efb0mr1310715edy.17.1693386558985; Wed, 30 Aug 2023 02:09:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693386558; cv=none; d=google.com; s=arc-20160816; b=PCm1NBloViECEvRfq+IfZ32m5079hFVqCpC/ZGFQOVHBaCQStmhX2kFCNCpWLZE/yV zG4WBzi1klXzSaW/btbOu67S28xANaFB3i7Oes29oxQhVsdEROLDCEWLpZVzUhcLCBdi DMAvDdMh0roH2Kh8WbtAKO3985KzsjO4ISzOhK4LVUKwDm3I1eBkBlvsfSQCMih8SX+3 p6zSmyYQtU2wbBqthsauAR3wEgbkwueCDvn6CSrgYPHMa+UL+M46tuG9AN2MJLlVVKLO /5HvUohSC/qBIprnmS+7J+1TwT4BCL/2ZOYWp6I6edMD71rYMzK5ZskiN1diAqOHiMqi MEXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:in-reply-to:cc :references:to:content-language:subject:user-agent:mime-version:date :message-id:dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=j8676XM4anbmPESOHlcFpK/LcZFIt14s9aOs4HYdSO4=; fh=SfXA5kMnhKP25nehHv7TPtU7ahJe7ujymrCbYOE+rsU=; b=R7zLvEZ6PgROKsMWlXyx+bk0+WI1MTPTTKZqfecX4tMrzyYwyuZEJt0m3Qg8/XqAoV UUi5xoBxsb7tTybFp75qhehstRer6RMecxIXi0tWTLNlHl/6hbaq1b5cxwu6CnqW1PJa SOnlL0ujEMqyU6rAAGPM4czLOSyY82xm1NwtZBJYIuaODPeeZGyjxWetIqkP3UB6I+Zh Dzn8Z0yYFFQBKrj76f+lcdCaPNUXmrLjiZCkAZkVXsK6eSKCcf0a5ujEGDozw3+KXOKv StxAB2ZHIDQ5QBfqXZbVJVt7MjFN0q5MSwzpDrxi42l28xWYC+lPN1o46xWiLmBps1iM 40jw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="M/gvlo0O"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id n7-20020aa7db47000000b0052257d9655bsi5190665edt.304.2023.08.30.02.09.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Aug 2023 02:09:18 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="M/gvlo0O"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A2D073858438 for ; Wed, 30 Aug 2023 09:09:17 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A2D073858438 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693386557; bh=j8676XM4anbmPESOHlcFpK/LcZFIt14s9aOs4HYdSO4=; h=Date:Subject:To:References:Cc:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=M/gvlo0OuO/sm9iqfnJM2kLKn9uRc3RoUnz8PRtMFa9yHFdqATBAJrGQgyeFO6u8S 2bdEa9dwIoc6lZuy4pluASiXGLbGTfbPVjCpSMtYhmAfKpTH8rU3/MM9aTYM+5Gesv vDM5Ow/+Nep6WwKQzdVvwvKjKZLh6Yf8AQxBQYmQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id C84953858C30 for ; Wed, 30 Aug 2023 09:08:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C84953858C30 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D067C2F4; Wed, 30 Aug 2023 02:09:12 -0700 (PDT) Received: from [10.57.64.216] (unknown [10.57.64.216]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id AE2CC3F64C; Wed, 30 Aug 2023 02:08:32 -0700 (PDT) Message-ID: <0942baa7-f186-4d0b-f556-3b8f926a24ad@arm.com> Date: Wed, 30 Aug 2023 10:08:27 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: [Patch 2/8] parloops: Allow poly nit and bound Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> Cc: Richard Sandiford , Richard Biener In-Reply-To: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> X-Spam-Status: No, score=-14.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775644504418195437 X-GMAIL-MSGID: 1775644504418195437 Teach parloops how to handle a poly nit and bound e ahead of the changes to enable non-constant simdlen. gcc/ChangeLog: * tree-parloops.cc (try_to_transform_to_exit_first_loop_alt): Accept poly NIT and ALT_BOUND. diff --git a/gcc/tree-parloops.cc b/gcc/tree-parloops.cc index a35f3d5023b06e5ef96eb4222488fcb34dd7bd45..cf713e53d712fb5ad050e274f373adba5a90c5a7 100644 --- a/gcc/tree-parloops.cc +++ b/gcc/tree-parloops.cc @@ -2531,14 +2531,16 @@ try_transform_to_exit_first_loop_alt (class loop *loop, tree nit_type = TREE_TYPE (nit); /* Figure out whether nit + 1 overflows. */ - if (TREE_CODE (nit) == INTEGER_CST) + if (TREE_CODE (nit) == INTEGER_CST + || TREE_CODE (nit) == POLY_INT_CST) { if (!tree_int_cst_equal (nit, TYPE_MAX_VALUE (nit_type))) { alt_bound = fold_build2_loc (UNKNOWN_LOCATION, PLUS_EXPR, nit_type, nit, build_one_cst (nit_type)); - gcc_assert (TREE_CODE (alt_bound) == INTEGER_CST); + gcc_assert (TREE_CODE (alt_bound) == INTEGER_CST + || TREE_CODE (alt_bound) == POLY_INT_CST); transform_to_exit_first_loop_alt (loop, reduction_list, alt_bound); return true; } From patchwork Wed Aug 30 09:10:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 137155 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a7d1:0:b0:3f2:4152:657d with SMTP id p17csp4416842vqm; Wed, 30 Aug 2023 02:11:03 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHBWkqK313emCfVtEVwadCMLb5qw8xwma3m3tsBnyVc5hkgT5utMKr6wzvNGjGoWRiPRBJN X-Received: by 2002:aa7:c414:0:b0:51e:ed6:df38 with SMTP id j20-20020aa7c414000000b0051e0ed6df38mr1396689edq.13.1693386663496; Wed, 30 Aug 2023 02:11:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693386663; cv=none; d=google.com; s=arc-20160816; b=cCOCFQy7PNO/bXa7+z9CWfg2ec1pv45dhC/axrgulcOnLJoLqezNjCvFGRtBwJEwOc 6e1nI0UX7CHQRx8UZkr8GRwIaimOndaiqcqcpuK7mjqmiPawcTiQ+uhSiQJaoNqrJTxM gF2SpyV/aMM3hpTbR70kGs88AtxSeJ9zr3rv1siPI/Uj397LfbpgiEmPjweS3mHsH8nD kw1bYJtlZ6RpWhtM9+6nS7/fRgiLrQ+0kLYoqQOCM2tyhHbXt8jmJEKgOP6RtTXno8Cv BQ5Sueak29wSNOyFGb+Muvu8Pmp/aGPeTNqxKI6dpDM1Fkxqcyt1dlRNNDN9H+J+kQl9 x0jg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:in-reply-to:cc :references:to:content-language:subject:user-agent:mime-version:date :message-id:dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=bM4ssMgK8S2KBdw7ymicPyUe/e/bvNa/kAFB8eEJh1Q=; fh=Z+8/kjPbzboGhTQMFYyyjN/U00RNQh1yaEWpyQnXrVg=; b=0Z0qFOpxviSx1aKKcun7LrbuS88LGzz28E8Ef6wlYsEDiyeLMxPAX2dB6auuH8qOXP sRphxKam2w1U5wjdTnn1aNZx2LOpUg3W3oSlQUnz3MpKfnEsmXgdQk7Ijnl60IdgDaaR HH5fkS3/MFc2boARTRBd2XwhEAZTN1hUEHoU56HGw7hWDgGzckskBSQy/JOnW85TlJfx iVRw6RAeqgpXklbjcJysEKHGhllJdk7gOOKqExsKxw13E81FodauUXPhFkRS0mh96TQS RigCDkwjzn/BZguWX6MLC4w7i2xu0Ud78sY2CbFXkjDXCLWk+STLEsvdu7MiEGk6CsPw ZLXQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=ZlssaRGN; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id n22-20020a05640206d600b005254cf5c284si7130738edy.526.2023.08.30.02.11.03 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Aug 2023 02:11:03 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=ZlssaRGN; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2E8293858407 for ; Wed, 30 Aug 2023 09:11:02 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2E8293858407 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693386662; bh=bM4ssMgK8S2KBdw7ymicPyUe/e/bvNa/kAFB8eEJh1Q=; h=Date:Subject:To:References:Cc:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=ZlssaRGNQ3gNrxOHK6PNiYF2MPzuC30h3d6D5NcGMgLKUT4m7DxG68XaDnkN18/CH h1ZKqoM+NWus64/O0U8gHG1uk5oWODrE4ni7qoujNTjLK/i5+qTFTmllAKiOwG/ovo QazYcEJHhyD/pSZRk7CSFcXzx2nPIy+a0jC9m0NY= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 0F7773858C30 for ; Wed, 30 Aug 2023 09:10:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0F7773858C30 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DC0AB2F4; Wed, 30 Aug 2023 02:10:56 -0700 (PDT) Received: from [10.57.64.216] (unknown [10.57.64.216]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 966773F64C; Wed, 30 Aug 2023 02:10:16 -0700 (PDT) Message-ID: <6adafeff-e026-aec9-2b1a-8a5f736f813d@arm.com> Date: Wed, 30 Aug 2023 10:10:10 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: [Patch 3/8] vect: Fix vect_get_smallest_scalar_type for simd clones Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> Cc: Richard Sandiford , Richard Biener , "jakub@redhat.com" In-Reply-To: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> X-Spam-Status: No, score=-13.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775644614332575137 X-GMAIL-MSGID: 1775644614332575137 The vect_get_smallest_scalar_type helper function was using any argument to a simd clone call when trying to determine the smallest scalar type that would be vectorized. This included the function pointer type in a MASK_CALL for instance, and would result in the wrong type being selected. Instead this patch special cases simd_clone_call's and uses only scalar types of the original function that get transformed into vector types. gcc/ChangeLog: * tree-vect-data-refs.cci (vect_get_smallest_scalar_type): Special case simd clone calls and only use types that are mapped to vectors. * tree-vect-stmts.cc (simd_clone_call_p): New helper function. * tree-vectorizer.h (simd_clone_call_p): Declare new function. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-simd-clone-16f.c: Remove unnecessary differentation between targets with different pointer sizes. * gcc.dg/vect/vect-simd-clone-17f.c: Likewise. * gcc.dg/vect/vect-simd-clone-18f.c: Likewise. diff --git a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16f.c b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16f.c index 574698d3e133ecb8700e698fa42a6b05dd6b8a18..7cd29e894d0502a59fadfe67db2db383133022d3 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16f.c +++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16f.c @@ -7,9 +7,8 @@ #include "vect-simd-clone-16.c" /* Ensure the the in-branch simd clones are used on targets that support them. - Some targets use pairs of vectors and do twice the calls. */ -/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" { target { ! { { i?86-*-* x86_64-*-* } && { ! lp64 } } } } } } */ -/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 4 "vect" { target { { i?86*-*-* x86_64-*-* } && { ! lp64 } } } } } */ + */ +/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" } } */ /* The LTO test produces two dump files and we scan the wrong one. */ /* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-17f.c b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-17f.c index 8bb6d19301a67a3eebce522daaf7d54d88f708d7..177521dc44531479fca1f1a1a0f2010f30fa3fb5 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-17f.c +++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-17f.c @@ -7,9 +7,8 @@ #include "vect-simd-clone-17.c" /* Ensure the the in-branch simd clones are used on targets that support them. - Some targets use pairs of vectors and do twice the calls. */ -/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" { target { ! { { i?86-*-* x86_64-*-* } && { ! lp64 } } } } } } */ -/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 4 "vect" { target { { i?86*-*-* x86_64-*-* } && { ! lp64 } } } } } */ + */ +/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" } } */ /* The LTO test produces two dump files and we scan the wrong one. */ /* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-18f.c b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-18f.c index d34f23f4db8e9c237558cc22fe66b7e02b9e6c20..4dd51381d73c0c7c8ec812f24e5054df038059c5 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-18f.c +++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-18f.c @@ -7,9 +7,8 @@ #include "vect-simd-clone-18.c" /* Ensure the the in-branch simd clones are used on targets that support them. - Some targets use pairs of vectors and do twice the calls. */ -/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" { target { ! { { i?86-*-* x86_64-*-* } && { ! lp64 } } } } } } */ -/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 4 "vect" { target { { i?86*-*-* x86_64-*-* } && { ! lp64 } } } } } */ + */ +/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" } } */ /* The LTO test produces two dump files and we scan the wrong one. */ /* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */ diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index a3570c45b5209281ac18c1220c3b95398487f389..1bdbea232afc6facddac23269ee3da033eb1ed50 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -119,6 +119,7 @@ tree vect_get_smallest_scalar_type (stmt_vec_info stmt_info, tree scalar_type) { HOST_WIDE_INT lhs, rhs; + cgraph_node *node; /* During the analysis phase, this function is called on arbitrary statements that might not have scalar results. */ @@ -145,6 +146,23 @@ vect_get_smallest_scalar_type (stmt_vec_info stmt_info, tree scalar_type) scalar_type = rhs_type; } } + else if (simd_clone_call_p (stmt_info->stmt, &node)) + { + auto clone = node->simd_clones->simdclone; + for (unsigned int i = 0; i < clone->nargs; ++i) + { + if (clone->args[i].arg_type == SIMD_CLONE_ARG_TYPE_VECTOR) + { + tree arg_scalar_type = TREE_TYPE (clone->args[i].vector_type); + rhs = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (arg_scalar_type)); + if (rhs < lhs) + { + scalar_type = arg_scalar_type; + lhs = rhs; + } + } + } + } else if (gcall *call = dyn_cast (stmt_info->stmt)) { unsigned int i = 0; diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 0fe5d0594abc095d3770b5ce4b9f2bad5205ab2f..35207de7acb410358220dbe8d1af82215b5091bf 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -3965,6 +3965,29 @@ vect_simd_lane_linear (tree op, class loop *loop, } } +bool +simd_clone_call_p (gimple *stmt, cgraph_node **out_node) +{ + gcall *call = dyn_cast (stmt); + if (!call) + return false; + + tree fndecl = NULL_TREE; + if (gimple_call_internal_p (call, IFN_MASK_CALL)) + fndecl = TREE_OPERAND (gimple_call_arg (stmt, 0), 0); + else + fndecl = gimple_call_fndecl (stmt); + + if (fndecl == NULL_TREE) + return false; + + cgraph_node *node = cgraph_node::get (fndecl); + if (out_node) + *out_node = node; + + return node != NULL && node->simd_clones != NULL; +} + /* Function vectorizable_simd_clone_call. Check if STMT_INFO performs a function call that can be vectorized diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index a65161499ea13f200aa745ca396db663a217b081..69634f7a6032696b394a62fb7ca8986bc78987c8 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -2165,6 +2165,7 @@ extern bool vect_can_advance_ivs_p (loop_vec_info); extern void vect_update_inits_of_drs (loop_vec_info, tree, tree_code); /* In tree-vect-stmts.cc. */ +extern bool simd_clone_call_p (gimple *, struct cgraph_node **node = NULL); extern tree get_related_vectype_for_scalar_type (machine_mode, tree, poly_uint64 = 0); extern tree get_vectype_for_scalar_type (vec_info *, tree, unsigned int = 0); From patchwork Wed Aug 30 09:11:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 137156 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a7d1:0:b0:3f2:4152:657d with SMTP id p17csp4417646vqm; Wed, 30 Aug 2023 02:13:13 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGnUwBe8z0cI+ZrSeNc1P7GQ7s7bNeYElVWxv548s+312JT/tbobD0LA4guvOrEitu2WfV5 X-Received: by 2002:a2e:9ed3:0:b0:2bc:f756:341 with SMTP id h19-20020a2e9ed3000000b002bcf7560341mr1308061ljk.35.1693386793506; Wed, 30 Aug 2023 02:13:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693386793; cv=none; d=google.com; s=arc-20160816; b=bPC9okC1rx8GuAhcCaFlITQ08Vsdek32H17usAS08C2vRLHSEkoy7aP4Xmdahpc9YB 4tV+BnP58bjPrG47RtgrLk4ReWJ7tqgslLnBzCItq0j39UQjgYKC1HtjrBTcTVn7/gr2 QwhZExQnQt18NEqFaUT7u+V9svWVSgoS5v8abKxuawFFFN7PE7l/a93DPrVW9SZVC7pa 4OSXhiWhw7x4TOGvr9Efyb/hD9kCsTc3tlqsb5qWczUNRidoAdrRzfrjmrPhA7cC1u5M BUsR6vmVhKcAA43zuo1aGGiy9RdSeAoVlWEjDGK0INqQ+vYwl3YLzKp2O3JmrddagWlg srZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:in-reply-to:cc :references:to:content-language:subject:user-agent:mime-version:date :message-id:dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=ukL/whRxULUdNHYfPhCY2xeRI3OxTnMTJfPaTCgj4Rs=; fh=o9i0XK2rF1g+e40uA5tpRZiZW6rUHhQJ/DoI6LlA7IY=; b=W2fqgEzSrnNazzn2hIM15BNT7Qs4lG3L8sAEsO48Ncr9mrWBl6rFqyao8CO7Idkxjx jEl/5BeRneSs+1B/n9T07NFSEbw1G9F8lmtvfOKfLXWv/MhtG8cepe1OVb+N62Sr04fm +QL61i/IFO/IBEQFi7WhLiiCPb0ne6KlwMT/9lytszbR7sSi7n9OSWqV2LH4c+CWb/qe fx4xwuwFDrVxN5Lunk2aNEymav2YJlMZGbM8Z1kqSK694oE/oomvCU/Jsouqlu+sNT4C vPznzPn6x2tspmED4WNU9Un9rOYxf5S9ne2h4vV0lCzVylaaYGLXZqiEQQsIKkB/zB09 vIQw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=bQVHn4ps; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id a7-20020a17090680c700b009937e7c4e50si5023375ejx.546.2023.08.30.02.13.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Aug 2023 02:13:13 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=bQVHn4ps; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DE1163856263 for ; Wed, 30 Aug 2023 09:12:46 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DE1163856263 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693386766; bh=ukL/whRxULUdNHYfPhCY2xeRI3OxTnMTJfPaTCgj4Rs=; h=Date:Subject:To:References:Cc:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=bQVHn4pscPI1ehNuqI8F5VKg2O8bpoy3huHsympSOCtB16xfVusPNbFXHr70mUDx3 q/AuywEI2pPd17sfiICBcw5von7fop/3lTy91l6KwSa+n+vhu2Zde/394bbEY7doHO DCeCXUj9F9KbwoLlwjomE1gKoZI8Yx5WhsycRZpQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id B16AC38582A3 for ; Wed, 30 Aug 2023 09:12:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B16AC38582A3 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AADC62F4; Wed, 30 Aug 2023 02:12:40 -0700 (PDT) Received: from [10.57.64.216] (unknown [10.57.64.216]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C60BB3F64C; Wed, 30 Aug 2023 02:12:00 -0700 (PDT) Message-ID: <49eca251-630e-b26c-5d66-4f8b322ee801@arm.com> Date: Wed, 30 Aug 2023 10:11:55 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: [PATCH 4/8] vect: don't allow fully masked loops with non-masked simd clones [PR 110485] Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> Cc: Richard Biener In-Reply-To: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> X-Spam-Status: No, score=-14.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775644750439004867 X-GMAIL-MSGID: 1775644750439004867 When analyzing a loop and choosing a simdclone to use it is possible to choose a simdclone that cannot be used 'inbranch' for a loop that can use partial vectors. This may lead to the vectorizer deciding to use partial vectors which are not supported for notinbranch simd clones. This patch fixes that by disabling the use of partial vectors once a notinbranch simd clone has been selected. gcc/ChangeLog: PR tree-optimization/110485 * tree-vect-stmts.cc (vectorizable_simd_clone_call): Disable partial vectors usage if a notinbranch simdclone has been selected. gcc/testsuite/ChangeLog: * gcc.dg/gomp/pr110485.c: New test. diff --git a/gcc/testsuite/gcc.dg/gomp/pr110485.c b/gcc/testsuite/gcc.dg/gomp/pr110485.c new file mode 100644 index 0000000000000000000000000000000000000000..ba6817a127f40246071e32ccebf692cc4d121d15 --- /dev/null +++ b/gcc/testsuite/gcc.dg/gomp/pr110485.c @@ -0,0 +1,19 @@ +/* PR 110485 */ +/* { dg-do compile } */ +/* { dg-additional-options "-Ofast -fdump-tree-vect-details" } */ +/* { dg-additional-options "-march=znver4 --param=vect-partial-vector-usage=1" { target x86_64-*-* } } */ +#pragma omp declare simd notinbranch uniform(p) +extern double __attribute__ ((const)) bar (double a, double p); + +double a[1024]; +double b[1024]; + +void foo (int n) +{ + #pragma omp simd + for (int i = 0; i < n; ++i) + a[i] = bar (b[i], 71.2); +} + +/* { dg-final { scan-tree-dump-not "MASK_LOAD" "vect" } } */ +/* { dg-final { scan-tree-dump "can't use a fully-masked loop because a non-masked simd clone was selected." "vect" { target x86_64-*-* } } } */ diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 35207de7acb410358220dbe8d1af82215b5091bf..664c3b5f7ca48fdb49383fb8a97f407465574479 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4349,6 +4349,17 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, ? boolean_true_node : boolean_false_node; STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_push (sll); } + + if (!bestn->simdclone->inbranch) + { + if (dump_enabled_p () + && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)) + dump_printf_loc (MSG_NOTE, vect_location, + "can't use a fully-masked loop because a" + " non-masked simd clone was selected.\n"); + LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; + } + STMT_VINFO_TYPE (stmt_info) = call_simd_clone_vec_info_type; DUMP_VECT_SCOPE ("vectorizable_simd_clone_call"); /* vect_model_simple_cost (vinfo, stmt_info, ncopies, From patchwork Wed Aug 30 09:13:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 137157 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a7d1:0:b0:3f2:4152:657d with SMTP id p17csp4418074vqm; Wed, 30 Aug 2023 02:14:22 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG+k2jLbMLQbQyxvyQAHODWstW4zbMQAq6MLeW2mEjMS5DBe9xLI23qGDI0lLw/CYxaAUTe X-Received: by 2002:a17:907:75d4:b0:9a1:b144:30f4 with SMTP id jl20-20020a17090775d400b009a1b14430f4mr2022813ejc.14.1693386862237; Wed, 30 Aug 2023 02:14:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693386862; cv=none; d=google.com; s=arc-20160816; b=Pvkwj+enuMU7jLsn7JzGTn1x5GUm2/jWA/A1EuAqCvGkbh55vdi94v4F0CE5OOadkR JTxmJVRvgS/jBTINTaGhghHGSRFPfgYd1o0j+6qiTc6+2FhjikefoPZfDZRuOZRnDQM8 O/4+LDhe2CzzKA8mcSTpjfsxOWvQfPJH+MsfM/iBIT7Sxr9fNUdTIrx1Xwci7PTlWUAr FUYaDm2bT2gOhKdVOV3GjlS/T3cYSFnB5sXRkpOrQv2x98t73HEuIiqrG4rFon61iID+ eX1GBLpamLCbUJx3QD/oUo0v52X56j1QsVblQkvL8JO0vt0GVGUouocSIQSJbeofWzhx 0eeg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:in-reply-to:cc :references:to:content-language:subject:user-agent:mime-version:date :message-id:dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=go/t5BhZ+sgQiGLM6SuykLLeoVMZUQuS/iqy8MzK9I0=; fh=sgurieNRT1F47MCuVM52yu0r4xmlRVSNh8p7V7FQz38=; b=bQBY8p/74naiLFI6xrfEapSoxu4kKbR5xqcOH9DuZltYfh8UJAf4uPTAGKzB8lJ1GC dwmuIYnp/T/PQOYNGGWl21hla9UxYGOZ1GJb3DX0tsIJ4YofK9+kW9yAOOdrwGgPUNRs lHyNcgZkx0BLNYYT4mQnKpeV8Kx+TjPsAlsoYqKeTU/huUrgf2sQbGhxvY8+EqVXSgbn Lpta5jGCNq4VF9L13Qu/XMJOD1dv4phAIrkM4MdjBmVmEXWtnkQuySslAb/QimXFi7d3 zvvgfVjPTV/oST+pKreNwpCz/HTaaBWVH8CvVq47uZ+iQd6OgBMnDDrD7HUAMGxBLvzF eHIg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=bPK8E3nV; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id lm17-20020a170906981100b009a196ce3530si6795858ejb.932.2023.08.30.02.14.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Aug 2023 02:14:22 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=bPK8E3nV; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C3E053858431 for ; Wed, 30 Aug 2023 09:14:20 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C3E053858431 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693386860; bh=go/t5BhZ+sgQiGLM6SuykLLeoVMZUQuS/iqy8MzK9I0=; h=Date:Subject:To:References:Cc:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=bPK8E3nVJtnzBfwHnLt/xuXwABNh3uuyyfM2lQ1Mkv5HTPUhoqZv3G+eYlPfJmNkT qmcolxcaMiCPkTtqVBXmhNVqgGe/dx4X4AlkZfEBe1Fs7rAwGI+jpPTbB2B3R/nA9W /iJgFLyds6wnxm0vUThOijYNLyPb9qmYxtUDvWnA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id AD0C23858C30 for ; Wed, 30 Aug 2023 09:13:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AD0C23858C30 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C6D102F4; Wed, 30 Aug 2023 02:14:13 -0700 (PDT) Received: from [10.57.64.216] (unknown [10.57.64.216]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 74CE23F64C; Wed, 30 Aug 2023 02:13:33 -0700 (PDT) Message-ID: Date: Wed, 30 Aug 2023 10:13:27 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: [PATCH 5/8] vect: Use inbranch simdclones in masked loops Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> Cc: Richard Biener , Richard Sandiford , "jakub@redhat.com" In-Reply-To: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> X-Spam-Status: No, score=-14.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775644822742615180 X-GMAIL-MSGID: 1775644822742615180 This patch enables the compiler to use inbranch simdclones when generating masked loops in autovectorization. gcc/ChangeLog: * omp-simd-clone.cc (simd_clone_adjust_argument_types): Make function compatible with mask parameters in clone. * tree-vect-stmts.cc (vect_convert): New helper function. (vect_build_all_ones_mask): Allow vector boolean typed masks. (vectorizable_simd_clone_call): Enable the use of masked clones in fully masked loops. diff --git a/gcc/omp-simd-clone.cc b/gcc/omp-simd-clone.cc index a42643400ddcf10961633448b49d4caafb999f12..ef0b9b48c7212900023bc0eaebca5e1f9389db77 100644 --- a/gcc/omp-simd-clone.cc +++ b/gcc/omp-simd-clone.cc @@ -807,8 +807,14 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) { ipa_adjusted_param adj; memset (&adj, 0, sizeof (adj)); - tree parm = args[i]; - tree parm_type = node->definition ? TREE_TYPE (parm) : parm; + tree parm = NULL_TREE; + tree parm_type = NULL_TREE; + if(i < args.length()) + { + parm = args[i]; + parm_type = node->definition ? TREE_TYPE (parm) : parm; + } + adj.base_index = i; adj.prev_clone_index = i; @@ -1547,7 +1553,7 @@ simd_clone_adjust (struct cgraph_node *node) mask = gimple_assign_lhs (g); g = gimple_build_assign (make_ssa_name (TREE_TYPE (mask)), BIT_AND_EXPR, mask, - build_int_cst (TREE_TYPE (mask), 1)); + build_one_cst (TREE_TYPE (mask))); gsi_insert_after (&gsi, g, GSI_CONTINUE_LINKING); mask = gimple_assign_lhs (g); } diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 664c3b5f7ca48fdb49383fb8a97f407465574479..7217f36a250d549b955c874d7c7644d94982b0b5 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -1723,6 +1723,20 @@ check_load_store_for_partial_vectors (loop_vec_info loop_vinfo, tree vectype, } } +/* Return SSA name of the result of the conversion of OPERAND into type TYPE. + The conversion statement is inserted at GSI. */ + +static tree +vect_convert (vec_info *vinfo, stmt_vec_info stmt_info, tree type, tree operand, + gimple_stmt_iterator *gsi) +{ + operand = build1 (VIEW_CONVERT_EXPR, type, operand); + gassign *new_stmt = gimple_build_assign (make_ssa_name (type), + operand); + vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); + return gimple_get_lhs (new_stmt); +} + /* Return the mask input to a masked load or store. VEC_MASK is the vectorized form of the scalar mask condition and LOOP_MASK, if nonnull, is the mask that needs to be applied to all loads and stores in a vectorized loop. @@ -2666,7 +2680,8 @@ vect_build_all_ones_mask (vec_info *vinfo, { if (TREE_CODE (masktype) == INTEGER_TYPE) return build_int_cst (masktype, -1); - else if (TREE_CODE (TREE_TYPE (masktype)) == INTEGER_TYPE) + else if (VECTOR_BOOLEAN_TYPE_P (masktype) + || TREE_CODE (TREE_TYPE (masktype)) == INTEGER_TYPE) { tree mask = build_int_cst (TREE_TYPE (masktype), -1); mask = build_vector_from_val (masktype, mask); @@ -4018,7 +4033,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, size_t i, nargs; tree lhs, rtype, ratype; vec *ret_ctor_elts = NULL; - int arg_offset = 0; + int masked_call_offset = 0; /* Is STMT a vectorizable call? */ gcall *stmt = dyn_cast (stmt_info->stmt); @@ -4033,7 +4048,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, gcc_checking_assert (TREE_CODE (fndecl) == ADDR_EXPR); fndecl = TREE_OPERAND (fndecl, 0); gcc_checking_assert (TREE_CODE (fndecl) == FUNCTION_DECL); - arg_offset = 1; + masked_call_offset = 1; } if (fndecl == NULL_TREE) return false; @@ -4065,7 +4080,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, return false; /* Process function arguments. */ - nargs = gimple_call_num_args (stmt) - arg_offset; + nargs = gimple_call_num_args (stmt) - masked_call_offset; /* Bail out if the function has zero arguments. */ if (nargs == 0) @@ -4083,7 +4098,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, thisarginfo.op = NULL_TREE; thisarginfo.simd_lane_linear = false; - op = gimple_call_arg (stmt, i + arg_offset); + op = gimple_call_arg (stmt, i + masked_call_offset); if (!vect_is_simple_use (op, vinfo, &thisarginfo.dt, &thisarginfo.vectype) || thisarginfo.dt == vect_uninitialized_def) @@ -4161,14 +4176,6 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, } poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo); - if (!vf.is_constant ()) - { - if (dump_enabled_p ()) - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "not considering SIMD clones; not yet supported" - " for variable-width vectors.\n"); - return false; - } unsigned int badness = 0; struct cgraph_node *bestn = NULL; @@ -4181,7 +4188,8 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, unsigned int this_badness = 0; unsigned int num_calls; if (!constant_multiple_p (vf, n->simdclone->simdlen, &num_calls) - || n->simdclone->nargs != nargs) + || (!n->simdclone->inbranch && (masked_call_offset > 0)) + || nargs != n->simdclone->nargs) continue; if (num_calls != 1) this_badness += exact_log2 (num_calls) * 4096; @@ -4198,7 +4206,8 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, case SIMD_CLONE_ARG_TYPE_VECTOR: if (!useless_type_conversion_p (n->simdclone->args[i].orig_type, - TREE_TYPE (gimple_call_arg (stmt, i + arg_offset)))) + TREE_TYPE (gimple_call_arg (stmt, + i + masked_call_offset)))) i = -1; else if (arginfo[i].dt == vect_constant_def || arginfo[i].dt == vect_external_def @@ -4243,6 +4252,17 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, } if (i == (size_t) -1) continue; + if (masked_call_offset == 0 + && n->simdclone->inbranch + && n->simdclone->nargs > nargs) + { + gcc_assert (n->simdclone->args[n->simdclone->nargs - 1].arg_type == + SIMD_CLONE_ARG_TYPE_MASK); + /* Penalize using a masked SIMD clone in a non-masked loop, that is + not in a branch, as we'd have to construct an all-true mask. */ + if (!loop_vinfo || !LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)) + this_badness += 64; + } if (bestn == NULL || this_badness < badness) { bestn = n; @@ -4259,7 +4279,8 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, || arginfo[i].dt == vect_external_def) && bestn->simdclone->args[i].arg_type == SIMD_CLONE_ARG_TYPE_VECTOR) { - tree arg_type = TREE_TYPE (gimple_call_arg (stmt, i + arg_offset)); + tree arg_type = TREE_TYPE (gimple_call_arg (stmt, + i + masked_call_offset)); arginfo[i].vectype = get_vectype_for_scalar_type (vinfo, arg_type, slp_node); if (arginfo[i].vectype == NULL @@ -4331,24 +4352,38 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, && TREE_CODE (TREE_TYPE (TREE_TYPE (bestn->decl))) == ARRAY_TYPE) vinfo->any_known_not_updated_vssa = true; STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_push (bestn->decl); - for (i = 0; i < nargs; i++) - if ((bestn->simdclone->args[i].arg_type - == SIMD_CLONE_ARG_TYPE_LINEAR_CONSTANT_STEP) - || (bestn->simdclone->args[i].arg_type - == SIMD_CLONE_ARG_TYPE_LINEAR_REF_CONSTANT_STEP)) - { - STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_grow_cleared (i * 3 - + 1, - true); - STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_push (arginfo[i].op); - tree lst = POINTER_TYPE_P (TREE_TYPE (arginfo[i].op)) - ? size_type_node : TREE_TYPE (arginfo[i].op); - tree ls = build_int_cst (lst, arginfo[i].linear_step); - STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_push (ls); - tree sll = arginfo[i].simd_lane_linear - ? boolean_true_node : boolean_false_node; - STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_push (sll); - } + + for (i = 0; i < bestn->simdclone->nargs; i++) + { + switch (bestn->simdclone->args[i].arg_type) + { + default: + continue; + case SIMD_CLONE_ARG_TYPE_LINEAR_CONSTANT_STEP: + case SIMD_CLONE_ARG_TYPE_LINEAR_REF_CONSTANT_STEP: + { + auto &clone_info = STMT_VINFO_SIMD_CLONE_INFO (stmt_info); + clone_info.safe_grow_cleared (i * 3 + 1, true); + clone_info.safe_push (arginfo[i].op); + tree lst = POINTER_TYPE_P (TREE_TYPE (arginfo[i].op)) + ? size_type_node : TREE_TYPE (arginfo[i].op); + tree ls = build_int_cst (lst, arginfo[i].linear_step); + clone_info.safe_push (ls); + tree sll = arginfo[i].simd_lane_linear + ? boolean_true_node : boolean_false_node; + clone_info.safe_push (sll); + } + break; + case SIMD_CLONE_ARG_TYPE_MASK: + if (loop_vinfo + && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)) + vect_record_loop_mask (loop_vinfo, + &LOOP_VINFO_MASKS (loop_vinfo), + ncopies, vectype, op); + + break; + } + } if (!bestn->simdclone->inbranch) { @@ -4394,6 +4429,8 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, vec_oprnds_i.safe_grow_cleared (nargs, true); for (j = 0; j < ncopies; ++j) { + poly_uint64 callee_nelements; + poly_uint64 caller_nelements; /* Build argument list for the vectorized call. */ if (j == 0) vargs.create (nargs); @@ -4404,8 +4441,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, { unsigned int k, l, m, o; tree atype; - poly_uint64 callee_nelements, caller_nelements; - op = gimple_call_arg (stmt, i + arg_offset); + op = gimple_call_arg (stmt, i + masked_call_offset); switch (bestn->simdclone->args[i].arg_type) { case SIMD_CLONE_ARG_TYPE_VECTOR: @@ -4482,16 +4518,9 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, if (k == 1) if (!useless_type_conversion_p (TREE_TYPE (vec_oprnd0), atype)) - { - vec_oprnd0 - = build1 (VIEW_CONVERT_EXPR, atype, vec_oprnd0); - gassign *new_stmt - = gimple_build_assign (make_ssa_name (atype), - vec_oprnd0); - vect_finish_stmt_generation (vinfo, stmt_info, - new_stmt, gsi); - vargs.safe_push (gimple_assign_lhs (new_stmt)); - } + vargs.safe_push (vect_convert (vinfo, stmt_info, + atype, vec_oprnd0, + gsi)); else vargs.safe_push (vec_oprnd0); else @@ -4544,6 +4573,24 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, vec_oprnds_i[i] = 0; } vec_oprnd0 = vec_oprnds[i][vec_oprnds_i[i]++]; + if (loop_vinfo + && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)) + { + vec_loop_masks *loop_masks + = &LOOP_VINFO_MASKS (loop_vinfo); + tree loop_mask + = vect_get_loop_mask (loop_vinfo, gsi, + loop_masks, ncopies, + vectype, j); + vec_oprnd0 + = prepare_vec_mask (loop_vinfo, + TREE_TYPE (loop_mask), + loop_mask, vec_oprnd0, + gsi); + loop_vinfo->vec_cond_masked_set.add ({ vec_oprnd0, + loop_mask }); + + } vec_oprnd0 = build3 (VEC_COND_EXPR, atype, vec_oprnd0, build_vector_from_val (atype, one), @@ -4641,6 +4688,64 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, } } + if (masked_call_offset == 0 + && bestn->simdclone->inbranch + && bestn->simdclone->nargs > nargs) + { + unsigned long m, o; + size_t mask_i = bestn->simdclone->nargs - 1; + tree mask; + gcc_assert (bestn->simdclone->args[mask_i].arg_type == + SIMD_CLONE_ARG_TYPE_MASK); + + tree masktype = bestn->simdclone->args[mask_i].vector_type; + callee_nelements = TYPE_VECTOR_SUBPARTS (masktype); + o = vector_unroll_factor (nunits, callee_nelements); + for (m = j * o; m < (j + 1) * o; m++) + { + if (loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)) + { + vec_loop_masks *loop_masks = &LOOP_VINFO_MASKS (loop_vinfo); + mask = vect_get_loop_mask (loop_vinfo, gsi, loop_masks, + ncopies, vectype, j); + } + else + mask = vect_build_all_ones_mask (vinfo, stmt_info, masktype); + + if (!useless_type_conversion_p (TREE_TYPE (mask), masktype)) + { + gassign *new_stmt; + if (bestn->simdclone->mask_mode != VOIDmode) + { + /* This means we are dealing with integer mask modes. + First convert to an integer type with the same size as + the current vector type. */ + unsigned HOST_WIDE_INT intermediate_size + = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (mask))); + tree mid_int_type = + build_nonstandard_integer_type (intermediate_size, 1); + mask = build1 (VIEW_CONVERT_EXPR, mid_int_type, mask); + new_stmt + = gimple_build_assign (make_ssa_name (mid_int_type), + mask); + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + /* Then zero-extend to the mask mode. */ + mask = fold_build1 (NOP_EXPR, masktype, + gimple_get_lhs (new_stmt)); + } + else + mask = build1 (VIEW_CONVERT_EXPR, masktype, mask); + + new_stmt = gimple_build_assign (make_ssa_name (masktype), + mask); + vect_finish_stmt_generation (vinfo, stmt_info, + new_stmt, gsi); + mask = gimple_assign_lhs (new_stmt); + } + vargs.safe_push (mask); + } + } + gcall *new_call = gimple_build_call_vec (fndecl, vargs); if (vec_dest) { @@ -4659,13 +4764,13 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, if (vec_dest) { - if (!multiple_p (TYPE_VECTOR_SUBPARTS (vectype), nunits)) + caller_nelements = TYPE_VECTOR_SUBPARTS (vectype); + if (!multiple_p (caller_nelements, nunits)) { unsigned int k, l; poly_uint64 prec = GET_MODE_BITSIZE (TYPE_MODE (vectype)); poly_uint64 bytes = GET_MODE_SIZE (TYPE_MODE (vectype)); - k = vector_unroll_factor (nunits, - TYPE_VECTOR_SUBPARTS (vectype)); + k = vector_unroll_factor (nunits, caller_nelements); gcc_assert ((k & (k - 1)) == 0); for (l = 0; l < k; l++) { @@ -4691,11 +4796,11 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, vect_clobber_variable (vinfo, stmt_info, gsi, new_temp); continue; } - else if (!multiple_p (nunits, TYPE_VECTOR_SUBPARTS (vectype))) + else if (!multiple_p (nunits, caller_nelements)) { unsigned int k; - if (!constant_multiple_p (TYPE_VECTOR_SUBPARTS (rtype), - TYPE_VECTOR_SUBPARTS (vectype), &k)) + if (!constant_multiple_p (caller_nelements, + TYPE_VECTOR_SUBPARTS (rtype), &k)) gcc_unreachable (); gcc_assert ((k & (k - 1)) == 0); if ((j & (k - 1)) == 0) From patchwork Wed Aug 30 09:14:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 137158 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a7d1:0:b0:3f2:4152:657d with SMTP id p17csp4418564vqm; Wed, 30 Aug 2023 02:15:34 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEG3Bmwd5/3NXH+r/yPFeVAv8h93Rk1DT5tiGhfTxBa5JmreAOOJuT6j8y35D7FXfnjN8Ri X-Received: by 2002:aa7:c1c5:0:b0:522:18b6:c01f with SMTP id d5-20020aa7c1c5000000b0052218b6c01fmr1838603edp.3.1693386934357; Wed, 30 Aug 2023 02:15:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693386934; cv=none; d=google.com; s=arc-20160816; b=l6cNAWw18z2xtNYlKvZZuwN/dO+V4iEpMIaXYQlfPmQ63hTr/e5LudJv+RTsnfyt59 ipRRXTUcAbzaDWDMony0JcC5GyBC/C7jTd6VSJBoecFBxLwrU9ZhRfo3k3GF+igvcZiF Iodj12ajZePzM2gO4r427Vk7ZOIP5iHasB0VFwWDP6MFyNF9g5kYkXicT4Kdh5PNPmMt vZ8dZctKxWD8JIFVjTj6bzkUCc35jiP62iAv1kx4mbzSuQ8GAlSosi8BW8Lci5XQNDCR ItJqEqHlDi6IR6sxy8Wu5dK9PIN7hlYE8UP5Fd1/u/me8ioRX3xrMlSkZf3gHaW5EXY9 ZhmQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:in-reply-to :references:to:content-language:subject:user-agent:mime-version:date :message-id:dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=OUXloNw3aREKP+/BRbRaqMhrdbzQgU1BRQnIVkM1654=; fh=hPrbWPhweUx4V0GV9uXJqbyAzg2ABmTz7kczrAQqMmM=; b=XSYZnjHLsCUG+hXWLkTQRD4qo2ZvEsG0QIFaCAJKUGu+ewp+PJ0iuMyoJlyfBfo0hh PGHaBGD7wRUBC2H4rAoKKtNvJ5rPXCusrf7V6av6pP50iRzsCITZv0c1YQsIlmoSVsz7 7pwWXkstjdz3QT36vQX5Wnfr6uaXd/uNaTF7m9Nk3+DGQxUL6g2FUenbyYqDxnZpsddR /FqsgD8YED0VOumBKWiFNPS8zx6qz6hnD9A+lRRMy52g/2oKvH+4yHv40d7ZXN5TpsWL 6iN0PsmayIGhQDmY887irBztfcYEu4cbokXiGJsd2pHfWG7ymQD6rngrXDOJUjfo2rej fpsA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=sWcTZzd4; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id d22-20020a05640208d600b00522bc3f1effsi7490271edz.433.2023.08.30.02.15.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Aug 2023 02:15:34 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=sWcTZzd4; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4174F3857342 for ; Wed, 30 Aug 2023 09:15:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4174F3857342 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693386925; bh=OUXloNw3aREKP+/BRbRaqMhrdbzQgU1BRQnIVkM1654=; h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=sWcTZzd4Es6F67Two4ryxRLxyKaKbpf+pN/WW/DXUOyahhJMhRmE+Bzmh7k+3S0oN LM8OTUSGC/BV55XK+v/CTU/U/3rPWCRMemsxKcp2QIuHVwEJ+wka3OzDrouVMSgnSP rilNePbzEo/k0HRRB90dPkqPsRNGquX9Rz/PLdnc= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id B33DF3858C30 for ; Wed, 30 Aug 2023 09:14:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B33DF3858C30 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8B87EFEC for ; Wed, 30 Aug 2023 02:15:19 -0700 (PDT) Received: from [10.57.64.216] (unknown [10.57.64.216]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CBE003F64C for ; Wed, 30 Aug 2023 02:14:39 -0700 (PDT) Message-ID: <4eda2924-2fe1-63ed-d6c5-2bdea8fd34d3@arm.com> Date: Wed, 30 Aug 2023 10:14:38 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: [PATCH 6/8] vect: Add vector_mode paramater to simd_clone_usable Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> In-Reply-To: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> X-Spam-Status: No, score=-13.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775644898296805153 X-GMAIL-MSGID: 1775644898296805153 This patch adds a machine_mode parameter to the TARGET_SIMD_CLONE_USABLE hook to enable rejecting SVE modes when the target architecture does not support SVE. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_simd_clone_usable): Add mode parameter and use to to reject SVE modes when target architecture does not support SVE. * config/gcn/gcn.cc (gcn_simd_clone_usable): Add unused mode parameter. * config/i386/i386.cc (ix86_simd_clone_usable): Likewise. * doc/tm.texi (TARGET_SIMD_CLONE_USABLE): Document new parameter. * target.def (usable): Add new parameter. * tree-vect-stmts.cc (vectorizable_simd_clone_call): Pass vector mode to TARGET_SIMD_CLONE_CALL hook. diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 5fb4c863d875871d6de865e72ce360506a3694d2..a13d3fba05f9f9d2989b36c681bc77d71e943e0d 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -27498,12 +27498,18 @@ aarch64_simd_clone_adjust (struct cgraph_node *node) /* Implement TARGET_SIMD_CLONE_USABLE. */ static int -aarch64_simd_clone_usable (struct cgraph_node *node) +aarch64_simd_clone_usable (struct cgraph_node *node, machine_mode vector_mode) { switch (node->simdclone->vecsize_mangle) { case 'n': - if (!TARGET_SIMD) + if (!TARGET_SIMD + || aarch64_sve_mode_p (vector_mode)) + return -1; + return 0; + case 's': + if (!TARGET_SVE + || !aarch64_sve_mode_p (vector_mode)) return -1; return 0; default: diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc index 02f4dedec4214b1eea9e6f5057ed57d7e0db316a..252676273f06500c99df6ae251f0406c618df891 100644 --- a/gcc/config/gcn/gcn.cc +++ b/gcc/config/gcn/gcn.cc @@ -5599,7 +5599,8 @@ gcn_simd_clone_adjust (struct cgraph_node *ARG_UNUSED (node)) /* Implement TARGET_SIMD_CLONE_USABLE. */ static int -gcn_simd_clone_usable (struct cgraph_node *ARG_UNUSED (node)) +gcn_simd_clone_usable (struct cgraph_node *ARG_UNUSED (node), + machine_mode ARG_UNUSED (mode)) { /* We don't need to do anything here because gcn_simd_clone_compute_vecsize_and_simdlen currently only returns one diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 5d57726e22cea8bcaa8ac8b1b25ac420193f39bb..84f0d5a7cb679e6be92001f59802276635506e97 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -24379,7 +24379,8 @@ ix86_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node, slightly less desirable, etc.). */ static int -ix86_simd_clone_usable (struct cgraph_node *node) +ix86_simd_clone_usable (struct cgraph_node *node, + machine_mode mode ATTRIBUTE_UNUSED) { switch (node->simdclone->vecsize_mangle) { diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 95ba56e05ae4a0f11639cc4a21d6736c53ad5ef1..bde22e562ebb9069122eb3b142ab8f4a4ae56a3a 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6336,11 +6336,13 @@ This hook should add implicit @code{attribute(target("..."))} attribute to SIMD clone @var{node} if needed. @end deftypefn -@deftypefn {Target Hook} int TARGET_SIMD_CLONE_USABLE (struct cgraph_node *@var{}) +@deftypefn {Target Hook} int TARGET_SIMD_CLONE_USABLE (struct cgraph_node *@var{}, @var{machine_mode}) This hook should return -1 if SIMD clone @var{node} shouldn't be used -in vectorized loops in current function, or non-negative number if it is -usable. In that case, the smaller the number is, the more desirable it is -to use it. +in vectorized loops being vectorized with mode @var{m} in current function, or +non-negative number if it is usable. In that case, the smaller the number is, +the more desirable it is to use it. +@end deftypefn + @end deftypefn @deftypefn {Target Hook} int TARGET_SIMT_VF (void) diff --git a/gcc/target.def b/gcc/target.def index 7d684296c17897b4ceecb31c5de1ae8665a8228e..6a0cbc454526ee29011451b570354bf234a4eabd 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -1645,10 +1645,11 @@ void, (struct cgraph_node *), NULL) DEFHOOK (usable, "This hook should return -1 if SIMD clone @var{node} shouldn't be used\n\ -in vectorized loops in current function, or non-negative number if it is\n\ -usable. In that case, the smaller the number is, the more desirable it is\n\ -to use it.", -int, (struct cgraph_node *), NULL) +in vectorized loops being vectorized with mode @var{m} in current function, or\n\ +non-negative number if it is usable. In that case, the smaller the number is,\n\ +the more desirable it is to use it.", +int, (struct cgraph_node *, machine_mode), NULL) + HOOK_VECTOR_END (simd_clone) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 7217f36a250d549b955c874d7c7644d94982b0b5..dc2fc20ef9fe777132308c9e33f7731d62717466 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4195,7 +4195,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, this_badness += exact_log2 (num_calls) * 4096; if (n->simdclone->inbranch) this_badness += 8192; - int target_badness = targetm.simd_clone.usable (n); + int target_badness = targetm.simd_clone.usable (n, vinfo->vector_mode); if (target_badness < 0) continue; this_badness += target_badness * 512; From patchwork Wed Aug 30 09:17:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 137159 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a7d1:0:b0:3f2:4152:657d with SMTP id p17csp4419823vqm; Wed, 30 Aug 2023 02:19:00 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHt+b09oxhC2VZGBprfwh60rBTx+VxPMBkAKomgauW0zgnwQMGXsFFTH3XZXdPkW0DhrT0m X-Received: by 2002:a17:906:1bb1:b0:99d:fd65:dbb2 with SMTP id r17-20020a1709061bb100b0099dfd65dbb2mr1221222ejg.33.1693387139803; Wed, 30 Aug 2023 02:18:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693387139; cv=none; d=google.com; s=arc-20160816; b=jUz1aIxbq6lk2eSVBM1uaWwnZNjiDpb0kr+qtgJCbZyAishrHZ63lSIBnLIsdrWcm5 W+x+eJYWQxA0S2+I9hv9zYYBWOdA3orZ+I1mxNqVVhwsCgjNDMKU/SCvMHLiUbAZWCy1 4tp7LJnrWVugMZv2gXjlJHn94tLUZQCju+tOPLKmCc8dyLYW4Rsgl4qFo90S2FHUzgse iYGnj0IKp9m6qLy6fzvzc5WAeKUXtBAftMGcWKgx9hBjgO1pvAzL4ESTyvYj/m0TPmPt KuRBsOnFlQcqUId1rLDmFgQlR2gyDlelFbhSgYMiX4AN+a1Po/Cm5jM7wRdjlaxOvEps dJsg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:in-reply-to:cc :references:to:content-language:subject:user-agent:mime-version:date :message-id:dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=7gjs5Kq+Wwa+VNFeI2QPhrB+oXDd0GatUWD4bpqXrZc=; fh=sgurieNRT1F47MCuVM52yu0r4xmlRVSNh8p7V7FQz38=; b=F0Fl5JOcGoq0DMyt0Avh8BSGYxrWv/dB17wPYEh3iLukhvePv4I2nnvvtDareWQsk8 fO+7Whu+ToHmQeU9MdC348oXncjfh2v8ooJLsIJqT6bM13KhpnitVc2l/WajTnKZF8CL Paz66QeVzegCqy2xHctom4j9GeYqdSGPaFE48i9lnfEbsYl1TFhsy/15Q/cLcvhwVNrV uq1woeKvmD2aQZ0yPFm/ITOHznOfhcij7yrhayMQtpUQruMKbfHqmMf/FNlzbATRKa68 fHY1hheVASnFU7mEPmhTrAFGRQPmOp6IVNDyTJNEA7Ob8HUh7Bln6ZWMEe7RojefL5wG A/Jw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=sK2cA77s; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id m2-20020a170906258200b009930c0374basi5433957ejb.632.2023.08.30.02.18.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Aug 2023 02:18:59 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=sK2cA77s; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A49CF385609B for ; Wed, 30 Aug 2023 09:18:28 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A49CF385609B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693387108; bh=7gjs5Kq+Wwa+VNFeI2QPhrB+oXDd0GatUWD4bpqXrZc=; h=Date:Subject:To:References:Cc:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=sK2cA77s5/SADAE32vOMQyVTAajQaG3f1WJzfoXItr1cQC9SjZzQh2HFA9jDYbMgO Gxeh4cG6xonr2GECtkFshNrDF+wVAdWuha+HK3gKHxh/RPhKEsm4yXIPr3jSoCRqmr WPQJniWpUpgYNFdLbM2ncw/d4iF1tZosF78juTVE= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 1784F3857714 for ; Wed, 30 Aug 2023 09:17:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1784F3857714 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3F60E2F4; Wed, 30 Aug 2023 02:18:21 -0700 (PDT) Received: from [10.57.64.216] (unknown [10.57.64.216]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0BF033F64C; Wed, 30 Aug 2023 02:17:40 -0700 (PDT) Message-ID: Date: Wed, 30 Aug 2023 10:17:39 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: [PATCH7/8] vect: Add TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> Cc: Richard Biener , Richard Sandiford , "jakub@redhat.com" In-Reply-To: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> X-Spam-Status: No, score=-13.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775645113467852881 X-GMAIL-MSGID: 1775645113467852881 This patch adds a new target hook to enable us to adapt the types of return and parameters of simd clones. We use this in two ways, the first one is to make sure we can create valid SVE types, including the SVE type attribute, when creating a SVE simd clone, even when the target options do not support SVE. We are following the same behaviour seen with x86 that creates simd clones according to the ABI rules when no simdlen is provided, even if that simdlen is not supported by the current target options. Note that this doesn't mean the simd clone will be used in auto-vectorization. gcc/ChangeLog: (TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM): Define. * doc/tm.texi (TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM): Document. * doc/tm.texi.in (TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM): New. * omp-simd-clone.cc (simd_adjust_return_type): Call new hook. (simd_clone_adjust_argument_types): Likewise. * target.def (adjust_ret_or_param): New hook. * targhooks.cc (default_simd_clone_adjust_ret_or_param): New. * targhooks.h (default_simd_clone_adjust_ret_or_param): New. diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index bde22e562ebb9069122eb3b142ab8f4a4ae56a3a..b80c09ec36d51f1bb55b14229f46207fb4457223 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6343,6 +6343,9 @@ non-negative number if it is usable. In that case, the smaller the number is, the more desirable it is to use it. @end deftypefn +@deftypefn {Target Hook} tree TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM (struct cgraph_node *@var{}, @var{tree}, @var{bool}) +If defined, this hook should adjust the type of the return or parameter +@var{type} to be used by the simd clone @var{node}. @end deftypefn @deftypefn {Target Hook} int TARGET_SIMT_VF (void) diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 4ac96dc357d35e0e57bb43a41d1b1a4f66d05946..7496a32d84f7c422fe7ea88215ee72f3c354a3f4 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -4211,6 +4211,8 @@ address; but often a machine-dependent strategy can generate better code. @hook TARGET_SIMD_CLONE_USABLE +@hook TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM + @hook TARGET_SIMT_VF @hook TARGET_OMP_DEVICE_KIND_ARCH_ISA diff --git a/gcc/omp-simd-clone.cc b/gcc/omp-simd-clone.cc index ef0b9b48c7212900023bc0eaebca5e1f9389db77..c2fd4d3be878e56b6394e34097d2de826a0ba1ff 100644 --- a/gcc/omp-simd-clone.cc +++ b/gcc/omp-simd-clone.cc @@ -736,6 +736,7 @@ simd_clone_adjust_return_type (struct cgraph_node *node) t = build_array_type_nelts (t, exact_div (node->simdclone->simdlen, veclen)); } + t = targetm.simd_clone.adjust_ret_or_param (node, t, false); TREE_TYPE (TREE_TYPE (fndecl)) = t; if (!node->definition) return NULL_TREE; @@ -748,6 +749,7 @@ simd_clone_adjust_return_type (struct cgraph_node *node) tree atype = build_array_type_nelts (orig_rettype, node->simdclone->simdlen); + atype = targetm.simd_clone.adjust_ret_or_param (node, atype, false); if (maybe_ne (veclen, node->simdclone->simdlen)) return build1 (VIEW_CONVERT_EXPR, atype, t); @@ -880,6 +882,8 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) ? IDENTIFIER_POINTER (DECL_NAME (parm)) : NULL, parm_type, sc->simdlen); } + adj.type = targetm.simd_clone.adjust_ret_or_param (node, adj.type, + false); vec_safe_push (new_params, adj); } @@ -912,6 +916,8 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) adj.type = build_vector_type (pointer_sized_int_node, veclen); else adj.type = build_vector_type (base_type, veclen); + adj.type = targetm.simd_clone.adjust_ret_or_param (node, adj.type, + true); vec_safe_push (new_params, adj); k = vector_unroll_factor (sc->simdlen, veclen); @@ -937,6 +943,7 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) sc->args[i].simd_array = NULL_TREE; } sc->args[i].orig_type = base_type; + sc->args[i].vector_type = adj.type; sc->args[i].arg_type = SIMD_CLONE_ARG_TYPE_MASK; sc->args[i].vector_type = adj.type; } diff --git a/gcc/target.def b/gcc/target.def index 6a0cbc454526ee29011451b570354bf234a4eabd..665083ce035da03b40b15f23684ccdacce33c9d3 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -1650,6 +1650,13 @@ non-negative number if it is usable. In that case, the smaller the number is,\n the more desirable it is to use it.", int, (struct cgraph_node *, machine_mode), NULL) +DEFHOOK +(adjust_ret_or_param, +"If defined, this hook should adjust the type of the return or parameter\n\ +@var{type} to be used by the simd clone @var{node}.", +tree, (struct cgraph_node *, tree, bool), +default_simd_clone_adjust_ret_or_param) + HOOK_VECTOR_END (simd_clone) diff --git a/gcc/targhooks.h b/gcc/targhooks.h index 1a0db8dddd594d9b1fb04ae0d9a66ad6b7a396dc..558157514814228ef2ed41ae0861e1c088eea9ef 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -75,6 +75,9 @@ extern void default_print_operand (FILE *, rtx, int); extern void default_print_operand_address (FILE *, machine_mode, rtx); extern bool default_print_operand_punct_valid_p (unsigned char); extern tree default_mangle_assembler_name (const char *); +extern tree default_simd_clone_adjust_ret_or_param + (struct cgraph_node *,tree , bool); + extern machine_mode default_translate_mode_attribute (machine_mode); extern bool default_scalar_mode_supported_p (scalar_mode); diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc index e190369f87a92e6a92372dc348d9374c3a965c0a..6b6f6132c6dc62b92ad8d448d63ca6041386788f 100644 --- a/gcc/targhooks.cc +++ b/gcc/targhooks.cc @@ -399,6 +399,16 @@ default_mangle_assembler_name (const char *name ATTRIBUTE_UNUSED) return get_identifier (stripped); } +/* The default implementation of TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM. */ + +tree +default_simd_clone_adjust_ret_or_param (struct cgraph_node *node ATTRIBUTE_UNUSED, + tree type, + bool is_return ATTRIBUTE_UNUSED) +{ + return type; +} + /* The default implementation of TARGET_TRANSLATE_MODE_ATTRIBUTE. */ machine_mode From patchwork Wed Aug 30 09:19:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 137160 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a7d1:0:b0:3f2:4152:657d with SMTP id p17csp4420531vqm; Wed, 30 Aug 2023 02:20:57 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFm8zUbB3QQssKmzvLqfXGNLo0PG9FVizH1XcOltKyMht/aMx9zNkyz04RRXWDoibMDyhR5 X-Received: by 2002:a17:906:8a7b:b0:9a1:c221:465a with SMTP id hy27-20020a1709068a7b00b009a1c221465amr1273892ejc.9.1693387257481; Wed, 30 Aug 2023 02:20:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693387257; cv=none; d=google.com; s=arc-20160816; b=faH8go0xF1jdmYMCfrMWN+3gZ7cEc/K85w66NkNuizMElid4kfBOi+Isyuj8pKeJXu gH4afBFssCRHnqgqgxitKhmAEOz0NnBD+Rxny2yA8QVbJovJIrQo/yEZgWNz6AWl2ohu G0Rnsu6ejBWOR2JxhSEMT8i6iArUIu/4RW7XTflKCl2VZxxvWTDqZb35y+tUKeRzJURp l6iHyPCIDYZR0m+Xzy4JY213xG3TSbNf72UHqlES71zcM+DcdEM0cvhRaLZh3FNw+4Fy AAlkSsxZOiqYRTgfycKWl/Etmy9SZuuxS8/8EO130YCeKDl2aqe8/38DsX/QflQ1ONDQ wOng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :in-reply-to:references:to:content-language:subject:user-agent :mime-version:date:message-id:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=UiF55yh1WGL1MumlWOZqY6dEZYIK1OW3QecIm2mkRfs=; fh=C4nEn4uRKApr1WsFtLyJD8L5BeRuRc+JFyqoopFjd9M=; b=JignHE3uxvtFSed9UydWpagK5EbVEhvnN/VfVz5wMkDk9VRg/QeGW9BF9ywxlk4m6m 90VAKJ0B668cgKimV7Fg2kZExVxqnvQJAcYslx6Lm6tHLW167N3lbLGB9IGeHunIFnNA vNQvF3lfrqW7aLsJxdEzGWzS+U5ieGC8YGQyGz5bZNBryJAEVzD9/sqnGkv92r0vpz/H faWslSOXVSUAbicfJ9A9rVi+Mdeq+rIp2r+unSeR/U3flkdJ6P/PhIYI8spUkUIDSptY y3lzIx0snt2g0qLpnziZ/x/GNE5LxtUapH4d/AcEjxnGEsZCvEDU3m1THrHldDSMtQeb YQAQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=BRYVwS1i; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id a7-20020a17090680c700b009a190ce8511si7444533ejx.111.2023.08.30.02.20.57 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 30 Aug 2023 02:20:57 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=BRYVwS1i; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9EB203857714 for ; Wed, 30 Aug 2023 09:20:45 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9EB203857714 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693387245; bh=UiF55yh1WGL1MumlWOZqY6dEZYIK1OW3QecIm2mkRfs=; h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=BRYVwS1idvBrKA+cbdPauCeC4tm1JWYQK8dORre6uMNqQkVv5T3DtwQiJFOWG/JWb jAqJCaTCZKqW4S0i8VmQry7RMTz2lQsHKKYAvJELIDkhqhY7rzaDW08684m7iP2jze DeQcAEprU9kDuttmgHkCOxqVC511nlVjycfTehB8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id A43B73858402 for ; Wed, 30 Aug 2023 09:19:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A43B73858402 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A56632F4; Wed, 30 Aug 2023 02:20:33 -0700 (PDT) Received: from [10.57.64.216] (unknown [10.57.64.216]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 79C043F64C; Wed, 30 Aug 2023 02:19:53 -0700 (PDT) Message-ID: <25cccf6c-1b3c-a032-7930-aba25a311dca@arm.com> Date: Wed, 30 Aug 2023 10:19:46 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: [PATCH 8/8] aarch64: Add SVE support for simd clones [PR 96342] Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> In-Reply-To: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> X-Spam-Status: No, score=-13.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Cc: Richard Sandiford Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775645237210252767 X-GMAIL-MSGID: 1775645237210252767 This patch finalizes adding support for the generation of SVE simd clones when no simdlen is provided, following the ABI rules where the widest data type determines the minimum amount of elements in a length agnostic vector. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (add_sve_type_attribute): Declare. * config/aarch64/aarch64-sve-builtins.cc (add_sve_type_attribute): Make visibility global. * config/aarch64/aarch64.cc (aarch64_fntype_abi): Ensure SVE ABI is chosen over SIMD ABI if a SVE type is used in return or arguments. (aarch64_simd_clone_compute_vecsize_and_simdlen): Create VLA simd clone when no simdlen is provided, according to ABI rules. (aarch64_simd_clone_adjust): Add '+sve' attribute to SVE simd clones. (aarch64_simd_clone_adjust_ret_or_param): New. (TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM): Define. * omp-simd-clone.cc (simd_clone_mangle): Print 'x' for VLA simdlen. (simd_clone_adjust): Adapt safelen check to be compatible with VLA simdlen. gcc/testsuite/ChangeLog: * c-c++-common/gomp/declare-variant-14.c: Adapt aarch64 scan. * gfortran.dg/gomp/declare-variant-14.f90: Likewise. * gcc.target/aarch64/declare-simd-1.c: Remove warning checks where no longer necessary. * gcc.target/aarch64/declare-simd-2.c: Add SVE clone scan. diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 70303d6fd953e0c397b9138ede8858c2db2e53db..d7888c95a4999fad1a4c55d5cd2287c2040302c8 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -1001,6 +1001,8 @@ namespace aarch64_sve { #ifdef GCC_TARGET_H bool verify_type_context (location_t, type_context_kind, const_tree, bool); #endif + void add_sve_type_attribute (tree, unsigned int, unsigned int, + const char *, const char *); } extern void aarch64_split_combinev16qi (rtx operands[3]); diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 161a14edde7c9fb1b13b146cf50463e2d78db264..6f99c438d10daa91b7e3b623c995489f1a8a0f4c 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -569,14 +569,16 @@ static bool reported_missing_registers_p; /* Record that TYPE is an ABI-defined SVE type that contains NUM_ZR SVE vectors and NUM_PR SVE predicates. MANGLED_NAME, if nonnull, is the ABI-defined mangling of the type. ACLE_NAME is the name of the type. */ -static void +void add_sve_type_attribute (tree type, unsigned int num_zr, unsigned int num_pr, const char *mangled_name, const char *acle_name) { tree mangled_name_tree = (mangled_name ? get_identifier (mangled_name) : NULL_TREE); + tree acle_name_tree + = (acle_name ? get_identifier (acle_name) : NULL_TREE); - tree value = tree_cons (NULL_TREE, get_identifier (acle_name), NULL_TREE); + tree value = tree_cons (NULL_TREE, acle_name_tree, NULL_TREE); value = tree_cons (NULL_TREE, mangled_name_tree, value); value = tree_cons (NULL_TREE, size_int (num_pr), value); value = tree_cons (NULL_TREE, size_int (num_zr), value); diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index a13d3fba05f9f9d2989b36c681bc77d71e943e0d..492acb9ce081866162faa8dfca777e4cb943797f 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -4034,13 +4034,13 @@ aarch64_takes_arguments_in_sve_regs_p (const_tree fntype) static const predefined_function_abi & aarch64_fntype_abi (const_tree fntype) { - if (lookup_attribute ("aarch64_vector_pcs", TYPE_ATTRIBUTES (fntype))) - return aarch64_simd_abi (); - if (aarch64_returns_value_in_sve_regs_p (fntype) || aarch64_takes_arguments_in_sve_regs_p (fntype)) return aarch64_sve_abi (); + if (lookup_attribute ("aarch64_vector_pcs", TYPE_ATTRIBUTES (fntype))) + return aarch64_simd_abi (); + return default_function_abi; } @@ -27327,7 +27327,7 @@ aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node, int num, bool explicit_p) { tree t, ret_type; - unsigned int nds_elt_bits; + unsigned int nds_elt_bits, wds_elt_bits; int count; unsigned HOST_WIDE_INT const_simdlen; poly_uint64 vec_bits; @@ -27374,10 +27374,14 @@ aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node, if (TREE_CODE (ret_type) != VOID_TYPE) { nds_elt_bits = lane_size (SIMD_CLONE_ARG_TYPE_VECTOR, ret_type); + wds_elt_bits = nds_elt_bits; vec_elts.safe_push (std::make_pair (ret_type, nds_elt_bits)); } else - nds_elt_bits = POINTER_SIZE; + { + nds_elt_bits = POINTER_SIZE; + wds_elt_bits = 0; + } int i; tree type_arg_types = TYPE_ARG_TYPES (TREE_TYPE (node->decl)); @@ -27385,30 +27389,36 @@ aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node, for (t = (decl_arg_p ? DECL_ARGUMENTS (node->decl) : type_arg_types), i = 0; t && t != void_list_node; t = TREE_CHAIN (t), i++) { - tree arg_type = decl_arg_p ? TREE_TYPE (t) : TREE_VALUE (t); + tree type = decl_arg_p ? TREE_TYPE (t) : TREE_VALUE (t); if (clonei->args[i].arg_type != SIMD_CLONE_ARG_TYPE_UNIFORM - && !supported_simd_type (arg_type)) + && !supported_simd_type (type)) { if (!explicit_p) ; - else if (COMPLEX_FLOAT_TYPE_P (ret_type)) + else if (COMPLEX_FLOAT_TYPE_P (type)) warning_at (DECL_SOURCE_LOCATION (node->decl), 0, "GCC does not currently support argument type %qT " - "for simd", arg_type); + "for simd", type); else warning_at (DECL_SOURCE_LOCATION (node->decl), 0, "unsupported argument type %qT for simd", - arg_type); + type); return 0; } - unsigned lane_bits = lane_size (clonei->args[i].arg_type, arg_type); + unsigned lane_bits = lane_size (clonei->args[i].arg_type, type); if (clonei->args[i].arg_type == SIMD_CLONE_ARG_TYPE_VECTOR) - vec_elts.safe_push (std::make_pair (arg_type, lane_bits)); + vec_elts.safe_push (std::make_pair (type, lane_bits)); if (nds_elt_bits > lane_bits) nds_elt_bits = lane_bits; + else if (wds_elt_bits < lane_bits) + wds_elt_bits = lane_bits; } - clonei->vecsize_mangle = 'n'; + /* If we could not determine the WDS type from available parameters/return, + then fallback to using uintptr_t. */ + if (wds_elt_bits == 0) + wds_elt_bits = POINTER_SIZE; + clonei->mask_mode = VOIDmode; poly_uint64 simdlen; auto_vec simdlens (2); @@ -27419,6 +27429,11 @@ aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node, simdlen = exact_div (poly_uint64 (64), nds_elt_bits); simdlens.safe_push (simdlen); simdlens.safe_push (simdlen * 2); + /* Only create a SVE simd clone if we aren't dealing with an unprototyped + function. */ + if (DECL_ARGUMENTS (node->decl) != 0 + || type_arg_types != 0) + simdlens.safe_push (exact_div (poly_uint64 (128, 128), wds_elt_bits)); } else simdlens.safe_push (clonei->simdlen); @@ -27439,19 +27454,20 @@ aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node, while (j < count && !simdlens.is_empty ()) { bool remove_simdlen = false; - for (auto elt : vec_elts) - if (known_gt (simdlens[j] * elt.second, 128U)) - { - /* Don't issue a warning for every simdclone when there is no - specific simdlen clause. */ - if (explicit_p && known_ne (clonei->simdlen, 0U)) - warning_at (DECL_SOURCE_LOCATION (node->decl), 0, - "GCC does not currently support simdlen %wd for " - "type %qT", - constant_lower_bound (simdlens[j]), elt.first); - remove_simdlen = true; - break; - } + if (simdlens[j].is_constant ()) + for (auto elt : vec_elts) + if (known_gt (simdlens[j] * elt.second, 128U)) + { + /* Don't issue a warning for every simdclone when there is no + specific simdlen clause. */ + if (explicit_p && known_ne (clonei->simdlen, 0U)) + warning_at (DECL_SOURCE_LOCATION (node->decl), 0, + "GCC does not currently support simdlen %wd for " + "type %qT", + constant_lower_bound (simdlens[j]), elt.first); + remove_simdlen = true; + break; + } if (remove_simdlen) { count--; @@ -27479,6 +27495,13 @@ aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node, gcc_assert (num < count); clonei->simdlen = simdlens[num]; + if (clonei->simdlen.is_constant ()) + clonei->vecsize_mangle = 'n'; + else + { + clonei->vecsize_mangle = 's'; + clonei->inbranch = 1; + } return count; } @@ -27493,6 +27516,11 @@ aarch64_simd_clone_adjust (struct cgraph_node *node) tree t = TREE_TYPE (node->decl); TYPE_ATTRIBUTES (t) = make_attribute ("aarch64_vector_pcs", "default", TYPE_ATTRIBUTES (t)); + if (node->simdclone->vecsize_mangle == 's') + { + tree target = build_string (strlen ("+sve"), "+sve"); + aarch64_option_valid_attribute_p (node->decl, NULL_TREE, target, 0); + } } /* Implement TARGET_SIMD_CLONE_USABLE. */ @@ -27517,6 +27545,57 @@ aarch64_simd_clone_usable (struct cgraph_node *node, machine_mode vector_mode) } } +/* Implement TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM. */ + +static tree +aarch64_simd_clone_adjust_ret_or_param (cgraph_node *node, tree type, + bool is_mask) +{ + if (type + && VECTOR_TYPE_P (type) + && node->simdclone->vecsize_mangle == 's') + { + cl_target_option cur_target; + cl_target_option_save (&cur_target, &global_options, &global_options_set); + tree new_target = DECL_FUNCTION_SPECIFIC_TARGET (node->decl); + cl_target_option_restore (&global_options, &global_options_set, + TREE_TARGET_OPTION (new_target)); + aarch64_override_options_internal (&global_options); + bool m_old_have_regs_of_mode[MAX_MACHINE_MODE]; + memcpy (m_old_have_regs_of_mode, have_regs_of_mode, + sizeof (have_regs_of_mode)); + for (int i = 0; i < NUM_MACHINE_MODES; ++i) + if (aarch64_sve_mode_p ((machine_mode) i)) + have_regs_of_mode[i] = true; + poly_uint16 old_sve_vg = aarch64_sve_vg; + if (!node->simdclone->simdlen.is_constant ()) + aarch64_sve_vg = poly_uint16 (2, 2); + unsigned int num_zr = 0; + unsigned int num_pr = 0; + type = TREE_TYPE (type); + type = build_vector_type (type, node->simdclone->simdlen); + if (is_mask) + { + type = truth_type_for (type); + num_pr = 1; + } + else + num_zr = 1; + + aarch64_sve::add_sve_type_attribute (type, num_zr, num_pr, NULL, NULL); + cl_target_option_restore (&global_options, &global_options_set, &cur_target); + aarch64_override_options_internal (&global_options); + memcpy (have_regs_of_mode, m_old_have_regs_of_mode, + sizeof (have_regs_of_mode)); + aarch64_sve_vg = old_sve_vg; + } + else if (type + && VECTOR_TYPE_P (type) + && is_mask) + type = truth_type_for (type); + return type; +} + /* Implement TARGET_COMP_TYPE_ATTRIBUTES */ static int @@ -28590,6 +28669,10 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_SIMD_CLONE_ADJUST #define TARGET_SIMD_CLONE_ADJUST aarch64_simd_clone_adjust +#undef TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM +#define TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM \ + aarch64_simd_clone_adjust_ret_or_param + #undef TARGET_SIMD_CLONE_USABLE #define TARGET_SIMD_CLONE_USABLE aarch64_simd_clone_usable diff --git a/gcc/omp-simd-clone.cc b/gcc/omp-simd-clone.cc index c2fd4d3be878e56b6394e34097d2de826a0ba1ff..091f194f1829fb9f70827d8674fd4dae44282d55 100644 --- a/gcc/omp-simd-clone.cc +++ b/gcc/omp-simd-clone.cc @@ -541,9 +541,12 @@ simd_clone_mangle (struct cgraph_node *node, pp_string (&pp, "_ZGV"); pp_character (&pp, vecsize_mangle); pp_character (&pp, mask); - /* For now, simdlen is always constant, while variable simdlen pp 'n'. */ - unsigned int len = simdlen.to_constant (); - pp_decimal_int (&pp, (len)); + + unsigned long long len = 0; + if (simdlen.is_constant (&len)) + pp_decimal_int (&pp, (int) (len)); + else + pp_character (&pp, 'x'); for (n = 0; n < clone_info->nargs; ++n) { @@ -1499,8 +1502,8 @@ simd_clone_adjust (struct cgraph_node *node) below). */ loop = alloc_loop (); cfun->has_force_vectorize_loops = true; - /* For now, simlen is always constant. */ - loop->safelen = node->simdclone->simdlen.to_constant (); + /* We can assert that safelen is the 'minimum' simdlen. */ + loop->safelen = constant_lower_bound (node->simdclone->simdlen); loop->force_vectorize = true; loop->header = body_bb; } diff --git a/gcc/testsuite/c-c++-common/gomp/declare-variant-14.c b/gcc/testsuite/c-c++-common/gomp/declare-variant-14.c index e3668893afe33a58c029cddd433d9bf43cce2bfa..12f8b3b839b7f3ff9e4f99768e59c0e1c5339062 100644 --- a/gcc/testsuite/c-c++-common/gomp/declare-variant-14.c +++ b/gcc/testsuite/c-c++-common/gomp/declare-variant-14.c @@ -21,7 +21,7 @@ test1 (int x) shall call f01 with score 8. */ /* { dg-final { scan-tree-dump-not "f04 \\\(x" "optimized" } } */ /* { dg-final { scan-tree-dump-times "f03 \\\(x" 14 "optimized" { target { !aarch64*-*-* } } } } */ - /* { dg-final { scan-tree-dump-times "f03 \\\(x" 10 "optimized" { target { aarch64*-*-* } } } } */ + /* { dg-final { scan-tree-dump-times "f03 \\\(x" 12 "optimized" { target { aarch64*-*-* } } } } */ /* { dg-final { scan-tree-dump-times "f01 \\\(x" 4 "optimized" { target { !aarch64*-*-* } } } } */ /* { dg-final { scan-tree-dump-times "f01 \\\(x" 0 "optimized" { target { aarch64*-*-* } } } } */ int a = f04 (x); diff --git a/gcc/testsuite/gcc.target/aarch64/declare-simd-1.c b/gcc/testsuite/gcc.target/aarch64/declare-simd-1.c index aab8c17f0c442a7cda4dce23cc18162a0b7f676e..add6e7c93019834fbd5bed5ead18b52d4cdd0a37 100644 --- a/gcc/testsuite/gcc.target/aarch64/declare-simd-1.c +++ b/gcc/testsuite/gcc.target/aarch64/declare-simd-1.c @@ -4,28 +4,39 @@ extern "C" { #endif #pragma omp declare simd -int __attribute__ ((const)) f00 (int a , char b) /* { dg-warning {GCC does not currently support a simdclone with simdlens 8 and 16 for these types.} } */ +int __attribute__ ((const)) f00 (int a , char b) { return a + b; } +/* { dg-final { scan-assembler-not {_ZGVn[a-z0-9]+_f00} } } */ +/* { dg-final { scan-assembler {_ZGVsMxvv_f00} } } */ + #pragma omp declare simd -long long __attribute__ ((const)) f01 (int a , short b) /* { dg-warning {GCC does not currently support a simdclone with simdlens 4 and 8 for these types.} } */ +long long __attribute__ ((const)) f01 (int a , short b) { return a + b; } +/* { dg-final { scan-assembler-not {_ZGVn[a-z0-9]+_f01} } } */ +/* { dg-final { scan-assembler {_ZGVsMxvv_f01} } } */ #pragma omp declare simd linear(b) -long long __attribute__ ((const)) f02 (short *b, int a) /* { dg-warning {GCC does not currently support a simdclone with simdlens 4 and 8 for these types.} } */ +long long __attribute__ ((const)) f02 (short *b, int a) { return a + *b; } +/* { dg-final { scan-assembler-not {_ZGVn[a-z0-9]+_f02} } } */ +/* { dg-final { scan-assembler {_ZGVsMxl2v_f02} } } */ + #pragma omp declare simd uniform(b) -void f03 (char b, int a) /* { dg-warning {GCC does not currently support a simdclone with simdlens 8 and 16 for these types.} } */ +void f03 (char b, int a) { } +/* { dg-final { scan-assembler-not {_ZGVn[a-z0-9]+_f03} } } */ +/* { dg-final { scan-assembler {_ZGVsMxuv_f03} } } */ + #pragma omp declare simd simdlen(4) double f04 (void) /* { dg-warning {GCC does not currently support simdlen 4 for type 'double'} } */ { diff --git a/gcc/testsuite/gcc.target/aarch64/declare-simd-2.c b/gcc/testsuite/gcc.target/aarch64/declare-simd-2.c index abb128ffc9cd2c1353b99eb38aae72377746e6d6..604869a30456e4db988bba86e059a27f19dda589 100644 --- a/gcc/testsuite/gcc.target/aarch64/declare-simd-2.c +++ b/gcc/testsuite/gcc.target/aarch64/declare-simd-2.c @@ -10,6 +10,7 @@ short __attribute__ ((const)) f00 (short a , char b) } /* { dg-final { scan-assembler {_ZGVnN8vv_f00:} } } */ /* { dg-final { scan-assembler {_ZGVnM8vv_f00:} } } */ +/* { dg-final { scan-assembler {_ZGVsMxvv_f00:} } } */ #pragma omp declare simd notinbranch short __attribute__ ((const)) f01 (int a , short b) @@ -17,6 +18,7 @@ short __attribute__ ((const)) f01 (int a , short b) return a + b; } /* { dg-final { scan-assembler {_ZGVnN4vv_f01:} } } */ +/* { dg-final { scan-assembler {_ZGVsMxvv_f01:} } } */ #pragma omp declare simd linear(b) inbranch int __attribute__ ((const)) f02 (int a, short *b) @@ -24,6 +26,7 @@ int __attribute__ ((const)) f02 (int a, short *b) return a + *b; } /* { dg-final { scan-assembler {_ZGVnM4vl2_f02:} } } */ +/* { dg-final { scan-assembler {_ZGVsMxvl2_f02:} } } */ #pragma omp declare simd uniform(a) notinbranch void f03 (char b, int a) @@ -31,6 +34,7 @@ void f03 (char b, int a) } /* { dg-final { scan-assembler {_ZGVnN8vu_f03:} } } */ /* { dg-final { scan-assembler {_ZGVnN16vu_f03:} } } */ +/* { dg-final { scan-assembler {_ZGVsMxvu_f03:} } } */ #pragma omp declare simd simdlen(2) float f04 (double a) @@ -39,6 +43,7 @@ float f04 (double a) } /* { dg-final { scan-assembler {_ZGVnN2v_f04:} } } */ /* { dg-final { scan-assembler {_ZGVnM2v_f04:} } } */ +/* { dg-final { scan-assembler-not {_ZGVs[0-9a-z]*_f04:} } } */ #pragma omp declare simd uniform(a) linear (b) void f05 (short a, short *b, short c) @@ -50,6 +55,7 @@ void f05 (short a, short *b, short c) /* { dg-final { scan-assembler {_ZGVnN4ul2v_f05:} } } */ /* { dg-final { scan-assembler {_ZGVnM8ul2v_f05:} } } */ /* { dg-final { scan-assembler {_ZGVnM8ul2v_f05:} } } */ +/* { dg-final { scan-assembler {_ZGVsMxul2v_f05:} } } */ #ifdef __cplusplus } #endif diff --git a/gcc/testsuite/gfortran.dg/gomp/declare-variant-14.f90 b/gcc/testsuite/gfortran.dg/gomp/declare-variant-14.f90 index 6319df0558f37b95f1b2eb17374bdb4ecbc33295..38677b8f7a76b960ce9363b1c0cabf6fc5086ab6 100644 --- a/gcc/testsuite/gfortran.dg/gomp/declare-variant-14.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/declare-variant-14.f90 @@ -41,7 +41,7 @@ contains ! shall call f01 with score 8. ! { dg-final { scan-tree-dump-not "f04 \\\(x" "optimized" } } ! { dg-final { scan-tree-dump-times "f03 \\\(x" 14 "optimized" { target { !aarch64*-*-* } } } } - ! { dg-final { scan-tree-dump-times "f03 \\\(x" 6 "optimized" { target { aarch64*-*-* } } } } + ! { dg-final { scan-tree-dump-times "f03 \\\(x" 8 "optimized" { target { aarch64*-*-* } } } } ! { dg-final { scan-tree-dump-times "f01 \\\(x" 4 "optimized" { target { !aarch64*-*-* } } } } ! { dg-final { scan-tree-dump-times "f01 \\\(x" 0 "optimized" { target { aarch64*-*-* } } } } a = f04 (x)