From patchwork Thu Jul 13 10:22:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 119758 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1723803vqm; Thu, 13 Jul 2023 03:24:44 -0700 (PDT) X-Google-Smtp-Source: APBJJlF3Q3yXZGu7+7rCpxVifvsZFDfSyv1yRbP4moI12AF2wIC8G3p5Io2GChxYwGzLx2HDnuC0 X-Received: by 2002:a05:6512:3ba9:b0:4fb:8f81:4fe8 with SMTP id g41-20020a0565123ba900b004fb8f814fe8mr898100lfv.46.1689243883792; Thu, 13 Jul 2023 03:24:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689243883; cv=none; d=google.com; s=arc-20160816; b=bNoWfnojBdI69Zof0Lt7B3M40/kKyhlF4Xlvk6hhLfF8VZZCj6SKcEmpajIA/yVfhG 9xEwbXjOs7MdItQc96KIOiiETbhLKDj/eT5b3WI8IOGW7qwC7grs/SXzMzjsW50ysoAn tV1bblV6Eep+42rrPejaYanLNh2l5jLMmxTdkwuvSXpGD37HfiJQlMUn18IhOsacHT1E 5wcVtThUThfFPold3gqIPT4Eil51/nCVLfu92WBh7p/2FL8UUo5KcyA6UfC0kujdHsmV 2HfCu/3ESY06eGMDvTqSe2M7wCdl5I8MoBhz/1VCne1mU6E1qllCgQIezWEcUzjUrbQL 7KwA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=vOwEhTwvf5CZaQmVnE2DZABhZ/riDlDW7+azoh2lzrk=; fh=XxyM7Ym2GZuIk6CR9Av7X+K6FIjQnvqQn0ljduIMmhY=; b=hIIplqLAE2M/uk5RCt47hCNqwHTO8L8CsK7QBHXqWLoH00DdvKGKyQWga7GZj2MEde IcGIsHMEPmHgBKoprxVrDhKW165nQJT6joDoeV/6TtDH5NPb5rQul1YuJaA2NjuOKQBc 6ANK1+ohHa+cITbxHzkTpH0E5aCFAZwC5FIHzELeUPfsHwfm+hXwk8w3y8lDCyqU+yoK DR/bWjfMGn41rSULHQTzfQpQD/avMcniJrYuaVqggoa82r2275ScJBvMkjqfEcwNBQEv +azIye94fivKxVh80ZF2pjJRk22KyPnlJ4nNjVvCtnIfeI3V88cgCJt04OyqwTptFxeh XlEw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=rOLP7kXw; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id d10-20020a05640208ca00b0051da8fd756dsi332019edz.165.2023.07.13.03.24.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 03:24:43 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=rOLP7kXw; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 721713856DC2 for ; Thu, 13 Jul 2023 10:24:28 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 721713856DC2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689243868; bh=vOwEhTwvf5CZaQmVnE2DZABhZ/riDlDW7+azoh2lzrk=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=rOLP7kXw6VDK8JtuEfP0EkVzkrNN9oydWY4DBBmSa6TWyNTvq9Fzgzn8OkQqNN0UC +y4Z3yJRfufxxV8CawsWPDDoFw0sVgj0Q1LyFCGWK4zgsSmB/A0hzuPGX3ITyWAiIF R1QTF0h/8sg0UQ88VFn0bg76qSNYuuHOaxrlIMfk= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oi1-x22b.google.com (mail-oi1-x22b.google.com [IPv6:2607:f8b0:4864:20::22b]) by sourceware.org (Postfix) with ESMTPS id 99E023857B98 for ; Thu, 13 Jul 2023 10:22:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 99E023857B98 Received: by mail-oi1-x22b.google.com with SMTP id 5614622812f47-3a3b7fafd61so508447b6e.2 for ; Thu, 13 Jul 2023 03:22:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689243749; x=1691835749; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=vOwEhTwvf5CZaQmVnE2DZABhZ/riDlDW7+azoh2lzrk=; b=YcoOcNg3gKHbQiG7c5VrIA98pb58RbJG6d15C0ZAoKhI+8t10g1T3NfrMrArayndCk UjN+yat0i8XriZdM0EZ1F3KgfyebE20zo1ZwWJeK89fhcne3DBJDYuExVOUPR2p1vsN8 0xBffFOj6gvafcPRXXrTVMWxeE2/GCeLe8pRPKwSQHsCkll6vMsKKvrdeX797HeCiYni iwwGbjz1JjMwHi0whNF+awLYqzLVqhHjlb6oIQvKhpkf8R7nKzghkgHgmE/tGptJuCXT dejZ3TmZIfofFJZwUmq54VwjjDguj+VAbKAONGaC4nGZWhjMnsGHFLIFlzCt5iLXj/e9 P3/w== X-Gm-Message-State: ABy/qLb2EMYgowMWkukZiWj761QE2jWM0nj9I0rUD7H2FZuVwGwAzQ4S RnevGlK/ClaOUYGEfCaG4i7O0OyoD6w7W2AqIB1dq4sPWJk= X-Received: by 2002:a05:6808:638a:b0:3a3:82a1:d1e7 with SMTP id ec10-20020a056808638a00b003a382a1d1e7mr1190916oib.6.1689243749579; Thu, 13 Jul 2023 03:22:29 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id a9-20020a05680802c900b003a020d24d7dsm2707263oid.56.2023.07.13.03.22.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 03:22:29 -0700 (PDT) To: gcc-patches@gcc.gnu.org, Kyrylo.Tkachov@arm.com, richard.earnshaw@arm.com, richard.sandiford@arm.com Cc: Christophe Lyon Subject: [PATCH 1/6] arm: [MVE intrinsics] Factorize vcaddq vhcaddq Date: Thu, 13 Jul 2023 10:22:19 +0000 Message-Id: <20230713102224.1161596-1-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Christophe Lyon via Gcc-patches From: Christophe Lyon Reply-To: Christophe Lyon Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771300594508651986 X-GMAIL-MSGID: 1771300594508651986 Factorize vcaddq, vhcaddq so that they use the same parameterized names. To be able to use the same patterns, we add a suffix to vcaddq. Note that vcadd uses UNSPEC_VCADDxx for builtins without predication, and VCADDQ_ROTxx_M_x (that is, not starting with "UNSPEC_"). The UNPEC_* names are also used by neon.md 2023-07-13 Christophe Lyon gcc/ * config/arm/arm_mve_builtins.def (vcaddq_rot90_, vcaddq_rot270_) (vcaddq_rot90_f, vcaddq_rot90_f): Add "_" or "_f" suffix. * config/arm/iterators.md (mve_insn): Add vcadd, vhcadd. (isu): Add UNSPEC_VCADD90, UNSPEC_VCADD270, VCADDQ_ROT270_M_U, VCADDQ_ROT270_M_S, VCADDQ_ROT90_M_U, VCADDQ_ROT90_M_S, VHCADDQ_ROT90_M_S, VHCADDQ_ROT270_M_S, VHCADDQ_ROT90_S, VHCADDQ_ROT270_S. (rot): Add VCADDQ_ROT90_M_F, VCADDQ_ROT90_M_S, VCADDQ_ROT90_M_U, VCADDQ_ROT270_M_F, VCADDQ_ROT270_M_S, VCADDQ_ROT270_M_U, VHCADDQ_ROT90_S, VHCADDQ_ROT270_S, VHCADDQ_ROT90_M_S, VHCADDQ_ROT270_M_S. (mve_rot): Add VCADDQ_ROT90_M_F, VCADDQ_ROT90_M_S, VCADDQ_ROT90_M_U, VCADDQ_ROT270_M_F, VCADDQ_ROT270_M_S, VCADDQ_ROT270_M_U, VHCADDQ_ROT90_S, VHCADDQ_ROT270_S, VHCADDQ_ROT90_M_S, VHCADDQ_ROT270_M_S. (supf): Add VHCADDQ_ROT90_M_S, VHCADDQ_ROT270_M_S, VHCADDQ_ROT90_S, VHCADDQ_ROT270_S, UNSPEC_VCADD90, UNSPEC_VCADD270. (VCADDQ_ROT270_M): Delete. (VCADDQ_M_F VxCADDQ VxCADDQ_M): New. (VCADDQ_ROT90_M): Delete. * config/arm/mve.md (mve_vcaddq) (mve_vhcaddq_rot270_s, mve_vhcaddq_rot90_s): Merge into ... (@mve_q_): ... this. (mve_vcaddq): Rename into ... (@mve_q_f): ... this (mve_vcaddq_rot270_m_) (mve_vcaddq_rot90_m_, mve_vhcaddq_rot270_m_s) (mve_vhcaddq_rot90_m_s): Merge into ... (@mve_q_m_): ... this. (mve_vcaddq_rot270_m_f, mve_vcaddq_rot90_m_f): Merge into ... (@mve_q_m_f): ... this. --- gcc/config/arm/arm_mve_builtins.def | 6 +- gcc/config/arm/iterators.md | 38 +++++++- gcc/config/arm/mve.md | 135 +++++----------------------- 3 files changed, 62 insertions(+), 117 deletions(-) diff --git a/gcc/config/arm/arm_mve_builtins.def b/gcc/config/arm/arm_mve_builtins.def index 8de765de3b0..63ad1845593 100644 --- a/gcc/config/arm/arm_mve_builtins.def +++ b/gcc/config/arm/arm_mve_builtins.def @@ -187,6 +187,10 @@ VAR3 (BINOP_NONE_NONE_NONE, vmaxvq_s, v16qi, v8hi, v4si) VAR3 (BINOP_NONE_NONE_NONE, vmaxq_s, v16qi, v8hi, v4si) VAR3 (BINOP_NONE_NONE_NONE, vhsubq_s, v16qi, v8hi, v4si) VAR3 (BINOP_NONE_NONE_NONE, vhsubq_n_s, v16qi, v8hi, v4si) +VAR3 (BINOP_NONE_NONE_NONE, vcaddq_rot90_, v16qi, v8hi, v4si) +VAR3 (BINOP_NONE_NONE_NONE, vcaddq_rot270_, v16qi, v8hi, v4si) +VAR2 (BINOP_NONE_NONE_NONE, vcaddq_rot90_f, v8hf, v4sf) +VAR2 (BINOP_NONE_NONE_NONE, vcaddq_rot270_f, v8hf, v4sf) VAR3 (BINOP_NONE_NONE_NONE, vhcaddq_rot90_s, v16qi, v8hi, v4si) VAR3 (BINOP_NONE_NONE_NONE, vhcaddq_rot270_s, v16qi, v8hi, v4si) VAR3 (BINOP_NONE_NONE_NONE, vhaddq_s, v16qi, v8hi, v4si) @@ -870,8 +874,6 @@ VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_PRED, vshlcq_m_vec_u, v16qi, v8hi, v4si) VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_PRED, vshlcq_m_carry_u, v16qi, v8hi, v4si) /* optabs without any suffixes. */ -VAR5 (BINOP_NONE_NONE_NONE, vcaddq_rot90, v16qi, v8hi, v4si, v8hf, v4sf) -VAR5 (BINOP_NONE_NONE_NONE, vcaddq_rot270, v16qi, v8hi, v4si, v8hf, v4sf) VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot90, v8hf, v4sf) VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot270, v8hf, v4sf) VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot180, v8hf, v4sf) diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index 9e77af55d60..da1ead34e58 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -902,6 +902,7 @@ ]) (define_int_attr mve_insn [ + (UNSPEC_VCADD90 "vcadd") (UNSPEC_VCADD270 "vcadd") (VABAVQ_P_S "vabav") (VABAVQ_P_U "vabav") (VABAVQ_S "vabav") (VABAVQ_U "vabav") (VABDQ_M_S "vabd") (VABDQ_M_U "vabd") (VABDQ_M_F "vabd") @@ -925,6 +926,8 @@ (VBICQ_N_S "vbic") (VBICQ_N_U "vbic") (VBRSRQ_M_N_S "vbrsr") (VBRSRQ_M_N_U "vbrsr") (VBRSRQ_M_N_F "vbrsr") (VBRSRQ_N_S "vbrsr") (VBRSRQ_N_U "vbrsr") (VBRSRQ_N_F "vbrsr") + (VCADDQ_ROT270_M_U "vcadd") (VCADDQ_ROT270_M_S "vcadd") (VCADDQ_ROT270_M_F "vcadd") + (VCADDQ_ROT90_M_U "vcadd") (VCADDQ_ROT90_M_S "vcadd") (VCADDQ_ROT90_M_F "vcadd") (VCLSQ_M_S "vcls") (VCLSQ_S "vcls") (VCLZQ_M_S "vclz") (VCLZQ_M_U "vclz") @@ -944,6 +947,8 @@ (VHADDQ_M_S "vhadd") (VHADDQ_M_U "vhadd") (VHADDQ_N_S "vhadd") (VHADDQ_N_U "vhadd") (VHADDQ_S "vhadd") (VHADDQ_U "vhadd") + (VHCADDQ_ROT90_M_S "vhcadd") (VHCADDQ_ROT270_M_S "vhcadd") + (VHCADDQ_ROT90_S "vhcadd") (VHCADDQ_ROT270_S "vhcadd") (VHSUBQ_M_N_S "vhsub") (VHSUBQ_M_N_U "vhsub") (VHSUBQ_M_S "vhsub") (VHSUBQ_M_U "vhsub") (VHSUBQ_N_S "vhsub") (VHSUBQ_N_U "vhsub") @@ -1190,7 +1195,10 @@ ]) (define_int_attr isu [ + (UNSPEC_VCADD90 "i") (UNSPEC_VCADD270 "i") (VABSQ_M_S "s") + (VCADDQ_ROT270_M_U "i") (VCADDQ_ROT270_M_S "i") + (VCADDQ_ROT90_M_U "i") (VCADDQ_ROT90_M_S "i") (VCLSQ_M_S "s") (VCLZQ_M_S "i") (VCLZQ_M_U "i") @@ -1214,6 +1222,8 @@ (VCMPNEQ_M_N_U "i") (VCMPNEQ_M_S "i") (VCMPNEQ_M_U "i") + (VHCADDQ_ROT90_M_S "s") (VHCADDQ_ROT270_M_S "s") + (VHCADDQ_ROT90_S "s") (VHCADDQ_ROT270_S "s") (VMOVNBQ_M_S "i") (VMOVNBQ_M_U "i") (VMOVNBQ_S "i") (VMOVNBQ_U "i") (VMOVNTQ_M_S "i") (VMOVNTQ_M_U "i") @@ -2155,6 +2165,16 @@ (define_int_attr rot [(UNSPEC_VCADD90 "90") (UNSPEC_VCADD270 "270") + (VCADDQ_ROT90_M_F "90") + (VCADDQ_ROT90_M_S "90") + (VCADDQ_ROT90_M_U "90") + (VCADDQ_ROT270_M_F "270") + (VCADDQ_ROT270_M_S "270") + (VCADDQ_ROT270_M_U "270") + (VHCADDQ_ROT90_S "90") + (VHCADDQ_ROT270_S "270") + (VHCADDQ_ROT90_M_S "90") + (VHCADDQ_ROT270_M_S "270") (UNSPEC_VCMUL "0") (UNSPEC_VCMUL90 "90") (UNSPEC_VCMUL180 "180") @@ -2193,6 +2213,16 @@ (define_int_attr mve_rot [(UNSPEC_VCADD90 "_rot90") (UNSPEC_VCADD270 "_rot270") + (VCADDQ_ROT90_M_F "_rot90") + (VCADDQ_ROT90_M_S "_rot90") + (VCADDQ_ROT90_M_U "_rot90") + (VCADDQ_ROT270_M_F "_rot270") + (VCADDQ_ROT270_M_S "_rot270") + (VCADDQ_ROT270_M_U "_rot270") + (VHCADDQ_ROT90_S "_rot90") + (VHCADDQ_ROT270_S "_rot270") + (VHCADDQ_ROT90_M_S "_rot90") + (VHCADDQ_ROT270_M_S "_rot270") (UNSPEC_VCMLA "") (UNSPEC_VCMLA90 "_rot90") (UNSPEC_VCMLA180 "_rot180") @@ -2535,6 +2565,9 @@ (VRMLALDAVHAQ_P_S "s") (VRMLALDAVHAQ_P_U "u") (VQSHLUQ_M_N_S "s") (VQSHLUQ_N_S "s") + (VHCADDQ_ROT90_M_S "s") (VHCADDQ_ROT270_M_S "s") + (VHCADDQ_ROT90_S "s") (VHCADDQ_ROT270_S "s") + (UNSPEC_VCADD90 "") (UNSPEC_VCADD270 "") ]) ;; Both kinds of return insn. @@ -2767,7 +2800,9 @@ (define_int_iterator VANDQ_M [VANDQ_M_U VANDQ_M_S]) (define_int_iterator VBICQ_M [VBICQ_M_U VBICQ_M_S]) (define_int_iterator VSHLQ_M_N [VSHLQ_M_N_S VSHLQ_M_N_U]) -(define_int_iterator VCADDQ_ROT270_M [VCADDQ_ROT270_M_U VCADDQ_ROT270_M_S]) +(define_int_iterator VCADDQ_M_F [VCADDQ_ROT90_M_F VCADDQ_ROT270_M_F]) +(define_int_iterator VxCADDQ [UNSPEC_VCADD90 UNSPEC_VCADD270 VHCADDQ_ROT90_S VHCADDQ_ROT270_S]) +(define_int_iterator VxCADDQ_M [VHCADDQ_ROT90_M_S VHCADDQ_ROT270_M_S VCADDQ_ROT90_M_U VCADDQ_ROT90_M_S VCADDQ_ROT270_M_U VCADDQ_ROT270_M_S]) (define_int_iterator VQRSHLQ_M [VQRSHLQ_M_U VQRSHLQ_M_S]) (define_int_iterator VQADDQ_M_N [VQADDQ_M_N_U VQADDQ_M_N_S]) (define_int_iterator VADDQ_M_N [VADDQ_M_N_S VADDQ_M_N_U]) @@ -2777,7 +2812,6 @@ (define_int_iterator VMLADAVAQ_P [VMLADAVAQ_P_U VMLADAVAQ_P_S]) (define_int_iterator VBRSRQ_M_N [VBRSRQ_M_N_U VBRSRQ_M_N_S]) (define_int_iterator VMULQ_M_N [VMULQ_M_N_U VMULQ_M_N_S]) -(define_int_iterator VCADDQ_ROT90_M [VCADDQ_ROT90_M_U VCADDQ_ROT90_M_S]) (define_int_iterator VMULLTQ_INT_M [VMULLTQ_INT_M_S VMULLTQ_INT_M_U]) (define_int_iterator VEORQ_M [VEORQ_M_S VEORQ_M_U]) (define_int_iterator VSHRQ_M_N [VSHRQ_M_N_S VSHRQ_M_N_U]) diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 74909ce47e1..a6db6d1b81d 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -839,17 +839,20 @@ ]) ;; -;; [vcaddq, vcaddq_rot90, vcadd_rot180, vcadd_rot270]) +;; [vcaddq_rot90_s, vcadd_rot90_u] +;; [vcaddq_rot270_s, vcadd_rot270_u] +;; [vhcaddq_rot90_s] +;; [vhcaddq_rot270_s] ;; -(define_insn "mve_vcaddq" +(define_insn "@mve_q_" [ (set (match_operand:MVE_2 0 "s_register_operand" "") (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "w") (match_operand:MVE_2 2 "s_register_operand" "w")] - VCADD)) + VxCADDQ)) ] "TARGET_HAVE_MVE" - "vcadd.i%# %q0, %q1, %q2, #" + ".%#\t%q0, %q1, %q2, #" [(set_attr "type" "mve_move") ]) @@ -904,36 +907,6 @@ [(set_attr "type" "mve_move") ]) -;; -;; [vhcaddq_rot270_s]) -;; -(define_insn "mve_vhcaddq_rot270_s" - [ - (set (match_operand:MVE_2 0 "s_register_operand" "") - (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "w") - (match_operand:MVE_2 2 "s_register_operand" "w")] - VHCADDQ_ROT270_S)) - ] - "TARGET_HAVE_MVE" - "vhcadd.s%#\t%q0, %q1, %q2, #270" - [(set_attr "type" "mve_move") -]) - -;; -;; [vhcaddq_rot90_s]) -;; -(define_insn "mve_vhcaddq_rot90_s" - [ - (set (match_operand:MVE_2 0 "s_register_operand" "") - (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "w") - (match_operand:MVE_2 2 "s_register_operand" "w")] - VHCADDQ_ROT90_S)) - ] - "TARGET_HAVE_MVE" - "vhcadd.s%#\t%q0, %q1, %q2, #90" - [(set_attr "type" "mve_move") -]) - ;; ;; [vmaxaq_s] ;; [vminaq_s] @@ -1238,9 +1211,9 @@ ]) ;; -;; [vcaddq, vcaddq_rot90, vcadd_rot180, vcadd_rot270]) +;; [vcaddq_rot90_f, vcaddq_rot270_f] ;; -(define_insn "mve_vcaddq" +(define_insn "@mve_q_f" [ (set (match_operand:MVE_0 0 "s_register_operand" "") (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "w") @@ -1248,7 +1221,7 @@ VCADD)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcadd.f%# %q0, %q1, %q2, #" + ".f%#\t%q0, %q1, %q2, #" [(set_attr "type" "mve_move") ]) @@ -2788,36 +2761,22 @@ (set_attr "length""8")]) ;; -;; [vcaddq_rot270_m_u, vcaddq_rot270_m_s]) +;; [vcaddq_rot90_m_u, vcaddq_rot90_m_s] +;; [vcaddq_rot270_m_u, vcaddq_rot270_m_s] +;; [vhcaddq_rot90_m_s] +;; [vhcaddq_rot270_m_s] ;; -(define_insn "mve_vcaddq_rot270_m_" +(define_insn "@mve_q_m_" [ (set (match_operand:MVE_2 0 "s_register_operand" "") (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0") (match_operand:MVE_2 2 "s_register_operand" "w") (match_operand:MVE_2 3 "s_register_operand" "w") (match_operand: 4 "vpr_register_operand" "Up")] - VCADDQ_ROT270_M)) + VxCADDQ_M)) ] "TARGET_HAVE_MVE" - "vpst\;vcaddt.i%# %q0, %q2, %q3, #270" - [(set_attr "type" "mve_move") - (set_attr "length""8")]) - -;; -;; [vcaddq_rot90_m_u, vcaddq_rot90_m_s]) -;; -(define_insn "mve_vcaddq_rot90_m_" - [ - (set (match_operand:MVE_2 0 "s_register_operand" "") - (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0") - (match_operand:MVE_2 2 "s_register_operand" "w") - (match_operand:MVE_2 3 "s_register_operand" "w") - (match_operand: 4 "vpr_register_operand" "Up")] - VCADDQ_ROT90_M)) - ] - "TARGET_HAVE_MVE" - "vpst\;vcaddt.i%# %q0, %q2, %q3, #90" + "vpst\;t.%#\t%q0, %q2, %q3, #" [(set_attr "type" "mve_move") (set_attr "length""8")]) @@ -2974,40 +2933,6 @@ [(set_attr "type" "mve_move") (set_attr "length""8")]) -;; -;; [vhcaddq_rot270_m_s]) -;; -(define_insn "mve_vhcaddq_rot270_m_s" - [ - (set (match_operand:MVE_2 0 "s_register_operand" "") - (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0") - (match_operand:MVE_2 2 "s_register_operand" "w") - (match_operand:MVE_2 3 "s_register_operand" "w") - (match_operand: 4 "vpr_register_operand" "Up")] - VHCADDQ_ROT270_M_S)) - ] - "TARGET_HAVE_MVE" - "vpst\;vhcaddt.s%#\t%q0, %q2, %q3, #270" - [(set_attr "type" "mve_move") - (set_attr "length""8")]) - -;; -;; [vhcaddq_rot90_m_s]) -;; -(define_insn "mve_vhcaddq_rot90_m_s" - [ - (set (match_operand:MVE_2 0 "s_register_operand" "") - (unspec:MVE_2 [(match_operand:MVE_2 1 "s_register_operand" "0") - (match_operand:MVE_2 2 "s_register_operand" "w") - (match_operand:MVE_2 3 "s_register_operand" "w") - (match_operand: 4 "vpr_register_operand" "Up")] - VHCADDQ_ROT90_M_S)) - ] - "TARGET_HAVE_MVE" - "vpst\;vhcaddt.s%#\t%q0, %q2, %q3, #90" - [(set_attr "type" "mve_move") - (set_attr "length""8")]) - ;; ;; [vmlaldavaq_p_u, vmlaldavaq_p_s] ;; [vmlaldavaxq_p_s] @@ -3247,36 +3172,20 @@ (set_attr "length""8")]) ;; -;; [vcaddq_rot270_m_f]) -;; -(define_insn "mve_vcaddq_rot270_m_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") - (match_operand:MVE_0 2 "s_register_operand" "w") - (match_operand:MVE_0 3 "s_register_operand" "w") - (match_operand: 4 "vpr_register_operand" "Up")] - VCADDQ_ROT270_M_F)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcaddt.f%# %q0, %q2, %q3, #270" - [(set_attr "type" "mve_move") - (set_attr "length""8")]) - -;; -;; [vcaddq_rot90_m_f]) +;; [vcaddq_rot90_m_f] +;; [vcaddq_rot270_m_f] ;; -(define_insn "mve_vcaddq_rot90_m_f" +(define_insn "@mve_q_m_f" [ (set (match_operand:MVE_0 0 "s_register_operand" "") (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") (match_operand:MVE_0 2 "s_register_operand" "w") (match_operand:MVE_0 3 "s_register_operand" "w") (match_operand: 4 "vpr_register_operand" "Up")] - VCADDQ_ROT90_M_F)) + VCADDQ_M_F)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcaddt.f%# %q0, %q2, %q3, #90" + "vpst\;t.f%#\t%q0, %q2, %q3, #" [(set_attr "type" "mve_move") (set_attr "length""8")]) From patchwork Thu Jul 13 10:22:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 119764 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1726099vqm; Thu, 13 Jul 2023 03:30:27 -0700 (PDT) X-Google-Smtp-Source: APBJJlFHXlMOsxrWYvhC5oU7cETGDEQHfZe1GGeVDgOEG5nil7942VO3rDBrS8xqLNk59MRlp1GM X-Received: by 2002:a17:907:506:b0:98c:df38:517b with SMTP id wj6-20020a170907050600b0098cdf38517bmr901623ejb.33.1689244227221; Thu, 13 Jul 2023 03:30:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689244227; cv=none; d=google.com; s=arc-20160816; b=Sk3asVtm8dxII7/hxUpJY86GP3oz/U0L4X7EOCzI0AW0GISb4da16ZI/fCB0M6k8TS NoIX/nR6aiq41utueIIu+Zx5Qj8rYc7zdJA8yJEKoVOjmu48gpwE+RE45NSj/YPn4864 hUxbw558mITVUeJy9t3XanlMWahJJ6iFVshjT0GXtXSlQfPD2dECtnXtgQWY3S4yzAbP JhtKG8Jbba5MO92cUBzyStkf8x2r+2oPUueb4I0/HkljH2SgByWkC0gjx4JP9qvCKnA4 lBc9Tnx9t8f6hZgK32QZimRqfb3iqyTE7Q+UT5toXq23efsgoZsyGR5aRXba/uV/sWCl mTEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=Tvt0bNZTF99UNSXid3AXhVOSfuwHSybXx7Y4LxeblKI=; fh=XxyM7Ym2GZuIk6CR9Av7X+K6FIjQnvqQn0ljduIMmhY=; b=Ubv7PuxdB6Mq1ZfH+2sCmqV/qkKL+kK8u/6wBnW09ivj9CMisqYsFcATMDaatEBQVC 5OhEqLB4fgdALOpTLDsUQKv0W/sESKzdAM+/iuGnpuR2WakYlGi1YNKP5Z0eJkFjFLHW 1BAYHI8Kn5o+qXDxw1ruNTcPqrapLhtsRMwBsT+FCFC4dQ2LdIDG2fe8ig2iYp6VGL30 NsI/1iQn16fDotT9jCrtHsDHGwgLSF+kuFVZfVGM/PwNN8xy1Xs3X+P8N0pHWHYzIvpL 0rvcoqZ+CiqnvBU6hYSaOzR0xtYyecXcSHP6QPjcnGTgHuP50JW/akCNDTYX5Babgike flQg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=qed9G3X9; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id d24-20020a170906345800b00993afecd681si6616802ejb.331.2023.07.13.03.30.26 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 03:30:27 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=qed9G3X9; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CA4433854E97 for ; Thu, 13 Jul 2023 10:28:35 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org CA4433854E97 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689244115; bh=Tvt0bNZTF99UNSXid3AXhVOSfuwHSybXx7Y4LxeblKI=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=qed9G3X9CccJ1X2AkkiO8bre7l3vC2RcedKc8j4iCJpiJJqQGIacmMaiyNv10cv54 ycmujzYrYgQN+wkbU1nEkIB4Geu5BCr0YPjEr2ArcwzXpWjD4BgZUXWTkOG/Gz0uOg GTsbcIApfinPpZvQhGDouGgTioqfxl3+Mq+nBj0s= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oi1-x235.google.com (mail-oi1-x235.google.com [IPv6:2607:f8b0:4864:20::235]) by sourceware.org (Postfix) with ESMTPS id BDADF3858035 for ; Thu, 13 Jul 2023 10:22:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BDADF3858035 Received: by mail-oi1-x235.google.com with SMTP id 5614622812f47-3a3efee1d44so437879b6e.3 for ; Thu, 13 Jul 2023 03:22:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689243751; x=1691835751; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Tvt0bNZTF99UNSXid3AXhVOSfuwHSybXx7Y4LxeblKI=; b=YJ9HPE7eR6D7z4ITnKuonM89gCijYBy23M4cpOB2D4yVYnrtDErWFgeMZcCZsZ5EyK gc5iN94n+HGJfOaYBruhi0LYBjy5CAlKqGDDvXUIXvRTYj82gJDK6ztCRI9nt1JRZ/L8 hvI43I0lezFjbdY2AIff8ftOnX5ZvUSYMbI48tD4XyFSS/V00wHHiP2lqn43bAlhHQgL 2Ze/54wsKYiYspKjdByWtRlyIAj4HEVCXUBEGLm4wQe3q++oS6YbwqRM4Bbr936Hw3FN p/xFwBB+jrri3mKTaBqUmGR/RUVBD1xGWZzWaXSDMJJiHgl3VdOd528eDbudCNfIFGTj lEWg== X-Gm-Message-State: ABy/qLZ+tUtLzma5Tex6btUbSd5jqTGj+4kku6/cXvZ5Ga22kGU7XWFr NTpcH2p64xhVVqEVaC+XumKn0LVMst74YrAVpeOA5V+V X-Received: by 2002:a05:6808:14c6:b0:3a3:e86b:c852 with SMTP id f6-20020a05680814c600b003a3e86bc852mr1133250oiw.56.1689243750547; Thu, 13 Jul 2023 03:22:30 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id a9-20020a05680802c900b003a020d24d7dsm2707263oid.56.2023.07.13.03.22.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 03:22:29 -0700 (PDT) To: gcc-patches@gcc.gnu.org, Kyrylo.Tkachov@arm.com, richard.earnshaw@arm.com, richard.sandiford@arm.com Cc: Christophe Lyon Subject: [PATCH 2/6] arm: [MVE intrinsics] rework vcaddq vhcaddq Date: Thu, 13 Jul 2023 10:22:20 +0000 Message-Id: <20230713102224.1161596-2-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230713102224.1161596-1-christophe.lyon@linaro.org> References: <20230713102224.1161596-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Christophe Lyon via Gcc-patches From: Christophe Lyon Reply-To: Christophe Lyon Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771300954945853880 X-GMAIL-MSGID: 1771300954945853880 Implement vcaddq, vhcaddq using the new MVE builtins framework. 2023-07-13 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vcaddq_rot90) (vcaddq_rot270, vhcaddq_rot90, vhcaddq_rot270): New. * config/arm/arm-mve-builtins-base.def (vcaddq_rot90) (vcaddq_rot270, vhcaddq_rot90, vhcaddq_rot270): New. * config/arm/arm-mve-builtins-base.h: (vcaddq_rot90) (vcaddq_rot270, vhcaddq_rot90, vhcaddq_rot270): New. * config/arm/arm-mve-builtins-functions.h (class unspec_mve_function_exact_insn_rot): New. * config/arm/arm_mve.h (vcaddq_rot90): Delete. (vcaddq_rot270): Delete. (vhcaddq_rot90): Delete. (vhcaddq_rot270): Delete. (vcaddq_rot270_m): Delete. (vcaddq_rot90_m): Delete. (vhcaddq_rot270_m): Delete. (vhcaddq_rot90_m): Delete. (vcaddq_rot90_x): Delete. (vcaddq_rot270_x): Delete. (vhcaddq_rot90_x): Delete. (vhcaddq_rot270_x): Delete. (vcaddq_rot90_u8): Delete. (vcaddq_rot270_u8): Delete. (vhcaddq_rot90_s8): Delete. (vhcaddq_rot270_s8): Delete. (vcaddq_rot90_s8): Delete. (vcaddq_rot270_s8): Delete. (vcaddq_rot90_u16): Delete. (vcaddq_rot270_u16): Delete. (vhcaddq_rot90_s16): Delete. (vhcaddq_rot270_s16): Delete. (vcaddq_rot90_s16): Delete. (vcaddq_rot270_s16): Delete. (vcaddq_rot90_u32): Delete. (vcaddq_rot270_u32): Delete. (vhcaddq_rot90_s32): Delete. (vhcaddq_rot270_s32): Delete. (vcaddq_rot90_s32): Delete. (vcaddq_rot270_s32): Delete. (vcaddq_rot90_f16): Delete. (vcaddq_rot270_f16): Delete. (vcaddq_rot90_f32): Delete. (vcaddq_rot270_f32): Delete. (vcaddq_rot270_m_s8): Delete. (vcaddq_rot270_m_s32): Delete. (vcaddq_rot270_m_s16): Delete. (vcaddq_rot270_m_u8): Delete. (vcaddq_rot270_m_u32): Delete. (vcaddq_rot270_m_u16): Delete. (vcaddq_rot90_m_s8): Delete. (vcaddq_rot90_m_s32): Delete. (vcaddq_rot90_m_s16): Delete. (vcaddq_rot90_m_u8): Delete. (vcaddq_rot90_m_u32): Delete. (vcaddq_rot90_m_u16): Delete. (vhcaddq_rot270_m_s8): Delete. (vhcaddq_rot270_m_s32): Delete. (vhcaddq_rot270_m_s16): Delete. (vhcaddq_rot90_m_s8): Delete. (vhcaddq_rot90_m_s32): Delete. (vhcaddq_rot90_m_s16): Delete. (vcaddq_rot270_m_f32): Delete. (vcaddq_rot270_m_f16): Delete. (vcaddq_rot90_m_f32): Delete. (vcaddq_rot90_m_f16): Delete. (vcaddq_rot90_x_s8): Delete. (vcaddq_rot90_x_s16): Delete. (vcaddq_rot90_x_s32): Delete. (vcaddq_rot90_x_u8): Delete. (vcaddq_rot90_x_u16): Delete. (vcaddq_rot90_x_u32): Delete. (vcaddq_rot270_x_s8): Delete. (vcaddq_rot270_x_s16): Delete. (vcaddq_rot270_x_s32): Delete. (vcaddq_rot270_x_u8): Delete. (vcaddq_rot270_x_u16): Delete. (vcaddq_rot270_x_u32): Delete. (vhcaddq_rot90_x_s8): Delete. (vhcaddq_rot90_x_s16): Delete. (vhcaddq_rot90_x_s32): Delete. (vhcaddq_rot270_x_s8): Delete. (vhcaddq_rot270_x_s16): Delete. (vhcaddq_rot270_x_s32): Delete. (vcaddq_rot90_x_f16): Delete. (vcaddq_rot90_x_f32): Delete. (vcaddq_rot270_x_f16): Delete. (vcaddq_rot270_x_f32): Delete. (__arm_vcaddq_rot90_u8): Delete. (__arm_vcaddq_rot270_u8): Delete. (__arm_vhcaddq_rot90_s8): Delete. (__arm_vhcaddq_rot270_s8): Delete. (__arm_vcaddq_rot90_s8): Delete. (__arm_vcaddq_rot270_s8): Delete. (__arm_vcaddq_rot90_u16): Delete. (__arm_vcaddq_rot270_u16): Delete. (__arm_vhcaddq_rot90_s16): Delete. (__arm_vhcaddq_rot270_s16): Delete. (__arm_vcaddq_rot90_s16): Delete. (__arm_vcaddq_rot270_s16): Delete. (__arm_vcaddq_rot90_u32): Delete. (__arm_vcaddq_rot270_u32): Delete. (__arm_vhcaddq_rot90_s32): Delete. (__arm_vhcaddq_rot270_s32): Delete. (__arm_vcaddq_rot90_s32): Delete. (__arm_vcaddq_rot270_s32): Delete. (__arm_vcaddq_rot270_m_s8): Delete. (__arm_vcaddq_rot270_m_s32): Delete. (__arm_vcaddq_rot270_m_s16): Delete. (__arm_vcaddq_rot270_m_u8): Delete. (__arm_vcaddq_rot270_m_u32): Delete. (__arm_vcaddq_rot270_m_u16): Delete. (__arm_vcaddq_rot90_m_s8): Delete. (__arm_vcaddq_rot90_m_s32): Delete. (__arm_vcaddq_rot90_m_s16): Delete. (__arm_vcaddq_rot90_m_u8): Delete. (__arm_vcaddq_rot90_m_u32): Delete. (__arm_vcaddq_rot90_m_u16): Delete. (__arm_vhcaddq_rot270_m_s8): Delete. (__arm_vhcaddq_rot270_m_s32): Delete. (__arm_vhcaddq_rot270_m_s16): Delete. (__arm_vhcaddq_rot90_m_s8): Delete. (__arm_vhcaddq_rot90_m_s32): Delete. (__arm_vhcaddq_rot90_m_s16): Delete. (__arm_vcaddq_rot90_x_s8): Delete. (__arm_vcaddq_rot90_x_s16): Delete. (__arm_vcaddq_rot90_x_s32): Delete. (__arm_vcaddq_rot90_x_u8): Delete. (__arm_vcaddq_rot90_x_u16): Delete. (__arm_vcaddq_rot90_x_u32): Delete. (__arm_vcaddq_rot270_x_s8): Delete. (__arm_vcaddq_rot270_x_s16): Delete. (__arm_vcaddq_rot270_x_s32): Delete. (__arm_vcaddq_rot270_x_u8): Delete. (__arm_vcaddq_rot270_x_u16): Delete. (__arm_vcaddq_rot270_x_u32): Delete. (__arm_vhcaddq_rot90_x_s8): Delete. (__arm_vhcaddq_rot90_x_s16): Delete. (__arm_vhcaddq_rot90_x_s32): Delete. (__arm_vhcaddq_rot270_x_s8): Delete. (__arm_vhcaddq_rot270_x_s16): Delete. (__arm_vhcaddq_rot270_x_s32): Delete. (__arm_vcaddq_rot90_f16): Delete. (__arm_vcaddq_rot270_f16): Delete. (__arm_vcaddq_rot90_f32): Delete. (__arm_vcaddq_rot270_f32): Delete. (__arm_vcaddq_rot270_m_f32): Delete. (__arm_vcaddq_rot270_m_f16): Delete. (__arm_vcaddq_rot90_m_f32): Delete. (__arm_vcaddq_rot90_m_f16): Delete. (__arm_vcaddq_rot90_x_f16): Delete. (__arm_vcaddq_rot90_x_f32): Delete. (__arm_vcaddq_rot270_x_f16): Delete. (__arm_vcaddq_rot270_x_f32): Delete. (__arm_vcaddq_rot90): Delete. (__arm_vcaddq_rot270): Delete. (__arm_vhcaddq_rot90): Delete. (__arm_vhcaddq_rot270): Delete. (__arm_vcaddq_rot270_m): Delete. (__arm_vcaddq_rot90_m): Delete. (__arm_vhcaddq_rot270_m): Delete. (__arm_vhcaddq_rot90_m): Delete. (__arm_vcaddq_rot90_x): Delete. (__arm_vcaddq_rot270_x): Delete. (__arm_vhcaddq_rot90_x): Delete. (__arm_vhcaddq_rot270_x): Delete. --- gcc/config/arm/arm-mve-builtins-base.cc | 4 + gcc/config/arm/arm-mve-builtins-base.def | 6 + gcc/config/arm/arm-mve-builtins-base.h | 4 + gcc/config/arm/arm-mve-builtins-functions.h | 103 ++ gcc/config/arm/arm_mve.h | 1398 ++----------------- 5 files changed, 215 insertions(+), 1300 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index af02397f1c4..f15bb926147 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -260,6 +260,10 @@ FUNCTION_PRED_P_S_U (vaddvq, VADDVQ) FUNCTION_PRED_P_S_U (vaddvaq, VADDVAQ) FUNCTION_WITH_RTX_M (vandq, AND, VANDQ) FUNCTION_ONLY_N (vbrsrq, VBRSRQ) +FUNCTION (vcaddq_rot90, unspec_mve_function_exact_insn_rot, (UNSPEC_VCADD90, UNSPEC_VCADD90, UNSPEC_VCADD90, VCADDQ_ROT90_M_S, VCADDQ_ROT90_M_U, VCADDQ_ROT90_M_F)) +FUNCTION (vcaddq_rot270, unspec_mve_function_exact_insn_rot, (UNSPEC_VCADD270, UNSPEC_VCADD270, UNSPEC_VCADD270, VCADDQ_ROT270_M_S, VCADDQ_ROT270_M_U, VCADDQ_ROT270_M_F)) +FUNCTION (vhcaddq_rot90, unspec_mve_function_exact_insn_rot, (VHCADDQ_ROT90_S, -1, -1, VHCADDQ_ROT90_M_S, -1, -1)) +FUNCTION (vhcaddq_rot270, unspec_mve_function_exact_insn_rot, (VHCADDQ_ROT270_S, -1, -1, VHCADDQ_ROT270_M_S, -1, -1)) FUNCTION_WITHOUT_N_NO_U_F (vclsq, VCLSQ) FUNCTION (vclzq, unspec_based_mve_function_exact_insn, (CLZ, CLZ, CLZ, -1, -1, -1, VCLZQ_M_S, VCLZQ_M_U, -1, -1, -1 ,-1)) FUNCTION (vcmpeqq, unspec_based_mve_function_exact_insn_vcmp, (EQ, EQ, EQ, VCMPEQQ_M_S, VCMPEQQ_M_U, VCMPEQQ_M_F, VCMPEQQ_M_N_S, VCMPEQQ_M_N_U, VCMPEQQ_M_N_F)) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index ee08d063407..9a793147960 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -28,6 +28,8 @@ DEF_MVE_FUNCTION (vaddvaq, unary_int32_acc, all_integer, p_or_none) DEF_MVE_FUNCTION (vaddvq, unary_int32, all_integer, p_or_none) DEF_MVE_FUNCTION (vandq, binary, all_integer, mx_or_none) DEF_MVE_FUNCTION (vbrsrq, binary_imm32, all_integer, mx_or_none) +DEF_MVE_FUNCTION (vcaddq_rot90, binary, all_integer, mx_or_none) +DEF_MVE_FUNCTION (vcaddq_rot270, binary, all_integer, mx_or_none) DEF_MVE_FUNCTION (vclsq, unary, all_signed, mx_or_none) DEF_MVE_FUNCTION (vclzq, unary, all_integer, mx_or_none) DEF_MVE_FUNCTION (vcmpcsq, cmp, all_unsigned, m_or_none) @@ -42,6 +44,8 @@ DEF_MVE_FUNCTION (vcreateq, create, all_integer_with_64, none) DEF_MVE_FUNCTION (vdupq, unary_n, all_integer, mx_or_none) DEF_MVE_FUNCTION (veorq, binary, all_integer, mx_or_none) DEF_MVE_FUNCTION (vhaddq, binary_opt_n, all_integer, mx_or_none) +DEF_MVE_FUNCTION (vhcaddq_rot90, binary, all_signed, mx_or_none) +DEF_MVE_FUNCTION (vhcaddq_rot270, binary, all_signed, mx_or_none) DEF_MVE_FUNCTION (vhsubq, binary_opt_n, all_integer, mx_or_none) DEF_MVE_FUNCTION (vmaxaq, binary_maxamina, all_signed, m_or_none) DEF_MVE_FUNCTION (vmaxavq, binary_maxavminav, all_signed, p_or_none) @@ -152,6 +156,8 @@ DEF_MVE_FUNCTION (vabsq, unary, all_float, mx_or_none) DEF_MVE_FUNCTION (vaddq, binary_opt_n, all_float, mx_or_none) DEF_MVE_FUNCTION (vandq, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vbrsrq, binary_imm32, all_float, mx_or_none) +DEF_MVE_FUNCTION (vcaddq_rot90, binary, all_float, mx_or_none) +DEF_MVE_FUNCTION (vcaddq_rot270, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vcmpeqq, cmp, all_float, m_or_none) DEF_MVE_FUNCTION (vcmpgeq, cmp, all_float, m_or_none) DEF_MVE_FUNCTION (vcmpgtq, cmp, all_float, m_or_none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index 942c8587446..8dac72970e0 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -33,6 +33,8 @@ extern const function_base *const vaddvaq; extern const function_base *const vaddvq; extern const function_base *const vandq; extern const function_base *const vbrsrq; +extern const function_base *const vcaddq_rot90; +extern const function_base *const vcaddq_rot270; extern const function_base *const vclsq; extern const function_base *const vclzq; extern const function_base *const vcmpcsq; @@ -50,6 +52,8 @@ extern const function_base *const vfmaq; extern const function_base *const vfmasq; extern const function_base *const vfmsq; extern const function_base *const vhaddq; +extern const function_base *const vhcaddq_rot90; +extern const function_base *const vhcaddq_rot270; extern const function_base *const vhsubq; extern const function_base *const vmaxaq; extern const function_base *const vmaxavq; diff --git a/gcc/config/arm/arm-mve-builtins-functions.h b/gcc/config/arm/arm-mve-builtins-functions.h index 8f3bae4b7da..a6573844319 100644 --- a/gcc/config/arm/arm-mve-builtins-functions.h +++ b/gcc/config/arm/arm-mve-builtins-functions.h @@ -735,6 +735,109 @@ public: } }; +/* Map the function directly to CODE (UNSPEC, UNSPEC, UNSPEC, M) where + M is the vector mode associated with type suffix 0. USed for the + operations where there is a "rot90" or "rot270" suffix, depending + on the UNSPEC. */ +class unspec_mve_function_exact_insn_rot : public function_base +{ +public: + CONSTEXPR unspec_mve_function_exact_insn_rot (int unspec_for_sint, + int unspec_for_uint, + int unspec_for_fp, + int unspec_for_m_sint, + int unspec_for_m_uint, + int unspec_for_m_fp) + : m_unspec_for_sint (unspec_for_sint), + m_unspec_for_uint (unspec_for_uint), + m_unspec_for_fp (unspec_for_fp), + m_unspec_for_m_sint (unspec_for_m_sint), + m_unspec_for_m_uint (unspec_for_m_uint), + m_unspec_for_m_fp (unspec_for_m_fp) + {} + + /* The unspec code associated with signed-integer, unsigned-integer + and floating-point operations respectively. It covers the cases + with and without the _m predicate. */ + int m_unspec_for_sint; + int m_unspec_for_uint; + int m_unspec_for_fp; + int m_unspec_for_m_sint; + int m_unspec_for_m_uint; + int m_unspec_for_m_fp; + + rtx + expand (function_expander &e) const override + { + insn_code code; + + switch (e.pred) + { + case PRED_none: + switch (e.mode_suffix_id) + { + case MODE_none: + /* No predicate, no suffix. */ + if (e.type_suffix (0).integer_p) + if (e.type_suffix (0).unsigned_p) + code = code_for_mve_q (m_unspec_for_uint, m_unspec_for_uint, m_unspec_for_uint, e.vector_mode (0)); + else + code = code_for_mve_q (m_unspec_for_sint, m_unspec_for_sint, m_unspec_for_sint, e.vector_mode (0)); + else + code = code_for_mve_q_f (m_unspec_for_fp, m_unspec_for_fp, e.vector_mode (0)); + break; + + default: + gcc_unreachable (); + } + return e.use_exact_insn (code); + + case PRED_m: + switch (e.mode_suffix_id) + { + case MODE_none: + /* No suffix, "m" predicate. */ + if (e.type_suffix (0).integer_p) + if (e.type_suffix (0).unsigned_p) + code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); + else + code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); + else + code = code_for_mve_q_m_f (m_unspec_for_m_fp, m_unspec_for_m_fp, e.vector_mode (0)); + break; + + default: + gcc_unreachable (); + } + return e.use_cond_insn (code, 0); + + case PRED_x: + switch (e.mode_suffix_id) + { + case MODE_none: + /* No suffix, "x" predicate. */ + if (e.type_suffix (0).integer_p) + if (e.type_suffix (0).unsigned_p) + code = code_for_mve_q_m (m_unspec_for_m_uint, m_unspec_for_m_uint, m_unspec_for_m_uint, e.vector_mode (0)); + else + code = code_for_mve_q_m (m_unspec_for_m_sint, m_unspec_for_m_sint, m_unspec_for_m_sint, e.vector_mode (0)); + else + code = code_for_mve_q_m_f (m_unspec_for_m_fp, m_unspec_for_m_fp, e.vector_mode (0)); + break; + + default: + gcc_unreachable (); + } + return e.use_pred_x_insn (code); + + default: + gcc_unreachable (); + } + + gcc_unreachable (); + } +}; + } /* end namespace arm_mve */ /* Declare the global function base NAME, creating it from an instance diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 30485e74cdd..9297ad757ab 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -45,20 +45,12 @@ #define vornq(__a, __b) __arm_vornq(__a, __b) #define vmulltq_int(__a, __b) __arm_vmulltq_int(__a, __b) #define vmullbq_int(__a, __b) __arm_vmullbq_int(__a, __b) -#define vcaddq_rot90(__a, __b) __arm_vcaddq_rot90(__a, __b) -#define vcaddq_rot270(__a, __b) __arm_vcaddq_rot270(__a, __b) #define vbicq(__a, __b) __arm_vbicq(__a, __b) -#define vhcaddq_rot90(__a, __b) __arm_vhcaddq_rot90(__a, __b) -#define vhcaddq_rot270(__a, __b) __arm_vhcaddq_rot270(__a, __b) #define vmulltq_poly(__a, __b) __arm_vmulltq_poly(__a, __b) #define vmullbq_poly(__a, __b) __arm_vmullbq_poly(__a, __b) #define vbicq_m_n(__a, __imm, __p) __arm_vbicq_m_n(__a, __imm, __p) #define vshlcq(__a, __b, __imm) __arm_vshlcq(__a, __b, __imm) #define vbicq_m(__inactive, __a, __b, __p) __arm_vbicq_m(__inactive, __a, __b, __p) -#define vcaddq_rot270_m(__inactive, __a, __b, __p) __arm_vcaddq_rot270_m(__inactive, __a, __b, __p) -#define vcaddq_rot90_m(__inactive, __a, __b, __p) __arm_vcaddq_rot90_m(__inactive, __a, __b, __p) -#define vhcaddq_rot270_m(__inactive, __a, __b, __p) __arm_vhcaddq_rot270_m(__inactive, __a, __b, __p) -#define vhcaddq_rot90_m(__inactive, __a, __b, __p) __arm_vhcaddq_rot90_m(__inactive, __a, __b, __p) #define vmullbq_int_m(__inactive, __a, __b, __p) __arm_vmullbq_int_m(__inactive, __a, __b, __p) #define vmulltq_int_m(__inactive, __a, __b, __p) __arm_vmulltq_int_m(__inactive, __a, __b, __p) #define vornq_m(__inactive, __a, __b, __p) __arm_vornq_m(__inactive, __a, __b, __p) @@ -141,10 +133,6 @@ #define vmullbq_int_x(__a, __b, __p) __arm_vmullbq_int_x(__a, __b, __p) #define vmulltq_poly_x(__a, __b, __p) __arm_vmulltq_poly_x(__a, __b, __p) #define vmulltq_int_x(__a, __b, __p) __arm_vmulltq_int_x(__a, __b, __p) -#define vcaddq_rot90_x(__a, __b, __p) __arm_vcaddq_rot90_x(__a, __b, __p) -#define vcaddq_rot270_x(__a, __b, __p) __arm_vcaddq_rot270_x(__a, __b, __p) -#define vhcaddq_rot90_x(__a, __b, __p) __arm_vhcaddq_rot90_x(__a, __b, __p) -#define vhcaddq_rot270_x(__a, __b, __p) __arm_vhcaddq_rot270_x(__a, __b, __p) #define vbicq_x(__a, __b, __p) __arm_vbicq_x(__a, __b, __p) #define vornq_x(__a, __b, __p) __arm_vornq_x(__a, __b, __p) #define vadciq(__a, __b, __carry_out) __arm_vadciq(__a, __b, __carry_out) @@ -249,44 +237,26 @@ #define vornq_u8(__a, __b) __arm_vornq_u8(__a, __b) #define vmulltq_int_u8(__a, __b) __arm_vmulltq_int_u8(__a, __b) #define vmullbq_int_u8(__a, __b) __arm_vmullbq_int_u8(__a, __b) -#define vcaddq_rot90_u8(__a, __b) __arm_vcaddq_rot90_u8(__a, __b) -#define vcaddq_rot270_u8(__a, __b) __arm_vcaddq_rot270_u8(__a, __b) #define vbicq_u8(__a, __b) __arm_vbicq_u8(__a, __b) #define vornq_s8(__a, __b) __arm_vornq_s8(__a, __b) #define vmulltq_int_s8(__a, __b) __arm_vmulltq_int_s8(__a, __b) #define vmullbq_int_s8(__a, __b) __arm_vmullbq_int_s8(__a, __b) -#define vhcaddq_rot90_s8(__a, __b) __arm_vhcaddq_rot90_s8(__a, __b) -#define vhcaddq_rot270_s8(__a, __b) __arm_vhcaddq_rot270_s8(__a, __b) -#define vcaddq_rot90_s8(__a, __b) __arm_vcaddq_rot90_s8(__a, __b) -#define vcaddq_rot270_s8(__a, __b) __arm_vcaddq_rot270_s8(__a, __b) #define vbicq_s8(__a, __b) __arm_vbicq_s8(__a, __b) #define vornq_u16(__a, __b) __arm_vornq_u16(__a, __b) #define vmulltq_int_u16(__a, __b) __arm_vmulltq_int_u16(__a, __b) #define vmullbq_int_u16(__a, __b) __arm_vmullbq_int_u16(__a, __b) -#define vcaddq_rot90_u16(__a, __b) __arm_vcaddq_rot90_u16(__a, __b) -#define vcaddq_rot270_u16(__a, __b) __arm_vcaddq_rot270_u16(__a, __b) #define vbicq_u16(__a, __b) __arm_vbicq_u16(__a, __b) #define vornq_s16(__a, __b) __arm_vornq_s16(__a, __b) #define vmulltq_int_s16(__a, __b) __arm_vmulltq_int_s16(__a, __b) #define vmullbq_int_s16(__a, __b) __arm_vmullbq_int_s16(__a, __b) -#define vhcaddq_rot90_s16(__a, __b) __arm_vhcaddq_rot90_s16(__a, __b) -#define vhcaddq_rot270_s16(__a, __b) __arm_vhcaddq_rot270_s16(__a, __b) -#define vcaddq_rot90_s16(__a, __b) __arm_vcaddq_rot90_s16(__a, __b) -#define vcaddq_rot270_s16(__a, __b) __arm_vcaddq_rot270_s16(__a, __b) #define vbicq_s16(__a, __b) __arm_vbicq_s16(__a, __b) #define vornq_u32(__a, __b) __arm_vornq_u32(__a, __b) #define vmulltq_int_u32(__a, __b) __arm_vmulltq_int_u32(__a, __b) #define vmullbq_int_u32(__a, __b) __arm_vmullbq_int_u32(__a, __b) -#define vcaddq_rot90_u32(__a, __b) __arm_vcaddq_rot90_u32(__a, __b) -#define vcaddq_rot270_u32(__a, __b) __arm_vcaddq_rot270_u32(__a, __b) #define vbicq_u32(__a, __b) __arm_vbicq_u32(__a, __b) #define vornq_s32(__a, __b) __arm_vornq_s32(__a, __b) #define vmulltq_int_s32(__a, __b) __arm_vmulltq_int_s32(__a, __b) #define vmullbq_int_s32(__a, __b) __arm_vmullbq_int_s32(__a, __b) -#define vhcaddq_rot90_s32(__a, __b) __arm_vhcaddq_rot90_s32(__a, __b) -#define vhcaddq_rot270_s32(__a, __b) __arm_vhcaddq_rot270_s32(__a, __b) -#define vcaddq_rot90_s32(__a, __b) __arm_vcaddq_rot90_s32(__a, __b) -#define vcaddq_rot270_s32(__a, __b) __arm_vcaddq_rot270_s32(__a, __b) #define vbicq_s32(__a, __b) __arm_vbicq_s32(__a, __b) #define vmulltq_poly_p8(__a, __b) __arm_vmulltq_poly_p8(__a, __b) #define vmullbq_poly_p8(__a, __b) __arm_vmullbq_poly_p8(__a, __b) @@ -296,8 +266,6 @@ #define vcmulq_rot270_f16(__a, __b) __arm_vcmulq_rot270_f16(__a, __b) #define vcmulq_rot180_f16(__a, __b) __arm_vcmulq_rot180_f16(__a, __b) #define vcmulq_f16(__a, __b) __arm_vcmulq_f16(__a, __b) -#define vcaddq_rot90_f16(__a, __b) __arm_vcaddq_rot90_f16(__a, __b) -#define vcaddq_rot270_f16(__a, __b) __arm_vcaddq_rot270_f16(__a, __b) #define vbicq_f16(__a, __b) __arm_vbicq_f16(__a, __b) #define vbicq_n_s16(__a, __imm) __arm_vbicq_n_s16(__a, __imm) #define vmulltq_poly_p16(__a, __b) __arm_vmulltq_poly_p16(__a, __b) @@ -308,8 +276,6 @@ #define vcmulq_rot270_f32(__a, __b) __arm_vcmulq_rot270_f32(__a, __b) #define vcmulq_rot180_f32(__a, __b) __arm_vcmulq_rot180_f32(__a, __b) #define vcmulq_f32(__a, __b) __arm_vcmulq_f32(__a, __b) -#define vcaddq_rot90_f32(__a, __b) __arm_vcaddq_rot90_f32(__a, __b) -#define vcaddq_rot270_f32(__a, __b) __arm_vcaddq_rot270_f32(__a, __b) #define vbicq_f32(__a, __b) __arm_vbicq_f32(__a, __b) #define vbicq_n_s32(__a, __imm) __arm_vbicq_n_s32(__a, __imm) #define vctp8q_m(__a, __p) __arm_vctp8q_m(__a, __p) @@ -374,24 +340,6 @@ #define vbicq_m_u8(__inactive, __a, __b, __p) __arm_vbicq_m_u8(__inactive, __a, __b, __p) #define vbicq_m_u32(__inactive, __a, __b, __p) __arm_vbicq_m_u32(__inactive, __a, __b, __p) #define vbicq_m_u16(__inactive, __a, __b, __p) __arm_vbicq_m_u16(__inactive, __a, __b, __p) -#define vcaddq_rot270_m_s8(__inactive, __a, __b, __p) __arm_vcaddq_rot270_m_s8(__inactive, __a, __b, __p) -#define vcaddq_rot270_m_s32(__inactive, __a, __b, __p) __arm_vcaddq_rot270_m_s32(__inactive, __a, __b, __p) -#define vcaddq_rot270_m_s16(__inactive, __a, __b, __p) __arm_vcaddq_rot270_m_s16(__inactive, __a, __b, __p) -#define vcaddq_rot270_m_u8(__inactive, __a, __b, __p) __arm_vcaddq_rot270_m_u8(__inactive, __a, __b, __p) -#define vcaddq_rot270_m_u32(__inactive, __a, __b, __p) __arm_vcaddq_rot270_m_u32(__inactive, __a, __b, __p) -#define vcaddq_rot270_m_u16(__inactive, __a, __b, __p) __arm_vcaddq_rot270_m_u16(__inactive, __a, __b, __p) -#define vcaddq_rot90_m_s8(__inactive, __a, __b, __p) __arm_vcaddq_rot90_m_s8(__inactive, __a, __b, __p) -#define vcaddq_rot90_m_s32(__inactive, __a, __b, __p) __arm_vcaddq_rot90_m_s32(__inactive, __a, __b, __p) -#define vcaddq_rot90_m_s16(__inactive, __a, __b, __p) __arm_vcaddq_rot90_m_s16(__inactive, __a, __b, __p) -#define vcaddq_rot90_m_u8(__inactive, __a, __b, __p) __arm_vcaddq_rot90_m_u8(__inactive, __a, __b, __p) -#define vcaddq_rot90_m_u32(__inactive, __a, __b, __p) __arm_vcaddq_rot90_m_u32(__inactive, __a, __b, __p) -#define vcaddq_rot90_m_u16(__inactive, __a, __b, __p) __arm_vcaddq_rot90_m_u16(__inactive, __a, __b, __p) -#define vhcaddq_rot270_m_s8(__inactive, __a, __b, __p) __arm_vhcaddq_rot270_m_s8(__inactive, __a, __b, __p) -#define vhcaddq_rot270_m_s32(__inactive, __a, __b, __p) __arm_vhcaddq_rot270_m_s32(__inactive, __a, __b, __p) -#define vhcaddq_rot270_m_s16(__inactive, __a, __b, __p) __arm_vhcaddq_rot270_m_s16(__inactive, __a, __b, __p) -#define vhcaddq_rot90_m_s8(__inactive, __a, __b, __p) __arm_vhcaddq_rot90_m_s8(__inactive, __a, __b, __p) -#define vhcaddq_rot90_m_s32(__inactive, __a, __b, __p) __arm_vhcaddq_rot90_m_s32(__inactive, __a, __b, __p) -#define vhcaddq_rot90_m_s16(__inactive, __a, __b, __p) __arm_vhcaddq_rot90_m_s16(__inactive, __a, __b, __p) #define vmullbq_int_m_s8(__inactive, __a, __b, __p) __arm_vmullbq_int_m_s8(__inactive, __a, __b, __p) #define vmullbq_int_m_s32(__inactive, __a, __b, __p) __arm_vmullbq_int_m_s32(__inactive, __a, __b, __p) #define vmullbq_int_m_s16(__inactive, __a, __b, __p) __arm_vmullbq_int_m_s16(__inactive, __a, __b, __p) @@ -416,10 +364,6 @@ #define vmulltq_poly_m_p16(__inactive, __a, __b, __p) __arm_vmulltq_poly_m_p16(__inactive, __a, __b, __p) #define vbicq_m_f32(__inactive, __a, __b, __p) __arm_vbicq_m_f32(__inactive, __a, __b, __p) #define vbicq_m_f16(__inactive, __a, __b, __p) __arm_vbicq_m_f16(__inactive, __a, __b, __p) -#define vcaddq_rot270_m_f32(__inactive, __a, __b, __p) __arm_vcaddq_rot270_m_f32(__inactive, __a, __b, __p) -#define vcaddq_rot270_m_f16(__inactive, __a, __b, __p) __arm_vcaddq_rot270_m_f16(__inactive, __a, __b, __p) -#define vcaddq_rot90_m_f32(__inactive, __a, __b, __p) __arm_vcaddq_rot90_m_f32(__inactive, __a, __b, __p) -#define vcaddq_rot90_m_f16(__inactive, __a, __b, __p) __arm_vcaddq_rot90_m_f16(__inactive, __a, __b, __p) #define vcmlaq_m_f32(__a, __b, __c, __p) __arm_vcmlaq_m_f32(__a, __b, __c, __p) #define vcmlaq_m_f16(__a, __b, __c, __p) __arm_vcmlaq_m_f16(__a, __b, __c, __p) #define vcmlaq_rot180_m_f32(__a, __b, __c, __p) __arm_vcmlaq_rot180_m_f32(__a, __b, __c, __p) @@ -756,24 +700,6 @@ #define vmulltq_int_x_u8(__a, __b, __p) __arm_vmulltq_int_x_u8(__a, __b, __p) #define vmulltq_int_x_u16(__a, __b, __p) __arm_vmulltq_int_x_u16(__a, __b, __p) #define vmulltq_int_x_u32(__a, __b, __p) __arm_vmulltq_int_x_u32(__a, __b, __p) -#define vcaddq_rot90_x_s8(__a, __b, __p) __arm_vcaddq_rot90_x_s8(__a, __b, __p) -#define vcaddq_rot90_x_s16(__a, __b, __p) __arm_vcaddq_rot90_x_s16(__a, __b, __p) -#define vcaddq_rot90_x_s32(__a, __b, __p) __arm_vcaddq_rot90_x_s32(__a, __b, __p) -#define vcaddq_rot90_x_u8(__a, __b, __p) __arm_vcaddq_rot90_x_u8(__a, __b, __p) -#define vcaddq_rot90_x_u16(__a, __b, __p) __arm_vcaddq_rot90_x_u16(__a, __b, __p) -#define vcaddq_rot90_x_u32(__a, __b, __p) __arm_vcaddq_rot90_x_u32(__a, __b, __p) -#define vcaddq_rot270_x_s8(__a, __b, __p) __arm_vcaddq_rot270_x_s8(__a, __b, __p) -#define vcaddq_rot270_x_s16(__a, __b, __p) __arm_vcaddq_rot270_x_s16(__a, __b, __p) -#define vcaddq_rot270_x_s32(__a, __b, __p) __arm_vcaddq_rot270_x_s32(__a, __b, __p) -#define vcaddq_rot270_x_u8(__a, __b, __p) __arm_vcaddq_rot270_x_u8(__a, __b, __p) -#define vcaddq_rot270_x_u16(__a, __b, __p) __arm_vcaddq_rot270_x_u16(__a, __b, __p) -#define vcaddq_rot270_x_u32(__a, __b, __p) __arm_vcaddq_rot270_x_u32(__a, __b, __p) -#define vhcaddq_rot90_x_s8(__a, __b, __p) __arm_vhcaddq_rot90_x_s8(__a, __b, __p) -#define vhcaddq_rot90_x_s16(__a, __b, __p) __arm_vhcaddq_rot90_x_s16(__a, __b, __p) -#define vhcaddq_rot90_x_s32(__a, __b, __p) __arm_vhcaddq_rot90_x_s32(__a, __b, __p) -#define vhcaddq_rot270_x_s8(__a, __b, __p) __arm_vhcaddq_rot270_x_s8(__a, __b, __p) -#define vhcaddq_rot270_x_s16(__a, __b, __p) __arm_vhcaddq_rot270_x_s16(__a, __b, __p) -#define vhcaddq_rot270_x_s32(__a, __b, __p) __arm_vhcaddq_rot270_x_s32(__a, __b, __p) #define vbicq_x_s8(__a, __b, __p) __arm_vbicq_x_s8(__a, __b, __p) #define vbicq_x_s16(__a, __b, __p) __arm_vbicq_x_s16(__a, __b, __p) #define vbicq_x_s32(__a, __b, __p) __arm_vbicq_x_s32(__a, __b, __p) @@ -786,10 +712,6 @@ #define vornq_x_u8(__a, __b, __p) __arm_vornq_x_u8(__a, __b, __p) #define vornq_x_u16(__a, __b, __p) __arm_vornq_x_u16(__a, __b, __p) #define vornq_x_u32(__a, __b, __p) __arm_vornq_x_u32(__a, __b, __p) -#define vcaddq_rot90_x_f16(__a, __b, __p) __arm_vcaddq_rot90_x_f16(__a, __b, __p) -#define vcaddq_rot90_x_f32(__a, __b, __p) __arm_vcaddq_rot90_x_f32(__a, __b, __p) -#define vcaddq_rot270_x_f16(__a, __b, __p) __arm_vcaddq_rot270_x_f16(__a, __b, __p) -#define vcaddq_rot270_x_f32(__a, __b, __p) __arm_vcaddq_rot270_x_f32(__a, __b, __p) #define vcmulq_x_f16(__a, __b, __p) __arm_vcmulq_x_f16(__a, __b, __p) #define vcmulq_x_f32(__a, __b, __p) __arm_vcmulq_x_f32(__a, __b, __p) #define vcmulq_rot90_x_f16(__a, __b, __p) __arm_vcmulq_rot90_x_f16(__a, __b, __p) @@ -1058,22 +980,6 @@ __arm_vmullbq_int_u8 (uint8x16_t __a, uint8x16_t __b) return __builtin_mve_vmullbq_int_uv16qi (__a, __b); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_u8 (uint8x16_t __a, uint8x16_t __b) -{ - return (uint8x16_t) - __builtin_mve_vcaddq_rot90v16qi ((int8x16_t)__a, (int8x16_t)__b); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_u8 (uint8x16_t __a, uint8x16_t __b) -{ - return (uint8x16_t) - __builtin_mve_vcaddq_rot270v16qi ((int8x16_t)__a, (int8x16_t)__b); -} - __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_u8 (uint8x16_t __a, uint8x16_t __b) @@ -1102,34 +1008,6 @@ __arm_vmullbq_int_s8 (int8x16_t __a, int8x16_t __b) return __builtin_mve_vmullbq_int_sv16qi (__a, __b); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot90_s8 (int8x16_t __a, int8x16_t __b) -{ - return __builtin_mve_vhcaddq_rot90_sv16qi (__a, __b); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot270_s8 (int8x16_t __a, int8x16_t __b) -{ - return __builtin_mve_vhcaddq_rot270_sv16qi (__a, __b); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_s8 (int8x16_t __a, int8x16_t __b) -{ - return __builtin_mve_vcaddq_rot90v16qi (__a, __b); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_s8 (int8x16_t __a, int8x16_t __b) -{ - return __builtin_mve_vcaddq_rot270v16qi (__a, __b); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_s8 (int8x16_t __a, int8x16_t __b) @@ -1158,22 +1036,6 @@ __arm_vmullbq_int_u16 (uint16x8_t __a, uint16x8_t __b) return __builtin_mve_vmullbq_int_uv8hi (__a, __b); } -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_u16 (uint16x8_t __a, uint16x8_t __b) -{ - return (uint16x8_t) - __builtin_mve_vcaddq_rot90v8hi ((int16x8_t)__a, (int16x8_t)__b); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_u16 (uint16x8_t __a, uint16x8_t __b) -{ - return (uint16x8_t) - __builtin_mve_vcaddq_rot270v8hi ((int16x8_t)__a, (int16x8_t)__b); -} - __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_u16 (uint16x8_t __a, uint16x8_t __b) @@ -1202,34 +1064,6 @@ __arm_vmullbq_int_s16 (int16x8_t __a, int16x8_t __b) return __builtin_mve_vmullbq_int_sv8hi (__a, __b); } -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot90_s16 (int16x8_t __a, int16x8_t __b) -{ - return __builtin_mve_vhcaddq_rot90_sv8hi (__a, __b); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot270_s16 (int16x8_t __a, int16x8_t __b) -{ - return __builtin_mve_vhcaddq_rot270_sv8hi (__a, __b); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_s16 (int16x8_t __a, int16x8_t __b) -{ - return __builtin_mve_vcaddq_rot90v8hi (__a, __b); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_s16 (int16x8_t __a, int16x8_t __b) -{ - return __builtin_mve_vcaddq_rot270v8hi (__a, __b); -} - __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_s16 (int16x8_t __a, int16x8_t __b) @@ -1258,22 +1092,6 @@ __arm_vmullbq_int_u32 (uint32x4_t __a, uint32x4_t __b) return __builtin_mve_vmullbq_int_uv4si (__a, __b); } -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_u32 (uint32x4_t __a, uint32x4_t __b) -{ - return (uint32x4_t) - __builtin_mve_vcaddq_rot90v4si ((int32x4_t)__a, (int32x4_t)__b); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_u32 (uint32x4_t __a, uint32x4_t __b) -{ - return (uint32x4_t) - __builtin_mve_vcaddq_rot270v4si ((int32x4_t)__a, (int32x4_t)__b); -} - __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_u32 (uint32x4_t __a, uint32x4_t __b) @@ -1302,34 +1120,6 @@ __arm_vmullbq_int_s32 (int32x4_t __a, int32x4_t __b) return __builtin_mve_vmullbq_int_sv4si (__a, __b); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot90_s32 (int32x4_t __a, int32x4_t __b) -{ - return __builtin_mve_vhcaddq_rot90_sv4si (__a, __b); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot270_s32 (int32x4_t __a, int32x4_t __b) -{ - return __builtin_mve_vhcaddq_rot270_sv4si (__a, __b); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_s32 (int32x4_t __a, int32x4_t __b) -{ - return __builtin_mve_vcaddq_rot90v4si (__a, __b); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_s32 (int32x4_t __a, int32x4_t __b) -{ - return __builtin_mve_vcaddq_rot270v4si (__a, __b); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_s32 (int32x4_t __a, int32x4_t __b) @@ -1545,132 +1335,6 @@ __arm_vbicq_m_u16 (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pr return __builtin_mve_vbicq_m_uv8hi (__inactive, __a, __b, __p); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_m_s8 (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot270_m_sv16qi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot270_m_sv4si (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_m_s16 (int16x8_t __inactive, int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot270_m_sv8hi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_m_u8 (uint8x16_t __inactive, uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot270_m_uv16qi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot270_m_uv4si (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_m_u16 (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot270_m_uv8hi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_m_s8 (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot90_m_sv16qi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot90_m_sv4si (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_m_s16 (int16x8_t __inactive, int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot90_m_sv8hi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_m_u8 (uint8x16_t __inactive, uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot90_m_uv16qi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot90_m_uv4si (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_m_u16 (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot90_m_uv8hi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot270_m_s8 (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vhcaddq_rot270_m_sv16qi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot270_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vhcaddq_rot270_m_sv4si (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot270_m_s16 (int16x8_t __inactive, int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vhcaddq_rot270_m_sv8hi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot90_m_s8 (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vhcaddq_rot90_m_sv16qi (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot90_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vhcaddq_rot90_m_sv4si (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot90_m_s16 (int16x8_t __inactive, int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vhcaddq_rot90_m_sv8hi (__inactive, __a, __b, __p); -} - __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vmullbq_int_m_s8 (int16x8_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) @@ -3850,281 +3514,155 @@ __arm_vmulltq_int_x_u32 (uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_x_s8 (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) +__arm_vbicq_x_s8 (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) { - return __builtin_mve_vcaddq_rot90_m_sv16qi (__arm_vuninitializedq_s8 (), __a, __b, __p); + return __builtin_mve_vbicq_m_sv16qi (__arm_vuninitializedq_s8 (), __a, __b, __p); } __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_x_s16 (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) +__arm_vbicq_x_s16 (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) { - return __builtin_mve_vcaddq_rot90_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __b, __p); + return __builtin_mve_vbicq_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __b, __p); } __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_x_s32 (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) +__arm_vbicq_x_s32 (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) { - return __builtin_mve_vcaddq_rot90_m_sv4si (__arm_vuninitializedq_s32 (), __a, __b, __p); + return __builtin_mve_vbicq_m_sv4si (__arm_vuninitializedq_s32 (), __a, __b, __p); } __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_x_u8 (uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) +__arm_vbicq_x_u8 (uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) { - return __builtin_mve_vcaddq_rot90_m_uv16qi (__arm_vuninitializedq_u8 (), __a, __b, __p); + return __builtin_mve_vbicq_m_uv16qi (__arm_vuninitializedq_u8 (), __a, __b, __p); } __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_x_u16 (uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) +__arm_vbicq_x_u16 (uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) { - return __builtin_mve_vcaddq_rot90_m_uv8hi (__arm_vuninitializedq_u16 (), __a, __b, __p); + return __builtin_mve_vbicq_m_uv8hi (__arm_vuninitializedq_u16 (), __a, __b, __p); } __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_x_u32 (uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) +__arm_vbicq_x_u32 (uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) { - return __builtin_mve_vcaddq_rot90_m_uv4si (__arm_vuninitializedq_u32 (), __a, __b, __p); + return __builtin_mve_vbicq_m_uv4si (__arm_vuninitializedq_u32 (), __a, __b, __p); } __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_x_s8 (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) +__arm_vornq_x_s8 (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) { - return __builtin_mve_vcaddq_rot270_m_sv16qi (__arm_vuninitializedq_s8 (), __a, __b, __p); + return __builtin_mve_vornq_m_sv16qi (__arm_vuninitializedq_s8 (), __a, __b, __p); } __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_x_s16 (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) +__arm_vornq_x_s16 (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) { - return __builtin_mve_vcaddq_rot270_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __b, __p); + return __builtin_mve_vornq_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __b, __p); } __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_x_s32 (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) +__arm_vornq_x_s32 (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) { - return __builtin_mve_vcaddq_rot270_m_sv4si (__arm_vuninitializedq_s32 (), __a, __b, __p); + return __builtin_mve_vornq_m_sv4si (__arm_vuninitializedq_s32 (), __a, __b, __p); } __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_x_u8 (uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) +__arm_vornq_x_u8 (uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) { - return __builtin_mve_vcaddq_rot270_m_uv16qi (__arm_vuninitializedq_u8 (), __a, __b, __p); + return __builtin_mve_vornq_m_uv16qi (__arm_vuninitializedq_u8 (), __a, __b, __p); } __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_x_u16 (uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) +__arm_vornq_x_u16 (uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) { - return __builtin_mve_vcaddq_rot270_m_uv8hi (__arm_vuninitializedq_u16 (), __a, __b, __p); + return __builtin_mve_vornq_m_uv8hi (__arm_vuninitializedq_u16 (), __a, __b, __p); } __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_x_u32 (uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) +__arm_vornq_x_u32 (uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) { - return __builtin_mve_vcaddq_rot270_m_uv4si (__arm_vuninitializedq_u32 (), __a, __b, __p); + return __builtin_mve_vornq_m_uv4si (__arm_vuninitializedq_u32 (), __a, __b, __p); } -__extension__ extern __inline int8x16_t +__extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot90_x_s8 (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) +__arm_vadciq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry_out) { - return __builtin_mve_vhcaddq_rot90_m_sv16qi (__arm_vuninitializedq_s8 (), __a, __b, __p); + int32x4_t __res = __builtin_mve_vadciq_sv4si (__a, __b); + *__carry_out = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; + return __res; } -__extension__ extern __inline int16x8_t +__extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot90_x_s16 (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) +__arm_vadciq_u32 (uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out) { - return __builtin_mve_vhcaddq_rot90_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __b, __p); + uint32x4_t __res = __builtin_mve_vadciq_uv4si (__a, __b); + *__carry_out = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; + return __res; } __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot90_x_s32 (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vhcaddq_rot90_m_sv4si (__arm_vuninitializedq_s32 (), __a, __b, __p); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot270_x_s8 (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) +__arm_vadciq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) { - return __builtin_mve_vhcaddq_rot270_m_sv16qi (__arm_vuninitializedq_s8 (), __a, __b, __p); + int32x4_t __res = __builtin_mve_vadciq_m_sv4si (__inactive, __a, __b, __p); + *__carry_out = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; + return __res; } -__extension__ extern __inline int16x8_t +__extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot270_x_s16 (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) +__arm_vadciq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) { - return __builtin_mve_vhcaddq_rot270_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __b, __p); + uint32x4_t __res = __builtin_mve_vadciq_m_uv4si (__inactive, __a, __b, __p); + *__carry_out = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; + return __res; } __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot270_x_s32 (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) +__arm_vadcq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry) { - return __builtin_mve_vhcaddq_rot270_m_sv4si (__arm_vuninitializedq_s32 (), __a, __b, __p); + __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & ~0x20000000u) | ((*__carry & 0x1u) << 29)); + int32x4_t __res = __builtin_mve_vadcq_sv4si (__a, __b); + *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; + return __res; } -__extension__ extern __inline int8x16_t +__extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_s8 (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) +__arm_vadcq_u32 (uint32x4_t __a, uint32x4_t __b, unsigned * __carry) { - return __builtin_mve_vbicq_m_sv16qi (__arm_vuninitializedq_s8 (), __a, __b, __p); + __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & ~0x20000000u) | ((*__carry & 0x1u) << 29)); + uint32x4_t __res = __builtin_mve_vadcq_uv4si (__a, __b); + *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; + return __res; } -__extension__ extern __inline int16x8_t +__extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_s16 (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) +__arm_vadcq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry, mve_pred16_t __p) { - return __builtin_mve_vbicq_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __b, __p); + __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & ~0x20000000u) | ((*__carry & 0x1u) << 29)); + int32x4_t __res = __builtin_mve_vadcq_m_sv4si (__inactive, __a, __b, __p); + *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; + return __res; } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_s32 (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_sv4si (__arm_vuninitializedq_s32 (), __a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_u8 (uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_uv16qi (__arm_vuninitializedq_u8 (), __a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_u16 (uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_uv8hi (__arm_vuninitializedq_u16 (), __a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_x_u32 (uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vbicq_m_uv4si (__arm_vuninitializedq_u32 (), __a, __b, __p); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_s8 (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_sv16qi (__arm_vuninitializedq_s8 (), __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_s16 (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_sv8hi (__arm_vuninitializedq_s16 (), __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_s32 (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_sv4si (__arm_vuninitializedq_s32 (), __a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_u8 (uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_uv16qi (__arm_vuninitializedq_u8 (), __a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_u16 (uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_uv8hi (__arm_vuninitializedq_u16 (), __a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vornq_x_u32 (uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vornq_m_uv4si (__arm_vuninitializedq_u32 (), __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadciq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry_out) -{ - int32x4_t __res = __builtin_mve_vadciq_sv4si (__a, __b); - *__carry_out = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadciq_u32 (uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out) -{ - uint32x4_t __res = __builtin_mve_vadciq_uv4si (__a, __b); - *__carry_out = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadciq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) -{ - int32x4_t __res = __builtin_mve_vadciq_m_sv4si (__inactive, __a, __b, __p); - *__carry_out = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadciq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) -{ - uint32x4_t __res = __builtin_mve_vadciq_m_uv4si (__inactive, __a, __b, __p); - *__carry_out = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadcq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry) -{ - __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & ~0x20000000u) | ((*__carry & 0x1u) << 29)); - int32x4_t __res = __builtin_mve_vadcq_sv4si (__a, __b); - *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadcq_u32 (uint32x4_t __a, uint32x4_t __b, unsigned * __carry) -{ - __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & ~0x20000000u) | ((*__carry & 0x1u) << 29)); - uint32x4_t __res = __builtin_mve_vadcq_uv4si (__a, __b); - *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vadcq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry, mve_pred16_t __p) -{ - __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & ~0x20000000u) | ((*__carry & 0x1u) << 29)); - int32x4_t __res = __builtin_mve_vadcq_m_sv4si (__inactive, __a, __b, __p); - *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u; - return __res; -} - -__extension__ extern __inline uint32x4_t +__extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vadcq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry, mve_pred16_t __p) { @@ -5051,20 +4589,6 @@ __arm_vcmulq_f16 (float16x8_t __a, float16x8_t __b) return __builtin_mve_vcmulqv8hf (__a, __b); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_f16 (float16x8_t __a, float16x8_t __b) -{ - return __builtin_mve_vcaddq_rot90v8hf (__a, __b); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_f16 (float16x8_t __a, float16x8_t __b) -{ - return __builtin_mve_vcaddq_rot270v8hf (__a, __b); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_f16 (float16x8_t __a, float16x8_t __b) @@ -5107,20 +4631,6 @@ __arm_vcmulq_f32 (float32x4_t __a, float32x4_t __b) return __builtin_mve_vcmulqv4sf (__a, __b); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_f32 (float32x4_t __a, float32x4_t __b) -{ - return __builtin_mve_vcaddq_rot90v4sf (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_f32 (float32x4_t __a, float32x4_t __b) -{ - return __builtin_mve_vcaddq_rot270v4sf (__a, __b); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_f32 (float32x4_t __a, float32x4_t __b) @@ -5437,34 +4947,6 @@ __arm_vbicq_m_f16 (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve return __builtin_mve_vbicq_m_fv8hf (__inactive, __a, __b, __p); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_m_f32 (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot270_m_fv4sf (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_m_f16 (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot270_m_fv8hf (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_m_f32 (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot90_m_fv4sf (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_m_f16 (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot90_m_fv8hf (__inactive, __a, __b, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmlaq_m_f32 (float32x4_t __a, float32x4_t __b, float32x4_t __c, mve_pred16_t __p) @@ -5877,34 +5359,6 @@ __arm_vstrwq_scatter_base_wb_p_f32 (uint32x4_t * __addr, const int __offset, flo *__addr = __builtin_mve_vstrwq_scatter_base_wb_p_fv4sf (*__addr, __offset, __value, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot90_m_fv8hf (__arm_vuninitializedq_f16 (), __a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_x_f32 (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot90_m_fv4sf (__arm_vuninitializedq_f32 (), __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot270_m_fv8hf (__arm_vuninitializedq_f16 (), __a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_x_f32 (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcaddq_rot270_m_fv4sf (__arm_vuninitializedq_f32 (), __a, __b, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmulq_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) @@ -6408,20 +5862,6 @@ __arm_vmullbq_int (uint8x16_t __a, uint8x16_t __b) return __arm_vmullbq_int_u8 (__a, __b); } -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90 (uint8x16_t __a, uint8x16_t __b) -{ - return __arm_vcaddq_rot90_u8 (__a, __b); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270 (uint8x16_t __a, uint8x16_t __b) -{ - return __arm_vcaddq_rot270_u8 (__a, __b); -} - __extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq (uint8x16_t __a, uint8x16_t __b) @@ -6450,34 +5890,6 @@ __arm_vmullbq_int (int8x16_t __a, int8x16_t __b) return __arm_vmullbq_int_s8 (__a, __b); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot90 (int8x16_t __a, int8x16_t __b) -{ - return __arm_vhcaddq_rot90_s8 (__a, __b); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot270 (int8x16_t __a, int8x16_t __b) -{ - return __arm_vhcaddq_rot270_s8 (__a, __b); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90 (int8x16_t __a, int8x16_t __b) -{ - return __arm_vcaddq_rot90_s8 (__a, __b); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270 (int8x16_t __a, int8x16_t __b) -{ - return __arm_vcaddq_rot270_s8 (__a, __b); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq (int8x16_t __a, int8x16_t __b) @@ -6506,20 +5918,6 @@ __arm_vmullbq_int (uint16x8_t __a, uint16x8_t __b) return __arm_vmullbq_int_u16 (__a, __b); } -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90 (uint16x8_t __a, uint16x8_t __b) -{ - return __arm_vcaddq_rot90_u16 (__a, __b); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270 (uint16x8_t __a, uint16x8_t __b) -{ - return __arm_vcaddq_rot270_u16 (__a, __b); -} - __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq (uint16x8_t __a, uint16x8_t __b) @@ -6548,34 +5946,6 @@ __arm_vmullbq_int (int16x8_t __a, int16x8_t __b) return __arm_vmullbq_int_s16 (__a, __b); } -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot90 (int16x8_t __a, int16x8_t __b) -{ - return __arm_vhcaddq_rot90_s16 (__a, __b); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot270 (int16x8_t __a, int16x8_t __b) -{ - return __arm_vhcaddq_rot270_s16 (__a, __b); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90 (int16x8_t __a, int16x8_t __b) -{ - return __arm_vcaddq_rot90_s16 (__a, __b); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270 (int16x8_t __a, int16x8_t __b) -{ - return __arm_vcaddq_rot270_s16 (__a, __b); -} - __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq (int16x8_t __a, int16x8_t __b) @@ -6604,20 +5974,6 @@ __arm_vmullbq_int (uint32x4_t __a, uint32x4_t __b) return __arm_vmullbq_int_u32 (__a, __b); } -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90 (uint32x4_t __a, uint32x4_t __b) -{ - return __arm_vcaddq_rot90_u32 (__a, __b); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270 (uint32x4_t __a, uint32x4_t __b) -{ - return __arm_vcaddq_rot270_u32 (__a, __b); -} - __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq (uint32x4_t __a, uint32x4_t __b) @@ -6646,34 +6002,6 @@ __arm_vmullbq_int (int32x4_t __a, int32x4_t __b) return __arm_vmullbq_int_s32 (__a, __b); } -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot90 (int32x4_t __a, int32x4_t __b) -{ - return __arm_vhcaddq_rot90_s32 (__a, __b); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot270 (int32x4_t __a, int32x4_t __b) -{ - return __arm_vhcaddq_rot270_s32 (__a, __b); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90 (int32x4_t __a, int32x4_t __b) -{ - return __arm_vcaddq_rot90_s32 (__a, __b); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270 (int32x4_t __a, int32x4_t __b) -{ - return __arm_vcaddq_rot270_s32 (__a, __b); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq (int32x4_t __a, int32x4_t __b) @@ -6751,228 +6079,102 @@ __arm_vbicq_m_n (int32x4_t __a, const int __imm, mve_pred16_t __p) return __arm_vbicq_m_n_s32 (__a, __imm, __p); } -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_n (uint16x8_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vbicq_m_n_u16 (__a, __imm, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m_n (uint32x4_t __a, const int __imm, mve_pred16_t __p) -{ - return __arm_vbicq_m_n_u32 (__a, __imm, __p); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq (int8x16_t __a, uint32_t * __b, const int __imm) -{ - return __arm_vshlcq_s8 (__a, __b, __imm); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq (uint8x16_t __a, uint32_t * __b, const int __imm) -{ - return __arm_vshlcq_u8 (__a, __b, __imm); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq (int16x8_t __a, uint32_t * __b, const int __imm) -{ - return __arm_vshlcq_s16 (__a, __b, __imm); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq (uint16x8_t __a, uint32_t * __b, const int __imm) -{ - return __arm_vshlcq_u16 (__a, __b, __imm); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq (int32x4_t __a, uint32_t * __b, const int __imm) -{ - return __arm_vshlcq_s32 (__a, __b, __imm); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vshlcq (uint32x4_t __a, uint32_t * __b, const int __imm) -{ - return __arm_vshlcq_u32 (__a, __b, __imm); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_s8 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_s32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (int16x8_t __inactive, int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_s16 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (uint8x16_t __inactive, uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_u8 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_u32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vbicq_m (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __arm_vbicq_m_u16 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_m (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot270_m_s8 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_m (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot270_m_s32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_m (int16x8_t __inactive, int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot270_m_s16 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_m (uint8x16_t __inactive, uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot270_m_u8 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline uint32x4_t +__extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_m (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) +__arm_vbicq_m_n (uint16x8_t __a, const int __imm, mve_pred16_t __p) { - return __arm_vcaddq_rot270_m_u32 (__inactive, __a, __b, __p); + return __arm_vbicq_m_n_u16 (__a, __imm, __p); } -__extension__ extern __inline uint16x8_t +__extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_m (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) +__arm_vbicq_m_n (uint32x4_t __a, const int __imm, mve_pred16_t __p) { - return __arm_vcaddq_rot270_m_u16 (__inactive, __a, __b, __p); + return __arm_vbicq_m_n_u32 (__a, __imm, __p); } __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_m (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) +__arm_vshlcq (int8x16_t __a, uint32_t * __b, const int __imm) { - return __arm_vcaddq_rot90_m_s8 (__inactive, __a, __b, __p); + return __arm_vshlcq_s8 (__a, __b, __imm); } -__extension__ extern __inline int32x4_t +__extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_m (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, mve_pred16_t __p) +__arm_vshlcq (uint8x16_t __a, uint32_t * __b, const int __imm) { - return __arm_vcaddq_rot90_m_s32 (__inactive, __a, __b, __p); + return __arm_vshlcq_u8 (__a, __b, __imm); } __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_m (int16x8_t __inactive, int16x8_t __a, int16x8_t __b, mve_pred16_t __p) +__arm_vshlcq (int16x8_t __a, uint32_t * __b, const int __imm) { - return __arm_vcaddq_rot90_m_s16 (__inactive, __a, __b, __p); + return __arm_vshlcq_s16 (__a, __b, __imm); } -__extension__ extern __inline uint8x16_t +__extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_m (uint8x16_t __inactive, uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) +__arm_vshlcq (uint16x8_t __a, uint32_t * __b, const int __imm) { - return __arm_vcaddq_rot90_m_u8 (__inactive, __a, __b, __p); + return __arm_vshlcq_u16 (__a, __b, __imm); } -__extension__ extern __inline uint32x4_t +__extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_m (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) +__arm_vshlcq (int32x4_t __a, uint32_t * __b, const int __imm) { - return __arm_vcaddq_rot90_m_u32 (__inactive, __a, __b, __p); + return __arm_vshlcq_s32 (__a, __b, __imm); } -__extension__ extern __inline uint16x8_t +__extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_m (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) +__arm_vshlcq (uint32x4_t __a, uint32_t * __b, const int __imm) { - return __arm_vcaddq_rot90_m_u16 (__inactive, __a, __b, __p); + return __arm_vshlcq_u32 (__a, __b, __imm); } __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot270_m (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) +__arm_vbicq_m (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) { - return __arm_vhcaddq_rot270_m_s8 (__inactive, __a, __b, __p); + return __arm_vbicq_m_s8 (__inactive, __a, __b, __p); } __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot270_m (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, mve_pred16_t __p) +__arm_vbicq_m (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, mve_pred16_t __p) { - return __arm_vhcaddq_rot270_m_s32 (__inactive, __a, __b, __p); + return __arm_vbicq_m_s32 (__inactive, __a, __b, __p); } __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot270_m (int16x8_t __inactive, int16x8_t __a, int16x8_t __b, mve_pred16_t __p) +__arm_vbicq_m (int16x8_t __inactive, int16x8_t __a, int16x8_t __b, mve_pred16_t __p) { - return __arm_vhcaddq_rot270_m_s16 (__inactive, __a, __b, __p); + return __arm_vbicq_m_s16 (__inactive, __a, __b, __p); } -__extension__ extern __inline int8x16_t +__extension__ extern __inline uint8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot90_m (int8x16_t __inactive, int8x16_t __a, int8x16_t __b, mve_pred16_t __p) +__arm_vbicq_m (uint8x16_t __inactive, uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) { - return __arm_vhcaddq_rot90_m_s8 (__inactive, __a, __b, __p); + return __arm_vbicq_m_u8 (__inactive, __a, __b, __p); } -__extension__ extern __inline int32x4_t +__extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot90_m (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, mve_pred16_t __p) +__arm_vbicq_m (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) { - return __arm_vhcaddq_rot90_m_s32 (__inactive, __a, __b, __p); + return __arm_vbicq_m_u32 (__inactive, __a, __b, __p); } -__extension__ extern __inline int16x8_t +__extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot90_m (int16x8_t __inactive, int16x8_t __a, int16x8_t __b, mve_pred16_t __p) +__arm_vbicq_m (uint16x8_t __inactive, uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) { - return __arm_vhcaddq_rot90_m_s16 (__inactive, __a, __b, __p); + return __arm_vbicq_m_u16 (__inactive, __a, __b, __p); } __extension__ extern __inline int16x8_t @@ -8725,132 +7927,6 @@ __arm_vmulltq_int_x (uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) return __arm_vmulltq_int_x_u32 (__a, __b, __p); } -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_x (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot90_x_s8 (__a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_x (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot90_x_s16 (__a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_x (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot90_x_s32 (__a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_x (uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot90_x_u8 (__a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_x (uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot90_x_u16 (__a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_x (uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot90_x_u32 (__a, __b, __p); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_x (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot270_x_s8 (__a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_x (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot270_x_s16 (__a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_x (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot270_x_s32 (__a, __b, __p); -} - -__extension__ extern __inline uint8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_x (uint8x16_t __a, uint8x16_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot270_x_u8 (__a, __b, __p); -} - -__extension__ extern __inline uint16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_x (uint16x8_t __a, uint16x8_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot270_x_u16 (__a, __b, __p); -} - -__extension__ extern __inline uint32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_x (uint32x4_t __a, uint32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot270_x_u32 (__a, __b, __p); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot90_x (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __arm_vhcaddq_rot90_x_s8 (__a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot90_x (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __arm_vhcaddq_rot90_x_s16 (__a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot90_x (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __arm_vhcaddq_rot90_x_s32 (__a, __b, __p); -} - -__extension__ extern __inline int8x16_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot270_x (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) -{ - return __arm_vhcaddq_rot270_x_s8 (__a, __b, __p); -} - -__extension__ extern __inline int16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot270_x (int16x8_t __a, int16x8_t __b, mve_pred16_t __p) -{ - return __arm_vhcaddq_rot270_x_s16 (__a, __b, __p); -} - -__extension__ extern __inline int32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vhcaddq_rot270_x (int32x4_t __a, int32x4_t __b, mve_pred16_t __p) -{ - return __arm_vhcaddq_rot270_x_s32 (__a, __b, __p); -} - __extension__ extern __inline int8x16_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_x (int8x16_t __a, int8x16_t __b, mve_pred16_t __p) @@ -9532,20 +8608,6 @@ __arm_vcmulq (float16x8_t __a, float16x8_t __b) return __arm_vcmulq_f16 (__a, __b); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90 (float16x8_t __a, float16x8_t __b) -{ - return __arm_vcaddq_rot90_f16 (__a, __b); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270 (float16x8_t __a, float16x8_t __b) -{ - return __arm_vcaddq_rot270_f16 (__a, __b); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq (float16x8_t __a, float16x8_t __b) @@ -9588,20 +8650,6 @@ __arm_vcmulq (float32x4_t __a, float32x4_t __b) return __arm_vcmulq_f32 (__a, __b); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90 (float32x4_t __a, float32x4_t __b) -{ - return __arm_vcaddq_rot90_f32 (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270 (float32x4_t __a, float32x4_t __b) -{ - return __arm_vcaddq_rot270_f32 (__a, __b); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq (float32x4_t __a, float32x4_t __b) @@ -9903,34 +8951,6 @@ __arm_vbicq_m (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pre return __arm_vbicq_m_f16 (__inactive, __a, __b, __p); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_m (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot270_m_f32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_m (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot270_m_f16 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_m (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot90_m_f32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_m (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot90_m_f16 (__inactive, __a, __b, __p); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmlaq_m (float32x4_t __a, float32x4_t __b, float32x4_t __c, mve_pred16_t __p) @@ -10281,34 +9301,6 @@ __arm_vstrwq_scatter_base_wb_p (uint32x4_t * __addr, const int __offset, float32 __arm_vstrwq_scatter_base_wb_p_f32 (__addr, __offset, __value, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_x (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot90_x_f16 (__a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot90_x (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot90_x_f32 (__a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_x (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot270_x_f16 (__a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcaddq_rot270_x (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcaddq_rot270_x_f32 (__a, __b, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcmulq_x (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) @@ -10920,30 +9912,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vornq_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t)), \ int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vornq_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t)));}) -#define __arm_vcaddq_rot270(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vcaddq_rot270_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vcaddq_rot270_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vcaddq_rot270_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vcaddq_rot270_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t)), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vcaddq_rot270_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t)), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vcaddq_rot270_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t)), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcaddq_rot270_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t)), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcaddq_rot270_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t)));}) - -#define __arm_vcaddq_rot90(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vcaddq_rot90_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vcaddq_rot90_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vcaddq_rot90_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vcaddq_rot90_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t)), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vcaddq_rot90_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t)), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vcaddq_rot90_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t)), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcaddq_rot90_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t)), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcaddq_rot90_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t)));}) - #define __arm_vcmulq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ @@ -10990,20 +9958,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vmulltq_int_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t)), \ int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vmulltq_int_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t)));}) -#define __arm_vhcaddq_rot270(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vhcaddq_rot270_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vhcaddq_rot270_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vhcaddq_rot270_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)));}) - -#define __arm_vhcaddq_rot90(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vhcaddq_rot90_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vhcaddq_rot90_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vhcaddq_rot90_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)));}) - #define __arm_vmullbq_int(p0,p1) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ @@ -11139,32 +10093,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vbicq_m_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vbicq_m_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) -#define __arm_vcaddq_rot270_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vcaddq_rot270_m_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vcaddq_rot270_m_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vcaddq_rot270_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vcaddq_rot270_m_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vcaddq_rot270_m_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vcaddq_rot270_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcaddq_rot270_m_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcaddq_rot270_m_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - -#define __arm_vcaddq_rot90_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vcaddq_rot90_m_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vcaddq_rot90_m_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vcaddq_rot90_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vcaddq_rot90_m_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vcaddq_rot90_m_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vcaddq_rot90_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcaddq_rot90_m_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcaddq_rot90_m_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - #define __arm_vcmlaq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \ @@ -11558,30 +10486,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vbicq_x_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vbicq_x_f32 (__ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) -#define __arm_vcaddq_rot270_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vcaddq_rot270_x_s8 (__ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vcaddq_rot270_x_s16 (__ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vcaddq_rot270_x_s32 (__ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vcaddq_rot270_x_u8 (__ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vcaddq_rot270_x_u16 (__ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vcaddq_rot270_x_u32 (__ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcaddq_rot270_x_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcaddq_rot270_x_f32 (__ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - -#define __arm_vcaddq_rot90_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vcaddq_rot90_x_s8 (__ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vcaddq_rot90_x_s16 (__ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vcaddq_rot90_x_s32 (__ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vcaddq_rot90_x_u8 (__ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vcaddq_rot90_x_u16 (__ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vcaddq_rot90_x_u32 (__ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3), \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcaddq_rot90_x_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcaddq_rot90_x_f32 (__ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - #define __arm_vcmulq_rot180_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \ _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ @@ -11710,40 +10614,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vmullbq_int_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t)), \ int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vmullbq_int_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t)));}) -#define __arm_vhcaddq_rot90(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vhcaddq_rot90_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vhcaddq_rot90_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vhcaddq_rot90_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)));}) - -#define __arm_vhcaddq_rot270(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vhcaddq_rot270_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vhcaddq_rot270_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vhcaddq_rot270_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)));}) - -#define __arm_vcaddq_rot90(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vcaddq_rot90_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vcaddq_rot90_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vcaddq_rot90_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vcaddq_rot90_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t)), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vcaddq_rot90_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t)), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vcaddq_rot90_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t)));}) - -#define __arm_vcaddq_rot270(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vcaddq_rot270_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vcaddq_rot270_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vcaddq_rot270_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vcaddq_rot270_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t)), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vcaddq_rot270_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t)), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vcaddq_rot270_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t)));}) - #define __arm_vbicq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ @@ -11797,28 +10667,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vbicq_m_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vbicq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3));}) -#define __arm_vcaddq_rot270_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vcaddq_rot270_m_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vcaddq_rot270_m_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vcaddq_rot270_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vcaddq_rot270_m_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vcaddq_rot270_m_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vcaddq_rot270_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3));}) - -#define __arm_vcaddq_rot90_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vcaddq_rot90_m_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vcaddq_rot90_m_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vcaddq_rot90_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vcaddq_rot90_m_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vcaddq_rot90_m_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vcaddq_rot90_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3));}) - #define __arm_vornq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \ @@ -12067,26 +10915,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint32x4_t]: __arm_vuninitializedq_u32 (), \ int (*)[__ARM_mve_type_uint64x2_t]: __arm_vuninitializedq_u64 ());}) -#define __arm_vcaddq_rot270_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vcaddq_rot270_x_s8 (__ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vcaddq_rot270_x_s16 (__ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vcaddq_rot270_x_s32 (__ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vcaddq_rot270_x_u8 (__ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vcaddq_rot270_x_u16 (__ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vcaddq_rot270_x_u32 (__ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3));}) - -#define __arm_vcaddq_rot90_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vcaddq_rot90_x_s8 (__ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vcaddq_rot90_x_s16 (__ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vcaddq_rot90_x_s32 (__ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3), \ - int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: __arm_vcaddq_rot90_x_u8 (__ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, uint8x16_t), p3), \ - int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vcaddq_rot90_x_u16 (__ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, uint16x8_t), p3), \ - int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vcaddq_rot90_x_u32 (__ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3));}) - #define __arm_vmullbq_int_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \ _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ @@ -12251,20 +11079,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_int_n]: __arm_vddupq_x_n_u32 ((uint32_t) __p1, p2, p3), \ int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vddupq_x_wb_u32 (__ARM_mve_coerce_u32_ptr(__p1, uint32_t *), p2, p3));}) -#define __arm_vhcaddq_rot270_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vhcaddq_rot270_x_s8 (__ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vhcaddq_rot270_x_s16 (__ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vhcaddq_rot270_x_s32 (__ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3));}) - -#define __arm_vhcaddq_rot90_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vhcaddq_rot90_x_s8 (__ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vhcaddq_rot90_x_s16 (__ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vhcaddq_rot90_x_s32 (__ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3));}) - #define __arm_vadciq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ @@ -12358,22 +11172,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint8_t_ptr][__ARM_mve_type_uint16x8_t]: __arm_vldrbq_gather_offset_z_u16 (__ARM_mve_coerce_u8_ptr(p0, uint8_t *), __ARM_mve_coerce(__p1, uint16x8_t), p2), \ int (*)[__ARM_mve_type_uint8_t_ptr][__ARM_mve_type_uint32x4_t]: __arm_vldrbq_gather_offset_z_u32 (__ARM_mve_coerce_u8_ptr(p0, uint8_t *), __ARM_mve_coerce(__p1, uint32x4_t), p2));}) -#define __arm_vhcaddq_rot270_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vhcaddq_rot270_m_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vhcaddq_rot270_m_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vhcaddq_rot270_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3));}) - -#define __arm_vhcaddq_rot90_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vhcaddq_rot90_m_s8 (__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, int8x16_t), p3), \ - int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vhcaddq_rot90_m_s16 (__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, int16x8_t), p3), \ - int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vhcaddq_rot90_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3));}) - #define __arm_vmullbq_int_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \ From patchwork Thu Jul 13 10:22:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 119759 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1724659vqm; Thu, 13 Jul 2023 03:26:41 -0700 (PDT) X-Google-Smtp-Source: APBJJlEgikHiqFXtaIWhU32wyK/DLrPR7ZwdpLXU/WkWlbhJHwgTOfrvB1zyWZsESsxD1XYrBRFl X-Received: by 2002:a17:906:73c4:b0:993:d88e:41ed with SMTP id n4-20020a17090673c400b00993d88e41edmr1289159ejl.3.1689244001761; Thu, 13 Jul 2023 03:26:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689244001; cv=none; d=google.com; s=arc-20160816; b=Vd0xFWkvp8rPrk/FZajcWbQPniv2pfKwyZP7lJ6wV68N6HSUDEv5YPS1dBPwL7q6nm DAMUaCZ78UEwd0k6y3eu1PRM78cFThNuXGDJd/8PWOEmgEUncnYw2lRwqoY+kB0ScdlI U0mjR21l7KnO1cd6ShhbfXWbjtBkUJg3Oq5ZKzZ9glQCiwhJtEV6nbN6TQaQRoQ9NhBc ZSI/tsyt/lZ5Cc0au587B9Qo4jemlOzMdhMHi3HU99B9KTRGsT5jgZkwOUEEfr2Sizyj UjWzTZE4psRdB8NB0acbFEdbsLq7LhiOAhso67lEDOsfJuoqy9rG35pPVzYXKoxqV4Bm QlCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=iKVHb3jApXLii0k4vJsTYRkrU2NU5DhCdEOr3Jy+oVI=; fh=XxyM7Ym2GZuIk6CR9Av7X+K6FIjQnvqQn0ljduIMmhY=; b=HyGEHvGZKpHlnPUZFc+cK0x1FCPkUEdO1YUVxcZ4QdIfDJjB1GXHjSxKnhPDJtOa9Y c7LECAxC9dxwU689L0SBqP2YOgo3yw6DUwYQBZw6P6rpQ6psP852cy+NFXewDC5CZ4dA QsvVwn8iQG7xUTHU12MC1BgyVIvHVemcE12YmWRoMBOqYPp8HiXN80cb1NkxxHVaK7N1 lxztmp2FZgAlPwXkaMP1/3ido5Q7QctJefMcNAJJr3+Jgo1Te7Rd2174ksZHMr0o9vay WAeYMOJjJH+JUMoRMyJfZdj4qejTex8WOrCZZk90vOt0uhLqGHocFDSiV+hBJVgmpwLS 6kIg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="D8Dry+7/"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id n26-20020a170906119a00b00992d6e88081si7439814eja.956.2023.07.13.03.26.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 03:26:41 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="D8Dry+7/"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5344F385C6F4 for ; Thu, 13 Jul 2023 10:25:28 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5344F385C6F4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689243928; bh=iKVHb3jApXLii0k4vJsTYRkrU2NU5DhCdEOr3Jy+oVI=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=D8Dry+7/kexKZBm5IWYsWPmR+LHiA9nl/6wCbtH4pcechpEc6Jv4++xycbgY1opHV xL0DU3zAxw7gl0tIuWh9Z+WpLdamItqlBy8iYbhmr7kK0zjYfAFMyB6TwvgkLejbwf EaFGih1v6TPaAxAuFdSirb0iIlUo5P9ga4DsQkvU= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oi1-x22d.google.com (mail-oi1-x22d.google.com [IPv6:2607:f8b0:4864:20::22d]) by sourceware.org (Postfix) with ESMTPS id 3D115385828E for ; Thu, 13 Jul 2023 10:22:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3D115385828E Received: by mail-oi1-x22d.google.com with SMTP id 5614622812f47-3a1ebb85f99so546275b6e.2 for ; Thu, 13 Jul 2023 03:22:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689243751; x=1691835751; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iKVHb3jApXLii0k4vJsTYRkrU2NU5DhCdEOr3Jy+oVI=; b=l4GNMPQuv554Ln3iEv+1m4iXfncE4o99KQN0OA4Y+I0RBaWGt9VOr6dtoDlH23lY8M nkBYlbyS/rgy2llQgGD+5d8A7Uf65WPGS9ETn14s76gMVD0VzawBgB8uzLDA99kNyfJG XYJ/SKuxHvp5iPp+RvLF2PKoJMaBM/zYzkiwDcLDup+iy4MHYPIeaWKrZM6vRkAdw7qI ltKx+zAeajIKtoq6oABO0C8oiseep6AUf017+y+GKKEaV53D2pL6qbje3Jk8mP03XgbF bnjA6z5Xw1GAoxLwUh6ArNJBfamQj+OCllHHidXiJCji/ghyWT9laTSjbwfysdJx+ns2 jD4A== X-Gm-Message-State: ABy/qLa8SOH0JBvyomsyE9I1X7BVyjUPWuJMqzv3+Iiug2JoTZdYddV+ 3o4GXo6mZRyydKLRdo9Nq3creBUwZVm/8bItRqB8+rAK X-Received: by 2002:aca:bc45:0:b0:3a0:3773:f294 with SMTP id m66-20020acabc45000000b003a03773f294mr1442903oif.8.1689243751266; Thu, 13 Jul 2023 03:22:31 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id a9-20020a05680802c900b003a020d24d7dsm2707263oid.56.2023.07.13.03.22.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 03:22:30 -0700 (PDT) To: gcc-patches@gcc.gnu.org, Kyrylo.Tkachov@arm.com, richard.earnshaw@arm.com, richard.sandiford@arm.com Cc: Christophe Lyon Subject: [PATCH 3/6] arm: [MVE intrinsics factorize vcmulq Date: Thu, 13 Jul 2023 10:22:21 +0000 Message-Id: <20230713102224.1161596-3-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230713102224.1161596-1-christophe.lyon@linaro.org> References: <20230713102224.1161596-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Christophe Lyon via Gcc-patches From: Christophe Lyon Reply-To: Christophe Lyon Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771300718387464471 X-GMAIL-MSGID: 1771300718387464471 Factorize vcmulq builtins so that they use parameterized names. We can merged them with vcadd. 2023-07-13 Christophe Lyon gcc/: * config/arm/arm_mve_builtins.def (vcmulq_rot90_f) (vcmulq_rot270_f, vcmulq_rot180_f, vcmulq_f): Add "_f" suffix. * config/arm/iterators.md (MVE_VCADDQ_VCMULQ) (MVE_VCADDQ_VCMULQ_M): New. (mve_insn): Add vcmul. (rot): Add VCMULQ_M_F, VCMULQ_ROT90_M_F, VCMULQ_ROT180_M_F, VCMULQ_ROT270_M_F. (VCMUL): Delete. (mve_rot): Add VCMULQ_M_F, VCMULQ_ROT90_M_F, VCMULQ_ROT180_M_F, VCMULQ_ROT270_M_F. * config/arm/mve.md (mve_vcmulq): Merge into @mve_q_f. (mve_vcmulq_m_f, mve_vcmulq_rot180_m_f) (mve_vcmulq_rot270_m_f, mve_vcmulq_rot90_m_f): Merge into @mve_q_m_f. --- gcc/config/arm/arm_mve_builtins.def | 8 +-- gcc/config/arm/iterators.md | 27 +++++++-- gcc/config/arm/mve.md | 92 +++-------------------------- 3 files changed, 33 insertions(+), 94 deletions(-) diff --git a/gcc/config/arm/arm_mve_builtins.def b/gcc/config/arm/arm_mve_builtins.def index 63ad1845593..56358c0bd02 100644 --- a/gcc/config/arm/arm_mve_builtins.def +++ b/gcc/config/arm/arm_mve_builtins.def @@ -191,6 +191,10 @@ VAR3 (BINOP_NONE_NONE_NONE, vcaddq_rot90_, v16qi, v8hi, v4si) VAR3 (BINOP_NONE_NONE_NONE, vcaddq_rot270_, v16qi, v8hi, v4si) VAR2 (BINOP_NONE_NONE_NONE, vcaddq_rot90_f, v8hf, v4sf) VAR2 (BINOP_NONE_NONE_NONE, vcaddq_rot270_f, v8hf, v4sf) +VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot90_f, v8hf, v4sf) +VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot270_f, v8hf, v4sf) +VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot180_f, v8hf, v4sf) +VAR2 (BINOP_NONE_NONE_NONE, vcmulq_f, v8hf, v4sf) VAR3 (BINOP_NONE_NONE_NONE, vhcaddq_rot90_s, v16qi, v8hi, v4si) VAR3 (BINOP_NONE_NONE_NONE, vhcaddq_rot270_s, v16qi, v8hi, v4si) VAR3 (BINOP_NONE_NONE_NONE, vhaddq_s, v16qi, v8hi, v4si) @@ -874,10 +878,6 @@ VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_PRED, vshlcq_m_vec_u, v16qi, v8hi, v4si) VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_PRED, vshlcq_m_carry_u, v16qi, v8hi, v4si) /* optabs without any suffixes. */ -VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot90, v8hf, v4sf) -VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot270, v8hf, v4sf) -VAR2 (BINOP_NONE_NONE_NONE, vcmulq_rot180, v8hf, v4sf) -VAR2 (BINOP_NONE_NONE_NONE, vcmulq, v8hf, v4sf) VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot90, v8hf, v4sf) VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot270, v8hf, v4sf) VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot180, v8hf, v4sf) diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index da1ead34e58..9f71404e26c 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -901,8 +901,19 @@ VPSELQ_F ]) +(define_int_iterator MVE_VCADDQ_VCMULQ [ + UNSPEC_VCADD90 UNSPEC_VCADD270 + UNSPEC_VCMUL UNSPEC_VCMUL90 UNSPEC_VCMUL180 UNSPEC_VCMUL270 + ]) + +(define_int_iterator MVE_VCADDQ_VCMULQ_M [ + VCADDQ_ROT90_M_F VCADDQ_ROT270_M_F + VCMULQ_M_F VCMULQ_ROT90_M_F VCMULQ_ROT180_M_F VCMULQ_ROT270_M_F + ]) + (define_int_attr mve_insn [ (UNSPEC_VCADD90 "vcadd") (UNSPEC_VCADD270 "vcadd") + (UNSPEC_VCMUL "vcmul") (UNSPEC_VCMUL90 "vcmul") (UNSPEC_VCMUL180 "vcmul") (UNSPEC_VCMUL270 "vcmul") (VABAVQ_P_S "vabav") (VABAVQ_P_U "vabav") (VABAVQ_S "vabav") (VABAVQ_U "vabav") (VABDQ_M_S "vabd") (VABDQ_M_U "vabd") (VABDQ_M_F "vabd") @@ -931,6 +942,7 @@ (VCLSQ_M_S "vcls") (VCLSQ_S "vcls") (VCLZQ_M_S "vclz") (VCLZQ_M_U "vclz") + (VCMULQ_M_F "vcmul") (VCMULQ_ROT90_M_F "vcmul") (VCMULQ_ROT180_M_F "vcmul") (VCMULQ_ROT270_M_F "vcmul") (VCREATEQ_S "vcreate") (VCREATEQ_U "vcreate") (VCREATEQ_F "vcreate") (VDUPQ_M_N_S "vdup") (VDUPQ_M_N_U "vdup") (VDUPQ_M_N_F "vdup") (VDUPQ_N_S "vdup") (VDUPQ_N_U "vdup") (VDUPQ_N_F "vdup") @@ -2182,7 +2194,11 @@ (UNSPEC_VCMLA "0") (UNSPEC_VCMLA90 "90") (UNSPEC_VCMLA180 "180") - (UNSPEC_VCMLA270 "270")]) + (UNSPEC_VCMLA270 "270") + (VCMULQ_M_F "0") + (VCMULQ_ROT90_M_F "90") + (VCMULQ_ROT180_M_F "180") + (VCMULQ_ROT270_M_F "270")]) ;; The complex operations when performed on a real complex number require two ;; instructions to perform the operation. e.g. complex multiplication requires @@ -2230,10 +2246,11 @@ (UNSPEC_VCMUL "") (UNSPEC_VCMUL90 "_rot90") (UNSPEC_VCMUL180 "_rot180") - (UNSPEC_VCMUL270 "_rot270")]) - -(define_int_iterator VCMUL [UNSPEC_VCMUL UNSPEC_VCMUL90 - UNSPEC_VCMUL180 UNSPEC_VCMUL270]) + (UNSPEC_VCMUL270 "_rot270") + (VCMULQ_M_F "") + (VCMULQ_ROT90_M_F "_rot90") + (VCMULQ_ROT180_M_F "_rot180") + (VCMULQ_ROT270_M_F "_rot270")]) (define_int_attr fcmac1 [(UNSPEC_VCMLA "a") (UNSPEC_VCMLA_CONJ "a") (UNSPEC_VCMLA180 "s") (UNSPEC_VCMLA180_CONJ "s")]) diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index a6db6d1b81d..0b99bf017dc 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -1212,13 +1212,14 @@ ;; ;; [vcaddq_rot90_f, vcaddq_rot270_f] +;; [vcmulq, vcmulq_rot90, vcmulq_rot180, vcmulq_rot270] ;; (define_insn "@mve_q_f" [ (set (match_operand:MVE_0 0 "s_register_operand" "") (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "w") (match_operand:MVE_0 2 "s_register_operand" "w")] - VCADD)) + MVE_VCADDQ_VCMULQ)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" ".f%#\t%q0, %q1, %q2, #" @@ -1254,21 +1255,6 @@ [(set_attr "type" "mve_move") ]) -;; -;; [vcmulq, vcmulq_rot90, vcmulq_rot180, vcmulq_rot270]) -;; -(define_insn "mve_vcmulq" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "w") - (match_operand:MVE_0 2 "s_register_operand" "w")] - VCMUL)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vcmul.f%# %q0, %q1, %q2, #" - [(set_attr "type" "mve_move") -]) - ;; ;; [vctp8q_m vctp16q_m vctp32q_m vctp64q_m]) ;; @@ -3174,6 +3160,10 @@ ;; ;; [vcaddq_rot90_m_f] ;; [vcaddq_rot270_m_f] +;; [vcmulq_m_f] +;; [vcmulq_rot90_m_f] +;; [vcmulq_rot180_m_f] +;; [vcmulq_rot270_m_f] ;; (define_insn "@mve_q_m_f" [ @@ -3182,7 +3172,7 @@ (match_operand:MVE_0 2 "s_register_operand" "w") (match_operand:MVE_0 3 "s_register_operand" "w") (match_operand: 4 "vpr_register_operand" "Up")] - VCADDQ_M_F)) + MVE_VCADDQ_VCMULQ_M)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" "vpst\;t.f%#\t%q0, %q2, %q3, #" @@ -3257,74 +3247,6 @@ [(set_attr "type" "mve_move") (set_attr "length""8")]) -;; -;; [vcmulq_m_f]) -;; -(define_insn "mve_vcmulq_m_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") - (match_operand:MVE_0 2 "s_register_operand" "w") - (match_operand:MVE_0 3 "s_register_operand" "w") - (match_operand: 4 "vpr_register_operand" "Up")] - VCMULQ_M_F)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcmult.f%# %q0, %q2, %q3, #0" - [(set_attr "type" "mve_move") - (set_attr "length""8")]) - -;; -;; [vcmulq_rot180_m_f]) -;; -(define_insn "mve_vcmulq_rot180_m_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") - (match_operand:MVE_0 2 "s_register_operand" "w") - (match_operand:MVE_0 3 "s_register_operand" "w") - (match_operand: 4 "vpr_register_operand" "Up")] - VCMULQ_ROT180_M_F)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcmult.f%# %q0, %q2, %q3, #180" - [(set_attr "type" "mve_move") - (set_attr "length""8")]) - -;; -;; [vcmulq_rot270_m_f]) -;; -(define_insn "mve_vcmulq_rot270_m_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") - (match_operand:MVE_0 2 "s_register_operand" "w") - (match_operand:MVE_0 3 "s_register_operand" "w") - (match_operand: 4 "vpr_register_operand" "Up")] - VCMULQ_ROT270_M_F)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcmult.f%# %q0, %q2, %q3, #270" - [(set_attr "type" "mve_move") - (set_attr "length""8")]) - -;; -;; [vcmulq_rot90_m_f]) -;; -(define_insn "mve_vcmulq_rot90_m_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") - (match_operand:MVE_0 2 "s_register_operand" "w") - (match_operand:MVE_0 3 "s_register_operand" "w") - (match_operand: 4 "vpr_register_operand" "Up")] - VCMULQ_ROT90_M_F)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcmult.f%# %q0, %q2, %q3, #90" - [(set_attr "type" "mve_move") - (set_attr "length""8")]) - ;; ;; [vornq_m_f]) ;; From patchwork Thu Jul 13 10:22:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 119761 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1725661vqm; Thu, 13 Jul 2023 03:29:21 -0700 (PDT) X-Google-Smtp-Source: APBJJlFJItdiEAa6fVMP8Jj449oSxvt9zgljOtHO9SAb2MPnZDG8GgRWZoS7B/dzxEeNgXYTQr1+ X-Received: by 2002:a2e:9890:0:b0:2b5:7fba:18ac with SMTP id b16-20020a2e9890000000b002b57fba18acmr1215691ljj.48.1689244161460; Thu, 13 Jul 2023 03:29:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689244161; cv=none; d=google.com; s=arc-20160816; b=ry3nnxs72MkxLeXM9PRTGdhAC/ZKBdUCTDlbmCrdahIJEgJ2YkdZSSlrHg8aXd3TrD 3kQJ4vkLnygwDR46R/olmycghe2DpcQ4FrMyaDJ7jm8Ex67GZKdQfyU7PWD2+Qjp5LDX EFoQYAwtZihu6Ny9cZKEHZdLb+9XOfLO+O2CCpFAsDJA3sE0LkGN4W5d/wlj3mCuzcld eW5Klh9kct2+Br5+mHVH8T7Hi8Ry0CcHYXzdVevmWEXTfzjRo3irSQ42dldJWdbH/N5K bVof833F7jAv2g/xjF9WLV8P6M1f5ei+3cnVC6j2Gx6kEJ36KK2b+HtRCe3QGaCQHnUz s/jg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=cjCuILF75+UHGw560YQGMvYQEynNWOT83bYXKS6j0Vs=; fh=XxyM7Ym2GZuIk6CR9Av7X+K6FIjQnvqQn0ljduIMmhY=; b=fW7y+pqyKzz+xvSD/iIIfAFskDwy1ofoL87FVvIp3iIvaYoUbNj1Vdeg2mwsUbj9Mw /lQsB3fSvL17lATlIpOwgC1mgxZ8kNxmpU/z5Ds/m6uHccCBWFIg6SGnKDI+qnhDVl0p MiVT33gxeQAFlBM+YvGMzDzPALDyOEixxnrzjnY2eH+I4UQ1yDtHEZ1TwTmGShGlNB9i 3Q4nViYsacI4HJ7St0s3RLhjYJNv9oaCUMtFWnoTcm8HV5TWm4wJFXqveuOYTAv98C2b saEPpKUw6WwNEUNLBVVovOHojqdNM+l46Y7f3S8v1BiTlSRrbSH7shZ60YfA7eGkD5S6 fUhQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=ExEO9GvL; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id gu10-20020a170906f28a00b0098831afb89bsi7204220ejb.84.2023.07.13.03.29.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 03:29:21 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=ExEO9GvL; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F15913857341 for ; Thu, 13 Jul 2023 10:27:24 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F15913857341 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689244045; bh=cjCuILF75+UHGw560YQGMvYQEynNWOT83bYXKS6j0Vs=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=ExEO9GvL1cUcL/u8KtYgbAsChTyjB2Dxp9Q5RlRQGoFQheO8ZK4eU5gfZoRCJMQXk 0QVRmfaR8fcPOwwZPVmTN91C4AsQMbxMq12+pOgRx1dSc63xBxg0Cr0k8Z8PAA7YuT d69cUwYZy/PpTKbDAGe0HfXtS6VzmZ7GMip6u7E0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oo1-xc36.google.com (mail-oo1-xc36.google.com [IPv6:2607:f8b0:4864:20::c36]) by sourceware.org (Postfix) with ESMTPS id 821BF385772D for ; Thu, 13 Jul 2023 10:22:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 821BF385772D Received: by mail-oo1-xc36.google.com with SMTP id 006d021491bc7-56584266c41so452558eaf.2 for ; Thu, 13 Jul 2023 03:22:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689243752; x=1691835752; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cjCuILF75+UHGw560YQGMvYQEynNWOT83bYXKS6j0Vs=; b=FHrFJZghEOBaFlimgcH49cBIanBWwKuf4kwBKrmgrBnzrZ/yRkf22KhnWynTmfDs0w Qq1lzgaYDKXD88bvF/moZlfxnAEynBfL+VjJ9d50eP3wklK24zUsbSWgau1rRAXe9uPF vH8ekXlnG6C0ZbOMTRNcfqmSb8FHa1d6/dY6W9+ag1AgONnzeRkOy9qmVZCnOzAIaoN+ 8jcU9wxzYjE1AWMRgFqfXW7xGG3oisE4u1P1WWmCjt4EKiIaFbCvlWdMkeA+BV1lwCUC XeTwvK1TJb3Qt1nDSbARMS3MH3IX3E8NSeu6XEopcojgcUKe9s61MHTEqcVK7unbrmY/ u3ng== X-Gm-Message-State: ABy/qLbBGJTuW+XxfimnIV2b6jbFx8nWsl4ipXTMKfTzRlK/EdQGkafd kt33ukh0Hof7Uqj3+6tJR2YbmfgWSQhr92K/Qi1ohSyV X-Received: by 2002:a05:6808:ece:b0:39e:c615:949f with SMTP id q14-20020a0568080ece00b0039ec615949fmr1378614oiv.24.1689243752167; Thu, 13 Jul 2023 03:22:32 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id a9-20020a05680802c900b003a020d24d7dsm2707263oid.56.2023.07.13.03.22.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 03:22:31 -0700 (PDT) To: gcc-patches@gcc.gnu.org, Kyrylo.Tkachov@arm.com, richard.earnshaw@arm.com, richard.sandiford@arm.com Cc: Christophe Lyon Subject: [PATCH 4/6] arm: [MVE intrinsics] rework vcmulq Date: Thu, 13 Jul 2023 10:22:22 +0000 Message-Id: <20230713102224.1161596-4-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230713102224.1161596-1-christophe.lyon@linaro.org> References: <20230713102224.1161596-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Christophe Lyon via Gcc-patches From: Christophe Lyon Reply-To: Christophe Lyon Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771300885732661897 X-GMAIL-MSGID: 1771300885732661897 Implement vcmulq using the new MVE builtins framework. 2023-07-13 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vcmulq, vcmulq_rot90) (vcmulq_rot180, vcmulq_rot270): New. * config/arm/arm-mve-builtins-base.def (vcmulq, vcmulq_rot90) (vcmulq_rot180, vcmulq_rot270): New. * config/arm/arm-mve-builtins-base.h: (vcmulq, vcmulq_rot90) (vcmulq_rot180, vcmulq_rot270): New. * config/arm/arm_mve.h (vcmulq_rot90): Delete. (vcmulq_rot270): Delete. (vcmulq_rot180): Delete. (vcmulq): Delete. (vcmulq_m): Delete. (vcmulq_rot180_m): Delete. (vcmulq_rot270_m): Delete. (vcmulq_rot90_m): Delete. (vcmulq_x): Delete. (vcmulq_rot90_x): Delete. (vcmulq_rot180_x): Delete. (vcmulq_rot270_x): Delete. (vcmulq_rot90_f16): Delete. (vcmulq_rot270_f16): Delete. (vcmulq_rot180_f16): Delete. (vcmulq_f16): Delete. (vcmulq_rot90_f32): Delete. (vcmulq_rot270_f32): Delete. (vcmulq_rot180_f32): Delete. (vcmulq_f32): Delete. (vcmulq_m_f32): Delete. (vcmulq_m_f16): Delete. (vcmulq_rot180_m_f32): Delete. (vcmulq_rot180_m_f16): Delete. (vcmulq_rot270_m_f32): Delete. (vcmulq_rot270_m_f16): Delete. (vcmulq_rot90_m_f32): Delete. (vcmulq_rot90_m_f16): Delete. (vcmulq_x_f16): Delete. (vcmulq_x_f32): Delete. (vcmulq_rot90_x_f16): Delete. (vcmulq_rot90_x_f32): Delete. (vcmulq_rot180_x_f16): Delete. (vcmulq_rot180_x_f32): Delete. (vcmulq_rot270_x_f16): Delete. (vcmulq_rot270_x_f32): Delete. (__arm_vcmulq_rot90_f16): Delete. (__arm_vcmulq_rot270_f16): Delete. (__arm_vcmulq_rot180_f16): Delete. (__arm_vcmulq_f16): Delete. (__arm_vcmulq_rot90_f32): Delete. (__arm_vcmulq_rot270_f32): Delete. (__arm_vcmulq_rot180_f32): Delete. (__arm_vcmulq_f32): Delete. (__arm_vcmulq_m_f32): Delete. (__arm_vcmulq_m_f16): Delete. (__arm_vcmulq_rot180_m_f32): Delete. (__arm_vcmulq_rot180_m_f16): Delete. (__arm_vcmulq_rot270_m_f32): Delete. (__arm_vcmulq_rot270_m_f16): Delete. (__arm_vcmulq_rot90_m_f32): Delete. (__arm_vcmulq_rot90_m_f16): Delete. (__arm_vcmulq_x_f16): Delete. (__arm_vcmulq_x_f32): Delete. (__arm_vcmulq_rot90_x_f16): Delete. (__arm_vcmulq_rot90_x_f32): Delete. (__arm_vcmulq_rot180_x_f16): Delete. (__arm_vcmulq_rot180_x_f32): Delete. (__arm_vcmulq_rot270_x_f16): Delete. (__arm_vcmulq_rot270_x_f32): Delete. (__arm_vcmulq_rot90): Delete. (__arm_vcmulq_rot270): Delete. (__arm_vcmulq_rot180): Delete. (__arm_vcmulq): Delete. (__arm_vcmulq_m): Delete. (__arm_vcmulq_rot180_m): Delete. (__arm_vcmulq_rot270_m): Delete. (__arm_vcmulq_rot90_m): Delete. (__arm_vcmulq_x): Delete. (__arm_vcmulq_rot90_x): Delete. (__arm_vcmulq_rot180_x): Delete. (__arm_vcmulq_rot270_x): Delete. --- gcc/config/arm/arm-mve-builtins-base.cc | 4 + gcc/config/arm/arm-mve-builtins-base.def | 4 + gcc/config/arm/arm-mve-builtins-base.h | 4 + gcc/config/arm/arm_mve.h | 448 ----------------------- 4 files changed, 12 insertions(+), 448 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index f15bb926147..3ad8df304e8 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -262,6 +262,10 @@ FUNCTION_WITH_RTX_M (vandq, AND, VANDQ) FUNCTION_ONLY_N (vbrsrq, VBRSRQ) FUNCTION (vcaddq_rot90, unspec_mve_function_exact_insn_rot, (UNSPEC_VCADD90, UNSPEC_VCADD90, UNSPEC_VCADD90, VCADDQ_ROT90_M_S, VCADDQ_ROT90_M_U, VCADDQ_ROT90_M_F)) FUNCTION (vcaddq_rot270, unspec_mve_function_exact_insn_rot, (UNSPEC_VCADD270, UNSPEC_VCADD270, UNSPEC_VCADD270, VCADDQ_ROT270_M_S, VCADDQ_ROT270_M_U, VCADDQ_ROT270_M_F)) +FUNCTION (vcmulq, unspec_mve_function_exact_insn_rot, (-1, -1, UNSPEC_VCMUL, -1, -1, VCMULQ_M_F)) +FUNCTION (vcmulq_rot90, unspec_mve_function_exact_insn_rot, (-1, -1, UNSPEC_VCMUL90, -1, -1, VCMULQ_ROT90_M_F)) +FUNCTION (vcmulq_rot180, unspec_mve_function_exact_insn_rot, (-1, -1, UNSPEC_VCMUL180, -1, -1, VCMULQ_ROT180_M_F)) +FUNCTION (vcmulq_rot270, unspec_mve_function_exact_insn_rot, (-1, -1, UNSPEC_VCMUL270, -1, -1, VCMULQ_ROT270_M_F)) FUNCTION (vhcaddq_rot90, unspec_mve_function_exact_insn_rot, (VHCADDQ_ROT90_S, -1, -1, VHCADDQ_ROT90_M_S, -1, -1)) FUNCTION (vhcaddq_rot270, unspec_mve_function_exact_insn_rot, (VHCADDQ_ROT270_S, -1, -1, VHCADDQ_ROT270_M_S, -1, -1)) FUNCTION_WITHOUT_N_NO_U_F (vclsq, VCLSQ) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index 9a793147960..cbcf0d296cd 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -158,6 +158,10 @@ DEF_MVE_FUNCTION (vandq, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vbrsrq, binary_imm32, all_float, mx_or_none) DEF_MVE_FUNCTION (vcaddq_rot90, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vcaddq_rot270, binary, all_float, mx_or_none) +DEF_MVE_FUNCTION (vcmulq, binary, all_float, mx_or_none) +DEF_MVE_FUNCTION (vcmulq_rot90, binary, all_float, mx_or_none) +DEF_MVE_FUNCTION (vcmulq_rot180, binary, all_float, mx_or_none) +DEF_MVE_FUNCTION (vcmulq_rot270, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vcmpeqq, cmp, all_float, m_or_none) DEF_MVE_FUNCTION (vcmpgeq, cmp, all_float, m_or_none) DEF_MVE_FUNCTION (vcmpgtq, cmp, all_float, m_or_none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index 8dac72970e0..875b333ebef 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -35,6 +35,10 @@ extern const function_base *const vandq; extern const function_base *const vbrsrq; extern const function_base *const vcaddq_rot90; extern const function_base *const vcaddq_rot270; +extern const function_base *const vcmulq; +extern const function_base *const vcmulq_rot90; +extern const function_base *const vcmulq_rot180; +extern const function_base *const vcmulq_rot270; extern const function_base *const vclsq; extern const function_base *const vclzq; extern const function_base *const vcmpcsq; diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 9297ad757ab..b9d3a876369 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -155,10 +155,6 @@ #define vcvtbq_f32(__a) __arm_vcvtbq_f32(__a) #define vcvtq(__a) __arm_vcvtq(__a) #define vcvtq_n(__a, __imm6) __arm_vcvtq_n(__a, __imm6) -#define vcmulq_rot90(__a, __b) __arm_vcmulq_rot90(__a, __b) -#define vcmulq_rot270(__a, __b) __arm_vcmulq_rot270(__a, __b) -#define vcmulq_rot180(__a, __b) __arm_vcmulq_rot180(__a, __b) -#define vcmulq(__a, __b) __arm_vcmulq(__a, __b) #define vcvtaq_m(__inactive, __a, __p) __arm_vcvtaq_m(__inactive, __a, __p) #define vcvtq_m(__inactive, __a, __p) __arm_vcvtq_m(__inactive, __a, __p) #define vcvtbq_m(__a, __b, __p) __arm_vcvtbq_m(__a, __b, __p) @@ -175,14 +171,6 @@ #define vcmlaq_rot180_m(__a, __b, __c, __p) __arm_vcmlaq_rot180_m(__a, __b, __c, __p) #define vcmlaq_rot270_m(__a, __b, __c, __p) __arm_vcmlaq_rot270_m(__a, __b, __c, __p) #define vcmlaq_rot90_m(__a, __b, __c, __p) __arm_vcmlaq_rot90_m(__a, __b, __c, __p) -#define vcmulq_m(__inactive, __a, __b, __p) __arm_vcmulq_m(__inactive, __a, __b, __p) -#define vcmulq_rot180_m(__inactive, __a, __b, __p) __arm_vcmulq_rot180_m(__inactive, __a, __b, __p) -#define vcmulq_rot270_m(__inactive, __a, __b, __p) __arm_vcmulq_rot270_m(__inactive, __a, __b, __p) -#define vcmulq_rot90_m(__inactive, __a, __b, __p) __arm_vcmulq_rot90_m(__inactive, __a, __b, __p) -#define vcmulq_x(__a, __b, __p) __arm_vcmulq_x(__a, __b, __p) -#define vcmulq_rot90_x(__a, __b, __p) __arm_vcmulq_rot90_x(__a, __b, __p) -#define vcmulq_rot180_x(__a, __b, __p) __arm_vcmulq_rot180_x(__a, __b, __p) -#define vcmulq_rot270_x(__a, __b, __p) __arm_vcmulq_rot270_x(__a, __b, __p) #define vcvtq_x(__a, __p) __arm_vcvtq_x(__a, __p) #define vcvtq_x_n(__a, __imm6, __p) __arm_vcvtq_x_n(__a, __imm6, __p) @@ -262,20 +250,12 @@ #define vmullbq_poly_p8(__a, __b) __arm_vmullbq_poly_p8(__a, __b) #define vbicq_n_u16(__a, __imm) __arm_vbicq_n_u16(__a, __imm) #define vornq_f16(__a, __b) __arm_vornq_f16(__a, __b) -#define vcmulq_rot90_f16(__a, __b) __arm_vcmulq_rot90_f16(__a, __b) -#define vcmulq_rot270_f16(__a, __b) __arm_vcmulq_rot270_f16(__a, __b) -#define vcmulq_rot180_f16(__a, __b) __arm_vcmulq_rot180_f16(__a, __b) -#define vcmulq_f16(__a, __b) __arm_vcmulq_f16(__a, __b) #define vbicq_f16(__a, __b) __arm_vbicq_f16(__a, __b) #define vbicq_n_s16(__a, __imm) __arm_vbicq_n_s16(__a, __imm) #define vmulltq_poly_p16(__a, __b) __arm_vmulltq_poly_p16(__a, __b) #define vmullbq_poly_p16(__a, __b) __arm_vmullbq_poly_p16(__a, __b) #define vbicq_n_u32(__a, __imm) __arm_vbicq_n_u32(__a, __imm) #define vornq_f32(__a, __b) __arm_vornq_f32(__a, __b) -#define vcmulq_rot90_f32(__a, __b) __arm_vcmulq_rot90_f32(__a, __b) -#define vcmulq_rot270_f32(__a, __b) __arm_vcmulq_rot270_f32(__a, __b) -#define vcmulq_rot180_f32(__a, __b) __arm_vcmulq_rot180_f32(__a, __b) -#define vcmulq_f32(__a, __b) __arm_vcmulq_f32(__a, __b) #define vbicq_f32(__a, __b) __arm_vbicq_f32(__a, __b) #define vbicq_n_s32(__a, __imm) __arm_vbicq_n_s32(__a, __imm) #define vctp8q_m(__a, __p) __arm_vctp8q_m(__a, __p) @@ -372,14 +352,6 @@ #define vcmlaq_rot270_m_f16(__a, __b, __c, __p) __arm_vcmlaq_rot270_m_f16(__a, __b, __c, __p) #define vcmlaq_rot90_m_f32(__a, __b, __c, __p) __arm_vcmlaq_rot90_m_f32(__a, __b, __c, __p) #define vcmlaq_rot90_m_f16(__a, __b, __c, __p) __arm_vcmlaq_rot90_m_f16(__a, __b, __c, __p) -#define vcmulq_m_f32(__inactive, __a, __b, __p) __arm_vcmulq_m_f32(__inactive, __a, __b, __p) -#define vcmulq_m_f16(__inactive, __a, __b, __p) __arm_vcmulq_m_f16(__inactive, __a, __b, __p) -#define vcmulq_rot180_m_f32(__inactive, __a, __b, __p) __arm_vcmulq_rot180_m_f32(__inactive, __a, __b, __p) -#define vcmulq_rot180_m_f16(__inactive, __a, __b, __p) __arm_vcmulq_rot180_m_f16(__inactive, __a, __b, __p) -#define vcmulq_rot270_m_f32(__inactive, __a, __b, __p) __arm_vcmulq_rot270_m_f32(__inactive, __a, __b, __p) -#define vcmulq_rot270_m_f16(__inactive, __a, __b, __p) __arm_vcmulq_rot270_m_f16(__inactive, __a, __b, __p) -#define vcmulq_rot90_m_f32(__inactive, __a, __b, __p) __arm_vcmulq_rot90_m_f32(__inactive, __a, __b, __p) -#define vcmulq_rot90_m_f16(__inactive, __a, __b, __p) __arm_vcmulq_rot90_m_f16(__inactive, __a, __b, __p) #define vcvtq_m_n_s32_f32(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_s32_f32(__inactive, __a, __imm6, __p) #define vcvtq_m_n_s16_f16(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_s16_f16(__inactive, __a, __imm6, __p) #define vcvtq_m_n_u32_f32(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_u32_f32(__inactive, __a, __imm6, __p) @@ -712,14 +684,6 @@ #define vornq_x_u8(__a, __b, __p) __arm_vornq_x_u8(__a, __b, __p) #define vornq_x_u16(__a, __b, __p) __arm_vornq_x_u16(__a, __b, __p) #define vornq_x_u32(__a, __b, __p) __arm_vornq_x_u32(__a, __b, __p) -#define vcmulq_x_f16(__a, __b, __p) __arm_vcmulq_x_f16(__a, __b, __p) -#define vcmulq_x_f32(__a, __b, __p) __arm_vcmulq_x_f32(__a, __b, __p) -#define vcmulq_rot90_x_f16(__a, __b, __p) __arm_vcmulq_rot90_x_f16(__a, __b, __p) -#define vcmulq_rot90_x_f32(__a, __b, __p) __arm_vcmulq_rot90_x_f32(__a, __b, __p) -#define vcmulq_rot180_x_f16(__a, __b, __p) __arm_vcmulq_rot180_x_f16(__a, __b, __p) -#define vcmulq_rot180_x_f32(__a, __b, __p) __arm_vcmulq_rot180_x_f32(__a, __b, __p) -#define vcmulq_rot270_x_f16(__a, __b, __p) __arm_vcmulq_rot270_x_f16(__a, __b, __p) -#define vcmulq_rot270_x_f32(__a, __b, __p) __arm_vcmulq_rot270_x_f32(__a, __b, __p) #define vcvtaq_x_s16_f16(__a, __p) __arm_vcvtaq_x_s16_f16(__a, __p) #define vcvtaq_x_s32_f32(__a, __p) __arm_vcvtaq_x_s32_f32(__a, __p) #define vcvtaq_x_u16_f16(__a, __p) __arm_vcvtaq_x_u16_f16(__a, __p) @@ -4561,34 +4525,6 @@ __arm_vornq_f16 (float16x8_t __a, float16x8_t __b) return __builtin_mve_vornq_fv8hf (__a, __b); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot90_f16 (float16x8_t __a, float16x8_t __b) -{ - return __builtin_mve_vcmulq_rot90v8hf (__a, __b); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot270_f16 (float16x8_t __a, float16x8_t __b) -{ - return __builtin_mve_vcmulq_rot270v8hf (__a, __b); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot180_f16 (float16x8_t __a, float16x8_t __b) -{ - return __builtin_mve_vcmulq_rot180v8hf (__a, __b); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_f16 (float16x8_t __a, float16x8_t __b) -{ - return __builtin_mve_vcmulqv8hf (__a, __b); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_f16 (float16x8_t __a, float16x8_t __b) @@ -4603,34 +4539,6 @@ __arm_vornq_f32 (float32x4_t __a, float32x4_t __b) return __builtin_mve_vornq_fv4sf (__a, __b); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot90_f32 (float32x4_t __a, float32x4_t __b) -{ - return __builtin_mve_vcmulq_rot90v4sf (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot270_f32 (float32x4_t __a, float32x4_t __b) -{ - return __builtin_mve_vcmulq_rot270v4sf (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot180_f32 (float32x4_t __a, float32x4_t __b) -{ - return __builtin_mve_vcmulq_rot180v4sf (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_f32 (float32x4_t __a, float32x4_t __b) -{ - return __builtin_mve_vcmulqv4sf (__a, __b); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq_f32 (float32x4_t __a, float32x4_t __b) @@ -5003,62 +4911,6 @@ __arm_vcmlaq_rot90_m_f16 (float16x8_t __a, float16x8_t __b, float16x8_t __c, mve return __builtin_mve_vcmlaq_rot90_m_fv8hf (__a, __b, __c, __p); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_m_f32 (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcmulq_m_fv4sf (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_m_f16 (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcmulq_m_fv8hf (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot180_m_f32 (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcmulq_rot180_m_fv4sf (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot180_m_f16 (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcmulq_rot180_m_fv8hf (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot270_m_f32 (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcmulq_rot270_m_fv4sf (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot270_m_f16 (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcmulq_rot270_m_fv8hf (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot90_m_f32 (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcmulq_rot90_m_fv4sf (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot90_m_f16 (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcmulq_rot90_m_fv8hf (__inactive, __a, __b, __p); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtq_m_n_s32_f32 (int32x4_t __inactive, float32x4_t __a, const int __imm6, mve_pred16_t __p) @@ -5359,62 +5211,6 @@ __arm_vstrwq_scatter_base_wb_p_f32 (uint32x4_t * __addr, const int __offset, flo *__addr = __builtin_mve_vstrwq_scatter_base_wb_p_fv4sf (*__addr, __offset, __value, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcmulq_m_fv8hf (__arm_vuninitializedq_f16 (), __a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_x_f32 (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcmulq_m_fv4sf (__arm_vuninitializedq_f32 (), __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot90_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcmulq_rot90_m_fv8hf (__arm_vuninitializedq_f16 (), __a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot90_x_f32 (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcmulq_rot90_m_fv4sf (__arm_vuninitializedq_f32 (), __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot180_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcmulq_rot180_m_fv8hf (__arm_vuninitializedq_f16 (), __a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot180_x_f32 (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcmulq_rot180_m_fv4sf (__arm_vuninitializedq_f32 (), __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot270_x_f16 (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcmulq_rot270_m_fv8hf (__arm_vuninitializedq_f16 (), __a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot270_x_f32 (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __builtin_mve_vcmulq_rot270_m_fv4sf (__arm_vuninitializedq_f32 (), __a, __b, __p); -} - __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtaq_x_s16_f16 (float16x8_t __a, mve_pred16_t __p) @@ -8580,34 +8376,6 @@ __arm_vornq (float16x8_t __a, float16x8_t __b) return __arm_vornq_f16 (__a, __b); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot90 (float16x8_t __a, float16x8_t __b) -{ - return __arm_vcmulq_rot90_f16 (__a, __b); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot270 (float16x8_t __a, float16x8_t __b) -{ - return __arm_vcmulq_rot270_f16 (__a, __b); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot180 (float16x8_t __a, float16x8_t __b) -{ - return __arm_vcmulq_rot180_f16 (__a, __b); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq (float16x8_t __a, float16x8_t __b) -{ - return __arm_vcmulq_f16 (__a, __b); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq (float16x8_t __a, float16x8_t __b) @@ -8622,34 +8390,6 @@ __arm_vornq (float32x4_t __a, float32x4_t __b) return __arm_vornq_f32 (__a, __b); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot90 (float32x4_t __a, float32x4_t __b) -{ - return __arm_vcmulq_rot90_f32 (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot270 (float32x4_t __a, float32x4_t __b) -{ - return __arm_vcmulq_rot270_f32 (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot180 (float32x4_t __a, float32x4_t __b) -{ - return __arm_vcmulq_rot180_f32 (__a, __b); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq (float32x4_t __a, float32x4_t __b) -{ - return __arm_vcmulq_f32 (__a, __b); -} - __extension__ extern __inline float32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vbicq (float32x4_t __a, float32x4_t __b) @@ -9007,62 +8747,6 @@ __arm_vcmlaq_rot90_m (float16x8_t __a, float16x8_t __b, float16x8_t __c, mve_pre return __arm_vcmlaq_rot90_m_f16 (__a, __b, __c, __p); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_m (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcmulq_m_f32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_m (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vcmulq_m_f16 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot180_m (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcmulq_rot180_m_f32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot180_m (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vcmulq_rot180_m_f16 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot270_m (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcmulq_rot270_m_f32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot270_m (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vcmulq_rot270_m_f16 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot90_m (float32x4_t __inactive, float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcmulq_rot90_m_f32 (__inactive, __a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot90_m (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vcmulq_rot90_m_f16 (__inactive, __a, __b, __p); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtq_m_n (int32x4_t __inactive, float32x4_t __a, const int __imm6, mve_pred16_t __p) @@ -9301,62 +8985,6 @@ __arm_vstrwq_scatter_base_wb_p (uint32x4_t * __addr, const int __offset, float32 __arm_vstrwq_scatter_base_wb_p_f32 (__addr, __offset, __value, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_x (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vcmulq_x_f16 (__a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_x (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcmulq_x_f32 (__a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot90_x (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vcmulq_rot90_x_f16 (__a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot90_x (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcmulq_rot90_x_f32 (__a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot180_x (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vcmulq_rot180_x_f16 (__a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot180_x (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcmulq_rot180_x_f32 (__a, __b, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot270_x (float16x8_t __a, float16x8_t __b, mve_pred16_t __p) -{ - return __arm_vcmulq_rot270_x_f16 (__a, __b, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmulq_rot270_x (float32x4_t __a, float32x4_t __b, mve_pred16_t __p) -{ - return __arm_vcmulq_rot270_x_f32 (__a, __b, __p); -} - __extension__ extern __inline float16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtq_x (uint16x8_t __a, mve_pred16_t __p) @@ -9912,30 +9540,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vornq_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t)), \ int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vornq_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t)));}) -#define __arm_vcmulq(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmulq_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t)), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmulq_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t)));}) - -#define __arm_vcmulq_rot180(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmulq_rot180_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t)), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmulq_rot180_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t)));}) - -#define __arm_vcmulq_rot270(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmulq_rot270_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t)), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmulq_rot270_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t)));}) - -#define __arm_vcmulq_rot90(p0,p1) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmulq_rot90_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t)), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmulq_rot90_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t)));}) - #define __arm_vmulltq_poly(p0,p1) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ @@ -10121,34 +9725,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmlaq_rot90_m_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmlaq_rot90_m_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) -#define __arm_vcmulq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmulq_m_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmulq_m_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - -#define __arm_vcmulq_rot180_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmulq_rot180_m_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmulq_rot180_m_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - -#define __arm_vcmulq_rot270_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmulq_rot270_m_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmulq_rot270_m_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - -#define __arm_vcmulq_rot90_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)] [__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmulq_rot90_m_f16(__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmulq_rot90_m_f32(__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - #define __arm_vornq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \ @@ -10486,24 +10062,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vbicq_x_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vbicq_x_f32 (__ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) -#define __arm_vcmulq_rot180_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmulq_rot180_x_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmulq_rot180_x_f32 (__ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - -#define __arm_vcmulq_rot270_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmulq_rot270_x_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmulq_rot270_x_f32 (__ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - -#define __arm_vcmulq_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmulq_x_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmulq_x_f32 (__ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - #define __arm_vcvtq_x(p1,p2) ({ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p1)])0, \ int (*)[__ARM_mve_type_int16x8_t]: __arm_vcvtq_x_f16_s16 (__ARM_mve_coerce(__p1, int16x8_t), p2), \ @@ -10530,12 +10088,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vornq_x_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vornq_x_f32 (__ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) -#define __arm_vcmulq_rot90_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmulq_rot90_x_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmulq_rot90_x_f32 (__ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - #define __arm_vgetq_lane(p0,p1) ({ __typeof(p0) __p0 = (p0); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \ int (*)[__ARM_mve_type_int8x16_t]: __arm_vgetq_lane_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1), \ From patchwork Thu Jul 13 10:22:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 119760 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1725236vqm; Thu, 13 Jul 2023 03:28:15 -0700 (PDT) X-Google-Smtp-Source: APBJJlFVZ0MatUDfd9LQj36BQVrBP6cgC4/pbL2HxDZWaMdkPiSErKM2F+wXWRKYG/Fk/ewjoABA X-Received: by 2002:a17:906:73d0:b0:991:ed4e:1c84 with SMTP id n16-20020a17090673d000b00991ed4e1c84mr1022046ejl.25.1689244095085; Thu, 13 Jul 2023 03:28:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689244095; cv=none; d=google.com; s=arc-20160816; b=AUMI8yUK/zFfoMGgOmrQaSdIsEBs8K7kjOl3vcNSQZCP2fwVYSfIuGBDc1OtNVWhUJ 6r5nJV7bGsRPPbZec7nxqSISnpTJwt9I5QR8brlTZ4661XB7SW1ilUMpzLEsqpmX2v1U yJX/oTJpcMBPj46gGuV2+EGnxRnRRf3ph+65KISngCIpQn7MMGeyo2EK0TwOngMj4l6Q rsKdV+RylkjkUekfb8RwSdq3XRY8f9YVYXKSWP79MH6Mj6nk756ihD9T1pPx1SBuGmf2 jqVuHJa20dZmWMKFv+jh8Hx3vencdVxNB0KpOh2MUnt6axpWZEJNzIkPs0E2F8e8/5I/ s3wg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=HwtaEEMZe3SnTp9inMCk6AQlAVG8/PwROwShhgmC5YM=; fh=XxyM7Ym2GZuIk6CR9Av7X+K6FIjQnvqQn0ljduIMmhY=; b=dehTn0Lb1WDLGdg68uuZLbgDtSkYnup+mEJxycVdHz9+1I1bpkhGP2gdnGNC861ok2 oLVA7YeM40dbwaA8fnlQMjg2pB9+i+Wq25JRa7ib4bw+jwCpfwj228M5YPo/6Zunei1Z gU31NCN18k9FS9l4Gn28PiN6an0aHTWf9Q7YMJKMaD0jS2yrPC336ztA6ktts92QLxNa FVIIiIqEeteoFJg+XvR7Eb+NgcPMtUXr75OKH6gdNUyRT+8F9QKVVYxwP9NFapsKEhlw rfNeeofrzd4n8TLrS7khAh9vIFzXZJA3lXKS888Rr1PtiqC7VVOu+8+1u2SqiihrRoAN tjLQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="m/rl0g17"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id fi16-20020a170906da1000b0099238b86eecsi7050975ejb.566.2023.07.13.03.28.14 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 03:28:15 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="m/rl0g17"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 38B313848B87 for ; Thu, 13 Jul 2023 10:26:26 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 38B313848B87 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689243986; bh=HwtaEEMZe3SnTp9inMCk6AQlAVG8/PwROwShhgmC5YM=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=m/rl0g17UBzhbyMWRJ4thz9KHNsLD/RxMaiZkUFixVU2K6RRJk/mKdM6++R2U+2TA mx2UpfV62kbhIAiG1ARDC+j4AtdWJgE8WuvoAVU2wivVQioipvin03pAOgEAt1Fb2t +1I0v0lcZA7rhKTKIfz3a4LUCmI1Si9twrxruH8Y= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oi1-x230.google.com (mail-oi1-x230.google.com [IPv6:2607:f8b0:4864:20::230]) by sourceware.org (Postfix) with ESMTPS id EC3783858439 for ; Thu, 13 Jul 2023 10:22:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EC3783858439 Received: by mail-oi1-x230.google.com with SMTP id 5614622812f47-3a1ebb85f99so546292b6e.2 for ; Thu, 13 Jul 2023 03:22:33 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689243753; x=1691835753; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HwtaEEMZe3SnTp9inMCk6AQlAVG8/PwROwShhgmC5YM=; b=AXolO5OAHIWQBklMLPF7+3r9bKBFC9+R9AVs0VcwggimhNKfyZaw6WRMPAMOSpmyDG ek8ESwrfP8rjF9o8HH+TSK0aUAbUlJAV1dRRCqsoHCEG9Q3zIxkyMJ8S7AXulVLeTe/I /sevxhrCZ6nk/qBMiInR12vw+xEe8ZQ6NWU5/Ch2c0uJejoZIdPjKpfhX4Mx3cWoAg0B fTNJvSJEPG6vPCliv2GdVpjSbdp/F7sB8asBb1Taucu6KTO6YG06jRxXkYOUC19r+VBU KEiE5fQrfX2/DA7RHFDLqY4mcpXosx+DNDUl1bX0aRuMwTL2Q6FEm4pXUzTxwfsTjUZ4 FPdQ== X-Gm-Message-State: ABy/qLaIIoOIqoIDrZH0EKjBnj0wT5gcAygf0FdpOAwa6hIdkx8PFyrw WEIbcH70q+6O5tJNAlBXJOPqM/1MTdGdcdMUwfMnYojw X-Received: by 2002:a05:6808:1645:b0:3a3:61df:da with SMTP id az5-20020a056808164500b003a361df00damr1471415oib.53.1689243752956; Thu, 13 Jul 2023 03:22:32 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id a9-20020a05680802c900b003a020d24d7dsm2707263oid.56.2023.07.13.03.22.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 03:22:32 -0700 (PDT) To: gcc-patches@gcc.gnu.org, Kyrylo.Tkachov@arm.com, richard.earnshaw@arm.com, richard.sandiford@arm.com Cc: Christophe Lyon Subject: [PATCH 5/6] arm: [MVE intrinsics] factorize vcmlaq Date: Thu, 13 Jul 2023 10:22:23 +0000 Message-Id: <20230713102224.1161596-5-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230713102224.1161596-1-christophe.lyon@linaro.org> References: <20230713102224.1161596-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Christophe Lyon via Gcc-patches From: Christophe Lyon Reply-To: Christophe Lyon Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771300816151841430 X-GMAIL-MSGID: 1771300816151841430 Factorize vcmlaq builtins so that they use parameterized names. 2023-17-13 Christophe Lyon gcc/ * config/arm/arm_mve_builtins.def (vcmlaq_rot90_f) (vcmlaq_rot270_f, vcmlaq_rot180_f, vcmlaq_f): Add "_f" suffix. * config/arm/iterators.md (MVE_VCMLAQ_M): New. (mve_insn): Add vcmla. (rot): Add VCMLAQ_M_F, VCMLAQ_ROT90_M_F, VCMLAQ_ROT180_M_F, VCMLAQ_ROT270_M_F. (mve_rot): Add VCMLAQ_M_F, VCMLAQ_ROT90_M_F, VCMLAQ_ROT180_M_F, VCMLAQ_ROT270_M_F. * config/arm/mve.md (mve_vcmlaq): Rename into ... (@mve_q_f): ... this. (mve_vcmlaq_m_f, mve_vcmlaq_rot180_m_f) (mve_vcmlaq_rot270_m_f, mve_vcmlaq_rot90_m_f): Merge into ... (@mve_q_m_f): ... this. --- gcc/config/arm/arm_mve_builtins.def | 10 ++--- gcc/config/arm/iterators.md | 19 ++++++++- gcc/config/arm/mve.md | 64 ++++------------------------- 3 files changed, 29 insertions(+), 64 deletions(-) diff --git a/gcc/config/arm/arm_mve_builtins.def b/gcc/config/arm/arm_mve_builtins.def index 56358c0bd02..43dacc3dda1 100644 --- a/gcc/config/arm/arm_mve_builtins.def +++ b/gcc/config/arm/arm_mve_builtins.def @@ -378,6 +378,10 @@ VAR3 (TERNOP_NONE_NONE_NONE_NONE, vmlasq_n_s, v16qi, v8hi, v4si) VAR3 (TERNOP_NONE_NONE_NONE_NONE, vmlaq_n_s, v16qi, v8hi, v4si) VAR3 (TERNOP_NONE_NONE_NONE_NONE, vmladavaxq_s, v16qi, v8hi, v4si) VAR3 (TERNOP_NONE_NONE_NONE_NONE, vmladavaq_s, v16qi, v8hi, v4si) +VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot90_f, v8hf, v4sf) +VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot270_f, v8hf, v4sf) +VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot180_f, v8hf, v4sf) +VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_f, v8hf, v4sf) VAR3 (TERNOP_NONE_NONE_NONE_IMM, vsriq_n_s, v16qi, v8hi, v4si) VAR3 (TERNOP_NONE_NONE_NONE_IMM, vsliq_n_s, v16qi, v8hi, v4si) VAR2 (TERNOP_UNONE_UNONE_UNONE_PRED, vrev32q_m_u, v16qi, v8hi) @@ -876,9 +880,3 @@ VAR3 (QUADOP_NONE_NONE_UNONE_IMM_PRED, vshlcq_m_vec_s, v16qi, v8hi, v4si) VAR3 (QUADOP_NONE_NONE_UNONE_IMM_PRED, vshlcq_m_carry_s, v16qi, v8hi, v4si) VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_PRED, vshlcq_m_vec_u, v16qi, v8hi, v4si) VAR3 (QUADOP_UNONE_UNONE_UNONE_IMM_PRED, vshlcq_m_carry_u, v16qi, v8hi, v4si) - -/* optabs without any suffixes. */ -VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot90, v8hf, v4sf) -VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot270, v8hf, v4sf) -VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq_rot180, v8hf, v4sf) -VAR2 (TERNOP_NONE_NONE_NONE_NONE, vcmlaq, v8hf, v4sf) diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index 9f71404e26c..b13ff53d36f 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -911,6 +911,10 @@ VCMULQ_M_F VCMULQ_ROT90_M_F VCMULQ_ROT180_M_F VCMULQ_ROT270_M_F ]) +(define_int_iterator MVE_VCMLAQ_M [ + VCMLAQ_M_F VCMLAQ_ROT90_M_F VCMLAQ_ROT180_M_F VCMLAQ_ROT270_M_F + ]) + (define_int_attr mve_insn [ (UNSPEC_VCADD90 "vcadd") (UNSPEC_VCADD270 "vcadd") (UNSPEC_VCMUL "vcmul") (UNSPEC_VCMUL90 "vcmul") (UNSPEC_VCMUL180 "vcmul") (UNSPEC_VCMUL270 "vcmul") @@ -942,6 +946,7 @@ (VCLSQ_M_S "vcls") (VCLSQ_S "vcls") (VCLZQ_M_S "vclz") (VCLZQ_M_U "vclz") + (VCMLAQ_M_F "vcmla") (VCMLAQ_ROT90_M_F "vcmla") (VCMLAQ_ROT180_M_F "vcmla") (VCMLAQ_ROT270_M_F "vcmla") (VCMULQ_M_F "vcmul") (VCMULQ_ROT90_M_F "vcmul") (VCMULQ_ROT180_M_F "vcmul") (VCMULQ_ROT270_M_F "vcmul") (VCREATEQ_S "vcreate") (VCREATEQ_U "vcreate") (VCREATEQ_F "vcreate") (VDUPQ_M_N_S "vdup") (VDUPQ_M_N_U "vdup") (VDUPQ_M_N_F "vdup") @@ -1204,6 +1209,7 @@ (VSUBQ_M_N_S "vsub") (VSUBQ_M_N_U "vsub") (VSUBQ_M_N_F "vsub") (VSUBQ_M_S "vsub") (VSUBQ_M_U "vsub") (VSUBQ_M_F "vsub") (VSUBQ_N_S "vsub") (VSUBQ_N_U "vsub") (VSUBQ_N_F "vsub") + (UNSPEC_VCMLA "vcmla") (UNSPEC_VCMLA90 "vcmla") (UNSPEC_VCMLA180 "vcmla") (UNSPEC_VCMLA270 "vcmla") ]) (define_int_attr isu [ @@ -2198,7 +2204,12 @@ (VCMULQ_M_F "0") (VCMULQ_ROT90_M_F "90") (VCMULQ_ROT180_M_F "180") - (VCMULQ_ROT270_M_F "270")]) + (VCMULQ_ROT270_M_F "270") + (VCMLAQ_M_F "0") + (VCMLAQ_ROT90_M_F "90") + (VCMLAQ_ROT180_M_F "180") + (VCMLAQ_ROT270_M_F "270") + ]) ;; The complex operations when performed on a real complex number require two ;; instructions to perform the operation. e.g. complex multiplication requires @@ -2250,7 +2261,11 @@ (VCMULQ_M_F "") (VCMULQ_ROT90_M_F "_rot90") (VCMULQ_ROT180_M_F "_rot180") - (VCMULQ_ROT270_M_F "_rot270")]) + (VCMULQ_ROT270_M_F "_rot270") + (VCMLAQ_M_F "") + (VCMLAQ_ROT90_M_F "_rot90") + (VCMLAQ_ROT180_M_F "_rot180") + (VCMLAQ_ROT270_M_F "_rot270")]) (define_int_attr fcmac1 [(UNSPEC_VCMLA "a") (UNSPEC_VCMLA_CONJ "a") (UNSPEC_VCMLA180 "s") (UNSPEC_VCMLA180_CONJ "s")]) diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index 0b99bf017dc..a2cbcff1a6f 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -2087,7 +2087,7 @@ ;; ;; [vcmlaq, vcmlaq_rot90, vcmlaq_rot180, vcmlaq_rot270]) ;; -(define_insn "mve_vcmlaq" +(define_insn "@mve_q_f" [ (set (match_operand:MVE_0 0 "s_register_operand" "=w,w") (plus:MVE_0 (match_operand:MVE_0 1 "reg_or_zero_operand" "Dz,0") @@ -3180,70 +3180,22 @@ (set_attr "length""8")]) ;; -;; [vcmlaq_m_f]) -;; -(define_insn "mve_vcmlaq_m_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "=w") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") - (match_operand:MVE_0 2 "s_register_operand" "w") - (match_operand:MVE_0 3 "s_register_operand" "w") - (match_operand: 4 "vpr_register_operand" "Up")] - VCMLAQ_M_F)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcmlat.f%# %q0, %q2, %q3, #0" - [(set_attr "type" "mve_move") - (set_attr "length""8")]) - -;; -;; [vcmlaq_rot180_m_f]) -;; -(define_insn "mve_vcmlaq_rot180_m_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "=w") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") - (match_operand:MVE_0 2 "s_register_operand" "w") - (match_operand:MVE_0 3 "s_register_operand" "w") - (match_operand: 4 "vpr_register_operand" "Up")] - VCMLAQ_ROT180_M_F)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcmlat.f%# %q0, %q2, %q3, #180" - [(set_attr "type" "mve_move") - (set_attr "length""8")]) - -;; -;; [vcmlaq_rot270_m_f]) +;; [vcmlaq_m_f] +;; [vcmlaq_rot90_m_f] +;; [vcmlaq_rot180_m_f] +;; [vcmlaq_rot270_m_f] ;; -(define_insn "mve_vcmlaq_rot270_m_f" - [ - (set (match_operand:MVE_0 0 "s_register_operand" "=w") - (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") - (match_operand:MVE_0 2 "s_register_operand" "w") - (match_operand:MVE_0 3 "s_register_operand" "w") - (match_operand: 4 "vpr_register_operand" "Up")] - VCMLAQ_ROT270_M_F)) - ] - "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcmlat.f%# %q0, %q2, %q3, #270" - [(set_attr "type" "mve_move") - (set_attr "length""8")]) - -;; -;; [vcmlaq_rot90_m_f]) -;; -(define_insn "mve_vcmlaq_rot90_m_f" +(define_insn "@mve_q_m_f" [ (set (match_operand:MVE_0 0 "s_register_operand" "=w") (unspec:MVE_0 [(match_operand:MVE_0 1 "s_register_operand" "0") (match_operand:MVE_0 2 "s_register_operand" "w") (match_operand:MVE_0 3 "s_register_operand" "w") (match_operand: 4 "vpr_register_operand" "Up")] - VCMLAQ_ROT90_M_F)) + MVE_VCMLAQ_M)) ] "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" - "vpst\;vcmlat.f%# %q0, %q2, %q3, #90" + "vpst\;t.f%#\t%q0, %q2, %q3, #" [(set_attr "type" "mve_move") (set_attr "length""8")]) From patchwork Thu Jul 13 10:22:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 119757 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1723423vqm; Thu, 13 Jul 2023 03:23:54 -0700 (PDT) X-Google-Smtp-Source: APBJJlFZAcgBGFJQhGIv2PmhiUN4JCWAWXFd8Wm7/Q4ad3OaEtlVBL6jTK+40FiKT6KHyV5FpVvK X-Received: by 2002:a05:6402:31f9:b0:51d:d622:713d with SMTP id dy25-20020a05640231f900b0051dd622713dmr1324779edb.39.1689243833986; Thu, 13 Jul 2023 03:23:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689243833; cv=none; d=google.com; s=arc-20160816; b=XvH9eC1roC2JpEDhtI2oBM3EkSFuZHmNHF4b063Jd/lghnAvQHGK00FeqqYenGG4Qi 7f/wIfvdukIpwO8BujzZLs1mrvQ+JD1ZnL3C1AY+17ssLFW5CLm3EVMY02SHEGYDFt3H +FPamUs5sW7Ihfc20+GiMjPQqdkV68UXZ2jm7TMERcGFEnq15dzjkkd8tEAonoHkA1v2 AJbOWoZMEDgy8y8FmGZLOg+e3xmxBVVqU++V3bUkHV9GWlJwnJjINo3RH7B8tMNIvOSl 1JVI2ux8jOh59gChv3b3BALJIthxSuDAxowqrvTxkouHOPWtaIFoZBBI6eiUitGPk0VN jeIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=FT4BUg91wqnCvWQSbri8GDhxg4wi5bZ5HWvywLCcw+A=; fh=XxyM7Ym2GZuIk6CR9Av7X+K6FIjQnvqQn0ljduIMmhY=; b=UASIMGNLmsikJ4UxbuI11EtZ9zpX850BUI71ntD4jAz5Wzh0VIUbQjAGWxzR4P5FoX nPqBGkdZAbI4VSFjmBhFwILGvGXfMZ2qV0MrfspX0pIshY6ZUcaHTzkApB+THmx3j0HC noL8Co8KAxstWnb2jsrr+Co5ILECKLZlm3nGcVOuQBiHB6THY6vqJzI/ihWpEePamK/L rO+pvu1+Z+f76IM2k1BYj/owukAUqiHYGNshg6QwPyMQlq++SqwFkdeU/fk0EwUqFLBx kmbX61Upp/htqRGIKAhia30XdqqizLHVLfv4M+OYXKdZJLtK8l9fx7cA/qoVMfSrFp8W ygIA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="E1US/qkX"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id e21-20020a50fb95000000b0051de4432748si7352657edq.478.2023.07.13.03.23.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 03:23:53 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="E1US/qkX"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 786DB385B525 for ; Thu, 13 Jul 2023 10:23:38 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 786DB385B525 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689243818; bh=FT4BUg91wqnCvWQSbri8GDhxg4wi5bZ5HWvywLCcw+A=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=E1US/qkXHVQ4m4PMiSIs6QtbI5tLMBRBhcyfvq4xFGp7YqN+SuqZOHbVzYhtMgGOA PBW3ohwO4Pb0tJlDEXp8OGls7BMY7BUHKAPx5gHU+09WKnfMtqjRINR+R0PIQeoNEA Sfi3MHutnL9BTqKN4rQJSSbBA6JfWCpXcBEvW/Ts= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oi1-x231.google.com (mail-oi1-x231.google.com [IPv6:2607:f8b0:4864:20::231]) by sourceware.org (Postfix) with ESMTPS id 48C563858412 for ; Thu, 13 Jul 2023 10:22:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 48C563858412 Received: by mail-oi1-x231.google.com with SMTP id 5614622812f47-3a3c77e0154so451403b6e.1 for ; Thu, 13 Jul 2023 03:22:35 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689243754; x=1691835754; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FT4BUg91wqnCvWQSbri8GDhxg4wi5bZ5HWvywLCcw+A=; b=DRdvvWayx7qjiJD+czhalrL0ITyQH2Eu4R7z4gmw8php9henP5qDCjndrHwIXbWpbe u1y9lg8cBXSz71hjcqTuvzomztsvV9W3/brRSzm64o747E0p++0+ABsHYzUId0APPNfI SMMhJnj8GDAytlYZj9pfPc3BMK07rAXl+7VTQoO05v1NSwAqWdA75xVc/W932Haf5rcG 462oLJZ0v2ejOGsTLZS+FdQibM5upDyrEgm7i+3yu3KwYslhZZWHnn5dKEKnh22dZXdW xGqETI3ju3p/EpqENP5bR4Z0hn9oJC05aknwTMEtq2zXdCpeYVfa2Hx3ZQ5KGlkDx1SB NiGw== X-Gm-Message-State: ABy/qLacPBukfaTZFoCtYVc1x5mvvpLlad8dpzvOFKXRUbzkW5JMDn3m CQt4AF6HREC055Yk7z8Qk94QfuTYIf+jlOiANzAHTnUf X-Received: by 2002:a05:6808:e82:b0:3a3:640d:ed71 with SMTP id k2-20020a0568080e8200b003a3640ded71mr1298158oil.10.1689243753800; Thu, 13 Jul 2023 03:22:33 -0700 (PDT) Received: from localhost.localdomain ([139.178.84.207]) by smtp.gmail.com with ESMTPSA id a9-20020a05680802c900b003a020d24d7dsm2707263oid.56.2023.07.13.03.22.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jul 2023 03:22:33 -0700 (PDT) To: gcc-patches@gcc.gnu.org, Kyrylo.Tkachov@arm.com, richard.earnshaw@arm.com, richard.sandiford@arm.com Cc: Christophe Lyon Subject: [PATCH 6/6] arm: [MVE intrinsics] rework vcmlaq Date: Thu, 13 Jul 2023 10:22:24 +0000 Message-Id: <20230713102224.1161596-6-christophe.lyon@linaro.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230713102224.1161596-1-christophe.lyon@linaro.org> References: <20230713102224.1161596-1-christophe.lyon@linaro.org> MIME-Version: 1.0 X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Christophe Lyon via Gcc-patches From: Christophe Lyon Reply-To: Christophe Lyon Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771300542395868507 X-GMAIL-MSGID: 1771300542395868507 Implement vcmlaq using the new MVE builtins framework. 2023-07-13 Christophe Lyon gcc/ * config/arm/arm-mve-builtins-base.cc (vcmlaq, vcmlaq_rot90) (vcmlaq_rot180, vcmlaq_rot270): New. * config/arm/arm-mve-builtins-base.def (vcmlaq, vcmlaq_rot90) (vcmlaq_rot180, vcmlaq_rot270): New. * config/arm/arm-mve-builtins-base.h: (vcmlaq, vcmlaq_rot90) (vcmlaq_rot180, vcmlaq_rot270): New. * config/arm/arm-mve-builtins.cc (function_instance::has_inactive_argument): Handle vcmlaq, vcmlaq_rot90, vcmlaq_rot180, vcmlaq_rot270. * config/arm/arm_mve.h (vcmlaq): Delete. (vcmlaq_rot180): Delete. (vcmlaq_rot270): Delete. (vcmlaq_rot90): Delete. (vcmlaq_m): Delete. (vcmlaq_rot180_m): Delete. (vcmlaq_rot270_m): Delete. (vcmlaq_rot90_m): Delete. (vcmlaq_f16): Delete. (vcmlaq_rot180_f16): Delete. (vcmlaq_rot270_f16): Delete. (vcmlaq_rot90_f16): Delete. (vcmlaq_f32): Delete. (vcmlaq_rot180_f32): Delete. (vcmlaq_rot270_f32): Delete. (vcmlaq_rot90_f32): Delete. (vcmlaq_m_f32): Delete. (vcmlaq_m_f16): Delete. (vcmlaq_rot180_m_f32): Delete. (vcmlaq_rot180_m_f16): Delete. (vcmlaq_rot270_m_f32): Delete. (vcmlaq_rot270_m_f16): Delete. (vcmlaq_rot90_m_f32): Delete. (vcmlaq_rot90_m_f16): Delete. (__arm_vcmlaq_f16): Delete. (__arm_vcmlaq_rot180_f16): Delete. (__arm_vcmlaq_rot270_f16): Delete. (__arm_vcmlaq_rot90_f16): Delete. (__arm_vcmlaq_f32): Delete. (__arm_vcmlaq_rot180_f32): Delete. (__arm_vcmlaq_rot270_f32): Delete. (__arm_vcmlaq_rot90_f32): Delete. (__arm_vcmlaq_m_f32): Delete. (__arm_vcmlaq_m_f16): Delete. (__arm_vcmlaq_rot180_m_f32): Delete. (__arm_vcmlaq_rot180_m_f16): Delete. (__arm_vcmlaq_rot270_m_f32): Delete. (__arm_vcmlaq_rot270_m_f16): Delete. (__arm_vcmlaq_rot90_m_f32): Delete. (__arm_vcmlaq_rot90_m_f16): Delete. (__arm_vcmlaq): Delete. (__arm_vcmlaq_rot180): Delete. (__arm_vcmlaq_rot270): Delete. (__arm_vcmlaq_rot90): Delete. (__arm_vcmlaq_m): Delete. (__arm_vcmlaq_rot180_m): Delete. (__arm_vcmlaq_rot270_m): Delete. (__arm_vcmlaq_rot90_m): Delete. --- gcc/config/arm/arm-mve-builtins-base.cc | 4 + gcc/config/arm/arm-mve-builtins-base.def | 4 + gcc/config/arm/arm-mve-builtins-base.h | 16 +- gcc/config/arm/arm-mve-builtins.cc | 4 + gcc/config/arm/arm_mve.h | 304 ----------------------- 5 files changed, 22 insertions(+), 310 deletions(-) diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc index 3ad8df304e8..e31095ae112 100644 --- a/gcc/config/arm/arm-mve-builtins-base.cc +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -262,6 +262,10 @@ FUNCTION_WITH_RTX_M (vandq, AND, VANDQ) FUNCTION_ONLY_N (vbrsrq, VBRSRQ) FUNCTION (vcaddq_rot90, unspec_mve_function_exact_insn_rot, (UNSPEC_VCADD90, UNSPEC_VCADD90, UNSPEC_VCADD90, VCADDQ_ROT90_M_S, VCADDQ_ROT90_M_U, VCADDQ_ROT90_M_F)) FUNCTION (vcaddq_rot270, unspec_mve_function_exact_insn_rot, (UNSPEC_VCADD270, UNSPEC_VCADD270, UNSPEC_VCADD270, VCADDQ_ROT270_M_S, VCADDQ_ROT270_M_U, VCADDQ_ROT270_M_F)) +FUNCTION (vcmlaq, unspec_mve_function_exact_insn_rot, (-1, -1, UNSPEC_VCMLA, -1, -1, VCMLAQ_M_F)) +FUNCTION (vcmlaq_rot90, unspec_mve_function_exact_insn_rot, (-1, -1, UNSPEC_VCMLA90, -1, -1, VCMLAQ_ROT90_M_F)) +FUNCTION (vcmlaq_rot180, unspec_mve_function_exact_insn_rot, (-1, -1, UNSPEC_VCMLA180, -1, -1, VCMLAQ_ROT180_M_F)) +FUNCTION (vcmlaq_rot270, unspec_mve_function_exact_insn_rot, (-1, -1, UNSPEC_VCMLA270, -1, -1, VCMLAQ_ROT270_M_F)) FUNCTION (vcmulq, unspec_mve_function_exact_insn_rot, (-1, -1, UNSPEC_VCMUL, -1, -1, VCMULQ_M_F)) FUNCTION (vcmulq_rot90, unspec_mve_function_exact_insn_rot, (-1, -1, UNSPEC_VCMUL90, -1, -1, VCMULQ_ROT90_M_F)) FUNCTION (vcmulq_rot180, unspec_mve_function_exact_insn_rot, (-1, -1, UNSPEC_VCMUL180, -1, -1, VCMULQ_ROT180_M_F)) diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def index cbcf0d296cd..e7d466f2efd 100644 --- a/gcc/config/arm/arm-mve-builtins-base.def +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -158,6 +158,10 @@ DEF_MVE_FUNCTION (vandq, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vbrsrq, binary_imm32, all_float, mx_or_none) DEF_MVE_FUNCTION (vcaddq_rot90, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vcaddq_rot270, binary, all_float, mx_or_none) +DEF_MVE_FUNCTION (vcmlaq, ternary, all_float, m_or_none) +DEF_MVE_FUNCTION (vcmlaq_rot90, ternary, all_float, m_or_none) +DEF_MVE_FUNCTION (vcmlaq_rot180, ternary, all_float, m_or_none) +DEF_MVE_FUNCTION (vcmlaq_rot270, ternary, all_float, m_or_none) DEF_MVE_FUNCTION (vcmulq, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vcmulq_rot90, binary, all_float, mx_or_none) DEF_MVE_FUNCTION (vcmulq_rot180, binary, all_float, mx_or_none) diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h index 875b333ebef..be3698b4f4c 100644 --- a/gcc/config/arm/arm-mve-builtins-base.h +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -33,14 +33,14 @@ extern const function_base *const vaddvaq; extern const function_base *const vaddvq; extern const function_base *const vandq; extern const function_base *const vbrsrq; -extern const function_base *const vcaddq_rot90; extern const function_base *const vcaddq_rot270; -extern const function_base *const vcmulq; -extern const function_base *const vcmulq_rot90; -extern const function_base *const vcmulq_rot180; -extern const function_base *const vcmulq_rot270; +extern const function_base *const vcaddq_rot90; extern const function_base *const vclsq; extern const function_base *const vclzq; +extern const function_base *const vcmlaq; +extern const function_base *const vcmlaq_rot180; +extern const function_base *const vcmlaq_rot270; +extern const function_base *const vcmlaq_rot90; extern const function_base *const vcmpcsq; extern const function_base *const vcmpeqq; extern const function_base *const vcmpgeq; @@ -49,6 +49,10 @@ extern const function_base *const vcmphiq; extern const function_base *const vcmpleq; extern const function_base *const vcmpltq; extern const function_base *const vcmpneq; +extern const function_base *const vcmulq; +extern const function_base *const vcmulq_rot180; +extern const function_base *const vcmulq_rot270; +extern const function_base *const vcmulq_rot90; extern const function_base *const vcreateq; extern const function_base *const vdupq; extern const function_base *const veorq; @@ -56,8 +60,8 @@ extern const function_base *const vfmaq; extern const function_base *const vfmasq; extern const function_base *const vfmsq; extern const function_base *const vhaddq; -extern const function_base *const vhcaddq_rot90; extern const function_base *const vhcaddq_rot270; +extern const function_base *const vhcaddq_rot90; extern const function_base *const vhsubq; extern const function_base *const vmaxaq; extern const function_base *const vmaxavq; diff --git a/gcc/config/arm/arm-mve-builtins.cc b/gcc/config/arm/arm-mve-builtins.cc index 7033e41a571..3272ece6326 100644 --- a/gcc/config/arm/arm-mve-builtins.cc +++ b/gcc/config/arm/arm-mve-builtins.cc @@ -670,6 +670,10 @@ function_instance::has_inactive_argument () const return false; if (mode_suffix_id == MODE_r + || base == functions::vcmlaq + || base == functions::vcmlaq_rot90 + || base == functions::vcmlaq_rot180 + || base == functions::vcmlaq_rot270 || base == functions::vcmpeqq || base == functions::vcmpneq || base == functions::vcmpgeq diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index b9d3a876369..88b2e77ffd9 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -159,18 +159,10 @@ #define vcvtq_m(__inactive, __a, __p) __arm_vcvtq_m(__inactive, __a, __p) #define vcvtbq_m(__a, __b, __p) __arm_vcvtbq_m(__a, __b, __p) #define vcvttq_m(__a, __b, __p) __arm_vcvttq_m(__a, __b, __p) -#define vcmlaq(__a, __b, __c) __arm_vcmlaq(__a, __b, __c) -#define vcmlaq_rot180(__a, __b, __c) __arm_vcmlaq_rot180(__a, __b, __c) -#define vcmlaq_rot270(__a, __b, __c) __arm_vcmlaq_rot270(__a, __b, __c) -#define vcmlaq_rot90(__a, __b, __c) __arm_vcmlaq_rot90(__a, __b, __c) #define vcvtmq_m(__inactive, __a, __p) __arm_vcvtmq_m(__inactive, __a, __p) #define vcvtnq_m(__inactive, __a, __p) __arm_vcvtnq_m(__inactive, __a, __p) #define vcvtpq_m(__inactive, __a, __p) __arm_vcvtpq_m(__inactive, __a, __p) #define vcvtq_m_n(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n(__inactive, __a, __imm6, __p) -#define vcmlaq_m(__a, __b, __c, __p) __arm_vcmlaq_m(__a, __b, __c, __p) -#define vcmlaq_rot180_m(__a, __b, __c, __p) __arm_vcmlaq_rot180_m(__a, __b, __c, __p) -#define vcmlaq_rot270_m(__a, __b, __c, __p) __arm_vcmlaq_rot270_m(__a, __b, __c, __p) -#define vcmlaq_rot90_m(__a, __b, __c, __p) __arm_vcmlaq_rot90_m(__a, __b, __c, __p) #define vcvtq_x(__a, __p) __arm_vcvtq_x(__a, __p) #define vcvtq_x_n(__a, __imm6, __p) __arm_vcvtq_x_n(__a, __imm6, __p) @@ -286,10 +278,6 @@ #define vcvtbq_m_f32_f16(__inactive, __a, __p) __arm_vcvtbq_m_f32_f16(__inactive, __a, __p) #define vcvttq_m_f16_f32(__a, __b, __p) __arm_vcvttq_m_f16_f32(__a, __b, __p) #define vcvttq_m_f32_f16(__inactive, __a, __p) __arm_vcvttq_m_f32_f16(__inactive, __a, __p) -#define vcmlaq_f16(__a, __b, __c) __arm_vcmlaq_f16(__a, __b, __c) -#define vcmlaq_rot180_f16(__a, __b, __c) __arm_vcmlaq_rot180_f16(__a, __b, __c) -#define vcmlaq_rot270_f16(__a, __b, __c) __arm_vcmlaq_rot270_f16(__a, __b, __c) -#define vcmlaq_rot90_f16(__a, __b, __c) __arm_vcmlaq_rot90_f16(__a, __b, __c) #define vcvtmq_m_s16_f16(__inactive, __a, __p) __arm_vcvtmq_m_s16_f16(__inactive, __a, __p) #define vcvtnq_m_s16_f16(__inactive, __a, __p) __arm_vcvtnq_m_s16_f16(__inactive, __a, __p) #define vcvtpq_m_s16_f16(__inactive, __a, __p) __arm_vcvtpq_m_s16_f16(__inactive, __a, __p) @@ -298,10 +286,6 @@ #define vcvtnq_m_u16_f16(__inactive, __a, __p) __arm_vcvtnq_m_u16_f16(__inactive, __a, __p) #define vcvtpq_m_u16_f16(__inactive, __a, __p) __arm_vcvtpq_m_u16_f16(__inactive, __a, __p) #define vcvtq_m_u16_f16(__inactive, __a, __p) __arm_vcvtq_m_u16_f16(__inactive, __a, __p) -#define vcmlaq_f32(__a, __b, __c) __arm_vcmlaq_f32(__a, __b, __c) -#define vcmlaq_rot180_f32(__a, __b, __c) __arm_vcmlaq_rot180_f32(__a, __b, __c) -#define vcmlaq_rot270_f32(__a, __b, __c) __arm_vcmlaq_rot270_f32(__a, __b, __c) -#define vcmlaq_rot90_f32(__a, __b, __c) __arm_vcmlaq_rot90_f32(__a, __b, __c) #define vcvtmq_m_s32_f32(__inactive, __a, __p) __arm_vcvtmq_m_s32_f32(__inactive, __a, __p) #define vcvtnq_m_s32_f32(__inactive, __a, __p) __arm_vcvtnq_m_s32_f32(__inactive, __a, __p) #define vcvtpq_m_s32_f32(__inactive, __a, __p) __arm_vcvtpq_m_s32_f32(__inactive, __a, __p) @@ -344,14 +328,6 @@ #define vmulltq_poly_m_p16(__inactive, __a, __b, __p) __arm_vmulltq_poly_m_p16(__inactive, __a, __b, __p) #define vbicq_m_f32(__inactive, __a, __b, __p) __arm_vbicq_m_f32(__inactive, __a, __b, __p) #define vbicq_m_f16(__inactive, __a, __b, __p) __arm_vbicq_m_f16(__inactive, __a, __b, __p) -#define vcmlaq_m_f32(__a, __b, __c, __p) __arm_vcmlaq_m_f32(__a, __b, __c, __p) -#define vcmlaq_m_f16(__a, __b, __c, __p) __arm_vcmlaq_m_f16(__a, __b, __c, __p) -#define vcmlaq_rot180_m_f32(__a, __b, __c, __p) __arm_vcmlaq_rot180_m_f32(__a, __b, __c, __p) -#define vcmlaq_rot180_m_f16(__a, __b, __c, __p) __arm_vcmlaq_rot180_m_f16(__a, __b, __c, __p) -#define vcmlaq_rot270_m_f32(__a, __b, __c, __p) __arm_vcmlaq_rot270_m_f32(__a, __b, __c, __p) -#define vcmlaq_rot270_m_f16(__a, __b, __c, __p) __arm_vcmlaq_rot270_m_f16(__a, __b, __c, __p) -#define vcmlaq_rot90_m_f32(__a, __b, __c, __p) __arm_vcmlaq_rot90_m_f32(__a, __b, __c, __p) -#define vcmlaq_rot90_m_f16(__a, __b, __c, __p) __arm_vcmlaq_rot90_m_f16(__a, __b, __c, __p) #define vcvtq_m_n_s32_f32(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_s32_f32(__inactive, __a, __imm6, __p) #define vcvtq_m_n_s16_f16(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_s16_f16(__inactive, __a, __imm6, __p) #define vcvtq_m_n_u32_f32(__inactive, __a, __imm6, __p) __arm_vcvtq_m_n_u32_f32(__inactive, __a, __imm6, __p) @@ -4645,34 +4621,6 @@ __arm_vcvttq_m_f32_f16 (float32x4_t __inactive, float16x8_t __a, mve_pred16_t __ return __builtin_mve_vcvttq_m_f32_f16v4sf (__inactive, __a, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_f16 (float16x8_t __a, float16x8_t __b, float16x8_t __c) -{ - return __builtin_mve_vcmlaqv8hf (__a, __b, __c); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot180_f16 (float16x8_t __a, float16x8_t __b, float16x8_t __c) -{ - return __builtin_mve_vcmlaq_rot180v8hf (__a, __b, __c); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot270_f16 (float16x8_t __a, float16x8_t __b, float16x8_t __c) -{ - return __builtin_mve_vcmlaq_rot270v8hf (__a, __b, __c); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot90_f16 (float16x8_t __a, float16x8_t __b, float16x8_t __c) -{ - return __builtin_mve_vcmlaq_rot90v8hf (__a, __b, __c); -} - __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m_s16_f16 (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) @@ -4729,34 +4677,6 @@ __arm_vcvtq_m_u16_f16 (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) return __builtin_mve_vcvtq_m_from_f_uv8hi (__inactive, __a, __p); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_f32 (float32x4_t __a, float32x4_t __b, float32x4_t __c) -{ - return __builtin_mve_vcmlaqv4sf (__a, __b, __c); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot180_f32 (float32x4_t __a, float32x4_t __b, float32x4_t __c) -{ - return __builtin_mve_vcmlaq_rot180v4sf (__a, __b, __c); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot270_f32 (float32x4_t __a, float32x4_t __b, float32x4_t __c) -{ - return __builtin_mve_vcmlaq_rot270v4sf (__a, __b, __c); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot90_f32 (float32x4_t __a, float32x4_t __b, float32x4_t __c) -{ - return __builtin_mve_vcmlaq_rot90v4sf (__a, __b, __c); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m_s32_f32 (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) @@ -4855,62 +4775,6 @@ __arm_vbicq_m_f16 (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve return __builtin_mve_vbicq_m_fv8hf (__inactive, __a, __b, __p); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_m_f32 (float32x4_t __a, float32x4_t __b, float32x4_t __c, mve_pred16_t __p) -{ - return __builtin_mve_vcmlaq_m_fv4sf (__a, __b, __c, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_m_f16 (float16x8_t __a, float16x8_t __b, float16x8_t __c, mve_pred16_t __p) -{ - return __builtin_mve_vcmlaq_m_fv8hf (__a, __b, __c, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot180_m_f32 (float32x4_t __a, float32x4_t __b, float32x4_t __c, mve_pred16_t __p) -{ - return __builtin_mve_vcmlaq_rot180_m_fv4sf (__a, __b, __c, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot180_m_f16 (float16x8_t __a, float16x8_t __b, float16x8_t __c, mve_pred16_t __p) -{ - return __builtin_mve_vcmlaq_rot180_m_fv8hf (__a, __b, __c, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot270_m_f32 (float32x4_t __a, float32x4_t __b, float32x4_t __c, mve_pred16_t __p) -{ - return __builtin_mve_vcmlaq_rot270_m_fv4sf (__a, __b, __c, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot270_m_f16 (float16x8_t __a, float16x8_t __b, float16x8_t __c, mve_pred16_t __p) -{ - return __builtin_mve_vcmlaq_rot270_m_fv8hf (__a, __b, __c, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot90_m_f32 (float32x4_t __a, float32x4_t __b, float32x4_t __c, mve_pred16_t __p) -{ - return __builtin_mve_vcmlaq_rot90_m_fv4sf (__a, __b, __c, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot90_m_f16 (float16x8_t __a, float16x8_t __b, float16x8_t __c, mve_pred16_t __p) -{ - return __builtin_mve_vcmlaq_rot90_m_fv8hf (__a, __b, __c, __p); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtq_m_n_s32_f32 (int32x4_t __inactive, float32x4_t __a, const int __imm6, mve_pred16_t __p) @@ -8481,34 +8345,6 @@ __arm_vcvttq_m (float32x4_t __inactive, float16x8_t __a, mve_pred16_t __p) return __arm_vcvttq_m_f32_f16 (__inactive, __a, __p); } -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq (float16x8_t __a, float16x8_t __b, float16x8_t __c) -{ - return __arm_vcmlaq_f16 (__a, __b, __c); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot180 (float16x8_t __a, float16x8_t __b, float16x8_t __c) -{ - return __arm_vcmlaq_rot180_f16 (__a, __b, __c); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot270 (float16x8_t __a, float16x8_t __b, float16x8_t __c) -{ - return __arm_vcmlaq_rot270_f16 (__a, __b, __c); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot90 (float16x8_t __a, float16x8_t __b, float16x8_t __c) -{ - return __arm_vcmlaq_rot90_f16 (__a, __b, __c); -} - __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m (int16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) @@ -8565,34 +8401,6 @@ __arm_vcvtq_m (uint16x8_t __inactive, float16x8_t __a, mve_pred16_t __p) return __arm_vcvtq_m_u16_f16 (__inactive, __a, __p); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq (float32x4_t __a, float32x4_t __b, float32x4_t __c) -{ - return __arm_vcmlaq_f32 (__a, __b, __c); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot180 (float32x4_t __a, float32x4_t __b, float32x4_t __c) -{ - return __arm_vcmlaq_rot180_f32 (__a, __b, __c); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot270 (float32x4_t __a, float32x4_t __b, float32x4_t __c) -{ - return __arm_vcmlaq_rot270_f32 (__a, __b, __c); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot90 (float32x4_t __a, float32x4_t __b, float32x4_t __c) -{ - return __arm_vcmlaq_rot90_f32 (__a, __b, __c); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtmq_m (int32x4_t __inactive, float32x4_t __a, mve_pred16_t __p) @@ -8691,62 +8499,6 @@ __arm_vbicq_m (float16x8_t __inactive, float16x8_t __a, float16x8_t __b, mve_pre return __arm_vbicq_m_f16 (__inactive, __a, __b, __p); } -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_m (float32x4_t __a, float32x4_t __b, float32x4_t __c, mve_pred16_t __p) -{ - return __arm_vcmlaq_m_f32 (__a, __b, __c, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_m (float16x8_t __a, float16x8_t __b, float16x8_t __c, mve_pred16_t __p) -{ - return __arm_vcmlaq_m_f16 (__a, __b, __c, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot180_m (float32x4_t __a, float32x4_t __b, float32x4_t __c, mve_pred16_t __p) -{ - return __arm_vcmlaq_rot180_m_f32 (__a, __b, __c, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot180_m (float16x8_t __a, float16x8_t __b, float16x8_t __c, mve_pred16_t __p) -{ - return __arm_vcmlaq_rot180_m_f16 (__a, __b, __c, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot270_m (float32x4_t __a, float32x4_t __b, float32x4_t __c, mve_pred16_t __p) -{ - return __arm_vcmlaq_rot270_m_f32 (__a, __b, __c, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot270_m (float16x8_t __a, float16x8_t __b, float16x8_t __c, mve_pred16_t __p) -{ - return __arm_vcmlaq_rot270_m_f16 (__a, __b, __c, __p); -} - -__extension__ extern __inline float32x4_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot90_m (float32x4_t __a, float32x4_t __b, float32x4_t __c, mve_pred16_t __p) -{ - return __arm_vcmlaq_rot90_m_f32 (__a, __b, __c, __p); -} - -__extension__ extern __inline float16x8_t -__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) -__arm_vcmlaq_rot90_m (float16x8_t __a, float16x8_t __b, float16x8_t __c, mve_pred16_t __p) -{ - return __arm_vcmlaq_rot90_m_f16 (__a, __b, __c, __p); -} - __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vcvtq_m_n (int32x4_t __inactive, float32x4_t __a, const int __imm6, mve_pred16_t __p) @@ -9620,34 +9372,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_uint16x8_t]: __arm_vcvtq_m_n_f16_u16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, uint16x8_t), p2, p3), \ int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vcvtq_m_n_f32_u32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), p2, p3));}) -#define __arm_vcmlaq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmlaq_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t)), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmlaq_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t)));}) - -#define __arm_vcmlaq_rot180(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmlaq_rot180_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t)), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmlaq_rot180_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t)));}) - -#define __arm_vcmlaq_rot270(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmlaq_rot270_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t)), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmlaq_rot270_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t)));}) - -#define __arm_vcmlaq_rot90(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmlaq_rot90_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t)), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmlaq_rot90_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t)));}) - #define __arm_vcvtbq_m(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ @@ -9697,34 +9421,6 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vbicq_m_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vbicq_m_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) -#define __arm_vcmlaq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmlaq_m_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmlaq_m_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - -#define __arm_vcmlaq_rot180_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmlaq_rot180_m_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmlaq_rot180_m_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - -#define __arm_vcmlaq_rot270_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmlaq_rot270_m_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmlaq_rot270_m_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - -#define __arm_vcmlaq_rot90_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ - __typeof(p1) __p1 = (p1); \ - __typeof(p2) __p2 = (p2); \ - _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ - int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmlaq_rot90_m_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \ - int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmlaq_rot90_m_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));}) - #define __arm_vornq_m(p0,p1,p2,p3) ({ __typeof(p0) __p0 = (p0); \ __typeof(p1) __p1 = (p1); \ __typeof(p2) __p2 = (p2); \