From patchwork Tue Dec 5 10:13:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 173900 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3329439vqy; Tue, 5 Dec 2023 02:15:53 -0800 (PST) X-Google-Smtp-Source: AGHT+IG7sjW+IYLp11WAuWydb7ZxpV4RWPIcwIQmD2Jluqc0kUWYoRzJOBx+TVWNMdAFbOeeYr7F X-Received: by 2002:a05:6214:1144:b0:67a:db17:c736 with SMTP id b4-20020a056214114400b0067adb17c736mr1057481qvt.62.1701771353424; Tue, 05 Dec 2023 02:15:53 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701771353; cv=pass; d=google.com; s=arc-20160816; b=eOzwJSfHWZ94tVmB+Z+mLbtWmJPv+iXBNE47uNade5RkzEJsDUSbM+UgLTnjrKK8IF iC19QYYhi3R+EE/2nOcsyKFlktwO3Jwk8tGgU+iO2CWODzKtD0sxHzYadWwhRFOtf9dp owU1RigbO5t8d820QIMetp0D53Ngu79LKN0FBdt1a9MMQRJirHuYEbiutUjew9h/0kwT /o9sqvR9wAs7r/V4LzYaXnQOUHu8ciS8n1nE2GJWqcm7lY9j+Mf6UbS12s6K1dESamDb Y1WwQlDDkGHnLrJC+CLceLVpYg0VOukdo4SbLn8d1SrBGnT/csZKmkdWDEMxsbXyG1lG zRmg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:arc-filter:dmarc-filter:delivered-to; bh=Q2KpzNbFOvNCK78vEIxCVnzzgMOxJr1saI6PozzQUrs=; fh=C4nEn4uRKApr1WsFtLyJD8L5BeRuRc+JFyqoopFjd9M=; b=SDOtm7Q5/1flHVezoDv2uvECtNnAGvhFDctIUpDWcXcWvKZgClp6I2MvG7GhX92/Dg Kt6eBlpcuhUwk+XMetc/48K1HAdLGCkqDL6D8j/a7N0yYRoyF3Gy4iDFTMhTgGRSLvmg 5QD0WL1wlXWcV7/l7GV+UYo9qzkmR7lPJkjDDsRmICbtJHuydjJn9uNQgQ9gr7RJmSuC aj89WKDw97QyrVUW51zw09ktkq8nqng+V3WpRIJaaoeGA8tG9VTgFg+BUMa2xeWbf4ee Jc0l21+ckyhZ9U9yGU4dNaUgKYHkCgpSQ0n1r/Y3LOPv/OenLYZNgix9obHk0SIcNaWe zZnw== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id m7-20020ad45047000000b0067aa6010cf4si7821388qvq.286.2023.12.05.02.15.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Dec 2023 02:15:53 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A0B963845777 for ; Tue, 5 Dec 2023 10:15:28 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id BBF8A385DC01 for ; Tue, 5 Dec 2023 10:13:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BBF8A385DC01 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BBF8A385DC01 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771225; cv=none; b=Txex4Xo/XEvJMNuluFpODHeSFZ2z7Tzdv8WEDvZWBUb0Z1TcAbuUHdzy+6dBT2u2eJuoSbzML7TXPjAXuiHFR/4dmP+slE/b0lSvgPagKOAEjddUzaBH361PG3K+IUb91/l50ibdTOc4haAKfgFtNYkKOz56aapLMDEEEyahcD0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701771225; c=relaxed/simple; bh=Lk0YDezrKuqLwHSIf+dVXOQKG3jttYQiN3+rr/7bTtI=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=ve2Djy8LDnPBqPeqlOT4dv+z7x6+wyGsMHoeYDjwVEIMqfZ+Pv50EwrFAv67tFcNGhtQKrWNJkUI0ujPBoiS2IpASX3UbT1Y/lQytFFfTYi6GNKU73qg04lbQ2b602oj1fHbrU664QguQjUje+KVuMbz8Vqwtfd5nFzAqaXNjgE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 17171139F; Tue, 5 Dec 2023 02:14:27 -0800 (PST) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 089693F5A1; Tue, 5 Dec 2023 02:13:39 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [pushed v2 10/25] aarch64: Add tuple forms of svreinterpret Date: Tue, 5 Dec 2023 10:13:08 +0000 Message-Id: <20231205101323.1914247-11-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231205101323.1914247-1-richard.sandiford@arm.com> References: <20231205101323.1914247-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-22.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784436598539406555 X-GMAIL-MSGID: 1784436598539406555 SME2 adds a number of intrinsics that operate on tuples of 2 and 4 vectors. The ACLE therefore extends the existing svreinterpret intrinsics to handle tuples as well. gcc/ * config/aarch64/aarch64-sve-builtins-base.cc (svreinterpret_impl::fold): Punt on tuple forms. (svreinterpret_impl::expand): Use tuple_mode instead of vector_mode. * config/aarch64/aarch64-sve-builtins-base.def (svreinterpret): Extend to x1234 groups. * config/aarch64/aarch64-sve-builtins-functions.h (multi_vector_function::vectors_per_tuple): If the function has a group suffix, get the number of vectors from there. * config/aarch64/aarch64-sve-builtins-shapes.h (reinterpret): Declare. * config/aarch64/aarch64-sve-builtins-shapes.cc (reinterpret_def) (reinterpret): New function shape. * config/aarch64/aarch64-sve-builtins.cc (function_groups): Handle DEF_SVE_FUNCTION_GS. * config/aarch64/aarch64-sve-builtins.def (DEF_SVE_FUNCTION_GS): New macro. (DEF_SVE_FUNCTION): Forward to DEF_SVE_FUNCTION_GS by default. * config/aarch64/aarch64-sve-builtins.h (function_instance::tuple_mode): New member function. (function_base::vectors_per_tuple): Take the function instance as argument and get the number from the group suffix. (function_instance::vectors_per_tuple): Update accordingly. * config/aarch64/iterators.md (SVE_FULLx2, SVE_FULLx3, SVE_FULLx4) (SVE_ALL_STRUCT): New mode iterators. (SVE_STRUCT): Redefine in terms of SVE_FULL*. * config/aarch64/aarch64-sve.md (@aarch64_sve_reinterpret) (*aarch64_sve_reinterpret): Extend to SVE structure modes. gcc/testsuite/ * gcc.target/aarch64/sve/acle/asm/test_sve_acle.h (TEST_DUAL_XN): New macro. * gcc.target/aarch64/sve/acle/asm/reinterpret_bf16.c: Add tests for tuple forms. * gcc.target/aarch64/sve/acle/asm/reinterpret_f16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_f32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_f64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_s16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_s32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_s64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_s8.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_u16.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_u32.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_u64.c: Likewise. * gcc.target/aarch64/sve/acle/asm/reinterpret_u8.c: Likewise. --- .../aarch64/aarch64-sve-builtins-base.cc | 5 +- .../aarch64/aarch64-sve-builtins-base.def | 2 +- .../aarch64/aarch64-sve-builtins-functions.h | 7 ++- .../aarch64/aarch64-sve-builtins-shapes.cc | 28 +++++++++ .../aarch64/aarch64-sve-builtins-shapes.h | 1 + gcc/config/aarch64/aarch64-sve-builtins.cc | 8 ++- gcc/config/aarch64/aarch64-sve-builtins.def | 8 ++- gcc/config/aarch64/aarch64-sve-builtins.h | 20 +++++- gcc/config/aarch64/aarch64-sve.md | 8 +-- gcc/config/aarch64/iterators.md | 26 +++++--- .../aarch64/sve/acle/asm/reinterpret_bf16.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_f16.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_f32.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_f64.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_s16.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_s32.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_s64.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_s8.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_u16.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_u32.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_u64.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/reinterpret_u8.c | 62 +++++++++++++++++++ .../aarch64/sve/acle/asm/test_sve_acle.h | 14 +++++ 23 files changed, 851 insertions(+), 20 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc b/gcc/config/aarch64/aarch64-sve-builtins-base.cc index 6e108de54ea..a219c88085a 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc @@ -2148,6 +2148,9 @@ public: gimple * fold (gimple_folder &f) const override { + if (f.vectors_per_tuple () > 1) + return NULL; + /* Punt to rtl if the effect of the reinterpret on registers does not conform to GCC's endianness model. */ if (!targetm.can_change_mode_class (f.vector_mode (0), @@ -2164,7 +2167,7 @@ public: rtx expand (function_expander &e) const override { - machine_mode mode = e.vector_mode (0); + machine_mode mode = e.tuple_mode (0); return e.use_exact_insn (code_for_aarch64_sve_reinterpret (mode)); } }; diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.def b/gcc/config/aarch64/aarch64-sve-builtins-base.def index 0484863d3f7..4e31f67ac47 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-base.def +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.def @@ -248,7 +248,7 @@ DEF_SVE_FUNCTION (svrdffr, rdffr, none, z_or_none) DEF_SVE_FUNCTION (svrecpe, unary, all_float, none) DEF_SVE_FUNCTION (svrecps, binary, all_float, none) DEF_SVE_FUNCTION (svrecpx, unary, all_float, mxz) -DEF_SVE_FUNCTION (svreinterpret, unary_convert, reinterpret, none) +DEF_SVE_FUNCTION_GS (svreinterpret, reinterpret, reinterpret, x1234, none) DEF_SVE_FUNCTION (svrev, unary, all_data, none) DEF_SVE_FUNCTION (svrev, unary_pred, all_pred, none) DEF_SVE_FUNCTION (svrevb, unary, hsd_integer, mxz) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-functions.h b/gcc/config/aarch64/aarch64-sve-builtins-functions.h index 2729877d914..4a10102038a 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-functions.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-functions.h @@ -48,8 +48,13 @@ public: : m_vectors_per_tuple (vectors_per_tuple) {} unsigned int - vectors_per_tuple () const override + vectors_per_tuple (const function_instance &fi) const override { + if (fi.group_suffix_id != GROUP_none) + { + gcc_checking_assert (m_vectors_per_tuple == 1); + return fi.group_suffix ().vectors_per_tuple; + } return m_vectors_per_tuple; } diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc index 86ec29a5caf..2c25b122f05 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.cc @@ -2400,6 +2400,34 @@ struct reduction_wide_def : public overloaded_base<0> }; SHAPE (reduction_wide) +/* svx_t svfoo_t0[_t1_g](svx_t) + + where the target type must be specified explicitly but the source + type can be inferred. */ +struct reinterpret_def : public overloaded_base<1> +{ + bool explicit_group_suffix_p () const override { return false; } + + void + build (function_builder &b, const function_group_info &group) const override + { + b.add_overloaded_functions (group, MODE_none); + build_all (b, "t0,t1", group, MODE_none); + } + + tree + resolve (function_resolver &r) const override + { + sve_type type; + if (!r.check_num_arguments (1) + || !(type = r.infer_sve_type (0))) + return error_mark_node; + + return r.resolve_to (r.mode_suffix_id, type); + } +}; +SHAPE (reinterpret) + /* svxN_t svfoo[_t0](svxN_t, uint64_t, sv_t) where the second argument is an integer constant expression in the diff --git a/gcc/config/aarch64/aarch64-sve-builtins-shapes.h b/gcc/config/aarch64/aarch64-sve-builtins-shapes.h index 7483c1d04b8..38d494761ae 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-shapes.h +++ b/gcc/config/aarch64/aarch64-sve-builtins-shapes.h @@ -133,6 +133,7 @@ namespace aarch64_sve extern const function_shape *const rdffr; extern const function_shape *const reduction; extern const function_shape *const reduction_wide; + extern const function_shape *const reinterpret; extern const function_shape *const set; extern const function_shape *const setffr; extern const function_shape *const shift_left_imm_long; diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 55bd2662d1a..ecee554a890 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -494,6 +494,10 @@ static const group_suffix_index groups_none[] = { GROUP_none, NUM_GROUP_SUFFIXES }; +static const group_suffix_index groups_x1234[] = { + GROUP_none, GROUP_x2, GROUP_x3, GROUP_x4, NUM_GROUP_SUFFIXES +}; + /* Used by functions that have no governing predicate. */ static const predication_index preds_none[] = { PRED_none, NUM_PREDS }; @@ -534,8 +538,8 @@ static const predication_index preds_z[] = { PRED_z, NUM_PREDS }; /* A list of all SVE ACLE functions. */ static CONSTEXPR const function_group_info function_groups[] = { -#define DEF_SVE_FUNCTION(NAME, SHAPE, TYPES, PREDS) \ - { #NAME, &functions::NAME, &shapes::SHAPE, types_##TYPES, groups_none, \ +#define DEF_SVE_FUNCTION_GS(NAME, SHAPE, TYPES, GROUPS, PREDS) \ + { #NAME, &functions::NAME, &shapes::SHAPE, types_##TYPES, groups_##GROUPS, \ preds_##PREDS, REQUIRED_EXTENSIONS }, #include "aarch64-sve-builtins.def" }; diff --git a/gcc/config/aarch64/aarch64-sve-builtins.def b/gcc/config/aarch64/aarch64-sve-builtins.def index 5fbd486d74e..14d12f07415 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.def +++ b/gcc/config/aarch64/aarch64-sve-builtins.def @@ -33,8 +33,13 @@ #define DEF_SVE_GROUP_SUFFIX(A, B, C) #endif +#ifndef DEF_SVE_FUNCTION_GS +#define DEF_SVE_FUNCTION_GS(A, B, C, D, E) +#endif + #ifndef DEF_SVE_FUNCTION -#define DEF_SVE_FUNCTION(A, B, C, D) +#define DEF_SVE_FUNCTION(NAME, SHAPE, TYPES, PREDS) \ + DEF_SVE_FUNCTION_GS (NAME, SHAPE, TYPES, none, PREDS) #endif DEF_SVE_MODE (n, none, none, none) @@ -107,6 +112,7 @@ DEF_SVE_GROUP_SUFFIX (x4, 0, 4) #include "aarch64-sve-builtins-sve2.def" #undef DEF_SVE_FUNCTION +#undef DEF_SVE_FUNCTION_GS #undef DEF_SVE_GROUP_SUFFIX #undef DEF_SVE_TYPE_SUFFIX #undef DEF_SVE_TYPE diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index 0b40ad7b7cd..e770a4042fe 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -364,6 +364,7 @@ public: tree tuple_type (unsigned int) const; unsigned int elements_per_vq (unsigned int i) const; machine_mode vector_mode (unsigned int) const; + machine_mode tuple_mode (unsigned int) const; machine_mode gp_mode (unsigned int) const; /* The properties of the function. */ @@ -666,7 +667,7 @@ public: /* If the function operates on tuples of vectors, return the number of vectors in the tuples, otherwise return 1. */ - virtual unsigned int vectors_per_tuple () const { return 1; } + virtual unsigned int vectors_per_tuple (const function_instance &) const; /* If the function addresses memory, return the type of a single scalar memory element. */ @@ -841,7 +842,7 @@ function_instance::operator!= (const function_instance &other) const inline unsigned int function_instance::vectors_per_tuple () const { - return base->vectors_per_tuple (); + return base->vectors_per_tuple (*this); } /* If the function addresses memory, return the type of a single @@ -945,6 +946,15 @@ function_instance::vector_mode (unsigned int i) const return type_suffix (i).vector_mode; } +/* Return the mode of tuple_type (I). */ +inline machine_mode +function_instance::tuple_mode (unsigned int i) const +{ + if (group_suffix ().vectors_per_tuple > 1) + return TYPE_MODE (tuple_type (i)); + return vector_mode (i); +} + /* Return the mode of the governing predicate to use when operating on type suffix I. */ inline machine_mode @@ -971,6 +981,12 @@ function_base::call_properties (const function_instance &instance) const return flags; } +inline unsigned int +function_base::vectors_per_tuple (const function_instance &instance) const +{ + return instance.group_suffix ().vectors_per_tuple; +} + /* Return the mode of the result of a call. */ inline machine_mode function_expander::result_mode () const diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index cfadac4f1be..e9cebffe3e0 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -787,8 +787,8 @@ (define_insn_and_split "*aarch64_sve_mov_subreg_be" ;; This is equivalent to a subreg on little-endian targets but not for ;; big-endian; see the comment at the head of the file for details. (define_expand "@aarch64_sve_reinterpret" - [(set (match_operand:SVE_ALL 0 "register_operand") - (unspec:SVE_ALL + [(set (match_operand:SVE_ALL_STRUCT 0 "register_operand") + (unspec:SVE_ALL_STRUCT [(match_operand 1 "aarch64_any_register_operand")] UNSPEC_REINTERPRET))] "TARGET_SVE" @@ -805,8 +805,8 @@ (define_expand "@aarch64_sve_reinterpret" ;; A pattern for handling type punning on big-endian targets. We use a ;; special predicate for operand 1 to reduce the number of patterns. (define_insn_and_split "*aarch64_sve_reinterpret" - [(set (match_operand:SVE_ALL 0 "register_operand" "=w") - (unspec:SVE_ALL + [(set (match_operand:SVE_ALL_STRUCT 0 "register_operand" "=w") + (unspec:SVE_ALL_STRUCT [(match_operand 1 "aarch64_any_register_operand" "w")] UNSPEC_REINTERPRET))] "TARGET_SVE" diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index a920de99ffc..e7aa7e35ae1 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -430,14 +430,6 @@ (define_mode_iterator VNx4SF_ONLY [VNx4SF]) (define_mode_iterator VNx2DI_ONLY [VNx2DI]) (define_mode_iterator VNx2DF_ONLY [VNx2DF]) -;; All SVE vector structure modes. -(define_mode_iterator SVE_STRUCT [VNx32QI VNx16HI VNx8SI VNx4DI - VNx16BF VNx16HF VNx8SF VNx4DF - VNx48QI VNx24HI VNx12SI VNx6DI - VNx24BF VNx24HF VNx12SF VNx6DF - VNx64QI VNx32HI VNx16SI VNx8DI - VNx32BF VNx32HF VNx16SF VNx8DF]) - ;; All fully-packed SVE vector modes. (define_mode_iterator SVE_FULL [VNx16QI VNx8HI VNx4SI VNx2DI VNx8BF VNx8HF VNx4SF VNx2DF]) @@ -509,6 +501,24 @@ (define_mode_iterator SVE_ALL [VNx16QI VNx8QI VNx4QI VNx2QI VNx2DI VNx2DF]) +;; All SVE 2-vector modes. +(define_mode_iterator SVE_FULLx2 [VNx32QI VNx16HI VNx8SI VNx4DI + VNx16BF VNx16HF VNx8SF VNx4DF]) + +;; All SVE 3-vector modes. +(define_mode_iterator SVE_FULLx3 [VNx48QI VNx24HI VNx12SI VNx6DI + VNx24BF VNx24HF VNx12SF VNx6DF]) + +;; All SVE 4-vector modes. +(define_mode_iterator SVE_FULLx4 [VNx64QI VNx32HI VNx16SI VNx8DI + VNx32BF VNx32HF VNx16SF VNx8DF]) + +;; All SVE vector structure modes. +(define_mode_iterator SVE_STRUCT [SVE_FULLx2 SVE_FULLx3 SVE_FULLx4]) + +;; All SVE vector and structure modes. +(define_mode_iterator SVE_ALL_STRUCT [SVE_ALL SVE_STRUCT]) + ;; All SVE integer vector modes. (define_mode_iterator SVE_I [VNx16QI VNx8QI VNx4QI VNx2QI VNx8HI VNx4HI VNx2HI diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_bf16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_bf16.c index 2d2c2a714b9..dd0daf2eff0 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_bf16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_bf16.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_bf16_u64_tied1, svbfloat16_t, svuint64_t, TEST_DUAL_Z (reinterpret_bf16_u64_untied, svbfloat16_t, svuint64_t, z0 = svreinterpret_bf16_u64 (z4), z0 = svreinterpret_bf16 (z4)) + +/* +** reinterpret_bf16_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_bf16_bf16_x2_tied1, svbfloat16x2_t, svbfloat16x2_t, + z0_res = svreinterpret_bf16_bf16_x2 (z0), + z0_res = svreinterpret_bf16 (z0)) + +/* +** reinterpret_bf16_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_bf16_f32_x2_untied, svbfloat16x2_t, svfloat32x2_t, z0, + svreinterpret_bf16_f32_x2 (z4), + svreinterpret_bf16 (z4)) + +/* +** reinterpret_bf16_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_bf16_s64_x3_tied1, svbfloat16x3_t, svint64x3_t, + z0_res = svreinterpret_bf16_s64_x3 (z0), + z0_res = svreinterpret_bf16 (z0)) + +/* +** reinterpret_bf16_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_bf16_u8_x3_untied, svbfloat16x3_t, svuint8x3_t, z18, + svreinterpret_bf16_u8_x3 (z23), + svreinterpret_bf16 (z23)) + +/* +** reinterpret_bf16_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_bf16_u32_x4_tied1, svbfloat16x4_t, svuint32x4_t, + z0_res = svreinterpret_bf16_u32_x4 (z0), + z0_res = svreinterpret_bf16 (z0)) + +/* +** reinterpret_bf16_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_bf16_f64_x4_untied, svbfloat16x4_t, svfloat64x4_t, z28, + svreinterpret_bf16_f64_x4 (z4), + svreinterpret_bf16 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f16.c index 60705e62879..9b6f8227d2a 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f16.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_f16_u64_tied1, svfloat16_t, svuint64_t, TEST_DUAL_Z (reinterpret_f16_u64_untied, svfloat16_t, svuint64_t, z0 = svreinterpret_f16_u64 (z4), z0 = svreinterpret_f16 (z4)) + +/* +** reinterpret_f16_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f16_bf16_x2_tied1, svfloat16x2_t, svbfloat16x2_t, + z0_res = svreinterpret_f16_bf16_x2 (z0), + z0_res = svreinterpret_f16 (z0)) + +/* +** reinterpret_f16_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_f16_f32_x2_untied, svfloat16x2_t, svfloat32x2_t, z0, + svreinterpret_f16_f32_x2 (z4), + svreinterpret_f16 (z4)) + +/* +** reinterpret_f16_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f16_s64_x3_tied1, svfloat16x3_t, svint64x3_t, + z0_res = svreinterpret_f16_s64_x3 (z0), + z0_res = svreinterpret_f16 (z0)) + +/* +** reinterpret_f16_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_f16_u8_x3_untied, svfloat16x3_t, svuint8x3_t, z18, + svreinterpret_f16_u8_x3 (z23), + svreinterpret_f16 (z23)) + +/* +** reinterpret_f16_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f16_u32_x4_tied1, svfloat16x4_t, svuint32x4_t, + z0_res = svreinterpret_f16_u32_x4 (z0), + z0_res = svreinterpret_f16 (z0)) + +/* +** reinterpret_f16_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_f16_f64_x4_untied, svfloat16x4_t, svfloat64x4_t, z28, + svreinterpret_f16_f64_x4 (z4), + svreinterpret_f16 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f32.c index 06fc46f25de..ce981fce9d8 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f32.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_f32_u64_tied1, svfloat32_t, svuint64_t, TEST_DUAL_Z (reinterpret_f32_u64_untied, svfloat32_t, svuint64_t, z0 = svreinterpret_f32_u64 (z4), z0 = svreinterpret_f32 (z4)) + +/* +** reinterpret_f32_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f32_bf16_x2_tied1, svfloat32x2_t, svbfloat16x2_t, + z0_res = svreinterpret_f32_bf16_x2 (z0), + z0_res = svreinterpret_f32 (z0)) + +/* +** reinterpret_f32_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_f32_f32_x2_untied, svfloat32x2_t, svfloat32x2_t, z0, + svreinterpret_f32_f32_x2 (z4), + svreinterpret_f32 (z4)) + +/* +** reinterpret_f32_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f32_s64_x3_tied1, svfloat32x3_t, svint64x3_t, + z0_res = svreinterpret_f32_s64_x3 (z0), + z0_res = svreinterpret_f32 (z0)) + +/* +** reinterpret_f32_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_f32_u8_x3_untied, svfloat32x3_t, svuint8x3_t, z18, + svreinterpret_f32_u8_x3 (z23), + svreinterpret_f32 (z23)) + +/* +** reinterpret_f32_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f32_u32_x4_tied1, svfloat32x4_t, svuint32x4_t, + z0_res = svreinterpret_f32_u32_x4 (z0), + z0_res = svreinterpret_f32 (z0)) + +/* +** reinterpret_f32_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_f32_f64_x4_untied, svfloat32x4_t, svfloat64x4_t, z28, + svreinterpret_f32_f64_x4 (z4), + svreinterpret_f32 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f64.c index 003ee3fe220..4f51824ab7e 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_f64.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_f64_u64_tied1, svfloat64_t, svuint64_t, TEST_DUAL_Z (reinterpret_f64_u64_untied, svfloat64_t, svuint64_t, z0 = svreinterpret_f64_u64 (z4), z0 = svreinterpret_f64 (z4)) + +/* +** reinterpret_f64_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f64_bf16_x2_tied1, svfloat64x2_t, svbfloat16x2_t, + z0_res = svreinterpret_f64_bf16_x2 (z0), + z0_res = svreinterpret_f64 (z0)) + +/* +** reinterpret_f64_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_f64_f32_x2_untied, svfloat64x2_t, svfloat32x2_t, z0, + svreinterpret_f64_f32_x2 (z4), + svreinterpret_f64 (z4)) + +/* +** reinterpret_f64_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f64_s64_x3_tied1, svfloat64x3_t, svint64x3_t, + z0_res = svreinterpret_f64_s64_x3 (z0), + z0_res = svreinterpret_f64 (z0)) + +/* +** reinterpret_f64_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_f64_u8_x3_untied, svfloat64x3_t, svuint8x3_t, z18, + svreinterpret_f64_u8_x3 (z23), + svreinterpret_f64 (z23)) + +/* +** reinterpret_f64_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_f64_u32_x4_tied1, svfloat64x4_t, svuint32x4_t, + z0_res = svreinterpret_f64_u32_x4 (z0), + z0_res = svreinterpret_f64 (z0)) + +/* +** reinterpret_f64_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_f64_f64_x4_untied, svfloat64x4_t, svfloat64x4_t, z28, + svreinterpret_f64_f64_x4 (z4), + svreinterpret_f64 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s16.c index d62817c2cac..7e15f3e9bd3 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s16.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_s16_u64_tied1, svint16_t, svuint64_t, TEST_DUAL_Z (reinterpret_s16_u64_untied, svint16_t, svuint64_t, z0 = svreinterpret_s16_u64 (z4), z0 = svreinterpret_s16 (z4)) + +/* +** reinterpret_s16_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s16_bf16_x2_tied1, svint16x2_t, svbfloat16x2_t, + z0_res = svreinterpret_s16_bf16_x2 (z0), + z0_res = svreinterpret_s16 (z0)) + +/* +** reinterpret_s16_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_s16_f32_x2_untied, svint16x2_t, svfloat32x2_t, z0, + svreinterpret_s16_f32_x2 (z4), + svreinterpret_s16 (z4)) + +/* +** reinterpret_s16_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s16_s64_x3_tied1, svint16x3_t, svint64x3_t, + z0_res = svreinterpret_s16_s64_x3 (z0), + z0_res = svreinterpret_s16 (z0)) + +/* +** reinterpret_s16_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_s16_u8_x3_untied, svint16x3_t, svuint8x3_t, z18, + svreinterpret_s16_u8_x3 (z23), + svreinterpret_s16 (z23)) + +/* +** reinterpret_s16_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s16_u32_x4_tied1, svint16x4_t, svuint32x4_t, + z0_res = svreinterpret_s16_u32_x4 (z0), + z0_res = svreinterpret_s16 (z0)) + +/* +** reinterpret_s16_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_s16_f64_x4_untied, svint16x4_t, svfloat64x4_t, z28, + svreinterpret_s16_f64_x4 (z4), + svreinterpret_s16 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s32.c index e1068f244ed..60da8aef333 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s32.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_s32_u64_tied1, svint32_t, svuint64_t, TEST_DUAL_Z (reinterpret_s32_u64_untied, svint32_t, svuint64_t, z0 = svreinterpret_s32_u64 (z4), z0 = svreinterpret_s32 (z4)) + +/* +** reinterpret_s32_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s32_bf16_x2_tied1, svint32x2_t, svbfloat16x2_t, + z0_res = svreinterpret_s32_bf16_x2 (z0), + z0_res = svreinterpret_s32 (z0)) + +/* +** reinterpret_s32_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_s32_f32_x2_untied, svint32x2_t, svfloat32x2_t, z0, + svreinterpret_s32_f32_x2 (z4), + svreinterpret_s32 (z4)) + +/* +** reinterpret_s32_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s32_s64_x3_tied1, svint32x3_t, svint64x3_t, + z0_res = svreinterpret_s32_s64_x3 (z0), + z0_res = svreinterpret_s32 (z0)) + +/* +** reinterpret_s32_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_s32_u8_x3_untied, svint32x3_t, svuint8x3_t, z18, + svreinterpret_s32_u8_x3 (z23), + svreinterpret_s32 (z23)) + +/* +** reinterpret_s32_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s32_u32_x4_tied1, svint32x4_t, svuint32x4_t, + z0_res = svreinterpret_s32_u32_x4 (z0), + z0_res = svreinterpret_s32 (z0)) + +/* +** reinterpret_s32_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_s32_f64_x4_untied, svint32x4_t, svfloat64x4_t, z28, + svreinterpret_s32_f64_x4 (z4), + svreinterpret_s32 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s64.c index cada7533c53..d705c60dfd7 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s64.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_s64_u64_tied1, svint64_t, svuint64_t, TEST_DUAL_Z (reinterpret_s64_u64_untied, svint64_t, svuint64_t, z0 = svreinterpret_s64_u64 (z4), z0 = svreinterpret_s64 (z4)) + +/* +** reinterpret_s64_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s64_bf16_x2_tied1, svint64x2_t, svbfloat16x2_t, + z0_res = svreinterpret_s64_bf16_x2 (z0), + z0_res = svreinterpret_s64 (z0)) + +/* +** reinterpret_s64_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_s64_f32_x2_untied, svint64x2_t, svfloat32x2_t, z0, + svreinterpret_s64_f32_x2 (z4), + svreinterpret_s64 (z4)) + +/* +** reinterpret_s64_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s64_s64_x3_tied1, svint64x3_t, svint64x3_t, + z0_res = svreinterpret_s64_s64_x3 (z0), + z0_res = svreinterpret_s64 (z0)) + +/* +** reinterpret_s64_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_s64_u8_x3_untied, svint64x3_t, svuint8x3_t, z18, + svreinterpret_s64_u8_x3 (z23), + svreinterpret_s64 (z23)) + +/* +** reinterpret_s64_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s64_u32_x4_tied1, svint64x4_t, svuint32x4_t, + z0_res = svreinterpret_s64_u32_x4 (z0), + z0_res = svreinterpret_s64 (z0)) + +/* +** reinterpret_s64_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_s64_f64_x4_untied, svint64x4_t, svfloat64x4_t, z28, + svreinterpret_s64_f64_x4 (z4), + svreinterpret_s64 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s8.c index 23a40d0bab7..ab90a54d746 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_s8.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_s8_u64_tied1, svint8_t, svuint64_t, TEST_DUAL_Z (reinterpret_s8_u64_untied, svint8_t, svuint64_t, z0 = svreinterpret_s8_u64 (z4), z0 = svreinterpret_s8 (z4)) + +/* +** reinterpret_s8_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s8_bf16_x2_tied1, svint8x2_t, svbfloat16x2_t, + z0_res = svreinterpret_s8_bf16_x2 (z0), + z0_res = svreinterpret_s8 (z0)) + +/* +** reinterpret_s8_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_s8_f32_x2_untied, svint8x2_t, svfloat32x2_t, z0, + svreinterpret_s8_f32_x2 (z4), + svreinterpret_s8 (z4)) + +/* +** reinterpret_s8_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s8_s64_x3_tied1, svint8x3_t, svint64x3_t, + z0_res = svreinterpret_s8_s64_x3 (z0), + z0_res = svreinterpret_s8 (z0)) + +/* +** reinterpret_s8_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_s8_u8_x3_untied, svint8x3_t, svuint8x3_t, z18, + svreinterpret_s8_u8_x3 (z23), + svreinterpret_s8 (z23)) + +/* +** reinterpret_s8_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_s8_u32_x4_tied1, svint8x4_t, svuint32x4_t, + z0_res = svreinterpret_s8_u32_x4 (z0), + z0_res = svreinterpret_s8 (z0)) + +/* +** reinterpret_s8_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_s8_f64_x4_untied, svint8x4_t, svfloat64x4_t, z28, + svreinterpret_s8_f64_x4 (z4), + svreinterpret_s8 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u16.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u16.c index 48e8ecaff44..fcfc0eb9da5 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u16.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u16.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_u16_u64_tied1, svuint16_t, svuint64_t, TEST_DUAL_Z (reinterpret_u16_u64_untied, svuint16_t, svuint64_t, z0 = svreinterpret_u16_u64 (z4), z0 = svreinterpret_u16 (z4)) + +/* +** reinterpret_u16_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u16_bf16_x2_tied1, svuint16x2_t, svbfloat16x2_t, + z0_res = svreinterpret_u16_bf16_x2 (z0), + z0_res = svreinterpret_u16 (z0)) + +/* +** reinterpret_u16_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_u16_f32_x2_untied, svuint16x2_t, svfloat32x2_t, z0, + svreinterpret_u16_f32_x2 (z4), + svreinterpret_u16 (z4)) + +/* +** reinterpret_u16_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u16_s64_x3_tied1, svuint16x3_t, svint64x3_t, + z0_res = svreinterpret_u16_s64_x3 (z0), + z0_res = svreinterpret_u16 (z0)) + +/* +** reinterpret_u16_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u16_u8_x3_untied, svuint16x3_t, svuint8x3_t, z18, + svreinterpret_u16_u8_x3 (z23), + svreinterpret_u16 (z23)) + +/* +** reinterpret_u16_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u16_u32_x4_tied1, svuint16x4_t, svuint32x4_t, + z0_res = svreinterpret_u16_u32_x4 (z0), + z0_res = svreinterpret_u16 (z0)) + +/* +** reinterpret_u16_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u16_f64_x4_untied, svuint16x4_t, svfloat64x4_t, z28, + svreinterpret_u16_f64_x4 (z4), + svreinterpret_u16 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u32.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u32.c index 1d4e857120e..6d7e05857fe 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u32.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u32.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_u32_u64_tied1, svuint32_t, svuint64_t, TEST_DUAL_Z (reinterpret_u32_u64_untied, svuint32_t, svuint64_t, z0 = svreinterpret_u32_u64 (z4), z0 = svreinterpret_u32 (z4)) + +/* +** reinterpret_u32_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u32_bf16_x2_tied1, svuint32x2_t, svbfloat16x2_t, + z0_res = svreinterpret_u32_bf16_x2 (z0), + z0_res = svreinterpret_u32 (z0)) + +/* +** reinterpret_u32_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_u32_f32_x2_untied, svuint32x2_t, svfloat32x2_t, z0, + svreinterpret_u32_f32_x2 (z4), + svreinterpret_u32 (z4)) + +/* +** reinterpret_u32_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u32_s64_x3_tied1, svuint32x3_t, svint64x3_t, + z0_res = svreinterpret_u32_s64_x3 (z0), + z0_res = svreinterpret_u32 (z0)) + +/* +** reinterpret_u32_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u32_u8_x3_untied, svuint32x3_t, svuint8x3_t, z18, + svreinterpret_u32_u8_x3 (z23), + svreinterpret_u32 (z23)) + +/* +** reinterpret_u32_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u32_u32_x4_tied1, svuint32x4_t, svuint32x4_t, + z0_res = svreinterpret_u32_u32_x4 (z0), + z0_res = svreinterpret_u32 (z0)) + +/* +** reinterpret_u32_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u32_f64_x4_untied, svuint32x4_t, svfloat64x4_t, z28, + svreinterpret_u32_f64_x4 (z4), + svreinterpret_u32 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u64.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u64.c index 07af69dce8d..55c0baefb6f 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u64.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u64.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_u64_u64_tied1, svuint64_t, svuint64_t, TEST_DUAL_Z (reinterpret_u64_u64_untied, svuint64_t, svuint64_t, z0 = svreinterpret_u64_u64 (z4), z0 = svreinterpret_u64 (z4)) + +/* +** reinterpret_u64_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u64_bf16_x2_tied1, svuint64x2_t, svbfloat16x2_t, + z0_res = svreinterpret_u64_bf16_x2 (z0), + z0_res = svreinterpret_u64 (z0)) + +/* +** reinterpret_u64_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_u64_f32_x2_untied, svuint64x2_t, svfloat32x2_t, z0, + svreinterpret_u64_f32_x2 (z4), + svreinterpret_u64 (z4)) + +/* +** reinterpret_u64_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u64_s64_x3_tied1, svuint64x3_t, svint64x3_t, + z0_res = svreinterpret_u64_s64_x3 (z0), + z0_res = svreinterpret_u64 (z0)) + +/* +** reinterpret_u64_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u64_u8_x3_untied, svuint64x3_t, svuint8x3_t, z18, + svreinterpret_u64_u8_x3 (z23), + svreinterpret_u64 (z23)) + +/* +** reinterpret_u64_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u64_u32_x4_tied1, svuint64x4_t, svuint32x4_t, + z0_res = svreinterpret_u64_u32_x4 (z0), + z0_res = svreinterpret_u64 (z0)) + +/* +** reinterpret_u64_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u64_f64_x4_untied, svuint64x4_t, svfloat64x4_t, z28, + svreinterpret_u64_f64_x4 (z4), + svreinterpret_u64 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u8.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u8.c index a4c7f4c8d21..f7302196162 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u8.c +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/reinterpret_u8.c @@ -205,3 +205,65 @@ TEST_DUAL_Z_REV (reinterpret_u8_u64_tied1, svuint8_t, svuint64_t, TEST_DUAL_Z (reinterpret_u8_u64_untied, svuint8_t, svuint64_t, z0 = svreinterpret_u8_u64 (z4), z0 = svreinterpret_u8 (z4)) + +/* +** reinterpret_u8_bf16_x2_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u8_bf16_x2_tied1, svuint8x2_t, svbfloat16x2_t, + z0_res = svreinterpret_u8_bf16_x2 (z0), + z0_res = svreinterpret_u8 (z0)) + +/* +** reinterpret_u8_f32_x2_untied: +** ( +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** | +** mov z0\.d, z4\.d +** mov z1\.d, z5\.d +** ) +** ret +*/ +TEST_DUAL_XN (reinterpret_u8_f32_x2_untied, svuint8x2_t, svfloat32x2_t, z0, + svreinterpret_u8_f32_x2 (z4), + svreinterpret_u8 (z4)) + +/* +** reinterpret_u8_s64_x3_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u8_s64_x3_tied1, svuint8x3_t, svint64x3_t, + z0_res = svreinterpret_u8_s64_x3 (z0), + z0_res = svreinterpret_u8 (z0)) + +/* +** reinterpret_u8_u8_x3_untied: +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** mov (z18|z19|z20)\.d, (z23|z24|z25)\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u8_u8_x3_untied, svuint8x3_t, svuint8x3_t, z18, + svreinterpret_u8_u8_x3 (z23), + svreinterpret_u8 (z23)) + +/* +** reinterpret_u8_u32_x4_tied1: +** ret +*/ +TEST_DUAL_Z_REV (reinterpret_u8_u32_x4_tied1, svuint8x4_t, svuint32x4_t, + z0_res = svreinterpret_u8_u32_x4 (z0), + z0_res = svreinterpret_u8 (z0)) + +/* +** reinterpret_u8_f64_x4_untied: +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** mov (z28|z29|z30|z31)\.d, z[4-7]\.d +** ret +*/ +TEST_DUAL_XN (reinterpret_u8_f64_x4_untied, svuint8x4_t, svfloat64x4_t, z28, + svreinterpret_u8_f64_x4 (z4), + svreinterpret_u8 (z4)) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h index fbf392b3ed4..2da61ff5c0b 100644 --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/asm/test_sve_acle.h @@ -421,4 +421,18 @@ return z0_res; \ } +#define TEST_DUAL_XN(NAME, TTYPE1, TTYPE2, RES, CODE1, CODE2) \ + PROTO (NAME, void, ()) \ + { \ + register TTYPE1 z0 __asm ("z0"); \ + register TTYPE2 z4 __asm ("z4"); \ + register TTYPE1 z18 __asm ("z18"); \ + register TTYPE2 z23 __asm ("z23"); \ + register TTYPE1 z28 __asm ("z28"); \ + __asm volatile ("" : "=w" (z0), "=w" (z4), "=w" (z18), \ + "=w" (z23), "=w" (z28)); \ + INVOKE (RES = CODE1, RES = CODE2); \ + __asm volatile ("" :: "w" (RES)); \ + } + #endif