From patchwork Tue May 23 10:37:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 97956 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp2037808vqo; Tue, 23 May 2023 03:38:13 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7bLErFgjDfIb904B0/twvQvFOfv55hPJ2xCEwYrTjGJCW5I1rppJpEYiuND4jjr4qJQXQp X-Received: by 2002:a05:6402:7d6:b0:510:885a:b4da with SMTP id u22-20020a05640207d600b00510885ab4damr11522454edy.19.1684838293545; Tue, 23 May 2023 03:38:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684838293; cv=none; d=google.com; s=arc-20160816; b=WD373MSvZ1lt8TRRAROcKlYNM2h+qLS7z3eL5Nv9wdaTa+Itowz7HWKPSQZyo0zxEQ 5/YidMjx0sbEjYI1+q7agfC60tHxCd/cYiAGg4RoIpNsC45QeIdytM2tL5n4Fey7wKHW BhKRCyafDFTykiUSTZhMiwD7J2eG7C4dSWoEpLccrtjHZWg//wjYgsox6SdVWBs6R8Qm PnK6PNGKjH+6NKCD39rAqT8vbxUW4AZdfJ46ntTa7lO4VaICq9uL6UaZ8v0QwuSpc8tI 8m1jM0PBgh9ParkZlRoE0awn+zAUwrS6e6TMQtE7/5GbLzdZ7O8Ymhx6RhOixGKwk6i9 kdFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:user-agent:message-id:date :subject:mail-followup-to:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=AtVIGHkR8cwQOGAU6x+HMjth3t1mym2g8t+W0KukDog=; b=EbtdKJlsqj22N8iKpu3n7mcUfFcKmQ47KRVRAjFDYgjqMo1/AE7nJpBTvVYEWZg61L 9zl41Mnp08z3V7Feo7s/ubBH5RNPPaLhzAKTu2m7Nzas6POLwVmRIl1i0Ldk/9lrs5WY bPJLL72Xu+6kl2X8T3p0pyuav/INQTJEdZjOllx2I2WRiHjOddft8JT3Gk8dSIrtbfEu 87N0DFEIfzNRgdf+Y777VCR6nPo0NKiq8gqDehWQRVIm+BKmn8H82pym5pQ8tYCq2lFg jP2TklIw7XRuK4yMhGFaqwmzn9YYytK2sXj82mm8joM/ojBIlvZ8VUB/L2CKzgfjKKVj KgYg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="TmV6mtk/"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id d19-20020a056402079300b0050d8302152esi5002030edy.261.2023.05.23.03.38.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 May 2023 03:38:13 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="TmV6mtk/"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5BDAE385842D for ; Tue, 23 May 2023 10:38:12 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5BDAE385842D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1684838292; bh=AtVIGHkR8cwQOGAU6x+HMjth3t1mym2g8t+W0KukDog=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=TmV6mtk/JoH8gvqKEoxQFiX52Wk23qg8xXQlGgXWkb5nyejBQvkKjD7MepYaS7sl+ 7CXrtSkgXi5lSD8l69NmLhBa0EwW9fVy4S2tuMimbMXOH7guQm4TEAJ+kQzIeDlB5O UAOoLAkUUBIVNyLI71UGI+oK3B5YCC2m52uWYjiA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id E5ADB3858D35 for ; Tue, 23 May 2023 10:37:24 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E5ADB3858D35 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8EB3F139F for ; Tue, 23 May 2023 03:38:09 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id F14B43FAFB for ; Tue, 23 May 2023 03:37:23 -0700 (PDT) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH 2/2] aarch64: Provide FPR alternatives for some bit insertions [PR109632] Date: Tue, 23 May 2023 11:37:22 +0100 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-28.5 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766680998723170167?= X-GMAIL-MSGID: =?utf-8?q?1766680998723170167?= At -O2, and so with SLP vectorisation enabled: struct complx_t { float re, im; }; complx_t add(complx_t a, complx_t b) { return {a.re + b.re, a.im + b.im}; } generates: fmov w3, s1 fmov x0, d0 fmov x1, d2 fmov w2, s3 bfi x0, x3, 32, 32 fmov d31, x0 bfi x1, x2, 32, 32 fmov d30, x1 fadd v31.2s, v31.2s, v30.2s fmov x1, d31 lsr x0, x1, 32 fmov s1, w0 lsr w0, w1, 0 fmov s0, w0 ret This is because complx_t is passed and returned in FPRs, but GCC gives it DImode. We therefore “need” to assemble a DImode pseudo from the two individual floats, bitcast it to a vector, do the arithmetic, bitcast it back to a DImode pseudo, then extract the individual floats. There are many problems here. The most basic is that we shouldn't use SLP for such a trivial example. But SLP should in principle be beneficial for more complicated examples, so preventing SLP for the example above just changes the reproducer needed. A more fundamental problem is that it doesn't make sense to use single DImode pseudos in a testcase like this. I have a WIP patch to allow re and im to be stored in individual SFmode pseudos instead, but it's quite an invasive change and might end up going nowhere. A simpler problem to tackle is that we allow DImode pseudos to be stored in FPRs, but we don't provide any patterns for inserting values into them, even though INS makes that easy for element-like insertions. This patch adds some patterns for that. Doing that showed that aarch64_modes_tieable_p was too strict: it didn't allow SFmode and DImode values to be tied, even though both of them occupy a single GPR and FPR, and even though we allow both classes to change between the modes. The *aarch64_bfidi_subreg_ pattern is especially ugly, but it's not clear what target-independent code ought to simplify it to, if it was going to simplify it. We should probably do the same thing for extractions, but that's left as future work. After the patch we generate: ins v0.s[1], v1.s[0] ins v2.s[1], v3.s[0] fadd v0.2s, v0.2s, v2.2s fmov x0, d0 ushr d1, d0, 32 lsr w0, w0, 0 fmov s0, w0 ret which seems like a step in the right direction. All in all, there's nothing elegant about this patchh. It just seems like the least worst option. Tested on aarch64-linux-gnu and aarch64_be-elf (including ILP32). Pushed to trunk. Richard gcc/ PR target/109632 * config/aarch64/aarch64.cc (aarch64_modes_tieable_p): Allow subregs between any scalars that are 64 bits or smaller. * config/aarch64/iterators.md (SUBDI_BITS): New int iterator. (bits_etype): New int attribute. * config/aarch64/aarch64.md (*insv_reg_) (*aarch64_bfi_): New patterns. (*aarch64_bfidi_subreg_): Likewise. gcc/testsuite/ * gcc.target/aarch64/ins_bitfield_1.c: New test. * gcc.target/aarch64/ins_bitfield_2.c: Likewise. * gcc.target/aarch64/ins_bitfield_3.c: Likewise. * gcc.target/aarch64/ins_bitfield_4.c: Likewise. * gcc.target/aarch64/ins_bitfield_5.c: Likewise. * gcc.target/aarch64/ins_bitfield_6.c: Likewise. --- gcc/config/aarch64/aarch64.cc | 12 ++ gcc/config/aarch64/aarch64.md | 62 +++++++ gcc/config/aarch64/iterators.md | 4 + .../gcc.target/aarch64/ins_bitfield_1.c | 142 ++++++++++++++++ .../gcc.target/aarch64/ins_bitfield_2.c | 142 ++++++++++++++++ .../gcc.target/aarch64/ins_bitfield_3.c | 156 ++++++++++++++++++ .../gcc.target/aarch64/ins_bitfield_4.c | 156 ++++++++++++++++++ .../gcc.target/aarch64/ins_bitfield_5.c | 139 ++++++++++++++++ .../gcc.target/aarch64/ins_bitfield_6.c | 139 ++++++++++++++++ 9 files changed, 952 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/ins_bitfield_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/ins_bitfield_2.c create mode 100644 gcc/testsuite/gcc.target/aarch64/ins_bitfield_3.c create mode 100644 gcc/testsuite/gcc.target/aarch64/ins_bitfield_4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/ins_bitfield_5.c create mode 100644 gcc/testsuite/gcc.target/aarch64/ins_bitfield_6.c diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index d6fc94015fa..146c2ad4988 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -24827,6 +24827,18 @@ aarch64_modes_tieable_p (machine_mode mode1, machine_mode mode2) if (GET_MODE_CLASS (mode1) == GET_MODE_CLASS (mode2)) return true; + /* Allow changes between scalar modes if both modes fit within 64 bits. + This is because: + + - We allow all such modes for both FPRs and GPRs. + - They occupy a single register for both FPRs and GPRs. + - We can reinterpret one mode as another in both types of register. */ + if (is_a (mode1) + && is_a (mode2) + && known_le (GET_MODE_SIZE (mode1), 8) + && known_le (GET_MODE_SIZE (mode2), 8)) + return true; + /* We specifically want to allow elements of "structure" modes to be tieable to the structure. This more general condition allows other rarer situations too. The reason we don't extend this to diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 286f044cb8b..8b8951d7b14 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -5862,6 +5862,25 @@ (define_expand "insv" operands[3] = force_reg (mode, value); }) +(define_insn "*insv_reg_" + [(set (zero_extract:GPI (match_operand:GPI 0 "register_operand" "+r,w,?w") + (const_int SUBDI_BITS) + (match_operand 1 "const_int_operand")) + (match_operand:GPI 2 "register_operand" "r,w,r"))] + "multiple_p (UINTVAL (operands[1]), ) + && UINTVAL (operands[1]) + <= " + { + if (which_alternative == 0) + return "bfi\t%0, %2, %1, "; + + operands[1] = gen_int_mode (UINTVAL (operands[1]) / , SImode); + if (which_alternative == 1) + return "ins\t%0.[%1], %2.[0]"; + return "ins\t%0.[%1], %w2"; + } + [(set_attr "type" "bfm,neon_ins_q,neon_ins_q")] +) + (define_insn "*insv_reg" [(set (zero_extract:GPI (match_operand:GPI 0 "register_operand" "+r") (match_operand 1 "const_int_operand" "n") @@ -5874,6 +5893,27 @@ (define_insn "*insv_reg" [(set_attr "type" "bfm")] ) +(define_insn_and_split "*aarch64_bfi_" + [(set (zero_extract:GPI (match_operand:GPI 0 "register_operand" "+r,w,?w") + (const_int SUBDI_BITS) + (match_operand 1 "const_int_operand")) + (zero_extend:GPI (match_operand:ALLX 2 "register_operand" "r,w,r")))] + " <= + && multiple_p (UINTVAL (operands[1]), ) + && UINTVAL (operands[1]) + <= " + "#" + "&& 1" + [(set (zero_extract:GPI (match_dup 0) + (const_int SUBDI_BITS) + (match_dup 1)) + (match_dup 2))] + { + operands[2] = lowpart_subreg (mode, operands[2], + mode); + } + [(set_attr "type" "bfm,neon_ins_q,neon_ins_q")] +) + (define_insn "*aarch64_bfi4" [(set (zero_extract:GPI (match_operand:GPI 0 "register_operand" "+r") (match_operand 1 "const_int_operand" "n") @@ -5884,6 +5924,28 @@ (define_insn "*aarch64_bfi4" [(set_attr "type" "bfm")] ) +(define_insn_and_split "*aarch64_bfidi_subreg_" + [(set (zero_extract:DI (match_operand:DI 0 "register_operand" "+r,w,?w") + (const_int SUBDI_BITS) + (match_operand 1 "const_int_operand")) + (match_operator:DI 2 "subreg_lowpart_operator" + [(zero_extend:SI + (match_operand:ALLX 3 "register_operand" "r,w,r"))]))] + " <= + && multiple_p (UINTVAL (operands[1]), ) + && UINTVAL (operands[1]) + <= 64" + "#" + "&& 1" + [(set (zero_extract:DI (match_dup 0) + (const_int SUBDI_BITS) + (match_dup 1)) + (match_dup 2))] + { + operands[2] = lowpart_subreg (DImode, operands[3], mode); + } + [(set_attr "type" "bfm,neon_ins_q,neon_ins_q")] +) + ;; Match a bfi instruction where the shift of OP3 means that we are ;; actually copying the least significant bits of OP3 into OP0 by way ;; of the AND masks and the IOR instruction. A similar instruction diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 4f1fd648e7f..8aabdb7c023 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -3174,6 +3174,8 @@ (define_int_attr atomic_ldoptab [(UNSPECV_ATOMIC_LDOP_OR "ior") (UNSPECV_ATOMIC_LDOP_BIC "bic") (UNSPECV_ATOMIC_LDOP_XOR "xor") (UNSPECV_ATOMIC_LDOP_PLUS "add")]) +(define_int_iterator SUBDI_BITS [8 16 32]) + ;; ------------------------------------------------------------------- ;; Int Iterators Attributes. ;; ------------------------------------------------------------------- @@ -4004,3 +4006,5 @@ (define_int_attr fpscr_name (UNSPECV_SET_FPSR "fpsr") (UNSPECV_GET_FPCR "fpcr") (UNSPECV_SET_FPCR "fpcr")]) + +(define_int_attr bits_etype [(8 "b") (16 "h") (32 "s")]) diff --git a/gcc/testsuite/gcc.target/aarch64/ins_bitfield_1.c b/gcc/testsuite/gcc.target/aarch64/ins_bitfield_1.c new file mode 100644 index 00000000000..592e98b9470 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ins_bitfield_1.c @@ -0,0 +1,142 @@ +/* { dg-do assemble } */ +/* { dg-options "-O2 -mlittle-endian --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +typedef unsigned char v16qi __attribute__((vector_size(16))); +typedef unsigned short v8hi __attribute__((vector_size(16))); +typedef unsigned int v4si __attribute__((vector_size(16))); + +struct di_qi_1 { unsigned char c[4]; unsigned int x; }; +struct di_qi_2 { unsigned int x; unsigned char c[4]; }; + +struct di_hi_1 { unsigned short s[2]; unsigned int x; }; +struct di_hi_2 { unsigned int x; unsigned short s[2]; }; + +struct di_si { unsigned int i[2]; }; + +struct si_qi_1 { unsigned char c[2]; unsigned short x; }; +struct si_qi_2 { unsigned short x; unsigned char c[2]; }; + +struct si_hi { unsigned short s[2]; }; + +#define TEST(NAME, STYPE, VTYPE, LHS, RHS) \ + void \ + NAME (VTYPE x) \ + { \ + register struct STYPE y asm ("v1"); \ + asm volatile ("" : "=w" (y)); \ + LHS = RHS; \ + asm volatile ("" :: "w" (y)); \ + } + +/* +** f_di_qi_0: +** ins v1\.b\[0\], v0\.b\[0\] +** ret +*/ +TEST (f_di_qi_0, di_qi_1, v16qi, y.c[0], x[0]) + +/* +** f_di_qi_1: +** ins v1\.b\[3\], v0\.b\[0\] +** ret +*/ +TEST (f_di_qi_1, di_qi_1, v16qi, y.c[3], x[0]) + +/* +** f_di_qi_2: +** ins v1\.b\[4\], v0\.b\[0\] +** ret +*/ +TEST (f_di_qi_2, di_qi_2, v16qi, y.c[0], x[0]) + +/* +** f_di_qi_3: +** ins v1\.b\[7\], v0\.b\[0\] +** ret +*/ +TEST (f_di_qi_3, di_qi_2, v16qi, y.c[3], x[0]) + +/* +** f_di_hi_0: +** ins v1\.h\[0\], v0\.h\[0\] +** ret +*/ +TEST (f_di_hi_0, di_hi_1, v8hi, y.s[0], x[0]) + +/* +** f_di_hi_1: +** ins v1\.h\[1\], v0\.h\[0\] +** ret +*/ +TEST (f_di_hi_1, di_hi_1, v8hi, y.s[1], x[0]) + +/* +** f_di_hi_2: +** ins v1\.h\[2\], v0\.h\[0\] +** ret +*/ +TEST (f_di_hi_2, di_hi_2, v8hi, y.s[0], x[0]) + +/* +** f_di_hi_3: +** ins v1\.h\[3\], v0\.h\[0\] +** ret +*/ +TEST (f_di_hi_3, di_hi_2, v8hi, y.s[1], x[0]) + +/* +** f_di_si_0: +** ins v1\.s\[0\], v0\.s\[0\] +** ret +*/ +TEST (f_di_si_0, di_si, v4si, y.i[0], x[0]) + +/* +** f_di_si_1: +** ins v1\.s\[1\], v0\.s\[0\] +** ret +*/ +TEST (f_di_si_1, di_si, v4si, y.i[1], x[0]) + +/* +** f_si_qi_0: +** ins v1\.b\[0\], v0\.b\[0\] +** ret +*/ +TEST (f_si_qi_0, si_qi_1, v16qi, y.c[0], x[0]) + +/* +** f_si_qi_1: +** ins v1\.b\[1\], v0\.b\[0\] +** ret +*/ +TEST (f_si_qi_1, si_qi_1, v16qi, y.c[1], x[0]) + +/* +** f_si_qi_2: +** ins v1\.b\[2\], v0\.b\[0\] +** ret +*/ +TEST (f_si_qi_2, si_qi_2, v16qi, y.c[0], x[0]) + +/* +** f_si_qi_3: +** ins v1\.b\[3\], v0\.b\[0\] +** ret +*/ +TEST (f_si_qi_3, si_qi_2, v16qi, y.c[1], x[0]) + +/* +** f_si_hi_0: +** ins v1\.h\[0\], v0\.h\[0\] +** ret +*/ +TEST (f_si_hi_0, si_hi, v8hi, y.s[0], x[0]) + +/* +** f_si_hi_1: +** ins v1\.h\[1\], v0\.h\[0\] +** ret +*/ +TEST (f_si_hi_1, si_hi, v8hi, y.s[1], x[0]) diff --git a/gcc/testsuite/gcc.target/aarch64/ins_bitfield_2.c b/gcc/testsuite/gcc.target/aarch64/ins_bitfield_2.c new file mode 100644 index 00000000000..152418889fa --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ins_bitfield_2.c @@ -0,0 +1,142 @@ +/* { dg-do assemble } */ +/* { dg-options "-O2 -mbig-endian --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +typedef unsigned char v16qi __attribute__((vector_size(16))); +typedef unsigned short v8hi __attribute__((vector_size(16))); +typedef unsigned int v4si __attribute__((vector_size(16))); + +struct di_qi_1 { unsigned char c[4]; unsigned int x; }; +struct di_qi_2 { unsigned int x; unsigned char c[4]; }; + +struct di_hi_1 { unsigned short s[2]; unsigned int x; }; +struct di_hi_2 { unsigned int x; unsigned short s[2]; }; + +struct di_si { unsigned int i[2]; }; + +struct si_qi_1 { unsigned char c[2]; unsigned short x; }; +struct si_qi_2 { unsigned short x; unsigned char c[2]; }; + +struct si_hi { unsigned short s[2]; }; + +#define TEST(NAME, STYPE, VTYPE, LHS, RHS) \ + void \ + NAME (VTYPE x) \ + { \ + register struct STYPE y asm ("v1"); \ + asm volatile ("" : "=w" (y)); \ + LHS = RHS; \ + asm volatile ("" :: "w" (y)); \ + } + +/* +** f_di_qi_0: +** ins v1\.b\[7\], v0\.b\[0\] +** ret +*/ +TEST (f_di_qi_0, di_qi_1, v16qi, y.c[0], x[15]) + +/* +** f_di_qi_1: +** ins v1\.b\[4\], v0\.b\[0\] +** ret +*/ +TEST (f_di_qi_1, di_qi_1, v16qi, y.c[3], x[15]) + +/* +** f_di_qi_2: +** ins v1\.b\[3\], v0\.b\[0\] +** ret +*/ +TEST (f_di_qi_2, di_qi_2, v16qi, y.c[0], x[15]) + +/* +** f_di_qi_3: +** ins v1\.b\[0\], v0\.b\[0\] +** ret +*/ +TEST (f_di_qi_3, di_qi_2, v16qi, y.c[3], x[15]) + +/* +** f_di_hi_0: +** ins v1\.h\[3\], v0\.h\[0\] +** ret +*/ +TEST (f_di_hi_0, di_hi_1, v8hi, y.s[0], x[7]) + +/* +** f_di_hi_1: +** ins v1\.h\[2\], v0\.h\[0\] +** ret +*/ +TEST (f_di_hi_1, di_hi_1, v8hi, y.s[1], x[7]) + +/* +** f_di_hi_2: +** ins v1\.h\[1\], v0\.h\[0\] +** ret +*/ +TEST (f_di_hi_2, di_hi_2, v8hi, y.s[0], x[7]) + +/* +** f_di_hi_3: +** ins v1\.h\[0\], v0\.h\[0\] +** ret +*/ +TEST (f_di_hi_3, di_hi_2, v8hi, y.s[1], x[7]) + +/* +** f_di_si_0: +** ins v1\.s\[1\], v0\.s\[0\] +** ret +*/ +TEST (f_di_si_0, di_si, v4si, y.i[0], x[3]) + +/* +** f_di_si_1: +** ins v1\.s\[0\], v0\.s\[0\] +** ret +*/ +TEST (f_di_si_1, di_si, v4si, y.i[1], x[3]) + +/* +** f_si_qi_0: +** ins v1\.b\[3\], v0\.b\[0\] +** ret +*/ +TEST (f_si_qi_0, si_qi_1, v16qi, y.c[0], x[15]) + +/* +** f_si_qi_1: +** ins v1\.b\[2\], v0\.b\[0\] +** ret +*/ +TEST (f_si_qi_1, si_qi_1, v16qi, y.c[1], x[15]) + +/* +** f_si_qi_2: +** ins v1\.b\[1\], v0\.b\[0\] +** ret +*/ +TEST (f_si_qi_2, si_qi_2, v16qi, y.c[0], x[15]) + +/* +** f_si_qi_3: +** ins v1\.b\[0\], v0\.b\[0\] +** ret +*/ +TEST (f_si_qi_3, si_qi_2, v16qi, y.c[1], x[15]) + +/* +** f_si_hi_0: +** ins v1\.h\[1\], v0\.h\[0\] +** ret +*/ +TEST (f_si_hi_0, si_hi, v8hi, y.s[0], x[7]) + +/* +** f_si_hi_1: +** ins v1\.h\[0\], v0\.h\[0\] +** ret +*/ +TEST (f_si_hi_1, si_hi, v8hi, y.s[1], x[7]) diff --git a/gcc/testsuite/gcc.target/aarch64/ins_bitfield_3.c b/gcc/testsuite/gcc.target/aarch64/ins_bitfield_3.c new file mode 100644 index 00000000000..0ef95a97996 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ins_bitfield_3.c @@ -0,0 +1,156 @@ +/* { dg-do assemble } */ +/* { dg-options "-O2 -mlittle-endian --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */ + +struct di_qi_1 { unsigned char c[4]; unsigned int x; }; +struct di_qi_2 { unsigned int x; unsigned char c[4]; }; + +struct di_hi_1 { unsigned short s[2]; unsigned int x; }; +struct di_hi_2 { unsigned int x; unsigned short s[2]; }; + +struct di_si { unsigned int i[2]; }; + +struct si_qi_1 { unsigned char c[2]; unsigned short x; }; +struct si_qi_2 { unsigned short x; unsigned char c[2]; }; + +struct si_hi { unsigned short s[2]; }; + +#define TEST(NAME, STYPE, ETYPE, LHS) \ + void \ + NAME (volatile ETYPE *ptr) \ + { \ + register struct STYPE y asm ("v1"); \ + asm volatile ("" : "=w" (y)); \ + ETYPE x = *ptr; \ + __UINT64_TYPE__ value = (ETYPE) x; \ + LHS = value; \ + asm volatile ("" :: "w" (y)); \ + } + +/* +** f_di_qi_0: +** ldr b([0-9]+), \[x0\] +** ins v1\.b\[0\], v\1\.b\[0\] +** ret +*/ +TEST (f_di_qi_0, di_qi_1, unsigned char, y.c[0]) + +/* +** f_di_qi_1: +** ldr b([0-9]+), \[x0\] +** ins v1\.b\[3\], v\1\.b\[0\] +** ret +*/ +TEST (f_di_qi_1, di_qi_1, unsigned char, y.c[3]) + +/* +** f_di_qi_2: +** ldr b([0-9]+), \[x0\] +** ins v1\.b\[4\], v\1\.b\[0\] +** ret +*/ +TEST (f_di_qi_2, di_qi_2, unsigned char, y.c[0]) + +/* +** f_di_qi_3: +** ldr b([0-9]+), \[x0\] +** ins v1\.b\[7\], v\1\.b\[0\] +** ret +*/ +TEST (f_di_qi_3, di_qi_2, unsigned char, y.c[3]) + +/* +** f_di_hi_0: +** ldr h([0-9]+), \[x0\] +** ins v1\.h\[0\], v\1\.h\[0\] +** ret +*/ +TEST (f_di_hi_0, di_hi_1, unsigned short, y.s[0]) + +/* +** f_di_hi_1: +** ldr h([0-9]+), \[x0\] +** ins v1\.h\[1\], v\1\.h\[0\] +** ret +*/ +TEST (f_di_hi_1, di_hi_1, unsigned short, y.s[1]) + +/* +** f_di_hi_2: +** ldr h([0-9]+), \[x0\] +** ins v1\.h\[2\], v\1\.h\[0\] +** ret +*/ +TEST (f_di_hi_2, di_hi_2, unsigned short, y.s[0]) + +/* +** f_di_hi_3: +** ldr h([0-9]+), \[x0\] +** ins v1\.h\[3\], v\1\.h\[0\] +** ret +*/ +TEST (f_di_hi_3, di_hi_2, unsigned short, y.s[1]) + +/* +** f_di_si_0: +** ldr s([0-9]+), \[x0\] +** ins v1\.s\[0\], v\1\.s\[0\] +** ret +*/ +TEST (f_di_si_0, di_si, unsigned int, y.i[0]) + +/* +** f_di_si_1: +** ldr s([0-9]+), \[x0\] +** ins v1\.s\[1\], v\1\.s\[0\] +** ret +*/ +TEST (f_di_si_1, di_si, unsigned int, y.i[1]) + +/* +** f_si_qi_0: +** ldr b([0-9]+), \[x0\] +** ins v1\.b\[0\], v\1\.b\[0\] +** ret +*/ +TEST (f_si_qi_0, si_qi_1, unsigned char, y.c[0]) + +/* +** f_si_qi_1: +** ldr b([0-9]+), \[x0\] +** ins v1\.b\[1\], v\1\.b\[0\] +** ret +*/ +TEST (f_si_qi_1, si_qi_1, unsigned char, y.c[1]) + +/* +** f_si_qi_2: +** ldr b([0-9]+), \[x0\] +** ins v1\.b\[2\], v\1\.b\[0\] +** ret +*/ +TEST (f_si_qi_2, si_qi_2, unsigned char, y.c[0]) + +/* +** f_si_qi_3: +** ldr b([0-9]+), \[x0\] +** ins v1\.b\[3\], v\1\.b\[0\] +** ret +*/ +TEST (f_si_qi_3, si_qi_2, unsigned char, y.c[1]) + +/* +** f_si_hi_0: +** ldr h([0-9]+), \[x0\] +** ins v1\.h\[0\], v\1\.h\[0\] +** ret +*/ +TEST (f_si_hi_0, si_hi, unsigned short, y.s[0]) + +/* +** f_si_hi_1: +** ldr h([0-9]+), \[x0\] +** ins v1\.h\[1\], v\1\.h\[0\] +** ret +*/ +TEST (f_si_hi_1, si_hi, unsigned short, y.s[1]) diff --git a/gcc/testsuite/gcc.target/aarch64/ins_bitfield_4.c b/gcc/testsuite/gcc.target/aarch64/ins_bitfield_4.c new file mode 100644 index 00000000000..98e25c86959 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ins_bitfield_4.c @@ -0,0 +1,156 @@ +/* { dg-do assemble } */ +/* { dg-options "-O2 -mbig-endian --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */ + +struct di_qi_1 { unsigned char c[4]; unsigned int x; }; +struct di_qi_2 { unsigned int x; unsigned char c[4]; }; + +struct di_hi_1 { unsigned short s[2]; unsigned int x; }; +struct di_hi_2 { unsigned int x; unsigned short s[2]; }; + +struct di_si { unsigned int i[2]; }; + +struct si_qi_1 { unsigned char c[2]; unsigned short x; }; +struct si_qi_2 { unsigned short x; unsigned char c[2]; }; + +struct si_hi { unsigned short s[2]; }; + +#define TEST(NAME, STYPE, ETYPE, LHS) \ + void \ + NAME (volatile ETYPE *ptr) \ + { \ + register struct STYPE y asm ("v1"); \ + asm volatile ("" : "=w" (y)); \ + ETYPE x = *ptr; \ + __UINT64_TYPE__ value = (ETYPE) x; \ + LHS = value; \ + asm volatile ("" :: "w" (y)); \ + } + +/* +** f_di_qi_0: +** ldr b([0-9]+), \[x0\] +** ins v1\.b\[7\], v\1\.b\[0\] +** ret +*/ +TEST (f_di_qi_0, di_qi_1, unsigned char, y.c[0]) + +/* +** f_di_qi_1: +** ldr b([0-9]+), \[x0\] +** ins v1\.b\[4\], v\1\.b\[0\] +** ret +*/ +TEST (f_di_qi_1, di_qi_1, unsigned char, y.c[3]) + +/* +** f_di_qi_2: +** ldr b([0-9]+), \[x0\] +** ins v1\.b\[3\], v\1\.b\[0\] +** ret +*/ +TEST (f_di_qi_2, di_qi_2, unsigned char, y.c[0]) + +/* +** f_di_qi_3: +** ldr b([0-9]+), \[x0\] +** ins v1\.b\[0\], v\1\.b\[0\] +** ret +*/ +TEST (f_di_qi_3, di_qi_2, unsigned char, y.c[3]) + +/* +** f_di_hi_0: +** ldr h([0-9]+), \[x0\] +** ins v1\.h\[3\], v\1\.h\[0\] +** ret +*/ +TEST (f_di_hi_0, di_hi_1, unsigned short, y.s[0]) + +/* +** f_di_hi_1: +** ldr h([0-9]+), \[x0\] +** ins v1\.h\[2\], v\1\.h\[0\] +** ret +*/ +TEST (f_di_hi_1, di_hi_1, unsigned short, y.s[1]) + +/* +** f_di_hi_2: +** ldr h([0-9]+), \[x0\] +** ins v1\.h\[1\], v\1\.h\[0\] +** ret +*/ +TEST (f_di_hi_2, di_hi_2, unsigned short, y.s[0]) + +/* +** f_di_hi_3: +** ldr h([0-9]+), \[x0\] +** ins v1\.h\[0\], v\1\.h\[0\] +** ret +*/ +TEST (f_di_hi_3, di_hi_2, unsigned short, y.s[1]) + +/* +** f_di_si_0: +** ldr s([0-9]+), \[x0\] +** ins v1\.s\[1\], v\1\.s\[0\] +** ret +*/ +TEST (f_di_si_0, di_si, unsigned int, y.i[0]) + +/* +** f_di_si_1: +** ldr s([0-9]+), \[x0\] +** ins v1\.s\[0\], v\1\.s\[0\] +** ret +*/ +TEST (f_di_si_1, di_si, unsigned int, y.i[1]) + +/* +** f_si_qi_0: +** ldr b([0-9]+), \[x0\] +** ins v1\.b\[3\], v\1\.b\[0\] +** ret +*/ +TEST (f_si_qi_0, si_qi_1, unsigned char, y.c[0]) + +/* +** f_si_qi_1: +** ldr b([0-9]+), \[x0\] +** ins v1\.b\[2\], v\1\.b\[0\] +** ret +*/ +TEST (f_si_qi_1, si_qi_1, unsigned char, y.c[1]) + +/* +** f_si_qi_2: +** ldr b([0-9]+), \[x0\] +** ins v1\.b\[1\], v\1\.b\[0\] +** ret +*/ +TEST (f_si_qi_2, si_qi_2, unsigned char, y.c[0]) + +/* +** f_si_qi_3: +** ldr b([0-9]+), \[x0\] +** ins v1\.b\[0\], v\1\.b\[0\] +** ret +*/ +TEST (f_si_qi_3, si_qi_2, unsigned char, y.c[1]) + +/* +** f_si_hi_0: +** ldr h([0-9]+), \[x0\] +** ins v1\.h\[1\], v\1\.h\[0\] +** ret +*/ +TEST (f_si_hi_0, si_hi, unsigned short, y.s[0]) + +/* +** f_si_hi_1: +** ldr h([0-9]+), \[x0\] +** ins v1\.h\[0\], v\1\.h\[0\] +** ret +*/ +TEST (f_si_hi_1, si_hi, unsigned short, y.s[1]) diff --git a/gcc/testsuite/gcc.target/aarch64/ins_bitfield_5.c b/gcc/testsuite/gcc.target/aarch64/ins_bitfield_5.c new file mode 100644 index 00000000000..6debf5419cd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ins_bitfield_5.c @@ -0,0 +1,139 @@ +/* { dg-do assemble } */ +/* { dg-options "-O2 -mlittle-endian --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +struct di_qi_1 { unsigned char c[4]; unsigned int x; }; +struct di_qi_2 { unsigned int x; unsigned char c[4]; }; + +struct di_hi_1 { unsigned short s[2]; unsigned int x; }; +struct di_hi_2 { unsigned int x; unsigned short s[2]; }; + +struct di_si { unsigned int i[2]; }; + +struct si_qi_1 { unsigned char c[2]; unsigned short x; }; +struct si_qi_2 { unsigned short x; unsigned char c[2]; }; + +struct si_hi { unsigned short s[2]; }; + +#define TEST(NAME, STYPE, ETYPE, LHS) \ + void \ + NAME (void) \ + { \ + register struct STYPE y asm ("v1"); \ + register ETYPE x asm ("x0"); \ + asm volatile ("" : "=w" (y), "=r" (x)); \ + LHS = x; \ + asm volatile ("" :: "w" (y)); \ + } + +/* +** f_di_qi_0: +** ins v1\.b\[0\], w0 +** ret +*/ +TEST (f_di_qi_0, di_qi_1, unsigned char, y.c[0]) + +/* +** f_di_qi_1: +** ins v1\.b\[3\], w0 +** ret +*/ +TEST (f_di_qi_1, di_qi_1, unsigned char, y.c[3]) + +/* +** f_di_qi_2: +** ins v1\.b\[4\], w0 +** ret +*/ +TEST (f_di_qi_2, di_qi_2, unsigned char, y.c[0]) + +/* +** f_di_qi_3: +** ins v1\.b\[7\], w0 +** ret +*/ +TEST (f_di_qi_3, di_qi_2, unsigned char, y.c[3]) + +/* +** f_di_hi_0: +** ins v1\.h\[0\], w0 +** ret +*/ +TEST (f_di_hi_0, di_hi_1, unsigned short, y.s[0]) + +/* +** f_di_hi_1: +** ins v1\.h\[1\], w0 +** ret +*/ +TEST (f_di_hi_1, di_hi_1, unsigned short, y.s[1]) + +/* +** f_di_hi_2: +** ins v1\.h\[2\], w0 +** ret +*/ +TEST (f_di_hi_2, di_hi_2, unsigned short, y.s[0]) + +/* +** f_di_hi_3: +** ins v1\.h\[3\], w0 +** ret +*/ +TEST (f_di_hi_3, di_hi_2, unsigned short, y.s[1]) + +/* +** f_di_si_0: +** ins v1\.s\[0\], w0 +** ret +*/ +TEST (f_di_si_0, di_si, unsigned int, y.i[0]) + +/* +** f_di_si_1: +** ins v1\.s\[1\], w0 +** ret +*/ +TEST (f_di_si_1, di_si, unsigned int, y.i[1]) + +/* +** f_si_qi_0: +** ins v1\.b\[0\], w0 +** ret +*/ +TEST (f_si_qi_0, si_qi_1, unsigned char, y.c[0]) + +/* +** f_si_qi_1: +** ins v1\.b\[1\], w0 +** ret +*/ +TEST (f_si_qi_1, si_qi_1, unsigned char, y.c[1]) + +/* +** f_si_qi_2: +** ins v1\.b\[2\], w0 +** ret +*/ +TEST (f_si_qi_2, si_qi_2, unsigned char, y.c[0]) + +/* +** f_si_qi_3: +** ins v1\.b\[3\], w0 +** ret +*/ +TEST (f_si_qi_3, si_qi_2, unsigned char, y.c[1]) + +/* +** f_si_hi_0: +** ins v1\.h\[0\], w0 +** ret +*/ +TEST (f_si_hi_0, si_hi, unsigned short, y.s[0]) + +/* +** f_si_hi_1: +** ins v1\.h\[1\], w0 +** ret +*/ +TEST (f_si_hi_1, si_hi, unsigned short, y.s[1]) diff --git a/gcc/testsuite/gcc.target/aarch64/ins_bitfield_6.c b/gcc/testsuite/gcc.target/aarch64/ins_bitfield_6.c new file mode 100644 index 00000000000..cb8af6b0623 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ins_bitfield_6.c @@ -0,0 +1,139 @@ +/* { dg-do assemble } */ +/* { dg-options "-O2 -mbig-endian --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +struct di_qi_1 { unsigned char c[4]; unsigned int x; }; +struct di_qi_2 { unsigned int x; unsigned char c[4]; }; + +struct di_hi_1 { unsigned short s[2]; unsigned int x; }; +struct di_hi_2 { unsigned int x; unsigned short s[2]; }; + +struct di_si { unsigned int i[2]; }; + +struct si_qi_1 { unsigned char c[2]; unsigned short x; }; +struct si_qi_2 { unsigned short x; unsigned char c[2]; }; + +struct si_hi { unsigned short s[2]; }; + +#define TEST(NAME, STYPE, ETYPE, LHS) \ + void \ + NAME (void) \ + { \ + register struct STYPE y asm ("v1"); \ + register ETYPE x asm ("x0"); \ + asm volatile ("" : "=w" (y), "=r" (x)); \ + LHS = x; \ + asm volatile ("" :: "w" (y)); \ + } + +/* +** f_di_qi_0: +** ins v1\.b\[7\], w0 +** ret +*/ +TEST (f_di_qi_0, di_qi_1, unsigned char, y.c[0]) + +/* +** f_di_qi_1: +** ins v1\.b\[4\], w0 +** ret +*/ +TEST (f_di_qi_1, di_qi_1, unsigned char, y.c[3]) + +/* +** f_di_qi_2: +** ins v1\.b\[3\], w0 +** ret +*/ +TEST (f_di_qi_2, di_qi_2, unsigned char, y.c[0]) + +/* +** f_di_qi_3: +** ins v1\.b\[0\], w0 +** ret +*/ +TEST (f_di_qi_3, di_qi_2, unsigned char, y.c[3]) + +/* +** f_di_hi_0: +** ins v1\.h\[3\], w0 +** ret +*/ +TEST (f_di_hi_0, di_hi_1, unsigned short, y.s[0]) + +/* +** f_di_hi_1: +** ins v1\.h\[2\], w0 +** ret +*/ +TEST (f_di_hi_1, di_hi_1, unsigned short, y.s[1]) + +/* +** f_di_hi_2: +** ins v1\.h\[1\], w0 +** ret +*/ +TEST (f_di_hi_2, di_hi_2, unsigned short, y.s[0]) + +/* +** f_di_hi_3: +** ins v1\.h\[0\], w0 +** ret +*/ +TEST (f_di_hi_3, di_hi_2, unsigned short, y.s[1]) + +/* +** f_di_si_0: +** ins v1\.s\[1\], w0 +** ret +*/ +TEST (f_di_si_0, di_si, unsigned int, y.i[0]) + +/* +** f_di_si_1: +** ins v1\.s\[0\], w0 +** ret +*/ +TEST (f_di_si_1, di_si, unsigned int, y.i[1]) + +/* +** f_si_qi_0: +** ins v1\.b\[3\], w0 +** ret +*/ +TEST (f_si_qi_0, si_qi_1, unsigned char, y.c[0]) + +/* +** f_si_qi_1: +** ins v1\.b\[2\], w0 +** ret +*/ +TEST (f_si_qi_1, si_qi_1, unsigned char, y.c[1]) + +/* +** f_si_qi_2: +** ins v1\.b\[1\], w0 +** ret +*/ +TEST (f_si_qi_2, si_qi_2, unsigned char, y.c[0]) + +/* +** f_si_qi_3: +** ins v1\.b\[0\], w0 +** ret +*/ +TEST (f_si_qi_3, si_qi_2, unsigned char, y.c[1]) + +/* +** f_si_hi_0: +** ins v1\.h\[1\], w0 +** ret +*/ +TEST (f_si_hi_0, si_hi, unsigned short, y.s[0]) + +/* +** f_si_hi_1: +** ins v1\.h\[0\], w0 +** ret +*/ +TEST (f_si_hi_1, si_hi, unsigned short, y.s[1])