From patchwork Tue May 9 06:48:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 9067 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp2668994vqo; Mon, 8 May 2023 23:50:08 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4m1jCBUkIyI3lwD1lZuFlo3tCRMhAPNonvP0mdk4l+t0m6mSfQw1UoaC8m38OMJ/J5Ihgz X-Received: by 2002:aa7:da95:0:b0:50b:d553:3822 with SMTP id q21-20020aa7da95000000b0050bd5533822mr8479163eds.7.1683615007971; Mon, 08 May 2023 23:50:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683615007; cv=none; d=google.com; s=arc-20160816; b=R93k4rFNA9GysNqgBLdr+3bz2Gs2oh/Y61ZkeLm3SYzT8kptjP3RtpSTCjmbzPOmi3 hG8FnbJ9AAmuLLn0nFCb6QAzkf/FOAbiZjQxwbA/s7bgtEwF19rm7pGnsv3eXzjHHltH e3ZsiwhW8WSBTRyBU5nBj0yoGTQhhUacR5WTpemNJ9UCbsZ68tC4a9fcW47DNHqJFjGM D5/A2+/NW0aIWVOyVOh+iQX9cyclhAj5D3Zn6iY4hknXtlrLWNGHfXkHxUhDG70fFKji Y3kO4ORLMNwPYsBXH2FsPY2Qc2ZBWQKGL12Iymxfz43OOCw95cHLx1iZtEEcOEpooJSx Bg2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=fkvZ1DpYlGbJRTKggKux4ketQMhXgc90wv2EYeXxHDo=; b=HIAg3NPQSgusJ9uPk4mO+ZmjpqURnPDnFbeWgpGajoyBVXuU7yFsuKyfvzjdnwqVTn VzFKC3gSlrx85C9N2oUyzs3oGd+QE43IDAkIZjiMEgZYIx4zRqs/ZXijww2oh+VWOu+h 4mUtgqW5MmcRqkIlZM1ED2I0qXcgccYlbz+lEB//pwr/GsRj6DyEY14vOeY0jcoVGO+0 5coTVmP+Y/zUibjwZyKKBdYYKVBhZf4SbosSW1YKnn7MEKZwE+1sPiVCqG5VGuv0Y843 FYM7u5LQ0n5TmWW5x7oskDWEy2SgfwLPukcZDR/qZoINDW2f6SbgbbNF8tFml8znOWQ6 lDOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=SGhq4rP6; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id q20-20020aa7cc14000000b0050ce3b7c42bsi636272edt.512.2023.05.08.23.50.07 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 May 2023 23:50:07 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=SGhq4rP6; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 747343854169 for ; Tue, 9 May 2023 06:49:43 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 747343854169 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1683614983; bh=fkvZ1DpYlGbJRTKggKux4ketQMhXgc90wv2EYeXxHDo=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=SGhq4rP6ON/B0ZC2twKh66x0U8qOiHchqQ3coFtkzNUUcj6sUfr3DGkbViQlpFPaH hLh8DocOIxhaTIqXWeUIy64gJCjtavm/ihEm/iucQWDL/WeHf6B+TT5Do0cSSctR6v CKhvZ+N0LQpTYKqp3/QbbivxZuHyBMJ5rfauFuy8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id C23F13858C50 for ; Tue, 9 May 2023 06:48:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C23F13858C50 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id F11D41063; Mon, 8 May 2023 23:49:41 -0700 (PDT) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 106FD3F5A1; Mon, 8 May 2023 23:48:56 -0700 (PDT) To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [PATCH 0/6] aarch64: Avoid hard-coding specific register allocations Date: Tue, 9 May 2023 07:48:25 +0100 Message-Id: <20230509064831.1651327-1-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Spam-Status: No, score=-24.4 required=5.0 tests=BAYES_00, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765398290616684714?= X-GMAIL-MSGID: =?utf-8?q?1765398290616684714?= I have a patch that seems to improve register allocation for SIMD lane operations, and for similar instructions that require a reduced register range. However, it showed that a lot of asm tests are sensitive to the current register allocation. This patch series tries to correct the affected cases. Putting it in first is an attempt to “prove” that the new tests work both ways. Tested on aarch64-linux-gnu and pushed. Richard Richard Sandiford (6): aarch64: Fix move-after-intrinsic function-body tests aarch64: Allow moves after tied-register intrinsics aarch64: Relax ordering requirements in SVE dup tests aarch64: Relax predicate register matches aarch64: Relax FP/vector register matches aarch64: Avoid hard-coding specific register allocations .../g++.target/aarch64/sve/vcond_1.C | 258 +++++++++--------- .../advsimd-intrinsics/bfcvtnq2-untied.c | 5 + .../aarch64/advsimd-intrinsics/bfdot-1.c | 10 + .../aarch64/advsimd-intrinsics/vdot-3-1.c | 10 + .../aarch64/advsimd-intrinsics/vshl-opt-6.c | 2 +- .../gcc.target/aarch64/asimd-mul-to-shl-sub.c | 4 +- .../gcc.target/aarch64/asm-x-constraint-1.c | 4 +- .../gcc.target/aarch64/auto-init-padding-1.c | 2 +- .../gcc.target/aarch64/auto-init-padding-2.c | 3 +- .../gcc.target/aarch64/auto-init-padding-3.c | 3 +- .../gcc.target/aarch64/auto-init-padding-4.c | 3 +- .../gcc.target/aarch64/auto-init-padding-9.c | 2 +- .../gcc.target/aarch64/fmul_fcvt_2.c | 6 +- gcc/testsuite/gcc.target/aarch64/ldp_stp_17.c | 2 +- gcc/testsuite/gcc.target/aarch64/ldp_stp_21.c | 2 +- gcc/testsuite/gcc.target/aarch64/ldp_stp_8.c | 2 +- .../gcc.target/aarch64/memset-corner-cases.c | 22 +- .../gcc.target/aarch64/memset-q-reg.c | 22 +- .../gcc.target/aarch64/simd/vaddlv_1.c | 24 +- .../gcc.target/aarch64/simd/vpaddd_f64.c | 2 +- .../gcc.target/aarch64/simd/vpaddd_s64.c | 2 +- .../gcc.target/aarch64/simd/vpaddd_u64.c | 2 +- .../gcc.target/aarch64/sve-neon-modes_1.c | 4 +- .../gcc.target/aarch64/sve-neon-modes_3.c | 16 +- .../aarch64/sve/acle/asm/adda_f16.c | 5 + .../aarch64/sve/acle/asm/adda_f32.c | 5 + .../aarch64/sve/acle/asm/adda_f64.c | 5 + .../gcc.target/aarch64/sve/acle/asm/brka_b.c | 5 + .../gcc.target/aarch64/sve/acle/asm/brkb_b.c | 5 + .../gcc.target/aarch64/sve/acle/asm/brkn_b.c | 5 + .../aarch64/sve/acle/asm/clasta_bf16.c | 5 + .../aarch64/sve/acle/asm/clasta_f16.c | 5 + .../aarch64/sve/acle/asm/clasta_f32.c | 5 + .../aarch64/sve/acle/asm/clasta_f64.c | 5 + .../aarch64/sve/acle/asm/clastb_bf16.c | 5 + .../aarch64/sve/acle/asm/clastb_f16.c | 5 + .../aarch64/sve/acle/asm/clastb_f32.c | 5 + .../aarch64/sve/acle/asm/clastb_f64.c | 5 + .../gcc.target/aarch64/sve/acle/asm/dup_s16.c | 72 +++++ .../gcc.target/aarch64/sve/acle/asm/dup_s32.c | 60 ++++ .../gcc.target/aarch64/sve/acle/asm/dup_s64.c | 60 ++++ .../gcc.target/aarch64/sve/acle/asm/dup_u16.c | 72 +++++ .../gcc.target/aarch64/sve/acle/asm/dup_u32.c | 60 ++++ .../gcc.target/aarch64/sve/acle/asm/dup_u64.c | 60 ++++ .../aarch64/sve/acle/asm/dupq_b16.c | 86 +++--- .../aarch64/sve/acle/asm/dupq_b32.c | 48 ++-- .../aarch64/sve/acle/asm/dupq_b64.c | 16 +- .../gcc.target/aarch64/sve/acle/asm/dupq_b8.c | 136 ++++----- .../aarch64/sve/acle/asm/pfirst_b.c | 5 + .../aarch64/sve/acle/asm/pnext_b16.c | 5 + .../aarch64/sve/acle/asm/pnext_b32.c | 5 + .../aarch64/sve/acle/asm/pnext_b64.c | 5 + .../aarch64/sve/acle/asm/pnext_b8.c | 5 + .../aarch64/sve/acle/general/whilele_10.c | 2 +- .../aarch64/sve/acle/general/whilele_5.c | 10 +- .../aarch64/sve/acle/general/whilele_6.c | 2 +- .../aarch64/sve/acle/general/whilele_7.c | 6 +- .../aarch64/sve/acle/general/whilele_9.c | 6 +- .../aarch64/sve/acle/general/whilelt_1.c | 10 +- .../aarch64/sve/acle/general/whilelt_2.c | 2 +- .../aarch64/sve/acle/general/whilelt_3.c | 6 +- gcc/testsuite/gcc.target/aarch64/sve/adr_1.c | 24 +- gcc/testsuite/gcc.target/aarch64/sve/adr_2.c | 24 +- gcc/testsuite/gcc.target/aarch64/sve/adr_3.c | 24 +- gcc/testsuite/gcc.target/aarch64/sve/adr_4.c | 6 +- gcc/testsuite/gcc.target/aarch64/sve/adr_5.c | 16 +- .../gcc.target/aarch64/sve/extract_1.c | 4 +- .../gcc.target/aarch64/sve/extract_2.c | 4 +- .../gcc.target/aarch64/sve/extract_3.c | 4 +- .../gcc.target/aarch64/sve/extract_4.c | 4 +- .../aarch64/sve/load_scalar_offset_1.c | 8 +- .../aarch64/sve/mask_gather_load_6.c | 4 +- .../aarch64/sve/pcs/args_5_be_bf16.c | 18 +- .../aarch64/sve/pcs/args_5_be_f16.c | 18 +- .../aarch64/sve/pcs/args_5_be_f32.c | 18 +- .../aarch64/sve/pcs/args_5_be_f64.c | 18 +- .../aarch64/sve/pcs/args_5_be_s16.c | 18 +- .../aarch64/sve/pcs/args_5_be_s32.c | 18 +- .../aarch64/sve/pcs/args_5_be_s64.c | 18 +- .../gcc.target/aarch64/sve/pcs/args_5_be_s8.c | 18 +- .../aarch64/sve/pcs/args_5_be_u16.c | 18 +- .../aarch64/sve/pcs/args_5_be_u32.c | 18 +- .../aarch64/sve/pcs/args_5_be_u64.c | 18 +- .../gcc.target/aarch64/sve/pcs/args_5_be_u8.c | 18 +- .../aarch64/sve/pcs/return_6_1024.c | 48 ++-- .../aarch64/sve/pcs/return_6_2048.c | 48 ++-- .../gcc.target/aarch64/sve/pcs/return_6_256.c | 48 ++-- .../gcc.target/aarch64/sve/pcs/return_6_512.c | 48 ++-- .../gcc.target/aarch64/sve/pcs/return_9.c | 16 +- .../gcc.target/aarch64/sve/pcs/varargs_1.c | 8 +- .../gcc.target/aarch64/sve/peel_ind_2.c | 2 +- .../gcc.target/aarch64/sve/pr89007-1.c | 2 +- .../gcc.target/aarch64/sve/pr89007-2.c | 2 +- gcc/testsuite/gcc.target/aarch64/sve/slp_4.c | 2 +- .../gcc.target/aarch64/sve/spill_3.c | 8 +- .../aarch64/sve/store_scalar_offset_1.c | 8 +- .../gcc.target/aarch64/sve/vcond_18.c | 14 +- .../gcc.target/aarch64/sve/vcond_19.c | 34 +-- .../gcc.target/aarch64/sve/vcond_2.c | 248 ++++++++--------- .../gcc.target/aarch64/sve/vcond_20.c | 34 +-- .../gcc.target/aarch64/sve/vcond_3.c | 26 +- .../gcc.target/aarch64/sve/vcond_7.c | 198 +++++++------- .../aarch64/sve2/acle/asm/aesd_u8.c | 4 +- .../aarch64/sve2/acle/asm/aese_u8.c | 4 +- .../aarch64/sve2/acle/asm/aesimc_u8.c | 2 +- .../aarch64/sve2/acle/asm/aesmc_u8.c | 2 +- .../aarch64/sve2/acle/asm/sli_s16.c | 15 + .../aarch64/sve2/acle/asm/sli_s32.c | 15 + .../aarch64/sve2/acle/asm/sli_s64.c | 15 + .../gcc.target/aarch64/sve2/acle/asm/sli_s8.c | 15 + .../aarch64/sve2/acle/asm/sli_u16.c | 15 + .../aarch64/sve2/acle/asm/sli_u32.c | 15 + .../aarch64/sve2/acle/asm/sli_u64.c | 15 + .../gcc.target/aarch64/sve2/acle/asm/sli_u8.c | 15 + .../aarch64/sve2/acle/asm/sm4e_u32.c | 2 +- .../aarch64/sve2/acle/asm/sri_s16.c | 15 + .../aarch64/sve2/acle/asm/sri_s32.c | 15 + .../aarch64/sve2/acle/asm/sri_s64.c | 15 + .../gcc.target/aarch64/sve2/acle/asm/sri_s8.c | 15 + .../aarch64/sve2/acle/asm/sri_u16.c | 15 + .../aarch64/sve2/acle/asm/sri_u32.c | 15 + .../aarch64/sve2/acle/asm/sri_u64.c | 15 + .../gcc.target/aarch64/sve2/acle/asm/sri_u8.c | 15 + .../gcc.target/aarch64/vadd_reduc-1.c | 4 +- .../gcc.target/aarch64/vadd_reduc-2.c | 4 +- gcc/testsuite/gcc.target/aarch64/vfp-1.c | 4 +- 126 files changed, 1680 insertions(+), 939 deletions(-)