From patchwork Thu Sep 14 10:42:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 139492 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp253752vqi; Thu, 14 Sep 2023 03:42:51 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE4ZDKE2eK7pR+41wqzCl6yWeIy1W0PwaMgdlFS9mBZUect0TioXgi/DKtn+J3Vz03YQkTC X-Received: by 2002:a2e:7c02:0:b0:2b9:f13b:6139 with SMTP id x2-20020a2e7c02000000b002b9f13b6139mr4642024ljc.20.1694688170802; Thu, 14 Sep 2023 03:42:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694688170; cv=none; d=google.com; s=arc-20160816; b=WHqchYF73/EeDM/YBXZ+cWW2DPUcEfuTm3Cfs7Vkk40FGV7pVO03G/U4290JQq57GW yDQpiWwxjFm5xpDvwEH8jX7qyFDrvaAOdfaITGQdRTy3tKqfXK47lqMJMjNPPZcbm34L InA18rtbvcfOqeOc9ky41zoB/9Xxj5pBCUkj/oaBLDQfE1jXDeoFX+Yl1UNHoJjDA1gj 3IKf2pZOikBhNf/9GDLTQtnfaaqWYGa6WPgI9/sMr0ZL6qwsN4Qwdsvw+JnjZQarW34D jzs080jx5e+sdxvF/BfLqQGI//Vez4LHKxj5oS65vVScUZIHZSjiV+z4J22NLJcOfSJn d4IA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :user-agent:message-id:date:subject:mail-followup-to:to:dmarc-filter :delivered-to:dkim-signature:dkim-filter; bh=nj9RJc0SjdaQakakgChfCaJ/x7iyaglht+T5YtgsPC0=; fh=hPrbWPhweUx4V0GV9uXJqbyAzg2ABmTz7kczrAQqMmM=; b=drQDUoUjS12NmI7n12a6M+0XTCTSm61w6A0ytIzo8A8L3jF9fH9f5BuKsVCTNW4bkF ixIM++PBFnuOSIzQeNSOxcnKHt7LVY8h3Ay7a4QqYLio9/sDw4wSLSHwdnxZMksQy6C9 ek86DOe8rStQVyd2XB9ldMH7bnEAs1xst6iBPYk+MaxKacP3VdGppTv8x6ua4ZDK/rKE bnLCFxpOmW+grP3ApSmluwjrNBUqADXepGJIVK/9TIZulQRhNLi/01VWGkKNbGb/BAqv 2E4DwYXUFO28GTcQ79XepEt0D20Iy24pNo2Tf3huPEgs2gZboHxvOnTemlyxeFL/l7Oe Gi9w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=oaK4ZS6u; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id jw24-20020a17090776b800b00992dc9d6b8bsi629976ejc.789.2023.09.14.03.42.50 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 03:42:50 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=oaK4ZS6u; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id ADDEE38582B0 for ; Thu, 14 Sep 2023 10:42:49 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org ADDEE38582B0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694688169; bh=nj9RJc0SjdaQakakgChfCaJ/x7iyaglht+T5YtgsPC0=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=oaK4ZS6uLKhWFkLhf5Pg3TljlM1JsOQGFuSc8jZCMLMypLJfFjZK2Zj9N544MLYb1 fiP19hSTgRl7y44coQveJcqmBy4tgQN11NhaAmv9o1noCHZ4g+2+CcbXOa6zBKGDAt IlHJu+ivxwrY4wqLi8gho1Jthb1xqjkj1W3LpsYI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 544F73858D20 for ; Thu, 14 Sep 2023 10:42:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 544F73858D20 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 376281FB for ; Thu, 14 Sep 2023 03:42:43 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A07EA3F738 for ; Thu, 14 Sep 2023 03:42:05 -0700 (PDT) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH] aarch64: Coerce addresses to be suitable for LD1RQ Date: Thu, 14 Sep 2023 11:42:04 +0100 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-24.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777009343186825115 X-GMAIL-MSGID: 1777009343186825115 In the following test: svuint8_t ld(uint8_t *ptr) { return svld1rq(svptrue_b8(), ptr + 2); } ptr + 2 is a valid address for an Advanced SIMD load, but not for an SVE load. We therefore ended up generating: ldr q0, [x0, 2] dup z0.q, z0.q[0] This patch makes us generate LD1RQ for that case too. It takes the slightly old-school approach of making the predicate broader than the constraint. That is: any valid memory address is accepted as an operand before RA. If the instruction remains during RA, LRA will coerce the address to match the constraint. If the instruction gets split before RA, the splitter will load invalid addresses into a scratch register. Tested on aarch64-linux-gnu & pushed. Richard gcc/ * config/aarch64/aarch64-sve.md (@aarch64_vec_duplicate_vq_le): Accept all nonimmediate_operands, but keep the existing constraints. If the instruction is split before RA, load invalid addresses into a temporary register. * config/aarch64/predicates.md (aarch64_sve_dup_ld1rq_operand): Delete. gcc/testsuite/ * gcc.target/aarch64/sve/acle/general/ld1rq_1.c: New test. --- gcc/config/aarch64/aarch64-sve.md | 15 ++++++++- gcc/config/aarch64/predicates.md | 4 --- .../aarch64/sve/acle/general/ld1rq_1.c | 33 +++++++++++++++++++ 3 files changed, 47 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general/ld1rq_1.c diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index da5534c3e32..b223e7d3c9d 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -2611,11 +2611,18 @@ (define_insn_and_split "*vec_duplicate_reg" ) ;; Duplicate an Advanced SIMD vector to fill an SVE vector (LE version). +;; +;; The addressing mode range of LD1RQ does not match the addressing mode +;; range of LDR Qn. If the predicate enforced the LD1RQ range, we would +;; not be able to combine LDR Qns outside that range. The predicate +;; therefore accepts all memory operands, with only the constraints +;; enforcing the actual restrictions. If the instruction is split +;; before RA, we need to load invalid addresses into a temporary. (define_insn_and_split "@aarch64_vec_duplicate_vq_le" [(set (match_operand:SVE_FULL 0 "register_operand" "=w, w") (vec_duplicate:SVE_FULL - (match_operand: 1 "aarch64_sve_dup_ld1rq_operand" "w, UtQ"))) + (match_operand: 1 "nonimmediate_operand" "w, UtQ"))) (clobber (match_scratch:VNx16BI 2 "=X, Upl"))] "TARGET_SVE && !BYTES_BIG_ENDIAN" { @@ -2633,6 +2640,12 @@ (define_insn_and_split "@aarch64_vec_duplicate_vq_le" "&& MEM_P (operands[1])" [(const_int 0)] { + if (can_create_pseudo_p () + && !aarch64_sve_ld1rq_operand (operands[1], mode)) + { + rtx addr = force_reg (Pmode, XEXP (operands[1], 0)); + operands[1] = replace_equiv_address (operands[1], addr); + } if (GET_CODE (operands[2]) == SCRATCH) operands[2] = gen_reg_rtx (VNx16BImode); emit_move_insn (operands[2], CONSTM1_RTX (VNx16BImode)); diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md index 2d8d1fe25c1..01de4743974 100644 --- a/gcc/config/aarch64/predicates.md +++ b/gcc/config/aarch64/predicates.md @@ -732,10 +732,6 @@ (define_predicate "aarch64_sve_dup_operand" (ior (match_operand 0 "register_operand") (match_operand 0 "aarch64_sve_ld1r_operand"))) -(define_predicate "aarch64_sve_dup_ld1rq_operand" - (ior (match_operand 0 "register_operand") - (match_operand 0 "aarch64_sve_ld1rq_operand"))) - (define_predicate "aarch64_sve_ptrue_svpattern_immediate" (and (match_code "const") (match_test "aarch64_sve_ptrue_svpattern_p (op, NULL)"))) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/ld1rq_1.c b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/ld1rq_1.c new file mode 100644 index 00000000000..9242c639731 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/ld1rq_1.c @@ -0,0 +1,33 @@ +/* { dg-options "-O2" } */ + +#include + +#define TEST_OFFSET(TYPE, SUFFIX, OFFSET) \ + sv##TYPE##_t \ + test_##TYPE##_##SUFFIX (TYPE##_t *ptr) \ + { \ + return svld1rq(svptrue_b8(), ptr + OFFSET); \ + } + +#define TEST(TYPE) \ + TEST_OFFSET (TYPE, 0, 0) \ + TEST_OFFSET (TYPE, 1, 1) \ + TEST_OFFSET (TYPE, 2, 2) \ + TEST_OFFSET (TYPE, 16, 16) \ + TEST_OFFSET (TYPE, 0x10000, 0x10000) \ + TEST_OFFSET (TYPE, 0x10001, 0x10001) \ + TEST_OFFSET (TYPE, m1, -1) \ + TEST_OFFSET (TYPE, m2, -2) \ + TEST_OFFSET (TYPE, m16, -16) \ + TEST_OFFSET (TYPE, m0x10000, -0x10000) \ + TEST_OFFSET (TYPE, m0x10001, -0x10001) + +TEST (int8) +TEST (int16) +TEST (uint32) +TEST (uint64) + +/* { dg-final { scan-assembler-times {\tld1rqb\t} 11 { target aarch64_little_endian } } } */ +/* { dg-final { scan-assembler-times {\tld1rqh\t} 11 { target aarch64_little_endian } } } */ +/* { dg-final { scan-assembler-times {\tld1rqw\t} 11 { target aarch64_little_endian } } } */ +/* { dg-final { scan-assembler-times {\tld1rqd\t} 11 { target aarch64_little_endian } } } */