From patchwork Tue May 16 10:53:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 94617 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp331052vqo; Tue, 16 May 2023 03:54:21 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6yh4HPOxN4DeCfbvbVOuG9THGLXkTLT7+UHM0ea3LB0egBGxL9AWI83ixB6fa9AUTmU4Xp X-Received: by 2002:a17:907:9617:b0:94a:6de2:ba9 with SMTP id gb23-20020a170907961700b0094a6de20ba9mr39762336ejc.68.1684234461141; Tue, 16 May 2023 03:54:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684234461; cv=none; d=google.com; s=arc-20160816; b=Wyhm5hTxlm8OFBDt+P8xzC9SLXFWKFgs1SXnEGoqIjiSgbhlnr2blLGM6SSwFoQ89A UMBFGtaph/RaXVr3Gdo40/4IPp/FHFLD2y0fR0mJn1q5TMQhVpmiA91DyyOPT2bMvzz3 G2T5XFGGoZBrhKYdTQRbESijcm+x/CChUjAbAbg2heA8b3NXGj73291teisG192abZjY MuctHmbzStfAyliMQ7xF812U+1UFEbtPgkt7gtyVaG9BsFHHnsbm7PB57aSZ4qGyQ6M8 AZF3uGEbksGdsjC+3gBitehVO3frY75tZ02iFX+11HsZj3kxfSmakrGiGzeToH7jZ7/u hWHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :user-agent:message-id:date:subject:cc:mail-followup-to:to :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=lvhBcbVc4+mRRGmgmaEtm4fefbSdFuQe5RGNpq3ft38=; b=T2sPDuMVGq3wjxvXU6FI6v6Q3p0KvnZpUEKs2U7nkajd8XyCD6vl8153tMYtECQTEA btJ4VWnryn8Wcic4BnboBKGTNx+chQv4XoKyn5Gpt8OiadbDiajH9MH/TkPd/M9PRlJV 3/H+nVSl4FKCnVzDbkfjh/woB6mH4R+5IKPumE2tAE9dGx0YNnAdJIB1ZoBrZSiDfP4l D4+yK5JL1MPPy6mw+qw3BdgsHOumsRl/tb7fqHTnMAJUUh3yFX/Jz4mz50RMbyI9d07Z 4I553YZ8Z9paPip8/uUsTqVeDcKr7f3nVOGJZ/goM5noxhym9KJLoC6mVg2dDSnKYIYb fZ+A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="cWQmmMl/"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id v25-20020a170906489900b009660c6fca5asi13290154ejq.1045.2023.05.16.03.54.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 16 May 2023 03:54:21 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="cWQmmMl/"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C39683857340 for ; Tue, 16 May 2023 10:54:19 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C39683857340 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1684234459; bh=lvhBcbVc4+mRRGmgmaEtm4fefbSdFuQe5RGNpq3ft38=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=cWQmmMl/2yFK612MIKV5wb39tNVA9L6dXPq1bSqT6OQagC9X+5yH+dCUJOwrQMfHd 0cpgPEO4iosD2NUOgmf5FbyXrTkLV7i/XASzuygpHH/yOv0t8hbBusOUWssay8kE6a jiRXOPrliMSoaWOnPj52krh4SbBseoLCXlx+tVJ4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id EA16F3858439 for ; Tue, 16 May 2023 10:53:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EA16F3858439 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4E55F2F4; Tue, 16 May 2023 03:54:19 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1ED643F663; Tue, 16 May 2023 03:53:34 -0700 (PDT) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, kyrylo.tkachov@arm.com, richard.sandiford@arm.com Cc: kyrylo.tkachov@arm.com Subject: [PATCH] aarch64: Allow moves after tied-register intrinsics (2nd edition) Date: Tue, 16 May 2023 11:53:32 +0100 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-29.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766047833944126992?= X-GMAIL-MSGID: =?utf-8?q?1766047833944126992?= I missed these two in g:4ff89f10ca0d41f9cfa76 because I was testing on a system that didn't support big-endian compilation. Testing on aarch64_be-elf shows no other related failures (although the overall results are worse than for little-endian). Tested on aarch64_be-elf & pushed. Richard gcc/testsuite/ * gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c: Allow mves to occur after the intrinsic instruction, rather than requiring them to happen before. * gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c: Likewise. --- .../gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c | 10 ++++++++++ .../gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c | 10 ++++++++++ 2 files changed, 20 insertions(+) diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c index ae0a953f7b4..9975edb8fdb 100644 --- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-2.c @@ -70,8 +70,13 @@ float32x4_t ufooq_lane(float32x4_t r, bfloat16x8_t x, bfloat16x4_t y) /* **ufoo_untied: +** ( ** mov v0.8b, v1.8b ** bfdot v0.2s, (v2.4h, v3.4h|v3.4h, v2.4h) +** | +** bfdot v1.2s, (v2.4h, v3.4h|v3.4h, v2.4h) +** mov v0.8b, v1.8b +** ) ** ret */ float32x2_t ufoo_untied(float32x4_t unused, float32x2_t r, bfloat16x4_t x, bfloat16x4_t y) @@ -81,8 +86,13 @@ float32x2_t ufoo_untied(float32x4_t unused, float32x2_t r, bfloat16x4_t x, bfloa /* **ufooq_lane_untied: +** ( ** mov v0.16b, v1.16b ** bfdot v0.4s, v2.8h, v3.2h\[1\] +** | +** bfdot v1.4s, v2.8h, v3.2h\[1\] +** mov v0.16b, v1.16b +** ) ** ret */ float32x4_t ufooq_lane_untied(float32x4_t unused, float32x4_t r, bfloat16x8_t x, bfloat16x4_t y) diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c index 61c7c51f5ec..76787f6bedd 100644 --- a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-3-2.c @@ -115,8 +115,13 @@ int32x4_t sfooq_laneq (int32x4_t r, int8x16_t x, uint8x16_t y) /* **ufoo_untied: +** ( ** mov v0\.8b, v1\.8b ** usdot v0\.2s, v2\.8b, v3\.8b +** | +** usdot v1\.2s, v2\.8b, v3\.8b +** mov v0\.8b, v1\.8b +** ) ** ret */ int32x2_t ufoo_untied (int32x2_t unused, int32x2_t r, uint8x8_t x, int8x8_t y) @@ -126,8 +131,13 @@ int32x2_t ufoo_untied (int32x2_t unused, int32x2_t r, uint8x8_t x, int8x8_t y) /* **ufooq_laneq_untied: +** ( ** mov v0\.16b, v1\.16b ** usdot v0\.4s, v2\.16b, v3\.4b\[3\] +** | +** usdot v1\.4s, v2\.16b, v3\.4b\[3\] +** mov v0\.16b, v1\.16b +** ) ** ret */ int32x4_t ufooq_laneq_untied (int32x2_t unused, int32x4_t r, uint8x16_t x, int8x16_t y)