From patchwork Thu Apr 27 16:38:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Stubbs X-Patchwork-Id: 88320 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp396058vqo; Thu, 27 Apr 2023 09:39:10 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4AsAqoaFEmkJBEo8HH3d6hr6Gqrcz003S6llSX6pjlX548pxvkEVv9UvNWacLQ+3K1aOmO X-Received: by 2002:aa7:d9c6:0:b0:504:a248:3741 with SMTP id v6-20020aa7d9c6000000b00504a2483741mr2036260eds.14.1682613549954; Thu, 27 Apr 2023 09:39:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682613549; cv=none; d=google.com; s=arc-20160816; b=bVe4U1ygkTmTlweyjqimy8V53nN4RF9alSq0FlZKZlRN2zY8X1HqyK9iDdM8kwUrXT RWMa23Sfh7MnWeS9t1MwTgX+TDDOdFnfxzqyseFMVUOFpNO/PREd5xr4QxQM0E29/CD+ ZpadGxjups6CGmPiCTzgnA2raB3ouE+nZOnuvjjEq5WPbYCcddr9Q4g/NholdyhyT4Ot HyGJ7F38S71SqAj10ybQKYDsfBA9Oy4POJ1cbm4VWSGpmFb3X0R+2wdxtIILGB4SqtV3 u32lZ2iTJFhZvblejGc3abQPpaQrOOX+SfwMwezP/KKaMu7HRWl2LHdz5+MJ8mVd6QdX znUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:from:to :content-language:user-agent:mime-version:date:message-id :ironport-sdr:dmarc-filter:delivered-to; bh=cX10Dj1cipn+lcRlJpZVGBErW4JJ+1BFhCg0jNtaG1A=; b=ojr69asuUhzx1icJT934nMSz3Wl1s5GCysiY5FfuxCy5XKn5DUmLOqyDwDXu17+rSh MComNt0EC0iP0HMAVM08IqR2PMALgjzFrE8MwSm4p1KJ1Csw0TBukhYpMXniF8ULozLF YuFh/H0O7bfsbKhFyIxFM3A3FSHXaEvieiTqJX10XQ021M5fvAS2IrtwDbShxGv3MNb3 G4b55S7X2lmo2G3M2x2zwuXC7KCjUdsQ1NBk/FtyIrD07RB7jI79B4fbHZiCcAtq8KZe SOHathm4ZpjjS9BiyUKwlkHVUzS7PEFqCe4bRyAXJ/NxH4a+14zzTszKnXSw0db/lbbh IhhQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id z1-20020a50eb41000000b00506c0b68fa5si13197031edp.525.2023.04.27.09.39.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 27 Apr 2023 09:39:09 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AC2193857715 for ; Thu, 27 Apr 2023 16:39:04 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa2.mentor.iphmx.com (esa2.mentor.iphmx.com [68.232.141.98]) by sourceware.org (Postfix) with ESMTPS id 7FB0D3858D37 for ; Thu, 27 Apr 2023 16:38:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7FB0D3858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.99,230,1677571200"; d="scan'208";a="4144963" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa2.mentor.iphmx.com with ESMTP; 27 Apr 2023 08:38:34 -0800 IronPort-SDR: QDJPfZPRa7YbVbNflFwBvCYoFMcgLJ7/EOyM9FVnWdknb2PBuWaBJ7cw16p3r7oQg8AZora4lt 52Rmq0Aeh6PBkVlf4kikHVU12tEkpa+9elxdhpZ/3kMfnCnvJNBrHSl4cDveXdt3JMGCq/fQRG kwBDxwNtY/9jrJc38GUAs3OID3cC7q/z0WU5/mYjp4+A5hW+1ARPqq1oKgDoGlBQqKoGI6Z14V 1svrWVu/FyxXwWr/fe8MfHGhMZvlF28DqsmQqjNheSrzTw9l0GnqO/2ssC0+aMRHEkaLMeoxWB nfY= Message-ID: <32c7f0c6-1a92-5c8a-0607-5aaa1929216a@codesourcery.com> Date: Thu, 27 Apr 2023 17:38:30 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Content-Language: en-GB To: "gcc-patches@gcc.gnu.org" From: Andrew Stubbs Subject: [committed] amdgcn: Fix addsub bug X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-14.mgc.mentorg.com (139.181.222.14) To svr-ies-mbx-11.mgc.mentorg.com (139.181.222.11) X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1764348185794800731?= X-GMAIL-MSGID: =?utf-8?q?1764348185794800731?= I've committed this patch to fix a couple of bugs introduced in the recent CMul patch. First, the fmsubadd insn was accidentally all adds and no substracts. Second, there were input dependencies on the undefined output register which caused the compiler to reserve unnecessary slots in the stack-frame. Both issues are now fixed. This patch is already committed to OG12. I'll backport it to GCC 13 shortly. Andrew amdgcn: Fix addsub bug The vec_fmsubadd instuction actually had add twice, by mistake. Also improve code-gen for all the complex patterns by using properly undefined values. Mostly this just prevents the compiler reserving space in the stack frame. gcc/ChangeLog: * config/gcn/gcn-valu.md (cmul3): Use gcn_gen_undef. (cml4): Likewise. (vec_addsub3): Likewise. (cadd3): Likewise. (vec_fmaddsub4): Likewise. (vec_fmsubadd4): Likewise, and use sub for the odd lanes. diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md index 44c48468dd6..7290cdc2fd0 100644 --- a/gcc/config/gcn/gcn-valu.md +++ b/gcc/config/gcn/gcn-valu.md @@ -2323,8 +2323,9 @@ (define_expand "cmul3" rtx even = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (even, get_exec (0x5555555555555555UL)); rtx dest = operands[0]; - emit_insn (gen_3_exec (dest, t1, t1_perm, dest, even)); - // a*c-b*d 0 + emit_insn (gen_3_exec (dest, t1, t1_perm, + gcn_gen_undef (mode), + even)); // a*c-b*d 0 rtx t2_perm = gen_reg_rtx (mode); emit_insn (gen_dpp_swap_pairs (t2_perm, t2)); // b*c a*d @@ -2368,7 +2369,8 @@ (define_expand "cml4" rtx even = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (even, get_exec (0x5555555555555555UL)); rtx dest = operands[0]; - emit_insn (gen_sub3_exec (dest, t1, t2_perm, dest, even)); + emit_insn (gen_sub3_exec (dest, t1, t2_perm, + gcn_gen_undef (mode), even)); rtx odd = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (odd, get_exec (0xaaaaaaaaaaaaaaaaUL)); @@ -2392,7 +2394,8 @@ (define_expand "vec_addsub3" rtx dest = operands[0]; rtx x = operands[1]; rtx y = operands[2]; - emit_insn (gen_sub3_exec (dest, x, y, dest, even)); + emit_insn (gen_sub3_exec (dest, x, y, gcn_gen_undef (mode), + even)); rtx odd = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (odd, get_exec (0xaaaaaaaaaaaaaaaaUL)); emit_insn (gen_add3_exec (dest, x, y, dest, odd)); @@ -2419,7 +2422,9 @@ (define_expand "cadd3" rtx even = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (even, get_exec (0x5555555555555555UL)); - emit_insn (gen_3_exec (dest, x, y, dest, even)); + emit_insn (gen_3_exec (dest, x, y, + gcn_gen_undef (mode), + even)); rtx odd = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (odd, get_exec (0xaaaaaaaaaaaaaaaaUL)); emit_insn (gen_3_exec (dest, x, y, dest, odd)); @@ -2439,7 +2444,8 @@ (define_expand "vec_fmaddsub4" rtx even = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (even, get_exec (0x5555555555555555UL)); rtx dest = operands[0]; - emit_insn (gen_sub3_exec (dest, t1, operands[3], dest, even)); + emit_insn (gen_sub3_exec (dest, t1, operands[3], + gcn_gen_undef (mode), even)); rtx odd = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (odd, get_exec (0xaaaaaaaaaaaaaaaaUL)); emit_insn (gen_add3_exec (dest, t1, operands[3], dest, odd)); @@ -2459,10 +2465,11 @@ (define_expand "vec_fmsubadd4" rtx even = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (even, get_exec (0x5555555555555555UL)); rtx dest = operands[0]; - emit_insn (gen_add3_exec (dest, t1, operands[3], dest, even)); + emit_insn (gen_add3_exec (dest, t1, operands[3], + gcn_gen_undef (mode), even)); rtx odd = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (odd, get_exec (0xaaaaaaaaaaaaaaaaUL)); - emit_insn (gen_add3_exec (dest, t1, operands[3], dest, odd)); + emit_insn (gen_sub3_exec (dest, t1, operands[3], dest, odd)); DONE; })