From patchwork Thu Oct 5 18:20:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tamar Christina X-Patchwork-Id: 148974 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2016:b0:403:3b70:6f57 with SMTP id fe22csp485184vqb; Thu, 5 Oct 2023 11:22:09 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH8RXv/2G3ZXit4fIGfFO7v5uniENVFGAoBeCdIRMpN6h2MVnOWqye8arlUgenki0Qcp/tA X-Received: by 2002:a17:907:b609:b0:9af:6bb:6c54 with SMTP id vl9-20020a170907b60900b009af06bb6c54mr4424212ejc.26.1696530128828; Thu, 05 Oct 2023 11:22:08 -0700 (PDT) Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id vm5-20020a170907b68500b009b9f4729c46si276006ejc.782.2023.10.05.11.22.08 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Oct 2023 11:22:08 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@armh.onmicrosoft.com header.s=selector2-armh-onmicrosoft-com header.b=dFFEJMMN; dkim=pass header.i=@armh.onmicrosoft.com header.s=selector2-armh-onmicrosoft-com header.b=dFFEJMMN; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E8A62385CC96 for ; Thu, 5 Oct 2023 18:21:54 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR03-DBA-obe.outbound.protection.outlook.com (mail-dbaeur03on2059.outbound.protection.outlook.com [40.107.104.59]) by sourceware.org (Postfix) with ESMTPS id 1467A385B53E for ; Thu, 5 Oct 2023 18:21:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1467A385B53E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=2ZnoCIUuUZrzn8eXGMugPLB+MvM6KCZ9Pehk550erG4=; b=dFFEJMMN7Ng5IdUsh7HkhKELOaSDuUYOAYh2Rj1bMpP9HWcGlozmQcWgAvdDtdsjboDwNbUzzDA3w6Ag1CoWBB6eQkpuodTfvA8cFvU8A0I4F57XjlrAUTp8phPJ3wfTp9vnXHw/c+oXEdxv9DerD2hPPhEdWzmpfVKU+8kAwZk= Received: from AS9PR06CA0771.eurprd06.prod.outlook.com (2603:10a6:20b:484::26) by AS1PR08MB7635.eurprd08.prod.outlook.com (2603:10a6:20b:477::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.26; Thu, 5 Oct 2023 18:21:00 +0000 Received: from AM7EUR03FT054.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:484:cafe::b3) by AS9PR06CA0771.outlook.office365.com (2603:10a6:20b:484::26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.33 via Frontend Transport; Thu, 5 Oct 2023 18:21:00 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT054.mail.protection.outlook.com (100.127.140.133) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6863.26 via Frontend Transport; Thu, 5 Oct 2023 18:21:00 +0000 Received: ("Tessian outbound ab4fc72d2cd4:v211"); Thu, 05 Oct 2023 18:21:00 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 772e9c2985a15650 X-CR-MTA-TID: 64aa7808 Received: from 4b04752ec958.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 7E2EB7DA-DA8A-422D-9DCD-FD788E81880E.1; Thu, 05 Oct 2023 18:20:53 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 4b04752ec958.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 05 Oct 2023 18:20:53 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=mBQJMhQvYf0hBoOTVJbvOjmZ3wyket5rQmYyRJnIHo795QPZOL72Pl/cEq+U00HdWMO4nv5OQf5IShSJ9E6M8LdNX5VV5htQS24935DRrOHaO7PqQAwgkW/dqNxFnNMmes/EYJ9jf4xp/GGS/i2t32lH4Uy1j5FAivAG0mSRJ+5EEEgAEklzmOxAEuwaPkZQg/mn50BT5agMCw4+arap1j52YK2yy1brcwIJ28NuFPOPncw5DNPNwyfEHDb/KvBzL+++umEWtxr2fPdT/kkV+EoEIwc4Fkx2kf5ha+EY69wNewUGDny9d/tKkVCPjgUMkSrtXbEmoxIfL4HiOoxIFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=2ZnoCIUuUZrzn8eXGMugPLB+MvM6KCZ9Pehk550erG4=; b=b46rd493ylCGFpKYPSmSaFWO0/ImIJ1G3nO861f6rkj4QvLog1Fz7bgKNRDi4DnkgniwJ00bsZNyKZR+8yDJ1ag8TjQSDqT7XM3Auptg/pfyLKH92CM8cSN5f3cenxGLItqSwW3qfm0H4F2ZT1A4EvLxXfBOSf2V/tkswvTIn3IeMcBvYDyuuLHZCmcN/0qQmAAFR8kT4uE0J/oZwF/2LfQE1juER6K3nJxmW/mzG3yGfMFOz5wsexzG0e1EW/NtpPRzTrWwiowSlajmnGyOYgmSKypD+3JnjLo9aSlR8KeiZaZCRFXaCv8HcGfmMV4imz3HyU7qHMqdHZBXk7pUnw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=2ZnoCIUuUZrzn8eXGMugPLB+MvM6KCZ9Pehk550erG4=; b=dFFEJMMN7Ng5IdUsh7HkhKELOaSDuUYOAYh2Rj1bMpP9HWcGlozmQcWgAvdDtdsjboDwNbUzzDA3w6Ag1CoWBB6eQkpuodTfvA8cFvU8A0I4F57XjlrAUTp8phPJ3wfTp9vnXHw/c+oXEdxv9DerD2hPPhEdWzmpfVKU+8kAwZk= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by DU0PR08MB9203.eurprd08.prod.outlook.com (2603:10a6:10:417::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.29; Thu, 5 Oct 2023 18:20:51 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::662f:8e26:1bf8:aaa1]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::662f:8e26:1bf8:aaa1%7]) with mapi id 15.20.6838.033; Thu, 5 Oct 2023 18:20:51 +0000 Date: Thu, 5 Oct 2023 19:20:43 +0100 From: Tamar Christina To: gcc-patches@gcc.gnu.org Cc: nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com, Kyrylo.Tkachov@arm.com, richard.sandiford@arm.com Subject: [PATCH]AArch64 Handle copysign (x, -1) expansion efficiently Message-ID: Content-Disposition: inline X-ClientProxiedBy: SN7P222CA0014.NAMP222.PROD.OUTLOOK.COM (2603:10b6:806:124::8) To VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: VI1PR08MB5325:EE_|DU0PR08MB9203:EE_|AM7EUR03FT054:EE_|AS1PR08MB7635:EE_ X-MS-Office365-Filtering-Correlation-Id: af7eab4d-3ae6-4a45-dd7c-08dbc5cfd3bd x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: nKGjpJuUwgr+s+7ekbzntMWlXzR8ghlAIGm4wSiiev9huQaAT7Pd1olgNKf6l0fzSrB9M4ir3uVDXeo11vW4bjwO59Io6DQUoAL/5Dkq5OWLNbvxHvBSKQi71+J4q3FIafmcRenqetBN/6ASrAUFyrKTz5W2i6h4xxSHo+5Tu6WtsS4cVzKM/+/8Fn+9jyEbEOXkdBOeAqeR/+nA0+nKt+gLWMMB5rMUnBVWY3lD4Vy9lbddAB5Sqhq47OGrg4vbAOxnLLlQTIDdk/pyYq+CufpLHw7goI+72Yigc03QGpARFOonf7KkEJXOU9gRXjhDhNqfpYmYKMaHt1xmTVauHGSg7Fw1OSkyrGjM4VpEhKv1+ByzGPyyk39Z+8DFu3MnI+nvbsXGHsteUXPkPl6dJ/YlVNcHu+awIp34rgyhiM9Skc/5YWO+ITZ90gILfC+2Aqs5N/RZSWkl1NbkcidBgU/XXq6FPJ36Bnuktd2L8Q2ZHe591kd7wG7P8AItOoQTAua9eOv6Hy1yUaJSVuRAIARwwaWiMNLuLKeYX1+PQH6CUJivFcgjJbkJOsYfTBUnY6EXYox9fpPNs846Fdcaq4TdUj3jrimU4LQcl/nmQWtPCMS42aZyGe6P2csUNV0ei5LDMQ+ecx0gqgUOWcvVWg== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:VI1PR08MB5325.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(39860400002)(396003)(346002)(366004)(136003)(376002)(230922051799003)(64100799003)(186009)(451199024)(1800799009)(44832011)(6916009)(316002)(8936002)(8676002)(235185007)(66476007)(66946007)(66556008)(41300700001)(4743002)(2906002)(5660300002)(4326008)(478600001)(6666004)(44144004)(33964004)(6512007)(36756003)(26005)(2616005)(6486002)(6506007)(38100700002)(86362001)(4216001)(2700100001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR08MB9203 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT054.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: db87ed60-f2c4-4214-f93f-08dbc5cfce2d X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: b9jeKQhOT7URAVCgHCILFubNa2wK84tFFcJt7alV2xwxAiEwqL+Oro+x9H7ssucY9s6csKQOikTl3qrRws313YOBXx99TLokmvcdgCeLe6+q9XOTBcdrEXhITxx9yi1MfBtIY5IeH1M1D2Lu1I1eiW3gwyU/xCFrNC6iKWFy18heScq1rcYG///ZoxzDHzwMxsLc7CSKKBNDgRdqeBPsVGgreQLCYXfnxbFAPsEc5XwdYlOi1F1zm0ppqD536vDPtDy9P/ByzSpRcsz+B69/Oyr//VKta/E8dUBeR/RbiHkL3jJBqDZLE6bC7x1DqKlalutrR743GRV1ZUuDzOuCMQoEFMUmZWCT8T6/MI1HtDvqTgIn8fEnGffj/yQDdUBY1Uq/gtoTxUOmx+kJf3MA5C8j2yOaFP9sz8wj5B0VjNpz8i+p4KineRa6t9VQIvHV8Ga3++5zjpQaOosLCWcdGVhRJ9Z5F5sAR1p1qG9BzCCthOM/4KS7DIHioOsVQrvCQpG/uoQIIfMUPPPb4J0s63j1uFhWw5SnxKpylcewASThgH7w4p5NXILGC+TFW5RMcryFhnBcPTmMyaKYr13fKbsitAtTEYTV2sRqg8URfhenbalIlWhnP6Apg9WSXSryU37rr3iu/ebCpso9U+Ea44yWNENImNbmB74nvsGVwOQr4edZf6nDDXfxQkvTkdEkGiqib6qF9lBCp7GYSsKL5qg+QIm81/PuloMfdw9CfmClu0tbGoLYzqSsI112N8HGbK6TUqtFzHryRS/7jfOzFYxauK6R5bshVNiJsuv/1ys= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(376002)(39860400002)(136003)(346002)(396003)(230922051799003)(451199024)(64100799003)(82310400011)(186009)(1800799009)(36840700001)(40470700004)(46966006)(2616005)(70586007)(44144004)(6512007)(40460700003)(41300700001)(6916009)(356005)(8676002)(44832011)(336012)(40480700001)(5660300002)(235185007)(81166007)(36756003)(8936002)(70206006)(316002)(4326008)(6486002)(6666004)(4743002)(86362001)(6506007)(26005)(33964004)(36860700001)(47076005)(82740400003)(478600001)(2906002)(4216001)(2700100001); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Oct 2023 18:21:00.3457 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: af7eab4d-3ae6-4a45-dd7c-08dbc5cfd3bd X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT054.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS1PR08MB7635 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, TXREP, T_SPF_TEMPERROR, UNPARSEABLE_RELAY, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778940775979209577 X-GMAIL-MSGID: 1778940775979209577 Hi All, copysign (x, -1) is effectively fneg (abs (x)) which on AArch64 can be most efficiently done by doing an OR of the signbit. The middle-end will optimize fneg (abs (x)) now to copysign as the canonical form and so this optimizes the expansion. If the target has an inclusive-OR that takes an immediate, then the transformed instruction is both shorter and faster. For those that don't, the immediate has to be separately constructed, but this still ends up being faster as the immediate construction is not on the critical path. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Note that this is part of another patch series, the additional testcases are mutually dependent on the match.pd patch. As such the tests are added there insteadof here. Ok for master? Thanks, Tamar gcc/ChangeLog: PR tree-optimization/109154 * config/aarch64/aarch64.md (copysign3): Handle copysign (x, -1). * config/aarch64/aarch64-simd.md (copysign3): Likewise. * config/aarch64/aarch64-sve.md (copysign3): Likewise. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 25a1e4e8ecf767636c0ff3cdab6cad6e1482f73e..a78e77dcc3473445108b06c50f9c28a8369f3e3f 100644 --- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 25a1e4e8ecf767636c0ff3cdab6cad6e1482f73e..a78e77dcc3473445108b06c50f9c28a8369f3e3f 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -754,15 +754,33 @@ (define_insn "aarch64_dot_lane< (define_expand "copysign3" [(match_operand:VHSDF 0 "register_operand") (match_operand:VHSDF 1 "register_operand") - (match_operand:VHSDF 2 "register_operand")] + (match_operand:VHSDF 2 "nonmemory_operand")] "TARGET_SIMD" { - rtx v_bitmask = gen_reg_rtx (mode); + machine_mode int_mode = mode; + rtx v_bitmask = gen_reg_rtx (int_mode); int bits = GET_MODE_UNIT_BITSIZE (mode) - 1; emit_move_insn (v_bitmask, aarch64_simd_gen_const_vector_dup (mode, HOST_WIDE_INT_M1U << bits)); + + /* copysign (x, -1) should instead be expanded as orr with the sign + bit. */ + if (!REG_P (operands[2])) + { + auto r0 + = CONST_DOUBLE_REAL_VALUE (unwrap_const_vec_duplicate (operands[2])); + if (-1 == real_to_integer (r0)) + { + emit_insn (gen_ior3 ( + lowpart_subreg (int_mode, operands[0], mode), + lowpart_subreg (int_mode, operands[1], mode), v_bitmask)); + DONE; + } + } + + operands[2] = force_reg (mode, operands[2]); emit_insn (gen_aarch64_simd_bsl (operands[0], v_bitmask, operands[2], operands[1])); DONE; diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index 5a652d8536a0ef9461f40da7b22834e683e73ceb..071400c820a5b106ddf9dc9faebb117975d74ea0 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -6387,7 +6387,7 @@ (define_insn "*3" (define_expand "copysign3" [(match_operand:SVE_FULL_F 0 "register_operand") (match_operand:SVE_FULL_F 1 "register_operand") - (match_operand:SVE_FULL_F 2 "register_operand")] + (match_operand:SVE_FULL_F 2 "nonmemory_operand")] "TARGET_SVE" { rtx sign = gen_reg_rtx (mode); @@ -6398,11 +6398,26 @@ (define_expand "copysign3" rtx arg1 = lowpart_subreg (mode, operands[1], mode); rtx arg2 = lowpart_subreg (mode, operands[2], mode); - emit_insn (gen_and3 - (sign, arg2, - aarch64_simd_gen_const_vector_dup (mode, - HOST_WIDE_INT_M1U - << bits))); + rtx v_sign_bitmask + = aarch64_simd_gen_const_vector_dup (mode, + HOST_WIDE_INT_M1U << bits); + + /* copysign (x, -1) should instead be expanded as orr with the sign + bit. */ + if (!REG_P (operands[2])) + { + auto r0 + = CONST_DOUBLE_REAL_VALUE (unwrap_const_vec_duplicate (operands[2])); + if (-1 == real_to_integer (r0)) + { + emit_insn (gen_ior3 (int_res, arg1, v_sign_bitmask)); + emit_move_insn (operands[0], gen_lowpart (mode, int_res)); + DONE; + } + } + + operands[2] = force_reg (mode, operands[2]); + emit_insn (gen_and3 (sign, arg2, v_sign_bitmask)); emit_insn (gen_and3 (mant, arg1, aarch64_simd_gen_const_vector_dup (mode, diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 24349ecdbbab875f21975f116732a9e53762d4c1..d6c581ad81615b4feb095391cbcf4f5b78fa72f1 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -6940,12 +6940,25 @@ (define_expand "lrint2" (define_expand "copysign3" [(match_operand:GPF 0 "register_operand") (match_operand:GPF 1 "register_operand") - (match_operand:GPF 2 "register_operand")] + (match_operand:GPF 2 "nonmemory_operand")] "TARGET_SIMD" { - rtx bitmask = gen_reg_rtx (mode); + machine_mode int_mode = mode; + rtx bitmask = gen_reg_rtx (int_mode); emit_move_insn (bitmask, GEN_INT (HOST_WIDE_INT_M1U << (GET_MODE_BITSIZE (mode) - 1))); + /* copysign (x, -1) should instead be expanded as orr with the sign + bit. */ + auto r0 = CONST_DOUBLE_REAL_VALUE (operands[2]); + if (-1 == real_to_integer (r0)) + { + emit_insn (gen_ior3 ( + lowpart_subreg (int_mode, operands[0], mode), + lowpart_subreg (int_mode, operands[1], mode), bitmask)); + DONE; + } + + operands[2] = force_reg (mode, operands[2]); emit_insn (gen_copysign3_insn (operands[0], operands[1], operands[2], bitmask)); DONE; --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -754,15 +754,33 @@ (define_insn "aarch64_dot_lane< (define_expand "copysign3" [(match_operand:VHSDF 0 "register_operand") (match_operand:VHSDF 1 "register_operand") - (match_operand:VHSDF 2 "register_operand")] + (match_operand:VHSDF 2 "nonmemory_operand")] "TARGET_SIMD" { - rtx v_bitmask = gen_reg_rtx (mode); + machine_mode int_mode = mode; + rtx v_bitmask = gen_reg_rtx (int_mode); int bits = GET_MODE_UNIT_BITSIZE (mode) - 1; emit_move_insn (v_bitmask, aarch64_simd_gen_const_vector_dup (mode, HOST_WIDE_INT_M1U << bits)); + + /* copysign (x, -1) should instead be expanded as orr with the sign + bit. */ + if (!REG_P (operands[2])) + { + auto r0 + = CONST_DOUBLE_REAL_VALUE (unwrap_const_vec_duplicate (operands[2])); + if (-1 == real_to_integer (r0)) + { + emit_insn (gen_ior3 ( + lowpart_subreg (int_mode, operands[0], mode), + lowpart_subreg (int_mode, operands[1], mode), v_bitmask)); + DONE; + } + } + + operands[2] = force_reg (mode, operands[2]); emit_insn (gen_aarch64_simd_bsl (operands[0], v_bitmask, operands[2], operands[1])); DONE; diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index 5a652d8536a0ef9461f40da7b22834e683e73ceb..071400c820a5b106ddf9dc9faebb117975d74ea0 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -6387,7 +6387,7 @@ (define_insn "*3" (define_expand "copysign3" [(match_operand:SVE_FULL_F 0 "register_operand") (match_operand:SVE_FULL_F 1 "register_operand") - (match_operand:SVE_FULL_F 2 "register_operand")] + (match_operand:SVE_FULL_F 2 "nonmemory_operand")] "TARGET_SVE" { rtx sign = gen_reg_rtx (mode); @@ -6398,11 +6398,26 @@ (define_expand "copysign3" rtx arg1 = lowpart_subreg (mode, operands[1], mode); rtx arg2 = lowpart_subreg (mode, operands[2], mode); - emit_insn (gen_and3 - (sign, arg2, - aarch64_simd_gen_const_vector_dup (mode, - HOST_WIDE_INT_M1U - << bits))); + rtx v_sign_bitmask + = aarch64_simd_gen_const_vector_dup (mode, + HOST_WIDE_INT_M1U << bits); + + /* copysign (x, -1) should instead be expanded as orr with the sign + bit. */ + if (!REG_P (operands[2])) + { + auto r0 + = CONST_DOUBLE_REAL_VALUE (unwrap_const_vec_duplicate (operands[2])); + if (-1 == real_to_integer (r0)) + { + emit_insn (gen_ior3 (int_res, arg1, v_sign_bitmask)); + emit_move_insn (operands[0], gen_lowpart (mode, int_res)); + DONE; + } + } + + operands[2] = force_reg (mode, operands[2]); + emit_insn (gen_and3 (sign, arg2, v_sign_bitmask)); emit_insn (gen_and3 (mant, arg1, aarch64_simd_gen_const_vector_dup (mode, diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 24349ecdbbab875f21975f116732a9e53762d4c1..d6c581ad81615b4feb095391cbcf4f5b78fa72f1 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -6940,12 +6940,25 @@ (define_expand "lrint2" (define_expand "copysign3" [(match_operand:GPF 0 "register_operand") (match_operand:GPF 1 "register_operand") - (match_operand:GPF 2 "register_operand")] + (match_operand:GPF 2 "nonmemory_operand")] "TARGET_SIMD" { - rtx bitmask = gen_reg_rtx (mode); + machine_mode int_mode = mode; + rtx bitmask = gen_reg_rtx (int_mode); emit_move_insn (bitmask, GEN_INT (HOST_WIDE_INT_M1U << (GET_MODE_BITSIZE (mode) - 1))); + /* copysign (x, -1) should instead be expanded as orr with the sign + bit. */ + auto r0 = CONST_DOUBLE_REAL_VALUE (operands[2]); + if (-1 == real_to_integer (r0)) + { + emit_insn (gen_ior3 ( + lowpart_subreg (int_mode, operands[0], mode), + lowpart_subreg (int_mode, operands[1], mode), bitmask)); + DONE; + } + + operands[2] = force_reg (mode, operands[2]); emit_insn (gen_copysign3_insn (operands[0], operands[1], operands[2], bitmask)); DONE;