From patchwork Wed Aug 23 14:39:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 136683 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a7d1:0:b0:3f2:4152:657d with SMTP id p17csp506290vqm; Wed, 23 Aug 2023 07:41:29 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEP4fLsQWJAXBBuqZWCQHsqbORf2loAnMMV/lcHG8jw/TMm3DaAzVSMLq3eE/v4LGfxFKVH X-Received: by 2002:a17:906:308b:b0:9a1:f96c:4baf with SMTP id 11-20020a170906308b00b009a1f96c4bafmr243241ejv.5.1692801688822; Wed, 23 Aug 2023 07:41:28 -0700 (PDT) Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id e3-20020a170906044300b0099cb29f4087si8771507eja.264.2023.08.23.07.41.28 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Aug 2023 07:41:28 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="DDO2//HE"; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 433B3388200F for ; Wed, 23 Aug 2023 14:40:36 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 433B3388200F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1692801636; bh=AvprJdRLCzdO5AhbgrAda+w95Ekcoq9s/5j5bv51IBE=; h=To:CC:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=DDO2//HEUlTEw5SNQwd1jmpdfpXhmeufCYettC1QQTGy4JP+7cEzFW2ZdF2db6y3A ZLOmm7n2xPwxY58DXfApuhfvU7DI2/FGYdEyF2jGqxOx8JHSKHBFkbHWuDSjqTSAHP t+w8ZOjU2fD/A0KfwaLBTeDS2gNJHwTgjJ56aHZw= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-DB3-obe.outbound.protection.outlook.com (mail-db3eur04on2056.outbound.protection.outlook.com [40.107.6.56]) by sourceware.org (Postfix) with ESMTPS id 664003858296 for ; Wed, 23 Aug 2023 14:39:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 664003858296 Received: from AS8PR04CA0092.eurprd04.prod.outlook.com (2603:10a6:20b:31e::7) by PAXPR08MB7551.eurprd08.prod.outlook.com (2603:10a6:102:24e::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6699.26; Wed, 23 Aug 2023 14:39:28 +0000 Received: from AM7EUR03FT034.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:31e:cafe::98) by AS8PR04CA0092.outlook.office365.com (2603:10a6:20b:31e::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6699.26 via Frontend Transport; Wed, 23 Aug 2023 14:39:28 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT034.mail.protection.outlook.com (100.127.140.87) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6723.16 via Frontend Transport; Wed, 23 Aug 2023 14:39:27 +0000 Received: ("Tessian outbound b5a0f4347031:v175"); Wed, 23 Aug 2023 14:39:27 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: eff1208d14580677 X-CR-MTA-TID: 64aa7808 Received: from adeacbf1fa98.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id D4413D68-E6D9-4C23-B37F-4A8E250EA6F2.1; Wed, 23 Aug 2023 14:39:17 +0000 Received: from EUR01-DB5-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id adeacbf1fa98.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 23 Aug 2023 14:39:17 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=fUwMuJQQ9//2sz8E+C0nvfxNfdS+eZqK8H/DI8VGNWIRJskz+fgfKEb3kb19416DrjgtTYAk92LWEoEBug3Hje6Eyq4RaRS6HHhUWb3QV8NKRdEfeTkMru/m6D2d+eaTpPEGIDRM0Ju/OkO1oy1+FW1+f8rYH6E8Yv2ioTQjQNwl5IQX4dCo0Z3cgoMV+ARa8HS1xrVwD/9GbkVgsvNEgrysketeUQwoyWY9rx57+EKxwBQs/EesXJnI3+1PmsxGwZPtZcR4jB1nr+XhHQWH6n00+thl61WUFoRzg/JivMNCTrxwLCJYq0Q9bAysphtp/P9ONPdd350KwXQDflF19w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=AvprJdRLCzdO5AhbgrAda+w95Ekcoq9s/5j5bv51IBE=; b=R4GvJdFWHsDXsangSvDVstLx+WB4JYdazra2B+hBOnYqLz1UUM6u4/SMgl0b3iKvdM2FfDcna0Tgi0jR3C36gHizt4ZLfkluuhsgMv5fEpMxZmYdGwBbFeQhbt9CwchfHFwVW34AIPj3onEbhesN7hyyfFfmePPH+PL7cWOj5MNj6O+ueQfQuKZqqahjALzxhdRNwY2oFVkPualRbTp9lHner7x2MKZ7IvTYxTNL8IboA/IEGqNgNscNCcfVdFChz7SDmtJ+waNgeTv9i+8u1/wvcNud+sPasBAg3SEOpTQxoJQUHCGM9g/jIcFlamBtkEhD+87NY1a2+4H1CIAubQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Received: from PAWPR08MB8982.eurprd08.prod.outlook.com (2603:10a6:102:33f::20) by DB8PR08MB5401.eurprd08.prod.outlook.com (2603:10a6:10:f9::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6699.24; Wed, 23 Aug 2023 14:39:13 +0000 Received: from PAWPR08MB8982.eurprd08.prod.outlook.com ([fe80::ff3d:6e95:9971:a7e]) by PAWPR08MB8982.eurprd08.prod.outlook.com ([fe80::ff3d:6e95:9971:a7e%5]) with mapi id 15.20.6699.020; Wed, 23 Aug 2023 14:39:13 +0000 To: GCC Patches CC: Kyrylo Tkachov , Richard Sandiford Subject: [PATCH] AArch64: Fix MOPS memmove operand corruption [PR111121] Thread-Topic: [PATCH] AArch64: Fix MOPS memmove operand corruption [PR111121] Thread-Index: AQHZ1c8R32LxDXcrFEahob5SKybvNA== Date: Wed, 23 Aug 2023 14:39:13 +0000 Message-ID: Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: PAWPR08MB8982:EE_|DB8PR08MB5401:EE_|AM7EUR03FT034:EE_|PAXPR08MB7551:EE_ X-MS-Office365-Filtering-Correlation-Id: 806ae583-c70e-4914-ffa5-08dba3e6c10f x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: 74IvDN5Iy7XCyqZDwAK2mKv4foU3dTpkt7P2J378+aj44eiblg3ZJ3d8XxO1QDpmejF3wLV6hAjSQHunCBNbEACJQVrrXEcGvIicaLZRgjEleYqZjE7btjs3YLIEMAXmH//IDk2OelsmC4Wr14lrb1Tn5FcBrJYD3t7FuOn4N7oEdzvZ+A0+tgZTo3mV9sOssH/OfwjLEKX3d5Q8Ooh8VMWRITodEF1OVcr1pypVKwy0eMQ6mLAw0Nlyu3R4DWrJbaCXMbogA0gjTd7GpmsHghgStzLOUYgoOC982UxnBz0C6or9rtEK5E50V6/B0BHWtvqRXEnM7IY/JLumr1tF28lRy3DoOeROOb33av1CHFIHrDQHuILUAk2H175MXV88p4AtQvBucYUM5RhcsXxscy7Nrvc2ozH6bgTGtNmILAE9EJJNEKbtp+jazb3j51lC1plbX/iUWUEr4swgI0FpMBq2kGeHlrg+zMTKMnhvVu3UsnSvTXDz7+R/JIZ4scYF3Ct7M0EnXaQt8B+EWmjG3eLlS7X3bjTvSHoTR2kqw27GHq3wS7rRBnhTXfvjceIGlqVJkibGVQJsQO8L0IzFqbS/rQkZOcFKnS+qf6Xojvo= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:PAWPR08MB8982.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(39860400002)(366004)(346002)(136003)(396003)(376002)(451199024)(186009)(1800799009)(2906002)(84970400001)(38070700005)(6506007)(38100700002)(83380400001)(5660300002)(33656002)(26005)(52536014)(86362001)(7696005)(8676002)(8936002)(4326008)(316002)(66556008)(9686003)(64756008)(66946007)(54906003)(76116006)(6916009)(66446008)(66476007)(91956017)(478600001)(55016003)(122000001)(71200400001)(41300700001); DIR:OUT; SFP:1101; MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR08MB5401 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT034.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 4b2f9629-6415-4de9-5340-08dba3e6b83e X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Pxs2pr+s+0WXtcsG/yayxBwzQqMpoDbqtsKt6Ny6BHZ0IDCiml8O+7Fm14B+qvB3161VqQIABGFeGBDut8zC+UV8UpYXBQfAnTLCSpWzsRc7ddWGEyHsMUxMVk5n10/VyXNRAXuu3AkKs9vlTanpwDWEIKyZ9+r7l6Nc54Dp23uap6CMS3Kj50/sIvz/VtNz55y82Pt46GUL3NBBkLtHDpDs0wcG/ongSDf2Jv4fB8jF6UDgnTxzfaz6S1X9NQqrjiufieHMAC9pejVivjYQ+5QMIR5SrA/TUgJNEQ11uxTB+abbOlBl+64NCfEmXCofSyhyaM6JTaql77kUWk2kv4TJHQmdP/sgQlccbiWcl+FQ1w1m31diiywlWUS6/n5RszshdyTR+veLCP3IxohTdtsc0zcfH/QEJoQkZZdwCNvLFNwc8+N2SeBmghZxuNTjV9qslNbrZsqDQI3v72v4n5MIrvOGjoS+3z5zoBYEE2EsIy/oNnXnb4fKjTt2mfa9zt0tYEL2sigc/iehUiMB3MLWiWAiYgNo/ayTw4LZN+XlhmEpO1DXVJsbCj0goZoB5zF8CdX4TQeN0W+QTnD03ljqaw3m5bmdrtMN6Sh5v6HrlAmMiFV5kHQPq5958EQozkJipu97SThX+jxfjoRkzipPn8gYXeEnH5djWpUEV1x0j0SdRiWbqaZ0/IiNqw23vERWCCLdGyrYSNlDmYeZBpE/RSpxIp7xugeq3T4qnqw= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(39860400002)(136003)(376002)(346002)(396003)(1800799009)(82310400011)(186009)(451199024)(40470700004)(46966006)(36840700001)(47076005)(5660300002)(40460700003)(36860700001)(83380400001)(2906002)(356005)(81166007)(82740400003)(33656002)(86362001)(40480700001)(55016003)(41300700001)(9686003)(84970400001)(70586007)(54906003)(70206006)(7696005)(6506007)(6916009)(316002)(52536014)(4326008)(8936002)(8676002)(478600001)(26005)(336012); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Aug 2023 14:39:27.9132 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 806ae583-c70e-4914-ffa5-08dba3e6c10f X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT034.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAXPR08MB7551 X-Spam-Status: No, score=-10.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Wilco Dijkstra via Gcc-patches From: Wilco Dijkstra Reply-To: Wilco Dijkstra Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775031223727421963 X-GMAIL-MSGID: 1775031223727421963 A MOPS memmove may corrupt registers since there is no copy of the input operands to temporary registers. Fix this by calling aarch64_expand_cpymem which does this. Also fix an issue with STRICT_ALIGNMENT being ignored if TARGET_MOPS is true, and avoid crashing or generating a huge expansion if aarch64_mops_memcpy_size_threshold is large. Passes regress/bootstrap, OK for commit? gcc/ChangeLog/ PR target/111121 * config/aarch64/aarch64.md (cpymemdi): Remove STRICT_ALIGNMENT, add param for memmove. (aarch64_movmemdi): Add new expander similar to aarch64_cpymemdi. (movmemdi): Like cpymemdi call aarch64_expand_cpymem for correct expansion. * config/aarch64/aarch64.cc (aarch64_expand_cpymem_mops): Add support for memmove. (aarch64_expand_cpymem): Add support for memmove. Handle STRICT_ALIGNMENT correctly. Handle TARGET_MOPS size selection correctly. * config/aarch64/aarch64-protos.h (aarch64_expand_cpymem): Update prototype. gcc/testsuite/ChangeLog/ PR target/111121 * gcc.target/aarch64/mops_4.c: Add memmove testcases. diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 70303d6fd953e0c397b9138ede8858c2db2e53db..97375e81cbda078847af83bf5dd4e0d7673d6af4 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -765,7 +765,7 @@ bool aarch64_emit_approx_div (rtx, rtx, rtx); bool aarch64_emit_approx_sqrt (rtx, rtx, bool); tree aarch64_vector_load_decl (tree); void aarch64_expand_call (rtx, rtx, rtx, bool); -bool aarch64_expand_cpymem (rtx *); +bool aarch64_expand_cpymem (rtx *, bool); bool aarch64_expand_setmem (rtx *); bool aarch64_float_const_zero_rtx_p (rtx); bool aarch64_float_const_rtx_p (rtx); diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index eba5d4a7e04b7af82437453a691d5607d98133c9..5e8d0a0c91bc7719de2a8c5627b354cf905a4db0 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -25135,10 +25135,11 @@ aarch64_copy_one_block_and_progress_pointers (rtx *src, rtx *dst, *dst = aarch64_progress_pointer (*dst); } -/* Expand a cpymem using the MOPS extension. OPERANDS are taken - from the cpymem pattern. Return true iff we succeeded. */ +/* Expand a cpymem/movmem using the MOPS extension. OPERANDS are taken + from the cpymem/movmem pattern. IS_MEMMOVE is true if this is a memmove + rather than memcpy. Return true iff we succeeded. */ static bool -aarch64_expand_cpymem_mops (rtx *operands) +aarch64_expand_cpymem_mops (rtx *operands, bool is_memmove) { if (!TARGET_MOPS) return false; @@ -25150,17 +25151,19 @@ aarch64_expand_cpymem_mops (rtx *operands) rtx dst_mem = replace_equiv_address (operands[0], dst_addr); rtx src_mem = replace_equiv_address (operands[1], src_addr); rtx sz_reg = copy_to_mode_reg (DImode, operands[2]); - emit_insn (gen_aarch64_cpymemdi (dst_mem, src_mem, sz_reg)); - + if (is_memmove) + emit_insn (gen_aarch64_movmemdi (dst_mem, src_mem, sz_reg)); + else + emit_insn (gen_aarch64_cpymemdi (dst_mem, src_mem, sz_reg)); return true; } -/* Expand cpymem, as if from a __builtin_memcpy. Return true if - we succeed, otherwise return false, indicating that a libcall to - memcpy should be emitted. */ - +/* Expand cpymem/movmem, as if from a __builtin_memcpy/memmove. + OPERANDS are taken from the cpymem/movmem pattern. IS_MEMMOVE is true + if this is a memmove rather than memcpy. Return true if we succeed, + otherwise return false, indicating that a libcall should be emitted. */ bool -aarch64_expand_cpymem (rtx *operands) +aarch64_expand_cpymem (rtx *operands, bool is_memmove) { int mode_bits; rtx dst = operands[0]; @@ -25168,25 +25171,23 @@ aarch64_expand_cpymem (rtx *operands) rtx base; machine_mode cur_mode = BLKmode; - /* Variable-sized memcpy can go through the MOPS expansion if available. */ - if (!CONST_INT_P (operands[2])) - return aarch64_expand_cpymem_mops (operands); + /* Variable-sized or strict align copies may use the MOPS expansion. */ + if (!CONST_INT_P (operands[2]) || STRICT_ALIGNMENT) + return aarch64_expand_cpymem_mops (operands, is_memmove); unsigned HOST_WIDE_INT size = INTVAL (operands[2]); - /* Try to inline up to 256 bytes or use the MOPS threshold if available. */ - unsigned HOST_WIDE_INT max_copy_size - = TARGET_MOPS ? aarch64_mops_memcpy_size_threshold : 256; + /* Set inline limits for memmove/memcpy. MOPS has a separate threshold. */ + unsigned HOST_WIDE_INT max_copy_size = is_memmove ? 0 : 256; + unsigned HOST_WIDE_INT max_mops_size = max_copy_size; - bool size_p = optimize_function_for_size_p (cfun); + if (TARGET_MOPS) + max_mops_size = is_memmove ? aarch64_mops_memmove_size_threshold + : aarch64_mops_memcpy_size_threshold; - /* Large constant-sized cpymem should go through MOPS when possible. - It should be a win even for size optimization in the general case. - For speed optimization the choice between MOPS and the SIMD sequence - depends on the size of the copy, rather than number of instructions, - alignment etc. */ - if (size > max_copy_size) - return aarch64_expand_cpymem_mops (operands); + /* Large copies use library call or MOPS when available. */ + if (size > max_copy_size || size > max_mops_size) + return aarch64_expand_cpymem_mops (operands, is_memmove); int copy_bits = 256; @@ -25254,6 +25255,8 @@ aarch64_expand_cpymem (rtx *operands) the constant size into a register. */ unsigned mops_cost = 3 + 1; + bool size_p = optimize_function_for_size_p (cfun); + /* If MOPS is available at this point we don't consider the libcall as it's not a win even on code size. At this point only consider MOPS if optimizing for size. For speed optimizations we will have chosen between @@ -25261,7 +25264,7 @@ aarch64_expand_cpymem (rtx *operands) if (TARGET_MOPS) { if (size_p && mops_cost < nops) - return aarch64_expand_cpymem_mops (operands); + return aarch64_expand_cpymem_mops (operands, is_memmove); emit_insn (seq); return true; } diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 01cf989641fce8e6c3828f6cfef62e101c4142df..97f70d39cc0ddeb330e044bae0544d85a695567d 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1609,15 +1609,30 @@ (define_expand "cpymemdi" (match_operand:BLK 1 "memory_operand") (match_operand:DI 2 "general_operand") (match_operand:DI 3 "immediate_operand")] - "!STRICT_ALIGNMENT || TARGET_MOPS" + "" { - if (aarch64_expand_cpymem (operands)) + if (aarch64_expand_cpymem (operands, false)) DONE; FAIL; } ) -(define_insn "aarch64_movmemdi" +(define_expand "aarch64_movmemdi" + [(parallel + [(set (match_operand 2) (const_int 0)) + (clobber (match_dup 3)) + (clobber (match_dup 4)) + (clobber (reg:CC CC_REGNUM)) + (set (match_operand 0) + (unspec:BLK [(match_operand 1) (match_dup 2)] UNSPEC_MOVMEM))])] + "TARGET_MOPS" + { + operands[3] = XEXP (operands[0], 0); + operands[4] = XEXP (operands[1], 0); + } +) + +(define_insn "*aarch64_movmemdi" [(parallel [ (set (match_operand:DI 2 "register_operand" "+&r") (const_int 0)) (clobber (match_operand:DI 0 "register_operand" "+&r")) @@ -1640,27 +1655,11 @@ (define_expand "movmemdi" (match_operand:BLK 1 "memory_operand") (match_operand:DI 2 "general_operand") (match_operand:DI 3 "immediate_operand")] - "TARGET_MOPS" + "" { - rtx sz_reg = operands[2]; - /* For constant-sized memmoves check the threshold. - FIXME: We should add a non-MOPS memmove expansion for smaller, - constant-sized memmove to avoid going to a libcall. */ - if (CONST_INT_P (sz_reg) - && INTVAL (sz_reg) < aarch64_mops_memmove_size_threshold) - FAIL; - - rtx addr_dst = XEXP (operands[0], 0); - rtx addr_src = XEXP (operands[1], 0); - - if (!REG_P (sz_reg)) - sz_reg = force_reg (DImode, sz_reg); - if (!REG_P (addr_dst)) - addr_dst = force_reg (DImode, addr_dst); - if (!REG_P (addr_src)) - addr_src = force_reg (DImode, addr_src); - emit_insn (gen_aarch64_movmemdi (addr_dst, addr_src, sz_reg)); - DONE; + if (aarch64_expand_cpymem (operands, true)) + DONE; + FAIL; } ) diff --git a/gcc/testsuite/gcc.target/aarch64/mops_4.c b/gcc/testsuite/gcc.target/aarch64/mops_4.c index 1b87759cb5e8bbcbb58cf63404d1d579d44b2818..dd796115cb4093251964d881e93bf4b98ade0c32 100644 --- a/gcc/testsuite/gcc.target/aarch64/mops_4.c +++ b/gcc/testsuite/gcc.target/aarch64/mops_4.c @@ -50,6 +50,54 @@ copy3 (int *x, int *y, long z, long *res) *res = z; } +/* +** move1: +** mov (x[0-9]+), x0 +** cpyp \[\1\]!, \[x1\]!, x2! +** cpym \[\1\]!, \[x1\]!, x2! +** cpye \[\1\]!, \[x1\]!, x2! +** str x0, \[x3\] +** ret +*/ +void +move1 (int *x, int *y, long z, int **res) +{ + __builtin_memmove (x, y, z); + *res = x; +} + +/* +** move2: +** mov (x[0-9]+), x1 +** cpyp \[x0\]!, \[\1\]!, x2! +** cpym \[x0\]!, \[\1\]!, x2! +** cpye \[x0\]!, \[\1\]!, x2! +** str x1, \[x3\] +** ret +*/ +void +move2 (int *x, int *y, long z, int **res) +{ + __builtin_memmove (x, y, z); + *res = y; +} + +/* +** move3: +** mov (x[0-9]+), x2 +** cpyp \[x0\]!, \[x1\]!, \1! +** cpym \[x0\]!, \[x1\]!, \1! +** cpye \[x0\]!, \[x1\]!, \1! +** str x2, \[x3\] +** ret +*/ +void +move3 (int *x, int *y, long z, long *res) +{ + __builtin_memmove (x, y, z); + *res = z; +} + /* ** set1: ** mov (x[0-9]+), x0