From patchwork Sun Nov 13 23:05:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 19472 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1865002wru; Sun, 13 Nov 2022 15:07:07 -0800 (PST) X-Google-Smtp-Source: AA0mqf4EK0Dpe0ZlYHsomHOwZFKpVzOcv2Nvb2BATJJ13FK1v8GW7PK27++ljPAeWtJiMB2ea7XM X-Received: by 2002:a05:6402:f29:b0:462:30e4:fcf5 with SMTP id i41-20020a0564020f2900b0046230e4fcf5mr9394416eda.115.1668380827791; Sun, 13 Nov 2022 15:07:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668380827; cv=none; d=google.com; s=arc-20160816; b=q4UvnD5GKe3ZMoHzu69jSwOUpDb/QmFRaL45+/UVu/QZDvKKrrRE9Ydz0q7bJyrn3n qIYXFKKyDS+pn1KQJuFN2Rbd77Mkg6ehwUL1EzHWeVRBSgyitjbwQ7eWZ2YBLKgVEcjS 1ksF8N2QM7iDYJKDRvkzz59EZAHLp+IZD//l7AyYBYdSMJbitDA9TjxrbmoV0lnzZzYz vbnmU6HxC7FTs5Aie2yvlRZ60Fny4bMsZK8aJLlcVJApZfm+K7bE+Q6emi+RbBgnWJ4U 4zspNtY/eFKKXh/Ynd7wy8kzTjjXNzGWoy6p1k2Z+dyj0oZMMGDqvOswkKZwMu2fLLvf dzzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=1ZutoYwdyLokk8NkpUbBx6XK3mlzqoA78Ap3nQT675Q=; b=DwlzZEs7E5hDOu495G5JEiFL561iOWy66FkQziSy1H2k9X2ackquyjlhE7/o0oaFMU 2AQMUBExMne78hlYEcG/sOZihYriUMJ53TxSfIQbPmlKjM8dCfjBCdjrPWowT3bvlMkq KcgstD/ae8QA+vub8zg02ciNDhqz6l9hxP/+1Kk7UX4cqXsiw7fwoRHpPNKRxQIAQyRe GAx/ZyPyr3nvgyiQIAGI8+0yqgRjx3gb8WI4zhL2q4BPUk0V2QuyFrLNwcvFHBwoXCHD 59i5n1I4188kwG1/ETbAANmXb06w5cXYWg4hU17ksZ/yKana9RQN+xG1ifAE+Ds7h4mk +Mkg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@vrull.eu header.s=google header.b=sTO97jEI; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id t2-20020a056402524200b00462ab8923ccsi7584288edd.600.2022.11.13.15.07.07 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:07:07 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@vrull.eu header.s=google header.b=sTO97jEI; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 350903898C6A for ; Sun, 13 Nov 2022 23:06:08 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) by sourceware.org (Postfix) with ESMTPS id B0167384F034 for ; Sun, 13 Nov 2022 23:05:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B0167384F034 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ed1-x533.google.com with SMTP id s12so14906761edd.5 for ; Sun, 13 Nov 2022 15:05:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1ZutoYwdyLokk8NkpUbBx6XK3mlzqoA78Ap3nQT675Q=; b=sTO97jEIpdQiU8aOAkFhuFh2aJi0xjSaISwXGHWF2H4WAbSDah87QW74ivqNJzxdJJ XtI4saOL0FFnoBNZzbhlo3XSbrTzQl2nTSmdIroZLkjSIFkauy766j2Pw6S3aXn0P6c0 6Mh0kUI8WB46oWeXBvEMFlt750QVsfsgvsoTBQch6nzPyVJmJrI5lLd+lmVqsN9aP1pE fC2OWyaAmeKqLMqlJV+ViUliltuX9vrkoI9s6rrlJij+OZB1bD/0O9R0aY9nUiiifkkr Kj/Rlmt+xGVPggpeP7gXmtOXKI0vpWzfYpYt1E5aWiHv9zyxEfW6P5LzeFXdRdL2Xu9e I/CQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1ZutoYwdyLokk8NkpUbBx6XK3mlzqoA78Ap3nQT675Q=; b=cA/VagKLeOwTqpsWsN7DNBRL2iOScwfDaEbflVGxYj3XOcAcen3LRHRdKeuUaU4rn4 cIfrWwoDrfFTPw1hBkoXLYd1A9uYtJceLhlE8KHyB07ACaDKrDAnvFGcqYNHjmugrKWu 7SIHjDLXzKtdtLny6mZxWkrt5fV/0dOg7gqMWESSMvHsNHhjQh99hxaFKs8qyL64f2vo 4B7wcFQ9o12QE6pMnxJbZblemxKl+4S2Ps1RTOw2lBLSdymXAv2++bAR461c0DefM0Cd e2me5t6gVhgD/84gry9MK39P0JZBhJkfqJZxxl8Z2eRRNVhIPTg2xThRiuEtCX5uHBgS COwg== X-Gm-Message-State: ANoB5pl84N1xb+1FuNv5tw2EZhnfW5I1WezluG4aWdpo293tfFHs3xha BJG7UiIbLnjQJ1KkiUQBTGFhUt20DMFLNqm+ X-Received: by 2002:a05:6402:528f:b0:464:4a3f:510b with SMTP id en15-20020a056402528f00b004644a3f510bmr9635313edb.222.1668380730190; Sun, 13 Nov 2022 15:05:30 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id ku3-20020a170907788300b007ae21bbdd3fsm2361281ejc.162.2022.11.13.15.05.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:05:29 -0800 (PST) From: Christoph Muellner To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [PATCH 5/7] riscv: Use by-pieces to do overlapping accesses in block_move_straight Date: Mon, 14 Nov 2022 00:05:19 +0100 Message-Id: <20221113230521.712693-6-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221113230521.712693-1-christoph.muellner@vrull.eu> References: <20221113230521.712693-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749424094807643475?= X-GMAIL-MSGID: =?utf-8?q?1749424094807643475?= From: Christoph Müllner The current implementation of riscv_block_move_straight() emits a couple of load-store pairs with maximum width (e.g. 8-byte for RV64). The remainder is handed over to move_by_pieces(), which emits code based target settings like slow_unaligned_access and overlap_op_by_pieces. move_by_pieces() will emit overlapping memory accesses with maximum width only if the given length exceeds the size of one access (e.g. 15-bytes for 8-byte accesses). This patch changes the implementation of riscv_block_move_straight() such, that it preserves a remainder within the interval [delta..2*delta) instead of [0..delta), so that overlapping memory access may be emitted (if the requirements for them are given). gcc/ChangeLog: * config/riscv/riscv-string.c (riscv_block_move_straight): Adjust range for emitted load/store pairs. Signed-off-by: Christoph Müllner --- gcc/config/riscv/riscv-string.cc | 8 ++++---- .../gcc.target/riscv/memcpy-overlapping.c | 19 ++++++++----------- 2 files changed, 12 insertions(+), 15 deletions(-) diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc index 6882f0be269..1137df475be 100644 --- a/gcc/config/riscv/riscv-string.cc +++ b/gcc/config/riscv/riscv-string.cc @@ -57,18 +57,18 @@ riscv_block_move_straight (rtx dest, rtx src, unsigned HOST_WIDE_INT length) delta = bits / BITS_PER_UNIT; /* Allocate a buffer for the temporary registers. */ - regs = XALLOCAVEC (rtx, length / delta); + regs = XALLOCAVEC (rtx, length / delta - 1); /* Load as many BITS-sized chunks as possible. Use a normal load if the source has enough alignment, otherwise use left/right pairs. */ - for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++) + for (offset = 0, i = 0; offset + 2 * delta <= length; offset += delta, i++) { regs[i] = gen_reg_rtx (mode); riscv_emit_move (regs[i], adjust_address (src, mode, offset)); } /* Copy the chunks to the destination. */ - for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++) + for (offset = 0, i = 0; offset + 2 * delta <= length; offset += delta, i++) riscv_emit_move (adjust_address (dest, mode, offset), regs[i]); /* Mop up any left-over bytes. */ @@ -166,7 +166,7 @@ riscv_expand_block_move (rtx dest, rtx src, rtx length) if (hwi_length <= (RISCV_MAX_MOVE_BYTES_STRAIGHT / factor)) { - riscv_block_move_straight (dest, src, INTVAL (length)); + riscv_block_move_straight (dest, src, hwi_length); return true; } else if (optimize && align >= BITS_PER_WORD) diff --git a/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c b/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c index ffb7248bfd1..ef95bfb879b 100644 --- a/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c +++ b/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c @@ -25,26 +25,23 @@ COPY_N(15) /* Emits 2x {ld,sd} and 1x {lw,sw}. */ COPY_N(19) -/* Emits 3x ld and 3x sd. */ +/* Emits 3x {ld,sd}. */ COPY_N(23) /* The by-pieces infrastructure handles up to 24 bytes. So the code below is emitted via cpymemsi/block_move_straight. */ -/* Emits 3x {ld,sd} and 1x {lhu,lbu,sh,sb}. */ +/* Emits 3x {ld,sd} and 1x {lw,sw}. */ COPY_N(27) -/* Emits 3x {ld,sd} and 1x {lw,lbu,sw,sb}. */ +/* Emits 4x {ld,sd}. */ COPY_N(29) -/* Emits 3x {ld,sd} and 2x {lw,sw}. */ +/* Emits 4x {ld,sd}. */ COPY_N(31) -/* { dg-final { scan-assembler-times "ld\t" 21 } } */ -/* { dg-final { scan-assembler-times "sd\t" 21 } } */ +/* { dg-final { scan-assembler-times "ld\t" 23 } } */ +/* { dg-final { scan-assembler-times "sd\t" 23 } } */ -/* { dg-final { scan-assembler-times "lw\t" 5 } } */ -/* { dg-final { scan-assembler-times "sw\t" 5 } } */ - -/* { dg-final { scan-assembler-times "lbu\t" 2 } } */ -/* { dg-final { scan-assembler-times "sb\t" 2 } } */ +/* { dg-final { scan-assembler-times "lw\t" 3 } } */ +/* { dg-final { scan-assembler-times "sw\t" 3 } } */