From patchwork Sat Sep 9 07:03:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Xi Ruoyao X-Patchwork-Id: 137824 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ab0a:0:b0:3f2:4152:657d with SMTP id m10csp991439vqo; Sat, 9 Sep 2023 00:04:40 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGWPuCh4+fUsDpfEvsj7SKergRkMzagEmyKjzV9A27XJOL/G7iM/hzw6f1bZnNV8tCCNIon X-Received: by 2002:a05:6402:156:b0:527:fa8d:d40b with SMTP id s22-20020a056402015600b00527fa8dd40bmr3034817edu.26.1694243079938; Sat, 09 Sep 2023 00:04:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694243079; cv=none; d=google.com; s=arc-20160816; b=ga3nvW9mYrCM785yVMKqm+CGcD6cyDw3zDtILsqydXHD69DfB6GKQ8HhNsHg0iQhSq PGGoFZLaTF5kfMlE2ZSZ32yS+6anMzPWT0T4abfVRUXR7tbe8XYgqGLQINsrwIu75F5y 1jG6yXXEUNzKgG07RyO/s5iStAOSYGCJmQHWvoNNIJN2ntN4xV6jqPgGFfSNmjXNidpG UlxvknyW7cpMjLKN1/+l5+osJ4PPO3pCjODRX09SpZZfUm4F5wQko/4EPC3RmHPj9biD ezk6eVq8loYX12Z+QGA+a865rNNq7C6X4xUT7b8QxGtYlRGT5RzyyAAGeopMkYIFjL0+ ueGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :user-agent:references:in-reply-to:date:cc:to:subject:message-id :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=7gg9xovaZsnofl6CA24gcV2he45ZbZUC0qZ+l9U0K+E=; fh=y7Igwd5cgmXcfehOQ+K2XSaVBUdLQpNVLA/ot2BRe3w=; b=OCdbPyYzRlZeTke13PaHOaC/UBTk4RFMZVP+Y7kq9Rf6ibWR99eWzuBekCvd0NnHlb rq4wjPV15q+dxURzdIxmyRl7sPlew2QjjUG+oF/TpXi6dB8parhz7rQOeKh6+V+mDPtX nvaQMctYAlXJ6LlGu972zTEGr6wvIJkrMNN9k0/i54AqEv9sWbGHnqvT9rR2qslzHK+/ j4EGlESzSNJPNV0wJReOib3XwoIMyJTbSt7bdSWn125RZJuQj9HTa33cRiqnesD4XYsw bTBgu4PcbLj5DS5tiKCoTJoMSy3ceiYBIhxm4EoJJuj3wXKIqWlVY2qcZJaoJvMXRjn2 x5jw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=mNSMZZTt; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id a4-20020aa7cf04000000b0052a3b1ffde5si2972954edy.154.2023.09.09.00.04.39 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 09 Sep 2023 00:04:39 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=mNSMZZTt; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 16CFB385B53C for ; Sat, 9 Sep 2023 07:04:26 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 16CFB385B53C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694243066; bh=7gg9xovaZsnofl6CA24gcV2he45ZbZUC0qZ+l9U0K+E=; h=Subject:To:Cc:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=mNSMZZTtRCsuKR0SkPBBpFT6UNQRC7qc4VEIHg0U/h9vc+T/67OJIE5RNP2wssemR 9GrDHXHc3kyoKXIjb9NeAhrs1IajW9H9xOth2cx/NmzMfYUjgGWZPP5i8gymSIaCVI bgFnPW4+nS7tDEW6NR1R+Lua2WwOdnFoLbVLVk/4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from xry111.site (xry111.site [89.208.246.23]) by sourceware.org (Postfix) with ESMTPS id 075183858D1E for ; Sat, 9 Sep 2023 07:03:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 075183858D1E Received: from localhost.localdomain (xry111.site [IPv6:2001:470:683e::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature ECDSA (P-384)) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id 65D4E659C0; Sat, 9 Sep 2023 03:03:41 -0400 (EDT) Message-ID: Subject: Pushed: [PATCH v2] LoongArch: Use LSX and LASX for block move To: chenglulu , gcc-patches@gcc.gnu.org Cc: Chenghui Pan , i@xen0n.name, xuchenghua@loongson.cn Date: Sat, 09 Sep 2023 15:03:40 +0800 In-Reply-To: References: <20230907161407.27338-2-xry111@xry111.site> User-Agent: Evolution 3.48.4 MIME-Version: 1.0 X-Spam-Status: No, score=-6.3 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, LIKELY_SPAM_FROM, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Xi Ruoyao via Gcc-patches From: Xi Ruoyao Reply-To: Xi Ruoyao Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776542631813860720 X-GMAIL-MSGID: 1776542631813860720 Pushed r14-3818 with test cases added. The pushed patch is attached. On Sat, 2023-09-09 at 14:10 +0800, chenglulu wrote: > > 在 2023/9/8 上午12:14, Xi Ruoyao 写道: > > gcc/ChangeLog: > > > >         * config/loongarch/loongarch.h (LARCH_MAX_MOVE_PER_INSN): > >         Define to the maximum amount of bytes able to be loaded or > >         stored with one machine instruction. > >         * config/loongarch/loongarch.cc (loongarch_mode_for_move_size): > >         New static function. > >         (loongarch_block_move_straight): Call > >         loongarch_mode_for_move_size for machine_mode to be moved. > >         (loongarch_expand_block_move): Use LARCH_MAX_MOVE_PER_INSN > >         instead of UNITS_PER_WORD. > > --- > > > > Bootstrapped and regtested on loongarch64-linux-gnu, with PR110939 patch > > applied, the "lib_build_self_spec = %<..." line in t-linux commented out > > (because it's silently making -mlasx in BOOT_CFLAGS ineffective, Yujie > > is working on a proper fix), and BOOT_CFLAGS="-O3 -mlasx".  Ok for trunk? > > I think test cases need to be added here. > > Otherwise OK, thanks! /* snip */ From 35adc54b55aa199f17e2c84e382792e424b6171e Mon Sep 17 00:00:00 2001 From: Xi Ruoyao Date: Tue, 5 Sep 2023 21:02:38 +0800 Subject: [PATCH v2] LoongArch: Use LSX and LASX for block move gcc/ChangeLog: * config/loongarch/loongarch.h (LARCH_MAX_MOVE_PER_INSN): Define to the maximum amount of bytes able to be loaded or stored with one machine instruction. * config/loongarch/loongarch.cc (loongarch_mode_for_move_size): New static function. (loongarch_block_move_straight): Call loongarch_mode_for_move_size for machine_mode to be moved. (loongarch_expand_block_move): Use LARCH_MAX_MOVE_PER_INSN instead of UNITS_PER_WORD. gcc/testsuite/ChangeLog: * gcc.target/loongarch/memcpy-vec-1.c: New test. * gcc.target/loongarch/memcpy-vec-2.c: New test. * gcc.target/loongarch/memcpy-vec-3.c: New test. --- gcc/config/loongarch/loongarch.cc | 22 +++++++++++++++---- gcc/config/loongarch/loongarch.h | 3 +++ .../gcc.target/loongarch/memcpy-vec-1.c | 11 ++++++++++ .../gcc.target/loongarch/memcpy-vec-2.c | 12 ++++++++++ .../gcc.target/loongarch/memcpy-vec-3.c | 6 +++++ 5 files changed, 50 insertions(+), 4 deletions(-) create mode 100644 gcc/testsuite/gcc.target/loongarch/memcpy-vec-1.c create mode 100644 gcc/testsuite/gcc.target/loongarch/memcpy-vec-2.c create mode 100644 gcc/testsuite/gcc.target/loongarch/memcpy-vec-3.c diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index 6698414281e..509ef2b97f1 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -5191,6 +5191,20 @@ loongarch_function_ok_for_sibcall (tree decl ATTRIBUTE_UNUSED, return true; } +static machine_mode +loongarch_mode_for_move_size (HOST_WIDE_INT size) +{ + switch (size) + { + case 32: + return V32QImode; + case 16: + return V16QImode; + } + + return int_mode_for_size (size * BITS_PER_UNIT, 0).require (); +} + /* Emit straight-line code to move LENGTH bytes from SRC to DEST. Assume that the areas do not overlap. */ @@ -5220,7 +5234,7 @@ loongarch_block_move_straight (rtx dest, rtx src, HOST_WIDE_INT length, for (delta_cur = delta, i = 0, offs = 0; offs < length; delta_cur /= 2) { - mode = int_mode_for_size (delta_cur * BITS_PER_UNIT, 0).require (); + mode = loongarch_mode_for_move_size (delta_cur); for (; offs + delta_cur <= length; offs += delta_cur, i++) { @@ -5231,7 +5245,7 @@ loongarch_block_move_straight (rtx dest, rtx src, HOST_WIDE_INT length, for (delta_cur = delta, i = 0, offs = 0; offs < length; delta_cur /= 2) { - mode = int_mode_for_size (delta_cur * BITS_PER_UNIT, 0).require (); + mode = loongarch_mode_for_move_size (delta_cur); for (; offs + delta_cur <= length; offs += delta_cur, i++) loongarch_emit_move (adjust_address (dest, mode, offs), regs[i]); @@ -5326,8 +5340,8 @@ loongarch_expand_block_move (rtx dest, rtx src, rtx r_length, rtx r_align) HOST_WIDE_INT align = INTVAL (r_align); - if (!TARGET_STRICT_ALIGN || align > UNITS_PER_WORD) - align = UNITS_PER_WORD; + if (!TARGET_STRICT_ALIGN || align > LARCH_MAX_MOVE_PER_INSN) + align = LARCH_MAX_MOVE_PER_INSN; if (length <= align * LARCH_MAX_MOVE_OPS_STRAIGHT) { diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h index 3fc9dc43ab1..7e391205583 100644 --- a/gcc/config/loongarch/loongarch.h +++ b/gcc/config/loongarch/loongarch.h @@ -1181,6 +1181,9 @@ typedef struct { least twice. */ #define LARCH_MAX_MOVE_OPS_STRAIGHT (LARCH_MAX_MOVE_OPS_PER_LOOP_ITER * 2) +#define LARCH_MAX_MOVE_PER_INSN \ + (ISA_HAS_LASX ? 32 : (ISA_HAS_LSX ? 16 : UNITS_PER_WORD)) + /* The base cost of a memcpy call, for MOVE_RATIO and friends. These values were determined experimentally by benchmarking with CSiBE. */ diff --git a/gcc/testsuite/gcc.target/loongarch/memcpy-vec-1.c b/gcc/testsuite/gcc.target/loongarch/memcpy-vec-1.c new file mode 100644 index 00000000000..8d9fedc9e4f --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/memcpy-vec-1.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mabi=lp64d -march=la464 -mno-strict-align" } */ +/* { dg-final { scan-assembler-times "xvst" 2 } } */ +/* { dg-final { scan-assembler-times "\tvst" 1 } } */ +/* { dg-final { scan-assembler-times "st\\.d|stptr\\.d" 1 } } */ +/* { dg-final { scan-assembler-times "st\\.w|stptr\\.w" 1 } } */ +/* { dg-final { scan-assembler-times "st\\.h" 1 } } */ +/* { dg-final { scan-assembler-times "st\\.b" 1 } } */ + +extern char a[], b[]; +void test() { __builtin_memcpy(a, b, 95); } diff --git a/gcc/testsuite/gcc.target/loongarch/memcpy-vec-2.c b/gcc/testsuite/gcc.target/loongarch/memcpy-vec-2.c new file mode 100644 index 00000000000..6b28b884db0 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/memcpy-vec-2.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mabi=lp64d -march=la464 -mno-strict-align" } */ +/* { dg-final { scan-assembler-times "xvst" 2 } } */ +/* { dg-final { scan-assembler-times "\tvst" 1 } } */ +/* { dg-final { scan-assembler-times "st\\.d|stptr\\.d" 1 } } */ +/* { dg-final { scan-assembler-times "st\\.w|stptr\\.w" 1 } } */ +/* { dg-final { scan-assembler-times "st\\.h" 1 } } */ +/* { dg-final { scan-assembler-times "st\\.b" 1 } } */ + +typedef char __attribute__ ((vector_size (32), aligned (32))) vec; +extern vec a[], b[]; +void test() { __builtin_memcpy(a, b, 95); } diff --git a/gcc/testsuite/gcc.target/loongarch/memcpy-vec-3.c b/gcc/testsuite/gcc.target/loongarch/memcpy-vec-3.c new file mode 100644 index 00000000000..233ed215078 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/memcpy-vec-3.c @@ -0,0 +1,6 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=la464 -mabi=lp64d -mstrict-align" } */ +/* { dg-final { scan-assembler-not "vst" } } */ + +extern char a[], b[]; +void test() { __builtin_memcpy(a, b, 16); } -- 2.42.0