diff mbox series

[1/3] RISC-V: movmem for RISCV with V extension

Message ID	20231211094728.1623032-2-slewis@rivosinc.com
State	Accepted
Headers	Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A243A3858C98 From: Sergei Lewis <slewis@rivosinc.com> To: gcc-patches@gcc.gnu.org Subject: [PATCH 1/3] RISC-V: movmem for RISCV with V extension Date: Mon, 11 Dec 2023 09:47:26 +0000 Message-Id: <20231211094728.1623032-2-slewis@rivosinc.com> In-Reply-To: <20231211094728.1623032-1-slewis@rivosinc.com> References: <20231211094728.1623032-1-slewis@rivosinc.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX
Series	RISC-V: vectorised memory operations \| [0/3] RISC-V: vectorised memory operations [1/3] RISC-V: movmem for RISCV with V extension [2/3] RISC-V: setmem for RISCV with V extension [3/3] RISC-V: cmpmem for RISCV with V extension

Checks

Context	Check	Description
snail/gcc-patch-check	success	Github commit url

Commit Message

Sergei Lewis Dec. 11, 2023, 9:47 a.m. UTC

  gcc/ChangeLog

    * config/riscv/riscv.md (movmem<mode>): Use riscv_vector::expand_block_move,
    if and only if we know the entire operation can be performed using one vector
    load followed by one vector store

gcc/testsuite/ChangeLog

    * gcc.target/riscv/rvv/base/movmem-1.c: New test
---
 gcc/config/riscv/riscv.md                     | 22 +++++++
 .../gcc.target/riscv/rvv/base/movmem-1.c      | 59 +++++++++++++++++++
 2 files changed, 81 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c

Comments

Robin Dapp Dec. 11, 2023, 10:08 a.m. UTC | #1

Hi Sergei,

thanks for contributing this!

Small general remarks/nits upfront:

The code looks like it hasn't been run through clang-format or
similar.  Please make sure that it adheres to the GNU coding
conventions.  The same applies to comments.  Some of them start
in lowercase.

As you rely on the vector length, please make sure to test various
combinations (also "exotic" ones) like zve32 and zve64.
Also, please specify which configurations it has been tested on. 

>     * config/riscv/riscv.md (movmem<mode>): Use riscv_vector::expand_block_move,
>     if and only if we know the entire operation can be performed using one vector
>     load followed by one vector store
> 
> gcc/testsuite/ChangeLog
> 
>     * gcc.target/riscv/rvv/base/movmem-1.c: New test

Please add a PR target/112109 here.  I believe after these
patches have landed we can close that bug.

> ---
>  gcc/config/riscv/riscv.md                     | 22 +++++++
>  .../gcc.target/riscv/rvv/base/movmem-1.c      | 59 +++++++++++++++++++
>  2 files changed, 81 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c
> 
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index eed997116b0..88fde290a8a 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -2359,6 +2359,28 @@
>      FAIL;
>  })
>  
> +;; inlining general memmove is a pessimisation: we can't avoid having to decide
> +;; which direction to go at runtime, which is costly in instruction count
> +;; however for situations where the entire move fits in one vector operation
> +;; we can do all reads before doing any writes so we don't have to worry
> +;; so generate the inline vector code in such situations
> +;; nb. prefer scalar path for tiny memmoves
> +(define_expand "movmem<mode>"
> +  [(parallel [(set (match_operand:BLK 0 "general_operand")
> +      (match_operand:BLK 1 "general_operand"))
> +	      (use (match_operand:P 2 ""))
> +	      (use (match_operand:SI 3 "const_int_operand"))])]
> +  "TARGET_VECTOR"
> +{
> +  if ((INTVAL (operands[2]) >= TARGET_MIN_VLEN/8)

If operands[2] is used as an int we need to make sure the constraint
says so.  Shouldn't operand[1] be a memory_operand?

> +	&& (INTVAL (operands[2]) <= TARGET_MIN_VLEN)
> +	&& riscv_vector::expand_block_move (operands[0], operands[1],
> +	     operands[2]))
> +    DONE;
> +  else
> +    FAIL;
> +})
> +

> +#define MIN_VECTOR_BYTES (__riscv_v_min_vlen/8)
> +
> +/* tiny memmoves should not be vectorised
> +** f1:
> +**  li\s+a2,15
> +**  tail\s+memmove
> +*/
> +char * f1 (char *a, char const *b)
> +{
> +  return memmove (a, b, 15);
> +}

The < 16 assumption might not hold on embedded targets.
Same with the other tests.

Regards
 Robin

Robin Dapp Dec. 11, 2023, 10:21 a.m. UTC | #2

Ah, please also ensure to include (and follow) the stringop_strategy
checks. (LIBCALL, VECTOR)
The naming is a bit unfortunate still but that need not be fixed
in this patch.  

Regards
 Robin

diff mbox series

Patch

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index eed997116b0..88fde290a8a 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2359,6 +2359,28 @@ 
     FAIL;
 })
 
+;; inlining general memmove is a pessimisation: we can't avoid having to decide
+;; which direction to go at runtime, which is costly in instruction count
+;; however for situations where the entire move fits in one vector operation
+;; we can do all reads before doing any writes so we don't have to worry
+;; so generate the inline vector code in such situations
+;; nb. prefer scalar path for tiny memmoves
+(define_expand "movmem<mode>"
+  [(parallel [(set (match_operand:BLK 0 "general_operand")
+      (match_operand:BLK 1 "general_operand"))
+	      (use (match_operand:P 2 ""))
+	      (use (match_operand:SI 3 "const_int_operand"))])]
+  "TARGET_VECTOR"
+{
+  if ((INTVAL (operands[2]) >= TARGET_MIN_VLEN/8)
+	&& (INTVAL (operands[2]) <= TARGET_MIN_VLEN)
+	&& riscv_vector::expand_block_move (operands[0], operands[1],
+	     operands[2]))
+    DONE;
+  else
+    FAIL;
+})
+
 ;; Expand in-line code to clear the instruction cache between operand[0] and
 ;; operand[1].
 (define_expand "clear_cache"
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c b/gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c
new file mode 100644
index 00000000000..b930241ae5d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/movmem-1.c
@@ -0,0 +1,59 @@ 
+/* { dg-do compile } */
+/* { dg-add-options riscv_v } */
+/* { dg-additional-options "-O3" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include <string.h>
+
+#define MIN_VECTOR_BYTES (__riscv_v_min_vlen/8)
+
+/* tiny memmoves should not be vectorised
+** f1:
+**  li\s+a2,15
+**  tail\s+memmove
+*/
+char * f1 (char *a, char const *b)
+{
+  return memmove (a, b, 15);
+}
+
+/* vectorise+inline minimum vector register width with LMUL=1
+** f2:
+**  (
+**  vsetivli\s+zero,16,e8,m1,ta,ma
+**  |
+**  li\s+[ta][0-7],\d+
+**  vsetvli\s+zero,[ta][0-7],e8,m1,ta,ma
+**  )
+**  vle8\.v\s+v\d+,0\(a1\)
+**  vse8\.v\s+v\d+,0\(a0\)
+**  ret
+*/
+char * f2 (char *a, char const *b)
+{
+  return memmove (a, b, MIN_VECTOR_BYTES);
+}
+
+/* vectorise+inline up to LMUL=8
+** f3:
+**  li\s+[ta][0-7],\d+
+**  vsetvli\s+zero,[ta][0-7],e8,m8,ta,ma
+**  vle8\.v\s+v\d+,0\(a1\)
+**  vse8\.v\s+v\d+,0\(a0\)
+**  ret
+*/
+char * f3 (char *a, char const *b)
+{
+  return memmove (a, b, MIN_VECTOR_BYTES*8);
+}
+
+/* don't vectorise if the move is too large for one operation
+** f4:
+**  li\s+a2,\d+
+**  tail\s+memmove
+*/
+char * f4 (char *a, char const *b)
+{
+  return memmove (a, b, MIN_VECTOR_BYTES*8+1);
+}
+