[11/11] aarch64: Use individual loads/stores for mem{cpy,set} expansion

Message ID ZVZbYrRa/M+jTFcm@arm.com
State Unresolved
Headers
Series None |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

Alex Coplan Nov. 16, 2023, 6:11 p.m. UTC
  This patch adjusts the mem{cpy,set} expansion in the aarch64 backend to use
individual loads/stores instead of ldp/stp at expand time.  The idea is to rely
on the ldp fusion pass to fuse the accesses together later in the RTL pipeline.

The earlier parts of the RTL pipeline should be able to do a better job with the
individual (non-paired) accesses, especially given that an earlier patch in this
series moves the pair representation to use unspecs.

Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk?

Thanks,
Alex

gcc/ChangeLog:

	* config/aarch64/aarch64.cc
	(aarch64_copy_one_block_and_progress_pointers): Emit individual
	accesses instead of load/store pairs.
	(aarch64_set_one_block_and_progress_pointer): Likewise.
---
 gcc/config/aarch64/aarch64.cc | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)
  

Comments

Richard Sandiford Nov. 22, 2023, 11:26 a.m. UTC | #1
Alex Coplan <alex.coplan@arm.com> writes:
> This patch adjusts the mem{cpy,set} expansion in the aarch64 backend to use
> individual loads/stores instead of ldp/stp at expand time.  The idea is to rely
> on the ldp fusion pass to fuse the accesses together later in the RTL pipeline.
>
> The earlier parts of the RTL pipeline should be able to do a better job with the
> individual (non-paired) accesses, especially given that an earlier patch in this
> series moves the pair representation to use unspecs.
>
> Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk?
>
> Thanks,
> Alex
>
> gcc/ChangeLog:
>
> 	* config/aarch64/aarch64.cc
> 	(aarch64_copy_one_block_and_progress_pointers): Emit individual
> 	accesses instead of load/store pairs.
> 	(aarch64_set_one_block_and_progress_pointer): Likewise.

OK, thanks.

Richard

> ---
>  gcc/config/aarch64/aarch64.cc | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 1f6094bf1bc..315ba7119c0 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -25457,9 +25457,12 @@ aarch64_copy_one_block_and_progress_pointers (rtx *src, rtx *dst,
>        /* "Cast" the pointers to the correct mode.  */
>        *src = adjust_address (*src, mode, 0);
>        *dst = adjust_address (*dst, mode, 0);
> -      /* Emit the memcpy.  */
> -      emit_insn (aarch64_gen_load_pair (reg1, reg2, *src));
> -      emit_insn (aarch64_gen_store_pair (*dst, reg1, reg2));
> +      /* Emit the memcpy.  The load/store pair pass should form
> +	 a load/store pair from these moves.  */
> +      emit_move_insn (reg1, *src);
> +      emit_move_insn (reg2, aarch64_progress_pointer (*src));
> +      emit_move_insn (*dst, reg1);
> +      emit_move_insn (aarch64_progress_pointer (*dst), reg2);
>        /* Move the pointers forward.  */
>        *src = aarch64_move_pointer (*src, 32);
>        *dst = aarch64_move_pointer (*dst, 32);
> @@ -25638,7 +25641,8 @@ aarch64_set_one_block_and_progress_pointer (rtx src, rtx *dst,
>        /* "Cast" the *dst to the correct mode.  */
>        *dst = adjust_address (*dst, mode, 0);
>        /* Emit the memset.  */
> -      emit_insn (aarch64_gen_store_pair (*dst, src, src));
> +      emit_move_insn (*dst, src);
> +      emit_move_insn (aarch64_progress_pointer (*dst), src);
>  
>        /* Move the pointers forward.  */
>        *dst = aarch64_move_pointer (*dst, 32);
  

Patch

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 1f6094bf1bc..315ba7119c0 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -25457,9 +25457,12 @@  aarch64_copy_one_block_and_progress_pointers (rtx *src, rtx *dst,
       /* "Cast" the pointers to the correct mode.  */
       *src = adjust_address (*src, mode, 0);
       *dst = adjust_address (*dst, mode, 0);
-      /* Emit the memcpy.  */
-      emit_insn (aarch64_gen_load_pair (reg1, reg2, *src));
-      emit_insn (aarch64_gen_store_pair (*dst, reg1, reg2));
+      /* Emit the memcpy.  The load/store pair pass should form
+	 a load/store pair from these moves.  */
+      emit_move_insn (reg1, *src);
+      emit_move_insn (reg2, aarch64_progress_pointer (*src));
+      emit_move_insn (*dst, reg1);
+      emit_move_insn (aarch64_progress_pointer (*dst), reg2);
       /* Move the pointers forward.  */
       *src = aarch64_move_pointer (*src, 32);
       *dst = aarch64_move_pointer (*dst, 32);
@@ -25638,7 +25641,8 @@  aarch64_set_one_block_and_progress_pointer (rtx src, rtx *dst,
       /* "Cast" the *dst to the correct mode.  */
       *dst = adjust_address (*dst, mode, 0);
       /* Emit the memset.  */
-      emit_insn (aarch64_gen_store_pair (*dst, src, src));
+      emit_move_insn (*dst, src);
+      emit_move_insn (aarch64_progress_pointer (*dst), src);
 
       /* Move the pointers forward.  */
       *dst = aarch64_move_pointer (*dst, 32);