MIPS: don't expand large block move

Message ID 20230519061152.3154332-1-yunqiang.su@cipunited.com
State Accepted
Headers
Series MIPS: don't expand large block move |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

YunQiang Su May 19, 2023, 6:11 a.m. UTC
  On platform with LWL/LWR, mips_block_move_loop is always used,
which expand __buildin_memcpy/strcpy to a loop of lwl/lwr/swl/swl etc.

For short (normally <=64), it has better performance,
but when the src/dest are long, use memcpy/strcpy lib call may have
better performance.

At the same time, lib call may be optimized with SIMD, so,
on the platform with SIMD, lib call may have much better performace.

gcc/ChangeLog:
	* config/mips/mips.cc (mips_expand_block_move): don't expand
	  if length>=64.

gcc/testsuite/ChangeLog:
	* gcc.target/mips/expand-block-move-large.c: New test.
---
 gcc/config/mips/mips.cc                         |  6 ++++++
 .../gcc.target/mips/expand-block-move-large.c   | 17 +++++++++++++++++
 2 files changed, 23 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/mips/expand-block-move-large.c
  

Comments

Jeff Law May 19, 2023, 4:56 p.m. UTC | #1
On 5/19/23 00:11, YunQiang Su wrote:
> On platform with LWL/LWR, mips_block_move_loop is always used,
> which expand __buildin_memcpy/strcpy to a loop of lwl/lwr/swl/swl etc.
> 
> For short (normally <=64), it has better performance,
> but when the src/dest are long, use memcpy/strcpy lib call may have
> better performance.
> 
> At the same time, lib call may be optimized with SIMD, so,
> on the platform with SIMD, lib call may have much better performace.
> 
> gcc/ChangeLog:
> 	* config/mips/mips.cc (mips_expand_block_move): don't expand
> 	  if length>=64.
> 
> gcc/testsuite/ChangeLog:
> 	* gcc.target/mips/expand-block-move-large.c: New test.
> ---
>   gcc/config/mips/mips.cc                         |  6 ++++++
>   .../gcc.target/mips/expand-block-move-large.c   | 17 +++++++++++++++++
>   2 files changed, 23 insertions(+)
>   create mode 100644 gcc/testsuite/gcc.target/mips/expand-block-move-large.c
> 
> diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
> index ca491b981a3..00f26d5e923 100644
> --- a/gcc/config/mips/mips.cc
> +++ b/gcc/config/mips/mips.cc
> @@ -8313,6 +8313,12 @@ mips_expand_block_move (rtx dest, rtx src, rtx length)
>   	}
>         else if (optimize)
>   	{
> +	  /* When the length is big enough, the lib call has better performace
> +	     than load/store insns.
> +	     In most platform, the value is about 64-128.
> +	     And in fact lib call may be optimized with SIMD */
> +	  if (INTVAL(length) >= 64)
> +	    return false;
Just a formatting nit.  Space between INTVAL and the open paren for its 
argument list.

OK with that change.

jeff
  
Maciej W. Rozycki May 19, 2023, 7:21 p.m. UTC | #2
On Fri, 19 May 2023, Jeff Law wrote:

> > diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
> > index ca491b981a3..00f26d5e923 100644
> > --- a/gcc/config/mips/mips.cc
> > +++ b/gcc/config/mips/mips.cc
> > @@ -8313,6 +8313,12 @@ mips_expand_block_move (rtx dest, rtx src, rtx
> > length)
> >   	}
> >         else if (optimize)
> >   	{
> > +	  /* When the length is big enough, the lib call has better performace
> > +	     than load/store insns.
> > +	     In most platform, the value is about 64-128.
> > +	     And in fact lib call may be optimized with SIMD */
> > +	  if (INTVAL(length) >= 64)
> > +	    return false;
> Just a formatting nit.  Space between INTVAL and the open paren for its
> argument list.

 This is oddly wrapped too.  I'd move "performace" (typo there!) to the 
second line, to align better with the rest of the text.

 Plus s/platform/platforms/ and there's a full stop missing along with two 
spaces at the end.  Also there's inconsistent style around <= and >=; the 
GNU Coding Standards ask for spaces around binary operators.  And "don't" 
in the change heading ought to be capitalised.

 In fact, I'd justify the whole paragraph as each sentence doesn't have to 
start on a new line, and the commit description could benefit from some 
reformatting too, as it's now odd to read.

> OK with that change.

 I think the conditional would be better readable if it was flattened 
though:

      if (INTVAL (length) <= MIPS_MAX_MOVE_BYTES_STRAIGHT)
	...
      else if (INTVAL (length) >= 64)
	...
      else if (optimize)
	...

or even:

      if (INTVAL (length) <= MIPS_MAX_MOVE_BYTES_STRAIGHT)
	...
      else if (INTVAL (length) < 64 && optimize)
	...

One just wouldn't write it as proposed if creating the whole piece from 
scratch rather than retrofitting this extra conditional.

 Ultimately it may have to be tunable as LWL/LWR, etc. may be subject to 
fusion and may be faster after all.

  Maciej
  
YunQiang Su May 24, 2023, 2:04 a.m. UTC | #3
Maciej W. Rozycki <macro@orcam.me.uk> 于2023年5月20日周六 03:21写道:
>
> On Fri, 19 May 2023, Jeff Law wrote:
>
> > > diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
> > > index ca491b981a3..00f26d5e923 100644
> > > --- a/gcc/config/mips/mips.cc
> > > +++ b/gcc/config/mips/mips.cc
> > > @@ -8313,6 +8313,12 @@ mips_expand_block_move (rtx dest, rtx src, rtx
> > > length)
> > >     }
> > >         else if (optimize)
> > >     {
> > > +     /* When the length is big enough, the lib call has better performace
> > > +        than load/store insns.
> > > +        In most platform, the value is about 64-128.
> > > +        And in fact lib call may be optimized with SIMD */
> > > +     if (INTVAL(length) >= 64)
> > > +       return false;
> > Just a formatting nit.  Space between INTVAL and the open paren for its
> > argument list.
>
>  This is oddly wrapped too.  I'd move "performace" (typo there!) to the
> second line, to align better with the rest of the text.
>
>  Plus s/platform/platforms/ and there's a full stop missing along with two
> spaces at the end.  Also there's inconsistent style around <= and >=; the
> GNU Coding Standards ask for spaces around binary operators.  And "don't"
> in the change heading ought to be capitalised.
>
>  In fact, I'd justify the whole paragraph as each sentence doesn't have to
> start on a new line, and the commit description could benefit from some
> reformatting too, as it's now odd to read.
>

Thank you. I will fix these problems.

> > OK with that change.
>
>  I think the conditional would be better readable if it was flattened
> though:
>
>       if (INTVAL (length) <= MIPS_MAX_MOVE_BYTES_STRAIGHT)
>         ...
>       else if (INTVAL (length) >= 64)
>         ...
>       else if (optimize)
>         ...
>

This sounds good.

> or even:
>
>       if (INTVAL (length) <= MIPS_MAX_MOVE_BYTES_STRAIGHT)
>         ...
>       else if (INTVAL (length) < 64 && optimize)
>         ...
>

I don't think this is a good option, since somebody may add some code,
and may break our logic.

> One just wouldn't write it as proposed if creating the whole piece from
> scratch rather than retrofitting this extra conditional.
>
>  Ultimately it may have to be tunable as LWL/LWR, etc. may be subject to
> fusion and may be faster after all.
>

oohhh, you are right.
And in fact this patch has some problems:
If the data is aligned, the value is about 1024, instead of 64-128.

>   Maciej
  
Maciej W. Rozycki May 30, 2023, 11:44 a.m. UTC | #4
On Wed, 24 May 2023, YunQiang Su wrote:

> > or even:
> >
> >       if (INTVAL (length) <= MIPS_MAX_MOVE_BYTES_STRAIGHT)
> >         ...
> >       else if (INTVAL (length) < 64 && optimize)
> >         ...
> >
> 
> I don't think this is a good option, since somebody may add some code,
> and may break our logic.

 There's no need to plan ahead for changes, which may never happen.  As it 
stands reading through such flattened code here requires one step less to 
analyse as there's one nested level and one exit point less here.  If more 
cases have to be added in the future, then whoever needs them will make 
any necessary adjustments to the structure (assuming minimal understanding 
how code works, which I think is a reasonable requirement for working on a 
compiler).

  Maciej
  

Patch

diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index ca491b981a3..00f26d5e923 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -8313,6 +8313,12 @@  mips_expand_block_move (rtx dest, rtx src, rtx length)
 	}
       else if (optimize)
 	{
+	  /* When the length is big enough, the lib call has better performace
+	     than load/store insns.
+	     In most platform, the value is about 64-128.
+	     And in fact lib call may be optimized with SIMD */
+	  if (INTVAL(length) >= 64)
+	    return false;
 	  mips_block_move_loop (dest, src, INTVAL (length),
 				MIPS_MAX_MOVE_BYTES_PER_LOOP_ITER);
 	  return true;
diff --git a/gcc/testsuite/gcc.target/mips/expand-block-move-large.c b/gcc/testsuite/gcc.target/mips/expand-block-move-large.c
new file mode 100644
index 00000000000..ae054551a2a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/expand-block-move-large.c
@@ -0,0 +1,17 @@ 
+/* { dg-options "isa_rev<=5" } */
+/* { dg-final { scan-assembler-not "lwl" } } */
+/* { dg-final { scan-assembler-not "swl" } } */
+/* { dg-final { scan-assembler-not "lwr" } } */
+/* { dg-final { scan-assembler-not "swr" } } */
+/* { dg-final { scan-assembler-not "ldl" } } */
+/* { dg-final { scan-assembler-not "sdl" } } */
+/* { dg-final { scan-assembler-not "ldr" } } */
+/* { dg-final { scan-assembler-not "sdr" } } */
+
+char a[4097], b[4097];
+
+NOCOMPRESSION void
+foo (volatile int *x)
+{
+  __builtin_memcpy(&a[1], &b[1], 64);
+}