[rs6000] Call vector load/store with length expand only on 64-bit Power10 [PR96762]

Message ID 528dd350-d75e-d0d8-0b91-326151b274e5@linux.ibm.com
State Accepted
Headers
Series [rs6000] Call vector load/store with length expand only on 64-bit Power10 [PR96762] |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

HAO CHEN GUI Aug. 29, 2023, 2:50 a.m. UTC
  Hi,
  This patch adds "TARGET_64BIT" check when calling vector load/store
with length expand in expand_block_move. It matches the expand condition
of "lxvl" and "stxvl" defined in vsx.md.

  This patch fixes the ICE occurred with the test case on 32-bit Power10.

  Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.

Thanks
Gui Haochen


ChangeLog
rs6000: call vector load/store with length expand only on 64-bit Power10

gcc/
	PR target/96762
	* config/rs6000/rs6000-string.cc (expand_block_move): Call vector
	load/store with length expand only on 64-bit Power10.

gcc/testsuite/
	PR target/96762
	* gcc.target/powerpc/pr96762.c: New.


patch.diff
  

Comments

Kewen.Lin Aug. 29, 2023, 8:55 a.m. UTC | #1
Hi Haochen,

on 2023/8/29 10:50, HAO CHEN GUI wrote:
> Hi,
>   This patch adds "TARGET_64BIT" check when calling vector load/store
> with length expand in expand_block_move. It matches the expand condition
> of "lxvl" and "stxvl" defined in vsx.md.
> 
>   This patch fixes the ICE occurred with the test case on 32-bit Power10.
> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> 
> Thanks
> Gui Haochen
> 
> 
> ChangeLog
> rs6000: call vector load/store with length expand only on 64-bit Power10
> 
> gcc/
> 	PR target/96762
> 	* config/rs6000/rs6000-string.cc (expand_block_move): Call vector
> 	load/store with length expand only on 64-bit Power10.
> 
> gcc/testsuite/
> 	PR target/96762
> 	* gcc.target/powerpc/pr96762.c: New.
> 
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000-string.cc b/gcc/config/rs6000/rs6000-string.cc
> index cd8ee8c..d1b48c2 100644
> --- a/gcc/config/rs6000/rs6000-string.cc
> +++ b/gcc/config/rs6000/rs6000-string.cc
> @@ -2811,8 +2811,9 @@ expand_block_move (rtx operands[], bool might_overlap)
>  	  gen_func.mov = gen_vsx_movv2di_64bit;
>  	}
>        else if (TARGET_BLOCK_OPS_UNALIGNED_VSX
> -	       && TARGET_POWER10 && bytes < 16
> -	       && orig_bytes > 16
> +	       /* Only use lxvl/stxvl on 64bit POWER10.  */
> +	       && TARGET_POWER10 && TARGET_64BIT
> +	       && bytes < 16 && orig_bytes > 16
>  	       && !(bytes == 1 || bytes == 2
>  		    || bytes == 4 || bytes == 8)
>  	       && (align >= 128 || !STRICT_ALIGNMENT))

Nit: Since you touched this part of code, could you format it better as well, like:

      else if (TARGET_BLOCK_OPS_UNALIGNED_VSX
	       /* Only use lxvl/stxvl on 64bit POWER10.  */
	       && TARGET_POWER10
	       && TARGET_64BIT
	       && bytes < 16
	       && orig_bytes > 16
	       && !(bytes == 1
		    || bytes == 2
		    || bytes == 4
		    || bytes == 8)
	       && (align >= 128
		   || !STRICT_ALIGNMENT))


> diff --git a/gcc/testsuite/gcc.target/powerpc/pr96762.c b/gcc/testsuite/gcc.target/powerpc/pr96762.c
> new file mode 100644
> index 0000000..1145dd1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr96762.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile { target ilp32 } } */

Nit: we can compile this on lp64, so you can remove the ilp32 restriction,
...

> +/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
> +

... but add one comment line to note the initial purpose, like:

/* Verify there is no ICE on ilp32 env.  */

or similar.

Okay for trunk with these nits fixed, thanks!

BR,
Kewen

> +extern void foo (char *);
> +
> +void
> +bar (void)
> +{
> +  char zj[] = "XXXXXXXXXXXXXXXX";
> +  foo (zj);
> +}
  
HAO CHEN GUI Aug. 31, 2023, 5:47 a.m. UTC | #2
Kewen,
  I refined the patch according to your comments and it passed bootstrap
and regression test.

  I committed it as
https://gcc.gnu.org/g:946b8967b905257ac9f140225db744c9a6ab91be

Thanks
Gui Haochen

在 2023/8/29 16:55, Kewen.Lin 写道:
> Hi Haochen,
> 
> on 2023/8/29 10:50, HAO CHEN GUI wrote:
>> Hi,
>>   This patch adds "TARGET_64BIT" check when calling vector load/store
>> with length expand in expand_block_move. It matches the expand condition
>> of "lxvl" and "stxvl" defined in vsx.md.
>>
>>   This patch fixes the ICE occurred with the test case on 32-bit Power10.
>>
>>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
>>
>> Thanks
>> Gui Haochen
>>
>>
>> ChangeLog
>> rs6000: call vector load/store with length expand only on 64-bit Power10
>>
>> gcc/
>> 	PR target/96762
>> 	* config/rs6000/rs6000-string.cc (expand_block_move): Call vector
>> 	load/store with length expand only on 64-bit Power10.
>>
>> gcc/testsuite/
>> 	PR target/96762
>> 	* gcc.target/powerpc/pr96762.c: New.
>>
>>
>> patch.diff
>> diff --git a/gcc/config/rs6000/rs6000-string.cc b/gcc/config/rs6000/rs6000-string.cc
>> index cd8ee8c..d1b48c2 100644
>> --- a/gcc/config/rs6000/rs6000-string.cc
>> +++ b/gcc/config/rs6000/rs6000-string.cc
>> @@ -2811,8 +2811,9 @@ expand_block_move (rtx operands[], bool might_overlap)
>>  	  gen_func.mov = gen_vsx_movv2di_64bit;
>>  	}
>>        else if (TARGET_BLOCK_OPS_UNALIGNED_VSX
>> -	       && TARGET_POWER10 && bytes < 16
>> -	       && orig_bytes > 16
>> +	       /* Only use lxvl/stxvl on 64bit POWER10.  */
>> +	       && TARGET_POWER10 && TARGET_64BIT
>> +	       && bytes < 16 && orig_bytes > 16
>>  	       && !(bytes == 1 || bytes == 2
>>  		    || bytes == 4 || bytes == 8)
>>  	       && (align >= 128 || !STRICT_ALIGNMENT))
> 
> Nit: Since you touched this part of code, could you format it better as well, like:
> 
>       else if (TARGET_BLOCK_OPS_UNALIGNED_VSX
> 	       /* Only use lxvl/stxvl on 64bit POWER10.  */
> 	       && TARGET_POWER10
> 	       && TARGET_64BIT
> 	       && bytes < 16
> 	       && orig_bytes > 16
> 	       && !(bytes == 1
> 		    || bytes == 2
> 		    || bytes == 4
> 		    || bytes == 8)
> 	       && (align >= 128
> 		   || !STRICT_ALIGNMENT))
> 
> 
>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr96762.c b/gcc/testsuite/gcc.target/powerpc/pr96762.c
>> new file mode 100644
>> index 0000000..1145dd1
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr96762.c
>> @@ -0,0 +1,11 @@
>> +/* { dg-do compile { target ilp32 } } */
> 
> Nit: we can compile this on lp64, so you can remove the ilp32 restriction,
> ...
> 
>> +/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
>> +
> 
> ... but add one comment line to note the initial purpose, like:
> 
> /* Verify there is no ICE on ilp32 env.  */
> 
> or similar.
> 
> Okay for trunk with these nits fixed, thanks!
> 
> BR,
> Kewen
> 
>> +extern void foo (char *);
>> +
>> +void
>> +bar (void)
>> +{
>> +  char zj[] = "XXXXXXXXXXXXXXXX";
>> +  foo (zj);
>> +}
  
Kewen.Lin Aug. 31, 2023, 6:14 a.m. UTC | #3
on 2023/8/31 13:47, HAO CHEN GUI wrote:
> Kewen,
>   I refined the patch according to your comments and it passed bootstrap
> and regression test.
> 
>   I committed it as
> https://gcc.gnu.org/g:946b8967b905257ac9f140225db744c9a6ab91be

Thanks!  We want this to be backported, so it's also ok for backporting to all
affected branch releases after a week or so.

BR,
Kewen
  

Patch

diff --git a/gcc/config/rs6000/rs6000-string.cc b/gcc/config/rs6000/rs6000-string.cc
index cd8ee8c..d1b48c2 100644
--- a/gcc/config/rs6000/rs6000-string.cc
+++ b/gcc/config/rs6000/rs6000-string.cc
@@ -2811,8 +2811,9 @@  expand_block_move (rtx operands[], bool might_overlap)
 	  gen_func.mov = gen_vsx_movv2di_64bit;
 	}
       else if (TARGET_BLOCK_OPS_UNALIGNED_VSX
-	       && TARGET_POWER10 && bytes < 16
-	       && orig_bytes > 16
+	       /* Only use lxvl/stxvl on 64bit POWER10.  */
+	       && TARGET_POWER10 && TARGET_64BIT
+	       && bytes < 16 && orig_bytes > 16
 	       && !(bytes == 1 || bytes == 2
 		    || bytes == 4 || bytes == 8)
 	       && (align >= 128 || !STRICT_ALIGNMENT))
diff --git a/gcc/testsuite/gcc.target/powerpc/pr96762.c b/gcc/testsuite/gcc.target/powerpc/pr96762.c
new file mode 100644
index 0000000..1145dd1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr96762.c
@@ -0,0 +1,11 @@ 
+/* { dg-do compile { target ilp32 } } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10" } */
+
+extern void foo (char *);
+
+void
+bar (void)
+{
+  char zj[] = "XXXXXXXXXXXXXXXX";
+  foo (zj);
+}