PR target/112886, Add %S<n> to print_operand for vector pair support

Message ID ZZiAHl_TMOICKNPx@cowardly-lion.the-meissners.org
State Accepted
Headers
Series PR target/112886, Add %S<n> to print_operand for vector pair support |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Michael Meissner Jan. 5, 2024, 10:18 p.m. UTC
  In looking at support for load vector pair and store vector pair for the
PowerPC in GCC, I noticed that we were missing a print_operand output modifier
if you are dealing with vector pairs to print the 2nd register in the vector
pair.

If the instruction inside of the asm used the Altivec encoding, then we could
use the %L<n> modifier:

	__vector_pair *p, *q, *r;
	// ...
	__asm__ ("vaddudm %0,%1,%2\n\tvaddudm %L0,%L1,%L2"
		 : "=v" (*p)
		 : "v" (*q), "v" (*r));

Likewise if we know the value to be in a tradiational FPR register, %L<n> will
work for instructions that use the VSX encoding:

	__vector_pair *p, *q, *r;
	// ...
	__asm__ ("xvadddp %x0,%x1,%x2\n\txvadddp %L0,%L1,%L2"
		 : "=f" (*p)
		 : "f" (*q), "f" (*r));

But if have a value that is in a traditional Altivec register, and the
instruction uses the VSX encoding, %L<n> will a value between 0 and 31, when it
should give a value between 32 and 63.

This patch adds %S<n> that acts like %x<n>, except that it adds 1 to the
register number.

I have tested this on power10 and power9 little endian systems and on a power9
big endian system.  There were no regressions in the patch.  Can I apply it to
the trunk?

It would be nice if I could apply it to the open branches.  Can I backport it
after a burn-in period?

2024-01-04  Michael Meissner  <meissner@linux.ibm.com>

gcc/

	PR target/112886
	* config/rs6000/rs6000.cc (print_operand): Add %S<n> output modifier.
	* doc/md.texi (Modifiers): Mention %S can be used like %x.

gcc/testsuite/

	PR target/112886
	* /gcc.target/powerpc/pr112886.c: New test.
---
 gcc/config/rs6000/rs6000.cc                 | 10 +++++++---
 gcc/doc/md.texi                             |  5 +++--
 gcc/testsuite/gcc.target/powerpc/pr112886.c | 19 +++++++++++++++++++
 3 files changed, 29 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr112886.c
  

Comments

Peter Bergner Jan. 9, 2024, 10:35 p.m. UTC | #1
On 1/5/24 4:18 PM, Michael Meissner wrote:
> @@ -14504,13 +14504,17 @@ print_operand (FILE *file, rtx x, int code)
>  	print_operand (file, x, 0);
>        return;
>  
> +    case 'S':
>      case 'x':
> -      /* X is a FPR or Altivec register used in a VSX context.  */
> +      /* X is a FPR or Altivec register used in a VSX context.  %x<n> prints
> +	 the VSX register number, %S<n> prints the 2nd register number for
> +	 vector pair, decimal 128-bit floating and IBM 128-bit binary floating
> +	 values.  */
>        if (!REG_P (x) || !VSX_REGNO_P (REGNO (x)))
> -	output_operand_lossage ("invalid %%x value");
> +	output_operand_lossage ("invalid %%%c value", (code == 'S' ? 'S' : 'x'));
>        else
>  	{
> -	  int reg = REGNO (x);
> +	  int reg = REGNO (x) + (code == 'S' ? 1 : 0);
>  	  int vsx_reg = (FP_REGNO_P (reg)
>  			 ? reg - 32
>  			 : reg - FIRST_ALTIVEC_REGNO + 32);

The above looks good to me.  However:


> +	   : "=v" (*p)
> +	   : "v" (*q), "v" (*r));

These really should use "wa" rather than "v", since these are
VSX instructions... or did you use those to ensure you got
Altivec registers numbers assigned?



> +/* { dg-final { scan-assembler-times {\mxvadddp (3[2-9]|[45][0-9]|6[0-3]),(3[2-9]|[45][0-9]|6[0-3]),(3[2-9]|[45][0-9]|6[0-3])\M} 2 } } */

...and this is really ugly and hard to read/understand.  Can't we use
register variables to make it simpler?  Something like the following
which tests having both FPR and Altivec reg numbers assigned?

...
void
test (__vector_pair *ptr)
{
  register __vector_pair p asm ("vs10");
  register __vector_pair q asm ("vs42");
  register __vector_pair r asm ("vs44");
  q = ptr[1];
  r = ptr[2];
  __asm__ ("xvadddp %x0,%x1,%x2\n\txvadddp %S0,%S1,%S2"
	   : "=wa" (p)
	   : "wa" (q), "wa" (r));
  ptr[2] = p;
}

/* { dg-final { scan-assembler-times {\mxvadddp 10,42,44\M} 1 } } */
/* { dg-final { scan-assembler-times {\mxvadddp 11,43,45\M} 1 } } */
...

Peter
  
Michael Meissner Jan. 10, 2024, 8:52 a.m. UTC | #2
On Tue, Jan 09, 2024 at 04:35:22PM -0600, Peter Bergner wrote:
> On 1/5/24 4:18 PM, Michael Meissner wrote:
> > @@ -14504,13 +14504,17 @@ print_operand (FILE *file, rtx x, int code)
> >  	print_operand (file, x, 0);
> >        return;
> >  
> > +    case 'S':
> >      case 'x':
> > -      /* X is a FPR or Altivec register used in a VSX context.  */
> > +      /* X is a FPR or Altivec register used in a VSX context.  %x<n> prints
> > +	 the VSX register number, %S<n> prints the 2nd register number for
> > +	 vector pair, decimal 128-bit floating and IBM 128-bit binary floating
> > +	 values.  */
> >        if (!REG_P (x) || !VSX_REGNO_P (REGNO (x)))
> > -	output_operand_lossage ("invalid %%x value");
> > +	output_operand_lossage ("invalid %%%c value", (code == 'S' ? 'S' : 'x'));
> >        else
> >  	{
> > -	  int reg = REGNO (x);
> > +	  int reg = REGNO (x) + (code == 'S' ? 1 : 0);
> >  	  int vsx_reg = (FP_REGNO_P (reg)
> >  			 ? reg - 32
> >  			 : reg - FIRST_ALTIVEC_REGNO + 32);
> 
> The above looks good to me.  However:
> 
> 
> > +	   : "=v" (*p)
> > +	   : "v" (*q), "v" (*r));
> 
> These really should use "wa" rather than "v", since these are
> VSX instructions... or did you use those to ensure you got
> Altivec registers numbers assigned?

Yes in real code you would typically use "wa" instead of "v".  I used them in
the test to ensure that I was getting a register to show the problem.

But I can imagine circumstances where you are doing extended asm with 2 or more
instructions, one that uses an instruction that uses the VSX encoding (where
you would use %S<n> and the other where you use the Altivec encoding where you
would use %L<n>, and you would use the "v" constraint.

> > +/* { dg-final { scan-assembler-times {\mxvadddp (3[2-9]|[45][0-9]|6[0-3]),(3[2-9]|[45][0-9]|6[0-3]),(3[2-9]|[45][0-9]|6[0-3])\M} 2 } } */
> 
> ...and this is really ugly and hard to read/understand.  Can't we use
> register variables to make it simpler?  Something like the following
> which tests having both FPR and Altivec reg numbers assigned?
> 
> ...
> void
> test (__vector_pair *ptr)
> {
>   register __vector_pair p asm ("vs10");
>   register __vector_pair q asm ("vs42");
>   register __vector_pair r asm ("vs44");
>   q = ptr[1];
>   r = ptr[2];
>   __asm__ ("xvadddp %x0,%x1,%x2\n\txvadddp %S0,%S1,%S2"
> 	   : "=wa" (p)
> 	   : "wa" (q), "wa" (r));
>   ptr[2] = p;
> }
> 
> /* { dg-final { scan-assembler-times {\mxvadddp 10,42,44\M} 1 } } */
> /* { dg-final { scan-assembler-times {\mxvadddp 11,43,45\M} 1 } } */

Yes that probably will work.
  
Kewen.Lin Jan. 10, 2024, 9:06 a.m. UTC | #3
Hi Mike,

on 2024/1/6 06:18, Michael Meissner wrote:
> In looking at support for load vector pair and store vector pair for the
> PowerPC in GCC, I noticed that we were missing a print_operand output modifier
> if you are dealing with vector pairs to print the 2nd register in the vector
> pair.
> 
> If the instruction inside of the asm used the Altivec encoding, then we could
> use the %L<n> modifier:

It seems there is no Power specific documentation on operand modifiers like this
"%L"?

> 
> 	__vector_pair *p, *q, *r;
> 	// ...
> 	__asm__ ("vaddudm %0,%1,%2\n\tvaddudm %L0,%L1,%L2"
> 		 : "=v" (*p)
> 		 : "v" (*q), "v" (*r));
> 
> Likewise if we know the value to be in a tradiational FPR register, %L<n> will
> work for instructions that use the VSX encoding:
> 
> 	__vector_pair *p, *q, *r;
> 	// ...
> 	__asm__ ("xvadddp %x0,%x1,%x2\n\txvadddp %L0,%L1,%L2"
> 		 : "=f" (*p)
> 		 : "f" (*q), "f" (*r));
> 
> But if have a value that is in a traditional Altivec register, and the
> instruction uses the VSX encoding, %L<n> will a value between 0 and 31, when it
> should give a value between 32 and 63.
> 
> This patch adds %S<n> that acts like %x<n>, except that it adds 1 to the
> register number.

Excepting for Peter's comments, since the existing "%L" has different handlings
on REG_P and MEM_P:

    case 'L':
      /* Write second word of DImode or DFmode reference.  Works on register
	 or non-indexed memory only.  */
      if (REG_P (x))
	fputs (reg_names[REGNO (x) + 1], file);
      else if (MEM_P (x))
	...

, maybe we can extend the existing '%X' for this similarly (as it's capital of
%x so easier to remember and it's only used for MEM_P now) instead of introducing
a new "%S".  But one argument can be a new character is more clear.  Thoughts?

BR,
Kewen

> 
> I have tested this on power10 and power9 little endian systems and on a power9
> big endian system.  There were no regressions in the patch.  Can I apply it to
> the trunk?
> 
> It would be nice if I could apply it to the open branches.  Can I backport it
> after a burn-in period?
> 
> 2024-01-04  Michael Meissner  <meissner@linux.ibm.com>
> 
> gcc/
> 
> 	PR target/112886
> 	* config/rs6000/rs6000.cc (print_operand): Add %S<n> output modifier.
> 	* doc/md.texi (Modifiers): Mention %S can be used like %x.
> 
> gcc/testsuite/
> 
> 	PR target/112886
> 	* /gcc.target/powerpc/pr112886.c: New test.
> ---
>  gcc/config/rs6000/rs6000.cc                 | 10 +++++++---
>  gcc/doc/md.texi                             |  5 +++--
>  gcc/testsuite/gcc.target/powerpc/pr112886.c | 19 +++++++++++++++++++
>  3 files changed, 29 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr112886.c
> 
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 5a7e00b03d1..ba89377c9ec 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -14504,13 +14504,17 @@ print_operand (FILE *file, rtx x, int code)
>  	print_operand (file, x, 0);
>        return;
>  
> +    case 'S':
>      case 'x':
> -      /* X is a FPR or Altivec register used in a VSX context.  */
> +      /* X is a FPR or Altivec register used in a VSX context.  %x<n> prints
> +	 the VSX register number, %S<n> prints the 2nd register number for
> +	 vector pair, decimal 128-bit floating and IBM 128-bit binary floating
> +	 values.  */
>        if (!REG_P (x) || !VSX_REGNO_P (REGNO (x)))
> -	output_operand_lossage ("invalid %%x value");
> +	output_operand_lossage ("invalid %%%c value", (code == 'S' ? 'S' : 'x'));
>        else
>  	{
> -	  int reg = REGNO (x);
> +	  int reg = REGNO (x) + (code == 'S' ? 1 : 0);
>  	  int vsx_reg = (FP_REGNO_P (reg)
>  			 ? reg - 32
>  			 : reg - FIRST_ALTIVEC_REGNO + 32);
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 47a87d6ceec..53ec957cb23 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -3386,8 +3386,9 @@ A VSX register (VSR), @code{vs0}@dots{}@code{vs63}.  This is either an
>  FPR (@code{vs0}@dots{}@code{vs31} are @code{f0}@dots{}@code{f31}) or a VR
>  (@code{vs32}@dots{}@code{vs63} are @code{v0}@dots{}@code{v31}).
>  
> -When using @code{wa}, you should use the @code{%x} output modifier, so that
> -the correct register number is printed.  For example:
> +When using @code{wa}, you should use either the @code{%x} or @code{%S}
> +output modifier, so that the correct register number is printed.  For
> +example:
>  
>  @smallexample
>  asm ("xvadddp %x0,%x1,%x2"
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr112886.c b/gcc/testsuite/gcc.target/powerpc/pr112886.c
> new file mode 100644
> index 00000000000..07196bdc220
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr112886.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> +
> +/* PR target/112886: Test that print_operand %S<n> gives the correct register
> +   number for VSX registers (i.e. if the register is an Altivec register, the
> +   register number is 32..63 instead of 0..31.  */
> +
> +void
> +test (__vector_pair *p, __vector_pair *q, __vector_pair *r)
> +{
> +  __asm__ ("xvadddp %x0,%x1,%x2\n\txvadddp %S0,%S1,%S2"
> +	   : "=v" (*p)
> +	   : "v" (*q), "v" (*r));
> +}
> +
> +/* { dg-final { scan-assembler-times {\mxvadddp (3[2-9]|[45][0-9]|6[0-3]),(3[2-9]|[45][0-9]|6[0-3]),(3[2-9]|[45][0-9]|6[0-3])\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mlxvp\M}  2 } } */
> +/* { dg-final { scan-assembler-times {\mstxvp\M} 1 } } */
  
Michael Meissner Jan. 11, 2024, 5:30 p.m. UTC | #4
On Tue, Jan 09, 2024 at 04:35:22PM -0600, Peter Bergner wrote:
> ...and this is really ugly and hard to read/understand.  Can't we use
> register variables to make it simpler?  Something like the following
> which tests having both FPR and Altivec reg numbers assigned?
> 
> ...
> void
> test (__vector_pair *ptr)
> {
>   register __vector_pair p asm ("vs10");
>   register __vector_pair q asm ("vs42");
>   register __vector_pair r asm ("vs44");
>   q = ptr[1];
>   r = ptr[2];
>   __asm__ ("xvadddp %x0,%x1,%x2\n\txvadddp %S0,%S1,%S2"
> 	   : "=wa" (p)
> 	   : "wa" (q), "wa" (r));
>   ptr[2] = p;
> }
> 
> /* { dg-final { scan-assembler-times {\mxvadddp 10,42,44\M} 1 } } */
> /* { dg-final { scan-assembler-times {\mxvadddp 11,43,45\M} 1 } } */
> ...

I have submitted V2 of the patch that changes the test case.  Thank you.
  

Patch

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 5a7e00b03d1..ba89377c9ec 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -14504,13 +14504,17 @@  print_operand (FILE *file, rtx x, int code)
 	print_operand (file, x, 0);
       return;
 
+    case 'S':
     case 'x':
-      /* X is a FPR or Altivec register used in a VSX context.  */
+      /* X is a FPR or Altivec register used in a VSX context.  %x<n> prints
+	 the VSX register number, %S<n> prints the 2nd register number for
+	 vector pair, decimal 128-bit floating and IBM 128-bit binary floating
+	 values.  */
       if (!REG_P (x) || !VSX_REGNO_P (REGNO (x)))
-	output_operand_lossage ("invalid %%x value");
+	output_operand_lossage ("invalid %%%c value", (code == 'S' ? 'S' : 'x'));
       else
 	{
-	  int reg = REGNO (x);
+	  int reg = REGNO (x) + (code == 'S' ? 1 : 0);
 	  int vsx_reg = (FP_REGNO_P (reg)
 			 ? reg - 32
 			 : reg - FIRST_ALTIVEC_REGNO + 32);
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 47a87d6ceec..53ec957cb23 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -3386,8 +3386,9 @@  A VSX register (VSR), @code{vs0}@dots{}@code{vs63}.  This is either an
 FPR (@code{vs0}@dots{}@code{vs31} are @code{f0}@dots{}@code{f31}) or a VR
 (@code{vs32}@dots{}@code{vs63} are @code{v0}@dots{}@code{v31}).
 
-When using @code{wa}, you should use the @code{%x} output modifier, so that
-the correct register number is printed.  For example:
+When using @code{wa}, you should use either the @code{%x} or @code{%S}
+output modifier, so that the correct register number is printed.  For
+example:
 
 @smallexample
 asm ("xvadddp %x0,%x1,%x2"
diff --git a/gcc/testsuite/gcc.target/powerpc/pr112886.c b/gcc/testsuite/gcc.target/powerpc/pr112886.c
new file mode 100644
index 00000000000..07196bdc220
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr112886.c
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+/* PR target/112886: Test that print_operand %S<n> gives the correct register
+   number for VSX registers (i.e. if the register is an Altivec register, the
+   register number is 32..63 instead of 0..31.  */
+
+void
+test (__vector_pair *p, __vector_pair *q, __vector_pair *r)
+{
+  __asm__ ("xvadddp %x0,%x1,%x2\n\txvadddp %S0,%S1,%S2"
+	   : "=v" (*p)
+	   : "v" (*q), "v" (*r));
+}
+
+/* { dg-final { scan-assembler-times {\mxvadddp (3[2-9]|[45][0-9]|6[0-3]),(3[2-9]|[45][0-9]|6[0-3]),(3[2-9]|[45][0-9]|6[0-3])\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mlxvp\M}  2 } } */
+/* { dg-final { scan-assembler-times {\mstxvp\M} 1 } } */