[V2] PR target/112886, Add %S<n> to print_operand for vector pair support.
Checks
Commit Message
This is version 2 of the patch. The only difference is I made the test case
simpler to read.
In looking at support for load vector pair and store vector pair for the
PowerPC in GCC, I noticed that we were missing a print_operand output modifier
if you are dealing with vector pairs to print the 2nd register in the vector
pair.
If the instruction inside of the asm used the Altivec encoding, then we could
use the %L<n> modifier:
__vector_pair *p, *q, *r;
// ...
__asm__ ("vaddudm %0,%1,%2\n\tvaddudm %L0,%L1,%L2"
: "=v" (*p)
: "v" (*q), "v" (*r));
Likewise if we know the value to be in a tradiational FPR register, %L<n> will
work for instructions that use the VSX encoding:
__vector_pair *p, *q, *r;
// ...
__asm__ ("xvadddp %x0,%x1,%x2\n\txvadddp %L0,%L1,%L2"
: "=f" (*p)
: "f" (*q), "f" (*r));
But if have a value that is in a traditional Altivec register, and the
instruction uses the VSX encoding, %L<n> will a value between 0 and 31, when it
should give a value between 32 and 63.
This patch adds %S<n> that acts like %x<n>, except that it adds 1 to the
register number.
I have tested this on power10 and power9 little endian systems and on a power9
big endian system. There were no regressions in the patch. Can I apply it to
the trunk?
It would be nice if I could apply it to the open branches. Can I backport it
after a burn-in period?
2024-01-10 Michael Meissner <meissner@linux.ibm.com>
gcc/
PR target/112886
* config/rs6000/rs6000.cc (print_operand): Add %S<n> output modifier.
* doc/md.texi (Modifiers): Mention %S can be used like %x.
gcc/testsuite/
PR target/112886
* /gcc.target/powerpc/pr112886.c: New test.
---
gcc/config/rs6000/rs6000.cc | 10 ++++---
gcc/doc/md.texi | 5 ++--
gcc/testsuite/gcc.target/powerpc/pr112886.c | 29 +++++++++++++++++++++
3 files changed, 39 insertions(+), 5 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/powerpc/pr112886.c
Comments
Ping
| Date: Thu, 11 Jan 2024 12:29:23 -0500
| From: Michael Meissner <meissner@linux.ibm.com>
| Subject: [PATCH, V2] PR target/112886, Add %S<n> to print_operand for vector pair support.
| Message-ID: <ZaAlc7laA0EbGjRn@cowardly-lion.the-meissners.org>
https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642727.html
On 1/11/24 11:29 AM, Michael Meissner wrote:
> This is version 2 of the patch. The only difference is I made the test case
> simpler to read.
[snip]
> gcc/
>
> PR target/112886
> * config/rs6000/rs6000.cc (print_operand): Add %S<n> output modifier.
> * doc/md.texi (Modifiers): Mention %S can be used like %x.
>
> gcc/testsuite/
>
> PR target/112886
> * /gcc.target/powerpc/pr112886.c: New test.
This resolves my issue with the first patch, so LGTM.
Peter
Hi Mike,
on 2024/1/12 01:29, Michael Meissner wrote:
> This is version 2 of the patch. The only difference is I made the test case
> simpler to read.
>
> In looking at support for load vector pair and store vector pair for the
> PowerPC in GCC, I noticed that we were missing a print_operand output modifier
> if you are dealing with vector pairs to print the 2nd register in the vector
> pair.
>
> If the instruction inside of the asm used the Altivec encoding, then we could
> use the %L<n> modifier:
>
> __vector_pair *p, *q, *r;
> // ...
> __asm__ ("vaddudm %0,%1,%2\n\tvaddudm %L0,%L1,%L2"
> : "=v" (*p)
> : "v" (*q), "v" (*r));
>
> Likewise if we know the value to be in a tradiational FPR register, %L<n> will
> work for instructions that use the VSX encoding:
>
> __vector_pair *p, *q, *r;
> // ...
> __asm__ ("xvadddp %x0,%x1,%x2\n\txvadddp %L0,%L1,%L2"
> : "=f" (*p)
> : "f" (*q), "f" (*r));
>
> But if have a value that is in a traditional Altivec register, and the
> instruction uses the VSX encoding, %L<n> will a value between 0 and 31, when it
> should give a value between 32 and 63.
>
> This patch adds %S<n> that acts like %x<n>, except that it adds 1 to the
> register number.
>
> I have tested this on power10 and power9 little endian systems and on a power9
> big endian system. There were no regressions in the patch. Can I apply it to
> the trunk?
>
> It would be nice if I could apply it to the open branches. Can I backport it
> after a burn-in period?
>
> 2024-01-10 Michael Meissner <meissner@linux.ibm.com>
>
> gcc/
>
> PR target/112886
> * config/rs6000/rs6000.cc (print_operand): Add %S<n> output modifier.
> * doc/md.texi (Modifiers): Mention %S can be used like %x.
>
> gcc/testsuite/
>
> PR target/112886
> * /gcc.target/powerpc/pr112886.c: New test.
> ---
> gcc/config/rs6000/rs6000.cc | 10 ++++---
> gcc/doc/md.texi | 5 ++--
> gcc/testsuite/gcc.target/powerpc/pr112886.c | 29 +++++++++++++++++++++
> 3 files changed, 39 insertions(+), 5 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr112886.c
>
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 872df3140e3..6353a7ccfb2 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -14504,13 +14504,17 @@ print_operand (FILE *file, rtx x, int code)
> print_operand (file, x, 0);
> return;
>
> + case 'S':
> case 'x':
> - /* X is a FPR or Altivec register used in a VSX context. */
> + /* X is a FPR or Altivec register used in a VSX context. %x<n> prints
> + the VSX register number, %S<n> prints the 2nd register number for
> + vector pair, decimal 128-bit floating and IBM 128-bit binary floating
> + values. */
> if (!REG_P (x) || !VSX_REGNO_P (REGNO (x)))
> - output_operand_lossage ("invalid %%x value");
> + output_operand_lossage ("invalid %%%c value", (code == 'S' ? 'S' : 'x'));
Nit: Seems simpler with
output_operand_lossage ("invalid %%%c value", (char) code);
> else
> {
> - int reg = REGNO (x);
> + int reg = REGNO (x) + (code == 'S' ? 1 : 0);
> int vsx_reg = (FP_REGNO_P (reg)
> ? reg - 32
> : reg - FIRST_ALTIVEC_REGNO + 32);
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 47a87d6ceec..53ec957cb23 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -3386,8 +3386,9 @@ A VSX register (VSR), @code{vs0}@dots{}@code{vs63}. This is either an
> FPR (@code{vs0}@dots{}@code{vs31} are @code{f0}@dots{}@code{f31}) or a VR
> (@code{vs32}@dots{}@code{vs63} are @code{v0}@dots{}@code{v31}).
>
> -When using @code{wa}, you should use the @code{%x} output modifier, so that
> -the correct register number is printed. For example:
> +When using @code{wa}, you should use either the @code{%x} or @code{%S}
> +output modifier, so that the correct register number is printed. For
> +example:
As I questioned in [1], "It seems there is no Power specific documentation on
operand modifiers like this %L?", even if we document this new "%S" as above,
users can still be confused.
Does it sound better with something like " ... @code{%x} or @code{%S} (for the
next consecutive register number in the context like vector pair etc.) ... "?
[1] https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642414.html
>
> @smallexample
> asm ("xvadddp %x0,%x1,%x2"
And we can also extend this by adding one more example like your associated test
case (hope it can make it more clear).
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr112886.c b/gcc/testsuite/gcc.target/powerpc/pr112886.c
> new file mode 100644
> index 00000000000..4e59dcda6ea
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr112886.c
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target power10_ok } */
I think this needs one more:
/* { dg-require-effective-target powerpc_vsx_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
... and
/* { dg-options "-mdejagnu-cpu=power10 -O2 -mvsx" } */
, otherwise with explicit -mno-vsx, this test case would fail.
The other looks good to me, thanks!
BR,
Kewen
> +
> +/* PR target/112886: Test that print_operand %S<n> gives the correct register
> + number for VSX registers (i.e. if the register is an Altivec register, the
> + register number is 32..63 instead of 0..31. */
> +
> +void
> +test (__vector_pair *ptr1, __vector_pair *ptr2, __vector_pair *ptr3)
> +{
> + register __vector_pair p asm ("vs10");
> + register __vector_pair q asm ("vs42");
> + register __vector_pair r asm ("vs44");
> +
> + q = *ptr2;
> + r = *ptr3;
> +
> + __asm__ ("xvadddp %x0,%x1,%x2\n\txvadddp %S0,%S1,%S2"
> + : "=wa" (p)
> + : "wa" (q), "wa" (r));
> +
> + *ptr1 = p;
> +}
> +
> +/* { dg-final { scan-assembler-times {\mxvadddp 10,42,44\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxvadddp 11,43,45\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mlxvpx?\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mstxvpx?\M} 1 } } */
On 1/23/24 8:30 PM, Kewen.Lin wrote:
>> - output_operand_lossage ("invalid %%x value");
>> + output_operand_lossage ("invalid %%%c value", (code == 'S' ? 'S' : 'x'));
>
> Nit: Seems simpler with
>
> output_operand_lossage ("invalid %%%c value", (char) code);
Agreed, good catch.
>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr112886.c b/gcc/testsuite/gcc.target/powerpc/pr112886.c
>> new file mode 100644
>> index 00000000000..4e59dcda6ea
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr112886.c
>> @@ -0,0 +1,29 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target power10_ok } */
>
> I think this needs one more:
>
> /* { dg-require-effective-target powerpc_vsx_ok } */
I agree with this...
>> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
>
> ... and
>
> /* { dg-options "-mdejagnu-cpu=power10 -O2 -mvsx" } */
>
> , otherwise with explicit -mno-vsx, this test case would fail.
But not with this. The -mdejagnu-cpu=power10 option already enables -mvsx.
If the user explcitly forces -mno-vsx via RUNTESTFLAGS, then let them.
The options set in RUNTESTFLAGS come after the options in the dg-options
line, so even adding -mvsx like the above won't help the test case PASS
if we didn't have the powerpc_vsx_ok test. In other words, the -mvsx option
doesn't help with anything.
All we need is the new powerpc_vsx_ok check and that will guard against the FAIL
in the case the user forces -mno-vsx. In that case, we'll just get an UNSUPPORTED
and that is fine.
Peter
on 2024/1/24 11:11, Peter Bergner wrote:
> On 1/23/24 8:30 PM, Kewen.Lin wrote:
>>> - output_operand_lossage ("invalid %%x value");
>>> + output_operand_lossage ("invalid %%%c value", (code == 'S' ? 'S' : 'x'));
>>
>> Nit: Seems simpler with
>>
>> output_operand_lossage ("invalid %%%c value", (char) code);
>
> Agreed, good catch.
>
>
>
>
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr112886.c b/gcc/testsuite/gcc.target/powerpc/pr112886.c
>>> new file mode 100644
>>> index 00000000000..4e59dcda6ea
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr112886.c
>>> @@ -0,0 +1,29 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-require-effective-target power10_ok } */
>>
>> I think this needs one more:
>>
>> /* { dg-require-effective-target powerpc_vsx_ok } */
>
> I agree with this...
>
>
>
>>> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
>>
>> ... and
>>
>> /* { dg-options "-mdejagnu-cpu=power10 -O2 -mvsx" } */
>>
>> , otherwise with explicit -mno-vsx, this test case would fail.
>
> But not with this. The -mdejagnu-cpu=power10 option already enables -mvsx.
> If the user explcitly forces -mno-vsx via RUNTESTFLAGS, then let them.
> The options set in RUNTESTFLAGS come after the options in the dg-options
> line, so even adding -mvsx like the above won't help the test case PASS
But this is NOT true, at least on one of internal Power10 machine (ltcden2-lp1).
With the command below:
make check-gcc-c RUNTESTFLAGS="--target_board=unix/-mno-vsx powerpc.exp=pr112886.c"
this test case fails without the explicit -mvsx in dg-options.
From the verbose dumping, the compilation command looks like:
/home/linkw/gcc/build/gcc-test-debug/gcc/xgcc -B/home/linkw/gcc/build/gcc-test-debug/gcc/
/home/linkw/gcc/gcc-test/gcc/testsuite/gcc.target/powerpc/pr112886.c -mno-vsx
-fdiagnostics-plain-output -mdejagnu-cpu=power10 -O2 -ffat-lto-objects -fno-ident -S
-o pr112886.s
"-mno-vsx" comes **before** "-mdejagnu-cpu=power10 -O2" rather than **after**.
I guess it might be due to different behaviors of different versions of runtest framework?
So there can be two cases with user explicitly specified -mno-vsx:
1) RUNTESTFLAGS comes after dg-options (assuming same order for -mvsx in powerpc_vsx_ok)
powerpc_vsx_ok test failed, so UNSUPPORTED
// with explicit -mvsx does nothing as you said.
2) RUNTESTFLAGS comes before dg-options
powerpc_vsx_ok test succeeds, but FAIL.
// with suggested -mvsx, make it match the powerpc_vsx_ok checking and the case not fail.
As above I think we still need to append the "-mvsx" explicitly. As tested/verified, it
does help the case not to fail on ltcden2-lp1.
BR,
Kewen
> if we didn't have the powerpc_vsx_ok test. In other words, the -mvsx option
> doesn't help with anything.
>
> All we need is the new powerpc_vsx_ok check and that will guard against the FAIL
> in the case the user forces -mno-vsx. In that case, we'll just get an UNSUPPORTED
> and that is fine.
>
> Peter
>
>
>
On 1/24/24 12:04 AM, Kewen.Lin wrote:
> on 2024/1/24 11:11, Peter Bergner wrote:
>> But not with this. The -mdejagnu-cpu=power10 option already enables -mvsx.
>> If the user explcitly forces -mno-vsx via RUNTESTFLAGS, then let them.
>> The options set in RUNTESTFLAGS come after the options in the dg-options
>> line, so even adding -mvsx like the above won't help the test case PASS
>
> But this is NOT true, at least on one of internal Power10 machine (ltcden2-lp1).
>
> With the command below:
>
> make check-gcc-c RUNTESTFLAGS="--target_board=unix/-mno-vsx powerpc.exp=pr112886.c"
>
> this test case fails without the explicit -mvsx in dg-options.
>
> From the verbose dumping, the compilation command looks like:
>
> /home/linkw/gcc/build/gcc-test-debug/gcc/xgcc -B/home/linkw/gcc/build/gcc-test-debug/gcc/
> /home/linkw/gcc/gcc-test/gcc/testsuite/gcc.target/powerpc/pr112886.c -mno-vsx
> -fdiagnostics-plain-output -mdejagnu-cpu=power10 -O2 -ffat-lto-objects -fno-ident -S
> -o pr112886.s
>
> "-mno-vsx" comes **before** "-mdejagnu-cpu=power10 -O2" rather than **after**.
>
> I guess it might be due to different behaviors of different versions of runtest framework?
That is confusing, unless as you say, the behavior changed. The whole reason we added
-mdejagnu-cpu= (and the dg-skip usage before that) was due to encountering problems
when the test case wanted a specific -mcpu= value and the user overrode it in their
RUNTESTFLAGS and that can only happen when its options come last on the command line.
Then again, why didn't the powerpc_vsx_ok test not save us here?
> So there can be two cases with user explicitly specified -mno-vsx:
>
> 1) RUNTESTFLAGS comes after dg-options (assuming same order for -mvsx in powerpc_vsx_ok)
>
> powerpc_vsx_ok test failed, so UNSUPPORTED
>
> // with explicit -mvsx does nothing as you said.
>
> 2) RUNTESTFLAGS comes before dg-options
>
> powerpc_vsx_ok test succeeds, but FAIL.
>
> // with suggested -mvsx, make it match the powerpc_vsx_ok checking and the case not fail.
>
> As above I think we still need to append the "-mvsx" explicitly. As tested/verified, it
> does help the case not to fail on ltcden2-lp1.
I'd like to verify that the behavior did change before we enforce adding that option.
The problem is, there are a HUGE number of older test cases that would need updating
to "fix" them too. ...and not just adding -mnsx, but -maltivec and basically any
-mfoo option where the test case is expecting the feature foo to be used/tested.
It would be a huge mess.
Peter
on 2024/1/24 23:51, Peter Bergner wrote:
> On 1/24/24 12:04 AM, Kewen.Lin wrote:
>> on 2024/1/24 11:11, Peter Bergner wrote:
>>> But not with this. The -mdejagnu-cpu=power10 option already enables -mvsx.
>>> If the user explcitly forces -mno-vsx via RUNTESTFLAGS, then let them.
>>> The options set in RUNTESTFLAGS come after the options in the dg-options
>>> line, so even adding -mvsx like the above won't help the test case PASS
>>
>> But this is NOT true, at least on one of internal Power10 machine (ltcden2-lp1).
>>
>> With the command below:
>>
>> make check-gcc-c RUNTESTFLAGS="--target_board=unix/-mno-vsx powerpc.exp=pr112886.c"
>>
>> this test case fails without the explicit -mvsx in dg-options.
>>
>> From the verbose dumping, the compilation command looks like:
>>
>> /home/linkw/gcc/build/gcc-test-debug/gcc/xgcc -B/home/linkw/gcc/build/gcc-test-debug/gcc/
>> /home/linkw/gcc/gcc-test/gcc/testsuite/gcc.target/powerpc/pr112886.c -mno-vsx
>> -fdiagnostics-plain-output -mdejagnu-cpu=power10 -O2 -ffat-lto-objects -fno-ident -S
>> -o pr112886.s
>>
>> "-mno-vsx" comes **before** "-mdejagnu-cpu=power10 -O2" rather than **after**.
>>
>> I guess it might be due to different behaviors of different versions of runtest framework?
>
> That is confusing, unless as you say, the behavior changed. The whole reason we added
> -mdejagnu-cpu= (and the dg-skip usage before that) was due to encountering problems
> when the test case wanted a specific -mcpu= value and the user overrode it in their
> RUNTESTFLAGS and that can only happen when its options come last on the command line.
I think the behavior changed, or more accurately: the behavior doesn't keep consistent
on all test environments (suspecting due to different versions of runtest).
>
> Then again, why didn't the powerpc_vsx_ok test not save us here?
Same reason, -mno-vsx comes before the append "-mvsx", so the checking passes.
>
>
>
>> So there can be two cases with user explicitly specified -mno-vsx:
>>
>> 1) RUNTESTFLAGS comes after dg-options (assuming same order for -mvsx in powerpc_vsx_ok)
>>
>> powerpc_vsx_ok test failed, so UNSUPPORTED
>>
>> // with explicit -mvsx does nothing as you said.
>>
>> 2) RUNTESTFLAGS comes before dg-options
>>
>> powerpc_vsx_ok test succeeds, but FAIL.
>>
>> // with suggested -mvsx, make it match the powerpc_vsx_ok checking and the case not fail.
>>
>> As above I think we still need to append the "-mvsx" explicitly. As tested/verified, it
>> does help the case not to fail on ltcden2-lp1.
>
> I'd like to verify that the behavior did change before we enforce adding that option.
Great.
> The problem is, there are a HUGE number of older test cases that would need updating
> to "fix" them too. ...and not just adding -mnsx, but -maltivec and basically any
> -mfoo option where the test case is expecting the feature foo to be used/tested.
> It would be a huge mess.
I agree, such "-mno-foo" is rarely used in testing, so it doesn't get enough notices.
BR,
Kewen
@@ -14504,13 +14504,17 @@ print_operand (FILE *file, rtx x, int code)
print_operand (file, x, 0);
return;
+ case 'S':
case 'x':
- /* X is a FPR or Altivec register used in a VSX context. */
+ /* X is a FPR or Altivec register used in a VSX context. %x<n> prints
+ the VSX register number, %S<n> prints the 2nd register number for
+ vector pair, decimal 128-bit floating and IBM 128-bit binary floating
+ values. */
if (!REG_P (x) || !VSX_REGNO_P (REGNO (x)))
- output_operand_lossage ("invalid %%x value");
+ output_operand_lossage ("invalid %%%c value", (code == 'S' ? 'S' : 'x'));
else
{
- int reg = REGNO (x);
+ int reg = REGNO (x) + (code == 'S' ? 1 : 0);
int vsx_reg = (FP_REGNO_P (reg)
? reg - 32
: reg - FIRST_ALTIVEC_REGNO + 32);
@@ -3386,8 +3386,9 @@ A VSX register (VSR), @code{vs0}@dots{}@code{vs63}. This is either an
FPR (@code{vs0}@dots{}@code{vs31} are @code{f0}@dots{}@code{f31}) or a VR
(@code{vs32}@dots{}@code{vs63} are @code{v0}@dots{}@code{v31}).
-When using @code{wa}, you should use the @code{%x} output modifier, so that
-the correct register number is printed. For example:
+When using @code{wa}, you should use either the @code{%x} or @code{%S}
+output modifier, so that the correct register number is printed. For
+example:
@smallexample
asm ("xvadddp %x0,%x1,%x2"
new file mode 100644
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+/* PR target/112886: Test that print_operand %S<n> gives the correct register
+ number for VSX registers (i.e. if the register is an Altivec register, the
+ register number is 32..63 instead of 0..31. */
+
+void
+test (__vector_pair *ptr1, __vector_pair *ptr2, __vector_pair *ptr3)
+{
+ register __vector_pair p asm ("vs10");
+ register __vector_pair q asm ("vs42");
+ register __vector_pair r asm ("vs44");
+
+ q = *ptr2;
+ r = *ptr3;
+
+ __asm__ ("xvadddp %x0,%x1,%x2\n\txvadddp %S0,%S1,%S2"
+ : "=wa" (p)
+ : "wa" (q), "wa" (r));
+
+ *ptr1 = p;
+}
+
+/* { dg-final { scan-assembler-times {\mxvadddp 10,42,44\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxvadddp 11,43,45\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mlxvpx?\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mstxvpx?\M} 1 } } */