[committed,6/6] amdgcn: vector testsuite tweaks
Commit Message
The testsuite needs a few tweaks following my patches to add multiple vector
sizes for amdgcn.
gcc/testsuite/ChangeLog:
* gcc.dg/pr104464.c: Xfail on amdgcn.
* gcc.dg/signbit-2.c: Likewise.
* gcc.dg/signbit-5.c: Likewise.
* gcc.dg/vect/bb-slp-68.c: Likewise.
* gcc.dg/vect/bb-slp-cond-1.c: Change expectations on amdgcn.
* gcc.dg/vect/bb-slp-subgroups-3.c: Likewise.
* gcc.dg/vect/no-vfa-vect-depend-2.c: Change expectations for multiple
vector sizes.
* gcc.dg/vect/pr33953.c: Likewise.
* gcc.dg/vect/pr65947-12.c: Likewise.
* gcc.dg/vect/pr65947-13.c: Likewise.
* gcc.dg/vect/pr80631-2.c: Likewise.
* gcc.dg/vect/slp-reduc-4.c: Likewise.
* gcc.dg/vect/trapv-vect-reduc-4.c: Likewise.
* lib/target-supports.exp (available_vector_sizes): Add more sizes
for amdgcn.
---
gcc/testsuite/gcc.dg/pr104464.c | 2 ++
gcc/testsuite/gcc.dg/signbit-2.c | 5 +++--
gcc/testsuite/gcc.dg/signbit-5.c | 1 +
gcc/testsuite/gcc.dg/vect/bb-slp-68.c | 5 +++--
gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c | 3 ++-
gcc/testsuite/gcc.dg/vect/bb-slp-subgroups-3.c | 5 ++++-
gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-2.c | 3 ++-
gcc/testsuite/gcc.dg/vect/pr33953.c | 3 ++-
gcc/testsuite/gcc.dg/vect/pr65947-12.c | 3 ++-
gcc/testsuite/gcc.dg/vect/pr65947-13.c | 3 ++-
gcc/testsuite/gcc.dg/vect/pr80631-2.c | 3 ++-
gcc/testsuite/gcc.dg/vect/slp-reduc-4.c | 3 ++-
gcc/testsuite/gcc.dg/vect/trapv-vect-reduc-4.c | 3 ++-
gcc/testsuite/lib/target-supports.exp | 3 ++-
14 files changed, 31 insertions(+), 14 deletions(-)
Comments
Hi!
On 2022-10-11T12:02:08+0100, Andrew Stubbs <ams@codesourcery.com> wrote:
> The testsuite needs a few tweaks following my patches to add multiple vector
> sizes for amdgcn.
While 'grep'ping for some other GCN thing, this:
> --- a/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c
> +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-cond-1.c
> @@ -46,5 +46,6 @@ int main ()
> }
>
> /* { dg-final { scan-tree-dump {(no need for alias check [^\n]* when VF is 1|no alias between [^\n]* when [^\n]* is outside \(-16, 16\))} "vect" { target vect_element_align } } } */
> -/* { dg-final { scan-tree-dump-times "loop vectorized" 1 "vect" { target vect_element_align } } } */
> +/* { dg-final { scan-tree-dump-times "loop vectorized" 1 "vect" { target { vect_element_align && !amdgcn-*-* } } } } */
> +/* { dg-final { scan-tree-dump-times "loop vectorized" 2 "vect" { target amdgcn-*-* } } } */
... target selector expression '!amdgcn-*-*' occurred to me as dubious,
so I checked, and now pushed to master branch
commit 0607307768b66a90e27c5bc91a247acc938f070e
"Fix target selector syntax in 'gcc.dg/vect/bb-slp-cond-1.c'", see attached.
Cherry-picked pushed to devel/omp/gcc-12 branch
commit 5f4d2a15403d7231d7be673a9d633c0b4a22e19c
"Fix target selector syntax in 'gcc.dg/vect/bb-slp-cond-1.c'", see attached.
Looking into commit r13-3225-gbd9a05594d227cde79a67dc715bd9d82e9c464e9
"amdgcn: vector testsuite tweaks" for a moment, I also did wonder about
the following changes, because for 'vect_multiple_sizes' (for example,
x86_64-pc-linux-gnu) that seems to lose more specific testing;
previously: 'scan-tree-dump-times' exactly once, now: 'scan-tree-dump'
any number of times. But I've no clue about that myself, so just
mentioning this, in case somebody else has an opinion. ;-)
> * gcc.dg/vect/no-vfa-vect-depend-2.c: Change expectations for multiple
> vector sizes.
> * gcc.dg/vect/pr33953.c: Likewise.
> * gcc.dg/vect/pr65947-12.c: Likewise.
> * gcc.dg/vect/pr65947-13.c: Likewise.
> * gcc.dg/vect/pr80631-2.c: Likewise.
> * gcc.dg/vect/slp-reduc-4.c: Likewise.
> * gcc.dg/vect/trapv-vect-reduc-4.c: Likewise.
> --- a/gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-2.c
> +++ b/gcc/testsuite/gcc.dg/vect/no-vfa-vect-depend-2.c
> @@ -51,4 +51,5 @@ int main (void)
> }
>
> /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" {xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
> -/* { dg-final { scan-tree-dump-times "dependence distance negative" 1 "vect" } } */
> +/* { dg-final { scan-tree-dump-times "dependence distance negative" 1 "vect" { target { ! vect_multiple_sizes } } } } */
> +/* { dg-final { scan-tree-dump "dependence distance negative" "vect" { target vect_multiple_sizes } } } */
> --- a/gcc/testsuite/gcc.dg/vect/pr33953.c
> +++ b/gcc/testsuite/gcc.dg/vect/pr33953.c
> @@ -29,6 +29,7 @@ void blockmove_NtoN_blend_noremap32 (const UINT32 *srcdata, int srcwidth,
> }
>
> /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { ! vect_multiple_sizes } xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
> +/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" { target vect_multiple_sizes xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
> --- a/gcc/testsuite/gcc.dg/vect/pr65947-12.c
> +++ b/gcc/testsuite/gcc.dg/vect/pr65947-12.c
> @@ -42,5 +42,6 @@ main (void)
> }
>
> /* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 2 "vect" } } */
> -/* { dg-final { scan-tree-dump-times "optimizing condition reduction with FOLD_EXTRACT_LAST" 2 "vect" { target vect_fold_extract_last } } } */
> +/* { dg-final { scan-tree-dump-times "optimizing condition reduction with FOLD_EXTRACT_LAST" 2 "vect" { target { vect_fold_extract_last && { ! vect_multiple_sizes } } } } } */
> +/* { dg-final { scan-tree-dump "optimizing condition reduction with FOLD_EXTRACT_LAST" "vect" { target { vect_fold_extract_last && vect_multiple_sizes } } } } */
> /* { dg-final { scan-tree-dump-not "condition expression based on integer induction." "vect" } } */
> --- a/gcc/testsuite/gcc.dg/vect/pr65947-13.c
> +++ b/gcc/testsuite/gcc.dg/vect/pr65947-13.c
> @@ -44,4 +44,5 @@ main (void)
>
> /* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 2 "vect" } } */
> /* { dg-final { scan-tree-dump-times "condition expression based on integer induction." 2 "vect" { xfail vect_fold_extract_last } } } */
> -/* { dg-final { scan-tree-dump-times "optimizing condition reduction with FOLD_EXTRACT_LAST" 2 "vect" { target vect_fold_extract_last } } } */
> +/* { dg-final { scan-tree-dump-times "optimizing condition reduction with FOLD_EXTRACT_LAST" 2 "vect" { target { vect_fold_extract_last && { ! vect_multiple_sizes } } } } } */
> +/* { dg-final { scan-tree-dump "optimizing condition reduction with FOLD_EXTRACT_LAST" "vect" { target { vect_fold_extract_last && vect_multiple_sizes } } } } */
> --- a/gcc/testsuite/gcc.dg/vect/pr80631-2.c
> +++ b/gcc/testsuite/gcc.dg/vect/pr80631-2.c
> @@ -75,4 +75,5 @@ main ()
>
> /* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 5 "vect" { target vect_condition } } } */
> /* { dg-final { scan-tree-dump-times "condition expression based on integer induction." 5 "vect" { target vect_condition xfail vect_fold_extract_last } } } */
> -/* { dg-final { scan-tree-dump-times "optimizing condition reduction with FOLD_EXTRACT_LAST" 5 "vect" { target vect_fold_extract_last } } } */
> +/* { dg-final { scan-tree-dump-times "optimizing condition reduction with FOLD_EXTRACT_LAST" 5 "vect" { target { { ! vect_multiple_sizes } && vect_fold_extract_last } } } } */
> +/* { dg-final { scan-tree-dump "optimizing condition reduction with FOLD_EXTRACT_LAST" "vect" { target { vect_multiple_sizes && vect_fold_extract_last } } } } */
> --- a/gcc/testsuite/gcc.dg/vect/slp-reduc-4.c
> +++ b/gcc/testsuite/gcc.dg/vect/slp-reduc-4.c
> @@ -59,6 +59,7 @@ int main (void)
> /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_int_min_max } } } */
> /* For variable-length SVE, the number of scalar statements in the
> reduction exceeds the number of elements in a 128-bit granule. */
> -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { xfail { vect_no_int_min_max || { aarch64_sve && vect_variable_length } } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { ! vect_multiple_sizes } xfail { vect_no_int_min_max || { aarch64_sve && vect_variable_length } } } } } */
> +/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" { target { vect_multiple_sizes } } } } */
> /* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "vect" { xfail { aarch64_sve && vect_variable_length } } } } */
> --- a/gcc/testsuite/gcc.dg/vect/trapv-vect-reduc-4.c
> +++ b/gcc/testsuite/gcc.dg/vect/trapv-vect-reduc-4.c
> @@ -50,6 +50,7 @@ int main (void)
>
> /* We can't handle the first loop with variable-length vectors and so
> fall back to the fixed-length mininum instead. */
> -/* { dg-final { scan-tree-dump-times "Detected reduction\\." 3 "vect" { xfail vect_variable_length } } } */
> +/* { dg-final { scan-tree-dump-times "Detected reduction\\." 3 "vect" { target { ! vect_multiple_sizes } xfail vect_variable_length } } } */
> +/* { dg-final { scan-tree-dump "Detected reduction\\." "vect" { target vect_multiple_sizes } } } */
> /* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { target { ! vect_no_int_min_max } } } } */
> /* { dg-final { scan-tree-dump-times {using an in-order \(fold-left\) reduction} 1 "vect" } } */
Grüße
Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
> -----Original Message-----
> Looking into commit r13-3225-gbd9a05594d227cde79a67dc715bd9d82e9c464e9
> "amdgcn: vector testsuite tweaks" for a moment, I also did wonder about
> the following changes, because for 'vect_multiple_sizes' (for example,
> x86_64-pc-linux-gnu) that seems to lose more specific testing;
> previously: 'scan-tree-dump-times' exactly once, now: 'scan-tree-dump'
> any number of times. But I've no clue about that myself, so just
> mentioning this, in case somebody else has an opinion. ;-)
When vect_multiple_sizes is true the number of times the pattern appears will be greater that normal. Most likely the pattern will appear once for each vector size. In the case of GCN, a pattern that normally appears 4 times now appears 24 times.
The alternative would be to have a whole set of patterns for each configuration of each target that can have the multiple sizes. That or change the implementation of 'scan-tree-dump-times' to support expressions of some kind, but even then the expressions would get hairy.
Andrew
Hi Andrew!
On 2022-10-28T10:38:11+0200, "Stubbs, Andrew" <Andrew_Stubbs@mentor.com> wrote:
>> -----Original Message-----
>> Looking into commit r13-3225-gbd9a05594d227cde79a67dc715bd9d82e9c464e9
>> "amdgcn: vector testsuite tweaks" for a moment, I also did wonder about
>> the following changes, because for 'vect_multiple_sizes' (for example,
>> x86_64-pc-linux-gnu) that seems to lose more specific testing;
>> previously: 'scan-tree-dump-times' exactly once, now: 'scan-tree-dump'
>> any number of times. But I've no clue about that myself, so just
>> mentioning this, in case somebody else has an opinion. ;-)
>
> When vect_multiple_sizes is true the number of times the pattern appears will be greater that normal. Most likely the pattern will appear once for each vector size. In the case of GCN, a pattern that normally appears 4 times now appears 24 times.
>
> The alternative would be to have a whole set of patterns for each configuration of each target that can have the multiple sizes. That or change the implementation of 'scan-tree-dump-times' to support expressions of some kind, but even then the expressions would get hairy.
I guess my confusion is why this then hasn't already previously be
FAILing for example for x86_64-pc-linux-gnu, which also is
'vect_multiple_sizes'? Anyway: "I've no clue about that myself".
Grüße
Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
@@ -9,3 +9,5 @@ foo(void)
{
f += (F)(f != (F){}[0]);
}
+
+/* { dg-xfail-if "-fnon-call-exceptions unsupported" { amdgcn-*-* } } */
@@ -20,6 +20,7 @@ void fun2(int32_t *x, int n)
x[i] = (-x[i]) >> 30;
}
-/* { dg-final { scan-tree-dump {\s+>\s+\{ 0(, 0)+ \}} optimized { target vect_int } } } */
+/* Xfail amdgcn where vector truth type is not integer type. */
+/* { dg-final { scan-tree-dump {\s+>\s+\{ 0(, 0)+ \}} optimized { target vect_int xfail amdgcn-*-* } } } */
/* { dg-final { scan-tree-dump {\s+>\s+0} optimized { target { ! vect_int } } } } */
-/* { dg-final { scan-tree-dump-not {\s+>>\s+31} optimized } } */
+/* { dg-final { scan-tree-dump-not {\s+>>\s+31} optimized { xfail amdgcn-*-* } } } */
@@ -4,6 +4,7 @@
/* This test does not work when the truth type does not match vector type. */
/* { dg-additional-options "-mno-avx512f" { target { i?86-*-* x86_64-*-* } } } */
/* { dg-additional-options "-march=armv8-a" { target aarch64_sve } } */
+/* { dg-xfail-run-if "truth type does not match vector type" { amdgcn-*-* } } */
#include <stdint.h>
@@ -18,5 +18,6 @@ void foo ()
x[9] = z[3] + 1.;
}
-/* We want to have the store group split into 4, 2, 4 when using 32byte vectors. */
-/* { dg-final { scan-tree-dump-not "from scalars" "slp2" } } */
+/* We want to have the store group split into 4, 2, 4 when using 32byte vectors.
+ Unfortunately it does not work when 64-byte vectors are available. */
+/* { dg-final { scan-tree-dump-not "from scalars" "slp2" { xfail amdgcn-*-* } } } */
@@ -46,5 +46,6 @@ int main ()
}
/* { dg-final { scan-tree-dump {(no need for alias check [^\n]* when VF is 1|no alias between [^\n]* when [^\n]* is outside \(-16, 16\))} "vect" { target vect_element_align } } } */
-/* { dg-final { scan-tree-dump-times "loop vectorized" 1 "vect" { target vect_element_align } } } */
+/* { dg-final { scan-tree-dump-times "loop vectorized" 1 "vect" { target { vect_element_align && !amdgcn-*-* } } } } */
+/* { dg-final { scan-tree-dump-times "loop vectorized" 2 "vect" { target amdgcn-*-* } } } */
@@ -42,4 +42,7 @@ main (int argc, char **argv)
/* Because we disable the cost model, targets with variable-length
vectors can end up vectorizing the store to a[0..7] on its own.
With the cost model we do something sensible. */
-/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" { xfail vect_variable_length } } } */
+/* { dg-final { scan-tree-dump-times "optimized: basic block" 2 "slp2" { target { ! amdgcn-*-* } xfail vect_variable_length } } } */
+
+/* amdgcn can do this in one vector. */
+/* { dg-final { scan-tree-dump-times "optimized: basic block" 1 "slp2" { target amdgcn-*-* } } } */
@@ -51,4 +51,5 @@ int main (void)
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" {xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
-/* { dg-final { scan-tree-dump-times "dependence distance negative" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "dependence distance negative" 1 "vect" { target { ! vect_multiple_sizes } } } } */
+/* { dg-final { scan-tree-dump "dependence distance negative" "vect" { target vect_multiple_sizes } } } */
@@ -29,6 +29,7 @@ void blockmove_NtoN_blend_noremap32 (const UINT32 *srcdata, int srcwidth,
}
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { ! vect_multiple_sizes } xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
+/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" { target vect_multiple_sizes xfail { vect_no_align && { ! vect_hw_misalign } } } } } */
@@ -42,5 +42,6 @@ main (void)
}
/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 2 "vect" } } */
-/* { dg-final { scan-tree-dump-times "optimizing condition reduction with FOLD_EXTRACT_LAST" 2 "vect" { target vect_fold_extract_last } } } */
+/* { dg-final { scan-tree-dump-times "optimizing condition reduction with FOLD_EXTRACT_LAST" 2 "vect" { target { vect_fold_extract_last && { ! vect_multiple_sizes } } } } } */
+/* { dg-final { scan-tree-dump "optimizing condition reduction with FOLD_EXTRACT_LAST" "vect" { target { vect_fold_extract_last && vect_multiple_sizes } } } } */
/* { dg-final { scan-tree-dump-not "condition expression based on integer induction." "vect" } } */
@@ -44,4 +44,5 @@ main (void)
/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 2 "vect" } } */
/* { dg-final { scan-tree-dump-times "condition expression based on integer induction." 2 "vect" { xfail vect_fold_extract_last } } } */
-/* { dg-final { scan-tree-dump-times "optimizing condition reduction with FOLD_EXTRACT_LAST" 2 "vect" { target vect_fold_extract_last } } } */
+/* { dg-final { scan-tree-dump-times "optimizing condition reduction with FOLD_EXTRACT_LAST" 2 "vect" { target { vect_fold_extract_last && { ! vect_multiple_sizes } } } } } */
+/* { dg-final { scan-tree-dump "optimizing condition reduction with FOLD_EXTRACT_LAST" "vect" { target { vect_fold_extract_last && vect_multiple_sizes } } } } */
@@ -75,4 +75,5 @@ main ()
/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 5 "vect" { target vect_condition } } } */
/* { dg-final { scan-tree-dump-times "condition expression based on integer induction." 5 "vect" { target vect_condition xfail vect_fold_extract_last } } } */
-/* { dg-final { scan-tree-dump-times "optimizing condition reduction with FOLD_EXTRACT_LAST" 5 "vect" { target vect_fold_extract_last } } } */
+/* { dg-final { scan-tree-dump-times "optimizing condition reduction with FOLD_EXTRACT_LAST" 5 "vect" { target { { ! vect_multiple_sizes } && vect_fold_extract_last } } } } */
+/* { dg-final { scan-tree-dump "optimizing condition reduction with FOLD_EXTRACT_LAST" "vect" { target { vect_multiple_sizes && vect_fold_extract_last } } } } */
@@ -59,6 +59,7 @@ int main (void)
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail vect_no_int_min_max } } } */
/* For variable-length SVE, the number of scalar statements in the
reduction exceeds the number of elements in a 128-bit granule. */
-/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { xfail { vect_no_int_min_max || { aarch64_sve && vect_variable_length } } } } } */
+/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { ! vect_multiple_sizes } xfail { vect_no_int_min_max || { aarch64_sve && vect_variable_length } } } } } */
+/* { dg-final { scan-tree-dump "vectorizing stmts using SLP" "vect" { target { vect_multiple_sizes } } } } */
/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 0 "vect" { xfail { aarch64_sve && vect_variable_length } } } } */
@@ -50,6 +50,7 @@ int main (void)
/* We can't handle the first loop with variable-length vectors and so
fall back to the fixed-length mininum instead. */
-/* { dg-final { scan-tree-dump-times "Detected reduction\\." 3 "vect" { xfail vect_variable_length } } } */
+/* { dg-final { scan-tree-dump-times "Detected reduction\\." 3 "vect" { target { ! vect_multiple_sizes } xfail vect_variable_length } } } */
+/* { dg-final { scan-tree-dump "Detected reduction\\." "vect" { target vect_multiple_sizes } } } */
/* { dg-final { scan-tree-dump-times "vectorized 3 loops" 1 "vect" { target { ! vect_no_int_min_max } } } } */
/* { dg-final { scan-tree-dump-times {using an in-order \(fold-left\) reduction} 1 "vect" } } */
@@ -8400,7 +8400,8 @@ proc available_vector_sizes { } {
} elseif { [istarget sparc*-*-*] } {
lappend result 64
} elseif { [istarget amdgcn*-*-*] } {
- lappend result 4096
+ # 6 different lane counts, and 4 element sizes
+ lappend result 4096 2048 1024 512 256 128 64 32 16 8 4 2
} else {
# The traditional default asumption.
lappend result 128