match.pd: Fix up fneg/fadd simplification [PR109230]
Checks
Commit Message
Hi!
The following testcase is miscompiled on aarch64-linux. match.pd
has a simplification for addsub, where it negates one of the vectors
in twice as large floating point element vector (effectively negating every
other element) and then doing addition.
But a requirement for that is that the permutation picks the right elements,
in particular 0, nelts+1, 2, nelts+3, 4, nelts+5, ...
The pattern tests this with sel.series_p (0, 2, 0, 2) check, which as
documented verifies that the even elements of the permutation mask are
identity, but doesn't say anything about the others.
The following patch fixes it by also checking that the odd elements
start at nelts + 1 with the same step of 2.
Bootstrapped/regtested on aarch64-linux, x86_64-linux and i686-linux,
ok for trunk?
2023-03-22 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/109230
* match.pd (fneg/fadd simplify): Verify also odd permutation indexes.
* gcc.dg/pr109230.c: New test.
Jakub
Comments
On 3/22/23 04:16, Jakub Jelinek via Gcc-patches wrote:
> Hi!
>
> The following testcase is miscompiled on aarch64-linux. match.pd
> has a simplification for addsub, where it negates one of the vectors
> in twice as large floating point element vector (effectively negating every
> other element) and then doing addition.
> But a requirement for that is that the permutation picks the right elements,
> in particular 0, nelts+1, 2, nelts+3, 4, nelts+5, ...
> The pattern tests this with sel.series_p (0, 2, 0, 2) check, which as
> documented verifies that the even elements of the permutation mask are
> identity, but doesn't say anything about the others.
> The following patch fixes it by also checking that the odd elements
> start at nelts + 1 with the same step of 2.
>
> Bootstrapped/regtested on aarch64-linux, x86_64-linux and i686-linux,
> ok for trunk?
>
> 2023-03-22 Jakub Jelinek <jakub@redhat.com>
>
> PR tree-optimization/109230
> * match.pd (fneg/fadd simplify): Verify also odd permutation indexes.
>
> * gcc.dg/pr109230.c: New test.
OK
Jeff
@@ -8096,6 +8096,7 @@ and,
scalar_mode inner_mode = GET_MODE_INNER (vec_mode);
}
(if (sel.series_p (0, 2, 0, 2)
+ && sel.series_p (1, 2, nelts + 1, 2)
&& GET_MODE_2XWIDER_MODE (inner_mode).exists (&wide_elt_mode)
&& multiple_p (GET_MODE_NUNITS (vec_mode), 2, &wide_nunits)
&& related_vector_mode (vec_mode, wide_elt_mode,
@@ -0,0 +1,31 @@
+/* PR tree-optimization/109230 */
+/* { dg-do run } */
+/* { dg-options "-O2 -Wno-psabi" } */
+
+#if __SIZEOF_FLOAT__ == __SIZEOF_INT__
+typedef float V __attribute__((vector_size (4 * sizeof (float))));
+typedef int VI __attribute__((vector_size (4 * sizeof (float))));
+
+__attribute__((noipa)) V
+foo (V x, V y)
+{
+ V a = x - y;
+ V b = y + x;
+ return __builtin_shuffle (b, a, (VI) { 0, 5, 2, 3 });
+}
+
+int
+main ()
+{
+ V a = (V) { 1.0f, 2.0f, 3.0f, 4.0f };
+ V b = (V) { 8.0f, 9.0f, 10.0f, 11.0f };
+ V c = foo (a, b);
+ if (c[0] != 9.0f || c[1] != -7.0f || c[2] != 13.0f || c[3] != 15.0f)
+ __builtin_abort ();
+}
+#else
+int
+main ()
+{
+}
+#endif