[v2] tree-optimization/110979 - fold-left reduction and partial vectors

Message ID 20230811130321.D7EB613592@imap2.suse-dmz.suse.de
State Accepted
Headers
Series [v2] tree-optimization/110979 - fold-left reduction and partial vectors |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Richard Biener Aug. 11, 2023, 1:03 p.m. UTC
  When we vectorize fold-left reductions with partial vectors but
no target operation available we use a vector conditional to force
excess elements to zero.  But that doesn't correctly preserve
the sign of zero.  The following patch disables partial vector
support when we have to do that and also need to honor rounding
modes other than round-to-nearest.  When round-to-nearest is in
effect and we have to preserve the sign of zero instead use
negative zero for the excess elements.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

	PR tree-optimization/110979
	* tree-vect-loop.cc (vectorizable_reduction): For
	FOLD_LEFT_REDUCTION without target support make sure
	we don't need to honor signed zeros and sign dependent rounding.

	* gcc.dg/torture/pr110979.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr110979.c | 25 +++++++++++++++++++++++++
 gcc/tree-vect-loop.cc                   | 24 +++++++++++++++++++++++-
 2 files changed, 48 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr110979.c
  

Patch

diff --git a/gcc/testsuite/gcc.dg/torture/pr110979.c b/gcc/testsuite/gcc.dg/torture/pr110979.c
new file mode 100644
index 00000000000..c25ad7a8a31
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr110979.c
@@ -0,0 +1,25 @@ 
+/* { dg-do run } */
+/* { dg-additional-options "--param vect-partial-vector-usage=2" } */
+
+#define FLT double
+#define N 20
+
+__attribute__((noipa))
+FLT
+foo3 (FLT *a)
+{
+  FLT sum = -0.0;
+  for (int i = 0; i != N; i++)
+    sum += a[i];
+  return sum;
+}
+
+int main()
+{
+  FLT a[N];
+  for (int i = 0; i != N; i++)
+    a[i] = -0.0;
+  if (!__builtin_signbit(foo3(a)))
+    __builtin_abort();
+  return 0;
+}
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index bf8d677b584..bc3063c3615 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -6905,7 +6905,17 @@  vectorize_fold_left_reduction (loop_vec_info loop_vinfo,
 
   tree vector_identity = NULL_TREE;
   if (LOOP_VINFO_FULLY_MASKED_P (loop_vinfo))
-    vector_identity = build_zero_cst (vectype_out);
+    {
+      vector_identity = build_zero_cst (vectype_out);
+      if (!HONOR_SIGNED_ZEROS (vectype_out))
+	;
+      else
+	{
+	  gcc_assert (!HONOR_SIGN_DEPENDENT_ROUNDING (vectype_out));
+	  vector_identity = const_unop (NEGATE_EXPR, vectype_out,
+					vector_identity);
+	}
+    }
 
   tree scalar_dest_var = vect_create_destination_var (scalar_dest, NULL);
   int i;
@@ -8037,6 +8047,18 @@  vectorizable_reduction (loop_vec_info loop_vinfo,
 			     " no conditional operation is available.\n");
 	  LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false;
 	}
+      else if (reduction_type == FOLD_LEFT_REDUCTION
+	       && reduc_fn == IFN_LAST
+	       && FLOAT_TYPE_P (vectype_in)
+	       && HONOR_SIGNED_ZEROS (vectype_in)
+	       && HONOR_SIGN_DEPENDENT_ROUNDING (vectype_in))
+	{
+	  if (dump_enabled_p ())
+	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+			     "can't operate on partial vectors because"
+			     " signed zeros cannot be preserved.\n");
+	  LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false;
+	}
       else
 	{
 	  internal_fn mask_reduc_fn