tree-optimization/106922 - missed FRE/PRE

Message ID 20220922111009.3EF0413AA5@imap2.suse-dmz.suse.de
State New, archived
Headers
Series tree-optimization/106922 - missed FRE/PRE |

Commit Message

Richard Biener Sept. 22, 2022, 11:10 a.m. UTC
  The following enhances the store-with-same-value trick in
vn_reference_lookup_3 by not only looking for

  a = val;
  *ptr = val;
  .. = a;

but also

  *ptr = val;
  other = x;
  .. = a;

where the earlier store is more than one hop away.  It does this
by queueing the actual value to compare until after the walk but
as disadvantage only allows a single such skipped store from a
constant value.

Unfortunately we cannot handle defs from non-constants this way
since we're prone to pick up values from the past loop iteration
this way and we have no good way to identify values that are
invariant in the currently iterated cycle.  That's why we keep
the single-hop lookup for those cases.  gcc.dg/tree-ssa/pr87126.c
would be a testcase that's un-XFAILed when we'd handle those
as well.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

	PR tree-optimization/106922
	* tree-ssa-sccvn.cc (vn_walk_cb_data::same_val): New member.
	(vn_walk_cb_data::finish): Perform delayed verification of
	a skipped may-alias.
	(vn_reference_lookup_pieces): Likewise.
	(vn_reference_lookup): Likewise.
	(vn_reference_lookup_3): When skipping stores of the same
	value also handle constant stores that are more than a
	single VDEF away by delaying the verification.

	* gcc.dg/tree-ssa/ssa-fre-100.c: New testcase.
	* g++.dg/tree-ssa/pr106922.C: Adjust.
---
 gcc/testsuite/g++.dg/tree-ssa/pr106922.C    |  3 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-100.c | 25 ++++++
 gcc/tree-ssa-sccvn.cc                       | 97 ++++++++++++++-------
 3 files changed, 93 insertions(+), 32 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-100.c
  

Comments

Jakub Jelinek Sept. 23, 2022, 7:44 a.m. UTC | #1
On Thu, Sep 22, 2022 at 01:10:08PM +0200, Richard Biener via Gcc-patches wrote:
> 	* g++.dg/tree-ssa/pr106922.C: Adjust.

> --- a/gcc/testsuite/g++.dg/tree-ssa/pr106922.C
> +++ b/gcc/testsuite/g++.dg/tree-ssa/pr106922.C
> @@ -87,5 +87,4 @@ void testfunctionfoo() {
>    }
>  }
>  
> -// { dg-final { scan-tree-dump-times "Found fully redundant value" 4 "pre" { xfail { ! lp64 } } } }
> -// { dg-final { scan-tree-dump-not "m_initialized" "cddce3" { xfail { ! lp64 } } } }
> +// { dg-final { scan-tree-dump-not "m_initialized" "dce3" } }

I've noticed
+UNRESOLVED: g++.dg/tree-ssa/pr106922.C  -std=gnu++20  scan-tree-dump-not dce3 "m_initialized"
+UNRESOLVED: g++.dg/tree-ssa/pr106922.C  -std=gnu++2b  scan-tree-dump-not dce3 "m_initialized"
with this change, both on x86_64 and i686.
The dump is still cddce3, additionally as the last reference to the pre
dump is gone, not sure it is worth creating that dump.

With the following patch, there aren't FAILs nor UNRESOLVED tests with
GXX_TESTSUITE_STDS=98,11,14,17,20,2b make check-g++ RUNTESTFLAGS="--target_board=unix\{-m32,-m64\} dg.exp='pr106922.C'"

Ok for trunk?

2022-09-23  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/106922
	* g++.dg/tree-ssa/pr106922.C: Scan in cddce3 dump rather than
	dce3.  Remove -fdump-tree-pre-details from dg-options.

--- gcc/testsuite/g++.dg/tree-ssa/pr106922.C.jj	2022-09-23 09:02:57.011311664 +0200
+++ gcc/testsuite/g++.dg/tree-ssa/pr106922.C	2022-09-23 09:41:06.348797951 +0200
@@ -1,5 +1,5 @@
 // { dg-require-effective-target c++20 }
-// { dg-options "-O2 -fdump-tree-pre-details -fdump-tree-cddce3" }
+// { dg-options "-O2 -fdump-tree-cddce3" }
 
 template <typename> struct __new_allocator {
   void deallocate(int *, int) { operator delete(0); }
@@ -87,4 +87,4 @@ void testfunctionfoo() {
   }
 }
 
-// { dg-final { scan-tree-dump-not "m_initialized" "dce3" } }
+// { dg-final { scan-tree-dump-not "m_initialized" "cddce3" } }


	Jakub
  
Richard Biener Sept. 23, 2022, 7:46 a.m. UTC | #2
On Fri, 23 Sep 2022, Jakub Jelinek wrote:

> On Thu, Sep 22, 2022 at 01:10:08PM +0200, Richard Biener via Gcc-patches wrote:
> > 	* g++.dg/tree-ssa/pr106922.C: Adjust.
> 
> > --- a/gcc/testsuite/g++.dg/tree-ssa/pr106922.C
> > +++ b/gcc/testsuite/g++.dg/tree-ssa/pr106922.C
> > @@ -87,5 +87,4 @@ void testfunctionfoo() {
> >    }
> >  }
> >  
> > -// { dg-final { scan-tree-dump-times "Found fully redundant value" 4 "pre" { xfail { ! lp64 } } } }
> > -// { dg-final { scan-tree-dump-not "m_initialized" "cddce3" { xfail { ! lp64 } } } }
> > +// { dg-final { scan-tree-dump-not "m_initialized" "dce3" } }
> 
> I've noticed
> +UNRESOLVED: g++.dg/tree-ssa/pr106922.C  -std=gnu++20  scan-tree-dump-not dce3 "m_initialized"
> +UNRESOLVED: g++.dg/tree-ssa/pr106922.C  -std=gnu++2b  scan-tree-dump-not dce3 "m_initialized"
> with this change, both on x86_64 and i686.
> The dump is still cddce3, additionally as the last reference to the pre
> dump is gone, not sure it is worth creating that dump.

oops...

> With the following patch, there aren't FAILs nor UNRESOLVED tests with
> GXX_TESTSUITE_STDS=98,11,14,17,20,2b make check-g++ RUNTESTFLAGS="--target_board=unix\{-m32,-m64\} dg.exp='pr106922.C'"
> 
> Ok for trunk?

OK.

Thanks,
Richard.

> 2022-09-23  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR tree-optimization/106922
> 	* g++.dg/tree-ssa/pr106922.C: Scan in cddce3 dump rather than
> 	dce3.  Remove -fdump-tree-pre-details from dg-options.
> 
> --- gcc/testsuite/g++.dg/tree-ssa/pr106922.C.jj	2022-09-23 09:02:57.011311664 +0200
> +++ gcc/testsuite/g++.dg/tree-ssa/pr106922.C	2022-09-23 09:41:06.348797951 +0200
> @@ -1,5 +1,5 @@
>  // { dg-require-effective-target c++20 }
> -// { dg-options "-O2 -fdump-tree-pre-details -fdump-tree-cddce3" }
> +// { dg-options "-O2 -fdump-tree-cddce3" }
>  
>  template <typename> struct __new_allocator {
>    void deallocate(int *, int) { operator delete(0); }
> @@ -87,4 +87,4 @@ void testfunctionfoo() {
>    }
>  }
>  
> -// { dg-final { scan-tree-dump-not "m_initialized" "dce3" } }
> +// { dg-final { scan-tree-dump-not "m_initialized" "cddce3" } }
> 
> 
> 	Jakub
> 
>
  

Patch

diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr106922.C b/gcc/testsuite/g++.dg/tree-ssa/pr106922.C
index faf379b0361..14fa061de20 100644
--- a/gcc/testsuite/g++.dg/tree-ssa/pr106922.C
+++ b/gcc/testsuite/g++.dg/tree-ssa/pr106922.C
@@ -87,5 +87,4 @@  void testfunctionfoo() {
   }
 }
 
-// { dg-final { scan-tree-dump-times "Found fully redundant value" 4 "pre" { xfail { ! lp64 } } } }
-// { dg-final { scan-tree-dump-not "m_initialized" "cddce3" { xfail { ! lp64 } } } }
+// { dg-final { scan-tree-dump-not "m_initialized" "dce3" } }
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-100.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-100.c
new file mode 100644
index 00000000000..ead76548f3d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-100.c
@@ -0,0 +1,25 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-fre1" } */
+
+float bar, baz;
+void foo (int *p, int n)
+{
+  *p = 0;
+  do
+    {
+      bar = 1.;
+      /* When iterating we should have optimistically value-numbered
+	 *p to zero, on the second iteration we have to prove the
+	 store below does not affect the value of this load though.
+	 We can compare the stored value against the value from the
+	 previous iteration instead relying on a non-walking lookup.  */
+      if (*p)
+        {
+          baz = 2.;
+          *p = 0;
+        }
+    }
+  while (--n);
+}
+
+/* { dg-final { scan-tree-dump-not "baz" "fre1" } } */
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index 85a7698f694..9c12a8e4f03 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -1803,7 +1803,8 @@  struct vn_walk_cb_data
 		   vn_lookup_kind vn_walk_kind_, bool tbaa_p_, tree mask_,
 		   bool redundant_store_removal_p_)
     : vr (vr_), last_vuse_ptr (last_vuse_ptr_), last_vuse (NULL_TREE),
-      mask (mask_), masked_result (NULL_TREE), vn_walk_kind (vn_walk_kind_),
+      mask (mask_), masked_result (NULL_TREE), same_val (NULL_TREE),
+      vn_walk_kind (vn_walk_kind_),
       tbaa_p (tbaa_p_), redundant_store_removal_p (redundant_store_removal_p_),
       saved_operands (vNULL), first_set (-2), first_base_set (-2),
       known_ranges (NULL)
@@ -1864,6 +1865,7 @@  struct vn_walk_cb_data
   tree last_vuse;
   tree mask;
   tree masked_result;
+  tree same_val;
   vn_lookup_kind vn_walk_kind;
   bool tbaa_p;
   bool redundant_store_removal_p;
@@ -1902,6 +1904,8 @@  vn_walk_cb_data::finish (alias_set_type set, alias_set_type base_set, tree val)
       masked_result = val;
       return (void *) -1;
     }
+  if (same_val && !operand_equal_p (val, same_val))
+    return (void *) -1;
   vec<vn_reference_op_s> &operands
     = saved_operands.exists () ? saved_operands : vr->operands;
   return vn_reference_lookup_or_insert_for_pieces (last_vuse, set, base_set,
@@ -2675,36 +2679,57 @@  vn_reference_lookup_3 (ao_ref *ref, tree vuse, void *data_,
 	 and return the found value.  */
       if (is_gimple_reg_type (TREE_TYPE (lhs))
 	  && types_compatible_p (TREE_TYPE (lhs), vr->type)
-	  && (ref->ref || data->orig_ref.ref))
-	{
-	  tree *saved_last_vuse_ptr = data->last_vuse_ptr;
-	  /* Do not update last_vuse_ptr in vn_reference_lookup_2.  */
-	  data->last_vuse_ptr = NULL;
-	  tree saved_vuse = vr->vuse;
-	  hashval_t saved_hashcode = vr->hashcode;
-	  void *res = vn_reference_lookup_2 (ref, gimple_vuse (def_stmt), data);
-	  /* Need to restore vr->vuse and vr->hashcode.  */
-	  vr->vuse = saved_vuse;
-	  vr->hashcode = saved_hashcode;
-	  data->last_vuse_ptr = saved_last_vuse_ptr;
-	  if (res && res != (void *)-1)
+	  && (ref->ref || data->orig_ref.ref)
+	  && !data->same_val
+	  && !data->mask
+	  && data->partial_defs.is_empty ()
+	  && multiple_p (get_object_alignment
+			   (ref->ref ? ref->ref : data->orig_ref.ref),
+			   ref->size)
+	  && multiple_p (get_object_alignment (lhs), ref->size))
+	{
+	  tree rhs = gimple_assign_rhs1 (def_stmt);
+	  /* ???  We may not compare to ahead values which might be from
+	     a different loop iteration but only to loop invariants.  Use
+	     CONSTANT_CLASS_P (unvalueized!) as conservative approximation.
+	     The one-hop lookup below doesn't have this issue since there's
+	     a virtual PHI before we ever reach a backedge to cross.  */
+	  if (CONSTANT_CLASS_P (rhs))
 	    {
-	      vn_reference_t vnresult = (vn_reference_t) res;
-	      tree rhs = gimple_assign_rhs1 (def_stmt);
-	      if (TREE_CODE (rhs) == SSA_NAME)
-		rhs = SSA_VAL (rhs);
-	      if (vnresult->result
-		  && operand_equal_p (vnresult->result, rhs, 0)
-		  /* We have to honor our promise about union type punning
-		     and also support arbitrary overlaps with
-		     -fno-strict-aliasing.  So simply resort to alignment to
-		     rule out overlaps.  Do this check last because it is
-		     quite expensive compared to the hash-lookup above.  */
-		  && multiple_p (get_object_alignment
-				   (ref->ref ? ref->ref : data->orig_ref.ref),
-				 ref->size)
-		  && multiple_p (get_object_alignment (lhs), ref->size))
-		return res;
+	      if (dump_file && (dump_flags & TDF_DETAILS))
+		{
+		  fprintf (dump_file,
+			   "Skipping possible redundant definition ");
+		  print_gimple_stmt (dump_file, def_stmt, 0);
+		}
+	      /* Delay the actual compare of the values to the end of the walk
+		 but do not update last_vuse from here.  */
+	      data->last_vuse_ptr = NULL;
+	      data->same_val = rhs;
+	      return NULL;
+	    }
+	  else
+	    {
+	      tree *saved_last_vuse_ptr = data->last_vuse_ptr;
+	      /* Do not update last_vuse_ptr in vn_reference_lookup_2.  */
+	      data->last_vuse_ptr = NULL;
+	      tree saved_vuse = vr->vuse;
+	      hashval_t saved_hashcode = vr->hashcode;
+	      void *res = vn_reference_lookup_2 (ref, gimple_vuse (def_stmt),
+						 data);
+	      /* Need to restore vr->vuse and vr->hashcode.  */
+	      vr->vuse = saved_vuse;
+	      vr->hashcode = saved_hashcode;
+	      data->last_vuse_ptr = saved_last_vuse_ptr;
+	      if (res && res != (void *)-1)
+		{
+		  vn_reference_t vnresult = (vn_reference_t) res;
+		  if (TREE_CODE (rhs) == SSA_NAME)
+		    rhs = SSA_VAL (rhs);
+		  if (vnresult->result
+		      && operand_equal_p (vnresult->result, rhs, 0))
+		    return res;
+		}
 	    }
 	}
     }
@@ -3798,6 +3823,14 @@  vn_reference_lookup_pieces (tree vuse, alias_set_type set,
       if (ops_for_ref != shared_lookup_references)
 	ops_for_ref.release ();
       gcc_checking_assert (vr1.operands == shared_lookup_references);
+      if (*vnresult
+	  && data.same_val
+	  && (!(*vnresult)->result
+	      || !operand_equal_p ((*vnresult)->result, data.same_val)))
+	{
+	  *vnresult = NULL;
+	  return NULL_TREE;
+	}
     }
 
   if (*vnresult)
@@ -3913,6 +3946,10 @@  vn_reference_lookup (tree op, tree vuse, vn_lookup_kind kind,
       if (wvnresult)
 	{
 	  gcc_assert (mask == NULL_TREE);
+	  if (data.same_val
+	      && (!wvnresult->result
+		  || !operand_equal_p (wvnresult->result, data.same_val)))
+	    return NULL_TREE;
 	  if (vnresult)
 	    *vnresult = wvnresult;
 	  return wvnresult->result;