cselib: add function to check if SET is redundant [PR106187]

Message ID e0748603-3540-d067-c61c-97f60115a588@foss.arm.com
State New, archived
Headers
Series cselib: add function to check if SET is redundant [PR106187] |

Commit Message

Richard Earnshaw (lists) July 28, 2022, 4:46 p.m. UTC
  [resend with correct subject line]

A SET operation that writes memory may have the same value as an earlier 
store but if the alias sets of the new and earlier store do not conflict 
then the set is not truly redundant.  This can happen, for example, if 
objects of different types share a stack slot.

To fix this we define a new function in cselib that first checks for 
equality and if that is successful then finds the earlier store in the 
value history and checks the alias sets.

The routine is used in two places elsewhere in the compiler.  Firstly
in cfgcleanup and secondly in postreload.

gcc/ChangeLog:
	* cselib.h (cselib_redundant_set_p): Declare.
	* cselib.cc: Include alias.h
	(cselib_redundant_set_p): New function.
	* cfgcleanup.cc: (mark_effect): Use cselib_redundant_set_p instead
	of rtx_equal_for_cselib_p.
	* postreload.c (reload_cse_simplify): Use cselib_redundant_set_p.
	(reload_cse_noop_set_p): Delete.
  

Comments

Richard Biener July 29, 2022, 7:06 a.m. UTC | #1
On Thu, Jul 28, 2022 at 6:46 PM Richard Earnshaw
<Richard.Earnshaw@foss.arm.com> wrote:
>
> [resend with correct subject line]
>
> A SET operation that writes memory may have the same value as an earlier
> store but if the alias sets of the new and earlier store do not conflict
> then the set is not truly redundant.  This can happen, for example, if
> objects of different types share a stack slot.
>
> To fix this we define a new function in cselib that first checks for
> equality and if that is successful then finds the earlier store in the
> value history and checks the alias sets.
>
> The routine is used in two places elsewhere in the compiler.  Firstly
> in cfgcleanup and secondly in postreload.

I can't comment on the stripping on SUBREGs and friends but it seems
to be conservative apart from

+  if (!flag_strict_aliasing || !MEM_P (dest))
+    return true;

where if dest is not a MEM but were to contain one we'd miss it.
Double-checking
from more RTL literate people appreciated.

+  /* Lookup the equivalents to the dest.  This is more likely to succeed
+     than looking up the equivalents to the source (for example, when the
+     src is some form of constant).  */

I think the comment is misleading - we _do_ have to lookup the MEM,
looking up equivalences of a reg or an expression on the RHS isn't
what we are interested in.

+               return alias_sets_conflict_p (MEM_ALIAS_SET (dest),
+                                             MEM_ALIAS_SET (src_equiv));

that's not conservative enough - dse.cc has correct boilerplate, we have
to check both MEM_ALIAS_SET and MEM_EXPR here (the latter only
if the former load/store has a MEM_EXPR).  Note in particular
using alias_set_subset_of instead of alias_sets_conflict_p.

              /* We can only remove the later store if the earlier aliases
                 at least all accesses the later one.  */
              && ((MEM_ALIAS_SET (mem) == MEM_ALIAS_SET (s_info->mem)
                   || alias_set_subset_of (MEM_ALIAS_SET (mem),
                                           MEM_ALIAS_SET (s_info->mem)))
                  && (!MEM_EXPR (s_info->mem)
                      || refs_same_for_tbaa_p (MEM_EXPR (s_info->mem),
                                               MEM_EXPR (mem)))))

+  /* We failed to find a recorded value in the cselib history, so try the
+     source of this set.  */
+  rtx src = SET_SRC (set);
+  while (GET_CODE (src) == SUBREG)
+    src = XEXP (src, 0);
+
+  if (MEM_P (src) && rtx_equal_for_cselib_1 (dest_addr, XEXP (src, 0),
+                                            GET_MODE (dest), 0))
+    return alias_sets_conflict_p (MEM_ALIAS_SET (dest),
+                                 MEM_ALIAS_SET (src));

this looks like an odd case to me - wouldn't that only catch things
like self-assignments, aka *p = *p?  So I'd simply drop this fallback.

Otherwise it looks OK to me.

Thanks,
Richard.

> gcc/ChangeLog:
>         * cselib.h (cselib_redundant_set_p): Declare.
>         * cselib.cc: Include alias.h
>         (cselib_redundant_set_p): New function.
>         * cfgcleanup.cc: (mark_effect): Use cselib_redundant_set_p instead
>         of rtx_equal_for_cselib_p.
>         * postreload.c (reload_cse_simplify): Use cselib_redundant_set_p.
>         (reload_cse_noop_set_p): Delete.
  
Richard Earnshaw July 29, 2022, 9:52 a.m. UTC | #2
On 29/07/2022 08:06, Richard Biener via Gcc-patches wrote:
> On Thu, Jul 28, 2022 at 6:46 PM Richard Earnshaw
> <Richard.Earnshaw@foss.arm.com> wrote:
>>
>> [resend with correct subject line]
>>
>> A SET operation that writes memory may have the same value as an earlier
>> store but if the alias sets of the new and earlier store do not conflict
>> then the set is not truly redundant.  This can happen, for example, if
>> objects of different types share a stack slot.
>>
>> To fix this we define a new function in cselib that first checks for
>> equality and if that is successful then finds the earlier store in the
>> value history and checks the alias sets.
>>
>> The routine is used in two places elsewhere in the compiler.  Firstly
>> in cfgcleanup and secondly in postreload.
> 
> I can't comment on the stripping on SUBREGs and friends but it seems
> to be conservative apart from
> 
> +  if (!flag_strict_aliasing || !MEM_P (dest))
> +    return true;
> 
> where if dest is not a MEM but were to contain one we'd miss it.
> Double-checking
> from more RTL literate people appreciated.

There are very few things that can wrap a MEM in a SET_DEST.  I'm pretty 
sure that's all of them.  It certainly matches the code in 
cselib_invalidate_rtx which has to deal with this sort of case.

> 
> +  /* Lookup the equivalents to the dest.  This is more likely to succeed
> +     than looking up the equivalents to the source (for example, when the
> +     src is some form of constant).  */
> 
> I think the comment is misleading - we _do_ have to lookup the MEM,
> looking up equivalences of a reg or an expression on the RHS isn't
> what we are interested in.

OK, I'll try to reword it.

> 
> +               return alias_sets_conflict_p (MEM_ALIAS_SET (dest),
> +                                             MEM_ALIAS_SET (src_equiv));
> 
> that's not conservative enough - dse.cc has correct boilerplate, we have
> to check both MEM_ALIAS_SET and MEM_EXPR here (the latter only
> if the former load/store has a MEM_EXPR).  Note in particular
> using alias_set_subset_of instead of alias_sets_conflict_p.
> 
>                /* We can only remove the later store if the earlier aliases
>                   at least all accesses the later one.  */
>                && ((MEM_ALIAS_SET (mem) == MEM_ALIAS_SET (s_info->mem)
>                     || alias_set_subset_of (MEM_ALIAS_SET (mem),
>                                             MEM_ALIAS_SET (s_info->mem)))
>                    && (!MEM_EXPR (s_info->mem)
>                        || refs_same_for_tbaa_p (MEM_EXPR (s_info->mem),
>                                                 MEM_EXPR (mem)))))
> 

OK, that's an easy enough change.

> +  /* We failed to find a recorded value in the cselib history, so try the
> +     source of this set.  */
> +  rtx src = SET_SRC (set);
> +  while (GET_CODE (src) == SUBREG)
> +    src = XEXP (src, 0);
> +
> +  if (MEM_P (src) && rtx_equal_for_cselib_1 (dest_addr, XEXP (src, 0),
> +                                            GET_MODE (dest), 0))
> +    return alias_sets_conflict_p (MEM_ALIAS_SET (dest),
> +                                 MEM_ALIAS_SET (src));
> 
> this looks like an odd case to me - wouldn't that only catch things
> like self-assignments, aka *p = *p?  So I'd simply drop this fallback.

It catches the case of *p = *q when p and q have the same value.  It did 
come up in testing on x86 (when previously I was aborting to make sure 
I'd caught everything).  We could leave it out as the fallback case in 
this instance is to record a conflict, but it's not a path that's likely 
to be performance critical and the probability of this being a redundant 
store is quite high.  I'll update the comment to make this clearer.


R.

> 
> Otherwise it looks OK to me.
> 
> Thanks,
> Richard.
> 
>> gcc/ChangeLog:
>>          * cselib.h (cselib_redundant_set_p): Declare.
>>          * cselib.cc: Include alias.h
>>          (cselib_redundant_set_p): New function.
>>          * cfgcleanup.cc: (mark_effect): Use cselib_redundant_set_p instead
>>          of rtx_equal_for_cselib_p.
>>          * postreload.c (reload_cse_simplify): Use cselib_redundant_set_p.
>>          (reload_cse_noop_set_p): Delete.
  
Richard Biener July 29, 2022, 9:56 a.m. UTC | #3
On Fri, Jul 29, 2022 at 11:52 AM Richard Earnshaw
<Richard.Earnshaw@foss.arm.com> wrote:
>
>
>
> On 29/07/2022 08:06, Richard Biener via Gcc-patches wrote:
> > On Thu, Jul 28, 2022 at 6:46 PM Richard Earnshaw
> > <Richard.Earnshaw@foss.arm.com> wrote:
> >>
> >> [resend with correct subject line]
> >>
> >> A SET operation that writes memory may have the same value as an earlier
> >> store but if the alias sets of the new and earlier store do not conflict
> >> then the set is not truly redundant.  This can happen, for example, if
> >> objects of different types share a stack slot.
> >>
> >> To fix this we define a new function in cselib that first checks for
> >> equality and if that is successful then finds the earlier store in the
> >> value history and checks the alias sets.
> >>
> >> The routine is used in two places elsewhere in the compiler.  Firstly
> >> in cfgcleanup and secondly in postreload.
> >
> > I can't comment on the stripping on SUBREGs and friends but it seems
> > to be conservative apart from
> >
> > +  if (!flag_strict_aliasing || !MEM_P (dest))
> > +    return true;
> >
> > where if dest is not a MEM but were to contain one we'd miss it.
> > Double-checking
> > from more RTL literate people appreciated.
>
> There are very few things that can wrap a MEM in a SET_DEST.  I'm pretty
> sure that's all of them.  It certainly matches the code in
> cselib_invalidate_rtx which has to deal with this sort of case.
>
> >
> > +  /* Lookup the equivalents to the dest.  This is more likely to succeed
> > +     than looking up the equivalents to the source (for example, when the
> > +     src is some form of constant).  */
> >
> > I think the comment is misleading - we _do_ have to lookup the MEM,
> > looking up equivalences of a reg or an expression on the RHS isn't
> > what we are interested in.
>
> OK, I'll try to reword it.
>
> >
> > +               return alias_sets_conflict_p (MEM_ALIAS_SET (dest),
> > +                                             MEM_ALIAS_SET (src_equiv));
> >
> > that's not conservative enough - dse.cc has correct boilerplate, we have
> > to check both MEM_ALIAS_SET and MEM_EXPR here (the latter only
> > if the former load/store has a MEM_EXPR).  Note in particular
> > using alias_set_subset_of instead of alias_sets_conflict_p.
> >
> >                /* We can only remove the later store if the earlier aliases
> >                   at least all accesses the later one.  */
> >                && ((MEM_ALIAS_SET (mem) == MEM_ALIAS_SET (s_info->mem)
> >                     || alias_set_subset_of (MEM_ALIAS_SET (mem),
> >                                             MEM_ALIAS_SET (s_info->mem)))
> >                    && (!MEM_EXPR (s_info->mem)
> >                        || refs_same_for_tbaa_p (MEM_EXPR (s_info->mem),
> >                                                 MEM_EXPR (mem)))))
> >
>
> OK, that's an easy enough change.
>
> > +  /* We failed to find a recorded value in the cselib history, so try the
> > +     source of this set.  */
> > +  rtx src = SET_SRC (set);
> > +  while (GET_CODE (src) == SUBREG)
> > +    src = XEXP (src, 0);
> > +
> > +  if (MEM_P (src) && rtx_equal_for_cselib_1 (dest_addr, XEXP (src, 0),
> > +                                            GET_MODE (dest), 0))
> > +    return alias_sets_conflict_p (MEM_ALIAS_SET (dest),
> > +                                 MEM_ALIAS_SET (src));
> >
> > this looks like an odd case to me - wouldn't that only catch things
> > like self-assignments, aka *p = *p?  So I'd simply drop this fallback.
>
> It catches the case of *p = *q when p and q have the same value.  It did
> come up in testing on x86 (when previously I was aborting to make sure
> I'd caught everything).  We could leave it out as the fallback case in
> this instance is to record a conflict, but it's not a path that's likely
> to be performance critical and the probability of this being a redundant
> store is quite high.  I'll update the comment to make this clearer.

Ah OK - if it did actually catch cases then it's fine to keep.  Note the
alias check needs to be updated the same as above.

Richard.

>
>
> R.
>
> >
> > Otherwise it looks OK to me.
> >
> > Thanks,
> > Richard.
> >
> >> gcc/ChangeLog:
> >>          * cselib.h (cselib_redundant_set_p): Declare.
> >>          * cselib.cc: Include alias.h
> >>          (cselib_redundant_set_p): New function.
> >>          * cfgcleanup.cc: (mark_effect): Use cselib_redundant_set_p instead
> >>          of rtx_equal_for_cselib_p.
> >>          * postreload.c (reload_cse_simplify): Use cselib_redundant_set_p.
> >>          (reload_cse_noop_set_p): Delete.
  

Patch

diff --git a/gcc/cfgcleanup.cc b/gcc/cfgcleanup.cc
index 18047da7b24..a8b0139bb4d 100644
--- a/gcc/cfgcleanup.cc
+++ b/gcc/cfgcleanup.cc
@@ -208,7 +208,7 @@  mark_effect (rtx exp, regset nonequal)
       return false;
 
     case SET:
-      if (rtx_equal_for_cselib_p (SET_DEST (exp), SET_SRC (exp)))
+      if (cselib_redundant_set_p (exp))
 	return false;
       dest = SET_DEST (exp);
       if (dest == pc_rtx)
diff --git a/gcc/cselib.cc b/gcc/cselib.cc
index 6769beeeaf8..fcfd8340a4a 100644
--- a/gcc/cselib.cc
+++ b/gcc/cselib.cc
@@ -32,6 +32,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "dumpfile.h"
 #include "cselib.h"
 #include "function-abi.h"
+#include "alias.h"
 
 /* A list of cselib_val structures.  */
 struct elt_list
@@ -1157,6 +1158,75 @@  rtx_equal_for_cselib_1 (rtx x, rtx y, machine_mode memmode, int depth)
   return 1;
 }
 
+/* Wrapper for rtx_equal_for_cselib_p to determine whether a SET is
+   truly redundant, taking into account aliasing information.  */
+bool
+cselib_redundant_set_p (rtx set)
+{
+  gcc_assert (GET_CODE (set) == SET);
+  rtx dest = SET_DEST (set);
+  if (cselib_reg_set_mode (dest) != GET_MODE (dest))
+    return false;
+
+  if (!rtx_equal_for_cselib_p (dest, SET_SRC (set)))
+    return false;
+
+  while (GET_CODE (dest) == SUBREG
+	 || GET_CODE (dest) == ZERO_EXTRACT
+	 || GET_CODE (dest) == STRICT_LOW_PART)
+    dest = XEXP (dest, 0);
+
+  if (!flag_strict_aliasing || !MEM_P (dest))
+    return true;
+
+  /* For a store we need to check that suppressing it will not change
+     the effective alias set.  */
+  rtx dest_addr = XEXP (dest, 0);
+
+  /* Lookup the equivalents to the dest.  This is more likely to succeed
+     than looking up the equivalents to the source (for example, when the
+     src is some form of constant).  */
+  cselib_val *src_val = cselib_lookup (SET_DEST (set),
+				       GET_MODE (SET_DEST (set)),
+				       0, VOIDmode);
+
+  if (src_val)
+    {
+      /* Walk the list of source equivalents to find the MEM accessing the same
+	 location.  */
+      for (elt_loc_list *l = src_val->locs; l; l = l->next)
+	{
+	  rtx src_equiv = l->loc;
+	  while (GET_CODE (src_equiv) == SUBREG
+		 || GET_CODE (src_equiv) == ZERO_EXTRACT
+		 || GET_CODE (src_equiv) == STRICT_LOW_PART)
+	    src_equiv = XEXP (src_equiv, 0);
+
+	  if (MEM_P (src_equiv))
+	    {
+	      /* Match the MEMs by comparing the addresses.  */
+	      if (rtx_equal_for_cselib_1 (dest_addr, XEXP (src_equiv, 0),
+					  GET_MODE (dest), 0))
+		return alias_sets_conflict_p (MEM_ALIAS_SET (dest),
+					      MEM_ALIAS_SET (src_equiv));
+	    }
+	}
+    }
+
+  /* We failed to find a recorded value in the cselib history, so try the
+     source of this set.  */
+  rtx src = SET_SRC (set);
+  while (GET_CODE (src) == SUBREG)
+    src = XEXP (src, 0);
+
+  if (MEM_P (src) && rtx_equal_for_cselib_1 (dest_addr, XEXP (src, 0),
+					     GET_MODE (dest), 0))
+    return alias_sets_conflict_p (MEM_ALIAS_SET (dest),
+				  MEM_ALIAS_SET (src));
+
+  return false;
+}
+
 /* Helper function for cselib_hash_rtx.  Arguments like for cselib_hash_rtx,
    except that it hashes (plus:P x c).  */
 
diff --git a/gcc/cselib.h b/gcc/cselib.h
index 9ae65e6459e..b0905053ea5 100644
--- a/gcc/cselib.h
+++ b/gcc/cselib.h
@@ -83,6 +83,7 @@  extern void cselib_process_insn (rtx_insn *);
 extern bool fp_setter_insn (rtx_insn *);
 extern machine_mode cselib_reg_set_mode (const_rtx);
 extern int rtx_equal_for_cselib_1 (rtx, rtx, machine_mode, int);
+extern bool cselib_redundant_set_p (rtx);
 extern int references_value_p (const_rtx, int);
 extern rtx cselib_expand_value_rtx (rtx, bitmap, int);
 typedef rtx (*cselib_expand_callback)(rtx, bitmap, int, void *);
diff --git a/gcc/postreload.cc b/gcc/postreload.cc
index d1c99fe6dc9..41f61d32648 100644
--- a/gcc/postreload.cc
+++ b/gcc/postreload.cc
@@ -43,7 +43,6 @@  along with GCC; see the file COPYING3.  If not see
 #include "function-abi.h"
 #include "rtl-iter.h"
 
-static int reload_cse_noop_set_p (rtx);
 static bool reload_cse_simplify (rtx_insn *, rtx);
 static void reload_cse_regs_1 (void);
 static int reload_cse_simplify_set (rtx, rtx_insn *);
@@ -74,16 +73,6 @@  reload_cse_regs (rtx_insn *first ATTRIBUTE_UNUSED)
     }
 }
 
-/* See whether a single set SET is a noop.  */
-static int
-reload_cse_noop_set_p (rtx set)
-{
-  if (cselib_reg_set_mode (SET_DEST (set)) != GET_MODE (SET_DEST (set)))
-    return 0;
-
-  return rtx_equal_for_cselib_p (SET_DEST (set), SET_SRC (set));
-}
-
 /* Try to simplify INSN.  Return true if the CFG may have changed.  */
 static bool
 reload_cse_simplify (rtx_insn *insn, rtx testreg)
@@ -118,7 +107,7 @@  reload_cse_simplify (rtx_insn *insn, rtx testreg)
          this out, so it's safer to simplify before we delete.  */
       count += reload_cse_simplify_set (body, insn);
 
-      if (!count && reload_cse_noop_set_p (body))
+      if (!count && cselib_redundant_set_p (body))
 	{
 	  if (check_for_inc_dec (insn))
 	    delete_insn_and_edges (insn);
@@ -157,7 +146,7 @@  reload_cse_simplify (rtx_insn *insn, rtx testreg)
 	  rtx part = XVECEXP (body, 0, i);
 	  if (GET_CODE (part) == SET)
 	    {
-	      if (! reload_cse_noop_set_p (part))
+	      if (! cselib_redundant_set_p (part))
 		break;
 	      if (REG_P (SET_DEST (part))
 		  && REG_FUNCTION_VALUE_P (SET_DEST (part)))