[v3,1/4] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

Message ID 20230830101400.1539313-2-manolis.tsamis@vrull.eu
State Accepted
Headers
Series ifcvt: Allow if conversion of arithmetic in basic blocks with multiple sets |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Manolis Tsamis Aug. 30, 2023, 10:13 a.m. UTC
  This is an extension of what was done in PR106590.

Currently if a sequence generated in noce_convert_multiple_sets clobbers the
condition rtx (cc_cmp or rev_cc_cmp) then only seq1 is used afterwards
(sequences that emit the comparison itself). Since this applies only from the
next iteration it assumes that the sequences generated (in particular seq2)
doesn't clobber the condition rtx itself before using it in the if_then_else,
which is only true in specific cases (currently only register/subregister moves
are allowed).

This patch changes this so it also tests if seq2 clobbers cc_cmp/rev_cc_cmp in
the current iteration. This makes it possible to include arithmetic operations
in noce_convert_multiple_sets.

gcc/ChangeLog:

	* ifcvt.cc (check_for_cc_cmp_clobbers): Use modified_in_p instead.
	(noce_convert_multiple_sets_1): Don't use seq2 if it clobbers cc_cmp.

Signed-off-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
---

(no changes since v1)

 gcc/ifcvt.cc | 49 +++++++++++++++++++------------------------------
 1 file changed, 19 insertions(+), 30 deletions(-)
  

Comments

Richard Sandiford Oct. 19, 2023, 7:41 p.m. UTC | #1
Manolis Tsamis <manolis.tsamis@vrull.eu> writes:
> This is an extension of what was done in PR106590.
>
> Currently if a sequence generated in noce_convert_multiple_sets clobbers the
> condition rtx (cc_cmp or rev_cc_cmp) then only seq1 is used afterwards
> (sequences that emit the comparison itself). Since this applies only from the
> next iteration it assumes that the sequences generated (in particular seq2)
> doesn't clobber the condition rtx itself before using it in the if_then_else,
> which is only true in specific cases (currently only register/subregister moves
> are allowed).
>
> This patch changes this so it also tests if seq2 clobbers cc_cmp/rev_cc_cmp in
> the current iteration. This makes it possible to include arithmetic operations
> in noce_convert_multiple_sets.
>
> gcc/ChangeLog:
>
> 	* ifcvt.cc (check_for_cc_cmp_clobbers): Use modified_in_p instead.
> 	(noce_convert_multiple_sets_1): Don't use seq2 if it clobbers cc_cmp.
>
> Signed-off-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
> ---
>
> (no changes since v1)
>
>  gcc/ifcvt.cc | 49 +++++++++++++++++++------------------------------
>  1 file changed, 19 insertions(+), 30 deletions(-)

Sorry for the slow review.  TBH I was hoping someone else would pick
it up, since (a) I'm not very familiar with this code, and (b) I don't
really agree with the way that the current code works.  I'm not sure the
current dependency checking is safe, so I'm nervous about adding even
more cases to it.  And it feels like the different ifcvt techniques are
not now well distinguished, so that they're beginning to overlap and
compete with one another.  None of that is your fault, of course. :)

> diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
> index a0af553b9ff..3273aeca125 100644
> --- a/gcc/ifcvt.cc
> +++ b/gcc/ifcvt.cc
> @@ -3375,20 +3375,6 @@ noce_convert_multiple_sets (struct noce_if_info *if_info)
>    return true;
>  }
>  
> -/* Helper function for noce_convert_multiple_sets_1.  If store to
> -   DEST can affect P[0] or P[1], clear P[0].  Called via note_stores.  */
> -
> -static void
> -check_for_cc_cmp_clobbers (rtx dest, const_rtx, void *p0)
> -{
> -  rtx *p = (rtx *) p0;
> -  if (p[0] == NULL_RTX)
> -    return;
> -  if (reg_overlap_mentioned_p (dest, p[0])
> -      || (p[1] && reg_overlap_mentioned_p (dest, p[1])))
> -    p[0] = NULL_RTX;
> -}
> -
>  /* This goes through all relevant insns of IF_INFO->then_bb and tries to
>     create conditional moves.  In case a simple move sufficis the insn
>     should be listed in NEED_NO_CMOV.  The rewired-src cases should be
> @@ -3552,9 +3538,17 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info,
>  	 creating an additional compare for each.  If successful, costing
>  	 is easier and this sequence is usually preferred.  */
>        if (cc_cmp)
> -	seq2 = try_emit_cmove_seq (if_info, temp, cond,
> -				   new_val, old_val, need_cmov,
> -				   &cost2, &temp_dest2, cc_cmp, rev_cc_cmp);
> +	{
> +	  seq2 = try_emit_cmove_seq (if_info, temp, cond,
> +				     new_val, old_val, need_cmov,
> +				     &cost2, &temp_dest2, cc_cmp, rev_cc_cmp);
> +
> +	  /* The if_then_else in SEQ2 may be affected when cc_cmp/rev_cc_cmp is
> +	     clobbered.  We can't safely use the sequence in this case.  */
> +	  if (seq2 && (modified_in_p (cc_cmp, seq2)
> +	      || (rev_cc_cmp && modified_in_p (rev_cc_cmp, seq2))))
> +	    seq2 = NULL;

modified_in_p only checks the first instruction in seq2, not the whole
sequence.

I think the unpatched approach is OK in cases where seq2 clobbers
cc_cmp/rev_cc_cmp in or after the last use, since instructions are
defined to operate on a read-all/compute/write-all basis.

Soon after the snippet above, the unpatched code has this loop:

      /* The backend might have created a sequence that uses the
	 condition.  Check this.  */
      rtx_insn *walk = seq2;
      while (walk)
	{
	  rtx set = single_set (walk);

	  if (!set || !SET_SRC (set))

This condition looks odd.  A SET_SRC is never null.  But more importantly,
continuing means "assume the best", and I don't think we should assume
the best for (say) an insn with two parallel sets.

It doesn't look like the series addresses this, but !set seems more
likely to occur if we extend the function to general operations.

	    {
	      walk = NEXT_INSN (walk);
	      continue;
	    }

	  rtx src = SET_SRC (set);

	  if (XEXP (set, 1) && GET_CODE (XEXP (set, 1)) == IF_THEN_ELSE)
	    ; /* We assume that this is the cmove created by the backend that
		 naturally uses the condition.  Therefore we ignore it.  */

XEXP (set, 1) is src.  But here too, I'm a bit nervous about assuming
that any if_then_else is safe to ignore, even currently.  It seems
more dangerous if we extend the function to handle more cases.

	  else
	    {
	      if (reg_mentioned_p (XEXP (cond, 0), src)
		  || reg_mentioned_p (XEXP (cond, 1), src))
		{
		  read_comparison = true;
		  break;
		}
	    }

	  walk = NEXT_INSN (walk);
	}

Maybe a safer way to check whether "insn" is sensitive to the values
in "val" is to use:

  subrtx_iterator::array_type array;
  FOR_EACH_SUBRTX (iter, array, val, NONCONST)
    if (OBJECT_P (*iter) && reg_overlap_mentioned_p (*iter, insn))
      return true;
  return false;

which doesn't assume that insn is a single set.

But I'm not sure which cases this code is trying to catch.  Is it trying
to catch cases where seq2 "spontaneously" uses registers that happen to
overlap with cond?  If so, then when does that happen?  And if it does
happen, wouldn't the sequence also have to set the registers first?

If instead the code is trying to detect uses of cond that were fed
to seq2 outside of cond itself, via try_emit_cmove_seq, then wouldn't
it be enough to check old_val and new_val?

Anyway, the point I was trying to get to was: we could use the
FOR_EACH_SUBRTX above to check whether an instruction in seq2
is sensitive to the value of cc_cmp/rev_cc_cmp.  And we can use
modified_in_p, as in your patch, to check whether an instruction
changes the value of cc_cmp/rev_cc_cmp.  We could therefore record
whether we've seen a modification of cc_cmp/rev_cc_cmp, then reject
seq2 if we see a subsequent use.

Thanks,
Richard

> +	}
>  
>        /* The backend might have created a sequence that uses the
>  	 condition.  Check this.  */
> @@ -3609,21 +3603,16 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info,
>  	  return false;
>  	}
>  
> -      if (cc_cmp)
> +      if (cc_cmp && seq == seq1)
>  	{
> -	  /* Check if SEQ can clobber registers mentioned in
> -	     cc_cmp and/or rev_cc_cmp.  If yes, we need to use
> -	     only seq1 from that point on.  */
> -	  rtx cc_cmp_pair[2] = { cc_cmp, rev_cc_cmp };
> -	  for (walk = seq; walk; walk = NEXT_INSN (walk))
> +	  /* Check if SEQ can clobber registers mentioned in cc_cmp/rev_cc_cmp.
> +	     If yes, we need to use only seq1 from that point on.
> +	     Only check when we use seq1 since we have already tested seq2.  */
> +	  if (modified_in_p (cc_cmp, seq)
> +	      || (rev_cc_cmp && modified_in_p (rev_cc_cmp, seq)))
>  	    {
> -	      note_stores (walk, check_for_cc_cmp_clobbers, cc_cmp_pair);
> -	      if (cc_cmp_pair[0] == NULL_RTX)
> -		{
> -		  cc_cmp = NULL_RTX;
> -		  rev_cc_cmp = NULL_RTX;
> -		  break;
> -		}
> +	      cc_cmp = NULL_RTX;
> +	      rev_cc_cmp = NULL_RTX;
>  	    }
>  	}
  
Robin Dapp Oct. 20, 2023, 7:04 a.m. UTC | #2
> Sorry for the slow review.  TBH I was hoping someone else would pick
> it up, since (a) I'm not very familiar with this code, and (b) I don't
> really agree with the way that the current code works.  I'm not sure the
> current dependency checking is safe, so I'm nervous about adding even
> more cases to it.  And it feels like the different ifcvt techniques are
> not now well distinguished, so that they're beginning to overlap and
> compete with one another.  None of that is your fault, of course. :)

I might be to blame, at least partially :)  The idea back then was to
do it here because (1) it can handle cases the other part cannot and
(2) its costing is based on sequence cost rather than just counting
instructions.

> This condition looks odd.  A SET_SRC is never null.  But more importantly,
> continuing means "assume the best", and I don't think we should assume
> the best for (say) an insn with two parallel sets.

IIRC I didn't consider two parallel sets at all :/

> But I'm not sure which cases this code is trying to catch.  Is it trying
> to catch cases where seq2 "spontaneously" uses registers that happen to
> overlap with cond?  If so, then when does that happen?  And if it does
> happen, wouldn't the sequence also have to set the registers first?

In order for sequence costing to be superior to just counting "conditional"
instructions we need to make sure that as few redundant instructions as
possible are present in the costed sequences. (redundant as in "will be
removed in a subsequent pass").
 
So, originally, the backend would always be passed a comparison
(set cc (cmp a b)).  On e.g. s390 this would result in at least two
redundant instructions per conditional move so costing was always wrong.
When passing the backend the comparison "result" e.g. (cmp cc 0)
there is no redundant instruction.

Now, passing the comparison is useful as well when we want to turn
a conditional move into a min/max in the backend.  This, however,
overwrites the initial condition result and any subsequent iterations
must not pass the result but the comparison itself to the backend.

In my first approach I hadn't considered several special cases which,
of course, complicated things further down the line.

All that said - maybe trying hard to get rid of later-redundant insns
is not a good approach after all?  A lot of complexity would go away
if we counted emitted conditional instructions just like the other
ifcvt part.  Maybe a middle ground would be to cost the initial
sequence as well as if + jmp and compare this against insn_cost of
only the created conditional insns?

In that case we might need to tighten/audit our preconditions in order
not to accidentally allow cases that result in strictly worse code.
But maybe that's an easier problem than what we are currently solving?

Regards
 Robin
  
Richard Sandiford Oct. 20, 2023, 9:16 a.m. UTC | #3
Thanks for the context.

Robin Dapp <rdapp.gcc@gmail.com> writes:
>> Sorry for the slow review.  TBH I was hoping someone else would pick
>> it up, since (a) I'm not very familiar with this code, and (b) I don't
>> really agree with the way that the current code works.  I'm not sure the
>> current dependency checking is safe, so I'm nervous about adding even
>> more cases to it.  And it feels like the different ifcvt techniques are
>> not now well distinguished, so that they're beginning to overlap and
>> compete with one another.  None of that is your fault, of course. :)
>
> I might be to blame, at least partially :)  The idea back then was to
> do it here because (1) it can handle cases the other part cannot and
> (2) its costing is based on sequence cost rather than just counting
> instructions.

Ah, OK.  (2) seems like a good reason.

>> This condition looks odd.  A SET_SRC is never null.  But more importantly,
>> continuing means "assume the best", and I don't think we should assume
>> the best for (say) an insn with two parallel sets.
>
> IIRC I didn't consider two parallel sets at all :/
>
>> But I'm not sure which cases this code is trying to catch.  Is it trying
>> to catch cases where seq2 "spontaneously" uses registers that happen to
>> overlap with cond?  If so, then when does that happen?  And if it does
>> happen, wouldn't the sequence also have to set the registers first?
>
> In order for sequence costing to be superior to just counting "conditional"
> instructions we need to make sure that as few redundant instructions as
> possible are present in the costed sequences. (redundant as in "will be
> removed in a subsequent pass").
>  
> So, originally, the backend would always be passed a comparison
> (set cc (cmp a b)).  On e.g. s390 this would result in at least two
> redundant instructions per conditional move so costing was always wrong.
> When passing the backend the comparison "result" e.g. (cmp cc 0)
> there is no redundant instruction.
>
> Now, passing the comparison is useful as well when we want to turn
> a conditional move into a min/max in the backend.  This, however,
> overwrites the initial condition result and any subsequent iterations
> must not pass the result but the comparison itself to the backend.

Yeah, makes sense.  Using your example, there seem to be two things
that we're checking:

(1) Does the sequence change cc?  This is checked via:

      if (cc_cmp)
	{
	  /* Check if SEQ can clobber registers mentioned in
	     cc_cmp and/or rev_cc_cmp.  If yes, we need to use
	     only seq1 from that point on.  */
	  rtx cc_cmp_pair[2] = { cc_cmp, rev_cc_cmp };
	  for (walk = seq; walk; walk = NEXT_INSN (walk))
	    {
	      note_stores (walk, check_for_cc_cmp_clobbers, cc_cmp_pair);
	      if (cc_cmp_pair[0] == NULL_RTX)
		{
		  cc_cmp = NULL_RTX;
		  rev_cc_cmp = NULL_RTX;
		  break;
		}
	    }
	}

    and is the case that Manolis's patch is dealing with.

(2) Does the sequence use a and b?  If so, we need to use temporary
    destinations for any earlier writes to a and b.

Is that right?

(1) looks OK, although Manolis's modified_in_p would work too.
(2) is the code I quoted yesterday and is the part I'm not sure
about.  First of all:

      seq1 = try_emit_cmove_seq (if_info, temp, cond,
				 new_val, old_val, need_cmov,
				 &cost1, &temp_dest1);

must have a consistent view of what a and b are.  So old_val and new_val
cannot at this point reference "newer" values of a and b (as set by previous
instructions in the sequence).  AIUI:

      if (int *ii = rewired_src->get (insn))
	new_val = simplify_replace_rtx (new_val, (*targets)[*ii],
					(*temporaries)[*ii]);

is the code that ensures this.  If a previous write to a and b has been
replaced by a temporary, this code substitutes that temporary into new_val.

The same cond, new_val and old_val are used in:

	seq2 = try_emit_cmove_seq (if_info, temp, cond,
				   new_val, old_val, need_cmov,
				   &cost2, &temp_dest2, cc_cmp, rev_cc_cmp);

So won't any use of a and b in seq2 to be from cond, rather than old_val
and new_val?  If so, shouldn't we set read_comparison for any use of a
and b, rather than skipping IF_THEN_ELSE?

> In my first approach I hadn't considered several special cases which,
> of course, complicated things further down the line.
>
> All that said - maybe trying hard to get rid of later-redundant insns
> is not a good approach after all?  A lot of complexity would go away
> if we counted emitted conditional instructions just like the other
> ifcvt part.  Maybe a middle ground would be to cost the initial
> sequence as well as if + jmp and compare this against insn_cost of
> only the created conditional insns?

Using seq_cost seems like the best way of costing things.  And I agree
that it's worth trying to avoid costing (and generating) redundant
instructions.

> In that case we might need to tighten/audit our preconditions in order
> not to accidentally allow cases that result in strictly worse code.
> But maybe that's an easier problem than what we are currently solving?

Not sure :)

Thanks,
Richard
  
Jeff Law Nov. 10, 2023, 9:25 p.m. UTC | #4
On 10/19/23 13:41, Richard Sandiford wrote:
> Manolis Tsamis <manolis.tsamis@vrull.eu> writes:
>> This is an extension of what was done in PR106590.
>>
>> Currently if a sequence generated in noce_convert_multiple_sets clobbers the
>> condition rtx (cc_cmp or rev_cc_cmp) then only seq1 is used afterwards
>> (sequences that emit the comparison itself). Since this applies only from the
>> next iteration it assumes that the sequences generated (in particular seq2)
>> doesn't clobber the condition rtx itself before using it in the if_then_else,
>> which is only true in specific cases (currently only register/subregister moves
>> are allowed).
>>
>> This patch changes this so it also tests if seq2 clobbers cc_cmp/rev_cc_cmp in
>> the current iteration. This makes it possible to include arithmetic operations
>> in noce_convert_multiple_sets.
>>
>> gcc/ChangeLog:
>>
>> 	* ifcvt.cc (check_for_cc_cmp_clobbers): Use modified_in_p instead.
>> 	(noce_convert_multiple_sets_1): Don't use seq2 if it clobbers cc_cmp.
>>
>> Signed-off-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
>> ---
>>
>> (no changes since v1)
>>
>>   gcc/ifcvt.cc | 49 +++++++++++++++++++------------------------------
>>   1 file changed, 19 insertions(+), 30 deletions(-)
> 
> Sorry for the slow review.  TBH I was hoping someone else would pick
> it up, since (a) I'm not very familiar with this code, and (b) I don't
> really agree with the way that the current code works.  I'm not sure the
> current dependency checking is safe, so I'm nervous about adding even
> more cases to it.  And it feels like the different ifcvt techniques are
> not now well distinguished, so that they're beginning to overlap and
> compete with one another.  None of that is your fault, of course. :)
I'd been hoping to get it it as well, particularly since I've got a TODO 
to sort out the conditional zero support in ifcvt.cc from various 
contibutors.  While there isn't any overlap I can see between that work 
and this submission from Manolis, it's close enough that if I'm going to 
get re-familiar with ifcvt.cc I figured I'd try to handle both.


jeff
  
Jeff Law Nov. 10, 2023, 9:31 p.m. UTC | #5
On 10/20/23 01:04, Robin Dapp wrote:
> 
>> But I'm not sure which cases this code is trying to catch.  Is it trying
>> to catch cases where seq2 "spontaneously" uses registers that happen to
>> overlap with cond?  If so, then when does that happen?  And if it does
>> happen, wouldn't the sequence also have to set the registers first?
> 
> In order for sequence costing to be superior to just counting "conditional"
> instructions we need to make sure that as few redundant instructions as
> possible are present in the costed sequences. (redundant as in "will be
> removed in a subsequent pass").
[ ... ]
Sounds a lot like a scenario we had with threading.  Threading will 
often generate code in duplicated blocks that will trivially be eliminated.

IIRC I had Alex O. tackle that in threading several years back with good 
success.  I don't think he tried to be exhaustive, just sensible in what 
was likely to be dead after threading.  It helped numerous cases where 
we clearly should have threaded, but weren't because of artificially 
high costing.

It's not a trivial balancing act.

Jeff
  
Jeff Law Nov. 10, 2023, 9:48 p.m. UTC | #6
On 10/20/23 03:16, Richard Sandiford wrote:
> Thanks for the context.
> 
> Robin Dapp <rdapp.gcc@gmail.com> writes:
>>> Sorry for the slow review.  TBH I was hoping someone else would pick
>>> it up, since (a) I'm not very familiar with this code, and (b) I don't
>>> really agree with the way that the current code works.  I'm not sure the
>>> current dependency checking is safe, so I'm nervous about adding even
>>> more cases to it.  And it feels like the different ifcvt techniques are
>>> not now well distinguished, so that they're beginning to overlap and
>>> compete with one another.  None of that is your fault, of course. :)
>>
>> I might be to blame, at least partially :)  The idea back then was to
>> do it here because (1) it can handle cases the other part cannot and
>> (2) its costing is based on sequence cost rather than just counting
>> instructions.
> 
> Ah, OK.  (2) seems like a good reason.
Agreed.  It's been a problem area (costing ifcvt), but it's still the 
right thing to do.  No doubt if we change something from counting insns 
to sequence costing it'll cause another set of problems, but that 
shouldn't stop us from doing the right thing here.


> 
> Yeah, makes sense.  Using your example, there seem to be two things
> that we're checking:
> 
> (1) Does the sequence change cc?  This is checked via:
> 
>        if (cc_cmp)
> 	{
> 	  /* Check if SEQ can clobber registers mentioned in
> 	     cc_cmp and/or rev_cc_cmp.  If yes, we need to use
> 	     only seq1 from that point on.  */
> 	  rtx cc_cmp_pair[2] = { cc_cmp, rev_cc_cmp };
> 	  for (walk = seq; walk; walk = NEXT_INSN (walk))
> 	    {
> 	      note_stores (walk, check_for_cc_cmp_clobbers, cc_cmp_pair);
> 	      if (cc_cmp_pair[0] == NULL_RTX)
> 		{
> 		  cc_cmp = NULL_RTX;
> 		  rev_cc_cmp = NULL_RTX;
> 		  break;
> 		}
> 	    }
> 	}
> 
>      and is the case that Manolis's patch is dealing with.
> 
> (2) Does the sequence use a and b?  If so, we need to use temporary
>      destinations for any earlier writes to a and b.
> 
> Is that right?
> 
> (1) looks OK, although Manolis's modified_in_p would work too.
Agreed.


> (2) is the code I quoted yesterday and is the part I'm not sure
> about.  First of all:
> 
>        seq1 = try_emit_cmove_seq (if_info, temp, cond,
> 				 new_val, old_val, need_cmov,
> 				 &cost1, &temp_dest1);
> 
> must have a consistent view of what a and b are.  So old_val and new_val
> cannot at this point reference "newer" values of a and b (as set by previous
> instructions in the sequence).  AIUI:
Sigh.  ifcvt seems to pervasively adjust arguments, then you have to 
figure out which one is the right one for any given context.  I was 
driving me nuts a couple weeks ago when I was looking at the condzero 
work.  It's part of why I set everything down at the time.  I ran into 
it in the VRULL code, Ventana's hacks on top of the VRULL code and in 
the ESWIN code, got frustrated and decided to look at something else for 
a bit (which has led to its own little rathole).





> 
> The same cond, new_val and old_val are used in:
> 
> 	seq2 = try_emit_cmove_seq (if_info, temp, cond,
> 				   new_val, old_val, need_cmov,
> 				   &cost2, &temp_dest2, cc_cmp, rev_cc_cmp);
> 
> So won't any use of a and b in seq2 to be from cond, rather than old_val
> and new_val?  If so, shouldn't we set read_comparison for any use of a
> and b, rather than skipping IF_THEN_ELSE?
Seems like it to me, yes.



> 
> Using seq_cost seems like the best way of costing things.  And I agree
> that it's worth trying to avoid costing (and generating) redundant
> instructions.
I think there's general agreement on seq_cost.  I wonder if we should 
look to split that out on its own, then figure out what to do with the 
bigger issues in this space.

Jeff
  

Patch

diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
index a0af553b9ff..3273aeca125 100644
--- a/gcc/ifcvt.cc
+++ b/gcc/ifcvt.cc
@@ -3375,20 +3375,6 @@  noce_convert_multiple_sets (struct noce_if_info *if_info)
   return true;
 }
 
-/* Helper function for noce_convert_multiple_sets_1.  If store to
-   DEST can affect P[0] or P[1], clear P[0].  Called via note_stores.  */
-
-static void
-check_for_cc_cmp_clobbers (rtx dest, const_rtx, void *p0)
-{
-  rtx *p = (rtx *) p0;
-  if (p[0] == NULL_RTX)
-    return;
-  if (reg_overlap_mentioned_p (dest, p[0])
-      || (p[1] && reg_overlap_mentioned_p (dest, p[1])))
-    p[0] = NULL_RTX;
-}
-
 /* This goes through all relevant insns of IF_INFO->then_bb and tries to
    create conditional moves.  In case a simple move sufficis the insn
    should be listed in NEED_NO_CMOV.  The rewired-src cases should be
@@ -3552,9 +3538,17 @@  noce_convert_multiple_sets_1 (struct noce_if_info *if_info,
 	 creating an additional compare for each.  If successful, costing
 	 is easier and this sequence is usually preferred.  */
       if (cc_cmp)
-	seq2 = try_emit_cmove_seq (if_info, temp, cond,
-				   new_val, old_val, need_cmov,
-				   &cost2, &temp_dest2, cc_cmp, rev_cc_cmp);
+	{
+	  seq2 = try_emit_cmove_seq (if_info, temp, cond,
+				     new_val, old_val, need_cmov,
+				     &cost2, &temp_dest2, cc_cmp, rev_cc_cmp);
+
+	  /* The if_then_else in SEQ2 may be affected when cc_cmp/rev_cc_cmp is
+	     clobbered.  We can't safely use the sequence in this case.  */
+	  if (seq2 && (modified_in_p (cc_cmp, seq2)
+	      || (rev_cc_cmp && modified_in_p (rev_cc_cmp, seq2))))
+	    seq2 = NULL;
+	}
 
       /* The backend might have created a sequence that uses the
 	 condition.  Check this.  */
@@ -3609,21 +3603,16 @@  noce_convert_multiple_sets_1 (struct noce_if_info *if_info,
 	  return false;
 	}
 
-      if (cc_cmp)
+      if (cc_cmp && seq == seq1)
 	{
-	  /* Check if SEQ can clobber registers mentioned in
-	     cc_cmp and/or rev_cc_cmp.  If yes, we need to use
-	     only seq1 from that point on.  */
-	  rtx cc_cmp_pair[2] = { cc_cmp, rev_cc_cmp };
-	  for (walk = seq; walk; walk = NEXT_INSN (walk))
+	  /* Check if SEQ can clobber registers mentioned in cc_cmp/rev_cc_cmp.
+	     If yes, we need to use only seq1 from that point on.
+	     Only check when we use seq1 since we have already tested seq2.  */
+	  if (modified_in_p (cc_cmp, seq)
+	      || (rev_cc_cmp && modified_in_p (rev_cc_cmp, seq)))
 	    {
-	      note_stores (walk, check_for_cc_cmp_clobbers, cc_cmp_pair);
-	      if (cc_cmp_pair[0] == NULL_RTX)
-		{
-		  cc_cmp = NULL_RTX;
-		  rev_cc_cmp = NULL_RTX;
-		  break;
-		}
+	      cc_cmp = NULL_RTX;
+	      rev_cc_cmp = NULL_RTX;
 	    }
 	}