[2/2] ifcvt: Allow more operations in multiple set if conversion

Message ID 20230701092413.2488806-3-manolis.tsamis@vrull.eu
State Accepted
Headers
Series ifcvt: Allow if conversion of arithmetic in basic blocks with multiple sets |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Manolis Tsamis July 1, 2023, 9:24 a.m. UTC
  Currently the operations allowed for if conversion of a basic block with
multiple sets are few, namely REG, SUBREG and CONST_INT (as controlled by
bb_ok_for_noce_convert_multiple_sets).

This commit allows more operations (arithmetic, compare, etc) to participate
in if conversion. The target's profitability hook and ifcvt's costing is
expected to reject sequences that are unprofitable.

This is especially useful for targets which provide a rich selection of
conditional instructions (like aarch64 which has cinc, csneg, csinv, ccmp, ...)
which are currently not used in basic blocks with more than a single set.

gcc/ChangeLog:

	* ifcvt.cc (try_emit_cmove_seq): Modify comments.
	(noce_convert_multiple_sets_1): Modify comments.
	(bb_ok_for_noce_convert_multiple_sets): Allow more operations.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/ifcvt_multiple_sets_arithm.c: New test.

Signed-off-by: Manolis Tsamis <manolis.tsamis@vrull.eu>
---

 gcc/ifcvt.cc                                  | 60 +++++++++++------
 .../aarch64/ifcvt_multiple_sets_arithm.c      | 67 +++++++++++++++++++
 2 files changed, 108 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_arithm.c
  

Comments

Robin Dapp July 3, 2023, 9:12 a.m. UTC | #1
Hi Manolis,

that looks like a nice enhancement of what's already possible.  The concern
I had some years back already was that this function would eventually
grow and cannibalize on some of what the other functions in ifcvt already
do :)  At some point we really should unify but that's not within the
scope of this patch.

IMHO we're already pretty far towards general "conditional execution"
with conditional increments, selects and so on (and the function is still
called "_noce") and historically the cond_exec functions would have
taken care of that.  To my knowledge though, none of the major backends
implements anything like (cond_exec ...) anymore and relies on bit-twiddling
tricks to generate the conditional instructions.

Have you checked whether cond_exec and others could be adjusted to
handle the conditional instructions you want to see?  They don't perform
full cost comparison though but just count.

I would expect a bit of discussion around that but from a first look
I don't have major concerns.

> -/* Return true iff basic block TEST_BB is comprised of only
> -   (SET (REG) (REG)) insns suitable for conversion to a series
> -   of conditional moves.  Also check that we have more than one set
> -   (other routines can handle a single set better than we would), and
> -   fewer than PARAM_MAX_RTL_IF_CONVERSION_INSNS sets.  While going
> +/* Return true iff basic block TEST_BB is suitable for conversion to a
> +   series of conditional moves.  Also check that we have more than one

Might want to change the "conditional moves" while you're at it.

>  
> -      if (!((REG_P (src) || CONSTANT_P (src))
> -	    || (GET_CODE (src) == SUBREG && REG_P (SUBREG_REG (src))
> -	      && subreg_lowpart_p (src))))
> +      /* Allow a wide range of operations and let the costing function decide
> +	 if the conversion is worth it later.  */
> +      enum rtx_code code = GET_CODE (src);
> +      if (!(CONSTANT_P (src)
> +	    || code == REG
> +	    || code == SUBREG
> +	    || code == ZERO_EXTEND
> +	    || code == SIGN_EXTEND
> +	    || code == NOT
> +	    || code == NEG
> +	    || code == PLUS
> +	    || code == MINUS
> +	    || code == AND
> +	    || code == IOR
> +	    || code == MULT
> +	    || code == ASHIFT
> +	    || code == ASHIFTRT
> +	    || code == NE
> +	    || code == EQ
> +	    || code == GE
> +	    || code == GT
> +	    || code == LE
> +	    || code == LT
> +	    || code == GEU
> +	    || code == GTU
> +	    || code == LEU
> +	    || code == LTU
> +	    || code == COMPARE))

We're potentially checking many more patterns than before.  Maybe it
would make sense to ask the backend whether it has a pattern for
the respective code?

Regards
 Robin
  
Manolis Tsamis July 4, 2023, 2:32 p.m. UTC | #2
On Mon, Jul 3, 2023 at 12:12 PM Robin Dapp <rdapp.gcc@gmail.com> wrote:
>
> Hi Manolis,
>
> that looks like a nice enhancement of what's already possible.  The concern
> I had some years back already was that this function would eventually
> grow and cannibalize on some of what the other functions in ifcvt already
> do :)  At some point we really should unify but that's not within the
> scope of this patch.
>

Hi Robin,

Indeed and it would be nice to extend the multi statement
implementation to the point that the others are not needed :)
I have some future plans to analyze cases where the multi-statement
performs worse and improve on that.

> IMHO we're already pretty far towards general "conditional execution"
> with conditional increments, selects and so on (and the function is still
> called "_noce") and historically the cond_exec functions would have
> taken care of that.  To my knowledge though, none of the major backends
> implements anything like (cond_exec ...) anymore and relies on bit-twiddling
> tricks to generate the conditional instructions.
>
> Have you checked whether cond_exec and others could be adjusted to
> handle the conditional instructions you want to see?  They don't perform
> full cost comparison though but just count.
>

Thanks for mentioning that, I was not really aware of cond_exec usage.
As you say, it looks like cond_exec isn't used very much on major backends.

Since noce_convert_multiple_sets_1 is just using the existing ifcvt
machinery (specifically noce_emit_cmove / try_emit_cmove_seq), is this
a question of whether we want to replace (if_then_else ...) with
(cond_exec ...) in general?
If that is beneficial then I could try to implement a change like
this, but that should probably be a separate effort from this
implementation.

> I would expect a bit of discussion around that but from a first look
> I don't have major concerns.
>
> > -/* Return true iff basic block TEST_BB is comprised of only
> > -   (SET (REG) (REG)) insns suitable for conversion to a series
> > -   of conditional moves.  Also check that we have more than one set
> > -   (other routines can handle a single set better than we would), and
> > -   fewer than PARAM_MAX_RTL_IF_CONVERSION_INSNS sets.  While going
> > +/* Return true iff basic block TEST_BB is suitable for conversion to a
> > +   series of conditional moves.  Also check that we have more than one
>
> Might want to change the "conditional moves" while you're at it.
>

Thanks for pointing out this comment, I missed it. I will rewrite the
relevant parts.

> >
> > -      if (!((REG_P (src) || CONSTANT_P (src))
> > -         || (GET_CODE (src) == SUBREG && REG_P (SUBREG_REG (src))
> > -           && subreg_lowpart_p (src))))
> > +      /* Allow a wide range of operations and let the costing function decide
> > +      if the conversion is worth it later.  */
> > +      enum rtx_code code = GET_CODE (src);
> > +      if (!(CONSTANT_P (src)
> > +         || code == REG
> > +         || code == SUBREG
> > +         || code == ZERO_EXTEND
> > +         || code == SIGN_EXTEND
> > +         || code == NOT
> > +         || code == NEG
> > +         || code == PLUS
> > +         || code == MINUS
> > +         || code == AND
> > +         || code == IOR
> > +         || code == MULT
> > +         || code == ASHIFT
> > +         || code == ASHIFTRT
> > +         || code == NE
> > +         || code == EQ
> > +         || code == GE
> > +         || code == GT
> > +         || code == LE
> > +         || code == LT
> > +         || code == GEU
> > +         || code == GTU
> > +         || code == LEU
> > +         || code == LTU
> > +         || code == COMPARE))
>
> We're potentially checking many more patterns than before.  Maybe it
> would make sense to ask the backend whether it has a pattern for
> the respective code?
>

Is it an issue if the backend doesn't have a pattern for a respective code?

My goal here is to not limit if conversion for sequences based on the
code but rather let ifcvt / the backedn decide based on costing.
That's because from what I've seen, conditional set instructions can
be beneficial even when the backend doesn't have a specific
instruction for that code.

Best,
Manolis

> Regards
>  Robin
>
  
Manolis Tsamis July 13, 2023, 2:11 p.m. UTC | #3
I resent this with just the change in the comment.
OK to merge?

Manolis

On Tue, Jul 4, 2023 at 5:32 PM Manolis Tsamis <manolis.tsamis@vrull.eu> wrote:
>
> On Mon, Jul 3, 2023 at 12:12 PM Robin Dapp <rdapp.gcc@gmail.com> wrote:
> >
> > Hi Manolis,
> >
> > that looks like a nice enhancement of what's already possible.  The concern
> > I had some years back already was that this function would eventually
> > grow and cannibalize on some of what the other functions in ifcvt already
> > do :)  At some point we really should unify but that's not within the
> > scope of this patch.
> >
>
> Hi Robin,
>
> Indeed and it would be nice to extend the multi statement
> implementation to the point that the others are not needed :)
> I have some future plans to analyze cases where the multi-statement
> performs worse and improve on that.
>
> > IMHO we're already pretty far towards general "conditional execution"
> > with conditional increments, selects and so on (and the function is still
> > called "_noce") and historically the cond_exec functions would have
> > taken care of that.  To my knowledge though, none of the major backends
> > implements anything like (cond_exec ...) anymore and relies on bit-twiddling
> > tricks to generate the conditional instructions.
> >
> > Have you checked whether cond_exec and others could be adjusted to
> > handle the conditional instructions you want to see?  They don't perform
> > full cost comparison though but just count.
> >
>
> Thanks for mentioning that, I was not really aware of cond_exec usage.
> As you say, it looks like cond_exec isn't used very much on major backends.
>
> Since noce_convert_multiple_sets_1 is just using the existing ifcvt
> machinery (specifically noce_emit_cmove / try_emit_cmove_seq), is this
> a question of whether we want to replace (if_then_else ...) with
> (cond_exec ...) in general?
> If that is beneficial then I could try to implement a change like
> this, but that should probably be a separate effort from this
> implementation.
>
> > I would expect a bit of discussion around that but from a first look
> > I don't have major concerns.
> >
> > > -/* Return true iff basic block TEST_BB is comprised of only
> > > -   (SET (REG) (REG)) insns suitable for conversion to a series
> > > -   of conditional moves.  Also check that we have more than one set
> > > -   (other routines can handle a single set better than we would), and
> > > -   fewer than PARAM_MAX_RTL_IF_CONVERSION_INSNS sets.  While going
> > > +/* Return true iff basic block TEST_BB is suitable for conversion to a
> > > +   series of conditional moves.  Also check that we have more than one
> >
> > Might want to change the "conditional moves" while you're at it.
> >
>
> Thanks for pointing out this comment, I missed it. I will rewrite the
> relevant parts.
>
> > >
> > > -      if (!((REG_P (src) || CONSTANT_P (src))
> > > -         || (GET_CODE (src) == SUBREG && REG_P (SUBREG_REG (src))
> > > -           && subreg_lowpart_p (src))))
> > > +      /* Allow a wide range of operations and let the costing function decide
> > > +      if the conversion is worth it later.  */
> > > +      enum rtx_code code = GET_CODE (src);
> > > +      if (!(CONSTANT_P (src)
> > > +         || code == REG
> > > +         || code == SUBREG
> > > +         || code == ZERO_EXTEND
> > > +         || code == SIGN_EXTEND
> > > +         || code == NOT
> > > +         || code == NEG
> > > +         || code == PLUS
> > > +         || code == MINUS
> > > +         || code == AND
> > > +         || code == IOR
> > > +         || code == MULT
> > > +         || code == ASHIFT
> > > +         || code == ASHIFTRT
> > > +         || code == NE
> > > +         || code == EQ
> > > +         || code == GE
> > > +         || code == GT
> > > +         || code == LE
> > > +         || code == LT
> > > +         || code == GEU
> > > +         || code == GTU
> > > +         || code == LEU
> > > +         || code == LTU
> > > +         || code == COMPARE))
> >
> > We're potentially checking many more patterns than before.  Maybe it
> > would make sense to ask the backend whether it has a pattern for
> > the respective code?
> >
>
> Is it an issue if the backend doesn't have a pattern for a respective code?
>
> My goal here is to not limit if conversion for sequences based on the
> code but rather let ifcvt / the backedn decide based on costing.
> That's because from what I've seen, conditional set instructions can
> be beneficial even when the backend doesn't have a specific
> instruction for that code.
>
> Best,
> Manolis
>
> > Regards
> >  Robin
> >
  

Patch

diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
index fd1ce8a1049..a9e5352a0a0 100644
--- a/gcc/ifcvt.cc
+++ b/gcc/ifcvt.cc
@@ -3213,13 +3213,13 @@  try_emit_cmove_seq (struct noce_if_info *if_info, rtx temp,
 /* We have something like:
 
      if (x > y)
-       { i = a; j = b; k = c; }
+       { i = EXPR_A; j = EXPR_B; k = EXPR_C; }
 
    Make it:
 
-     tmp_i = (x > y) ? a : i;
-     tmp_j = (x > y) ? b : j;
-     tmp_k = (x > y) ? c : k;
+     tmp_i = (x > y) ? EXPR_A : i;
+     tmp_j = (x > y) ? EXPR_B : j;
+     tmp_k = (x > y) ? EXPR_C : k;
      i = tmp_i;
      j = tmp_j;
      k = tmp_k;
@@ -3635,11 +3635,10 @@  noce_convert_multiple_sets_1 (struct noce_if_info *if_info,
 
 
 
-/* Return true iff basic block TEST_BB is comprised of only
-   (SET (REG) (REG)) insns suitable for conversion to a series
-   of conditional moves.  Also check that we have more than one set
-   (other routines can handle a single set better than we would), and
-   fewer than PARAM_MAX_RTL_IF_CONVERSION_INSNS sets.  While going
+/* Return true iff basic block TEST_BB is suitable for conversion to a
+   series of conditional moves.  Also check that we have more than one
+   set (other routines can handle a single set better than we would),
+   and fewer than PARAM_MAX_RTL_IF_CONVERSION_INSNS sets.  While going
    through the insns store the sum of their potential costs in COST.  */
 
 static bool
@@ -3665,20 +3664,43 @@  bb_ok_for_noce_convert_multiple_sets (basic_block test_bb, unsigned *cost)
       rtx dest = SET_DEST (set);
       rtx src = SET_SRC (set);
 
-      /* We can possibly relax this, but for now only handle REG to REG
-	 (including subreg) moves.  This avoids any issues that might come
-	 from introducing loads/stores that might violate data-race-freedom
-	 guarantees.  */
-      if (!REG_P (dest))
+      /* Do not handle anything involving memory loads/stores since it might
+	 violate data-race-freedom guarantees.  */
+      if (!REG_P (dest) || contains_mem_rtx_p (src))
 	return false;
 
-      if (!((REG_P (src) || CONSTANT_P (src))
-	    || (GET_CODE (src) == SUBREG && REG_P (SUBREG_REG (src))
-	      && subreg_lowpart_p (src))))
+      /* Allow a wide range of operations and let the costing function decide
+	 if the conversion is worth it later.  */
+      enum rtx_code code = GET_CODE (src);
+      if (!(CONSTANT_P (src)
+	    || code == REG
+	    || code == SUBREG
+	    || code == ZERO_EXTEND
+	    || code == SIGN_EXTEND
+	    || code == NOT
+	    || code == NEG
+	    || code == PLUS
+	    || code == MINUS
+	    || code == AND
+	    || code == IOR
+	    || code == MULT
+	    || code == ASHIFT
+	    || code == ASHIFTRT
+	    || code == NE
+	    || code == EQ
+	    || code == GE
+	    || code == GT
+	    || code == LE
+	    || code == LT
+	    || code == GEU
+	    || code == GTU
+	    || code == LEU
+	    || code == LTU
+	    || code == COMPARE))
 	return false;
 
-      /* Destination must be appropriate for a conditional write.  */
-      if (!noce_operand_ok (dest))
+      /* Destination and source must be appropriate.  */
+      if (!noce_operand_ok (dest) || !noce_operand_ok (src))
 	return false;
 
       /* We must be able to conditionally move in this mode.  */
diff --git a/gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_arithm.c b/gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_arithm.c
new file mode 100644
index 00000000000..f29cc72263a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_arithm.c
@@ -0,0 +1,67 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-rtl-ce1" } */
+
+void sink2(int, int);
+void sink3(int, int, int);
+
+void cond_1(int cond, int x, int y) {
+    if (cond) {
+        x = x << 4;
+        y = 1;
+    }
+
+    sink2(x, y);
+}
+
+void cond_2(int cond, int x, int y) {
+    if (cond) {
+        x++;
+        y++;
+    }
+
+    sink2(x, y);
+}
+
+void cond_3(int cond, int x1, int x2, int x3) {
+    if (cond) {
+        x1++;
+        x2++;
+        x3++;
+    }
+
+    sink3(x1, x2, x3);
+}
+
+void cond_4(int cond, int x, int y) {
+    if (cond) {
+        x += 2;
+        y += 3;
+    }
+
+    sink2(x, y);
+}
+
+void cond_5(int cond, int x, int y, int r1, int r2) {
+    if (cond) {
+        x = r1 + 2;
+        y = r2 - 34;
+    }
+
+    sink2(x, y);
+}
+
+void cond_6(int cond, int x, int y) {
+    if (cond) {
+        x = -x;
+        y = ~y;
+    }
+
+    sink2(x, y);
+}
+
+/* { dg-final { scan-assembler-times "cinc\t" 5 } } */
+/* { dg-final { scan-assembler-times "csneg\t" 1 } } */
+/* { dg-final { scan-assembler-times "csinv\t" 1 } } */
+/* { dg-final { scan-assembler "csel\t" 1 } } */
+
+/* { dg-final { scan-rtl-dump-times "if-conversion succeeded through noce_convert_multiple_sets" 6 "ce1" } } */
\ No newline at end of file