[v2] PR 91865: Avoid ZERO_EXTEND of ZERO_EXTEND in make_compound_operation.

Message ID 001501da0724$b12ffc20$138ff460$@nextmovesoftware.com
State Accepted
Headers
Series [v2] PR 91865: Avoid ZERO_EXTEND of ZERO_EXTEND in make_compound_operation. |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Roger Sayle Oct. 25, 2023, 9:21 a.m. UTC
  Hi Jeff,
Many thanks for the review/approval of my fix for PR rtl-optimization/91865.
Based on your and Richard Biener's feedback, I’d like to propose a revision
calling simplify_unary_operation instead of simplify_const_unary_operation
(i.e. Richi's recommendation).  I was originally concerned that this might
potentially result in unbounded recursion, and testing for ZERO_EXTEND was
safer but "uglier", but testing hasn't shown any issues.  If we do see issues
in the future, it's easy to fall back to the previous version of this patch.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?


2023-10-25  Roger Sayle  <roger@nextmovesoftware.com>
            Richard Biener  <rguenther@suse.de>

gcc/ChangeLog
        PR rtl-optimization/91865
        * combine.cc (make_compound_operation): Avoid creating a
        ZERO_EXTEND of a ZERO_EXTEND.

gcc/testsuite/ChangeLog
        PR rtl-optimization/91865
        * gcc.target/msp430/pr91865.c: New test case.


Thanks again,
Roger
--

> -----Original Message-----
> From: Jeff Law <jeffreyalaw@gmail.com>
> Sent: 19 October 2023 16:20
> 
> On 10/14/23 16:14, Roger Sayle wrote:
> >
> > This patch is my proposed solution to PR rtl-optimization/91865.
> > Normally RTX simplification canonicalizes a ZERO_EXTEND of a
> > ZERO_EXTEND to a single ZERO_EXTEND, but as shown in this PR it is
> > possible for combine's make_compound_operation to unintentionally
> > generate a non-canonical ZERO_EXTEND of a ZERO_EXTEND, which is
> > unlikely to be matched by the backend.
> >
> > For the new test case:
> >
> > const int table[2] = {1, 2};
> > int foo (char i) { return table[i]; }
> >
> > compiling with -O2 -mlarge on msp430 we currently see:
> >
> > Trying 2 -> 7:
> >      2: r25:HI=zero_extend(R12:QI)
> >        REG_DEAD R12:QI
> >      7: r28:PSI=sign_extend(r25:HI)#0
> >        REG_DEAD r25:HI
> > Failed to match this instruction:
> > (set (reg:PSI 28 [ iD.1772 ])
> >      (zero_extend:PSI (zero_extend:HI (reg:QI 12 R12 [ iD.1772 ]))))
> >
> > which results in the following code:
> >
> > foo:    AND     #0xff, R12
> >          RLAM.A #4, R12 { RRAM.A #4, R12
> >          RLAM.A  #1, R12
> >          MOVX.W  table(R12), R12
> >          RETA
> >
> > With this patch, we now see:
> >
> > Trying 2 -> 7:
> >      2: r25:HI=zero_extend(R12:QI)
> >        REG_DEAD R12:QI
> >      7: r28:PSI=sign_extend(r25:HI)#0
> >        REG_DEAD r25:HI
> > Successfully matched this instruction:
> > (set (reg:PSI 28 [ iD.1772 ])
> >      (zero_extend:PSI (reg:QI 12 R12 [ iD.1772 ]))) allowing
> > combination of insns 2 and 7 original costs 4 + 8 = 12 replacement
> > cost 8
> >
> > foo:    MOV.B   R12, R12
> >          RLAM.A  #1, R12
> >          MOVX.W  table(R12), R12
> >          RETA
> >
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check, both with and without --target_board=unix{-m32}
> > with no new failures.  Ok for mainline?
> >
> > 2023-10-14  Roger Sayle  <roger@nextmovesoftware.com>
> >
> > gcc/ChangeLog
> >          PR rtl-optimization/91865
> >          * combine.cc (make_compound_operation): Avoid creating a
> >          ZERO_EXTEND of a ZERO_EXTEND.
> Final question.  Is there a reasonable expectation that we could get a
> similar situation with sign extensions?   If so we probably ought to try
> and handle both.
> 
> OK with the obvious change to handle nested sign extensions if you think it's
> useful to do so.  And OK as-is if you don't think handling nested sign extensions is
> useful.
> 
> jeff
  

Comments

Jeff Law Oct. 25, 2023, 4:26 p.m. UTC | #1
On 10/25/23 03:21, Roger Sayle wrote:
> 
> Hi Jeff,
> Many thanks for the review/approval of my fix for PR rtl-optimization/91865.
> Based on your and Richard Biener's feedback, I’d like to propose a revision
> calling simplify_unary_operation instead of simplify_const_unary_operation
> (i.e. Richi's recommendation).  I was originally concerned that this might
> potentially result in unbounded recursion, and testing for ZERO_EXTEND was
> safer but "uglier", but testing hasn't shown any issues.  If we do see issues
> in the future, it's easy to fall back to the previous version of this patch.
> 
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures.  Ok for mainline?
> 
> 
> 2023-10-25  Roger Sayle  <roger@nextmovesoftware.com>
>              Richard Biener  <rguenther@suse.de>
> 
> gcc/ChangeLog
>          PR rtl-optimization/91865
>          * combine.cc (make_compound_operation): Avoid creating a
>          ZERO_EXTEND of a ZERO_EXTEND.
> 
> gcc/testsuite/ChangeLog
>          PR rtl-optimization/91865
>          * gcc.target/msp430/pr91865.c: New test case.
I'm not terribly worried about recursion.  For the case you want to 
handle, it's going to be picked up by the call to 
simplify_const_unary_operation at the start of simplify_unary_operation. 
  It's only if that fails that we call into simplify_unary_operation_1.

The only thing that even comes close to worrisome to me in this space is 
the asserts in do_SUBST.  But I don't think your patch is likely to make 
the problems with those asserts any worse than they already are.

OK for the trunk.

Jeff
  

Patch

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 360aa2f25e6..b1b16ac7bb2 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -8449,8 +8449,8 @@  make_compound_operation (rtx x, enum rtx_code in_code)
   if (code == ZERO_EXTEND)
     {
       new_rtx = make_compound_operation (XEXP (x, 0), next_code);
-      tem = simplify_const_unary_operation (ZERO_EXTEND, GET_MODE (x),
-					    new_rtx, GET_MODE (XEXP (x, 0)));
+      tem = simplify_unary_operation (ZERO_EXTEND, GET_MODE (x),
+				      new_rtx, GET_MODE (XEXP (x, 0)));
       if (tem)
 	return tem;
       SUBST (XEXP (x, 0), new_rtx);
diff --git a/gcc/testsuite/gcc.target/msp430/pr91865.c b/gcc/testsuite/gcc.target/msp430/pr91865.c
new file mode 100644
index 00000000000..8cc21c8b9e8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/msp430/pr91865.c
@@ -0,0 +1,8 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -mlarge" } */
+
+const int table[2] = {1, 2};
+int foo (char i) { return table[i]; }
+
+/* { dg-final { scan-assembler-not "AND" } } */
+/* { dg-final { scan-assembler-not "RRAM" } } */