tree-ssa-math-opts: Small uaddc/usubc pattern matching improvement [PR79173]

Message ID ZJHXpbRnyrPrFROM@tucnak
State Unresolved
Headers
Series tree-ssa-math-opts: Small uaddc/usubc pattern matching improvement [PR79173] |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

Jakub Jelinek June 20, 2023, 4:45 p.m. UTC
  Hi!

In the following testcase we fail to pattern recognize the least significant
.UADDC call.  The reason is that arg3 in that case is
  _3 = .ADD_OVERFLOW (...);
  _2 = __imag__ _3;
  _1 = _2 != 0;
  arg3 = (unsigned long) _1;
and while before the changes arg3 has a single use in some .ADD_OVERFLOW
later on, we add a .UADDC call next to it (and gsi_remove/gsi_replace only
what is strictly necessary and leave quite a few dead stmts around which
next DCE cleans up) and so it all of sudden isn't used just once, but twice
(.ADD_OVERFLOW and .UADDC) and so uaddc_cast fails.  While we could tweak
uaddc_cast and not require has_single_use in these uses, there is also
no vrp that would figure out that because __imag__ _3 is in [0, 1] range,
it can just use arg3 = __imag__ _3; and drop the comparison and cast.

We already search if either arg2 or arg3 is ultimately set from __imag__
of .{{ADD,SUB}_OVERFLOW,U{ADD,SUB}C} call, so the following patch just
remembers the lhs of __imag__ from that case and uses it later.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2023-06-20  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/79173
	* tree-ssa-math-opts.cc (match_uaddc_usubc): Remember lhs of
	IMAGPART_EXPR of arg2/arg3 and use that as arg3 if it has the right
	type.

	* g++.target/i386/pr79173-1.C: New test.


	Jakub
  

Comments

Richard Biener June 20, 2023, 6:03 p.m. UTC | #1
> Am 20.06.2023 um 18:46 schrieb Jakub Jelinek via Gcc-patches <gcc-patches@gcc.gnu.org>:
> 
> Hi!
> 
> In the following testcase we fail to pattern recognize the least significant
> .UADDC call.  The reason is that arg3 in that case is
>  _3 = .ADD_OVERFLOW (...);
>  _2 = __imag__ _3;
>  _1 = _2 != 0;
>  arg3 = (unsigned long) _1;
> and while before the changes arg3 has a single use in some .ADD_OVERFLOW
> later on, we add a .UADDC call next to it (and gsi_remove/gsi_replace only
> what is strictly necessary and leave quite a few dead stmts around which
> next DCE cleans up) and so it all of sudden isn't used just once, but twice
> (.ADD_OVERFLOW and .UADDC) and so uaddc_cast fails.  While we could tweak
> uaddc_cast and not require has_single_use in these uses, there is also
> no vrp that would figure out that because __imag__ _3 is in [0, 1] range,
> it can just use arg3 = __imag__ _3; and drop the comparison and cast.
> 
> We already search if either arg2 or arg3 is ultimately set from __imag__
> of .{{ADD,SUB}_OVERFLOW,U{ADD,SUB}C} call, so the following patch just
> remembers the lhs of __imag__ from that case and uses it later.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok

Richard 

> 2023-06-20  Jakub Jelinek  <jakub@redhat.com>
> 
>    PR middle-end/79173
>    * tree-ssa-math-opts.cc (match_uaddc_usubc): Remember lhs of
>    IMAGPART_EXPR of arg2/arg3 and use that as arg3 if it has the right
>    type.
> 
>    * g++.target/i386/pr79173-1.C: New test.
> 
> --- gcc/tree-ssa-math-opts.cc.jj    2023-06-20 08:57:38.000000000 +0200
> +++ gcc/tree-ssa-math-opts.cc    2023-06-20 10:33:52.969805538 +0200
> @@ -4728,6 +4728,7 @@ match_uaddc_usubc (gimple_stmt_iterator
>   if (!types_compatible_p (type, TREE_TYPE (arg1)))
>     return false;
>   int kind[2] = { 0, 0 };
> +  tree arg_im[2] = { NULL_TREE, NULL_TREE };
>   /* At least one of arg2 and arg3 should have type compatible
>      with arg1/rhs[0], and the other one should have value in [0, 1]
>      range.  If both are in [0, 1] range and type compatible with
> @@ -4758,6 +4759,7 @@ match_uaddc_usubc (gimple_stmt_iterator
>      g = uaddc_ne0 (g);
>      if (!uaddc_is_cplxpart (g, IMAGPART_EXPR))
>        continue;
> +      arg_im[i] = gimple_assign_lhs (g);
>      g = SSA_NAME_DEF_STMT (TREE_OPERAND (gimple_assign_rhs1 (g), 0));
>      if (!is_gimple_call (g) || !gimple_call_internal_p (g))
>        continue;
> @@ -4781,6 +4783,7 @@ match_uaddc_usubc (gimple_stmt_iterator
>     {
>       std::swap (arg2, arg3);
>       std::swap (kind[0], kind[1]);
> +      std::swap (arg_im[0], arg_im[1]);
>     }
>   if ((kind[0] & 1) == 0 || (kind[1] & 6) == 0)
>     return false;
> @@ -4810,6 +4813,8 @@ match_uaddc_usubc (gimple_stmt_iterator
>   /* Build .UADDC/.USUBC call which will be placed before the stmt.  */
>   gimple_stmt_iterator gsi2 = gsi_for_stmt (ovf2);
>   gimple *g;
> +  if ((kind[1] & 4) != 0 && types_compatible_p (type, TREE_TYPE (arg_im[1])))
> +    arg3 = arg_im[1];
>   if ((kind[1] & 1) == 0)
>     {
>       if (TREE_CODE (arg3) == INTEGER_CST)
> --- gcc/testsuite/g++.target/i386/pr79173-1.C.jj    2023-06-20 09:44:37.515578731 +0200
> +++ gcc/testsuite/g++.target/i386/pr79173-1.C    2023-06-20 10:35:33.650418101 +0200
> @@ -0,0 +1,33 @@
> +// PR middle-end/79173
> +// { dg-do compile { target c++11 } }
> +// { dg-options "-O2 -fno-stack-protector -masm=att" }
> +// { dg-final { scan-assembler-times "addq\t%r\[^\n\r]*, \\\(%rdi\\\)" 1 { target lp64 } } }
> +// { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 8\\\(%rdi\\\)" 1 { target lp64 } } }
> +// { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 16\\\(%rdi\\\)" 1 { target lp64 } } }
> +// { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 24\\\(%rdi\\\)" 1 { target lp64 } } }
> +// { dg-final { scan-assembler-times "addl\t%e\[^\n\r]*, \\\(%e\[^\n\r]*\\\)" 1 { target ia32 } } }
> +// { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 4\\\(%e\[^\n\r]*\\\)" 1 { target ia32 } } }
> +// { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 8\\\(%e\[^\n\r]*\\\)" 1 { target ia32 } } }
> +// { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 12\\\(%e\[^\n\r]*\\\)" 1 { target ia32 } } }
> +
> +template <typename T>
> +inline constexpr T
> +uaddc (T x, T y, T carry_in, T &carry_out) noexcept
> +{
> +  [[gnu::assume (carry_in <= 1)]];
> +  x += y;
> +  carry_out = x < y;
> +  x += carry_in;
> +  carry_out += x < carry_in;
> +  return x;
> +}
> +
> +void
> +foo (unsigned long *p, unsigned long *q)
> +{
> +  unsigned long c;
> +  p[0] = uaddc (p[0], q[0], 0UL, c);
> +  p[1] = uaddc (p[1], q[1], c, c);
> +  p[2] = uaddc (p[2], q[2], c, c);
> +  p[3] = uaddc (p[3], q[3], c, c);
> +}
> 
>    Jakub
>
  

Patch

--- gcc/tree-ssa-math-opts.cc.jj	2023-06-20 08:57:38.000000000 +0200
+++ gcc/tree-ssa-math-opts.cc	2023-06-20 10:33:52.969805538 +0200
@@ -4728,6 +4728,7 @@  match_uaddc_usubc (gimple_stmt_iterator
   if (!types_compatible_p (type, TREE_TYPE (arg1)))
     return false;
   int kind[2] = { 0, 0 };
+  tree arg_im[2] = { NULL_TREE, NULL_TREE };
   /* At least one of arg2 and arg3 should have type compatible
      with arg1/rhs[0], and the other one should have value in [0, 1]
      range.  If both are in [0, 1] range and type compatible with
@@ -4758,6 +4759,7 @@  match_uaddc_usubc (gimple_stmt_iterator
 	  g = uaddc_ne0 (g);
 	  if (!uaddc_is_cplxpart (g, IMAGPART_EXPR))
 	    continue;
+	  arg_im[i] = gimple_assign_lhs (g);
 	  g = SSA_NAME_DEF_STMT (TREE_OPERAND (gimple_assign_rhs1 (g), 0));
 	  if (!is_gimple_call (g) || !gimple_call_internal_p (g))
 	    continue;
@@ -4781,6 +4783,7 @@  match_uaddc_usubc (gimple_stmt_iterator
     {
       std::swap (arg2, arg3);
       std::swap (kind[0], kind[1]);
+      std::swap (arg_im[0], arg_im[1]);
     }
   if ((kind[0] & 1) == 0 || (kind[1] & 6) == 0)
     return false;
@@ -4810,6 +4813,8 @@  match_uaddc_usubc (gimple_stmt_iterator
   /* Build .UADDC/.USUBC call which will be placed before the stmt.  */
   gimple_stmt_iterator gsi2 = gsi_for_stmt (ovf2);
   gimple *g;
+  if ((kind[1] & 4) != 0 && types_compatible_p (type, TREE_TYPE (arg_im[1])))
+    arg3 = arg_im[1];
   if ((kind[1] & 1) == 0)
     {
       if (TREE_CODE (arg3) == INTEGER_CST)
--- gcc/testsuite/g++.target/i386/pr79173-1.C.jj	2023-06-20 09:44:37.515578731 +0200
+++ gcc/testsuite/g++.target/i386/pr79173-1.C	2023-06-20 10:35:33.650418101 +0200
@@ -0,0 +1,33 @@ 
+// PR middle-end/79173
+// { dg-do compile { target c++11 } }
+// { dg-options "-O2 -fno-stack-protector -masm=att" }
+// { dg-final { scan-assembler-times "addq\t%r\[^\n\r]*, \\\(%rdi\\\)" 1 { target lp64 } } }
+// { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 8\\\(%rdi\\\)" 1 { target lp64 } } }
+// { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 16\\\(%rdi\\\)" 1 { target lp64 } } }
+// { dg-final { scan-assembler-times "adcq\t%r\[^\n\r]*, 24\\\(%rdi\\\)" 1 { target lp64 } } }
+// { dg-final { scan-assembler-times "addl\t%e\[^\n\r]*, \\\(%e\[^\n\r]*\\\)" 1 { target ia32 } } }
+// { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 4\\\(%e\[^\n\r]*\\\)" 1 { target ia32 } } }
+// { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 8\\\(%e\[^\n\r]*\\\)" 1 { target ia32 } } }
+// { dg-final { scan-assembler-times "adcl\t%e\[^\n\r]*, 12\\\(%e\[^\n\r]*\\\)" 1 { target ia32 } } }
+
+template <typename T>
+inline constexpr T
+uaddc (T x, T y, T carry_in, T &carry_out) noexcept
+{
+  [[gnu::assume (carry_in <= 1)]];
+  x += y;
+  carry_out = x < y;
+  x += carry_in;
+  carry_out += x < carry_in;
+  return x;
+}
+
+void
+foo (unsigned long *p, unsigned long *q)
+{
+  unsigned long c;
+  p[0] = uaddc (p[0], q[0], 0UL, c);
+  p[1] = uaddc (p[1], q[1], c, c);
+  p[2] = uaddc (p[2], q[2], c, c);
+  p[3] = uaddc (p[3], q[3], c, c);
+}