c, c++, v2: Avoid incorrect shortening of divisions [PR108365]

Message ID Y8BuGyewY8o2YFkR@tucnak
State Unresolved
Headers
Series c, c++, v2: Avoid incorrect shortening of divisions [PR108365] |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

Jakub Jelinek Jan. 12, 2023, 8:31 p.m. UTC
  On Thu, Jan 12, 2023 at 08:55:32PM +0100, Jakub Jelinek via Gcc-patches wrote:
> > > So, the following patch for the NOP_EXPR cases checks just in case that
> > > it is from integral type and more importantly checks it is a widening
> > > conversion, and then next to it also allows op0 to be just unsigned,
> > > promoted or not, as that is what the C FE will do for those cases too
> > > and I believe it must work - either the division/modulo common type
> > > will be that unsigned type, then we can shorten and don't need to worry
> > > about UB, or it will be some wider signed type but then it can't be most
> > > negative value of the wider type.
> > 
> > Why not use the same condition in C and C++?
> 
> I can test that.  Do you mean change the C FE to match the patched C++
> or change C++ FE to just test TYPE_UNSIGNED (orig_op0)?
> I think both should work, though what I wrote perhaps can shorten in more
> cases.  Can try to construct testcases where it differs...

E.g.
int f1 (int x, int y) { return (unsigned) x / y; }
unsigned short f2 (unsigned short x, unsigned short y) { return (unsigned) x / y; }
unsigned int f3 (unsigned int x, unsigned int y) { return (long long) x / y; }
C++ FE before and after the patch shortens the division in f2 and f3,
C FE shortens only in f2.  So using the C FE condition would be a regression
for C++.

Therefore I'm going to test following patch:

2023-01-12  Jakub Jelinek  <jakub@redhat.com>

	PR c++/108365
	* c-typeck.cc (build_binary_op): For integral division or modulo,
	shorten if type0 is unsigned, or op0 is cast from narrower unsigned
	integral type or op1 is INTEGER_CST other than -1.

	* typeck.cc (cp_build_binary_op): For integral division or modulo,
	shorten if type0 is unsigned, or op0 is cast from narrower unsigned
	integral type or stripped_op1 is INTEGER_CST other than -1.

	* c-c++-common/pr108365.c: New test.
	* g++.dg/opt/pr108365.C: New test.
	* g++.dg/warn/pr108365.C: New test.



	Jakub
  

Comments

Jakub Jelinek Jan. 13, 2023, 12:25 a.m. UTC | #1
On Thu, Jan 12, 2023 at 09:31:23PM +0100, Jakub Jelinek via Gcc-patches wrote:
> On Thu, Jan 12, 2023 at 08:55:32PM +0100, Jakub Jelinek via Gcc-patches wrote:
> > > > So, the following patch for the NOP_EXPR cases checks just in case that
> > > > it is from integral type and more importantly checks it is a widening
> > > > conversion, and then next to it also allows op0 to be just unsigned,
> > > > promoted or not, as that is what the C FE will do for those cases too
> > > > and I believe it must work - either the division/modulo common type
> > > > will be that unsigned type, then we can shorten and don't need to worry
> > > > about UB, or it will be some wider signed type but then it can't be most
> > > > negative value of the wider type.
> > > 
> > > Why not use the same condition in C and C++?
> > 
> > I can test that.  Do you mean change the C FE to match the patched C++
> > or change C++ FE to just test TYPE_UNSIGNED (orig_op0)?
> > I think both should work, though what I wrote perhaps can shorten in more
> > cases.  Can try to construct testcases where it differs...
> 
> E.g.
> int f1 (int x, int y) { return (unsigned) x / y; }
> unsigned short f2 (unsigned short x, unsigned short y) { return (unsigned) x / y; }
> unsigned int f3 (unsigned int x, unsigned int y) { return (long long) x / y; }
> C++ FE before and after the patch shortens the division in f2 and f3,
> C FE shortens only in f2.  So using the C FE condition would be a regression
> for C++.
> 
> Therefore I'm going to test following patch:

Bootstrapped/regtested successfully on x86_64-linux and i686-linux.

	Jakub
  
Jason Merrill Jan. 13, 2023, 4:58 p.m. UTC | #2
On 1/12/23 15:31, Jakub Jelinek wrote:
> On Thu, Jan 12, 2023 at 08:55:32PM +0100, Jakub Jelinek via Gcc-patches wrote:
>>>> So, the following patch for the NOP_EXPR cases checks just in case that
>>>> it is from integral type and more importantly checks it is a widening
>>>> conversion, and then next to it also allows op0 to be just unsigned,
>>>> promoted or not, as that is what the C FE will do for those cases too
>>>> and I believe it must work - either the division/modulo common type
>>>> will be that unsigned type, then we can shorten and don't need to worry
>>>> about UB, or it will be some wider signed type but then it can't be most
>>>> negative value of the wider type.
>>>
>>> Why not use the same condition in C and C++?
>>
>> I can test that.  Do you mean change the C FE to match the patched C++
>> or change C++ FE to just test TYPE_UNSIGNED (orig_op0)?
>> I think both should work, though what I wrote perhaps can shorten in more
>> cases.  Can try to construct testcases where it differs...
> 
> E.g.
> int f1 (int x, int y) { return (unsigned) x / y; }
> unsigned short f2 (unsigned short x, unsigned short y) { return (unsigned) x / y; }
> unsigned int f3 (unsigned int x, unsigned int y) { return (long long) x / y; }
> C++ FE before and after the patch shortens the division in f2 and f3,
> C FE shortens only in f2.  So using the C FE condition would be a regression
> for C++.
> 
> Therefore I'm going to test following patch:

LGTM, though we might put that condition in c-common somewhere?

> 2023-01-12  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR c++/108365
> 	* c-typeck.cc (build_binary_op): For integral division or modulo,
> 	shorten if type0 is unsigned, or op0 is cast from narrower unsigned
> 	integral type or op1 is INTEGER_CST other than -1.
> 
> 	* typeck.cc (cp_build_binary_op): For integral division or modulo,
> 	shorten if type0 is unsigned, or op0 is cast from narrower unsigned
> 	integral type or stripped_op1 is INTEGER_CST other than -1.
> 
> 	* c-c++-common/pr108365.c: New test.
> 	* g++.dg/opt/pr108365.C: New test.
> 	* g++.dg/warn/pr108365.C: New test.
> 
> --- gcc/c/c-typeck.cc.jj	2022-11-13 12:29:08.197504249 +0100
> +++ gcc/c/c-typeck.cc	2023-01-12 21:06:53.310875131 +0100
> @@ -12431,7 +12431,14 @@ build_binary_op (location_t location, en
>   	       undefined if the quotient can't be represented in the
>   	       computation mode.  We shorten only if unsigned or if
>   	       dividing by something we know != -1.  */
> -	    shorten = (TYPE_UNSIGNED (TREE_TYPE (orig_op0))
> +	    shorten = (TYPE_UNSIGNED (type0)
> +		       || (TREE_CODE (op0) == NOP_EXPR
> +			   && INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (op0,
> +									0)))
> +			   && TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0, 0)))
> +			   && (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (op0,
> +									0)))
> +			       < TYPE_PRECISION (type0)))
>   		       || (TREE_CODE (op1) == INTEGER_CST
>   			   && !integer_all_onesp (op1)));
>   	  common = 1;
> @@ -12467,7 +12474,12 @@ build_binary_op (location_t location, en
>   	     on some targets, since the modulo instruction is undefined if the
>   	     quotient can't be represented in the computation mode.  We shorten
>   	     only if unsigned or if dividing by something we know != -1.  */
> -	  shorten = (TYPE_UNSIGNED (TREE_TYPE (orig_op0))
> +	  shorten = (TYPE_UNSIGNED (type0)
> +		     || (TREE_CODE (op0) == NOP_EXPR
> +			 && INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0)))
> +			 && TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0, 0)))
> +			 && (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (op0, 0)))
> +			     < TYPE_PRECISION (type0)))
>   		     || (TREE_CODE (op1) == INTEGER_CST
>   			 && !integer_all_onesp (op1)));
>   	  common = 1;
> --- gcc/cp/typeck.cc.jj	2023-01-11 12:47:56.099672340 +0100
> +++ gcc/cp/typeck.cc	2023-01-12 21:04:23.738022528 +0100
> @@ -5455,8 +5455,15 @@ cp_build_binary_op (const op_location_t
>   		 point, so we have to dig out the original type to find out if
>   		 it was unsigned.  */
>   	      tree stripped_op1 = tree_strip_any_location_wrapper (op1);
> -	      shorten = ((TREE_CODE (op0) == NOP_EXPR
> -			  && TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0, 0))))
> +	      shorten = (TYPE_UNSIGNED (type0)
> +			 || (TREE_CODE (op0) == NOP_EXPR
> +			     && INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (op0,
> +									  0)))
> +			     && TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0,
> +									0)))
> +			     && (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (op0,
> +									  0)))
> +				 < TYPE_PRECISION (type0)))
>   			 || (TREE_CODE (stripped_op1) == INTEGER_CST
>   			     && ! integer_all_onesp (stripped_op1)));
>   	    }
> @@ -5491,8 +5498,12 @@ cp_build_binary_op (const op_location_t
>   	     quotient can't be represented in the computation mode.  We shorten
>   	     only if unsigned or if dividing by something we know != -1.  */
>   	  tree stripped_op1 = tree_strip_any_location_wrapper (op1);
> -	  shorten = ((TREE_CODE (op0) == NOP_EXPR
> -		      && TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0, 0))))
> +	  shorten = (TYPE_UNSIGNED (type0)
> +		     || (TREE_CODE (op0) == NOP_EXPR
> +			 && INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0)))
> +			 && TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0, 0)))
> +			 && (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (op0, 0)))
> +			     < TYPE_PRECISION (type0)))
>   		     || (TREE_CODE (stripped_op1) == INTEGER_CST
>   			 && ! integer_all_onesp (stripped_op1)));
>   	  common = 1;
> --- gcc/testsuite/c-c++-common/pr108365.c.jj	2023-01-12 21:27:05.825467236 +0100
> +++ gcc/testsuite/c-c++-common/pr108365.c	2023-01-12 21:26:44.095779209 +0100
> @@ -0,0 +1,16 @@
> +/* PR c++/108365 */
> +/* { dg-do compile { target { { ilp32 || lp64 } || llp64 } } } */
> +/* { dg-options "-O2 -fdump-tree-gimple" } */
> +/* { dg-final { scan-tree-dump-not " \\\((int|unsigned short int|long long int|unsigned int)\\\) " "gimple" } } */
> +
> +unsigned short
> +foo (unsigned short x, unsigned short y)
> +{
> +  return (unsigned) x / y;
> +}
> +
> +unsigned int
> +bar (unsigned int x, unsigned int y)
> +{
> +  return (long long) x / y;
> +}
> --- gcc/testsuite/g++.dg/opt/pr108365.C.jj	2023-01-12 21:04:23.738022528 +0100
> +++ gcc/testsuite/g++.dg/opt/pr108365.C	2023-01-12 21:04:23.738022528 +0100
> @@ -0,0 +1,13 @@
> +// PR c++/108365
> +// { dg-do run }
> +
> +char b = 1;
> +
> +int
> +main ()
> +{
> +#if __CHAR_BIT__ == 8 && __SIZEOF_SHORT__ == 2 && __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8
> +  while ((short) ((long long) (unsigned long long) (-__INT_MAX__ - 1) / (long long) (b ? -1 : 0)))
> +    ;
> +#endif
> +}
> --- gcc/testsuite/g++.dg/warn/pr108365.C.jj	2023-01-12 21:04:23.738022528 +0100
> +++ gcc/testsuite/g++.dg/warn/pr108365.C	2023-01-12 21:04:23.738022528 +0100
> @@ -0,0 +1,5 @@
> +// PR c++/108365
> +// { dg-do compile { target { { { ilp32 || lp64 } || llp64 } && c++11 } } }
> +
> +constexpr char b = 1;
> +long t = (short) ((long long) (unsigned long long) (-__INT_MAX__ - 1) / (long long) (b ? -1 : 0)); // { dg-bogus "integer overflow in expression of type" }
> 
> 
> 	Jakub
>
  

Patch

--- gcc/c/c-typeck.cc.jj	2022-11-13 12:29:08.197504249 +0100
+++ gcc/c/c-typeck.cc	2023-01-12 21:06:53.310875131 +0100
@@ -12431,7 +12431,14 @@  build_binary_op (location_t location, en
 	       undefined if the quotient can't be represented in the
 	       computation mode.  We shorten only if unsigned or if
 	       dividing by something we know != -1.  */
-	    shorten = (TYPE_UNSIGNED (TREE_TYPE (orig_op0))
+	    shorten = (TYPE_UNSIGNED (type0)
+		       || (TREE_CODE (op0) == NOP_EXPR
+			   && INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (op0,
+									0)))
+			   && TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0, 0)))
+			   && (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (op0,
+									0)))
+			       < TYPE_PRECISION (type0)))
 		       || (TREE_CODE (op1) == INTEGER_CST
 			   && !integer_all_onesp (op1)));
 	  common = 1;
@@ -12467,7 +12474,12 @@  build_binary_op (location_t location, en
 	     on some targets, since the modulo instruction is undefined if the
 	     quotient can't be represented in the computation mode.  We shorten
 	     only if unsigned or if dividing by something we know != -1.  */
-	  shorten = (TYPE_UNSIGNED (TREE_TYPE (orig_op0))
+	  shorten = (TYPE_UNSIGNED (type0)
+		     || (TREE_CODE (op0) == NOP_EXPR
+			 && INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0)))
+			 && TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0, 0)))
+			 && (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (op0, 0)))
+			     < TYPE_PRECISION (type0)))
 		     || (TREE_CODE (op1) == INTEGER_CST
 			 && !integer_all_onesp (op1)));
 	  common = 1;
--- gcc/cp/typeck.cc.jj	2023-01-11 12:47:56.099672340 +0100
+++ gcc/cp/typeck.cc	2023-01-12 21:04:23.738022528 +0100
@@ -5455,8 +5455,15 @@  cp_build_binary_op (const op_location_t
 		 point, so we have to dig out the original type to find out if
 		 it was unsigned.  */
 	      tree stripped_op1 = tree_strip_any_location_wrapper (op1);
-	      shorten = ((TREE_CODE (op0) == NOP_EXPR
-			  && TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0, 0))))
+	      shorten = (TYPE_UNSIGNED (type0)
+			 || (TREE_CODE (op0) == NOP_EXPR
+			     && INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (op0,
+									  0)))
+			     && TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0,
+									0)))
+			     && (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (op0,
+									  0)))
+				 < TYPE_PRECISION (type0)))
 			 || (TREE_CODE (stripped_op1) == INTEGER_CST
 			     && ! integer_all_onesp (stripped_op1)));
 	    }
@@ -5491,8 +5498,12 @@  cp_build_binary_op (const op_location_t
 	     quotient can't be represented in the computation mode.  We shorten
 	     only if unsigned or if dividing by something we know != -1.  */
 	  tree stripped_op1 = tree_strip_any_location_wrapper (op1);
-	  shorten = ((TREE_CODE (op0) == NOP_EXPR
-		      && TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0, 0))))
+	  shorten = (TYPE_UNSIGNED (type0)
+		     || (TREE_CODE (op0) == NOP_EXPR
+			 && INTEGRAL_TYPE_P (TREE_TYPE (TREE_OPERAND (op0, 0)))
+			 && TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (op0, 0)))
+			 && (TYPE_PRECISION (TREE_TYPE (TREE_OPERAND (op0, 0)))
+			     < TYPE_PRECISION (type0)))
 		     || (TREE_CODE (stripped_op1) == INTEGER_CST
 			 && ! integer_all_onesp (stripped_op1)));
 	  common = 1;
--- gcc/testsuite/c-c++-common/pr108365.c.jj	2023-01-12 21:27:05.825467236 +0100
+++ gcc/testsuite/c-c++-common/pr108365.c	2023-01-12 21:26:44.095779209 +0100
@@ -0,0 +1,16 @@ 
+/* PR c++/108365 */
+/* { dg-do compile { target { { ilp32 || lp64 } || llp64 } } } */
+/* { dg-options "-O2 -fdump-tree-gimple" } */
+/* { dg-final { scan-tree-dump-not " \\\((int|unsigned short int|long long int|unsigned int)\\\) " "gimple" } } */
+
+unsigned short
+foo (unsigned short x, unsigned short y)
+{
+  return (unsigned) x / y;
+}
+
+unsigned int
+bar (unsigned int x, unsigned int y)
+{
+  return (long long) x / y;
+}
--- gcc/testsuite/g++.dg/opt/pr108365.C.jj	2023-01-12 21:04:23.738022528 +0100
+++ gcc/testsuite/g++.dg/opt/pr108365.C	2023-01-12 21:04:23.738022528 +0100
@@ -0,0 +1,13 @@ 
+// PR c++/108365
+// { dg-do run }
+
+char b = 1;
+
+int
+main ()
+{
+#if __CHAR_BIT__ == 8 && __SIZEOF_SHORT__ == 2 && __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8
+  while ((short) ((long long) (unsigned long long) (-__INT_MAX__ - 1) / (long long) (b ? -1 : 0)))
+    ;
+#endif
+}
--- gcc/testsuite/g++.dg/warn/pr108365.C.jj	2023-01-12 21:04:23.738022528 +0100
+++ gcc/testsuite/g++.dg/warn/pr108365.C	2023-01-12 21:04:23.738022528 +0100
@@ -0,0 +1,5 @@ 
+// PR c++/108365
+// { dg-do compile { target { { { ilp32 || lp64 } || llp64 } && c++11 } } }
+
+constexpr char b = 1;
+long t = (short) ((long long) (unsigned long long) (-__INT_MAX__ - 1) / (long long) (b ? -1 : 0)); // { dg-bogus "integer overflow in expression of type" }