v2: Add IntegerRange for -param=min-nondebug-insn-uid= and fix vector growing in LRA and vec [PR112411]

Message ID ZXGEqWQnsyIHljTu@tucnak
State Unresolved
Headers
Series v2: Add IntegerRange for -param=min-nondebug-insn-uid= and fix vector growing in LRA and vec [PR112411] |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

Jakub Jelinek Dec. 7, 2023, 8:39 a.m. UTC
  On Thu, Dec 07, 2023 at 09:36:22AM +0100, Jakub Jelinek wrote:
> So, one way to fix the LRA issue would be just to use
>   lra_insn_recog_data_len = index * 3U / 2;
>   if (lra_insn_recog_data_len <= index)
>     lra_insn_recog_data_len = index + 1;
> basically do what vec.cc does.  I thought we can do better for
> both vec.cc and LRA on 64-bit hosts even without growing the allocated
> counters, but now that I look at it again, perhaps we can't.
> The above overflows already with original alloc or lra_insn_recog_data_len
> 0x55555556, where 0x5555555 * 3U / 2 is still 0x7fffffff
> and so representable in the 32-bit, but 0x55555556 * 3U / 2 is
> 1.  I thought (and the patch implements it) that we could use
> alloc * (size_t) 3 / 2 so that on 64-bit hosts it wouldn't overflow
> that quickly, but 0x55555556 * (size_t) 3 / 2 there is 0x80000001
> which is still ok in unsigned, but given that vec.h then stores the
> counter into unsigned m_alloc:31; bit-field, it is too much.
> 
> The patch below is what I've actually bootstrapped/regtested on
> x86_64-linux and i686-linux, but given the above I think I should drop
> the vec.cc hunk and change (size_t) 3 in the LRA hunk to 3U.

Here is so far untested adjusted patch, which does the computation
just in unsigned int rather than size_t, because doing it in size_t
wouldn't improve things.

2023-12-07  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/112411
	* params.opt (-param=min-nondebug-insn-uid=): Add
	IntegerRange(0, 1073741824).
	* lra.cc (check_and_expand_insn_recog_data): Use 3U rather than 3
	in * 3 / 2 computation and if the result is smaller or equal to
	index, use index + 1.

	* gcc.dg/params/blocksort-part.c: Add dg-skip-if for
	--param min-nondebug-insn-uid=1073741824.



	Jakub
  

Comments

Richard Biener Dec. 7, 2023, 10:12 a.m. UTC | #1
On Thu, 7 Dec 2023, Jakub Jelinek wrote:

> On Thu, Dec 07, 2023 at 09:36:22AM +0100, Jakub Jelinek wrote:
> > So, one way to fix the LRA issue would be just to use
> >   lra_insn_recog_data_len = index * 3U / 2;
> >   if (lra_insn_recog_data_len <= index)
> >     lra_insn_recog_data_len = index + 1;
> > basically do what vec.cc does.  I thought we can do better for
> > both vec.cc and LRA on 64-bit hosts even without growing the allocated
> > counters, but now that I look at it again, perhaps we can't.
> > The above overflows already with original alloc or lra_insn_recog_data_len
> > 0x55555556, where 0x5555555 * 3U / 2 is still 0x7fffffff
> > and so representable in the 32-bit, but 0x55555556 * 3U / 2 is
> > 1.  I thought (and the patch implements it) that we could use
> > alloc * (size_t) 3 / 2 so that on 64-bit hosts it wouldn't overflow
> > that quickly, but 0x55555556 * (size_t) 3 / 2 there is 0x80000001
> > which is still ok in unsigned, but given that vec.h then stores the
> > counter into unsigned m_alloc:31; bit-field, it is too much.
> > 
> > The patch below is what I've actually bootstrapped/regtested on
> > x86_64-linux and i686-linux, but given the above I think I should drop
> > the vec.cc hunk and change (size_t) 3 in the LRA hunk to 3U.
> 
> Here is so far untested adjusted patch, which does the computation
> just in unsigned int rather than size_t, because doing it in size_t
> wouldn't improve things.

OK, but ...

> 2023-12-07  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR middle-end/112411
> 	* params.opt (-param=min-nondebug-insn-uid=): Add
> 	IntegerRange(0, 1073741824).
> 	* lra.cc (check_and_expand_insn_recog_data): Use 3U rather than 3
> 	in * 3 / 2 computation and if the result is smaller or equal to
> 	index, use index + 1.
> 
> 	* gcc.dg/params/blocksort-part.c: Add dg-skip-if for
> 	--param min-nondebug-insn-uid=1073741824.

what's this change for?  Do we test the actual param limit?  Can you
skip for the param without specifying the actual upper bound?

Thanks,
Richard.

> --- gcc/params.opt.jj	2023-11-02 07:49:18.010852541 +0100
> +++ gcc/params.opt	2023-12-06 18:55:57.045420935 +0100
> @@ -779,7 +779,7 @@ Common Joined UInteger Var(param_min_loo
>  The minimum threshold for probability of semi-invariant condition statement to trigger loop split.
>  
>  -param=min-nondebug-insn-uid=
> -Common Joined UInteger Var(param_min_nondebug_insn_uid) Param
> +Common Joined UInteger Var(param_min_nondebug_insn_uid) Param IntegerRange(0, 1073741824)
>  The minimum UID to be used for a nondebug insn.
>  
>  -param=min-size-for-stack-sharing=
> --- gcc/lra.cc.jj	2023-12-05 13:17:29.642260866 +0100
> +++ gcc/lra.cc	2023-12-06 19:52:01.759241999 +0100
> @@ -768,7 +768,9 @@ check_and_expand_insn_recog_data (int in
>    if (lra_insn_recog_data_len > index)
>      return;
>    old = lra_insn_recog_data_len;
> -  lra_insn_recog_data_len = index * 3 / 2 + 1;
> +  lra_insn_recog_data_len = index * 3U / 2;
> +  if (lra_insn_recog_data_len <= index)
> +    lra_insn_recog_data_len = index + 1;
>    lra_insn_recog_data = XRESIZEVEC (lra_insn_recog_data_t,
>  				    lra_insn_recog_data,
>  				    lra_insn_recog_data_len);
> --- gcc/testsuite/gcc.dg/params/blocksort-part.c.jj	2020-01-12 11:54:37.463397567 +0100
> +++ gcc/testsuite/gcc.dg/params/blocksort-part.c	2023-12-07 08:46:11.656974144 +0100
> @@ -1,4 +1,5 @@
>  /* { dg-skip-if "AArch64 does not support these bounds." { aarch64*-*-* } { "--param stack-clash-protection-*" } } */
> +/* { dg-skip-if "For 32-bit hosts such param is too much and even for 64-bit might require hundreds of GB of RAM" { *-*-* } { "--param min-nondebug-insn-uid=1073741824" } } */
>  
>  /*-------------------------------------------------------------*/
>  /*--- Block sorting machinery                               ---*/
> 
> 
> 	Jakub
> 
>
  
Jakub Jelinek Dec. 7, 2023, 10:32 a.m. UTC | #2
On Thu, Dec 07, 2023 at 11:12:39AM +0100, Richard Biener wrote:
> > 2023-12-07  Jakub Jelinek  <jakub@redhat.com>
> > 
> > 	PR middle-end/112411
> > 	* params.opt (-param=min-nondebug-insn-uid=): Add
> > 	IntegerRange(0, 1073741824).
> > 	* lra.cc (check_and_expand_insn_recog_data): Use 3U rather than 3
> > 	in * 3 / 2 computation and if the result is smaller or equal to
> > 	index, use index + 1.
> > 
> > 	* gcc.dg/params/blocksort-part.c: Add dg-skip-if for
> > 	--param min-nondebug-insn-uid=1073741824.
> 
> what's this change for?  Do we test the actual param limit?  Can you
> skip for the param without specifying the actual upper bound?

params.exp iterates over all params which have a range selected and tries
to compile the testcase(s) with both the minimum and if any maximum of the
range.
I think it is useful to test normally with --param min-nondebug-insn-uid=0
the minimum, that means it is off, it is just the maximum which either
doesn't work or requires those hundreds of gigabytes of memory (guess I
should look at what needs that much).
I don't know how else to skip just the maximum test for the param except
to specify the exact value; if params.opt changes that value, people will
notice FAILs of the test and the test can be adjusted too (unless the
maximum is lowered into something so small that it works well even on low
memory 32-bit hosts).

	Jakub
  
Richard Biener Dec. 7, 2023, 12:06 p.m. UTC | #3
On Thu, 7 Dec 2023, Jakub Jelinek wrote:

> On Thu, Dec 07, 2023 at 11:12:39AM +0100, Richard Biener wrote:
> > > 2023-12-07  Jakub Jelinek  <jakub@redhat.com>
> > > 
> > > 	PR middle-end/112411
> > > 	* params.opt (-param=min-nondebug-insn-uid=): Add
> > > 	IntegerRange(0, 1073741824).
> > > 	* lra.cc (check_and_expand_insn_recog_data): Use 3U rather than 3
> > > 	in * 3 / 2 computation and if the result is smaller or equal to
> > > 	index, use index + 1.
> > > 
> > > 	* gcc.dg/params/blocksort-part.c: Add dg-skip-if for
> > > 	--param min-nondebug-insn-uid=1073741824.
> > 
> > what's this change for?  Do we test the actual param limit?  Can you
> > skip for the param without specifying the actual upper bound?
> 
> params.exp iterates over all params which have a range selected and tries
> to compile the testcase(s) with both the minimum and if any maximum of the
> range.
> I think it is useful to test normally with --param min-nondebug-insn-uid=0
> the minimum, that means it is off, it is just the maximum which either
> doesn't work or requires those hundreds of gigabytes of memory (guess I
> should look at what needs that much).
> I don't know how else to skip just the maximum test for the param except
> to specify the exact value; if params.opt changes that value, people will
> notice FAILs of the test and the test can be adjusted too (unless the
> maximum is lowered into something so small that it works well even on low
> memory 32-bit hosts).

Ah, OK then.

Richard.
  
Vladimir Makarov Dec. 8, 2023, 5:49 p.m. UTC | #4
On 12/7/23 03:39, Jakub Jelinek wrote:
> On Thu, Dec 07, 2023 at 09:36:22AM +0100, Jakub Jelinek wrote:
>> So, one way to fix the LRA issue would be just to use
>>    lra_insn_recog_data_len = index * 3U / 2;
>>    if (lra_insn_recog_data_len <= index)
>>      lra_insn_recog_data_len = index + 1;
>> basically do what vec.cc does.  I thought we can do better for
>> both vec.cc and LRA on 64-bit hosts even without growing the allocated
>> counters, but now that I look at it again, perhaps we can't.
>> The above overflows already with original alloc or lra_insn_recog_data_len
>> 0x55555556, where 0x5555555 * 3U / 2 is still 0x7fffffff
>> and so representable in the 32-bit, but 0x55555556 * 3U / 2 is
>> 1.  I thought (and the patch implements it) that we could use
>> alloc * (size_t) 3 / 2 so that on 64-bit hosts it wouldn't overflow
>> that quickly, but 0x55555556 * (size_t) 3 / 2 there is 0x80000001
>> which is still ok in unsigned, but given that vec.h then stores the
>> counter into unsigned m_alloc:31; bit-field, it is too much.
>>
>> The patch below is what I've actually bootstrapped/regtested on
>> x86_64-linux and i686-linux, but given the above I think I should drop
>> the vec.cc hunk and change (size_t) 3 in the LRA hunk to 3U.
> Here is so far untested adjusted patch, which does the computation
> just in unsigned int rather than size_t, because doing it in size_t
> wouldn't improve things.
>
> 2023-12-07  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR middle-end/112411
> 	* params.opt (-param=min-nondebug-insn-uid=): Add
> 	IntegerRange(0, 1073741824).
> 	* lra.cc (check_and_expand_insn_recog_data): Use 3U rather than 3
> 	in * 3 / 2 computation and if the result is smaller or equal to
> 	index, use index + 1.
>
> 	* gcc.dg/params/blocksort-part.c: Add dg-skip-if for
> 	--param min-nondebug-insn-uid=1073741824.
>
Jakub, if you are still waiting for an approval,  LRA change is ok for 
me with given max param.

Thank you for fixing this.
  

Patch

--- gcc/params.opt.jj	2023-11-02 07:49:18.010852541 +0100
+++ gcc/params.opt	2023-12-06 18:55:57.045420935 +0100
@@ -779,7 +779,7 @@  Common Joined UInteger Var(param_min_loo
 The minimum threshold for probability of semi-invariant condition statement to trigger loop split.
 
 -param=min-nondebug-insn-uid=
-Common Joined UInteger Var(param_min_nondebug_insn_uid) Param
+Common Joined UInteger Var(param_min_nondebug_insn_uid) Param IntegerRange(0, 1073741824)
 The minimum UID to be used for a nondebug insn.
 
 -param=min-size-for-stack-sharing=
--- gcc/lra.cc.jj	2023-12-05 13:17:29.642260866 +0100
+++ gcc/lra.cc	2023-12-06 19:52:01.759241999 +0100
@@ -768,7 +768,9 @@  check_and_expand_insn_recog_data (int in
   if (lra_insn_recog_data_len > index)
     return;
   old = lra_insn_recog_data_len;
-  lra_insn_recog_data_len = index * 3 / 2 + 1;
+  lra_insn_recog_data_len = index * 3U / 2;
+  if (lra_insn_recog_data_len <= index)
+    lra_insn_recog_data_len = index + 1;
   lra_insn_recog_data = XRESIZEVEC (lra_insn_recog_data_t,
 				    lra_insn_recog_data,
 				    lra_insn_recog_data_len);
--- gcc/testsuite/gcc.dg/params/blocksort-part.c.jj	2020-01-12 11:54:37.463397567 +0100
+++ gcc/testsuite/gcc.dg/params/blocksort-part.c	2023-12-07 08:46:11.656974144 +0100
@@ -1,4 +1,5 @@ 
 /* { dg-skip-if "AArch64 does not support these bounds." { aarch64*-*-* } { "--param stack-clash-protection-*" } } */
+/* { dg-skip-if "For 32-bit hosts such param is too much and even for 64-bit might require hundreds of GB of RAM" { *-*-* } { "--param min-nondebug-insn-uid=1073741824" } } */
 
 /*-------------------------------------------------------------*/
 /*--- Block sorting machinery                               ---*/