c-family, middle-end: Add __builtin_c[lt]zg (arg, 0ULL) exception

Message ID ZVkQ8KJMpceSkrbj@tucnak
State Unresolved
Headers
Series c-family, middle-end: Add __builtin_c[lt]zg (arg, 0ULL) exception |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

Jakub Jelinek Nov. 18, 2023, 7:30 p.m. UTC
  Hi!

In https://sourceware.org/pipermail/libc-alpha/2023-November/152819.html
Florian Weimer raised concern that the type-generic stdbit.h macros
currently being considered suffer from similar problem as old tgmath.h
implementation, in particular that the macros expand during preprocessing
their arguments multiple times and if one nests these stdbit.h type-generic
macros several times, that can result in extremely large preprocessed source
and long compile times, even when the argument is only actually evaluated
once at runtime for side-effects.

As I'd strongly prefer not to add new builtins for all the 14 stdbit.h
type-generic macros, I think it is better to build the macros from
smaller building blocks.

The following patch adds the first one.
While one can say implement e.g. stdc_leading_zeros(value) macro
as ((unsigned int) __builtin_clzg (value, __builtin_popcountg ((__typeof (value)) ~(__typeof (value)) 0)))
that expands the argument 3 times, and even if it just used
((unsigned int) __builtin_clzg (value, __builtin_popcountg ((__typeof (value)) -1)))
relying on 2-s complement, that is still twice.

I'd prefer not to add optional 3rd argument to these, but given that the
second argument if specified right now has to have signed int type,
the following patch adds an exception that it allows 0ULL as a magic
value for the argument to mean fill in the precision of the first argument.

Ok for trunk if it passes bootstrap/regtest?

2023-11-18  Jakub Jelinek  <jakub@redhat.com>

	PR c/111309
gcc/
	* builtins.cc (fold_builtin_bit_query): If arg1 is 0ULL, use
	TYPE_PRECISION (arg0_type) instead of it.
	* fold-const-call.cc (fold_const_call_sss): Rename arg0_type
	argument to arg_type, add arg1_type argument, if for CLZ/CTZ
	second argument is unsigned long long, use
	TYPE_PRECISION (arg0_type).
	(fold_const_call_1): Pass also TREE_TYPE (arg1) to
	fold_const_call_sss.
	* doc/extend.texi (__builtin_clzg, __builtin_ctzg): Document
	behavior for second argument 0ULL.
gcc/c-family/
	* c-common.cc (check_builtin_function_arguments): If args[1] is
	0ULL, use TYPE_PRECISION (TREE_TYPE (args[0])) instead of it.
gcc/testsuite/
	* c-c++-common/pr111309-3.c: New test.
	* gcc.dg/torture/bitint-43.c: Add tests with 0ULL second argument.


	Jakub
  

Comments

Richard Biener Nov. 20, 2023, 7:44 a.m. UTC | #1
On Sat, 18 Nov 2023, Jakub Jelinek wrote:

> Hi!
> 
> In https://sourceware.org/pipermail/libc-alpha/2023-November/152819.html
> Florian Weimer raised concern that the type-generic stdbit.h macros
> currently being considered suffer from similar problem as old tgmath.h
> implementation, in particular that the macros expand during preprocessing
> their arguments multiple times and if one nests these stdbit.h type-generic
> macros several times, that can result in extremely large preprocessed source
> and long compile times, even when the argument is only actually evaluated
> once at runtime for side-effects.
> 
> As I'd strongly prefer not to add new builtins for all the 14 stdbit.h
> type-generic macros, I think it is better to build the macros from
> smaller building blocks.
> 
> The following patch adds the first one.
> While one can say implement e.g. stdc_leading_zeros(value) macro
> as ((unsigned int) __builtin_clzg (value, __builtin_popcountg ((__typeof (value)) ~(__typeof (value)) 0)))
> that expands the argument 3 times, and even if it just used
> ((unsigned int) __builtin_clzg (value, __builtin_popcountg ((__typeof (value)) -1)))
> relying on 2-s complement, that is still twice.
> 
> I'd prefer not to add optional 3rd argument to these, but given that the
> second argument if specified right now has to have signed int type,
> the following patch adds an exception that it allows 0ULL as a magic
> value for the argument to mean fill in the precision of the first argument.

Ugh.  First of all I don't like that the exception is applied during
folding.  As for the problem of multi evaluation can't consumers use
stmt expressions for this, say

{( auto __tem = value; __builtin_xyz (__tem, __typeof (__tem)); ... )}

?  Thus use 'auto' to avoid spelling 'value' multiple times?

Richard.

> Ok for trunk if it passes bootstrap/regtest?
> 
> 2023-11-18  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR c/111309
> gcc/
> 	* builtins.cc (fold_builtin_bit_query): If arg1 is 0ULL, use
> 	TYPE_PRECISION (arg0_type) instead of it.
> 	* fold-const-call.cc (fold_const_call_sss): Rename arg0_type
> 	argument to arg_type, add arg1_type argument, if for CLZ/CTZ
> 	second argument is unsigned long long, use
> 	TYPE_PRECISION (arg0_type).
> 	(fold_const_call_1): Pass also TREE_TYPE (arg1) to
> 	fold_const_call_sss.
> 	* doc/extend.texi (__builtin_clzg, __builtin_ctzg): Document
> 	behavior for second argument 0ULL.
> gcc/c-family/
> 	* c-common.cc (check_builtin_function_arguments): If args[1] is
> 	0ULL, use TYPE_PRECISION (TREE_TYPE (args[0])) instead of it.
> gcc/testsuite/
> 	* c-c++-common/pr111309-3.c: New test.
> 	* gcc.dg/torture/bitint-43.c: Add tests with 0ULL second argument.
> 
> --- gcc/builtins.cc.jj	2023-11-14 10:52:16.170276318 +0100
> +++ gcc/builtins.cc	2023-11-18 13:55:02.996395917 +0100
> @@ -9591,6 +9591,10 @@ fold_builtin_bit_query (location_t loc,
>      case BUILT_IN_CLZG:
>        if (arg1 && TREE_CODE (arg1) != INTEGER_CST)
>  	return NULL_TREE;
> +      if (arg1
> +	  && (TYPE_MAIN_VARIANT (TREE_TYPE (arg1))
> +	      == long_long_unsigned_type_node))
> +	arg1 = build_int_cst (integer_type_node, TYPE_PRECISION (arg0_type));
>        ifn = IFN_CLZ;
>        fcodei = BUILT_IN_CLZ;
>        fcodel = BUILT_IN_CLZL;
> @@ -9599,6 +9603,10 @@ fold_builtin_bit_query (location_t loc,
>      case BUILT_IN_CTZG:
>        if (arg1 && TREE_CODE (arg1) != INTEGER_CST)
>  	return NULL_TREE;
> +      if (arg1
> +	  && (TYPE_MAIN_VARIANT (TREE_TYPE (arg1))
> +	      == long_long_unsigned_type_node))
> +	arg1 = build_int_cst (integer_type_node, TYPE_PRECISION (arg0_type));
>        ifn = IFN_CTZ;
>        fcodei = BUILT_IN_CTZ;
>        fcodel = BUILT_IN_CTZL;
> --- gcc/fold-const-call.cc.jj	2023-11-14 10:52:16.186276097 +0100
> +++ gcc/fold-const-call.cc	2023-11-18 13:49:57.514641417 +0100
> @@ -1543,13 +1543,13 @@ fold_const_call_sss (real_value *result,
>  
>        *RESULT = FN (ARG0, ARG1)
>  
> -   where ARG_TYPE is the type of ARG0 and PRECISION is the number of bits in
> -   the result.  Return true on success.  */
> +   where ARG0_TYPE is the type of ARG0, ARG1_TYPE is the type of ARG1 and
> +   PRECISION is the number of bits in the result.  Return true on success.  */
>  
>  static bool
>  fold_const_call_sss (wide_int *result, combined_fn fn,
>  		     const wide_int_ref &arg0, const wide_int_ref &arg1,
> -		     unsigned int precision, tree arg_type ATTRIBUTE_UNUSED)
> +		     unsigned int precision, tree arg0_type, tree arg1_type)
>  {
>    switch (fn)
>      {
> @@ -1559,6 +1559,8 @@ fold_const_call_sss (wide_int *result, c
>  	int tmp;
>  	if (wi::ne_p (arg0, 0))
>  	  tmp = wi::clz (arg0);
> +	else if (TYPE_MAIN_VARIANT (arg1_type) == long_long_unsigned_type_node)
> +	  tmp = TYPE_PRECISION (arg0_type);
>  	else
>  	  tmp = arg1.to_shwi ();
>  	*result = wi::shwi (tmp, precision);
> @@ -1571,6 +1573,8 @@ fold_const_call_sss (wide_int *result, c
>  	int tmp;
>  	if (wi::ne_p (arg0, 0))
>  	  tmp = wi::ctz (arg0);
> +	else if (TYPE_MAIN_VARIANT (arg1_type) == long_long_unsigned_type_node)
> +	  tmp = TYPE_PRECISION (arg0_type);
>  	else
>  	  tmp = arg1.to_shwi ();
>  	*result = wi::shwi (tmp, precision);
> @@ -1625,7 +1629,7 @@ fold_const_call_1 (combined_fn fn, tree
>  	  wide_int result;
>  	  if (fold_const_call_sss (&result, fn, wi::to_wide (arg0),
>  				   wi::to_wide (arg1), TYPE_PRECISION (type),
> -				   TREE_TYPE (arg0)))
> +				   TREE_TYPE (arg0), TREE_TYPE (arg1)))
>  	    return wide_int_to_tree (type, result);
>  	}
>        return NULL_TREE;
> --- gcc/doc/extend.texi.jj	2023-11-16 17:27:39.838028110 +0100
> +++ gcc/doc/extend.texi	2023-11-18 13:17:40.982551766 +0100
> @@ -15031,6 +15031,9 @@ optional second argument with int type.
>  are performed on the first argument.  If two arguments are specified,
>  and first argument is 0, the result is the second argument.  If only
>  one argument is specified and it is 0, the result is undefined.
> +As an exception, if two arguments are specified and the second argument
> +is 0ULL, it is as if the second argument was the bit width of the first
> +argument.
>  @enddefbuiltin
>  
>  @defbuiltin{int __builtin_ctzg (...)}
> @@ -15040,6 +15043,9 @@ optional second argument with int type.
>  are performed on the first argument.  If two arguments are specified,
>  and first argument is 0, the result is the second argument.  If only
>  one argument is specified and it is 0, the result is undefined.
> +As an exception, if two arguments are specified and the second argument
> +is 0ULL, it is as if the second argument was the bit width of the first
> +argument.
>  @enddefbuiltin
>  
>  @defbuiltin{int __builtin_clrsbg (...)}
> --- gcc/c-family/c-common.cc.jj	2023-11-14 18:26:05.193616416 +0100
> +++ gcc/c-family/c-common.cc	2023-11-18 13:20:55.751844490 +0100
> @@ -6540,6 +6540,13 @@ check_builtin_function_arguments (locati
>  			"%qE does not have integral type", 2, fndecl);
>  	      return false;
>  	    }
> +	  if (integer_zerop (args[1])
> +	      && (TYPE_MAIN_VARIANT (TREE_TYPE (args[1]))
> +		  == long_long_unsigned_type_node))
> +	    args[1] = build_int_cst (integer_type_node,
> +				     INTEGRAL_TYPE_P (TREE_TYPE (args[0]))
> +				     ? TYPE_PRECISION (TREE_TYPE (args[0]))
> +				     : 0);
>  	  if ((TYPE_PRECISION (TREE_TYPE (args[1]))
>  	       > TYPE_PRECISION (integer_type_node))
>  	      || (TYPE_PRECISION (TREE_TYPE (args[1]))
> --- gcc/testsuite/c-c++-common/pr111309-3.c.jj	2023-11-18 13:22:22.084644472 +0100
> +++ gcc/testsuite/c-c++-common/pr111309-3.c	2023-11-18 13:26:12.894436233 +0100
> @@ -0,0 +1,26 @@
> +/* PR c/111309 */
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +int
> +main ()
> +{
> +  if (__builtin_clzg ((unsigned char) 0, 0ULL) != __CHAR_BIT__
> +      || __builtin_clzg ((unsigned short) 0, 0ULL) != __SIZEOF_SHORT__ * __CHAR_BIT__
> +      || __builtin_clzg (0U, 0ULL) != __SIZEOF_INT__ * __CHAR_BIT__
> +      || __builtin_clzg (0UL, 0ULL) != __SIZEOF_LONG__ * __CHAR_BIT__
> +      || __builtin_clzg (0ULL, 0ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__
> +#ifdef __SIZEOF_INT128__
> +      || __builtin_clzg ((unsigned __int128) 0, 0ULL) != __SIZEOF_INT128__ * __CHAR_BIT__
> +#endif
> +      || __builtin_clzg ((unsigned char) 1, 0ULL) != __CHAR_BIT__ - 1
> +      || __builtin_clzg ((unsigned short) 2, 0ULL) != __SIZEOF_SHORT__ * __CHAR_BIT__ - 2
> +      || __builtin_clzg (4U, 0ULL) != __SIZEOF_INT__ * __CHAR_BIT__ - 3
> +      || __builtin_clzg (8UL, 0ULL) != __SIZEOF_LONG__ * __CHAR_BIT__ - 4
> +      || __builtin_clzg (16ULL, 0ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 5
> +#ifdef __SIZEOF_INT128__
> +      || __builtin_clzg ((unsigned __int128) 32, 0ULL) != __SIZEOF_INT128__ * __CHAR_BIT__ - 6
> +#endif
> +      || 0)
> +    __builtin_abort ();
> +}
> --- gcc/testsuite/gcc.dg/torture/bitint-43.c.jj	2023-11-14 10:52:16.191276028 +0100
> +++ gcc/testsuite/gcc.dg/torture/bitint-43.c	2023-11-18 13:28:49.335261722 +0100
> @@ -141,7 +141,9 @@ main ()
>        || parity156 (0) != 0
>        || popcount156 (0) != 0
>        || __builtin_clzg ((unsigned _BitInt(156)) 0, 156 + 32) != 156 + 32
> +      || __builtin_clzg ((unsigned _BitInt(156)) 0, 0ULL) != 156
>        || __builtin_ctzg ((unsigned _BitInt(156)) 0, 156) != 156
> +      || __builtin_ctzg ((unsigned _BitInt(156)) 0, 0ULL) != 156
>        || __builtin_clrsbg ((_BitInt(156)) 0) != 156 - 1
>        || __builtin_ffsg ((_BitInt(156)) 0) != 0
>        || __builtin_parityg ((unsigned _BitInt(156)) 0) != 0
> @@ -159,8 +161,10 @@ main ()
>        || popcount156 (-1) != 156
>        || __builtin_clzg ((unsigned _BitInt(156)) -1) != 0
>        || __builtin_clzg ((unsigned _BitInt(156)) -1, 156 + 32) != 0
> +      || __builtin_clzg ((unsigned _BitInt(156)) -1, 0ULL) != 0
>        || __builtin_ctzg ((unsigned _BitInt(156)) -1) != 0
>        || __builtin_ctzg ((unsigned _BitInt(156)) -1, 156) != 0
> +      || __builtin_ctzg ((unsigned _BitInt(156)) -1, 0ULL) != 0
>        || __builtin_clrsbg ((_BitInt(156)) -1) != 156 - 1
>        || __builtin_ffsg ((_BitInt(156)) -1) != 1
>        || __builtin_parityg ((unsigned _BitInt(156)) -1) != 0
> 
> 	Jakub
> 
>
  
Richard Biener Nov. 20, 2023, 7:49 a.m. UTC | #2
On Mon, 20 Nov 2023, Richard Biener wrote:

> On Sat, 18 Nov 2023, Jakub Jelinek wrote:
> 
> > Hi!
> > 
> > In https://sourceware.org/pipermail/libc-alpha/2023-November/152819.html
> > Florian Weimer raised concern that the type-generic stdbit.h macros
> > currently being considered suffer from similar problem as old tgmath.h
> > implementation, in particular that the macros expand during preprocessing
> > their arguments multiple times and if one nests these stdbit.h type-generic
> > macros several times, that can result in extremely large preprocessed source
> > and long compile times, even when the argument is only actually evaluated
> > once at runtime for side-effects.
> > 
> > As I'd strongly prefer not to add new builtins for all the 14 stdbit.h
> > type-generic macros, I think it is better to build the macros from
> > smaller building blocks.
> > 
> > The following patch adds the first one.
> > While one can say implement e.g. stdc_leading_zeros(value) macro
> > as ((unsigned int) __builtin_clzg (value, __builtin_popcountg ((__typeof (value)) ~(__typeof (value)) 0)))
> > that expands the argument 3 times, and even if it just used
> > ((unsigned int) __builtin_clzg (value, __builtin_popcountg ((__typeof (value)) -1)))
> > relying on 2-s complement, that is still twice.
> > 
> > I'd prefer not to add optional 3rd argument to these, but given that the
> > second argument if specified right now has to have signed int type,
> > the following patch adds an exception that it allows 0ULL as a magic
> > value for the argument to mean fill in the precision of the first argument.
> 
> Ugh.  First of all I don't like that the exception is applied during
> folding.  As for the problem of multi evaluation can't consumers use
> stmt expressions for this, say
> 
> {( auto __tem = value; __builtin_xyz (__tem, __typeof (__tem)); ... )}
> 
> ?  Thus use 'auto' to avoid spelling 'value' multiple times?

Of course the other obvious alternative would be to parse
stdc_leading_zeros and friends directly, and not go through macros
and builtins.

> Richard.
> 
> > Ok for trunk if it passes bootstrap/regtest?
> > 
> > 2023-11-18  Jakub Jelinek  <jakub@redhat.com>
> > 
> > 	PR c/111309
> > gcc/
> > 	* builtins.cc (fold_builtin_bit_query): If arg1 is 0ULL, use
> > 	TYPE_PRECISION (arg0_type) instead of it.
> > 	* fold-const-call.cc (fold_const_call_sss): Rename arg0_type
> > 	argument to arg_type, add arg1_type argument, if for CLZ/CTZ
> > 	second argument is unsigned long long, use
> > 	TYPE_PRECISION (arg0_type).
> > 	(fold_const_call_1): Pass also TREE_TYPE (arg1) to
> > 	fold_const_call_sss.
> > 	* doc/extend.texi (__builtin_clzg, __builtin_ctzg): Document
> > 	behavior for second argument 0ULL.
> > gcc/c-family/
> > 	* c-common.cc (check_builtin_function_arguments): If args[1] is
> > 	0ULL, use TYPE_PRECISION (TREE_TYPE (args[0])) instead of it.
> > gcc/testsuite/
> > 	* c-c++-common/pr111309-3.c: New test.
> > 	* gcc.dg/torture/bitint-43.c: Add tests with 0ULL second argument.
> > 
> > --- gcc/builtins.cc.jj	2023-11-14 10:52:16.170276318 +0100
> > +++ gcc/builtins.cc	2023-11-18 13:55:02.996395917 +0100
> > @@ -9591,6 +9591,10 @@ fold_builtin_bit_query (location_t loc,
> >      case BUILT_IN_CLZG:
> >        if (arg1 && TREE_CODE (arg1) != INTEGER_CST)
> >  	return NULL_TREE;
> > +      if (arg1
> > +	  && (TYPE_MAIN_VARIANT (TREE_TYPE (arg1))
> > +	      == long_long_unsigned_type_node))
> > +	arg1 = build_int_cst (integer_type_node, TYPE_PRECISION (arg0_type));
> >        ifn = IFN_CLZ;
> >        fcodei = BUILT_IN_CLZ;
> >        fcodel = BUILT_IN_CLZL;
> > @@ -9599,6 +9603,10 @@ fold_builtin_bit_query (location_t loc,
> >      case BUILT_IN_CTZG:
> >        if (arg1 && TREE_CODE (arg1) != INTEGER_CST)
> >  	return NULL_TREE;
> > +      if (arg1
> > +	  && (TYPE_MAIN_VARIANT (TREE_TYPE (arg1))
> > +	      == long_long_unsigned_type_node))
> > +	arg1 = build_int_cst (integer_type_node, TYPE_PRECISION (arg0_type));
> >        ifn = IFN_CTZ;
> >        fcodei = BUILT_IN_CTZ;
> >        fcodel = BUILT_IN_CTZL;
> > --- gcc/fold-const-call.cc.jj	2023-11-14 10:52:16.186276097 +0100
> > +++ gcc/fold-const-call.cc	2023-11-18 13:49:57.514641417 +0100
> > @@ -1543,13 +1543,13 @@ fold_const_call_sss (real_value *result,
> >  
> >        *RESULT = FN (ARG0, ARG1)
> >  
> > -   where ARG_TYPE is the type of ARG0 and PRECISION is the number of bits in
> > -   the result.  Return true on success.  */
> > +   where ARG0_TYPE is the type of ARG0, ARG1_TYPE is the type of ARG1 and
> > +   PRECISION is the number of bits in the result.  Return true on success.  */
> >  
> >  static bool
> >  fold_const_call_sss (wide_int *result, combined_fn fn,
> >  		     const wide_int_ref &arg0, const wide_int_ref &arg1,
> > -		     unsigned int precision, tree arg_type ATTRIBUTE_UNUSED)
> > +		     unsigned int precision, tree arg0_type, tree arg1_type)
> >  {
> >    switch (fn)
> >      {
> > @@ -1559,6 +1559,8 @@ fold_const_call_sss (wide_int *result, c
> >  	int tmp;
> >  	if (wi::ne_p (arg0, 0))
> >  	  tmp = wi::clz (arg0);
> > +	else if (TYPE_MAIN_VARIANT (arg1_type) == long_long_unsigned_type_node)
> > +	  tmp = TYPE_PRECISION (arg0_type);
> >  	else
> >  	  tmp = arg1.to_shwi ();
> >  	*result = wi::shwi (tmp, precision);
> > @@ -1571,6 +1573,8 @@ fold_const_call_sss (wide_int *result, c
> >  	int tmp;
> >  	if (wi::ne_p (arg0, 0))
> >  	  tmp = wi::ctz (arg0);
> > +	else if (TYPE_MAIN_VARIANT (arg1_type) == long_long_unsigned_type_node)
> > +	  tmp = TYPE_PRECISION (arg0_type);
> >  	else
> >  	  tmp = arg1.to_shwi ();
> >  	*result = wi::shwi (tmp, precision);
> > @@ -1625,7 +1629,7 @@ fold_const_call_1 (combined_fn fn, tree
> >  	  wide_int result;
> >  	  if (fold_const_call_sss (&result, fn, wi::to_wide (arg0),
> >  				   wi::to_wide (arg1), TYPE_PRECISION (type),
> > -				   TREE_TYPE (arg0)))
> > +				   TREE_TYPE (arg0), TREE_TYPE (arg1)))
> >  	    return wide_int_to_tree (type, result);
> >  	}
> >        return NULL_TREE;
> > --- gcc/doc/extend.texi.jj	2023-11-16 17:27:39.838028110 +0100
> > +++ gcc/doc/extend.texi	2023-11-18 13:17:40.982551766 +0100
> > @@ -15031,6 +15031,9 @@ optional second argument with int type.
> >  are performed on the first argument.  If two arguments are specified,
> >  and first argument is 0, the result is the second argument.  If only
> >  one argument is specified and it is 0, the result is undefined.
> > +As an exception, if two arguments are specified and the second argument
> > +is 0ULL, it is as if the second argument was the bit width of the first
> > +argument.
> >  @enddefbuiltin
> >  
> >  @defbuiltin{int __builtin_ctzg (...)}
> > @@ -15040,6 +15043,9 @@ optional second argument with int type.
> >  are performed on the first argument.  If two arguments are specified,
> >  and first argument is 0, the result is the second argument.  If only
> >  one argument is specified and it is 0, the result is undefined.
> > +As an exception, if two arguments are specified and the second argument
> > +is 0ULL, it is as if the second argument was the bit width of the first
> > +argument.
> >  @enddefbuiltin
> >  
> >  @defbuiltin{int __builtin_clrsbg (...)}
> > --- gcc/c-family/c-common.cc.jj	2023-11-14 18:26:05.193616416 +0100
> > +++ gcc/c-family/c-common.cc	2023-11-18 13:20:55.751844490 +0100
> > @@ -6540,6 +6540,13 @@ check_builtin_function_arguments (locati
> >  			"%qE does not have integral type", 2, fndecl);
> >  	      return false;
> >  	    }
> > +	  if (integer_zerop (args[1])
> > +	      && (TYPE_MAIN_VARIANT (TREE_TYPE (args[1]))
> > +		  == long_long_unsigned_type_node))
> > +	    args[1] = build_int_cst (integer_type_node,
> > +				     INTEGRAL_TYPE_P (TREE_TYPE (args[0]))
> > +				     ? TYPE_PRECISION (TREE_TYPE (args[0]))
> > +				     : 0);
> >  	  if ((TYPE_PRECISION (TREE_TYPE (args[1]))
> >  	       > TYPE_PRECISION (integer_type_node))
> >  	      || (TYPE_PRECISION (TREE_TYPE (args[1]))
> > --- gcc/testsuite/c-c++-common/pr111309-3.c.jj	2023-11-18 13:22:22.084644472 +0100
> > +++ gcc/testsuite/c-c++-common/pr111309-3.c	2023-11-18 13:26:12.894436233 +0100
> > @@ -0,0 +1,26 @@
> > +/* PR c/111309 */
> > +/* { dg-do run } */
> > +/* { dg-options "-O2" } */
> > +
> > +int
> > +main ()
> > +{
> > +  if (__builtin_clzg ((unsigned char) 0, 0ULL) != __CHAR_BIT__
> > +      || __builtin_clzg ((unsigned short) 0, 0ULL) != __SIZEOF_SHORT__ * __CHAR_BIT__
> > +      || __builtin_clzg (0U, 0ULL) != __SIZEOF_INT__ * __CHAR_BIT__
> > +      || __builtin_clzg (0UL, 0ULL) != __SIZEOF_LONG__ * __CHAR_BIT__
> > +      || __builtin_clzg (0ULL, 0ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__
> > +#ifdef __SIZEOF_INT128__
> > +      || __builtin_clzg ((unsigned __int128) 0, 0ULL) != __SIZEOF_INT128__ * __CHAR_BIT__
> > +#endif
> > +      || __builtin_clzg ((unsigned char) 1, 0ULL) != __CHAR_BIT__ - 1
> > +      || __builtin_clzg ((unsigned short) 2, 0ULL) != __SIZEOF_SHORT__ * __CHAR_BIT__ - 2
> > +      || __builtin_clzg (4U, 0ULL) != __SIZEOF_INT__ * __CHAR_BIT__ - 3
> > +      || __builtin_clzg (8UL, 0ULL) != __SIZEOF_LONG__ * __CHAR_BIT__ - 4
> > +      || __builtin_clzg (16ULL, 0ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 5
> > +#ifdef __SIZEOF_INT128__
> > +      || __builtin_clzg ((unsigned __int128) 32, 0ULL) != __SIZEOF_INT128__ * __CHAR_BIT__ - 6
> > +#endif
> > +      || 0)
> > +    __builtin_abort ();
> > +}
> > --- gcc/testsuite/gcc.dg/torture/bitint-43.c.jj	2023-11-14 10:52:16.191276028 +0100
> > +++ gcc/testsuite/gcc.dg/torture/bitint-43.c	2023-11-18 13:28:49.335261722 +0100
> > @@ -141,7 +141,9 @@ main ()
> >        || parity156 (0) != 0
> >        || popcount156 (0) != 0
> >        || __builtin_clzg ((unsigned _BitInt(156)) 0, 156 + 32) != 156 + 32
> > +      || __builtin_clzg ((unsigned _BitInt(156)) 0, 0ULL) != 156
> >        || __builtin_ctzg ((unsigned _BitInt(156)) 0, 156) != 156
> > +      || __builtin_ctzg ((unsigned _BitInt(156)) 0, 0ULL) != 156
> >        || __builtin_clrsbg ((_BitInt(156)) 0) != 156 - 1
> >        || __builtin_ffsg ((_BitInt(156)) 0) != 0
> >        || __builtin_parityg ((unsigned _BitInt(156)) 0) != 0
> > @@ -159,8 +161,10 @@ main ()
> >        || popcount156 (-1) != 156
> >        || __builtin_clzg ((unsigned _BitInt(156)) -1) != 0
> >        || __builtin_clzg ((unsigned _BitInt(156)) -1, 156 + 32) != 0
> > +      || __builtin_clzg ((unsigned _BitInt(156)) -1, 0ULL) != 0
> >        || __builtin_ctzg ((unsigned _BitInt(156)) -1) != 0
> >        || __builtin_ctzg ((unsigned _BitInt(156)) -1, 156) != 0
> > +      || __builtin_ctzg ((unsigned _BitInt(156)) -1, 0ULL) != 0
> >        || __builtin_clrsbg ((_BitInt(156)) -1) != 156 - 1
> >        || __builtin_ffsg ((_BitInt(156)) -1) != 1
> >        || __builtin_parityg ((unsigned _BitInt(156)) -1) != 0
> > 
> > 	Jakub
> > 
> > 
> 
>
  
Jakub Jelinek Nov. 20, 2023, 8:13 a.m. UTC | #3
On Mon, Nov 20, 2023 at 07:44:18AM +0000, Richard Biener wrote:
> > In https://sourceware.org/pipermail/libc-alpha/2023-November/152819.html
> > Florian Weimer raised concern that the type-generic stdbit.h macros
> > currently being considered suffer from similar problem as old tgmath.h
> > implementation, in particular that the macros expand during preprocessing
> > their arguments multiple times and if one nests these stdbit.h type-generic
> > macros several times, that can result in extremely large preprocessed source
> > and long compile times, even when the argument is only actually evaluated
> > once at runtime for side-effects.
> > 
> > As I'd strongly prefer not to add new builtins for all the 14 stdbit.h
> > type-generic macros, I think it is better to build the macros from
> > smaller building blocks.
> > 
> > The following patch adds the first one.
> > While one can say implement e.g. stdc_leading_zeros(value) macro
> > as ((unsigned int) __builtin_clzg (value, __builtin_popcountg ((__typeof (value)) ~(__typeof (value)) 0)))
> > that expands the argument 3 times, and even if it just used
> > ((unsigned int) __builtin_clzg (value, __builtin_popcountg ((__typeof (value)) -1)))
> > relying on 2-s complement, that is still twice.
> > 
> > I'd prefer not to add optional 3rd argument to these, but given that the
> > second argument if specified right now has to have signed int type,
> > the following patch adds an exception that it allows 0ULL as a magic
> > value for the argument to mean fill in the precision of the first argument.
> 
> Ugh.  First of all I don't like that the exception is applied during
> folding.  As for the problem of multi evaluation can't consumers use
> stmt expressions for this, say
> 
> {( auto __tem = value; __builtin_xyz (__tem, __typeof (__tem)); ... )}
> 
> ?  Thus use 'auto' to avoid spelling 'value' multiple times?

Unfortunately not.  See
https://sourceware.org/pipermail/libc-alpha/2023-November/152757.html
where Joseph said that ({ ... }) can't be used in the implementation,
as ({ ... }) is not allowed outside of functions.  And I believe e.g.
size_t a = sizeof (stdc_leading_zeros (0ULL));
at file scope is valid C23 (we disallow ({ ... }) outside of functions
even for C++ BTW).

The reason I've changed fold_builtin_bit_query and
fold_const_call_sss/fold_const_call_1 is that while the
check_builtin_function_arguments change was apparently sufficient for C
which when check_builtin_function_arguments changes the args[1] argument
will just use whatever it changed to to construct the call, in C++
we first create the call and only then call
check_builtin_function_arguments.  But I can try to change it...

	Jakub
  
Florian Weimer Nov. 20, 2023, 8:18 a.m. UTC | #4
* Richard Biener:

> Ugh.  First of all I don't like that the exception is applied during
> folding.  As for the problem of multi evaluation can't consumers use
> stmt expressions for this, say
>
> {( auto __tem = value; __builtin_xyz (__tem, __typeof (__tem)); ... )}
>
> ?  Thus use 'auto' to avoid spelling 'value' multiple times?

{( … )} cannot be used in a constant expression, but the new macros are
supposed to be usable there.

Thanks,
Florian
  
Jakub Jelinek Nov. 20, 2023, 8:19 a.m. UTC | #5
On Mon, Nov 20, 2023 at 07:49:40AM +0000, Richard Biener wrote:
> Of course the other obvious alternative would be to parse
> stdc_leading_zeros and friends directly, and not go through macros
> and builtins.

The stdc_* names are not keywords and without including stdbit.h I'm
not sure it is ok to change their behavior.
That said, the last patch in the series implements 3 of the 14
type-generic macros as builtins, so one can then
#if __glibc_has_builtin (__builtin_stdc_bit_width)
#define stdc_bit_width(value) __builtin_stdc_bit_width (value)
#else
...
#endif
and be done with that.  If there is an agreement we should do that
for all 14 rather than just those 3 + the 2 ugly hacks (__builtin_c{l,t}zg with
0ULL second argument and __builtin_bit_complement), I can change the
patch to implement all 14 too (again, in the FE in the same spot, so there
isn't actually any builtins.def for it and it doesn't affect the middle-end
at all).  Note, __builtin_bit_complement doesn't affect the middle-end
at all either.

	Jakub
  
Jakub Jelinek Nov. 20, 2023, 8:31 a.m. UTC | #6
On Mon, Nov 20, 2023 at 09:18:57AM +0100, Florian Weimer wrote:
> * Richard Biener:
> 
> > Ugh.  First of all I don't like that the exception is applied during
> > folding.  As for the problem of multi evaluation can't consumers use
> > stmt expressions for this, say
> >
> > {( auto __tem = value; __builtin_xyz (__tem, __typeof (__tem)); ... )}
> >
> > ?  Thus use 'auto' to avoid spelling 'value' multiple times?
> 
> {( … )} cannot be used in a constant expression, but the new macros are
> supposed to be usable there.

I'm not sure about that, it would be nice for them to be usable there,
but I think e.g. none of Joseph's implementation of those macros
made them usable there (except inside of sizeof/typeof/typeof_unquall)
and I don't see a requirement in the C23 standard that they must be usable
in constant expressions.
The versions I've posted on Thursday were usable there except for
stdc_has_single_bit (but that actually can be implemented that way too)
and stdc_bit_floor.  And the version I haven't posted that used the 3
patches posted on Saturday would have all functions usable when the
argument to those macros is a constant expression.

BTW, if we go route of implementing all of the stdc_ type-generic macros
as builtins, we could as well not implement that way the following 4
# define stdc_first_leading_one(x) (__builtin_clzg (x, -1) + 1U)
# define stdc_first_trailing_one(x) (__builtin_ctzg (x, -1) + 1U)
# define stdc_count_ones(x) ((unsigned int) __builtin_popcountg (x))
# define stdc_has_single_bit(x) ((_Bool) (__builtin_popcountg (x) == 1))
which are implementable without any new extensions.

	Jakub
  
Richard Biener Nov. 20, 2023, 8:37 a.m. UTC | #7
On Mon, 20 Nov 2023, Jakub Jelinek wrote:

> On Mon, Nov 20, 2023 at 09:18:57AM +0100, Florian Weimer wrote:
> > * Richard Biener:
> > 
> > > Ugh.  First of all I don't like that the exception is applied during
> > > folding.  As for the problem of multi evaluation can't consumers use
> > > stmt expressions for this, say
> > >
> > > {( auto __tem = value; __builtin_xyz (__tem, __typeof (__tem)); ... )}
> > >
> > > ?  Thus use 'auto' to avoid spelling 'value' multiple times?
> > 
> > {( ? )} cannot be used in a constant expression, but the new macros are
> > supposed to be usable there.
> 
> I'm not sure about that, it would be nice for them to be usable there,

Btw, I think that {( .. )} should be made usable in sizeof () and
possibly even in at least C++ constant expressions (not sure about C).

> but I think e.g. none of Joseph's implementation of those macros
> made them usable there (except inside of sizeof/typeof/typeof_unquall)
> and I don't see a requirement in the C23 standard that they must be usable
> in constant expressions.
> The versions I've posted on Thursday were usable there except for
> stdc_has_single_bit (but that actually can be implemented that way too)
> and stdc_bit_floor.  And the version I haven't posted that used the 3
> patches posted on Saturday would have all functions usable when the
> argument to those macros is a constant expression.
> 
> BTW, if we go route of implementing all of the stdc_ type-generic macros
> as builtins, we could as well not implement that way the following 4
> # define stdc_first_leading_one(x) (__builtin_clzg (x, -1) + 1U)
> # define stdc_first_trailing_one(x) (__builtin_ctzg (x, -1) + 1U)
> # define stdc_count_ones(x) ((unsigned int) __builtin_popcountg (x))
> # define stdc_has_single_bit(x) ((_Bool) (__builtin_popcountg (x) == 1))
> which are implementable without any new extensions.

I'd rather do all of those necessary as builtins instead of hacking
around limitations.  If we don't want to solve those limitations in
a more generic way.

And of course nobody would write

const int x = sizeof (stdc_first_leading_one (5));

that's just stupid ... (but oh well).

Richard.
  
Jakub Jelinek Nov. 20, 2023, 8:48 a.m. UTC | #8
On Mon, Nov 20, 2023 at 08:37:55AM +0000, Richard Biener wrote:
> > I'm not sure about that, it would be nice for them to be usable there,
> 
> Btw, I think that {( .. )} should be made usable in sizeof () and
> possibly even in at least C++ constant expressions (not sure about C).

I believe the problkem is having new VAR_DECLs in those which actually
aren't file scope/namespace scope variables but there is no function
DECL_CONTEXT to attach to them.  So, it probably wouldn't be one afternoon
change to allow that.

> > but I think e.g. none of Joseph's implementation of those macros
> > made them usable there (except inside of sizeof/typeof/typeof_unquall)
> > and I don't see a requirement in the C23 standard that they must be usable
> > in constant expressions.
> > The versions I've posted on Thursday were usable there except for
> > stdc_has_single_bit (but that actually can be implemented that way too)
> > and stdc_bit_floor.  And the version I haven't posted that used the 3
> > patches posted on Saturday would have all functions usable when the
> > argument to those macros is a constant expression.
> > 
> > BTW, if we go route of implementing all of the stdc_ type-generic macros
> > as builtins, we could as well not implement that way the following 4
> > # define stdc_first_leading_one(x) (__builtin_clzg (x, -1) + 1U)
> > # define stdc_first_trailing_one(x) (__builtin_ctzg (x, -1) + 1U)
> > # define stdc_count_ones(x) ((unsigned int) __builtin_popcountg (x))
> > # define stdc_has_single_bit(x) ((_Bool) (__builtin_popcountg (x) == 1))
> > which are implementable without any new extensions.
> 
> I'd rather do all of those necessary as builtins instead of hacking
> around limitations.  If we don't want to solve those limitations in
> a more generic way.

Ok, I can prepare a patch for that, shouldn't be that hard.
Do you want all 14, or just the 10 and leave the above 4 with the
above definitions?

> And of course nobody would write
> 
> const int x = sizeof (stdc_first_leading_one (5));
> 
> that's just stupid ... (but oh well).

Well, standard testsuite needs to include that at least.
But of course, if it is usable in constant expressions,
unsigned a = stdc_bit_width ((unsigned _BitInt(824)) 435987349856735489657489657468954768954674589674598uwb * 49876558967549867548967548967548967549867548967456uwb);
etc. can be useful in constant expressions.

	Jakub
  
Richard Biener Nov. 20, 2023, 8:58 a.m. UTC | #9
On Mon, 20 Nov 2023, Jakub Jelinek wrote:

> On Mon, Nov 20, 2023 at 08:37:55AM +0000, Richard Biener wrote:
> > > I'm not sure about that, it would be nice for them to be usable there,
> > 
> > Btw, I think that {( .. )} should be made usable in sizeof () and
> > possibly even in at least C++ constant expressions (not sure about C).
> 
> I believe the problkem is having new VAR_DECLs in those which actually
> aren't file scope/namespace scope variables but there is no function
> DECL_CONTEXT to attach to them.  So, it probably wouldn't be one afternoon
> change to allow that.
> 
> > > but I think e.g. none of Joseph's implementation of those macros
> > > made them usable there (except inside of sizeof/typeof/typeof_unquall)
> > > and I don't see a requirement in the C23 standard that they must be usable
> > > in constant expressions.
> > > The versions I've posted on Thursday were usable there except for
> > > stdc_has_single_bit (but that actually can be implemented that way too)
> > > and stdc_bit_floor.  And the version I haven't posted that used the 3
> > > patches posted on Saturday would have all functions usable when the
> > > argument to those macros is a constant expression.
> > > 
> > > BTW, if we go route of implementing all of the stdc_ type-generic macros
> > > as builtins, we could as well not implement that way the following 4
> > > # define stdc_first_leading_one(x) (__builtin_clzg (x, -1) + 1U)
> > > # define stdc_first_trailing_one(x) (__builtin_ctzg (x, -1) + 1U)
> > > # define stdc_count_ones(x) ((unsigned int) __builtin_popcountg (x))
> > > # define stdc_has_single_bit(x) ((_Bool) (__builtin_popcountg (x) == 1))
> > > which are implementable without any new extensions.
> > 
> > I'd rather do all of those necessary as builtins instead of hacking
> > around limitations.  If we don't want to solve those limitations in
> > a more generic way.
> 
> Ok, I can prepare a patch for that, shouldn't be that hard.
> Do you want all 14, or just the 10 and leave the above 4 with the
> above definitions?

I'd say all of them for consistency, we can parse/gimplify them to
the open-coded variants then.

> > And of course nobody would write
> > 
> > const int x = sizeof (stdc_first_leading_one (5));
> > 
> > that's just stupid ... (but oh well).
> 
> Well, standard testsuite needs to include that at least.
> But of course, if it is usable in constant expressions,
> unsigned a = stdc_bit_width ((unsigned _BitInt(824)) 435987349856735489657489657468954768954674589674598uwb * 49876558967549867548967548967548967549867548967456uwb);
> etc. can be useful in constant expressions.
> 
> 	Jakub
> 
>
  
Joseph Myers Nov. 21, 2023, 12:12 a.m. UTC | #10
On Mon, 20 Nov 2023, Jakub Jelinek wrote:

> and be done with that.  If there is an agreement we should do that
> for all 14 rather than just those 3 + the 2 ugly hacks (__builtin_c{l,t}zg with
> 0ULL second argument and __builtin_bit_complement), I can change the

I tend to agree with the "ugly hack" description of the 0ULL second 
argument special case.  __builtin_bit_complement seems reasonable enough 
as a primitive for implementing such operations, but so does just defining 
built-in functions in the front end for all 14 (or for all 14 except those 
that are trivial wrappers round existing built-in functions without 
needing to use ({}) or expand argument tokens more than once).
  
Martin Uecker Nov. 21, 2023, 5:59 a.m. UTC | #11
(corrected address)


> On Mon, 20 Nov 2023, Jakub Jelinek wrote:
> 
> > On Mon, Nov 20, 2023 at 08:37:55AM +0000, Richard Biener wrote:
> > > > I'm not sure about that, it would be nice for them to be usable there,
> > > 
> > > Btw, I think that {( .. )} should be made usable in sizeof () and
> > > possibly even in at least C++ constant expressions (not sure about C).
> > 
> > I believe the problkem is having new VAR_DECLs in those which actually
> > aren't file scope/namespace scope variables but there is no function
> > DECL_CONTEXT to attach to them.  So, it probably wouldn't be one afternoon
> > change to allow that.

There is an open bug about this:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93239

But the required feature is simpler than full statement 
expression, essentially

let x = y in z

where x is an identifier and y and z are expression, and
this should be much easier to implement.

I thought about an extension to _Generic which would be
useful here:

_GNU_Generic(y, int x1: z1, float x2: z2)

or even

_GNU_Generic(y, default x: z)

which would be useful in general.


> > 
> > > > but I think e.g. none of Joseph's implementation of those macros
> > > > made them usable there (except inside of sizeof/typeof/typeof_unquall)
> > > > and I don't see a requirement in the C23 standard that they must be usable
> > > > in constant expressions.
> > > > The versions I've posted on Thursday were usable there except for
> > > > stdc_has_single_bit (but that actually can be implemented that way too)
> > > > and stdc_bit_floor.  And the version I haven't posted that used the 3
> > > > patches posted on Saturday would have all functions usable when the
> > > > argument to those macros is a constant expression.
> > > > 
> > > > BTW, if we go route of implementing all of the stdc_ type-generic macros
> > > > as builtins, we could as well not implement that way the following 4
> > > > # define stdc_first_leading_one(x) (__builtin_clzg (x, -1) + 1U)
> > > > # define stdc_first_trailing_one(x) (__builtin_ctzg (x, -1) + 1U)
> > > > # define stdc_count_ones(x) ((unsigned int) __builtin_popcountg (x))
> > > > # define stdc_has_single_bit(x) ((_Bool) (__builtin_popcountg (x) == 1))
> > > > which are implementable without any new extensions.
> > > 
> > > I'd rather do all of those necessary as builtins instead of hacking
> > > around limitations.  If we don't want to solve those limitations in
> > > a more generic way.
> > 
> > Ok, I can prepare a patch for that, shouldn't be that hard.
> > Do you want all 14, or just the 10 and leave the above 4 with the
> > above definitions?
> 
> I'd say all of them for consistency, we can parse/gimplify them to
> the open-coded variants then.

For use of _Generic with _BitInt one would need some kind
of _BitInt_Width(x) macro/builtin that returns the width as an
constant expressions, which would also be useful in general.

Then one could write:

_Generic(x, int a: foo, _BitInt(_BitInt_Width(x)): bar);

With this and an extension as suggested above, I think one could
solve this in a generic way.

Martin

> > > And of course nobody would write
> > > 
> > > const int x = sizeof (stdc_first_leading_one (5));
> > > 
> > > that's just stupid ... (but oh well).
> > 
> > Well, standard testsuite needs to include that at least.
> > But of course, if it is usable in constant expressions,
> > unsigned a = stdc_bit_width ((unsigned _BitInt(824)) 435987349856735489657489657468954768954674589674598uwb * 49876558967549867548967548967548967549867548967456uwb);
> > etc. can be useful in constant expressions.


> > 
> > 	Jakub
> > 
> >
  
Jakub Jelinek Nov. 21, 2023, 8:30 a.m. UTC | #12
On Tue, Nov 21, 2023 at 12:12:18AM +0000, Joseph Myers wrote:
> On Mon, 20 Nov 2023, Jakub Jelinek wrote:
> 
> > and be done with that.  If there is an agreement we should do that
> > for all 14 rather than just those 3 + the 2 ugly hacks (__builtin_c{l,t}zg with
> > 0ULL second argument and __builtin_bit_complement), I can change the
> 
> I tend to agree with the "ugly hack" description of the 0ULL second 
> argument special case.

It could be done differently, e.g. by adding optional third argument with
the meaning that if that third argument is present and constant non-zero,
the return value on 0 first argument would be second argument + bitsize
of the first argument, while if the third argument is not present or
constant zero, it is the current behavior of second argument on constant
zero.

Given that I've already posted all 14 __builtin_stdc_*, that isn't
strictly needed for stdbit.h, the only question is if it would be useful
for users in other cases.
If there is the _Generic extension or say even just a new builtin like
__builtin_save_expr (x, y)
which would introduce some special fixed identifier when evaluating y for
the value (but not side-effects) of x, that wouldn't be needed and I agree
going for something like that would be cleaner.

	Jakub
  

Patch

--- gcc/builtins.cc.jj	2023-11-14 10:52:16.170276318 +0100
+++ gcc/builtins.cc	2023-11-18 13:55:02.996395917 +0100
@@ -9591,6 +9591,10 @@  fold_builtin_bit_query (location_t loc,
     case BUILT_IN_CLZG:
       if (arg1 && TREE_CODE (arg1) != INTEGER_CST)
 	return NULL_TREE;
+      if (arg1
+	  && (TYPE_MAIN_VARIANT (TREE_TYPE (arg1))
+	      == long_long_unsigned_type_node))
+	arg1 = build_int_cst (integer_type_node, TYPE_PRECISION (arg0_type));
       ifn = IFN_CLZ;
       fcodei = BUILT_IN_CLZ;
       fcodel = BUILT_IN_CLZL;
@@ -9599,6 +9603,10 @@  fold_builtin_bit_query (location_t loc,
     case BUILT_IN_CTZG:
       if (arg1 && TREE_CODE (arg1) != INTEGER_CST)
 	return NULL_TREE;
+      if (arg1
+	  && (TYPE_MAIN_VARIANT (TREE_TYPE (arg1))
+	      == long_long_unsigned_type_node))
+	arg1 = build_int_cst (integer_type_node, TYPE_PRECISION (arg0_type));
       ifn = IFN_CTZ;
       fcodei = BUILT_IN_CTZ;
       fcodel = BUILT_IN_CTZL;
--- gcc/fold-const-call.cc.jj	2023-11-14 10:52:16.186276097 +0100
+++ gcc/fold-const-call.cc	2023-11-18 13:49:57.514641417 +0100
@@ -1543,13 +1543,13 @@  fold_const_call_sss (real_value *result,
 
       *RESULT = FN (ARG0, ARG1)
 
-   where ARG_TYPE is the type of ARG0 and PRECISION is the number of bits in
-   the result.  Return true on success.  */
+   where ARG0_TYPE is the type of ARG0, ARG1_TYPE is the type of ARG1 and
+   PRECISION is the number of bits in the result.  Return true on success.  */
 
 static bool
 fold_const_call_sss (wide_int *result, combined_fn fn,
 		     const wide_int_ref &arg0, const wide_int_ref &arg1,
-		     unsigned int precision, tree arg_type ATTRIBUTE_UNUSED)
+		     unsigned int precision, tree arg0_type, tree arg1_type)
 {
   switch (fn)
     {
@@ -1559,6 +1559,8 @@  fold_const_call_sss (wide_int *result, c
 	int tmp;
 	if (wi::ne_p (arg0, 0))
 	  tmp = wi::clz (arg0);
+	else if (TYPE_MAIN_VARIANT (arg1_type) == long_long_unsigned_type_node)
+	  tmp = TYPE_PRECISION (arg0_type);
 	else
 	  tmp = arg1.to_shwi ();
 	*result = wi::shwi (tmp, precision);
@@ -1571,6 +1573,8 @@  fold_const_call_sss (wide_int *result, c
 	int tmp;
 	if (wi::ne_p (arg0, 0))
 	  tmp = wi::ctz (arg0);
+	else if (TYPE_MAIN_VARIANT (arg1_type) == long_long_unsigned_type_node)
+	  tmp = TYPE_PRECISION (arg0_type);
 	else
 	  tmp = arg1.to_shwi ();
 	*result = wi::shwi (tmp, precision);
@@ -1625,7 +1629,7 @@  fold_const_call_1 (combined_fn fn, tree
 	  wide_int result;
 	  if (fold_const_call_sss (&result, fn, wi::to_wide (arg0),
 				   wi::to_wide (arg1), TYPE_PRECISION (type),
-				   TREE_TYPE (arg0)))
+				   TREE_TYPE (arg0), TREE_TYPE (arg1)))
 	    return wide_int_to_tree (type, result);
 	}
       return NULL_TREE;
--- gcc/doc/extend.texi.jj	2023-11-16 17:27:39.838028110 +0100
+++ gcc/doc/extend.texi	2023-11-18 13:17:40.982551766 +0100
@@ -15031,6 +15031,9 @@  optional second argument with int type.
 are performed on the first argument.  If two arguments are specified,
 and first argument is 0, the result is the second argument.  If only
 one argument is specified and it is 0, the result is undefined.
+As an exception, if two arguments are specified and the second argument
+is 0ULL, it is as if the second argument was the bit width of the first
+argument.
 @enddefbuiltin
 
 @defbuiltin{int __builtin_ctzg (...)}
@@ -15040,6 +15043,9 @@  optional second argument with int type.
 are performed on the first argument.  If two arguments are specified,
 and first argument is 0, the result is the second argument.  If only
 one argument is specified and it is 0, the result is undefined.
+As an exception, if two arguments are specified and the second argument
+is 0ULL, it is as if the second argument was the bit width of the first
+argument.
 @enddefbuiltin
 
 @defbuiltin{int __builtin_clrsbg (...)}
--- gcc/c-family/c-common.cc.jj	2023-11-14 18:26:05.193616416 +0100
+++ gcc/c-family/c-common.cc	2023-11-18 13:20:55.751844490 +0100
@@ -6540,6 +6540,13 @@  check_builtin_function_arguments (locati
 			"%qE does not have integral type", 2, fndecl);
 	      return false;
 	    }
+	  if (integer_zerop (args[1])
+	      && (TYPE_MAIN_VARIANT (TREE_TYPE (args[1]))
+		  == long_long_unsigned_type_node))
+	    args[1] = build_int_cst (integer_type_node,
+				     INTEGRAL_TYPE_P (TREE_TYPE (args[0]))
+				     ? TYPE_PRECISION (TREE_TYPE (args[0]))
+				     : 0);
 	  if ((TYPE_PRECISION (TREE_TYPE (args[1]))
 	       > TYPE_PRECISION (integer_type_node))
 	      || (TYPE_PRECISION (TREE_TYPE (args[1]))
--- gcc/testsuite/c-c++-common/pr111309-3.c.jj	2023-11-18 13:22:22.084644472 +0100
+++ gcc/testsuite/c-c++-common/pr111309-3.c	2023-11-18 13:26:12.894436233 +0100
@@ -0,0 +1,26 @@ 
+/* PR c/111309 */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+int
+main ()
+{
+  if (__builtin_clzg ((unsigned char) 0, 0ULL) != __CHAR_BIT__
+      || __builtin_clzg ((unsigned short) 0, 0ULL) != __SIZEOF_SHORT__ * __CHAR_BIT__
+      || __builtin_clzg (0U, 0ULL) != __SIZEOF_INT__ * __CHAR_BIT__
+      || __builtin_clzg (0UL, 0ULL) != __SIZEOF_LONG__ * __CHAR_BIT__
+      || __builtin_clzg (0ULL, 0ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__
+#ifdef __SIZEOF_INT128__
+      || __builtin_clzg ((unsigned __int128) 0, 0ULL) != __SIZEOF_INT128__ * __CHAR_BIT__
+#endif
+      || __builtin_clzg ((unsigned char) 1, 0ULL) != __CHAR_BIT__ - 1
+      || __builtin_clzg ((unsigned short) 2, 0ULL) != __SIZEOF_SHORT__ * __CHAR_BIT__ - 2
+      || __builtin_clzg (4U, 0ULL) != __SIZEOF_INT__ * __CHAR_BIT__ - 3
+      || __builtin_clzg (8UL, 0ULL) != __SIZEOF_LONG__ * __CHAR_BIT__ - 4
+      || __builtin_clzg (16ULL, 0ULL) != __SIZEOF_LONG_LONG__ * __CHAR_BIT__ - 5
+#ifdef __SIZEOF_INT128__
+      || __builtin_clzg ((unsigned __int128) 32, 0ULL) != __SIZEOF_INT128__ * __CHAR_BIT__ - 6
+#endif
+      || 0)
+    __builtin_abort ();
+}
--- gcc/testsuite/gcc.dg/torture/bitint-43.c.jj	2023-11-14 10:52:16.191276028 +0100
+++ gcc/testsuite/gcc.dg/torture/bitint-43.c	2023-11-18 13:28:49.335261722 +0100
@@ -141,7 +141,9 @@  main ()
       || parity156 (0) != 0
       || popcount156 (0) != 0
       || __builtin_clzg ((unsigned _BitInt(156)) 0, 156 + 32) != 156 + 32
+      || __builtin_clzg ((unsigned _BitInt(156)) 0, 0ULL) != 156
       || __builtin_ctzg ((unsigned _BitInt(156)) 0, 156) != 156
+      || __builtin_ctzg ((unsigned _BitInt(156)) 0, 0ULL) != 156
       || __builtin_clrsbg ((_BitInt(156)) 0) != 156 - 1
       || __builtin_ffsg ((_BitInt(156)) 0) != 0
       || __builtin_parityg ((unsigned _BitInt(156)) 0) != 0
@@ -159,8 +161,10 @@  main ()
       || popcount156 (-1) != 156
       || __builtin_clzg ((unsigned _BitInt(156)) -1) != 0
       || __builtin_clzg ((unsigned _BitInt(156)) -1, 156 + 32) != 0
+      || __builtin_clzg ((unsigned _BitInt(156)) -1, 0ULL) != 0
       || __builtin_ctzg ((unsigned _BitInt(156)) -1) != 0
       || __builtin_ctzg ((unsigned _BitInt(156)) -1, 156) != 0
+      || __builtin_ctzg ((unsigned _BitInt(156)) -1, 0ULL) != 0
       || __builtin_clrsbg ((_BitInt(156)) -1) != 156 - 1
       || __builtin_ffsg ((_BitInt(156)) -1) != 1
       || __builtin_parityg ((unsigned _BitInt(156)) -1) != 0