[rs6000] Fix addg6s builtin with long long parameters. (PR100693)

Message ID b2128dcf14408b394358f51802e73bcc9d922889.camel@vnet.ibm.com
State Accepted, archived
Headers
Series [rs6000] Fix addg6s builtin with long long parameters. (PR100693) |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

will schmidt Oct. 6, 2022, 9:29 p.m. UTC
  [PATCH, rs6000] Fix addg6s builtin with long long parameters. (PR100693)

Hi,
  As reported in PR 100693, attempts to use __builtin_addg6s
with long long arguments result in truncated results.

Since the int and long long types can be coerced into each other,
(documented further near the rs6000-c.cc change), this is handled
by adding a builtin overload (ADDG6S_OV), and the addition of some
special handling in altivec_resolve_overloaded_builtin() to map
the calls to addg6s_32 or addg6s_64; similar to how the SCAL_CMPB
builtins are currently handled.

This has sniff-tested cleanly.

I'm seeing a regression failure show up in
testsuite/g++.dg/modules/adl-3*.c; which seems entirely unrelated
to the content in this change.  I'm poking at that a bit more to
see if I can tell the what/why for that.

OK for trunk?

Thanks,
-Will

gcc/
	PR target/100693

	* config/rs6000/rs6000-builtins.def ([POWER7]): Replace bif-name
	__builtin_addg6s with bif-name __builtin_addg6s_32.
	([POWER7-64]): New bif-name __builtin_addg6s_64.
	* config/rs6000/rs6000-c.cc (altivec_resolve_overloaded_builtin):
	Add handler mapping RS6000_OVLD_ADDG6S_OV to RS6000_BIF_ADDG6S
	and RS6000_BIF_ADDG6S_32.
	* config/rs6000/rs6000-overload.def (ADDG6S_OV): Add overloaded
	entry __builtin_addg6s mapped to ADDG6S_32 and ADDG6S.
	* config/rs6000/rs6000.md ("addg6s", UNSPEC_ADDG6S): Replace with
	("addg6s<mode>3") and rework.
	* doc/extend.texi (__builtin_addg6s): Add documentation for
	__builtin_addg6s with unsigned long long parameters.

gcc/testsuite/
	* testsuite/gcc.target/powerpc/pr100693-compile.c: New.
	* testsuite/gcc.target/powerpc/pr100693-run.c: New.
  

Comments

Segher Boessenkool Oct. 7, 2022, 8 p.m. UTC | #1
Hi!

On Thu, Oct 06, 2022 at 04:29:57PM -0500, will schmidt wrote:
>   As reported in PR 100693, attempts to use __builtin_addg6s
> with long long arguments result in truncated results.
> 
> Since the int and long long types can be coerced into each other,
> (documented further near the rs6000-c.cc change), this is handled
> by adding a builtin overload (ADDG6S_OV), and the addition of some
> special handling in altivec_resolve_overloaded_builtin() to map
> the calls to addg6s_32 or addg6s_64; similar to how the SCAL_CMPB
> builtins are currently handled.

Another option is to make "addg6sl" and "addg6ll" versions, like many
generic builtins have (popcount for example).  This is ugly and not very
userfriendly of course.  OTOH, the overload thing is more confusing, if
you use constant arguments for example.

Have you considered just always using "long" arguments, treating this as
a bugfix?  It will only hurt users that depend on 64-bit arguments being
cut short to 32 bits (implicitly!), not a very sensible thing to expect.

> I'm seeing a regression failure show up in
> testsuite/g++.dg/modules/adl-3*.c; which seems entirely unrelated
> to the content in this change.  I'm poking at that a bit more to
> see if I can tell the what/why for that.

Yeah, we need that resolved before this patch can be okay at all.  But I
guess it is just an unstable test?

> 	* config/rs6000/rs6000-builtins.def ([POWER7]): Replace bif-name
> 	__builtin_addg6s with bif-name __builtin_addg6s_32.

The stanza is lowercase, not uppercase.  But, Sthis should be
	* config/rs6000/rs6000-builtins.def (ADDG6S): Replace bif-name
	__builtin_addg6s with bif-name __builtin_addg6s_32.
or don't even mention the old name?  Like
	* config/rs6000/rs6000-builtins.def (ADDG6S): Use bif-name
	__builtin_addg6s_32.

> 	([POWER7-64]): New bif-name __builtin_addg6s_64.

Similar.

> 	* config/rs6000/rs6000-c.cc (altivec_resolve_overloaded_builtin):
> 	Add handler mapping RS6000_OVLD_ADDG6S_OV to RS6000_BIF_ADDG6S
> 	and RS6000_BIF_ADDG6S_32.

Please don't wrap lines early, certainly not if that then leaves a colon
at the end of a line (it looks like you forgot something, that way).
Changelog lines are 80 character positions long.

> 	* config/rs6000/rs6000.md ("addg6s", UNSPEC_ADDG6S): Replace with
> 	("addg6s<mode>3") and rework.

	* config/rs6000/rs6000.md (addg6s): Delete.
	(addg6s<mode>3 for GPR): New.

There is no pattern called "UNSPEC_ADDG6S", it doesn't belong in the
changelog :-)

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr100693-compile.c
> @@ -0,0 +1,30 @@
> +/* { dg-do compile { target { powerpc*-*-linux* } } } */

target {*-*-linux*}  (everything is powerpc*-*-* in gcc.target/powerpc)

> +/* { dg-skip-if "" { powerpc*-*-darwin* } } */

Don't skip on darwin unless you know it is needed.

> +/* { dg-require-effective-target powerpc_vsx_ok } */

addg6s does not need VSX.  You want has_arch_pwr7 here?

> +/* { dg-options "-mdejagnu-cpu=power7 -O3" } */

-O2 if that works please.

> +/* { dg-final { scan-assembler-times {\maddg6s\M} 4 } } */
> +/* { dg-final { scan-assembler-not    "bl __builtin" } } */

Some ABIs will use tailcalls here.  You can prevent that in the
testcase (add an  asm("");  before the return statement), or you can
scan for something less specific than the exact "bl"?

> +unsigned long long test4_ull (unsigned long long *a, unsigned long long *b)
> +{
> +	return __builtin_addg6s(*a, *b);
> +}

This does not work with -m32, no?


Segher
  

Patch

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index f76f54793d73..11050e4c26d5 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2010,12 +2010,13 @@ 
     XXSPLTD_V2DI vsx_xxspltd_v2di {}
 
 
 ; Power7 builtins (ISA 2.06).
 [power7]
-  const unsigned int __builtin_addg6s (unsigned int, unsigned int);
-    ADDG6S addg6s {}
+
+  const unsigned int __builtin_addg6s_32 (unsigned int, unsigned int);
+    ADDG6S_32 addg6ssi3 {}
 
   const signed long __builtin_bpermd (signed long, signed long);
     BPERMD bpermd_di {32bit}
 
   const unsigned int __builtin_cbcdtd (unsigned int);
@@ -2041,10 +2042,14 @@ 
     UNPACK_V1TI unpackv1ti {}
 
 
 ; Power7 builtins requiring 64-bit GPRs (even with 32-bit addressing).
 [power7-64]
+
+  const unsigned long __builtin_addg6s_64 (unsigned long, unsigned long);
+    ADDG6S addg6sdi3 {no32bit}
+
   const signed long long __builtin_divde (signed long long, signed long long);
     DIVDE dive_di {}
 
   const unsigned long long __builtin_divdeu (unsigned long long, \
                                              unsigned long long);
diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 566094626293..28e8b6761ce5 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -1919,10 +1919,35 @@  altivec_resolve_overloaded_builtin (location_t loc, tree fndecl,
 				   instance_code, fcode, types, args);
 	if (call != error_mark_node)
 	  return call;
 	break;
       }
+      /* We need to special case __builtin_addg6s because the overloaded
+	 forms of this function take (unsigned int, unsigned int) or
+	 (unsigned long long, unsigned long long).  Since C conventions
+	 allow the respective argument types to be implicitly coerced into
+	 each other, the default handling does not provide adequate
+	 discrimination between the desired forms of the function.  */
+    case RS6000_OVLD_ADDG6S_OV:
+      {
+	machine_mode arg1_mode = TYPE_MODE (types[0]);
+	machine_mode arg2_mode = TYPE_MODE (types[1]);
+
+	/* If any supplied arguments are wider than 32 bits, resolve to
+	   64-bit variant of built-in function.  */
+	if (GET_MODE_PRECISION (arg1_mode) > 32
+	    || GET_MODE_PRECISION (arg2_mode) > 32)
+	  instance_code = RS6000_BIF_ADDG6S;
+	else
+	  instance_code = RS6000_BIF_ADDG6S_32;
+
+	tree call = find_instance (&unsupported_builtin, &instance,
+				   instance_code, fcode, types, args);
+	if (call != error_mark_node)
+	  return call;
+	break;
+      }
     case RS6000_OVLD_VEC_VSIE:
       {
 	machine_mode arg1_mode = TYPE_MODE (types[0]);
 
 	/* If supplied first argument is wider than 64 bits, resolve to
diff --git a/gcc/config/rs6000/rs6000-overload.def b/gcc/config/rs6000/rs6000-overload.def
index 44e2945aaa0e..41b74c0c1500 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -193,10 +193,16 @@ 
   unsigned int __builtin_cmpb (unsigned int, unsigned int);
     CMPB_32
   unsigned long long __builtin_cmpb (unsigned long long, unsigned long long);
     CMPB
 
+[ADDG6S_OV, SKIP, __builtin_addg6s]
+  unsigned int __builtin_addg6s (unsigned int, unsigned int);
+    ADDG6S_32
+  unsigned long long __builtin_addg6s (unsigned long long, unsigned long long);
+    ADDG6S
+
 [VEC_ABS, vec_abs, __builtin_vec_abs]
   vsc __builtin_vec_abs (vsc);
     ABS_V16QI
   vss __builtin_vec_abs (vss);
     ABS_V8HI
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index ad5a4cf2ef83..b172ffd097ab 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -14534,14 +14534,14 @@  (define_peephole2
   operands[5] = change_address (mem, <ALTIVEC_DFORM:MODE>mode, new_addr);
 })
    
 
 ;; Miscellaneous ISA 2.06 (power7) instructions
-(define_insn "addg6s"
-  [(set (match_operand:SI 0 "register_operand" "=r")
-	(unspec:SI [(match_operand:SI 1 "register_operand" "r")
-		    (match_operand:SI 2 "register_operand" "r")]
+(define_insn "addg6s<mode>3"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(unspec:GPR [(match_operand:GPR 1 "register_operand" "r")
+		    (match_operand:GPR 2 "register_operand" "r")]
 		   UNSPEC_ADDG6S))]
   "TARGET_POPCNTD"
   "addg6s %0,%1,%2"
   [(set_attr "type" "integer")])
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index a5afb467d235..d393062e790f 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -18139,10 +18139,11 @@  addition to the @option{-maltivec}, @option{-mpopcntd}, and
 @option{-mvsx} options.
 
 The following basic built-in functions require @option{-mpopcntd}:
 @smallexample
 unsigned int __builtin_addg6s (unsigned int, unsigned int);
+unsigned long long __builtin_addg6s (unsigned long long, unsigned long long);
 long long __builtin_bpermd (long long, long long);
 unsigned int __builtin_cbcdtd (unsigned int);
 unsigned int __builtin_cdtbcd (unsigned int);
 long long __builtin_divde (long long, long long);
 unsigned long long __builtin_divdeu (unsigned long long, unsigned long long);
diff --git a/gcc/testsuite/gcc.target/powerpc/pr100693-compile.c b/gcc/testsuite/gcc.target/powerpc/pr100693-compile.c
new file mode 100644
index 000000000000..113936a3d9f8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr100693-compile.c
@@ -0,0 +1,30 @@ 
+/* { dg-do compile { target { powerpc*-*-linux* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mdejagnu-cpu=power7 -O3" } */
+/* { dg-final { scan-assembler-times {\maddg6s\M} 4 } } */
+/* { dg-final { scan-assembler-not    "bl __builtin" } } */
+
+/* Test case for the addg6s builtin, exercising both
+   unsigned int and unsigned long long arguments. */
+
+unsigned long long test2_ull (unsigned long long a, unsigned long long b)
+{
+    return __builtin_addg6s (a, b);
+}
+
+unsigned int test1_ui (unsigned int a, unsigned int b)
+{
+    return __builtin_addg6s (a, b);
+}
+
+unsigned int test3_ui (unsigned int *a, unsigned int *b)
+{
+	return __builtin_addg6s(*a, *b);
+}
+
+unsigned long long test4_ull (unsigned long long *a, unsigned long long *b)
+{
+	return __builtin_addg6s(*a, *b);
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr100693-run.c b/gcc/testsuite/gcc.target/powerpc/pr100693-run.c
new file mode 100644
index 000000000000..3694010b10ae
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr100693-run.c
@@ -0,0 +1,78 @@ 
+/* { dg-do run { target { powerpc*-*-linux* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mdejagnu-cpu=power7 -O3" } */
+
+/* Test case for the addg6s builtin, exercising both
+   unsigned int and unsigned long long arguments.  */
+
+#include <stdio.h>
+#include <stdlib.h>
+
+unsigned int test_addg6s_int (unsigned int a, unsigned int b)
+{
+    return __builtin_addg6s (a, b);
+}
+
+unsigned long long test_addg6s_longlong (unsigned long long a, unsigned long long b)
+{
+    return __builtin_addg6s (a, b);
+}
+
+/* Table of expected values.  */
+
+unsigned int exp_i[] = {
+0x66666660,
+0x66666606,
+0x66666066,
+0x66660666,
+0x66606666,
+0x66066666,
+0x60666666,
+0x06666666,
+0x77777777
+};
+
+unsigned long long exp_ll[] = {
+0x6666666666666660,
+0x6666666666666606,
+0x6666666666666066,
+0x6666666666660666,
+0x6666666666606666,
+0x6666666666066666,
+0x6666666660666666,
+0x6666666606666666,
+0x6666666066666666,
+0x6666660666666666,
+0x6666606666666666,
+0x6666066666666666,
+0x6660666666666666, 
+0x6606666666666666, 
+0x6066666666666666, 
+0x0666666666666666, 
+0x7777777777777777
+};
+
+int main() {
+	unsigned int intresult;
+	unsigned long long longresult;
+	int idx;
+	int fail=0;
+
+	for (idx=0; idx<8; idx++) {
+	   intresult = test_addg6s_int (0x01<<4*idx, 0x0f<<4*idx);
+	   if (exp_i[idx] != intresult)
+	      printf("index:%2d Got:%8x Expected:%8x %d\n",
+			idx, intresult, exp_i[idx], ++fail);
+	}
+
+	for (idx=0; idx<16; idx++) {
+	   longresult = test_addg6s_longlong (0x01ULL<<4*idx, 0x0fULL<<4*idx);
+	   if (exp_ll[idx] != longresult)
+	      printf("index:%2d Got:%16llx Expected:%16llx %d\n",
+			idx, longresult, exp_ll[idx], ++fail);
+	}
+	if (fail)
+		abort();
+}
+