[v7,rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]
Checks
Commit Message
Hi,
This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000.
Tests show that outputs of xs[min/max]dp are consistent with the standard
of C99 fmin/max.
This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead
of smin/max when fast-math is not set. While fast-math is set, xs[min/max]dp
are folded to MIN/MAX_EXPR in gimple, and finally expanded to smin/max.
Bootstrapped and tested on ppc64 Linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.
ChangeLog
2022-09-26 Haochen Gui <guihaoc@linux.ibm.com>
gcc/
PR target/103605
* config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Gimple
fold RS6000_BIF_XSMINDP and RS6000_BIF_XSMAXDP when fast-math is set.
* config/rs6000/rs6000.md (FMINMAX): New int iterator.
(minmax_op): New int attribute.
(UNSPEC_FMAX, UNSPEC_FMIN): New unspecs.
(f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN.
* config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set
pattern to fmaxdf3.
(__builtin_vsx_xsmindp): Set pattern to fmindf3.
gcc/testsuite/
PR target/103605
* gcc.dg/powerpc/pr103605.h: New.
* gcc.dg/powerpc/pr103605-1.c: New.
* gcc.dg/powerpc/pr103605-2.c: New.
patch.diff
Comments
Hi,
As the ticket(PR107013, adding fmin/max to RTL code) is suspended, I ping
this patch. The unspec of fmin/max can be replaced with corresponding RTL
code after that ticket is fixed.
https://gcc.gnu.org/pipermail/gcc-patches/2022-September/602181.html
Thanks
Gui Haochen
在 2022/9/26 11:35, HAO CHEN GUI 写道:
> Hi,
> This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000.
> Tests show that outputs of xs[min/max]dp are consistent with the standard
> of C99 fmin/max.
>
> This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead
> of smin/max when fast-math is not set. While fast-math is set, xs[min/max]dp
> are folded to MIN/MAX_EXPR in gimple, and finally expanded to smin/max.
>
> Bootstrapped and tested on ppc64 Linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.
>
> ChangeLog
> 2022-09-26 Haochen Gui <guihaoc@linux.ibm.com>
>
> gcc/
> PR target/103605
> * config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Gimple
> fold RS6000_BIF_XSMINDP and RS6000_BIF_XSMAXDP when fast-math is set.
> * config/rs6000/rs6000.md (FMINMAX): New int iterator.
> (minmax_op): New int attribute.
> (UNSPEC_FMAX, UNSPEC_FMIN): New unspecs.
> (f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN.
> * config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set
> pattern to fmaxdf3.
> (__builtin_vsx_xsmindp): Set pattern to fmindf3.
>
> gcc/testsuite/
> PR target/103605
> * gcc.dg/powerpc/pr103605.h: New.
> * gcc.dg/powerpc/pr103605-1.c: New.
> * gcc.dg/powerpc/pr103605-2.c: New.
>
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000-builtin.cc b/gcc/config/rs6000/rs6000-builtin.cc
> index e925ba9fad9..944ae9fe55c 100644
> --- a/gcc/config/rs6000/rs6000-builtin.cc
> +++ b/gcc/config/rs6000/rs6000-builtin.cc
> @@ -1588,6 +1588,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
> gimple_set_location (g, gimple_location (stmt));
> gsi_replace (gsi, g, true);
> return true;
> + /* fold into MIN_EXPR when fast-math is set. */
> + case RS6000_BIF_XSMINDP:
> /* flavors of vec_min. */
> case RS6000_BIF_XVMINDP:
> case RS6000_BIF_XVMINSP:
> @@ -1614,6 +1616,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
> gimple_set_location (g, gimple_location (stmt));
> gsi_replace (gsi, g, true);
> return true;
> + /* fold into MAX_EXPR when fast-math is set. */
> + case RS6000_BIF_XSMAXDP:
> /* flavors of vec_max. */
> case RS6000_BIF_XVMAXDP:
> case RS6000_BIF_XVMAXSP:
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index f4a9f24bcc5..8b735493b40 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1613,10 +1613,10 @@
> XSCVSPDP vsx_xscvspdp {}
>
> const double __builtin_vsx_xsmaxdp (double, double);
> - XSMAXDP smaxdf3 {}
> + XSMAXDP fmaxdf3 {}
>
> const double __builtin_vsx_xsmindp (double, double);
> - XSMINDP smindf3 {}
> + XSMINDP fmindf3 {}
>
> const double __builtin_vsx_xsrdpi (double);
> XSRDPI vsx_xsrdpi {}
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index bf85baa5370..ae0dd98f0f9 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -158,6 +158,8 @@ (define_c_enum "unspec"
> UNSPEC_HASHCHK
> UNSPEC_XXSPLTIDP_CONST
> UNSPEC_XXSPLTIW_CONST
> + UNSPEC_FMAX
> + UNSPEC_FMIN
> ])
>
> ;;
> @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr"
> DONE;
> })
>
> +
> +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN])
> +
> +(define_int_attr minmax_op [(UNSPEC_FMAX "max")
> + (UNSPEC_FMIN "min")])
> +
> +(define_insn "f<minmax_op><mode>3"
> + [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa")
> + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa")
> + (match_operand:SFDF 2 "vsx_register_operand" "wa")]
> + FMINMAX))]
> + "TARGET_VSX && !flag_finite_math_only"
> + "xs<minmax_op>dp %x0,%x1,%x2"
> + [(set_attr "type" "fp")]
> +)
> +
> (define_expand "mov<mode>cc"
> [(set (match_operand:GPR 0 "gpc_reg_operand")
> (if_then_else:GPR (match_operand 1 "comparison_operator")
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605-1.c b/gcc/testsuite/gcc.target/powerpc/pr103605-1.c
> new file mode 100644
> index 00000000000..923deec6a1e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr103605-1.c
> @@ -0,0 +1,7 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */
> +/* { dg-options "-O2 -mvsx" } */
> +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */
> +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */
> +
> +#include "pr103605.h"
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605-2.c b/gcc/testsuite/gcc.target/powerpc/pr103605-2.c
> new file mode 100644
> index 00000000000..f50fe9468f5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr103605-2.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */
> +/* { dg-options "-O2 -mvsx -ffast-math" } */
> +/* { dg-final { scan-assembler-times {\mxsmaxcdp\M} 3 { target has_arch_pwr9 } } } */
> +/* { dg-final { scan-assembler-times {\mxsmincdp\M} 3 { target has_arch_pwr9 } } } */
> +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 { target { ! has_arch_pwr9 } } } } */
> +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 { target { ! has_arch_pwr9 } } } } */
> +
> +#include "pr103605.h"
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.h b/gcc/testsuite/gcc.target/powerpc/pr103605.h
> new file mode 100644
> index 00000000000..c99dfe6d7eb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.h
> @@ -0,0 +1,31 @@
> +#include <math.h>
> +
> +double test1 (double d0, double d1)
> +{
> + return fmin (d0, d1);
> +}
> +
> +float test2 (float d0, float d1)
> +{
> + return fmin (d0, d1);
> +}
> +
> +double test3 (double d0, double d1)
> +{
> + return fmax (d0, d1);
> +}
> +
> +float test4 (float d0, float d1)
> +{
> + return fmax (d0, d1);
> +}
> +
> +double test5 (double d0, double d1)
> +{
> + return __builtin_vsx_xsmindp (d0, d1);
> +}
> +
> +double test6 (double d0, double d1)
> +{
> + return __builtin_vsx_xsmaxdp (d0, d1);
> +}
>
Hi Haochen,
on 2022/9/26 11:35, HAO CHEN GUI wrote:
> Hi,
> This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000.
> Tests show that outputs of xs[min/max]dp are consistent with the standard
> of C99 fmin/max.
>
> This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead
> of smin/max when fast-math is not set. While fast-math is set, xs[min/max]dp
> are folded to MIN/MAX_EXPR in gimple, and finally expanded to smin/max.
>
> Bootstrapped and tested on ppc64 Linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.
Sorry for the late review, this patch is okay for trunk with the below
nit tweaked or not. Thanks!
>
> ChangeLog
> 2022-09-26 Haochen Gui <guihaoc@linux.ibm.com>
>
> gcc/
> PR target/103605
> * config/rs6000/rs6000-builtin.cc (rs6000_gimple_fold_builtin): Gimple
> fold RS6000_BIF_XSMINDP and RS6000_BIF_XSMAXDP when fast-math is set.
> * config/rs6000/rs6000.md (FMINMAX): New int iterator.
> (minmax_op): New int attribute.
> (UNSPEC_FMAX, UNSPEC_FMIN): New unspecs.
> (f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN.
> * config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set
> pattern to fmaxdf3.
> (__builtin_vsx_xsmindp): Set pattern to fmindf3.
>
> gcc/testsuite/
> PR target/103605
> * gcc.dg/powerpc/pr103605.h: New.
> * gcc.dg/powerpc/pr103605-1.c: New.
> * gcc.dg/powerpc/pr103605-2.c: New.
>
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000-builtin.cc b/gcc/config/rs6000/rs6000-builtin.cc
> index e925ba9fad9..944ae9fe55c 100644
> --- a/gcc/config/rs6000/rs6000-builtin.cc
> +++ b/gcc/config/rs6000/rs6000-builtin.cc
> @@ -1588,6 +1588,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
> gimple_set_location (g, gimple_location (stmt));
> gsi_replace (gsi, g, true);
> return true;
> + /* fold into MIN_EXPR when fast-math is set. */
> + case RS6000_BIF_XSMINDP:
> /* flavors of vec_min. */
> case RS6000_BIF_XVMINDP:
> case RS6000_BIF_XVMINSP:
> @@ -1614,6 +1616,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
> gimple_set_location (g, gimple_location (stmt));
> gsi_replace (gsi, g, true);
> return true;
> + /* fold into MAX_EXPR when fast-math is set. */
> + case RS6000_BIF_XSMAXDP:
> /* flavors of vec_max. */
> case RS6000_BIF_XVMAXDP:
> case RS6000_BIF_XVMAXSP:
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index f4a9f24bcc5..8b735493b40 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -1613,10 +1613,10 @@
> XSCVSPDP vsx_xscvspdp {}
>
> const double __builtin_vsx_xsmaxdp (double, double);
> - XSMAXDP smaxdf3 {}
> + XSMAXDP fmaxdf3 {}
>
> const double __builtin_vsx_xsmindp (double, double);
> - XSMINDP smindf3 {}
> + XSMINDP fmindf3 {}
>
> const double __builtin_vsx_xsrdpi (double);
> XSRDPI vsx_xsrdpi {}
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index bf85baa5370..ae0dd98f0f9 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -158,6 +158,8 @@ (define_c_enum "unspec"
> UNSPEC_HASHCHK
> UNSPEC_XXSPLTIDP_CONST
> UNSPEC_XXSPLTIW_CONST
> + UNSPEC_FMAX
> + UNSPEC_FMIN
> ])
>
> ;;
> @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr"
> DONE;
> })
>
> +
> +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN])
> +
> +(define_int_attr minmax_op [(UNSPEC_FMAX "max")
> + (UNSPEC_FMIN "min")])
> +
> +(define_insn "f<minmax_op><mode>3"
> + [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa")
> + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa")
> + (match_operand:SFDF 2 "vsx_register_operand" "wa")]
> + FMINMAX))]
> + "TARGET_VSX && !flag_finite_math_only"
> + "xs<minmax_op>dp %x0,%x1,%x2"
> + [(set_attr "type" "fp")]
> +)
> +
> (define_expand "mov<mode>cc"
> [(set (match_operand:GPR 0 "gpc_reg_operand")
> (if_then_else:GPR (match_operand 1 "comparison_operator")
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605-1.c b/gcc/testsuite/gcc.target/powerpc/pr103605-1.c
> new file mode 100644
> index 00000000000..923deec6a1e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr103605-1.c
> @@ -0,0 +1,7 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */
> +/* { dg-options "-O2 -mvsx" } */
Nit: Add a comment here like:
/* Verify that GCC generates expected min/max hw insns instead of fmin/fmax calls. */
> +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */
> +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */> +
> +#include "pr103605.h"
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605-2.c b/gcc/testsuite/gcc.target/powerpc/pr103605-2.c
> new file mode 100644
> index 00000000000..f50fe9468f5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr103605-2.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_vsx_ok } */
> +/* { dg-options "-O2 -mvsx -ffast-math" } */
Ditto.
BR,
Kewen
> +/* { dg-final { scan-assembler-times {\mxsmaxcdp\M} 3 { target has_arch_pwr9 } } } */
> +/* { dg-final { scan-assembler-times {\mxsmincdp\M} 3 { target has_arch_pwr9 } } } */
> +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 { target { ! has_arch_pwr9 } } } } */
> +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 { target { ! has_arch_pwr9 } } } } */> +
> +#include "pr103605.h"
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.h b/gcc/testsuite/gcc.target/powerpc/pr103605.h
> new file mode 100644
> index 00000000000..c99dfe6d7eb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.h
> @@ -0,0 +1,31 @@
> +#include <math.h>
> +
> +double test1 (double d0, double d1)
> +{
> + return fmin (d0, d1);
> +}
> +
> +float test2 (float d0, float d1)
> +{
> + return fmin (d0, d1);
> +}
> +
> +double test3 (double d0, double d1)
> +{
> + return fmax (d0, d1);
> +}
> +
> +float test4 (float d0, float d1)
> +{
> + return fmax (d0, d1);
> +}
> +
> +double test5 (double d0, double d1)
> +{
> + return __builtin_vsx_xsmindp (d0, d1);
> +}
> +
> +double test6 (double d0, double d1)
> +{
> + return __builtin_vsx_xsmaxdp (d0, d1);
> +}
>
@@ -1588,6 +1588,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
gimple_set_location (g, gimple_location (stmt));
gsi_replace (gsi, g, true);
return true;
+ /* fold into MIN_EXPR when fast-math is set. */
+ case RS6000_BIF_XSMINDP:
/* flavors of vec_min. */
case RS6000_BIF_XVMINDP:
case RS6000_BIF_XVMINSP:
@@ -1614,6 +1616,8 @@ rs6000_gimple_fold_builtin (gimple_stmt_iterator *gsi)
gimple_set_location (g, gimple_location (stmt));
gsi_replace (gsi, g, true);
return true;
+ /* fold into MAX_EXPR when fast-math is set. */
+ case RS6000_BIF_XSMAXDP:
/* flavors of vec_max. */
case RS6000_BIF_XVMAXDP:
case RS6000_BIF_XVMAXSP:
@@ -1613,10 +1613,10 @@
XSCVSPDP vsx_xscvspdp {}
const double __builtin_vsx_xsmaxdp (double, double);
- XSMAXDP smaxdf3 {}
+ XSMAXDP fmaxdf3 {}
const double __builtin_vsx_xsmindp (double, double);
- XSMINDP smindf3 {}
+ XSMINDP fmindf3 {}
const double __builtin_vsx_xsrdpi (double);
XSRDPI vsx_xsrdpi {}
@@ -158,6 +158,8 @@ (define_c_enum "unspec"
UNSPEC_HASHCHK
UNSPEC_XXSPLTIDP_CONST
UNSPEC_XXSPLTIW_CONST
+ UNSPEC_FMAX
+ UNSPEC_FMIN
])
;;
@@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr"
DONE;
})
+
+(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN])
+
+(define_int_attr minmax_op [(UNSPEC_FMAX "max")
+ (UNSPEC_FMIN "min")])
+
+(define_insn "f<minmax_op><mode>3"
+ [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa")
+ (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa")
+ (match_operand:SFDF 2 "vsx_register_operand" "wa")]
+ FMINMAX))]
+ "TARGET_VSX && !flag_finite_math_only"
+ "xs<minmax_op>dp %x0,%x1,%x2"
+ [(set_attr "type" "fp")]
+)
+
(define_expand "mov<mode>cc"
[(set (match_operand:GPR 0 "gpc_reg_operand")
(if_then_else:GPR (match_operand 1 "comparison_operator")
new file mode 100644
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mvsx" } */
+/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */
+
+#include "pr103605.h"
new file mode 100644
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mvsx -ffast-math" } */
+/* { dg-final { scan-assembler-times {\mxsmaxcdp\M} 3 { target has_arch_pwr9 } } } */
+/* { dg-final { scan-assembler-times {\mxsmincdp\M} 3 { target has_arch_pwr9 } } } */
+/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 { target { ! has_arch_pwr9 } } } } */
+/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 { target { ! has_arch_pwr9 } } } } */
+
+#include "pr103605.h"
new file mode 100644
@@ -0,0 +1,31 @@
+#include <math.h>
+
+double test1 (double d0, double d1)
+{
+ return fmin (d0, d1);
+}
+
+float test2 (float d0, float d1)
+{
+ return fmin (d0, d1);
+}
+
+double test3 (double d0, double d1)
+{
+ return fmax (d0, d1);
+}
+
+float test4 (float d0, float d1)
+{
+ return fmax (d0, d1);
+}
+
+double test5 (double d0, double d1)
+{
+ return __builtin_vsx_xsmindp (d0, d1);
+}
+
+double test6 (double d0, double d1)
+{
+ return __builtin_vsx_xsmaxdp (d0, d1);
+}