[ver,2] rs6000, __builtin_set_fpscr_rn add retrun value

Message ID b39733dc4b96c263db0e75b69c293e2654e2b7e5.camel@us.ibm.com
State Accepted
Headers
Series [ver,2] rs6000, __builtin_set_fpscr_rn add retrun value |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Carl Love July 1, 2023, 12:58 a.m. UTC
  GCC maintainers:

Ver 2,  Went back thru the requirements and emails.  Not sure where I
came up with the requirement for an overloaded version with double
argument.  Removed the overloaded version with the double argument. 
Added the macro to announce if the __builtin_set_fpscr_rn returns a
void or a double with the FPSCR bits.  Updated the documentation file. 
Retested on Power 8 BE/LE, Power 9 BE/LE, Power 10 LE.  Redid the test
file.  Per request, the original test file functionality was not
changed.  Just changed the name from test_fpscr_rn_builtin.c to 
test_fpscr_rn_builtin_1.c.  Put new tests for the return values into a
new test file, test_fpscr_rn_builtin_2.c.

The GLibC team requested a builtin to replace the mffscrn and
mffscrniinline asm instructions in the GLibC code.  Previously there
was discussion on adding builtins for the mffscrn instructions.

https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html

In the end, it was felt that it would be to extend the existing
__builtin_set_fpscr_rn builtin to return a double instead of a void
type.  The desire is that we could have the functionality of the
mffscrn and mffscrni instructions on older ISAs.  The two instructions
were initially added in ISA 3.0.  The __builtin_set_fpscr_rn has the
needed functionality to set the RN field using the mffscrn and mffscrni
instructions if ISA 3.0 is supported or fall back to using logical
instructions to mask and set the bits for earlier ISAs.  The
instructions return the current value of the FPSCR fields DRN, VE, OE,
UE, ZE, XE, NI, RN bit positions then update the RN bit positions with
the new RN value provided.

The current __builtin_set_fpscr_rn builtin has a return type of void. 
So, changing the return type to double and returning the  FPSCR fields
DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the
functionally equivalent of the mffscrn and mffscrni instructions.  Any
current uses of the builtin would just ignore the return value yet any
new uses could use the return value.  So the requirement is for the
change to the __builtin_set_fpscr_rn builtin to be backwardly
compatible and work for all ISAs.

The following patch changes the return type of the
 __builtin_set_fpscr_rn builtin from void to double.  The return value
is the current value of the various FPSCR fields DRN, VE, OE, UE, ZE,
XE, NI, RN bit positions when the builtin is called.  The builtin then
updated the RN field with the new value provided as an argument to the
builtin.  The patch adds new testcases to test_fpscr_rn_builtin.c to
check that the builtin returns the current value of the FPSCR fields
and then updates the RN field.

The GLibC team has reviewed the patch to make sure it met their needs
as a drop in replacement for the inline asm mffscr and mffscrni
statements in the GLibC code.  T

The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power 10
LE.

Please let me know if the patch is acceptable for mainline.  Thanks.

                               Carl 


----------------------------------
rs6000, __builtin_set_fpscr_rn add retrun value

Change the return value from void to double.  The return value consists of
the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit positions.  Add an
overloaded version which accepts a double argument.

The test powerpc/test_fpscr_rn_builtin.c is updated to add tests for the
double reterun value and the new double argument.

gcc/ChangeLog:
	* config/rs6000/rs6000-builtins.def (__builtin_set_fpscr_rn): Update
	builtin definition return type.
	* config/rs6000-c.cc(rs6000_target_modify_macros): Add check, define
	__SET_FPSCR_RN_RETURNS_FPSCR__ macro.
	* config/rs6000/rs6000.md ((rs6000_get_fpscr_fields): New
	define_expand.
	(rs6000_update_fpscr_rn_field): New define_expand.
	(rs6000_set_fpscr_rn): Added	return argument.  Updated to use new
	rs6000_get_fpscr_fields and rs6000_update_fpscr_rn_field define
	 _expands.
	* doc/extend.texi (__builtin_set_fpscr_rn): Update description for
	the return value and new double argument.  Add descripton for
	__SET_FPSCR_RN_RETURNS_FPSCR__ macro.

gcc/testsuite/ChangeLog:
	gcc.target/powerpc/test_fpscr_rn_builtin.c: Renamed to
	test_fpscr_rn_builtin_1.c.  Added comment.
	gcc.target/powerpc/test_fpscr_rn_builtin_2.c: New test for the
	return value of __builtin_set_fpscr_rn builtin.
---
 gcc/config/rs6000/rs6000-builtins.def         |   2 +-
 gcc/config/rs6000/rs6000-c.cc                 |   4 +
 gcc/config/rs6000/rs6000.md                   |  87 +++++++---
 gcc/doc/extend.texi                           |  26 ++-
 ...rn_builtin.c => test_fpscr_rn_builtin_1.c} |   6 +
 .../powerpc/test_fpscr_rn_builtin_2.c         | 153 ++++++++++++++++++
 6 files changed, 246 insertions(+), 32 deletions(-)
 rename gcc/testsuite/gcc.target/powerpc/{test_fpscr_rn_builtin.c => test_fpscr_rn_builtin_1.c} (92%)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
  

Comments

Peter Bergner July 6, 2023, 10:54 p.m. UTC | #1
On 6/30/23 7:58 PM, Carl Love via Gcc-patches wrote:
> rs6000, __builtin_set_fpscr_rn add retrun value

s/retrun/return/

Maybe better written as:

rs6000: Add return value to __builtin_set_fpscr_rn


> Change the return value from void to double.  The return value consists of
> the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit positions.  Add an
> overloaded version which accepts a double argument.

You're not adding an overloaded version anymore, so I think you can just
remove the last sentence.



> The test powerpc/test_fpscr_rn_builtin.c is updated to add tests for the
> double reterun value and the new double argument.

s/reterun/return/   ...and there is no double argument anymore, so that
part can be removed.



> 	* config/rs6000/rs6000.md ((rs6000_get_fpscr_fields): New
> 	define_expand.

Too many '('.



> 	(rs6000_set_fpscr_rn): Added	return argument.  Updated to use new

Looks like a <tab> after Added instead of a space.


> 	rs6000_get_fpscr_fields and rs6000_update_fpscr_rn_field define
> 	 _expands.

Don't split define_expand across two lines.



> 	* doc/extend.texi (__builtin_set_fpscr_rn): Update description for
> 	the return value and new double argument.  Add descripton for
> 	__SET_FPSCR_RN_RETURNS_FPSCR__ macro.

s/descripton/description/






> +  /* Tell the user the __builtin_set_fpscr_rn now returns the FPSCR fields
> +     in a double.  Originally the builtin returned void.  */

Either:
  1) s/Tell the user the __builtin_set_fpscr_rn/Tell the user __builtin_set_fpscr_rn/ 
  2) s/the __builtin_set_fpscr_rn now/the __builtin_set_fpscr_rn built-in now/ 


> +  if ((flags & OPTION_MASK_SOFT_FLOAT) == 0)
> +  rs6000_define_or_undefine_macro (define_p, "__SET_FPSCR_RN_RETURNS_FPSCR__");

This doesn't look like it's indented correctly.




> +(define_expand "rs6000_get_fpscr_fields"
> + [(match_operand:DF 0 "gpc_reg_operand")]
> +  "TARGET_HARD_FLOAT"
> +{
> +  /* Extract fields bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI,
> +     RN) from the FPSCR and return them.  */
> +  rtx tmp_df = gen_reg_rtx (DFmode);
> +  rtx tmp_di = gen_reg_rtx (DImode);
> +
> +  emit_insn (gen_rs6000_mffs (tmp_df));
> +  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> +  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (0x00000007000000FFULL)));
> +  rtx tmp_rtn = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
> +  emit_move_insn (operands[0], tmp_rtn);
> +  DONE;
> +})

This doesn't look correct.  You first set tmp_di to a new reg rtx but then
throw that away with the return value of simplify_gen_subreg().  I'm guessing
you want that tmp_di as a gen_reg_rtx for the destination of the gen_anddi3, so
you probably want a different rtx for the subreg that feeds the gen_anddi3.



> +(define_expand "rs6000_update_fpscr_rn_field"
> + [(match_operand:DI 0 "gpc_reg_operand")]
> +  "TARGET_HARD_FLOAT"
> +{
> +  /* Insert the new RN value from operands[0] into FPSCR bit [62:63].  */
> +  rtx tmp_di = gen_reg_rtx (DImode);
> +  rtx tmp_df = gen_reg_rtx (DFmode);
> +
> +  emit_insn (gen_rs6000_mffs (tmp_df));
> +  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);

Ditto.




> +The @code{__builtin_set_fpscr_rn} builtin allows changing both of the floating
> +point rounding mode bits and returning the various FPSCR fields before the RN
> +field is updated.  The builtin returns a double consisting of the initial value
> +of the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, and RN bit positions with all
> +other bits set to zero. The builtin argument is a 2-bit value for the new RN
> +field value.  The argument can either be an @code{const int} or stored in a
> +variable.  Earlier versions of @code{__builtin_set_fpscr_rn} returned void.  A
> +@code{__SET_FPSCR_RN_RETURNS_FPSCR__} macro has been added.  If defined, then
> +the @code{__builtin_set_fpscr_rn} builtin returns the FPSCR fields.  If not
> +defined, the @code{__builtin_set_fpscr_rn} does not return a vaule.  If the
> +@option{-msoft-float} option is used, the @code{__builtin_set_fpscr_rn} builtin
> +will not return a value.

Multiple occurrences of "builtin" that should be spelled "built-in" (not in the
built-in function name itself though).



> +/* Originally the __builtin_set_fpscr_rn builtin was defined to return
> +   void.  It was later extended to return a double with the various
> +   FPSCR bits.  The extended builtin is inteded to be a drop in replacement
> +   for the original version.  This test is for the original version of the
> +   builtin and should work exactly as before.  */

Ditto.




> +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
> @@ -0,0 +1,153 @@
> +/* { dg-do run { target { powerpc*-*-* } } } */

powerpc*-*-* is the default for this test directory, so you can drop that,
but you need to disable this test for soft-float systems, so you probably want:

  /* { dg-do run { target powerpc_fprs } } */

I know you didn't write it, but test_fpscr_rn_builtin_1.c should be changed to
use the same dg-do run test above as well.


Peter
  
Peter Bergner July 6, 2023, 11 p.m. UTC | #2
On 7/6/23 5:54 PM, Peter Bergner wrote:
> On 6/30/23 7:58 PM, Carl Love via Gcc-patches wrote:
>> +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
>> @@ -0,0 +1,153 @@
>> +/* { dg-do run { target { powerpc*-*-* } } } */
> 
> powerpc*-*-* is the default for this test directory, so you can drop that,
> but you need to disable this test for soft-float systems, so you probably want:
> 
>   /* { dg-do run { target powerpc_fprs } } */

We actually want something like powerpc_fprs_hw, but that doesn't exist.

Peter
  
Kewen.Lin July 7, 2023, 4:06 a.m. UTC | #3
Hi Carl,

Some more minor comments are inline below on top of Peter's insightful
review comments.

on 2023/7/1 08:58, Carl Love wrote:
> 
> GCC maintainers:
> 
> Ver 2,  Went back thru the requirements and emails.  Not sure where I
> came up with the requirement for an overloaded version with double
> argument.  Removed the overloaded version with the double argument. 
> Added the macro to announce if the __builtin_set_fpscr_rn returns a
> void or a double with the FPSCR bits.  Updated the documentation file. 
> Retested on Power 8 BE/LE, Power 9 BE/LE, Power 10 LE.  Redid the test
> file.  Per request, the original test file functionality was not
> changed.  Just changed the name from test_fpscr_rn_builtin.c to 
> test_fpscr_rn_builtin_1.c.  Put new tests for the return values into a
> new test file, test_fpscr_rn_builtin_2.c.
> 
> The GLibC team requested a builtin to replace the mffscrn and
> mffscrniinline asm instructions in the GLibC code.  Previously there
> was discussion on adding builtins for the mffscrn instructions.
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html
> 
> In the end, it was felt that it would be to extend the existing
> __builtin_set_fpscr_rn builtin to return a double instead of a void
> type.  The desire is that we could have the functionality of the
> mffscrn and mffscrni instructions on older ISAs.  The two instructions
> were initially added in ISA 3.0.  The __builtin_set_fpscr_rn has the
> needed functionality to set the RN field using the mffscrn and mffscrni
> instructions if ISA 3.0 is supported or fall back to using logical
> instructions to mask and set the bits for earlier ISAs.  The
> instructions return the current value of the FPSCR fields DRN, VE, OE,
> UE, ZE, XE, NI, RN bit positions then update the RN bit positions with
> the new RN value provided.
> 
> The current __builtin_set_fpscr_rn builtin has a return type of void. 
> So, changing the return type to double and returning the  FPSCR fields
> DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the
> functionally equivalent of the mffscrn and mffscrni instructions.  Any
> current uses of the builtin would just ignore the return value yet any
> new uses could use the return value.  So the requirement is for the
> change to the __builtin_set_fpscr_rn builtin to be backwardly
> compatible and work for all ISAs.
> 
> The following patch changes the return type of the
>  __builtin_set_fpscr_rn builtin from void to double.  The return value
> is the current value of the various FPSCR fields DRN, VE, OE, UE, ZE,
> XE, NI, RN bit positions when the builtin is called.  The builtin then
> updated the RN field with the new value provided as an argument to the
> builtin.  The patch adds new testcases to test_fpscr_rn_builtin.c to
> check that the builtin returns the current value of the FPSCR fields
> and then updates the RN field.
> 
> The GLibC team has reviewed the patch to make sure it met their needs
> as a drop in replacement for the inline asm mffscr and mffscrni
> statements in the GLibC code.  T
> 
> The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power 10
> LE.
> 
> Please let me know if the patch is acceptable for mainline.  Thanks.
> 
>                                Carl 
> 
> 
> ----------------------------------
> rs6000, __builtin_set_fpscr_rn add retrun value
> 
> Change the return value from void to double.  The return value consists of
> the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit positions.  Add an
> overloaded version which accepts a double argument.
> 
> The test powerpc/test_fpscr_rn_builtin.c is updated to add tests for the
> double reterun value and the new double argument.
> 
> gcc/ChangeLog:
> 	* config/rs6000/rs6000-builtins.def (__builtin_set_fpscr_rn): Update
> 	builtin definition return type.
> 	* config/rs6000-c.cc(rs6000_target_modify_macros): Add check, define
> 	__SET_FPSCR_RN_RETURNS_FPSCR__ macro.
> 	* config/rs6000/rs6000.md ((rs6000_get_fpscr_fields): New
> 	define_expand.
> 	(rs6000_update_fpscr_rn_field): New define_expand.
> 	(rs6000_set_fpscr_rn): Added	return argument.  Updated to use new
> 	rs6000_get_fpscr_fields and rs6000_update_fpscr_rn_field define
> 	 _expands.
> 	* doc/extend.texi (__builtin_set_fpscr_rn): Update description for
> 	the return value and new double argument.  Add descripton for
> 	__SET_FPSCR_RN_RETURNS_FPSCR__ macro.
> 
> gcc/testsuite/ChangeLog:
> 	gcc.target/powerpc/test_fpscr_rn_builtin.c: Renamed to
> 	test_fpscr_rn_builtin_1.c.  Added comment.
> 	gcc.target/powerpc/test_fpscr_rn_builtin_2.c: New test for the
> 	return value of __builtin_set_fpscr_rn builtin.
> ---
>  gcc/config/rs6000/rs6000-builtins.def         |   2 +-
>  gcc/config/rs6000/rs6000-c.cc                 |   4 +
>  gcc/config/rs6000/rs6000.md                   |  87 +++++++---
>  gcc/doc/extend.texi                           |  26 ++-
>  ...rn_builtin.c => test_fpscr_rn_builtin_1.c} |   6 +
>  .../powerpc/test_fpscr_rn_builtin_2.c         | 153 ++++++++++++++++++
>  6 files changed, 246 insertions(+), 32 deletions(-)
>  rename gcc/testsuite/gcc.target/powerpc/{test_fpscr_rn_builtin.c => test_fpscr_rn_builtin_1.c} (92%)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
> 
> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
> index 289a37998b1..28788b69c7d 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -237,7 +237,7 @@
>    const __ibm128 __builtin_pack_ibm128 (double, double);
>      PACK_IF packif {ibm128}
>  
> -  void __builtin_set_fpscr_rn (const int[0,3]);
> +  double __builtin_set_fpscr_rn (const int[0,3]);
>      SET_FPSCR_RN rs6000_set_fpscr_rn {nosoft}
>  
>    const double __builtin_unpack_ibm128 (__ibm128, const int<1>);
> diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
> index 8555174d36e..8373bb66919 100644
> --- a/gcc/config/rs6000/rs6000-c.cc
> +++ b/gcc/config/rs6000/rs6000-c.cc
> @@ -604,6 +604,10 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT flags)
>    /* Tell the user -mrop-protect is in play.  */
>    if (rs6000_rop_protect)
>      rs6000_define_or_undefine_macro (define_p, "__ROP_PROTECT__");
> +  /* Tell the user the __builtin_set_fpscr_rn now returns the FPSCR fields
> +     in a double.  Originally the builtin returned void.  */
> +  if ((flags & OPTION_MASK_SOFT_FLOAT) == 0)
> +  rs6000_define_or_undefine_macro (define_p, "__SET_FPSCR_RN_RETURNS_FPSCR__");
>  }
>  
>  void
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index b0db8ae508d..1b77a13c8a1 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -6440,8 +6440,51 @@
>    "mffscdrn %0,%1"
>    [(set_attr "type" "fp")])
>  
> +
> +(define_expand "rs6000_get_fpscr_fields"
> + [(match_operand:DF 0 "gpc_reg_operand")]
> +  "TARGET_HARD_FLOAT"
> +{
> +  /* Extract fields bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI,
> +     RN) from the FPSCR and return them.  */
> +  rtx tmp_df = gen_reg_rtx (DFmode);
> +  rtx tmp_di = gen_reg_rtx (DImode);
> +
> +  emit_insn (gen_rs6000_mffs (tmp_df));
> +  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> +  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (0x00000007000000FFULL)));
> +  rtx tmp_rtn = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
> +  emit_move_insn (operands[0], tmp_rtn);
> +  DONE;
> +})
> +
> +(define_expand "rs6000_update_fpscr_rn_field"
> + [(match_operand:DI 0 "gpc_reg_operand")]
> +  "TARGET_HARD_FLOAT"
> +{
> +  /* Insert the new RN value from operands[0] into FPSCR bit [62:63].  */
> +  rtx tmp_di = gen_reg_rtx (DImode);
> +  rtx tmp_df = gen_reg_rtx (DFmode);
> +
> +  emit_insn (gen_rs6000_mffs (tmp_df));
> +  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> +  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
> +  emit_insn (gen_iordi3 (tmp_di, tmp_di, operands[0]));
> +
> +  /* Need to write to field k=15.  The fields are [0:15].  Hence with
> +     L=0, W=0, FLM_i must be equal to 8, 16 = i + 8*(1-W).  FLM is an
> +     8-bit field[0:7]. Need to set the bit that corresponds to the
> +     value of i that you want [0:7].  */
> +
> +  tmp_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
> +  emit_insn (gen_rs6000_mtfsf (GEN_INT (0x01), tmp_df));
> +  DONE;
> +})
> +
>  (define_expand "rs6000_set_fpscr_rn"
> - [(match_operand:SI 0 "reg_or_cint_operand")]
> + [(set (match_operand:DF 0 "gpc_reg_operand")
> +       (unspec_volatile:DF [(match_operand:SI 1 "reg_or_cint_operand")]
> +       UNSPECV_MFFSCDRN))]

I don't think we need this pattern to be complete, since it's always
expanded with "DONE", the pattern is useless, so it can be simpler like:

   [(use (match_operand:DF 0 "gpc_reg_operand"))
    (use (match_operand:SI 1 "reg_or_cint_operand"))]


>    "TARGET_HARD_FLOAT"
>  {
>    rtx tmp_df = gen_reg_rtx (DFmode);
> @@ -6450,25 +6493,34 @@
>       new rounding mode bits from operands[0][62:63] into FPSCR[62:63].  */
>    if (TARGET_P9_MISC)
>      {
> -      if (const_0_to_3_operand (operands[0], VOIDmode))
> -	emit_insn (gen_rs6000_mffscrni (tmp_df, operands[0]));
> +      if (const_0_to_3_operand (operands[1], VOIDmode))
> +	emit_insn (gen_rs6000_mffscrni (tmp_df, operands[1]));
>        else
>  	{
> -	  rtx op0 = convert_to_mode (DImode, operands[0], false);
> -	  rtx src_df = simplify_gen_subreg (DFmode, op0, DImode, 0);
> +	  rtx op1 = convert_to_mode (DImode, operands[1], false);
> +	  rtx src_df = simplify_gen_subreg (DFmode, op1, DImode, 0);
>  	  emit_insn (gen_rs6000_mffscrn (tmp_df, src_df));
>  	}
> -      DONE;
> +       emit_move_insn (operands[0], tmp_df);
> +       DONE;
>      }
>  
> -  if (CONST_INT_P (operands[0]))
> +  /* Emulate the behavior of the mffscrni, mffscrn instructions for earlier
> +     ISAs.  Return bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI,
> +     RN) from the FPSCR.  Set the RN field based on the value in operands[1].
> +  */
> +
> +  /* Get the current FPSCR fields to return.  */
> +  emit_insn (gen_rs6000_get_fpscr_fields (operands[0]));

IMHO the above define_expand rs6000_get_fpscr_fields isn't needed, just inline
it here, since it doesn't get reused somewhere else, and more importantly the
read fpscr in DImode (tmp_di from simplify_gen_subreg on tmp_df) can be shared 
with the uses in rs6000_update_fpscr_rn_field (so I'd also suggest not factoring
out the previous RN bit handlings into rs6000_update_fpscr_rn_field).

> +
> +  if (CONST_INT_P (operands[1]))
>      {
> -      if ((INTVAL (operands[0]) & 0x1) == 0x1)
> +      if ((INTVAL (operands[1]) & 0x1) == 0x1)
>  	emit_insn (gen_rs6000_mtfsb1 (GEN_INT (31)));
>        else
>  	emit_insn (gen_rs6000_mtfsb0 (GEN_INT (31)));
>  
> -      if ((INTVAL (operands[0]) & 0x2) == 0x2)
> +      if ((INTVAL (operands[1]) & 0x2) == 0x2)
>  	emit_insn (gen_rs6000_mtfsb1 (GEN_INT (30)));
>        else
>  	emit_insn (gen_rs6000_mtfsb0 (GEN_INT (30)));
> @@ -6476,24 +6528,13 @@
>    else
>      {
>        rtx tmp_rn = gen_reg_rtx (DImode);
> -      rtx tmp_di = gen_reg_rtx (DImode);
>  
>        /* Extract new RN mode from operand.  */
> -      rtx op0 = convert_to_mode (DImode, operands[0], false);
> -      emit_insn (gen_anddi3 (tmp_rn, op0, GEN_INT (3)));
> +      rtx op1 = convert_to_mode (DImode, operands[1], false);
> +      emit_insn (gen_anddi3 (tmp_rn, op1, GEN_INT (3)));
>  
>        /* Insert new RN mode into FSCPR.  */
> -      emit_insn (gen_rs6000_mffs (tmp_df));
> -      tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> -      emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
> -      emit_insn (gen_iordi3 (tmp_di, tmp_di, tmp_rn));
> -
> -      /* Need to write to field k=15.  The fields are [0:15].  Hence with
> -	 L=0, W=0, FLM_i must be equal to 8, 16 = i + 8*(1-W).  FLM is an
> -	 8-bit field[0:7]. Need to set the bit that corresponds to the
> -	 value of i that you want [0:7].  */
> -      tmp_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
> -      emit_insn (gen_rs6000_mtfsf (GEN_INT (0x01), tmp_df));
> +      emit_insn (gen_rs6000_update_fpscr_rn_field (tmp_rn));
>      }
>    DONE;
>  })
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index cdbd4b34a35..fee35ac40ec 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -18188,7 +18188,6 @@ double __builtin_mffs (void);
>  void __builtin_mtfsf (const int, double);
>  void __builtin_mtfsb0 (const int);
>  void __builtin_mtfsb1 (const int);
> -void __builtin_set_fpscr_rn (int);

Just update void to double here, instead of the below seperated smallexample?
As the below documentation is to describe these above functions one by
one, just moving one down looks a bit inconsistent.

>  @end smallexample
>  
>  The @code{__builtin_ppc_get_timebase} and @code{__builtin_ppc_mftb}
> @@ -18209,13 +18208,24 @@ values to selected fields of the FPSCR.  The
>  as an argument.  The valid bit range is between 0 and 31.  The builtins map to
>  the @code{mtfsb0} and @code{mtfsb1} instructions which take the argument and
>  add 32.  Hence these instructions only modify the FPSCR[32:63] bits by
> -changing the specified bit to a zero or one respectively.  The
> -@code{__builtin_set_fpscr_rn} builtin allows changing both of the floating
> -point rounding mode bits.  The argument is a 2-bit value.  The argument can
> -either be a @code{const int} or stored in a variable. The builtin uses
> -the ISA 3.0
> -instruction @code{mffscrn} if available, otherwise it reads the FPSCR, masks
> -the current rounding mode bits out and OR's in the new value.
> +changing the specified bit to a zero or one respectively.
> +
> +@smallexample
> +double __builtin_set_fpscr_rn (int);
> +@end smallexample
> +
> +The @code{__builtin_set_fpscr_rn} builtin allows changing both of the floating
> +point rounding mode bits and returning the various FPSCR fields before the RN
> +field is updated.  The builtin returns a double consisting of the initial value
> +of the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, and RN bit positions with all
> +other bits set to zero. The builtin argument is a 2-bit value for the new RN
> +field value.  The argument can either be an @code{const int} or stored in a
> +variable.  Earlier versions of @code{__builtin_set_fpscr_rn} returned void.  A
> +@code{__SET_FPSCR_RN_RETURNS_FPSCR__} macro has been added.  If defined, then
> +the @code{__builtin_set_fpscr_rn} builtin returns the FPSCR fields.  If not
> +defined, the @code{__builtin_set_fpscr_rn} does not return a vaule.  If the
> +@option{-msoft-float} option is used, the @code{__builtin_set_fpscr_rn} builtin
> +will not return a value.

__builtin_set_fpscr_rn isn't available for -msoft-float, your patch doesn't change
it so IMHO you don't need to note this.

>  
>  @node Basic PowerPC Built-in Functions Available on ISA 2.05
>  @subsubsection Basic PowerPC Built-in Functions Available on ISA 2.05
> diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_1.c
> similarity index 92%
> rename from gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c
> rename to gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_1.c
> index 04707ad8a56..d917ee19656 100644
> --- a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c
> +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_1.c
> @@ -1,6 +1,12 @@
>  /* { dg-do run { target { powerpc*-*-* } } } */
>  /* { dg-options "-O2 -std=c99" } */
>  
> +/* Originally the __builtin_set_fpscr_rn builtin was defined to return
> +   void.  It was later extended to return a double with the various
> +   FPSCR bits.  The extended builtin is inteded to be a drop in replacement
> +   for the original version.  This test is for the original version of the
> +   builtin and should work exactly as before.  */
> +
>  #ifdef DEBUG
>  #include <stdio.h>
>  #endif
> diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
> new file mode 100644
> index 00000000000..62fca67c948
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
> @@ -0,0 +1,153 @@
> +/* { dg-do run { target { powerpc*-*-* } } } */
> +/* { dg-options "-O2 -std=c99" } */

Nit: The previous existing test_fpscr_rn_builtin.c introduced -std=c99 but
I think it is useless for this case, so you can remove it if you like.

> +
> +/* The __builtin_set_fpscr_rn builtin was originally defined to return
> +   void.  It was to return a double with the various FPSCR bits.  This test
> +   verifies the new version returns the desired FPSCR bits.  */
> +
> +#ifdef DEBUG
> +#include <stdio.h>
> +#endif
> +
> +#define RN_MASK  0x3LL             /* RN field mask */
> +#define FIELD_MASK 0x00000007000000FFULL
> +
> +union blah {
> +  double d;
> +  unsigned long long ll;
> +} conv_val;
> +
> +void abort (void);
> +double __attribute__ ((noipa)) wrap_set_fpscr_rn (int val)
> +{
> +  return __builtin_set_fpscr_rn (val);
> +}
> +

Nit: I noticed you want this case to abort if __SET_FPSCR_RN_RETURNS_FPSCR__
doesn't get defined, but actually the compilation should fail
for compiling this function wrap_set_fpscr_rn already, so
the below __SET_FPSCR_RN_RETURNS_FPSCR__ check and abort seems 
useless.  That said, the explicit use of return value of this built-in
__builtin_set_fpscr_rn already implicitly checks it.

BR,
Kewen
  
Kewen.Lin July 7, 2023, 5:08 a.m. UTC | #4
on 2023/7/7 07:00, Peter Bergner wrote:
> On 7/6/23 5:54 PM, Peter Bergner wrote:
>> On 6/30/23 7:58 PM, Carl Love via Gcc-patches wrote:
>>> +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
>>> @@ -0,0 +1,153 @@
>>> +/* { dg-do run { target { powerpc*-*-* } } } */
>>
>> powerpc*-*-* is the default for this test directory, so you can drop that,
>> but you need to disable this test for soft-float systems, so you probably want:
>>
>>   /* { dg-do run { target powerpc_fprs } } */
> 
> We actually want something like powerpc_fprs_hw, but that doesn't exist.
> 

Yeah, good point!  I noticed that we have a few test cases which need to
check soft-float env as well but they don't, I didn't find any related
issues have been reported, so I would assume that there are very few
actual testings on this area.  Based on this, I'm not sure if it's worthy
to add a new effective target for it.  Personally I'm happy with just using
powerpc_fprs here to keep it simple. :)

BR,
Kewen
  
Peter Bergner July 7, 2023, 2:23 p.m. UTC | #5
On 7/7/23 12:08 AM, Kewen.Lin wrote:
> on 2023/7/7 07:00, Peter Bergner wrote:
>> On 7/6/23 5:54 PM, Peter Bergner wrote:
>>> On 6/30/23 7:58 PM, Carl Love via Gcc-patches wrote:
>>>> +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
>>>> @@ -0,0 +1,153 @@
>>>> +/* { dg-do run { target { powerpc*-*-* } } } */
>>>
>>> powerpc*-*-* is the default for this test directory, so you can drop that,
>>> but you need to disable this test for soft-float systems, so you probably want:
>>>
>>>   /* { dg-do run { target powerpc_fprs } } */
>>
>> We actually want something like powerpc_fprs_hw, but that doesn't exist.
>>
> 
> Yeah, good point!  I noticed that we have a few test cases which need to
> check soft-float env as well but they don't, I didn't find any related
> issues have been reported, so I would assume that there are very few
> actual testings on this area.  Based on this, I'm not sure if it's worthy
> to add a new effective target for it.  Personally I'm happy with just using
> powerpc_fprs here to keep it simple. :)

I think powerpc_fprs_hw can be added later by someone if they care.
Using powerpc_fprs is an improvement over powerpc*-*-*, since it will
reduce some FAILs, just not all of them a powerpc_fprs_hw would.
I doubt many people are running the testsuite on real ppc hardware
that doesn't have an FP unit.

Peter
  
Carl Love July 10, 2023, 7:17 p.m. UTC | #6
On Thu, 2023-07-06 at 17:54 -0500, Peter Bergner wrote:
> On 6/30/23 7:58 PM, Carl Love via Gcc-patches wrote:
> > rs6000, __builtin_set_fpscr_rn add retrun value
> 
> s/retrun/return/
> 
> Maybe better written as:
> 
> rs6000: Add return value to __builtin_set_fpscr_rn

Changed subject, fixed misspelling.
> 
> 
> > Change the return value from void to double.  The return value
> > consists of
> > the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit
> > positions.  Add an
> > overloaded version which accepts a double argument.
> 
> You're not adding an overloaded version anymore, so I think you can
> just
> remove the last sentence.

Yup, didn't get that removed when removing the overloaded instance. 
fixed.

> 
> 
> 
> > The test powerpc/test_fpscr_rn_builtin.c is updated to add tests
> > for the
> > double reterun value and the new double argument.
> 
> s/reterun/return/   ...and there is no double argument anymore, so
> that
> part can be removed.

Fixed.  Note, the new return value tests were moved to new test file.
> 
> 
> 
> > 	* config/rs6000/rs6000.md ((rs6000_get_fpscr_fields): New
> > 	define_expand.
> 
> Too many '('.

fixed.

> 
> 
> 
> > 	(rs6000_set_fpscr_rn): Added	return argument.  Updated
> > to use new
> 
> Looks like a <tab> after Added instead of a space.
> 
> 
> > 	rs6000_get_fpscr_fields and rs6000_update_fpscr_rn_field define
> > 	 _expands.
> 
> Don't split define_expand across two lines.

Fixed.

> 
> 
> 
> > 	* doc/extend.texi (__builtin_set_fpscr_rn): Update description
> > for
> > 	the return value and new double argument.  Add descripton for
> > 	__SET_FPSCR_RN_RETURNS_FPSCR__ macro.
> 
> s/descripton/description/

Fixed.

> 
> 
> 
> 
> 
> 
> > +  /* Tell the user the __builtin_set_fpscr_rn now returns the
> > FPSCR fields
> > +     in a double.  Originally the builtin returned void.  */
> 
> Either:
>   1) s/Tell the user the __builtin_set_fpscr_rn/Tell the user
> __builtin_set_fpscr_rn/ 
>   2) s/the __builtin_set_fpscr_rn now/the __builtin_set_fpscr_rn
> built-in now/ 
> 
> 
> > +  if ((flags & OPTION_MASK_SOFT_FLOAT) == 0)
> > +  rs6000_define_or_undefine_macro (define_p,
> > "__SET_FPSCR_RN_RETURNS_FPSCR__");
> 
> This doesn't look like it's indented correctly.
> 
> 

Fixed indentation.

> 
> 
> > +(define_expand "rs6000_get_fpscr_fields"
> > + [(match_operand:DF 0 "gpc_reg_operand")]
> > +  "TARGET_HARD_FLOAT"
> > +{
> > +  /* Extract fields bits 29:31 (DRN) and bits 56:63 (VE, OE, UE,
> > ZE, XE, NI,
> > +     RN) from the FPSCR and return them.  */
> > +  rtx tmp_df = gen_reg_rtx (DFmode);
> > +  rtx tmp_di = gen_reg_rtx (DImode);
> > +
> > +  emit_insn (gen_rs6000_mffs (tmp_df));
> > +  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> > +  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT
> > (0x00000007000000FFULL)));
> > +  rtx tmp_rtn = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
> > +  emit_move_insn (operands[0], tmp_rtn);
> > +  DONE;
> > +})
> 
> This doesn't look correct.  You first set tmp_di to a new reg rtx but
> then
> throw that away with the return value of simplify_gen_subreg().  I'm
> guessing
> you want that tmp_di as a gen_reg_rtx for the destination of the
> gen_anddi3, so
> you probably want a different rtx for the subreg that feeds the
> gen_anddi3.

OK, fixed the use of the tmp values.  Note the define_expand was
inlined into define_expand "rs6000_set_fpscr_rn per comments from
Kewen.  Inlining allows the reuse some of the tmp values.

> 
> 
> 
> > +(define_expand "rs6000_update_fpscr_rn_field"
> > + [(match_operand:DI 0 "gpc_reg_operand")]
> > +  "TARGET_HARD_FLOAT"
> > +{
> > +  /* Insert the new RN value from operands[0] into FPSCR bit
> > [62:63].  */
> > +  rtx tmp_di = gen_reg_rtx (DImode);
> > +  rtx tmp_df = gen_reg_rtx (DFmode);
> > +
> > +  emit_insn (gen_rs6000_mffs (tmp_df));
> > +  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> 
> Ditto.

Fixed.

> 
> 
> 
> 
> > +The @code{__builtin_set_fpscr_rn} builtin allows changing both of
> > the floating
> > +point rounding mode bits and returning the various FPSCR fields
> > before the RN
> > +field is updated.  The builtin returns a double consisting of the
> > initial value
> > +of the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, and RN bit
> > positions with all
> > +other bits set to zero. The builtin argument is a 2-bit value for
> > the new RN
> > +field value.  The argument can either be an @code{const int} or
> > stored in a
> > +variable.  Earlier versions of @code{__builtin_set_fpscr_rn}
> > returned void.  A
> > +@code{__SET_FPSCR_RN_RETURNS_FPSCR__} macro has been added.  If
> > defined, then
> > +the @code{__builtin_set_fpscr_rn} builtin returns the FPSCR
> > fields.  If not
> > +defined, the @code{__builtin_set_fpscr_rn} does not return a
> > vaule.  If the
> > +@option{-msoft-float} option is used, the
> > @code{__builtin_set_fpscr_rn} builtin
> > +will not return a value.
> 
> Multiple occurrences of "builtin" that should be spelled "built-in"
> (not in the
> built-in function name itself though).

Went thru the patch to fix the spelling of built-in.

> 
> 
> 
> > +/* Originally the __builtin_set_fpscr_rn builtin was defined to
> > return
> > +   void.  It was later extended to return a double with the
> > various
> > +   FPSCR bits.  The extended builtin is inteded to be a drop in
> > replacement
> > +   for the original version.  This test is for the original
> > version of the
> > +   builtin and should work exactly as before.  */
> 
> Ditto.

Fixed.

> 
> 
> 
> 
> > +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
> > @@ -0,0 +1,153 @@
> > +/* { dg-do run { target { powerpc*-*-* } } } */
> 
> powerpc*-*-* is the default for this test directory, so you can drop
> that,
> but you need to disable this test for soft-float systems, so you
> probably want:
> 
>   /* { dg-do run { target powerpc_fprs } } */
> 
> I know you didn't write it, but test_fpscr_rn_builtin_1.c should be
> changed to
> use the same dg-do run test above as well.

Fixed in both test files.

                                        Carl
  
Carl Love July 10, 2023, 7:18 p.m. UTC | #7
On Fri, 2023-07-07 at 12:06 +0800, Kewen.Lin wrote:
> Hi Carl,
> 
> Some more minor comments are inline below on top of Peter's
> insightful
> review comments.
> 
> on 2023/7/1 08:58, Carl Love wrote:
> > GCC maintainers:
> > 
> > Ver 2,  Went back thru the requirements and emails.  Not sure where
> > I
> > came up with the requirement for an overloaded version with double
> > argument.  Removed the overloaded version with the double
> > argument. 
> > Added the macro to announce if the __builtin_set_fpscr_rn returns a
> > void or a double with the FPSCR bits.  Updated the documentation
> > file. 
> > Retested on Power 8 BE/LE, Power 9 BE/LE, Power 10 LE.  Redid the
> > test
> > file.  Per request, the original test file functionality was not
> > changed.  Just changed the name from test_fpscr_rn_builtin.c to 
> > test_fpscr_rn_builtin_1.c.  Put new tests for the return values
> > into a
> > new test file, test_fpscr_rn_builtin_2.c.
> > 
> > The GLibC team requested a builtin to replace the mffscrn and
> > mffscrniinline asm instructions in the GLibC code.  Previously
> > there
> > was discussion on adding builtins for the mffscrn instructions.
> > 
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html
> > 
> > In the end, it was felt that it would be to extend the existing
> > __builtin_set_fpscr_rn builtin to return a double instead of a void
> > type.  The desire is that we could have the functionality of the
> > mffscrn and mffscrni instructions on older ISAs.  The two
> > instructions
> > were initially added in ISA 3.0.  The __builtin_set_fpscr_rn has
> > the
> > needed functionality to set the RN field using the mffscrn and
> > mffscrni
> > instructions if ISA 3.0 is supported or fall back to using logical
> > instructions to mask and set the bits for earlier ISAs.  The
> > instructions return the current value of the FPSCR fields DRN, VE,
> > OE,
> > UE, ZE, XE, NI, RN bit positions then update the RN bit positions
> > with
> > the new RN value provided.
> > 
> > The current __builtin_set_fpscr_rn builtin has a return type of
> > void. 
> > So, changing the return type to double and returning the  FPSCR
> > fields
> > DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the
> > functionally equivalent of the mffscrn and mffscrni
> > instructions.  Any
> > current uses of the builtin would just ignore the return value yet
> > any
> > new uses could use the return value.  So the requirement is for the
> > change to the __builtin_set_fpscr_rn builtin to be backwardly
> > compatible and work for all ISAs.
> > 
> > The following patch changes the return type of the
> >  __builtin_set_fpscr_rn builtin from void to double.  The return
> > value
> > is the current value of the various FPSCR fields DRN, VE, OE, UE,
> > ZE,
> > XE, NI, RN bit positions when the builtin is called.  The builtin
> > then
> > updated the RN field with the new value provided as an argument to
> > the
> > builtin.  The patch adds new testcases to test_fpscr_rn_builtin.c
> > to
> > check that the builtin returns the current value of the FPSCR
> > fields
> > and then updates the RN field.
> > 
> > The GLibC team has reviewed the patch to make sure it met their
> > needs
> > as a drop in replacement for the inline asm mffscr and mffscrni
> > statements in the GLibC code.  T
> > 
> > The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power
> > 10
> > LE.
> > 
> > Please let me know if the patch is acceptable for
> > mainline.  Thanks.
> > 
> >                                Carl 
> > 
> > 
> > ----------------------------------
> > rs6000, __builtin_set_fpscr_rn add retrun value
> > 
> > Change the return value from void to double.  The return value
> > consists of
> > the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit
> > positions.  Add an
> > overloaded version which accepts a double argument.
> > 
> > The test powerpc/test_fpscr_rn_builtin.c is updated to add tests
> > for the
> > double reterun value and the new double argument.
> > 
> > gcc/ChangeLog:
> > 	* config/rs6000/rs6000-builtins.def (__builtin_set_fpscr_rn):
> > Update
> > 	builtin definition return type.
> > 	* config/rs6000-c.cc(rs6000_target_modify_macros): Add check,
> > define
> > 	__SET_FPSCR_RN_RETURNS_FPSCR__ macro.
> > 	* config/rs6000/rs6000.md ((rs6000_get_fpscr_fields): New
> > 	define_expand.
> > 	(rs6000_update_fpscr_rn_field): New define_expand.
> > 	(rs6000_set_fpscr_rn): Added	return argument.  Updated
> > to use new
> > 	rs6000_get_fpscr_fields and rs6000_update_fpscr_rn_field define
> > 	 _expands.
> > 	* doc/extend.texi (__builtin_set_fpscr_rn): Update description
> > for
> > 	the return value and new double argument.  Add descripton for
> > 	__SET_FPSCR_RN_RETURNS_FPSCR__ macro.
> > 
> > gcc/testsuite/ChangeLog:
> > 	gcc.target/powerpc/test_fpscr_rn_builtin.c: Renamed to
> > 	test_fpscr_rn_builtin_1.c.  Added comment.
> > 	gcc.target/powerpc/test_fpscr_rn_builtin_2.c: New test for the
> > 	return value of __builtin_set_fpscr_rn builtin.
> > ---
> >  gcc/config/rs6000/rs6000-builtins.def         |   2 +-
> >  gcc/config/rs6000/rs6000-c.cc                 |   4 +
> >  gcc/config/rs6000/rs6000.md                   |  87 +++++++---
> >  gcc/doc/extend.texi                           |  26 ++-
> >  ...rn_builtin.c => test_fpscr_rn_builtin_1.c} |   6 +
> >  .../powerpc/test_fpscr_rn_builtin_2.c         | 153
> > ++++++++++++++++++
> >  6 files changed, 246 insertions(+), 32 deletions(-)
> >  rename gcc/testsuite/gcc.target/powerpc/{test_fpscr_rn_builtin.c
> > => test_fpscr_rn_builtin_1.c} (92%)
> >  create mode 100644
> > gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
> > 
> > diff --git a/gcc/config/rs6000/rs6000-builtins.def
> > b/gcc/config/rs6000/rs6000-builtins.def
> > index 289a37998b1..28788b69c7d 100644
> > --- a/gcc/config/rs6000/rs6000-builtins.def
> > +++ b/gcc/config/rs6000/rs6000-builtins.def
> > @@ -237,7 +237,7 @@
> >    const __ibm128 __builtin_pack_ibm128 (double, double);
> >      PACK_IF packif {ibm128}
> >  
> > -  void __builtin_set_fpscr_rn (const int[0,3]);
> > +  double __builtin_set_fpscr_rn (const int[0,3]);
> >      SET_FPSCR_RN rs6000_set_fpscr_rn {nosoft}
> >  
> >    const double __builtin_unpack_ibm128 (__ibm128, const int<1>);
> > diff --git a/gcc/config/rs6000/rs6000-c.cc
> > b/gcc/config/rs6000/rs6000-c.cc
> > index 8555174d36e..8373bb66919 100644
> > --- a/gcc/config/rs6000/rs6000-c.cc
> > +++ b/gcc/config/rs6000/rs6000-c.cc
> > @@ -604,6 +604,10 @@ rs6000_target_modify_macros (bool define_p,
> > HOST_WIDE_INT flags)
> >    /* Tell the user -mrop-protect is in play.  */
> >    if (rs6000_rop_protect)
> >      rs6000_define_or_undefine_macro (define_p, "__ROP_PROTECT__");
> > +  /* Tell the user the __builtin_set_fpscr_rn now returns the
> > FPSCR fields
> > +     in a double.  Originally the builtin returned void.  */
> > +  if ((flags & OPTION_MASK_SOFT_FLOAT) == 0)
> > +  rs6000_define_or_undefine_macro (define_p,
> > "__SET_FPSCR_RN_RETURNS_FPSCR__");
> >  }
> >  
> >  void
> > diff --git a/gcc/config/rs6000/rs6000.md
> > b/gcc/config/rs6000/rs6000.md
> > index b0db8ae508d..1b77a13c8a1 100644
> > --- a/gcc/config/rs6000/rs6000.md
> > +++ b/gcc/config/rs6000/rs6000.md
> > @@ -6440,8 +6440,51 @@
> >    "mffscdrn %0,%1"
> >    [(set_attr "type" "fp")])
> >  
> > +
> > +(define_expand "rs6000_get_fpscr_fields"
> > + [(match_operand:DF 0 "gpc_reg_operand")]
> > +  "TARGET_HARD_FLOAT"
> > +{
> > +  /* Extract fields bits 29:31 (DRN) and bits 56:63 (VE, OE, UE,
> > ZE, XE, NI,
> > +     RN) from the FPSCR and return them.  */
> > +  rtx tmp_df = gen_reg_rtx (DFmode);
> > +  rtx tmp_di = gen_reg_rtx (DImode);
> > +
> > +  emit_insn (gen_rs6000_mffs (tmp_df));
> > +  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> > +  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT
> > (0x00000007000000FFULL)));
> > +  rtx tmp_rtn = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
> > +  emit_move_insn (operands[0], tmp_rtn);
> > +  DONE;
> > +})
> > +
> > +(define_expand "rs6000_update_fpscr_rn_field"
> > + [(match_operand:DI 0 "gpc_reg_operand")]
> > +  "TARGET_HARD_FLOAT"
> > +{
> > +  /* Insert the new RN value from operands[0] into FPSCR bit
> > [62:63].  */
> > +  rtx tmp_di = gen_reg_rtx (DImode);
> > +  rtx tmp_df = gen_reg_rtx (DFmode);
> > +
> > +  emit_insn (gen_rs6000_mffs (tmp_df));
> > +  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> > +  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
> > +  emit_insn (gen_iordi3 (tmp_di, tmp_di, operands[0]));
> > +
> > +  /* Need to write to field k=15.  The fields are [0:15].  Hence
> > with
> > +     L=0, W=0, FLM_i must be equal to 8, 16 = i + 8*(1-W).  FLM is
> > an
> > +     8-bit field[0:7]. Need to set the bit that corresponds to the
> > +     value of i that you want [0:7].  */
> > +
> > +  tmp_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
> > +  emit_insn (gen_rs6000_mtfsf (GEN_INT (0x01), tmp_df));
> > +  DONE;
> > +})
> > +
> >  (define_expand "rs6000_set_fpscr_rn"
> > - [(match_operand:SI 0 "reg_or_cint_operand")]
> > + [(set (match_operand:DF 0 "gpc_reg_operand")
> > +       (unspec_volatile:DF [(match_operand:SI 1
> > "reg_or_cint_operand")]
> > +       UNSPECV_MFFSCDRN))]
> 
> I don't think we need this pattern to be complete, since it's always
> expanded with "DONE", the pattern is useless, so it can be simpler
> like:
> 
>    [(use (match_operand:DF 0 "gpc_reg_operand"))
>     (use (match_operand:SI 1 "reg_or_cint_operand"))]

OK, changed this.  Went and reviewed the use of set in the GCC internal
documentation.  Not really sure why we can use set with the DONE.  Not
really following your explanation.  It does seem to work.

> 
> 
> >    "TARGET_HARD_FLOAT"
> >  {
> >    rtx tmp_df = gen_reg_rtx (DFmode);
> > @@ -6450,25 +6493,34 @@
> >       new rounding mode bits from operands[0][62:63] into
> > FPSCR[62:63].  */
> >    if (TARGET_P9_MISC)
> >      {
> > -      if (const_0_to_3_operand (operands[0], VOIDmode))
> > -	emit_insn (gen_rs6000_mffscrni (tmp_df, operands[0]));
> > +      if (const_0_to_3_operand (operands[1], VOIDmode))
> > +	emit_insn (gen_rs6000_mffscrni (tmp_df, operands[1]));
> >        else
> >  	{
> > -	  rtx op0 = convert_to_mode (DImode, operands[0], false);
> > -	  rtx src_df = simplify_gen_subreg (DFmode, op0, DImode, 0);
> > +	  rtx op1 = convert_to_mode (DImode, operands[1], false);
> > +	  rtx src_df = simplify_gen_subreg (DFmode, op1, DImode, 0);
> >  	  emit_insn (gen_rs6000_mffscrn (tmp_df, src_df));
> >  	}
> > -      DONE;
> > +       emit_move_insn (operands[0], tmp_df);
> > +       DONE;
> >      }
> >  
> > -  if (CONST_INT_P (operands[0]))
> > +  /* Emulate the behavior of the mffscrni, mffscrn instructions
> > for earlier
> > +     ISAs.  Return bits 29:31 (DRN) and bits 56:63 (VE, OE, UE,
> > ZE, XE, NI,
> > +     RN) from the FPSCR.  Set the RN field based on the value in
> > operands[1].
> > +  */
> > +
> > +  /* Get the current FPSCR fields to return.  */
> > +  emit_insn (gen_rs6000_get_fpscr_fields (operands[0]));
> 
> IMHO the above define_expand rs6000_get_fpscr_fields isn't needed,
> just inline
> it here, since it doesn't get reused somewhere else, and more
> importantly the
> read fpscr in DImode (tmp_di from simplify_gen_subreg on tmp_df) can
> be shared 
> with the uses in rs6000_update_fpscr_rn_field (so I'd also suggest
> not factoring
> out the previous RN bit handlings into rs6000_update_fpscr_rn_field).

I in lined rs6000_get_fpscr_fields and rs6000_update_fpscr_rn_field. 
Since the overloaded version for the double operand was removed, there
is now just the single use of the define_expands.  Inlining improves
the code.

> 
> > +
> > +  if (CONST_INT_P (operands[1]))
> >      {
> > -      if ((INTVAL (operands[0]) & 0x1) == 0x1)
> > +      if ((INTVAL (operands[1]) & 0x1) == 0x1)
> >  	emit_insn (gen_rs6000_mtfsb1 (GEN_INT (31)));
> >        else
> >  	emit_insn (gen_rs6000_mtfsb0 (GEN_INT (31)));
> >  
> > -      if ((INTVAL (operands[0]) & 0x2) == 0x2)
> > +      if ((INTVAL (operands[1]) & 0x2) == 0x2)
> >  	emit_insn (gen_rs6000_mtfsb1 (GEN_INT (30)));
> >        else
> >  	emit_insn (gen_rs6000_mtfsb0 (GEN_INT (30)));
> > @@ -6476,24 +6528,13 @@
> >    else
> >      {
> >        rtx tmp_rn = gen_reg_rtx (DImode);
> > -      rtx tmp_di = gen_reg_rtx (DImode);
> >  
> >        /* Extract new RN mode from operand.  */
> > -      rtx op0 = convert_to_mode (DImode, operands[0], false);
> > -      emit_insn (gen_anddi3 (tmp_rn, op0, GEN_INT (3)));
> > +      rtx op1 = convert_to_mode (DImode, operands[1], false);
> > +      emit_insn (gen_anddi3 (tmp_rn, op1, GEN_INT (3)));
> >  
> >        /* Insert new RN mode into FSCPR.  */
> > -      emit_insn (gen_rs6000_mffs (tmp_df));
> > -      tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
> > -      emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
> > -      emit_insn (gen_iordi3 (tmp_di, tmp_di, tmp_rn));
> > -
> > -      /* Need to write to field k=15.  The fields are
> > [0:15].  Hence with
> > -	 L=0, W=0, FLM_i must be equal to 8, 16 = i + 8*(1-W).  FLM is
> > an
> > -	 8-bit field[0:7]. Need to set the bit that corresponds to the
> > -	 value of i that you want [0:7].  */
> > -      tmp_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
> > -      emit_insn (gen_rs6000_mtfsf (GEN_INT (0x01), tmp_df));
> > +      emit_insn (gen_rs6000_update_fpscr_rn_field (tmp_rn));
> >      }
> >    DONE;
> >  })
> > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > index cdbd4b34a35..fee35ac40ec 100644
> > --- a/gcc/doc/extend.texi
> > +++ b/gcc/doc/extend.texi
> > @@ -18188,7 +18188,6 @@ double __builtin_mffs (void);
> >  void __builtin_mtfsf (const int, double);
> >  void __builtin_mtfsb0 (const int);
> >  void __builtin_mtfsb1 (const int);
> > -void __builtin_set_fpscr_rn (int);
> 
> Just update void to double here, instead of the below seperated
> smallexample?
> As the below documentation is to describe these above functions one
> by
> one, just moving one down looks a bit inconsistent.

OK, fixed.

> 
> >  @end smallexample
> >  
> >  The @code{__builtin_ppc_get_timebase} and
> > @code{__builtin_ppc_mftb}
> > @@ -18209,13 +18208,24 @@ values to selected fields of the
> > FPSCR.  The
> >  as an argument.  The valid bit range is between 0 and 31.  The
> > builtins map to
> >  the @code{mtfsb0} and @code{mtfsb1} instructions which take the
> > argument and
> >  add 32.  Hence these instructions only modify the FPSCR[32:63]
> > bits by
> > -changing the specified bit to a zero or one respectively.  The
> > -@code{__builtin_set_fpscr_rn} builtin allows changing both of the
> > floating
> > -point rounding mode bits.  The argument is a 2-bit value.  The
> > argument can
> > -either be a @code{const int} or stored in a variable. The builtin
> > uses
> > -the ISA 3.0
> > -instruction @code{mffscrn} if available, otherwise it reads the
> > FPSCR, masks
> > -the current rounding mode bits out and OR's in the new value.
> > +changing the specified bit to a zero or one respectively.
> > +
> > +@smallexample
> > +double __builtin_set_fpscr_rn (int);
> > +@end smallexample
> > +
> > +The @code{__builtin_set_fpscr_rn} builtin allows changing both of
> > the floating
> > +point rounding mode bits and returning the various FPSCR fields
> > before the RN
> > +field is updated.  The builtin returns a double consisting of the
> > initial value
> > +of the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, and RN bit
> > positions with all
> > +other bits set to zero. The builtin argument is a 2-bit value for
> > the new RN
> > +field value.  The argument can either be an @code{const int} or
> > stored in a
> > +variable.  Earlier versions of @code{__builtin_set_fpscr_rn}
> > returned void.  A
> > +@code{__SET_FPSCR_RN_RETURNS_FPSCR__} macro has been added.  If
> > defined, then
> > +the @code{__builtin_set_fpscr_rn} builtin returns the FPSCR
> > fields.  If not
> > +defined, the @code{__builtin_set_fpscr_rn} does not return a
> > vaule.  If the
> > +@option{-msoft-float} option is used, the
> > @code{__builtin_set_fpscr_rn} builtin
> > +will not return a value.
> 
> __builtin_set_fpscr_rn isn't available for -msoft-float, your patch
> doesn't change
> it so IMHO you don't need to note this.

OK, dropped.

> 
> >  
> >  @node Basic PowerPC Built-in Functions Available on ISA 2.05
> >  @subsubsection Basic PowerPC Built-in Functions Available on ISA
> > 2.05
> > diff --git
> > a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c
> > b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_1.c
> > similarity index 92%
> > rename from
> > gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c
> > rename to
> > gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_1.c
> > index 04707ad8a56..d917ee19656 100644
> > --- a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_1.c
> > @@ -1,6 +1,12 @@
> >  /* { dg-do run { target { powerpc*-*-* } } } */
> >  /* { dg-options "-O2 -std=c99" } */
> >  
> > +/* Originally the __builtin_set_fpscr_rn builtin was defined to
> > return
> > +   void.  It was later extended to return a double with the
> > various
> > +   FPSCR bits.  The extended builtin is inteded to be a drop in
> > replacement
> > +   for the original version.  This test is for the original
> > version of the
> > +   builtin and should work exactly as before.  */
> > +
> >  #ifdef DEBUG
> >  #include <stdio.h>
> >  #endif
> > diff --git
> > a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
> > b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
> > new file mode 100644
> > index 00000000000..62fca67c948
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
> > @@ -0,0 +1,153 @@
> > +/* { dg-do run { target { powerpc*-*-* } } } */
> > +/* { dg-options "-O2 -std=c99" } */
> 
> Nit: The previous existing test_fpscr_rn_builtin.c introduced
> -std=c99 but
> I think it is useless for this case, so you can remove it if you
> like.
> 

Removed -stc=c99.


> > +
> > +/* The __builtin_set_fpscr_rn builtin was originally defined to
> > return
> > +   void.  It was to return a double with the various FPSCR
> > bits.  This test
> > +   verifies the new version returns the desired FPSCR bits.  */
> > +
> > +#ifdef DEBUG
> > +#include <stdio.h>
> > +#endif
> > +
> > +#define RN_MASK  0x3LL             /* RN field mask */
> > +#define FIELD_MASK 0x00000007000000FFULL
> > +
> > +union blah {
> > +  double d;
> > +  unsigned long long ll;
> > +} conv_val;
> > +
> > +void abort (void);
> > +double __attribute__ ((noipa)) wrap_set_fpscr_rn (int val)
> > +{
> > +  return __builtin_set_fpscr_rn (val);
> > +}
> > +
> 
> Nit: I noticed you want this case to abort if
> __SET_FPSCR_RN_RETURNS_FPSCR__
> doesn't get defined, but actually the compilation should fail
> for compiling this function wrap_set_fpscr_rn already, so
> the below __SET_FPSCR_RN_RETURNS_FPSCR__ check and abort seems 
> useless.  That said, the explicit use of return value of this built-
> in
> __builtin_set_fpscr_rn already implicitly checks it.

OK, removed the __SET_FPSCR_RN_RETURNS_FPSCR__ and will just rely on
the compiler to complain instead if it isn't expecting
__builtin_set_fpscr_rn to return a value.

                             Carl
  
Kewen.Lin July 11, 2023, 5:39 a.m. UTC | #8
on 2023/7/11 03:18, Carl Love wrote:
> On Fri, 2023-07-07 at 12:06 +0800, Kewen.Lin wrote:
>> Hi Carl,
>>
>> Some more minor comments are inline below on top of Peter's
>> insightful
>> review comments.
>>
>> on 2023/7/1 08:58, Carl Love wrote:
>>> GCC maintainers:
>>>
>>> Ver 2,  Went back thru the requirements and emails.  Not sure where
>>> I
>>> came up with the requirement for an overloaded version with double
>>> argument.  Removed the overloaded version with the double
>>> argument. 
>>> Added the macro to announce if the __builtin_set_fpscr_rn returns a
>>> void or a double with the FPSCR bits.  Updated the documentation
>>> file. 
>>> Retested on Power 8 BE/LE, Power 9 BE/LE, Power 10 LE.  Redid the
>>> test
>>> file.  Per request, the original test file functionality was not
>>> changed.  Just changed the name from test_fpscr_rn_builtin.c to 
>>> test_fpscr_rn_builtin_1.c.  Put new tests for the return values
>>> into a
>>> new test file, test_fpscr_rn_builtin_2.c.
>>>
>>> The GLibC team requested a builtin to replace the mffscrn and
>>> mffscrniinline asm instructions in the GLibC code.  Previously
>>> there
>>> was discussion on adding builtins for the mffscrn instructions.
>>>
>>> https://gcc.gnu.org/pipermail/gcc-patches/2023-May/620261.html
>>>
>>> In the end, it was felt that it would be to extend the existing
>>> __builtin_set_fpscr_rn builtin to return a double instead of a void
>>> type.  The desire is that we could have the functionality of the
>>> mffscrn and mffscrni instructions on older ISAs.  The two
>>> instructions
>>> were initially added in ISA 3.0.  The __builtin_set_fpscr_rn has
>>> the
>>> needed functionality to set the RN field using the mffscrn and
>>> mffscrni
>>> instructions if ISA 3.0 is supported or fall back to using logical
>>> instructions to mask and set the bits for earlier ISAs.  The
>>> instructions return the current value of the FPSCR fields DRN, VE,
>>> OE,
>>> UE, ZE, XE, NI, RN bit positions then update the RN bit positions
>>> with
>>> the new RN value provided.
>>>
>>> The current __builtin_set_fpscr_rn builtin has a return type of
>>> void. 
>>> So, changing the return type to double and returning the  FPSCR
>>> fields
>>> DRN, VE, OE, UE, ZE, XE, NI, RN bit positions would then give the
>>> functionally equivalent of the mffscrn and mffscrni
>>> instructions.  Any
>>> current uses of the builtin would just ignore the return value yet
>>> any
>>> new uses could use the return value.  So the requirement is for the
>>> change to the __builtin_set_fpscr_rn builtin to be backwardly
>>> compatible and work for all ISAs.
>>>
>>> The following patch changes the return type of the
>>>  __builtin_set_fpscr_rn builtin from void to double.  The return
>>> value
>>> is the current value of the various FPSCR fields DRN, VE, OE, UE,
>>> ZE,
>>> XE, NI, RN bit positions when the builtin is called.  The builtin
>>> then
>>> updated the RN field with the new value provided as an argument to
>>> the
>>> builtin.  The patch adds new testcases to test_fpscr_rn_builtin.c
>>> to
>>> check that the builtin returns the current value of the FPSCR
>>> fields
>>> and then updates the RN field.
>>>
>>> The GLibC team has reviewed the patch to make sure it met their
>>> needs
>>> as a drop in replacement for the inline asm mffscr and mffscrni
>>> statements in the GLibC code.  T
>>>
>>> The patch has been tested on Power 8 LE/BE, Power 9 LE/BE and Power
>>> 10
>>> LE.
>>>
>>> Please let me know if the patch is acceptable for
>>> mainline.  Thanks.
>>>
>>>                                Carl 
>>>
>>>
>>> ----------------------------------
>>> rs6000, __builtin_set_fpscr_rn add retrun value
>>>
>>> Change the return value from void to double.  The return value
>>> consists of
>>> the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, RN bit
>>> positions.  Add an
>>> overloaded version which accepts a double argument.
>>>
>>> The test powerpc/test_fpscr_rn_builtin.c is updated to add tests
>>> for the
>>> double reterun value and the new double argument.
>>>
>>> gcc/ChangeLog:
>>> 	* config/rs6000/rs6000-builtins.def (__builtin_set_fpscr_rn):
>>> Update
>>> 	builtin definition return type.
>>> 	* config/rs6000-c.cc(rs6000_target_modify_macros): Add check,
>>> define
>>> 	__SET_FPSCR_RN_RETURNS_FPSCR__ macro.
>>> 	* config/rs6000/rs6000.md ((rs6000_get_fpscr_fields): New
>>> 	define_expand.
>>> 	(rs6000_update_fpscr_rn_field): New define_expand.
>>> 	(rs6000_set_fpscr_rn): Added	return argument.  Updated
>>> to use new
>>> 	rs6000_get_fpscr_fields and rs6000_update_fpscr_rn_field define
>>> 	 _expands.
>>> 	* doc/extend.texi (__builtin_set_fpscr_rn): Update description
>>> for
>>> 	the return value and new double argument.  Add descripton for
>>> 	__SET_FPSCR_RN_RETURNS_FPSCR__ macro.
>>>
>>> gcc/testsuite/ChangeLog:
>>> 	gcc.target/powerpc/test_fpscr_rn_builtin.c: Renamed to
>>> 	test_fpscr_rn_builtin_1.c.  Added comment.
>>> 	gcc.target/powerpc/test_fpscr_rn_builtin_2.c: New test for the
>>> 	return value of __builtin_set_fpscr_rn builtin.
>>> ---
>>>  gcc/config/rs6000/rs6000-builtins.def         |   2 +-
>>>  gcc/config/rs6000/rs6000-c.cc                 |   4 +
>>>  gcc/config/rs6000/rs6000.md                   |  87 +++++++---
>>>  gcc/doc/extend.texi                           |  26 ++-
>>>  ...rn_builtin.c => test_fpscr_rn_builtin_1.c} |   6 +
>>>  .../powerpc/test_fpscr_rn_builtin_2.c         | 153
>>> ++++++++++++++++++
>>>  6 files changed, 246 insertions(+), 32 deletions(-)
>>>  rename gcc/testsuite/gcc.target/powerpc/{test_fpscr_rn_builtin.c
>>> => test_fpscr_rn_builtin_1.c} (92%)
>>>  create mode 100644
>>> gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
>>>
>>> diff --git a/gcc/config/rs6000/rs6000-builtins.def
>>> b/gcc/config/rs6000/rs6000-builtins.def
>>> index 289a37998b1..28788b69c7d 100644
>>> --- a/gcc/config/rs6000/rs6000-builtins.def
>>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>>> @@ -237,7 +237,7 @@
>>>    const __ibm128 __builtin_pack_ibm128 (double, double);
>>>      PACK_IF packif {ibm128}
>>>  
>>> -  void __builtin_set_fpscr_rn (const int[0,3]);
>>> +  double __builtin_set_fpscr_rn (const int[0,3]);
>>>      SET_FPSCR_RN rs6000_set_fpscr_rn {nosoft}
>>>  
>>>    const double __builtin_unpack_ibm128 (__ibm128, const int<1>);
>>> diff --git a/gcc/config/rs6000/rs6000-c.cc
>>> b/gcc/config/rs6000/rs6000-c.cc
>>> index 8555174d36e..8373bb66919 100644
>>> --- a/gcc/config/rs6000/rs6000-c.cc
>>> +++ b/gcc/config/rs6000/rs6000-c.cc
>>> @@ -604,6 +604,10 @@ rs6000_target_modify_macros (bool define_p,
>>> HOST_WIDE_INT flags)
>>>    /* Tell the user -mrop-protect is in play.  */
>>>    if (rs6000_rop_protect)
>>>      rs6000_define_or_undefine_macro (define_p, "__ROP_PROTECT__");
>>> +  /* Tell the user the __builtin_set_fpscr_rn now returns the
>>> FPSCR fields
>>> +     in a double.  Originally the builtin returned void.  */
>>> +  if ((flags & OPTION_MASK_SOFT_FLOAT) == 0)
>>> +  rs6000_define_or_undefine_macro (define_p,
>>> "__SET_FPSCR_RN_RETURNS_FPSCR__");
>>>  }
>>>  
>>>  void
>>> diff --git a/gcc/config/rs6000/rs6000.md
>>> b/gcc/config/rs6000/rs6000.md
>>> index b0db8ae508d..1b77a13c8a1 100644
>>> --- a/gcc/config/rs6000/rs6000.md
>>> +++ b/gcc/config/rs6000/rs6000.md
>>> @@ -6440,8 +6440,51 @@
>>>    "mffscdrn %0,%1"
>>>    [(set_attr "type" "fp")])
>>>  
>>> +
>>> +(define_expand "rs6000_get_fpscr_fields"
>>> + [(match_operand:DF 0 "gpc_reg_operand")]
>>> +  "TARGET_HARD_FLOAT"
>>> +{
>>> +  /* Extract fields bits 29:31 (DRN) and bits 56:63 (VE, OE, UE,
>>> ZE, XE, NI,
>>> +     RN) from the FPSCR and return them.  */
>>> +  rtx tmp_df = gen_reg_rtx (DFmode);
>>> +  rtx tmp_di = gen_reg_rtx (DImode);
>>> +
>>> +  emit_insn (gen_rs6000_mffs (tmp_df));
>>> +  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
>>> +  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT
>>> (0x00000007000000FFULL)));
>>> +  rtx tmp_rtn = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
>>> +  emit_move_insn (operands[0], tmp_rtn);
>>> +  DONE;
>>> +})
>>> +
>>> +(define_expand "rs6000_update_fpscr_rn_field"
>>> + [(match_operand:DI 0 "gpc_reg_operand")]
>>> +  "TARGET_HARD_FLOAT"
>>> +{
>>> +  /* Insert the new RN value from operands[0] into FPSCR bit
>>> [62:63].  */
>>> +  rtx tmp_di = gen_reg_rtx (DImode);
>>> +  rtx tmp_df = gen_reg_rtx (DFmode);
>>> +
>>> +  emit_insn (gen_rs6000_mffs (tmp_df));
>>> +  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
>>> +  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
>>> +  emit_insn (gen_iordi3 (tmp_di, tmp_di, operands[0]));
>>> +
>>> +  /* Need to write to field k=15.  The fields are [0:15].  Hence
>>> with
>>> +     L=0, W=0, FLM_i must be equal to 8, 16 = i + 8*(1-W).  FLM is
>>> an
>>> +     8-bit field[0:7]. Need to set the bit that corresponds to the
>>> +     value of i that you want [0:7].  */
>>> +
>>> +  tmp_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
>>> +  emit_insn (gen_rs6000_mtfsf (GEN_INT (0x01), tmp_df));
>>> +  DONE;
>>> +})
>>> +
>>>  (define_expand "rs6000_set_fpscr_rn"
>>> - [(match_operand:SI 0 "reg_or_cint_operand")]
>>> + [(set (match_operand:DF 0 "gpc_reg_operand")
>>> +       (unspec_volatile:DF [(match_operand:SI 1
>>> "reg_or_cint_operand")]
>>> +       UNSPECV_MFFSCDRN))]
>>
>> I don't think we need this pattern to be complete, since it's always
>> expanded with "DONE", the pattern is useless, so it can be simpler
>> like:
>>
>>    [(use (match_operand:DF 0 "gpc_reg_operand"))
>>     (use (match_operand:SI 1 "reg_or_cint_operand"))]
> 
> OK, changed this.  Went and reviewed the use of set in the GCC internal
> documentation.  Not really sure why we can use set with the DONE.  Not
> really following your explanation.  It does seem to work.
> 

It's mainly based on the documentation of define_expand:

"If the preparation falls through (invokes neither DONE nor FAIL), then
 the define_expand acts like a define_insn in that the RTL template is
 used to generate the insn.

 The RTL template is not used for matching, only for generating the initial
 insn list. **If the preparation statement always invokes DONE or FAIL, the RTL
 template may be reduced to a simple list of operands** ..."

https://gcc.gnu.org/onlinedocs/gccint/Expander-Definitions.html

BR,
Kewen
  

Patch

diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def
index 289a37998b1..28788b69c7d 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -237,7 +237,7 @@ 
   const __ibm128 __builtin_pack_ibm128 (double, double);
     PACK_IF packif {ibm128}
 
-  void __builtin_set_fpscr_rn (const int[0,3]);
+  double __builtin_set_fpscr_rn (const int[0,3]);
     SET_FPSCR_RN rs6000_set_fpscr_rn {nosoft}
 
   const double __builtin_unpack_ibm128 (__ibm128, const int<1>);
diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 8555174d36e..8373bb66919 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -604,6 +604,10 @@  rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT flags)
   /* Tell the user -mrop-protect is in play.  */
   if (rs6000_rop_protect)
     rs6000_define_or_undefine_macro (define_p, "__ROP_PROTECT__");
+  /* Tell the user the __builtin_set_fpscr_rn now returns the FPSCR fields
+     in a double.  Originally the builtin returned void.  */
+  if ((flags & OPTION_MASK_SOFT_FLOAT) == 0)
+  rs6000_define_or_undefine_macro (define_p, "__SET_FPSCR_RN_RETURNS_FPSCR__");
 }
 
 void
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index b0db8ae508d..1b77a13c8a1 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -6440,8 +6440,51 @@ 
   "mffscdrn %0,%1"
   [(set_attr "type" "fp")])
 
+
+(define_expand "rs6000_get_fpscr_fields"
+ [(match_operand:DF 0 "gpc_reg_operand")]
+  "TARGET_HARD_FLOAT"
+{
+  /* Extract fields bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI,
+     RN) from the FPSCR and return them.  */
+  rtx tmp_df = gen_reg_rtx (DFmode);
+  rtx tmp_di = gen_reg_rtx (DImode);
+
+  emit_insn (gen_rs6000_mffs (tmp_df));
+  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
+  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (0x00000007000000FFULL)));
+  rtx tmp_rtn = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
+  emit_move_insn (operands[0], tmp_rtn);
+  DONE;
+})
+
+(define_expand "rs6000_update_fpscr_rn_field"
+ [(match_operand:DI 0 "gpc_reg_operand")]
+  "TARGET_HARD_FLOAT"
+{
+  /* Insert the new RN value from operands[0] into FPSCR bit [62:63].  */
+  rtx tmp_di = gen_reg_rtx (DImode);
+  rtx tmp_df = gen_reg_rtx (DFmode);
+
+  emit_insn (gen_rs6000_mffs (tmp_df));
+  tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
+  emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
+  emit_insn (gen_iordi3 (tmp_di, tmp_di, operands[0]));
+
+  /* Need to write to field k=15.  The fields are [0:15].  Hence with
+     L=0, W=0, FLM_i must be equal to 8, 16 = i + 8*(1-W).  FLM is an
+     8-bit field[0:7]. Need to set the bit that corresponds to the
+     value of i that you want [0:7].  */
+
+  tmp_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
+  emit_insn (gen_rs6000_mtfsf (GEN_INT (0x01), tmp_df));
+  DONE;
+})
+
 (define_expand "rs6000_set_fpscr_rn"
- [(match_operand:SI 0 "reg_or_cint_operand")]
+ [(set (match_operand:DF 0 "gpc_reg_operand")
+       (unspec_volatile:DF [(match_operand:SI 1 "reg_or_cint_operand")]
+       UNSPECV_MFFSCDRN))]
   "TARGET_HARD_FLOAT"
 {
   rtx tmp_df = gen_reg_rtx (DFmode);
@@ -6450,25 +6493,34 @@ 
      new rounding mode bits from operands[0][62:63] into FPSCR[62:63].  */
   if (TARGET_P9_MISC)
     {
-      if (const_0_to_3_operand (operands[0], VOIDmode))
-	emit_insn (gen_rs6000_mffscrni (tmp_df, operands[0]));
+      if (const_0_to_3_operand (operands[1], VOIDmode))
+	emit_insn (gen_rs6000_mffscrni (tmp_df, operands[1]));
       else
 	{
-	  rtx op0 = convert_to_mode (DImode, operands[0], false);
-	  rtx src_df = simplify_gen_subreg (DFmode, op0, DImode, 0);
+	  rtx op1 = convert_to_mode (DImode, operands[1], false);
+	  rtx src_df = simplify_gen_subreg (DFmode, op1, DImode, 0);
 	  emit_insn (gen_rs6000_mffscrn (tmp_df, src_df));
 	}
-      DONE;
+       emit_move_insn (operands[0], tmp_df);
+       DONE;
     }
 
-  if (CONST_INT_P (operands[0]))
+  /* Emulate the behavior of the mffscrni, mffscrn instructions for earlier
+     ISAs.  Return bits 29:31 (DRN) and bits 56:63 (VE, OE, UE, ZE, XE, NI,
+     RN) from the FPSCR.  Set the RN field based on the value in operands[1].
+  */
+
+  /* Get the current FPSCR fields to return.  */
+  emit_insn (gen_rs6000_get_fpscr_fields (operands[0]));
+
+  if (CONST_INT_P (operands[1]))
     {
-      if ((INTVAL (operands[0]) & 0x1) == 0x1)
+      if ((INTVAL (operands[1]) & 0x1) == 0x1)
 	emit_insn (gen_rs6000_mtfsb1 (GEN_INT (31)));
       else
 	emit_insn (gen_rs6000_mtfsb0 (GEN_INT (31)));
 
-      if ((INTVAL (operands[0]) & 0x2) == 0x2)
+      if ((INTVAL (operands[1]) & 0x2) == 0x2)
 	emit_insn (gen_rs6000_mtfsb1 (GEN_INT (30)));
       else
 	emit_insn (gen_rs6000_mtfsb0 (GEN_INT (30)));
@@ -6476,24 +6528,13 @@ 
   else
     {
       rtx tmp_rn = gen_reg_rtx (DImode);
-      rtx tmp_di = gen_reg_rtx (DImode);
 
       /* Extract new RN mode from operand.  */
-      rtx op0 = convert_to_mode (DImode, operands[0], false);
-      emit_insn (gen_anddi3 (tmp_rn, op0, GEN_INT (3)));
+      rtx op1 = convert_to_mode (DImode, operands[1], false);
+      emit_insn (gen_anddi3 (tmp_rn, op1, GEN_INT (3)));
 
       /* Insert new RN mode into FSCPR.  */
-      emit_insn (gen_rs6000_mffs (tmp_df));
-      tmp_di = simplify_gen_subreg (DImode, tmp_df, DFmode, 0);
-      emit_insn (gen_anddi3 (tmp_di, tmp_di, GEN_INT (-4)));
-      emit_insn (gen_iordi3 (tmp_di, tmp_di, tmp_rn));
-
-      /* Need to write to field k=15.  The fields are [0:15].  Hence with
-	 L=0, W=0, FLM_i must be equal to 8, 16 = i + 8*(1-W).  FLM is an
-	 8-bit field[0:7]. Need to set the bit that corresponds to the
-	 value of i that you want [0:7].  */
-      tmp_df = simplify_gen_subreg (DFmode, tmp_di, DImode, 0);
-      emit_insn (gen_rs6000_mtfsf (GEN_INT (0x01), tmp_df));
+      emit_insn (gen_rs6000_update_fpscr_rn_field (tmp_rn));
     }
   DONE;
 })
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index cdbd4b34a35..fee35ac40ec 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -18188,7 +18188,6 @@  double __builtin_mffs (void);
 void __builtin_mtfsf (const int, double);
 void __builtin_mtfsb0 (const int);
 void __builtin_mtfsb1 (const int);
-void __builtin_set_fpscr_rn (int);
 @end smallexample
 
 The @code{__builtin_ppc_get_timebase} and @code{__builtin_ppc_mftb}
@@ -18209,13 +18208,24 @@  values to selected fields of the FPSCR.  The
 as an argument.  The valid bit range is between 0 and 31.  The builtins map to
 the @code{mtfsb0} and @code{mtfsb1} instructions which take the argument and
 add 32.  Hence these instructions only modify the FPSCR[32:63] bits by
-changing the specified bit to a zero or one respectively.  The
-@code{__builtin_set_fpscr_rn} builtin allows changing both of the floating
-point rounding mode bits.  The argument is a 2-bit value.  The argument can
-either be a @code{const int} or stored in a variable. The builtin uses
-the ISA 3.0
-instruction @code{mffscrn} if available, otherwise it reads the FPSCR, masks
-the current rounding mode bits out and OR's in the new value.
+changing the specified bit to a zero or one respectively.
+
+@smallexample
+double __builtin_set_fpscr_rn (int);
+@end smallexample
+
+The @code{__builtin_set_fpscr_rn} builtin allows changing both of the floating
+point rounding mode bits and returning the various FPSCR fields before the RN
+field is updated.  The builtin returns a double consisting of the initial value
+of the FPSCR fields DRN, VE, OE, UE, ZE, XE, NI, and RN bit positions with all
+other bits set to zero. The builtin argument is a 2-bit value for the new RN
+field value.  The argument can either be an @code{const int} or stored in a
+variable.  Earlier versions of @code{__builtin_set_fpscr_rn} returned void.  A
+@code{__SET_FPSCR_RN_RETURNS_FPSCR__} macro has been added.  If defined, then
+the @code{__builtin_set_fpscr_rn} builtin returns the FPSCR fields.  If not
+defined, the @code{__builtin_set_fpscr_rn} does not return a vaule.  If the
+@option{-msoft-float} option is used, the @code{__builtin_set_fpscr_rn} builtin
+will not return a value.
 
 @node Basic PowerPC Built-in Functions Available on ISA 2.05
 @subsubsection Basic PowerPC Built-in Functions Available on ISA 2.05
diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_1.c
similarity index 92%
rename from gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c
rename to gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_1.c
index 04707ad8a56..d917ee19656 100644
--- a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin.c
+++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_1.c
@@ -1,6 +1,12 @@ 
 /* { dg-do run { target { powerpc*-*-* } } } */
 /* { dg-options "-O2 -std=c99" } */
 
+/* Originally the __builtin_set_fpscr_rn builtin was defined to return
+   void.  It was later extended to return a double with the various
+   FPSCR bits.  The extended builtin is inteded to be a drop in replacement
+   for the original version.  This test is for the original version of the
+   builtin and should work exactly as before.  */
+
 #ifdef DEBUG
 #include <stdio.h>
 #endif
diff --git a/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
new file mode 100644
index 00000000000..62fca67c948
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/test_fpscr_rn_builtin_2.c
@@ -0,0 +1,153 @@ 
+/* { dg-do run { target { powerpc*-*-* } } } */
+/* { dg-options "-O2 -std=c99" } */
+
+/* The __builtin_set_fpscr_rn builtin was originally defined to return
+   void.  It was to return a double with the various FPSCR bits.  This test
+   verifies the new version returns the desired FPSCR bits.  */
+
+#ifdef DEBUG
+#include <stdio.h>
+#endif
+
+#define RN_MASK  0x3LL             /* RN field mask */
+#define FIELD_MASK 0x00000007000000FFULL
+
+union blah {
+  double d;
+  unsigned long long ll;
+} conv_val;
+
+void abort (void);
+double __attribute__ ((noipa)) wrap_set_fpscr_rn (int val)
+{
+  return __builtin_set_fpscr_rn (val);
+}
+
+double __attribute__ ((noipa)) wrap_const_fpscr_rn (int val)
+{
+  switch (val)
+    {
+    case 0: return __builtin_set_fpscr_rn (0x0);
+    case 1: return __builtin_set_fpscr_rn (0x1);
+    case 2: return __builtin_set_fpscr_rn (0x2);
+    case 3: return __builtin_set_fpscr_rn (0x3);
+    }
+}
+
+void check_builtin_set_fpscr_rn (unsigned long long initial_fpscr,
+				 int new_RN, double result)
+{
+  register double  f14;
+  unsigned long long masked_fpscr = initial_fpscr & FIELD_MASK;
+  
+  conv_val.d = result;
+
+  /* Check the result.  */
+  if (conv_val.ll != masked_fpscr)
+    {
+#ifdef DEBUG
+       printf("ERROR, __builtin_set_fpscr_rn(%d) did not return expected value %llx.\n",
+	      new_RN, masked_fpscr);
+       printf("fpscr_val_initial = 0x%llx\n", initial_fpscr);       
+       printf("result = 0x%llx\n", conv_val.ll);
+#else
+       abort();
+#endif
+    }
+
+  /* Check to see if the RN field was updated.  */
+    __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+
+  if ((conv_val.ll & RN_MASK) != new_RN)
+#ifdef DEBUG
+    {
+      printf("ERROR,  __builtin_set_fpscr_rn(%d) did not update RN to %llx.\n",
+	     new_RN, new_RN);
+      printf("  conv_val.ll = 0x%llx\n", conv_val.ll);
+    }
+#else
+    abort();
+#endif
+}
+
+int
+main ()
+{
+  int i;
+  int val, bit;
+  double fpscr_val;
+  unsigned long long fpscr_val_initial;
+  
+  unsigned long long ll_value;
+  union blah src_double;
+  register double  f14;
+
+#ifdef __SET_FPSCR_RN_RETURNS_FPSCR__
+  /* If __SET_FPSCR_RN_RETURNS_FPSCR__ is defined, the __builtin_set_fpscr_rn()
+     builtin returns the FPSCR fields.*/
+  
+  /* __builtin_set_fpscr_rn() builtin can take a const or a variable
+     value between 0 and 3 as the argument.
+     __builtin_mtfsb0 and __builtin_mtfsb1 argument must be a constant 
+     30 or 31.
+  */
+
+  /* Test reading the FPSCR register */
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+
+  if (conv_val.d != __builtin_mffs())
+    {
+#ifdef DEBUG
+       printf("ERROR, __builtin_mffs() returned 0x%llx, not the expecected value 0x%llx\n",
+	      __builtin_mffs(), conv_val.d);
+#else
+       abort();
+#endif
+    }		  
+
+  /* Test return value from __builtin_set_fpscr_rn. The FPSCR fields (DRN, VE,
+     OE, UE, ZE, XE, NI, RN) are returned and  the RN field of FPSCR is updated
+     with the specified argument for the builtin.  */
+
+  /* Check immediate argument cases */
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+  fpscr_val_initial = conv_val.ll;
+
+  val = 0x0;
+  fpscr_val = wrap_const_fpscr_rn (val);
+  check_builtin_set_fpscr_rn (fpscr_val_initial, val, fpscr_val);
+
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+  fpscr_val_initial = conv_val.ll;
+
+  val = 0x3;
+  fpscr_val = wrap_const_fpscr_rn (val);
+  check_builtin_set_fpscr_rn (fpscr_val_initial, val, fpscr_val);
+  
+  /* Check int argument cases */
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+  fpscr_val_initial = conv_val.ll;
+
+  val = 0x1;
+  fpscr_val = wrap_set_fpscr_rn (val);
+  check_builtin_set_fpscr_rn (fpscr_val_initial, val, fpscr_val);
+
+  __asm __volatile ("mffs %0" : "=f"(f14));
+  conv_val.d = f14;
+  fpscr_val_initial = conv_val.ll;
+
+  val = 0x2;
+  fpscr_val = wrap_set_fpscr_rn (val);
+  check_builtin_set_fpscr_rn (fpscr_val_initial, val, fpscr_val);
+  return 0;
+
+#endif
+
+  /* The __SET_FPSCR_RN_RETURNS_FPSCR__ should be defined.  */
+  abort();
+}