[fortran] Fix common subexpression elimination with IEEE rounding (PR108329)

Message ID 7bd3545a-7b9d-a9b2-6923-0d02df809177@netcologne.de
State Accepted
Headers
Series [fortran] Fix common subexpression elimination with IEEE rounding (PR108329) |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Thomas Koenig Jan. 7, 2023, 3:46 p.m. UTC
  Hello world,

this patch fixes Fortran's handling of common subexpression elimination
across ieee_set_rouding_mode calls.  It does so using a rather big
hammer, by issuing a memory barrier to force reload from memory
(and thus a recomputation).

This is a rather big hammer, so if there are more elegant ways
to fix it, I am very much open to suggestions.

If PR 34678 is fixed, then this solution can also be applied here.

OK for trunk?  How do you feel about a backport?

Best regards

	Thomas

Add memory barrier for calls to ieee_set_rounding_mode.

gcc/fortran/ChangeLog:

         PR fortran/108329
         * trans-expr.cc (trans_memory_barrier): New functions.
         (gfc_conv_procedure_call): Insert memory barrier for
         ieee_set_rounding_mode.

gcc/testsuite/ChangeLog:

         PR fortran/108329
         * gfortran.dg/rounding_4.f90: New test.
  

Comments

Paul Richard Thomas Jan. 8, 2023, 1:31 p.m. UTC | #1
Hi Thomas,

Following your off-line explanation that the seemingly empty looking
assembly line forces an effective reload from memory, all is now clear.

OK for mainline and for backporting as you see fit.

Thanks for the patch.

Paul


On Sat, 7 Jan 2023 at 15:46, Thomas Koenig via Fortran <fortran@gcc.gnu.org>
wrote:

> Hello world,
>
> this patch fixes Fortran's handling of common subexpression elimination
> across ieee_set_rouding_mode calls.  It does so using a rather big
> hammer, by issuing a memory barrier to force reload from memory
> (and thus a recomputation).
>
> This is a rather big hammer, so if there are more elegant ways
> to fix it, I am very much open to suggestions.
>
> If PR 34678 is fixed, then this solution can also be applied here.
>
> OK for trunk?  How do you feel about a backport?
>
> Best regards
>
>         Thomas
>
> Add memory barrier for calls to ieee_set_rounding_mode.
>
> gcc/fortran/ChangeLog:
>
>          PR fortran/108329
>          * trans-expr.cc (trans_memory_barrier): New functions.
>          (gfc_conv_procedure_call): Insert memory barrier for
>          ieee_set_rounding_mode.
>
> gcc/testsuite/ChangeLog:
>
>          PR fortran/108329
>          * gfortran.dg/rounding_4.f90: New test.
  
Richard Biener Jan. 8, 2023, 3:53 p.m. UTC | #2
> Am 08.01.2023 um 14:31 schrieb Paul Richard Thomas via Fortran <fortran@gcc.gnu.org>:
> 
> Hi Thomas,
> 
> Following your off-line explanation that the seemingly empty looking
> assembly line forces an effective reload from memory, all is now clear.

It’s not a full fix (for register vars) and it’s ‚superior‘ to the call itself only because asm handling is implemented in a rather stupid way in the Alias oracle.  So I don’t think this is a „fix“ at all.

Richard 

> OK for mainline and for backporting as you see fit.
> 
> Thanks for the patch.
> 
> Paul
> 
> 
>> On Sat, 7 Jan 2023 at 15:46, Thomas Koenig via Fortran <fortran@gcc.gnu.org>
>> wrote:
>> 
>> Hello world,
>> 
>> this patch fixes Fortran's handling of common subexpression elimination
>> across ieee_set_rouding_mode calls.  It does so using a rather big
>> hammer, by issuing a memory barrier to force reload from memory
>> (and thus a recomputation).
>> 
>> This is a rather big hammer, so if there are more elegant ways
>> to fix it, I am very much open to suggestions.
>> 
>> If PR 34678 is fixed, then this solution can also be applied here.
>> 
>> OK for trunk?  How do you feel about a backport?
>> 
>> Best regards
>> 
>>        Thomas
>> 
>> Add memory barrier for calls to ieee_set_rounding_mode.
>> 
>> gcc/fortran/ChangeLog:
>> 
>>         PR fortran/108329
>>         * trans-expr.cc (trans_memory_barrier): New functions.
>>         (gfc_conv_procedure_call): Insert memory barrier for
>>         ieee_set_rounding_mode.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>         PR fortran/108329
>>         * gfortran.dg/rounding_4.f90: New test.
> 
> 
> 
> -- 
> "If you can't explain it simply, you don't understand it well enough" -
> Albert Einstein
  
Thomas Koenig Jan. 8, 2023, 4:21 p.m. UTC | #3
Hi Richard,

>> Am 08.01.2023 um 14:31 schrieb Paul Richard Thomas via Fortran <fortran@gcc.gnu.org>:
>>
>> Hi Thomas,
>>
>> Following your off-line explanation that the seemingly empty looking
>> assembly line forces an effective reload from memory, all is now clear.
> 
> It’s not a full fix (for register vars) and it’s ‚superior‘ to the call itself only because asm handling is implemented in a rather stupid way in the Alias oracle.  So I don’t think this is a „fix“ at all.

There are no register variables in Fortran, this is Fortran FE only,
and it is a fix in the sense that correct code is no longer miscompiled.

There's a FIXME in the code pointing to the relevant PR precisely
because I think that this is less than elegant (as do you, obviously).
Do you have other suggestions how to implement this?  If PR 34678
is solved, this would probably provide a mechanism that we could
simply re-use.

Best regards

	Thomas
  
Richard Biener Jan. 9, 2023, 12:59 p.m. UTC | #4
On Sun, Jan 8, 2023 at 5:21 PM Thomas Koenig <tkoenig@netcologne.de> wrote:
>
> Hi Richard,
>
> >> Am 08.01.2023 um 14:31 schrieb Paul Richard Thomas via Fortran <fortran@gcc.gnu.org>:
> >>
> >> Hi Thomas,
> >>
> >> Following your off-line explanation that the seemingly empty looking
> >> assembly line forces an effective reload from memory, all is now clear.
> >
> > It’s not a full fix (for register vars) and it’s ‚superior‘ to the call itself only because asm handling is implemented in a rather stupid way in the Alias oracle.  So I don’t think this is a „fix“ at all.
>
> There are no register variables in Fortran, this is Fortran FE only,
> and it is a fix in the sense that correct code is no longer miscompiled.

It's a quite big hammer and the fact that it "works" is just luck and
the fact that the memory barrier implied by the ieee_set_rouding_mode
does not is because by-reference passed arguments are marked by
the frontend so they can be CSEd since memory barriers may not
affect them.

As said, the fact that this "works" is just because we're lazy on GIMPLE:

/* If the statement STMT may clobber the memory reference REF return true,
   otherwise return false.  */

bool
stmt_may_clobber_ref_p_1 (gimple *stmt, ao_ref *ref, bool tbaa_p)
{
...
  else if (gimple_code (stmt) == GIMPLE_ASM)
    return true;

> There's a FIXME in the code pointing to the relevant PR precisely
> because I think that this is less than elegant (as do you, obviously).
> Do you have other suggestions how to implement this?  If PR 34678
> is solved, this would probably provide a mechanism that we could
> simply re-use.

There is no reliable way to get this correct at the moment and if there
were good and easy ways to get this working they'd be implemented already.

Richard.

> Best regards
>
>         Thomas
  
Thomas Koenig Jan. 9, 2023, 3:27 p.m. UTC | #5
Hi Richard,

> There is no reliable way to get this correct at the moment and if there
> were good and easy ways to get this working they'd be implemented already.

OK, I then withdraw the patch (and have unassigned myself from the PR).

Best regards

	Thomas
  

Patch

diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index 4f3ae82d39c..29be7804e11 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -5981,6 +5981,20 @@  post_call:
     gfc_add_block_to_block (&parmse->post, &block);
 }
 
+/* Helper function - generate a memory barrier.  */
+
+static tree
+trans_memory_barrier (void)
+{
+  tree tmp;
+
+  tmp = gfc_build_string_const (sizeof ("memory"), "memory");
+  tmp = build5_loc (input_location, ASM_EXPR, void_type_node,
+		    gfc_build_string_const (1, ""), NULL_TREE, NULL_TREE,
+		    tree_cons (NULL_TREE, tmp, NULL_TREE), NULL_TREE);
+  ASM_VOLATILE_P (tmp) = 1;
+  return tmp;
+}
 
 /* Generate code for a procedure call.  Note can return se->post != NULL.
    If se->direct_byref is set then se->expr contains the return parameter.
@@ -7692,6 +7706,19 @@  gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
   else
     conv_base_obj_fcn_val (se, base_object, expr);
 
+  /* FIXME: Special handing of ieee_set_rounding_mode - we clobber
+     memory here to avoid common subexpression moving code past calls
+     to ieee_set_rounding_mode.  This should only be done for
+     floating point, but currently gcc offers no other possibility.
+     See PR 108329.  */
+
+  if (sym->from_intmod == INTMOD_IEEE_ARITHMETIC
+      && strcmp (sym->name, "ieee_set_rounding_mode") == 0)
+    {
+      tree tmp = trans_memory_barrier ();
+      gfc_add_expr_to_block (&post, tmp);
+    }
+
   /* If there are alternate return labels, function type should be
      integer.  Can't modify the type in place though, since it can be shared
      with other functions.  For dummy arguments, the typing is done to
diff --git a/gcc/testsuite/gfortran.dg/rounding_4.f90 b/gcc/testsuite/gfortran.dg/rounding_4.f90
new file mode 100644
index 00000000000..e8799da67dc
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/rounding_4.f90
@@ -0,0 +1,31 @@ 
+! { dg-do run }
+module y
+  implicit none
+  integer, parameter :: wp = selected_real_kind(15)
+contains
+  subroutine foo(a,b,c)
+    use ieee_arithmetic
+    real(kind=wp), dimension(4), intent(out) :: a
+    real(kind=wp), intent(in) :: b, c
+    type (ieee_round_type), dimension(4), parameter :: mode = &
+         [ieee_nearest, ieee_to_zero, ieee_up, ieee_down]
+    call ieee_set_rounding_mode (mode(1))
+    a(1) = b + c
+    call ieee_set_rounding_mode (mode(2))
+    a(2) = b + c
+    call ieee_set_rounding_mode (mode(3))
+    a(3) = b + c
+    call ieee_set_rounding_mode (mode(4))
+    a(4) = b + c
+  end subroutine foo
+end module y
+
+program main
+  use y
+  real(kind=wp), dimension(4) :: a
+  call foo(a,0.1_wp,0.2_wp)
+  if (a(1) <= a(2)) stop 1
+  if (a(3) <= a(4)) stop 2
+  if (a(1) /= a(3)) stop 3
+  if (a(2) /= a(4)) stop 4
+end program main