rtl: Try to remove EH edges after {pro,epi}logue generation [PR90259]
Checks
Commit Message
Hi,
After prologue and epilogue generation, the judgement on whether
one memory access onto stack frame may trap or not could change,
since we get more exact stack information by now.
As PR90259 shows, some memory access becomes impossible to trap
any more after prologue and epilogue generation, it can make
subsequent optimization be able to remove it if safe, but it
results in unexpected control flow status due to REG_EH_REGION
note missing.
This patch proposes to try to remove EH edges with function
purge_all_dead_edges after prologue and epilogue generation,
it simplifies CFG as early as we can and don't need any fixup
in downstream passes.
CFG simplification result with PR90259's case as example:
*before*
18: %1:TF=call [`__gcc_qdiv'] argc:0
REG_EH_REGION 0x2
77: NOTE_INSN_BASIC_BLOCK 3
19: NOTE_INSN_DELETED
20: NOTE_INSN_DELETED
110: [%31:SI+0x20]=%1:DF
REG_EH_REGION 0x2
116: NOTE_INSN_BASIC_BLOCK 4
111: [%31:SI+0x28]=%2:DF
REG_EH_REGION 0x2
22: NOTE_INSN_BASIC_BLOCK 5
108: %0:DF=[%31:SI+0x20]
REG_EH_REGION 0x2
117: NOTE_INSN_BASIC_BLOCK 6
109: %1:DF=[%31:SI+0x28]
REG_EH_REGION 0x2
79: NOTE_INSN_BASIC_BLOCK 7
26: [%31:SI+0x18]=%0:DF
104: pc=L69
105: barrier
*after*
18: %1:TF=call [`__gcc_qdiv'] argc:0
REG_EH_REGION 0x2
77: NOTE_INSN_BASIC_BLOCK 3
19: NOTE_INSN_DELETED
20: NOTE_INSN_DELETED
110: [%31:SI+0x20]=%1:DF
111: [%31:SI+0x28]=%2:DF
108: %0:DF=[%31:SI+0x20]
109: %1:DF=[%31:SI+0x28]
26: [%31:SI+0x18]=%0:DF
104: pc=L69
105: barrier
Bootstrapped and regtested on x86_64-redhat-linux,
aarch64-linux-gnu and powerpc64{,le}-linux-gnu.
Is it ok for trunk?
BR,
Kewen
-----
PR rtl-optimization/90259
gcc/ChangeLog:
* function.cc (rest_of_handle_thread_prologue_and_epilogue): Add
parameter fun, and call function purge_all_dead_edges.
(pass_thread_prologue_and_epilogue::execute): Name unamed parameter
as fun, and use it for rest_of_handle_thread_prologue_and_epilogue.
gcc/testsuite/ChangeLog:
* g++.target/powerpc/pr90259.C: New.
---
gcc/function.cc | 13 ++-
gcc/testsuite/g++.target/powerpc/pr90259.C | 103 +++++++++++++++++++++
2 files changed, 113 insertions(+), 3 deletions(-)
create mode 100644 gcc/testsuite/g++.target/powerpc/pr90259.C
--
2.35.4
Comments
On Tue, Nov 8, 2022 at 3:49 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>
> Hi,
>
> After prologue and epilogue generation, the judgement on whether
> one memory access onto stack frame may trap or not could change,
> since we get more exact stack information by now.
>
> As PR90259 shows, some memory access becomes impossible to trap
> any more after prologue and epilogue generation, it can make
> subsequent optimization be able to remove it if safe, but it
> results in unexpected control flow status due to REG_EH_REGION
> note missing.
>
> This patch proposes to try to remove EH edges with function
> purge_all_dead_edges after prologue and epilogue generation,
> it simplifies CFG as early as we can and don't need any fixup
> in downstream passes.
>
> CFG simplification result with PR90259's case as example:
>
> *before*
>
> 18: %1:TF=call [`__gcc_qdiv'] argc:0
> REG_EH_REGION 0x2
> 77: NOTE_INSN_BASIC_BLOCK 3
> 19: NOTE_INSN_DELETED
> 20: NOTE_INSN_DELETED
> 110: [%31:SI+0x20]=%1:DF
> REG_EH_REGION 0x2
> 116: NOTE_INSN_BASIC_BLOCK 4
> 111: [%31:SI+0x28]=%2:DF
> REG_EH_REGION 0x2
> 22: NOTE_INSN_BASIC_BLOCK 5
> 108: %0:DF=[%31:SI+0x20]
> REG_EH_REGION 0x2
> 117: NOTE_INSN_BASIC_BLOCK 6
> 109: %1:DF=[%31:SI+0x28]
> REG_EH_REGION 0x2
> 79: NOTE_INSN_BASIC_BLOCK 7
> 26: [%31:SI+0x18]=%0:DF
> 104: pc=L69
> 105: barrier
>
> *after*
>
> 18: %1:TF=call [`__gcc_qdiv'] argc:0
> REG_EH_REGION 0x2
> 77: NOTE_INSN_BASIC_BLOCK 3
> 19: NOTE_INSN_DELETED
> 20: NOTE_INSN_DELETED
> 110: [%31:SI+0x20]=%1:DF
> 111: [%31:SI+0x28]=%2:DF
> 108: %0:DF=[%31:SI+0x20]
> 109: %1:DF=[%31:SI+0x28]
> 26: [%31:SI+0x18]=%0:DF
> 104: pc=L69
> 105: barrier
>
> Bootstrapped and regtested on x86_64-redhat-linux,
> aarch64-linux-gnu and powerpc64{,le}-linux-gnu.
>
> Is it ok for trunk?
It looks reasonable - OK if the others CCed have no comments.
Thanks,
Richard.
> BR,
> Kewen
>
> -----
> PR rtl-optimization/90259
>
> gcc/ChangeLog:
>
> * function.cc (rest_of_handle_thread_prologue_and_epilogue): Add
> parameter fun, and call function purge_all_dead_edges.
> (pass_thread_prologue_and_epilogue::execute): Name unamed parameter
> as fun, and use it for rest_of_handle_thread_prologue_and_epilogue.
>
> gcc/testsuite/ChangeLog:
>
> * g++.target/powerpc/pr90259.C: New.
> ---
> gcc/function.cc | 13 ++-
> gcc/testsuite/g++.target/powerpc/pr90259.C | 103 +++++++++++++++++++++
> 2 files changed, 113 insertions(+), 3 deletions(-)
> create mode 100644 gcc/testsuite/g++.target/powerpc/pr90259.C
>
> diff --git a/gcc/function.cc b/gcc/function.cc
> index 6474a663b30..3757ded547d 100644
> --- a/gcc/function.cc
> +++ b/gcc/function.cc
> @@ -6540,7 +6540,7 @@ make_pass_leaf_regs (gcc::context *ctxt)
> }
>
> static unsigned int
> -rest_of_handle_thread_prologue_and_epilogue (void)
> +rest_of_handle_thread_prologue_and_epilogue (function *fun)
> {
> /* prepare_shrink_wrap is sensitive to the block structure of the control
> flow graph, so clean it up first. */
> @@ -6557,6 +6557,13 @@ rest_of_handle_thread_prologue_and_epilogue (void)
> Fix that up. */
> fixup_partitions ();
>
> + /* After prologue and epilogue generation, the judgement on whether
> + one memory access onto stack frame may trap or not could change,
> + since we get more exact stack information by now. So try to
> + remove any EH edges here, see PR90259. */
> + if (fun->can_throw_non_call_exceptions)
> + purge_all_dead_edges ();
> +
> /* Shrink-wrapping can result in unreachable edges in the epilogue,
> see PR57320. */
> cleanup_cfg (optimize ? CLEANUP_EXPENSIVE : 0);
> @@ -6625,9 +6632,9 @@ public:
> {}
>
> /* opt_pass methods: */
> - unsigned int execute (function *) final override
> + unsigned int execute (function * fun) final override
> {
> - return rest_of_handle_thread_prologue_and_epilogue ();
> + return rest_of_handle_thread_prologue_and_epilogue (fun);
> }
>
> }; // class pass_thread_prologue_and_epilogue
> diff --git a/gcc/testsuite/g++.target/powerpc/pr90259.C b/gcc/testsuite/g++.target/powerpc/pr90259.C
> new file mode 100644
> index 00000000000..db75ac7fe02
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/powerpc/pr90259.C
> @@ -0,0 +1,103 @@
> +/* { dg-require-effective-target long_double_ibm128 } */
> +/* { dg-options "-O2 -ffloat-store -fgcse -fnon-call-exceptions -fno-forward-propagate -fno-omit-frame-pointer -fstack-protector-all" } */
> +/* { dg-add-options long_double_ibm128 } */
> +
> +/* Verify there is no ICE. */
> +
> +template <int a> struct b
> +{
> + static constexpr int c = a;
> +};
> +template <bool a> using d = b<a>;
> +struct e
> +{
> + int f;
> + int
> + g ()
> + {
> + return __builtin_ceil (f / (long double) h);
> + }
> + float h;
> +};
> +template <typename, typename> using k = d<!bool ()>;
> +template <typename> class n
> +{
> +public:
> + e ae;
> + void af ();
> +};
> +template <typename l>
> +void
> +n<l>::af ()
> +{
> + ae.g ();
> +}
> +template <bool> using m = int;
> +template <typename ag, typename ah, typename ai = m<k<ag, ah>::c>>
> +using aj = n<ai>;
> +struct o
> +{
> + void
> + af ()
> + {
> + al.af ();
> + }
> + aj<int, int> al;
> +};
> +template <typename> class am;
> +template <typename i> class ao
> +{
> +protected:
> + static i *ap (int);
> +};
> +template <typename, typename> class p;
> +template <typename ar, typename i, typename... j> class p<ar (j...), i> : ao<i>
> +{
> +public:
> + static ar
> + as (const int &p1, j...)
> + {
> + (*ao<i>::ap (p1)) (j ()...);
> + }
> +};
> +template <typename ar, typename... j> class am<ar (j...)>
> +{
> + template <typename, typename> using av = int;
> +
> +public:
> + template <typename i, typename = av<d<!bool ()>, void>,
> + typename = av<i, void>>
> + am (i);
> + using aw = ar (*) (const int &, j...);
> + aw ax;
> +};
> +template <typename ar, typename... j>
> +template <typename i, typename, typename>
> +am<ar (j...)>::am (i)
> +{
> + ax = p<ar (j...), i>::as;
> +}
> +struct G
> +{
> + void ba (am<void (o)>);
> +};
> +struct q
> +{
> + q ()
> + {
> + G a;
> + a.ba (r ());
> + }
> + struct r
> + {
> + void
> + operator() (o p1)
> + try
> + {
> + p1.af ();
> + }
> + catch (int)
> + {
> + }
> + };
> +} s;
> --
> 2.35.4
> It looks reasonable - OK if the others CCed have no comments.
My only comment is that it needs to be tested with languages enabling -fnon-
call-exceptions by default (Ada & Go), if not already done.
Hi Richi and Eric,
Thanks for your reviews and comments!
on 2022/11/8 18:37, Eric Botcazou wrote:
>> It looks reasonable - OK if the others CCed have no comments.
>
> My only comment is that it needs to be tested with languages enabling -fnon-
> call-exceptions by default (Ada & Go), if not already done.
>
The previous testings on powerpc64{,le}-linux-gnu covered language Go, but
not Ada. I re-tested it with languages c,c++,fortran,objc,obj-c++,go,ada
on powerpc64le-linux-gnu, the result looked good. Both x86 and aarch64
cfarm machines which I used for testing don't have gnat installed, do you
think testing Ada on ppc64le is enough?
BR,
Kewen
> The previous testings on powerpc64{,le}-linux-gnu covered language Go, but
> not Ada. I re-tested it with languages c,c++,fortran,objc,obj-c++,go,ada
> on powerpc64le-linux-gnu, the result looked good. Both x86 and aarch64
> cfarm machines which I used for testing don't have gnat installed, do you
> think testing Ada on ppc64le is enough?
Sure, thanks for having done it!
on 2022/11/9 15:56, Eric Botcazou wrote:
>> The previous testings on powerpc64{,le}-linux-gnu covered language Go, but
>> not Ada. I re-tested it with languages c,c++,fortran,objc,obj-c++,go,ada
>> on powerpc64le-linux-gnu, the result looked good. Both x86 and aarch64
>> cfarm machines which I used for testing don't have gnat installed, do you
>> think testing Ada on ppc64le is enough?
>
> Sure, thanks for having done it!
>
Thanks for confirming.
I'm going to push this next Monday if there is no objection or further
comments in this week.
BR,
Kewen
on 2022/11/10 11:30, Kewen.Lin wrote:
> on 2022/11/9 15:56, Eric Botcazou wrote:
>>> The previous testings on powerpc64{,le}-linux-gnu covered language Go, but
>>> not Ada. I re-tested it with languages c,c++,fortran,objc,obj-c++,go,ada
>>> on powerpc64le-linux-gnu, the result looked good. Both x86 and aarch64
>>> cfarm machines which I used for testing don't have gnat installed, do you
>>> think testing Ada on ppc64le is enough?
>>
>> Sure, thanks for having done it!
>>
>
> Thanks for confirming.
>
> I'm going to push this next Monday if there is no objection or further
> comments in this week.
>
Committed in r13-4079. Do we want this to be backported a week later?
BR,
Kewen
On Wed, Nov 16, 2022 at 3:33 AM Kewen.Lin <linkw@linux.ibm.com> wrote:
>
> on 2022/11/10 11:30, Kewen.Lin wrote:
> > on 2022/11/9 15:56, Eric Botcazou wrote:
> >>> The previous testings on powerpc64{,le}-linux-gnu covered language Go, but
> >>> not Ada. I re-tested it with languages c,c++,fortran,objc,obj-c++,go,ada
> >>> on powerpc64le-linux-gnu, the result looked good. Both x86 and aarch64
> >>> cfarm machines which I used for testing don't have gnat installed, do you
> >>> think testing Ada on ppc64le is enough?
> >>
> >> Sure, thanks for having done it!
> >>
> >
> > Thanks for confirming.
> >
> > I'm going to push this next Monday if there is no objection or further
> > comments in this week.
> >
>
> Committed in r13-4079. Do we want this to be backported a week later?
It doesn't seem to be a regression and the report was from a fuzzed testcase
so I think no.
Richard.
> BR,
> Kewen
@@ -6540,7 +6540,7 @@ make_pass_leaf_regs (gcc::context *ctxt)
}
static unsigned int
-rest_of_handle_thread_prologue_and_epilogue (void)
+rest_of_handle_thread_prologue_and_epilogue (function *fun)
{
/* prepare_shrink_wrap is sensitive to the block structure of the control
flow graph, so clean it up first. */
@@ -6557,6 +6557,13 @@ rest_of_handle_thread_prologue_and_epilogue (void)
Fix that up. */
fixup_partitions ();
+ /* After prologue and epilogue generation, the judgement on whether
+ one memory access onto stack frame may trap or not could change,
+ since we get more exact stack information by now. So try to
+ remove any EH edges here, see PR90259. */
+ if (fun->can_throw_non_call_exceptions)
+ purge_all_dead_edges ();
+
/* Shrink-wrapping can result in unreachable edges in the epilogue,
see PR57320. */
cleanup_cfg (optimize ? CLEANUP_EXPENSIVE : 0);
@@ -6625,9 +6632,9 @@ public:
{}
/* opt_pass methods: */
- unsigned int execute (function *) final override
+ unsigned int execute (function * fun) final override
{
- return rest_of_handle_thread_prologue_and_epilogue ();
+ return rest_of_handle_thread_prologue_and_epilogue (fun);
}
}; // class pass_thread_prologue_and_epilogue
new file mode 100644
@@ -0,0 +1,103 @@
+/* { dg-require-effective-target long_double_ibm128 } */
+/* { dg-options "-O2 -ffloat-store -fgcse -fnon-call-exceptions -fno-forward-propagate -fno-omit-frame-pointer -fstack-protector-all" } */
+/* { dg-add-options long_double_ibm128 } */
+
+/* Verify there is no ICE. */
+
+template <int a> struct b
+{
+ static constexpr int c = a;
+};
+template <bool a> using d = b<a>;
+struct e
+{
+ int f;
+ int
+ g ()
+ {
+ return __builtin_ceil (f / (long double) h);
+ }
+ float h;
+};
+template <typename, typename> using k = d<!bool ()>;
+template <typename> class n
+{
+public:
+ e ae;
+ void af ();
+};
+template <typename l>
+void
+n<l>::af ()
+{
+ ae.g ();
+}
+template <bool> using m = int;
+template <typename ag, typename ah, typename ai = m<k<ag, ah>::c>>
+using aj = n<ai>;
+struct o
+{
+ void
+ af ()
+ {
+ al.af ();
+ }
+ aj<int, int> al;
+};
+template <typename> class am;
+template <typename i> class ao
+{
+protected:
+ static i *ap (int);
+};
+template <typename, typename> class p;
+template <typename ar, typename i, typename... j> class p<ar (j...), i> : ao<i>
+{
+public:
+ static ar
+ as (const int &p1, j...)
+ {
+ (*ao<i>::ap (p1)) (j ()...);
+ }
+};
+template <typename ar, typename... j> class am<ar (j...)>
+{
+ template <typename, typename> using av = int;
+
+public:
+ template <typename i, typename = av<d<!bool ()>, void>,
+ typename = av<i, void>>
+ am (i);
+ using aw = ar (*) (const int &, j...);
+ aw ax;
+};
+template <typename ar, typename... j>
+template <typename i, typename, typename>
+am<ar (j...)>::am (i)
+{
+ ax = p<ar (j...), i>::as;
+}
+struct G
+{
+ void ba (am<void (o)>);
+};
+struct q
+{
+ q ()
+ {
+ G a;
+ a.ba (r ());
+ }
+ struct r
+ {
+ void
+ operator() (o p1)
+ try
+ {
+ p1.af ();
+ }
+ catch (int)
+ {
+ }
+ };
+} s;