[v2] predict: Adjust optimize_function_for_size_p [PR105818]
Checks
Commit Message
Hi,
Function optimize_function_for_size_p returns OPTIMIZE_SIZE_NO
if fun->decl is not null but no cgraph node is available for it.
As PR105818 shows, this could cause unexpected consequence. For
the case in PR105818, when parsing bar decl in function foo, the
cfun is the function structure for foo, for which there is no
cgraph node, so it returns OPTIMIZE_SIZE_NO. But it's incorrect
since the context is to optimize for size, the flag optimize_size
is true.
The patch is to make optimize_function_for_size_p to check
opt_for_fn (fun->decl, optimize_size) further when fun->decl
is available but no cgraph node, it's just like what function
cgraph_node::optimize_for_size_p does at its first step.
One regression failure got exposed on aarch64-linux-gnu:
PASS->FAIL: gcc.dg/guality/pr54693-2.c -Os \
-DPREVENT_OPTIMIZATION line 21 x == 10 - i
The difference comes from the macro LOGICAL_OP_NON_SHORT_CIRCUIT
used in function fold_range_test during c parsing, it uses
optimize_function_for_speed_p which is equal to the invertion
of optimize_function_for_size_p. At that time cfun->decl is valid
but no cgraph node for it, w/o this patch function
optimize_function_for_speed_p returns true eventually, while it
returns false with this patch. Since the command line option -Os
is specified, there is no reason to interpret it as "for speed".
I think this failure is expected and adjust the test case
accordingly.
v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596628.html
Comparing with v1, v2 adopts opt_for_fn (fun->decl, optimize_size)
instead of optimize_size as Honza's previous comments.
Besides, the reply to Honza's question "Why exactly PR105818 hits
the flag change issue?" was at the link:
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596667.html
Bootstrapped and regtested on x86_64-redhat-linux,
aarch64-linux-gnu and powerpc64{,le}-linux-gnu.
Is it for trunk?
BR,
Kewen
-----
PR middle-end/105818
gcc/ChangeLog:
* predict.cc (optimize_function_for_size_p): Further check
optimize_size of fun->decl when it is valid but no cgraph node.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr105818.c: New test.
* gcc.dg/guality/pr54693-2.c: Adjust for aarch64.
---
gcc/predict.cc | 3 ++-
gcc/testsuite/gcc.dg/guality/pr54693-2.c | 2 +-
gcc/testsuite/gcc.target/powerpc/pr105818.c | 11 +++++++++++
3 files changed, 14 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/powerpc/pr105818.c
--
2.27.0
Comments
Hi,
Gentle ping: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607527.html
BR,
Kewen
on 2022/11/30 16:30, Kewen.Lin via Gcc-patches wrote:
> Hi,
>
> Function optimize_function_for_size_p returns OPTIMIZE_SIZE_NO
> if fun->decl is not null but no cgraph node is available for it.
> As PR105818 shows, this could cause unexpected consequence. For
> the case in PR105818, when parsing bar decl in function foo, the
> cfun is the function structure for foo, for which there is no
> cgraph node, so it returns OPTIMIZE_SIZE_NO. But it's incorrect
> since the context is to optimize for size, the flag optimize_size
> is true.
>
> The patch is to make optimize_function_for_size_p to check
> opt_for_fn (fun->decl, optimize_size) further when fun->decl
> is available but no cgraph node, it's just like what function
> cgraph_node::optimize_for_size_p does at its first step.
>
> One regression failure got exposed on aarch64-linux-gnu:
>
> PASS->FAIL: gcc.dg/guality/pr54693-2.c -Os \
> -DPREVENT_OPTIMIZATION line 21 x == 10 - i
>
> The difference comes from the macro LOGICAL_OP_NON_SHORT_CIRCUIT
> used in function fold_range_test during c parsing, it uses
> optimize_function_for_speed_p which is equal to the invertion
> of optimize_function_for_size_p. At that time cfun->decl is valid
> but no cgraph node for it, w/o this patch function
> optimize_function_for_speed_p returns true eventually, while it
> returns false with this patch. Since the command line option -Os
> is specified, there is no reason to interpret it as "for speed".
> I think this failure is expected and adjust the test case
> accordingly.
>
> v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596628.html
>
> Comparing with v1, v2 adopts opt_for_fn (fun->decl, optimize_size)
> instead of optimize_size as Honza's previous comments.
>
> Besides, the reply to Honza's question "Why exactly PR105818 hits
> the flag change issue?" was at the link:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596667.html
>
> Bootstrapped and regtested on x86_64-redhat-linux,
> aarch64-linux-gnu and powerpc64{,le}-linux-gnu.
>
> Is it for trunk?
>
> BR,
> Kewen
> -----
> PR middle-end/105818
>
> gcc/ChangeLog:
>
> * predict.cc (optimize_function_for_size_p): Further check
> optimize_size of fun->decl when it is valid but no cgraph node.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/powerpc/pr105818.c: New test.
> * gcc.dg/guality/pr54693-2.c: Adjust for aarch64.
> ---
> gcc/predict.cc | 3 ++-
> gcc/testsuite/gcc.dg/guality/pr54693-2.c | 2 +-
> gcc/testsuite/gcc.target/powerpc/pr105818.c | 11 +++++++++++
> 3 files changed, 14 insertions(+), 2 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr105818.c
>
> diff --git a/gcc/predict.cc b/gcc/predict.cc
> index 1bc7ab94454..ecb4aabc9df 100644
> --- a/gcc/predict.cc
> +++ b/gcc/predict.cc
> @@ -268,7 +268,8 @@ optimize_function_for_size_p (struct function *fun)
> cgraph_node *n = cgraph_node::get (fun->decl);
> if (n)
> return n->optimize_for_size_p ();
> - return OPTIMIZE_SIZE_NO;
> + return opt_for_fn (fun->decl, optimize_size) ? OPTIMIZE_SIZE_MAX
> + : OPTIMIZE_SIZE_NO;
> }
>
> /* Return true if function FUN should always be optimized for speed. */
> diff --git a/gcc/testsuite/gcc.dg/guality/pr54693-2.c b/gcc/testsuite/gcc.dg/guality/pr54693-2.c
> index 68aa6c63d71..14ca94ad37d 100644
> --- a/gcc/testsuite/gcc.dg/guality/pr54693-2.c
> +++ b/gcc/testsuite/gcc.dg/guality/pr54693-2.c
> @@ -17,7 +17,7 @@ foo (int x, int y, int z)
> int i = 0;
> while (x > 3 && y > 3 && z > 3)
> { /* { dg-final { gdb-test .+2 "i" "v + 1" } } */
> - /* { dg-final { gdb-test .+1 "x" "10 - i" } } */
> + /* { dg-final { gdb-test .+1 "x" "10 - i" { xfail { aarch64*-*-* && { any-opts "-Os" } } } } } */
> bar (i); /* { dg-final { gdb-test . "y" "20 - 2 * i" } } */
> /* { dg-final { gdb-test .-1 "z" "30 - 3 * i" { xfail { aarch64*-*-* && { any-opts "-fno-fat-lto-objects" "-Os" } } } } } */
> i++, x--, y -= 2, z -= 3;
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr105818.c b/gcc/testsuite/gcc.target/powerpc/pr105818.c
> new file mode 100644
> index 00000000000..679647e189d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr105818.c
> @@ -0,0 +1,11 @@
> +/* { dg-options "-Os -fno-tree-vectorize" } */
> +
> +/* Verify there is no ICE. */
> +
> +#pragma GCC optimize "-fno-tree-vectorize"
> +
> +void
> +foo (void)
> +{
> + void bar (void);
> +}
> --
> 2.27.0
> > PR middle-end/105818
> >
> > gcc/ChangeLog:
> >
> > * predict.cc (optimize_function_for_size_p): Further check
> > optimize_size of fun->decl when it is valid but no cgraph node.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/powerpc/pr105818.c: New test.
> > * gcc.dg/guality/pr54693-2.c: Adjust for aarch64.
> > diff --git a/gcc/testsuite/gcc.target/powerpc/pr105818.c b/gcc/testsuite/gcc.target/powerpc/pr105818.c
> > new file mode 100644
> > index 00000000000..679647e189d
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr105818.c
> > @@ -0,0 +1,11 @@
> > +/* { dg-options "-Os -fno-tree-vectorize" } */
> > +
> > +/* Verify there is no ICE. */
> > +
> > +#pragma GCC optimize "-fno-tree-vectorize"
> > +
> > +void
> > +foo (void)
> > +{
> > + void bar (void);
> > +}
So the testcase starts with optimize_size set but then it switches to
optimize_size==0 due to the GCC optimize pragma. I think this is
behaviour Martin wants to change, so perhaps the testcase should be
written with explicit -O2.
I also wonder what happen when you add the attribute later?
/* { dg-options "-Os -fno-tree-vectorize" } */
/* Verify there is no ICE. */
#pragma GCC optimize "-fno-tree-vectorize"
void
foo (void)
{
void bar (void);
}
__attribute__ ((optimize("-fno-tree-vectorize"))) void foo (void);
I think we should generally avoid doing decisions about size/speed
optimizations so early since the setting may change due to attribtes or
profile feedback...
Honza
On 12/14/22 14:22, Jan Hubicka via Gcc-patches wrote:
>>> PR middle-end/105818
>>>
>>> gcc/ChangeLog:
>>>
>>> * predict.cc (optimize_function_for_size_p): Further check
>>> optimize_size of fun->decl when it is valid but no cgraph node.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.target/powerpc/pr105818.c: New test.
>>> * gcc.dg/guality/pr54693-2.c: Adjust for aarch64.
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr105818.c b/gcc/testsuite/gcc.target/powerpc/pr105818.c
>>> new file mode 100644
>>> index 00000000000..679647e189d
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr105818.c
>>> @@ -0,0 +1,11 @@
>>> +/* { dg-options "-Os -fno-tree-vectorize" } */
>>> +
>>> +/* Verify there is no ICE. */
>>> +
>>> +#pragma GCC optimize "-fno-tree-vectorize"
>>> +
>>> +void
>>> +foo (void)
>>> +{
>>> + void bar (void);
>>> +}
Hi.
Next time, please CC me if you cite me.
> So the testcase starts with optimize_size set but then it switches to
> optimize_size==0 due to the GCC optimize pragma. I think this is
> behaviour Martin wants to change, so perhaps the testcase should be
> written with explicit -O2.
No, the pragma does not modify optimize_size as "optimize" attribute behaves
as documented:
```
...
The optimize attribute arguments of a function behave behave as if appended to the command-line.
```
Martin
>
> I also wonder what happen when you add the attribute later?
> /* { dg-options "-Os -fno-tree-vectorize" } */
>
> /* Verify there is no ICE. */
>
> #pragma GCC optimize "-fno-tree-vectorize"
>
> void
> foo (void)
> {
> void bar (void);
> }
>
> __attribute__ ((optimize("-fno-tree-vectorize"))) void foo (void);
>
> I think we should generally avoid doing decisions about size/speed
> optimizations so early since the setting may change due to attribtes or
> profile feedback...
>
> Honza
Hi Honza,
Thanks for the comments.
on 2022/12/14 21:22, Jan Hubicka wrote:
>>> PR middle-end/105818
>>>
>>> gcc/ChangeLog:
>>>
>>> * predict.cc (optimize_function_for_size_p): Further check
>>> optimize_size of fun->decl when it is valid but no cgraph node.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.target/powerpc/pr105818.c: New test.
>>> * gcc.dg/guality/pr54693-2.c: Adjust for aarch64.
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr105818.c b/gcc/testsuite/gcc.target/powerpc/pr105818.c
>>> new file mode 100644
>>> index 00000000000..679647e189d
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr105818.c
>>> @@ -0,0 +1,11 @@
>>> +/* { dg-options "-Os -fno-tree-vectorize" } */
>>> +
>>> +/* Verify there is no ICE. */
>>> +
>>> +#pragma GCC optimize "-fno-tree-vectorize"
>>> +
>>> +void
>>> +foo (void)
>>> +{
>>> + void bar (void);
>>> +}
> So the testcase starts with optimize_size set but then it switches to
> optimize_size==0 due to the GCC optimize pragma. I think this is
> behaviour Martin wants to change, so perhaps the testcase should be
> written with explicit -O2.
No, both the global context and the function context are to optimize
for size, Martin also clarified that. So the issue is the inconsistent
information on optimizing for size when parsing bar, at that time
cfun->decl is available but no cgraph node, it says OPTIMIZE_SIZE_NO.
>
> I also wonder what happen when you add the attribute later?
> /* { dg-options "-Os -fno-tree-vectorize" } */
>
> /* Verify there is no ICE. */
>
> #pragma GCC optimize "-fno-tree-vectorize"
>
// marker A
> void
> foo (void)
> {
> void bar (void);
> }
>
> __attribute__ ((optimize("-fno-tree-vectorize"))) void foo (void);
There is still one ICE with this additional decl. Same ICE if I moved
it to "marker A" above.
>
> I think we should generally avoid doing decisions about size/speed
> optimizations so early since the setting may change due to attribtes or
> profile feedback...
>
Good point, I'll make a separated patch to move optimize_function_for_speed_p
uses out of function rs6000_option_override_internal, but I think it's a
separated issue which just results in imperfect "optimize for size" decision.
Fixing it can cover this exposed ICE, but IMHO this exposed inconsistent
information on "optimize for size" exposed here is still an issue, like: all
the context is to optimize for size, but it still returns OPTIMIZE_SIZE_NO.
Do you agree?
BR,
Kewen
@@ -268,7 +268,8 @@ optimize_function_for_size_p (struct function *fun)
cgraph_node *n = cgraph_node::get (fun->decl);
if (n)
return n->optimize_for_size_p ();
- return OPTIMIZE_SIZE_NO;
+ return opt_for_fn (fun->decl, optimize_size) ? OPTIMIZE_SIZE_MAX
+ : OPTIMIZE_SIZE_NO;
}
/* Return true if function FUN should always be optimized for speed. */
@@ -17,7 +17,7 @@ foo (int x, int y, int z)
int i = 0;
while (x > 3 && y > 3 && z > 3)
{ /* { dg-final { gdb-test .+2 "i" "v + 1" } } */
- /* { dg-final { gdb-test .+1 "x" "10 - i" } } */
+ /* { dg-final { gdb-test .+1 "x" "10 - i" { xfail { aarch64*-*-* && { any-opts "-Os" } } } } } */
bar (i); /* { dg-final { gdb-test . "y" "20 - 2 * i" } } */
/* { dg-final { gdb-test .-1 "z" "30 - 3 * i" { xfail { aarch64*-*-* && { any-opts "-fno-fat-lto-objects" "-Os" } } } } } */
i++, x--, y -= 2, z -= 3;
new file mode 100644
@@ -0,0 +1,11 @@
+/* { dg-options "-Os -fno-tree-vectorize" } */
+
+/* Verify there is no ICE. */
+
+#pragma GCC optimize "-fno-tree-vectorize"
+
+void
+foo (void)
+{
+ void bar (void);
+}