PR tree-optimization/108356 - Ranger cache - always use range_from_dom when updating.
Checks
Commit Message
This turned out to be a more interesting problem than I wanted.
the situation boils down to:
<bb 10>
# g_5 = PHI <0(2), 2(8)>
if (g_5 <= 1)
goto <bb 4>; [INV]
<bb 4> :
if (g_5 != 0)
goto <bb 7>; [INV]
else
goto <bb 8>; [INV]
<bb 7> :
c = 0;
<bb 8> :
goto <bb 10>; [INV]
We globally know that g_5 is [0,0][2,2]
we also know that 10->4 , g_5 will be [0,0]
Which means we also know that that the branch in bb_4 will never be
taken, and will always go to bb 8.
THis is all processed in EVRP, the branch is changed, and the next time
VRP is called, we blow the block with c = 0 like we want...
Unfortunately it doesnt happen within EVRP because when this updated
range for g_5 is propagated in the cache, it was tripping over a shotcut
which tried to avoid using lookups when it thinks it didnt matter, and
would occasionally drop back to the global range.
In particular, the cache had originally propagated [0,0][2,2] as the on
entry range to bb8. when we rewrite the branch, we mark 4->7 and 7->8
as unexecutable edges, then propagate the new range for g_5.. This
requires recalculating the existing range on entry to bb8.
It properly picked up [0,0] from 4->8, but when the cache query checked
the range from 7->8, it discovered that no value was yet set, so instead
of looking it up, it fell back to the global range of [0,0][2,2]. If it
properly calculates that edge instead, it comes up with UNDEFINED (as it
is unexecutable) and results in [0,0]... and EVRP then also removes the
block is c = 0 in.
By picking up the global value, it still thought 2 was a possibility
later on, and a single VRP pass couldn't eliminate the branch ultimately
leading to the store... it required a second one with the adjusted CFG
to catch it.
This patch tells the cache to always do a read-only scan of the
dominator tree to find the nearest actual value and use that instead.
This may solve other lingering weird propagation issues.
I also ran a performance run on this change. It does slow VRP by down
about 1%, but the overall change is nominal at around 0.05%.
Bootstraps on x86_64-pc-linux-gnu with no regressions. OK?
Andrew
Comments
On Tue, Jan 31, 2023 at 9:11 PM Andrew MacLeod via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> This turned out to be a more interesting problem than I wanted.
>
> the situation boils down to:
>
> <bb 10>
> # g_5 = PHI <0(2), 2(8)>
> if (g_5 <= 1)
> goto <bb 4>; [INV]
>
> <bb 4> :
> if (g_5 != 0)
> goto <bb 7>; [INV]
> else
> goto <bb 8>; [INV]
>
> <bb 7> :
> c = 0;
>
> <bb 8> :
> goto <bb 10>; [INV]
>
> We globally know that g_5 is [0,0][2,2]
> we also know that 10->4 , g_5 will be [0,0]
> Which means we also know that that the branch in bb_4 will never be
> taken, and will always go to bb 8.
> THis is all processed in EVRP, the branch is changed, and the next time
> VRP is called, we blow the block with c = 0 like we want...
>
> Unfortunately it doesnt happen within EVRP because when this updated
> range for g_5 is propagated in the cache, it was tripping over a shotcut
> which tried to avoid using lookups when it thinks it didnt matter, and
> would occasionally drop back to the global range.
>
> In particular, the cache had originally propagated [0,0][2,2] as the on
> entry range to bb8. when we rewrite the branch, we mark 4->7 and 7->8
> as unexecutable edges, then propagate the new range for g_5.. This
> requires recalculating the existing range on entry to bb8.
>
> It properly picked up [0,0] from 4->8, but when the cache query checked
> the range from 7->8, it discovered that no value was yet set, so instead
> of looking it up, it fell back to the global range of [0,0][2,2]. If it
> properly calculates that edge instead, it comes up with UNDEFINED (as it
> is unexecutable) and results in [0,0]... and EVRP then also removes the
> block is c = 0 in.
>
> By picking up the global value, it still thought 2 was a possibility
> later on, and a single VRP pass couldn't eliminate the branch ultimately
> leading to the store... it required a second one with the adjusted CFG
> to catch it.
>
> This patch tells the cache to always do a read-only scan of the
> dominator tree to find the nearest actual value and use that instead.
> This may solve other lingering weird propagation issues.
>
> I also ran a performance run on this change. It does slow VRP by down
> about 1%, but the overall change is nominal at around 0.05%.
>
> Bootstraps on x86_64-pc-linux-gnu with no regressions. OK?
OK.
Thanks,
Richard.
> Andrew
>
From ddb9df254ed2bc5f11bdde213e1485d7971a25d9 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod <amacleod@redhat.com>
Date: Tue, 10 Jan 2023 13:40:56 -0500
Subject: [PATCH] Ranger cache - always use range_from_dom when updating.
When updating an existing range, if we dont query the dom tree, we can
get the global range instead of a proper range on some incoming edges
which cause the range to not be refined properly.
PR tree-optimization/108356
gcc/
* gimple-range-cache.cc (ranger_cache::range_on_edge): Always
do a search of the DOM tree for a range.
gcc/testsuite/
* gcc.dg/pr108356.c: New.
---
gcc/gimple-range-cache.cc | 2 +-
gcc/testsuite/gcc.dg/pr108356.c | 23 +++++++++++++++++++++++
2 files changed, 24 insertions(+), 1 deletion(-)
create mode 100644 gcc/testsuite/gcc.dg/pr108356.c
@@ -998,7 +998,7 @@ bool
ranger_cache::range_on_edge (vrange &r, edge e, tree expr)
{
if (gimple_range_ssa_p (expr))
- return edge_range (r, e, expr, RFD_NONE);
+ return edge_range (r, e, expr, RFD_READ_ONLY);
return get_tree_range (r, expr, NULL);
}
new file mode 100644
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+char a;
+static char c = 3;
+char d;
+void foo();
+short(b)(short e, short f) { return e + f; }
+int main() {
+ unsigned g = 0;
+ if (c)
+ ;
+ else
+ foo();
+ for (; g < 2; g = b(g, 2)) {
+ d = g ? 0 : a;
+ if (g)
+ c = 0;
+ }
+}
+
+
+/* { dg-final { scan-tree-dump-not "foo" "optimized" } } */
--
2.39.0