early-remat: Resync with new DF postorders [PR109940]
Checks
Commit Message
When I wrote early-remat, the DF_FORWARD block order was a postorder
of a reverse/backward walk (i.e. of the inverted cfg), rather than a
reverse postorder of a forward walk. A postorder of a backward walk
lacked the important property that dominators come before the blocks
they dominate; instead it ensures that postdominators come after
the blocks that they postdominate.
The DF_BACKWARD block order was similarly a postorder of a forward
walk. Since early-remat wanted a standard postorder and reverse
postorder with normal dominator properties, it used the DF_BACKWARD
order instead of the DF_FORWARD order.
g:53dddbfeb213ac4ec39f fixed the DF orders so that DF_FORWARD was
an RPO of a forward walk and so that DF_BACKWARD was an RPO of a
backward walk. This meant that iterating backwards over the
DF_BACKWARD order had the exact problem that the original DF_FORWARD
order had, triggering a flurry of ICEs for SVE.
This fixes the build with SVE enabled. It also fixes an ICE
in g++.target/aarch64/sve/pr99766.C with normal builds. I've
included the test from the PR as well, for extra coverage.
Tested on aarch64-linux-gnu and aarch64_be-elf. OK to install?
Richard
gcc/
PR rtl-optimization/109940
* early-remat.cc (postorder_index): Rename to...
(rpo_index): ...this.
(compare_candidates): Sort by decreasing rpo_index rather than
increasing postorder_index.
(early_remat::sort_candidates): Calculate the forward RPO from
DF_FORWARD.
(early_remat::local_phase): Follow forward RPO using DF_FORWARD,
rather than DF_BACKWARD in reverse.
gcc/testsuite/
* gcc.dg/torture/pr109940.c: New test.
---
gcc/early-remat.cc | 28 ++++++++++++-------------
gcc/testsuite/gcc.dg/torture/pr109940.c | 18 ++++++++++++++++
2 files changed, 32 insertions(+), 14 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/torture/pr109940.c
Comments
On Wed, 24 May 2023, Richard Sandiford wrote:
> When I wrote early-remat, the DF_FORWARD block order was a postorder
> of a reverse/backward walk (i.e. of the inverted cfg), rather than a
> reverse postorder of a forward walk. A postorder of a backward walk
> lacked the important property that dominators come before the blocks
> they dominate; instead it ensures that postdominators come after
> the blocks that they postdominate.
>
> The DF_BACKWARD block order was similarly a postorder of a forward
> walk. Since early-remat wanted a standard postorder and reverse
> postorder with normal dominator properties, it used the DF_BACKWARD
> order instead of the DF_FORWARD order.
>
> g:53dddbfeb213ac4ec39f fixed the DF orders so that DF_FORWARD was
> an RPO of a forward walk and so that DF_BACKWARD was an RPO of a
> backward walk. This meant that iterating backwards over the
> DF_BACKWARD order had the exact problem that the original DF_FORWARD
> order had, triggering a flurry of ICEs for SVE.
>
> This fixes the build with SVE enabled. It also fixes an ICE
> in g++.target/aarch64/sve/pr99766.C with normal builds. I've
> included the test from the PR as well, for extra coverage.
>
> Tested on aarch64-linux-gnu and aarch64_be-elf. OK to install?
OK.
Thanks,
Richard.
> Richard
>
>
> gcc/
> PR rtl-optimization/109940
> * early-remat.cc (postorder_index): Rename to...
> (rpo_index): ...this.
> (compare_candidates): Sort by decreasing rpo_index rather than
> increasing postorder_index.
> (early_remat::sort_candidates): Calculate the forward RPO from
> DF_FORWARD.
> (early_remat::local_phase): Follow forward RPO using DF_FORWARD,
> rather than DF_BACKWARD in reverse.
>
> gcc/testsuite/
> * gcc.dg/torture/pr109940.c: New test.
> ---
> gcc/early-remat.cc | 28 ++++++++++++-------------
> gcc/testsuite/gcc.dg/torture/pr109940.c | 18 ++++++++++++++++
> 2 files changed, 32 insertions(+), 14 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/torture/pr109940.c
>
> diff --git a/gcc/early-remat.cc b/gcc/early-remat.cc
> index b76771eaf0d..1ee63c73c1b 100644
> --- a/gcc/early-remat.cc
> +++ b/gcc/early-remat.cc
> @@ -1010,8 +1010,8 @@ early_remat::init_block_info (void)
> m_block_info.safe_grow_cleared (n_blocks, true);
> }
>
> -/* Maps basic block indices to their position in the post order. */
> -static unsigned int *postorder_index;
> +/* Maps basic block indices to their position in the forward RPO. */
> +static unsigned int *rpo_index;
>
> /* Order remat_candidates X_IN and Y_IN according to the cfg postorder. */
>
> @@ -1024,7 +1024,7 @@ compare_candidates (const void *x_in, const void *y_in)
> basic_block y_bb = BLOCK_FOR_INSN (y->insn);
> if (x_bb != y_bb)
> /* Make X and Y follow block postorder. */
> - return postorder_index[x_bb->index] - postorder_index[y_bb->index];
> + return rpo_index[y_bb->index] - rpo_index[x_bb->index];
>
> /* Make X and Y follow a backward traversal of the containing block. */
> return DF_INSN_LUID (y->insn) - DF_INSN_LUID (x->insn);
> @@ -1051,15 +1051,15 @@ early_remat::sort_candidates (void)
> /* Create a mapping from block numbers to their position in the
> postorder. */
> unsigned int n_blocks = last_basic_block_for_fn (m_fn);
> - int *postorder = df_get_postorder (DF_BACKWARD);
> - unsigned int postorder_len = df_get_n_blocks (DF_BACKWARD);
> - postorder_index = new unsigned int[n_blocks];
> - for (unsigned int i = 0; i < postorder_len; ++i)
> - postorder_index[postorder[i]] = i;
> + int *rpo = df_get_postorder (DF_FORWARD);
> + unsigned int rpo_len = df_get_n_blocks (DF_FORWARD);
> + rpo_index = new unsigned int[n_blocks];
> + for (unsigned int i = 0; i < rpo_len; ++i)
> + rpo_index[rpo[i]] = i;
>
> m_candidates.qsort (compare_candidates);
>
> - delete[] postorder_index;
> + delete[] rpo_index;
> }
>
> /* Commit to the current candidate indices and initialize cross-references. */
> @@ -2097,11 +2097,11 @@ early_remat::local_phase (void)
> if (dump_file)
> fprintf (dump_file, "\n;; Local phase:\n");
>
> - int *postorder = df_get_postorder (DF_BACKWARD);
> - unsigned int postorder_len = df_get_n_blocks (DF_BACKWARD);
> - for (unsigned int i = postorder_len; i-- > 0; )
> - if (postorder[i] >= NUM_FIXED_BLOCKS)
> - process_block (BASIC_BLOCK_FOR_FN (m_fn, postorder[i]));
> + int *rpo = df_get_postorder (DF_FORWARD);
> + unsigned int rpo_len = df_get_n_blocks (DF_FORWARD);
> + for (unsigned int i = 0; i < rpo_len; ++i)
> + if (rpo[i] >= NUM_FIXED_BLOCKS)
> + process_block (BASIC_BLOCK_FOR_FN (m_fn, rpo[i]));
> }
>
> /* Return true if available values survive across edge E. */
> diff --git a/gcc/testsuite/gcc.dg/torture/pr109940.c b/gcc/testsuite/gcc.dg/torture/pr109940.c
> new file mode 100644
> index 00000000000..23364708e86
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr109940.c
> @@ -0,0 +1,18 @@
> +/* { dg-additional-options "-march=armv9-a" { target aarch64*-*-* } } */
> +
> +int a;
> +int *b;
> +void
> +c (int *d) { *d = a; }
> +
> +int
> +e(int d, int f) {
> + if (d <= 1)
> + return 1;
> + int g = d / 2;
> + for (int h = 0; h < g; h++)
> + if (f == (long int)b > b[h])
> + c(&b[h]);
> + e(g, f);
> + e(g, f);
> +}
>
@@ -1010,8 +1010,8 @@ early_remat::init_block_info (void)
m_block_info.safe_grow_cleared (n_blocks, true);
}
-/* Maps basic block indices to their position in the post order. */
-static unsigned int *postorder_index;
+/* Maps basic block indices to their position in the forward RPO. */
+static unsigned int *rpo_index;
/* Order remat_candidates X_IN and Y_IN according to the cfg postorder. */
@@ -1024,7 +1024,7 @@ compare_candidates (const void *x_in, const void *y_in)
basic_block y_bb = BLOCK_FOR_INSN (y->insn);
if (x_bb != y_bb)
/* Make X and Y follow block postorder. */
- return postorder_index[x_bb->index] - postorder_index[y_bb->index];
+ return rpo_index[y_bb->index] - rpo_index[x_bb->index];
/* Make X and Y follow a backward traversal of the containing block. */
return DF_INSN_LUID (y->insn) - DF_INSN_LUID (x->insn);
@@ -1051,15 +1051,15 @@ early_remat::sort_candidates (void)
/* Create a mapping from block numbers to their position in the
postorder. */
unsigned int n_blocks = last_basic_block_for_fn (m_fn);
- int *postorder = df_get_postorder (DF_BACKWARD);
- unsigned int postorder_len = df_get_n_blocks (DF_BACKWARD);
- postorder_index = new unsigned int[n_blocks];
- for (unsigned int i = 0; i < postorder_len; ++i)
- postorder_index[postorder[i]] = i;
+ int *rpo = df_get_postorder (DF_FORWARD);
+ unsigned int rpo_len = df_get_n_blocks (DF_FORWARD);
+ rpo_index = new unsigned int[n_blocks];
+ for (unsigned int i = 0; i < rpo_len; ++i)
+ rpo_index[rpo[i]] = i;
m_candidates.qsort (compare_candidates);
- delete[] postorder_index;
+ delete[] rpo_index;
}
/* Commit to the current candidate indices and initialize cross-references. */
@@ -2097,11 +2097,11 @@ early_remat::local_phase (void)
if (dump_file)
fprintf (dump_file, "\n;; Local phase:\n");
- int *postorder = df_get_postorder (DF_BACKWARD);
- unsigned int postorder_len = df_get_n_blocks (DF_BACKWARD);
- for (unsigned int i = postorder_len; i-- > 0; )
- if (postorder[i] >= NUM_FIXED_BLOCKS)
- process_block (BASIC_BLOCK_FOR_FN (m_fn, postorder[i]));
+ int *rpo = df_get_postorder (DF_FORWARD);
+ unsigned int rpo_len = df_get_n_blocks (DF_FORWARD);
+ for (unsigned int i = 0; i < rpo_len; ++i)
+ if (rpo[i] >= NUM_FIXED_BLOCKS)
+ process_block (BASIC_BLOCK_FOR_FN (m_fn, rpo[i]));
}
/* Return true if available values survive across edge E. */
new file mode 100644
@@ -0,0 +1,18 @@
+/* { dg-additional-options "-march=armv9-a" { target aarch64*-*-* } } */
+
+int a;
+int *b;
+void
+c (int *d) { *d = a; }
+
+int
+e(int d, int f) {
+ if (d <= 1)
+ return 1;
+ int g = d / 2;
+ for (int h = 0; h < g; h++)
+ if (f == (long int)b > b[h])
+ c(&b[h]);
+ e(g, f);
+ e(g, f);
+}