Fix builtin vs non-builtin partition merge in loop distribution
Commit Message
When r7-6373-g40b6bff965d004 fixed a costing issue it failed to
make the logic symmetric which means that we now fuse
normal vs. builtin when the cost model says so but we don't fuse
builtin vs. normal. The following fixes that, also allowing
the cost model to decide to fuse two builtin partitions as otherwise
an intermediate non-builtin can result in a partial merge as well.
Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.
* tree-loop-distribution.cc (loop_distribution::distribute_loop):
When computing cost-based merging do not disregard builtin
classified partitions in some cases.
* gcc.dg/tree-ssa/ldist-24.c: XFAIL.
* gcc.dg/tree-ssa/ldist-36.c: Adjust expected outcome.
---
gcc/testsuite/gcc.dg/tree-ssa/ldist-24.c | 5 +++--
gcc/testsuite/gcc.dg/tree-ssa/ldist-36.c | 3 ++-
gcc/tree-loop-distribution.cc | 5 +----
3 files changed, 6 insertions(+), 7 deletions(-)
@@ -20,5 +20,6 @@ void foo ()
}
}
-/* { dg-final { scan-tree-dump "generated memcpy" "ldist" } } */
-/* { dg-final { scan-tree-dump "generated memset zero" "ldist" } } */
+/* The cost modeling does not consider WAR as beneficial to split. */
+/* { dg-final { scan-tree-dump "generated memcpy" "ldist" { xfail *-*-* } } } */
+/* { dg-final { scan-tree-dump "generated memset zero" "ldist" { xfail *-*-* } } } */
@@ -25,4 +25,5 @@ foo (struct st * restrict p)
}
}
-/* { dg-final { scan-tree-dump-times "Loop nest . distributed: split to 0 loops and 3 library" 1 "ldist" } } */
+/* The cost modeling doesn't consider splitting a WAR re-use profitable. */
+/* { dg-final { scan-tree-dump-times "Loop nest . distributed: split to 1 loops and 1 library" 1 "ldist" } } */
@@ -3090,10 +3090,7 @@ loop_distribution::distribute_loop (class loop *loop,
for (i = 0; partitions.iterate (i, &into); ++i)
{
bool changed = false;
- if (partition_builtin_p (into) || into->kind == PKIND_PARTIAL_MEMSET)
- continue;
- for (int j = i + 1;
- partitions.iterate (j, &partition); ++j)
+ for (int j = i + 1; partitions.iterate (j, &partition); ++j)
{
if (share_memory_accesses (rdg, into, partition))
{