RISC-V: Enable pressure-aware scheduling by default.
Checks
Commit Message
Hi,
this patch enables pressure-aware scheduling for riscv. There have been
various requests for it so I figured I'd just go ahead and send
the patch.
There is some slight regression in code quality for a number of
vector tests where we spill more due to different instructions order.
The ones I looked at were a mix of bad luck and/or brittle tests.
Comparing the size of the generated assembly or the number of vsetvls
for SPECint also didn't show any immediate benefit but that's obviously
not a very fine-grained analysis.
As cost and scheduling models mature I expect the situation to improve
and for now I think it's generally favorable to enable pressure-aware
scheduling so we can work with it rather than trying to find every
possible problem in advance. Any other opinions on that?
Regards
Robin
This patch enables register -fsched-pressure by default and sets
the algorithm to "model". As with other backends, this helps
reduce unnecessary spills.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add -fsched-pressure.
* config/riscv/riscv.cc (riscv_option_override): Set sched
pressure algorithm.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/narrow_constraint-1.c: Add
-fno-sched-pressure.
* gcc.target/riscv/rvv/base/narrow_constraint-17.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-18.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-19.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-20.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-21.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-22.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-23.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-24.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-25.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-26.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-27.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-28.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-29.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-30.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-31.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-4.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-5.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-8.c: Ditto.
* gcc.target/riscv/rvv/base/narrow_constraint-9.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-10.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-11.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-12.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-3.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-9.c: Ditto.
---
gcc/common/config/riscv/riscv-common.cc | 2 ++
gcc/config/riscv/riscv.cc | 5 +++++
.../gcc.target/riscv/rvv/base/narrow_constraint-1.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-17.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-18.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-19.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-20.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-21.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-22.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-23.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-24.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-25.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-26.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-27.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-28.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-29.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-30.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-31.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-4.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-5.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-8.c | 2 +-
.../gcc.target/riscv/rvv/base/narrow_constraint-9.c | 2 +-
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-10.c | 2 +-
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-11.c | 2 +-
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-12.c | 2 +-
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-3.c | 2 +-
gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-9.c | 2 +-
27 files changed, 32 insertions(+), 25 deletions(-)
Comments
On 8/18/23 07:57, Robin Dapp wrote:
> Hi,
>
> this patch enables pressure-aware scheduling for riscv. There have been
> various requests for it so I figured I'd just go ahead and send
> the patch.
Thanks. Your timing is good, I was pondering nominating a victim to
pick this up next week ;-)
>
> There is some slight regression in code quality for a number of
> vector tests where we spill more due to different instructions order.
> The ones I looked at were a mix of bad luck and/or brittle tests.
> Comparing the size of the generated assembly or the number of vsetvls
> for SPECint also didn't show any immediate benefit but that's obviously
> not a very fine-grained analysis.
Yea. In fact I wouldn't really expect significant changes other than
those key loops in x264.
>
> As cost and scheduling models mature I expect the situation to improve
> and for now I think it's generally favorable to enable pressure-aware
> scheduling so we can work with it rather than trying to find every
> possible problem in advance. Any other opinions on that?
Agreed. I wouldn't be surprised if largely turns into a set-and-forget.
>
> Regards
> Robin
>
> This patch enables register -fsched-pressure by default and sets
> the algorithm to "model". As with other backends, this helps
> reduce unnecessary spills.
>
> gcc/ChangeLog:
>
> * common/config/riscv/riscv-common.cc: Add -fsched-pressure.
> * config/riscv/riscv.cc (riscv_option_override): Set sched
> pressure algorithm.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/narrow_constraint-1.c: Add
> -fno-sched-pressure.
> * gcc.target/riscv/rvv/base/narrow_constraint-17.c: Ditto.
> * gcc.target/riscv/rvv/base/narrow_constraint-18.c: Ditto.
> * gcc.target/riscv/rvv/base/narrow_constraint-19.c: Ditto.
> * gcc.target/riscv/rvv/base/narrow_constraint-20.c: Ditto.
> * gcc.target/riscv/rvv/base/narrow_constraint-21.c: Ditto.
> * gcc.target/riscv/rvv/base/narrow_constraint-22.c: Ditto.
> * gcc.target/riscv/rvv/base/narrow_constraint-23.c: Ditto.
> * gcc.target/riscv/rvv/base/narrow_constraint-24.c: Ditto.
> * gcc.target/riscv/rvv/base/narrow_constraint-25.c: Ditto.
> * gcc.target/riscv/rvv/base/narrow_constraint-26.c: Ditto.
> * gcc.target/riscv/rvv/base/narrow_constraint-27.c: Ditto.
> * gcc.target/riscv/rvv/base/narrow_constraint-28.c: Ditto.
> * gcc.target/riscv/rvv/base/narrow_constraint-29.c: Ditto.
> * gcc.target/riscv/rvv/base/narrow_constraint-30.c: Ditto.
> * gcc.target/riscv/rvv/base/narrow_constraint-31.c: Ditto.
> * gcc.target/riscv/rvv/base/narrow_constraint-4.c: Ditto.
> * gcc.target/riscv/rvv/base/narrow_constraint-5.c: Ditto.
> * gcc.target/riscv/rvv/base/narrow_constraint-8.c: Ditto.
> * gcc.target/riscv/rvv/base/narrow_constraint-9.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-10.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-11.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-12.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-3.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-9.c: Ditto.
OK. Just one nit below.
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc
> index 4737dcd44a1..59848b21162 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -2017,9 +2017,11 @@ static const struct default_options riscv_option_optimization_table[] =
> {
> { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
> { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
> + { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
> #if TARGET_DEFAULT_ASYNC_UNWIND_TABLES == 1
> { OPT_LEVELS_ALL, OPT_fasynchronous_unwind_tables, NULL, 1 },
> { OPT_LEVELS_ALL, OPT_funwind_tables, NULL, 1},
> + /* Enable -fsched-pressure by default when optimizing. */
> #endif
> { OPT_LEVELS_NONE, 0, NULL, 0 }
> };
Shouldn't the comment move up to before the OPT_fsched_pressure line?
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 49062bef9fc..96c5362d2fd 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -65,6 +65,7 @@ along with GCC; see the file COPYING3. If not see
> #include "cfgloop.h"
> #include "cfgrtl.h"
> #include "sel-sched.h"
> +#include "sched-int.h"
> #include "fold-const.h"
> #include "gimple-iterator.h"
> #include "gimple-expr.h"
> @@ -7095,6 +7096,10 @@ riscv_option_override (void)
> sorry (
> "Current RISC-V GCC cannot support VLEN greater than 4096bit for 'V' Extension");
>
> + SET_OPTION_IF_UNSET (&global_options, &global_options_set,
> + param_sched_pressure_algorithm,
> + SCHED_PRESSURE_MODEL);
> +
FWIW, I tried both variants of the pressure model. They seemed roughly
on-par with each other across specint.
Jeff
On 8/18/23 16:08, Jeff Law wrote:
>> There is some slight regression in code quality for a number of
>> vector tests where we spill more due to different instructions order.
>> The ones I looked at were a mix of bad luck and/or brittle tests.
>> Comparing the size of the generated assembly or the number of vsetvls
>> for SPECint also didn't show any immediate benefit but that's obviously
>> not a very fine-grained analysis.
> Yea. In fact I wouldn't really expect significant changes other than
> those key loops in x264.
Care to elaborate a bit more please. I've seen severe reg pressure /
spills in a bunch of others: cactu, lbm, exchange2. Is there something
specific to x264 spills ?
>
>>
>> diff --git a/gcc/common/config/riscv/riscv-common.cc
>> b/gcc/common/config/riscv/riscv-common.cc
>> index 4737dcd44a1..59848b21162 100644
>> --- a/gcc/common/config/riscv/riscv-common.cc
>> +++ b/gcc/common/config/riscv/riscv-common.cc
>> @@ -2017,9 +2017,11 @@ static const struct default_options
>> riscv_option_optimization_table[] =
>> {
>> { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
>> { OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
>> + { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
Nit2: maybe move this 1 line up to keep LEVEL_1 together, at least the
new ones being added.
>> #if TARGET_DEFAULT_ASYNC_UNWIND_TABLES == 1
>> { OPT_LEVELS_ALL, OPT_fasynchronous_unwind_tables, NULL, 1 },
>> { OPT_LEVELS_ALL, OPT_funwind_tables, NULL, 1},
>> + /* Enable -fsched-pressure by default when optimizing. */
>> #endif
>> { OPT_LEVELS_NONE, 0, NULL, 0 }
>> };
> Shouldn't the comment move up to before the OPT_fsched_pressure line?
Yep I had the exact same first though but then thought it was something
deeper.
Turned out to be Occam's Razor after all :-)
Thx,
-Vineet
On 8/18/23 17:24, Vineet Gupta wrote:
>
>
> On 8/18/23 16:08, Jeff Law wrote:
>>> There is some slight regression in code quality for a number of
>>> vector tests where we spill more due to different instructions order.
>>> The ones I looked at were a mix of bad luck and/or brittle tests.
>>> Comparing the size of the generated assembly or the number of vsetvls
>>> for SPECint also didn't show any immediate benefit but that's obviously
>>> not a very fine-grained analysis.
>> Yea. In fact I wouldn't really expect significant changes other than
>> those key loops in x264.
>
> Care to elaborate a bit more please. I've seen severe reg pressure /
> spills in a bunch of others: cactu, lbm, exchange2. Is there something
> specific to x264 spills ?
The only thing that's particularly interesting about the x264 spills is
they're caused by scheduling.
In simplest terms GCC's scheduler tries to minimize the latency of the
critical path in a block. For x264 we've got a loop that we unrolled 8
times with 8 byte sized loads per loop iteration. So 64 byte loads, all
higher from a critical path latency standpoint than anything else.
Naturally there's no way we can hold 64 values live as we only have 32
registers and thus we blow out the register file.
By turning on pressure sensitive scheduling, as register pressure
approaches the threshold, the scheduler will select a lower priority
instruction (say computing the difference of two previously loaded
values) that reduces register pressure. So it's not critical path
optimal, but it keep us from blowing out the register file and
ultimately we get better performance as a result.
jeff
@@ -2017,9 +2017,11 @@ static const struct default_options riscv_option_optimization_table[] =
{
{ OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
{ OPT_LEVELS_2_PLUS, OPT_free, NULL, 1 },
+ { OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
#if TARGET_DEFAULT_ASYNC_UNWIND_TABLES == 1
{ OPT_LEVELS_ALL, OPT_fasynchronous_unwind_tables, NULL, 1 },
{ OPT_LEVELS_ALL, OPT_funwind_tables, NULL, 1},
+ /* Enable -fsched-pressure by default when optimizing. */
#endif
{ OPT_LEVELS_NONE, 0, NULL, 0 }
};
@@ -65,6 +65,7 @@ along with GCC; see the file COPYING3. If not see
#include "cfgloop.h"
#include "cfgrtl.h"
#include "sel-sched.h"
+#include "sched-int.h"
#include "fold-const.h"
#include "gimple-iterator.h"
#include "gimple-expr.h"
@@ -7095,6 +7096,10 @@ riscv_option_override (void)
sorry (
"Current RISC-V GCC cannot support VLEN greater than 4096bit for 'V' Extension");
+ SET_OPTION_IF_UNSET (&global_options, &global_options_set,
+ param_sched_pressure_algorithm,
+ SCHED_PRESSURE_MODEL);
+
/* Convert -march to a chunks count. */
riscv_vector_chunks = riscv_convert_vector_bits ();
}
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3 -fno-sched-pressure" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-sched-pressure -fno-tree-vectorize" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-sched-pressure -fno-tree-vectorize" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-sched-pressure -fno-tree-vectorize" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-sched-pressure -fno-tree-vectorize" } */
#include "riscv_vector.h"
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-tree-vectorize" } */
+/* { dg-options "-march=rv32gcv -mabi=ilp32 -fno-sched-pressure -fno-tree-vectorize" } */
#include "riscv_vector.h"