aarch64: update ampere1 vectorization cost
Checks
Commit Message
The original submission of AmpereOne (-mcpu=ampere1) costs occurred
prior to exhaustive testing of vectorizable workloads against
hardware.
Adjust the vector costs to achieve the best results and more closely
match the underlying hardware.
gcc/ChangeLog:
* config/aarch64/aarch64.cc: Update vector costs for ampere1.
Co-Authored-By: Manolis Tsamis <manolis.tsamis@vrull.eu>
Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
---
We would like to get this into GCC 13 to avoid having to backport at
the start of the next cycle.
OK for backports?
gcc/config/aarch64/aarch64.cc | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
Comments
Hi Philipp,
> -----Original Message-----
> From: Gcc-patches <gcc-patches-
> bounces+kyrylo.tkachov=arm.com@gcc.gnu.org> On Behalf Of Philipp
> Tomsich
> Sent: Monday, March 27, 2023 8:47 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Tamar Christina
> <Tamar.Christina@arm.com>; Philipp Tomsich <philipp.tomsich@vrull.eu>;
> Manolis Tsamis <manolis.tsamis@vrull.eu>
> Subject: [PATCH] aarch64: update ampere1 vectorization cost
>
> The original submission of AmpereOne (-mcpu=ampere1) costs occurred
> prior to exhaustive testing of vectorizable workloads against
> hardware.
>
> Adjust the vector costs to achieve the best results and more closely
> match the underlying hardware.
>
> gcc/ChangeLog:
>
> * config/aarch64/aarch64.cc: Update vector costs for ampere1.
>
> Co-Authored-By: Manolis Tsamis <manolis.tsamis@vrull.eu>
>
> Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> ---
> We would like to get this into GCC 13 to avoid having to backport at
> the start of the next cycle.
>
Given this affects only the ampere1 costs that sounds fine to me and fairly low risk, you are being trusted that these costs are actually desirable and properly validated on the hardware involved.
> OK for backports?
This is ok for trunk (GCC 13). Do you also want to backport this to other branches?
Thanks,
Kyrill
>
> gcc/config/aarch64/aarch64.cc | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index b27f4354031..661fff65cea 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -1132,7 +1132,7 @@ static const struct cpu_vector_cost
> thunderx3t110_vector_cost =
>
> static const advsimd_vec_cost ampere1_advsimd_vector_cost =
> {
> - 3, /* int_stmt_cost */
> + 1, /* int_stmt_cost */
> 3, /* fp_stmt_cost */
> 0, /* ld2_st2_permute_cost */
> 0, /* ld3_st3_permute_cost */
> @@ -1148,17 +1148,17 @@ static const advsimd_vec_cost
> ampere1_advsimd_vector_cost =
> 8, /* store_elt_extra_cost */
> 6, /* vec_to_scalar_cost */
> 7, /* scalar_to_vec_cost */
> - 5, /* align_load_cost */
> - 5, /* unalign_load_cost */
> - 2, /* unalign_store_cost */
> - 2 /* store_cost */
> + 4, /* align_load_cost */
> + 4, /* unalign_load_cost */
> + 1, /* unalign_store_cost */
> + 1 /* store_cost */
> };
>
> /* Ampere-1 costs for vector insn classes. */
> static const struct cpu_vector_cost ampere1_vector_cost =
> {
> 1, /* scalar_int_stmt_cost */
> - 1, /* scalar_fp_stmt_cost */
> + 3, /* scalar_fp_stmt_cost */
> 4, /* scalar_load_cost */
> 1, /* scalar_store_cost */
> 1, /* cond_taken_branch_cost */
> --
> 2.34.1
On Mon, 27 Mar 2023 at 16:45, Kyrylo Tkachov <Kyrylo.Tkachov@arm.com> wrote:
>
> Hi Philipp,
>
> > -----Original Message-----
> > From: Gcc-patches <gcc-patches-
> > bounces+kyrylo.tkachov=arm.com@gcc.gnu.org> On Behalf Of Philipp
> > Tomsich
> > Sent: Monday, March 27, 2023 8:47 AM
> > To: gcc-patches@gcc.gnu.org
> > Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Tamar Christina
> > <Tamar.Christina@arm.com>; Philipp Tomsich <philipp.tomsich@vrull.eu>;
> > Manolis Tsamis <manolis.tsamis@vrull.eu>
> > Subject: [PATCH] aarch64: update ampere1 vectorization cost
> >
> > The original submission of AmpereOne (-mcpu=ampere1) costs occurred
> > prior to exhaustive testing of vectorizable workloads against
> > hardware.
> >
> > Adjust the vector costs to achieve the best results and more closely
> > match the underlying hardware.
> >
> > gcc/ChangeLog:
> >
> > * config/aarch64/aarch64.cc: Update vector costs for ampere1.
> >
> > Co-Authored-By: Manolis Tsamis <manolis.tsamis@vrull.eu>
> >
> > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > ---
> > We would like to get this into GCC 13 to avoid having to backport at
> > the start of the next cycle.
> >
>
> Given this affects only the ampere1 costs that sounds fine to me and fairly low risk, you are being trusted that these costs are actually desirable and properly validated on the hardware involved.
>
> > OK for backports?
>
> This is ok for trunk (GCC 13). Do you also want to backport this to other branches?
Ampere1 (with the older vector costs) are in GCC12 and GCC11.
I would like to backport to those as well.
Thanks,
Philipp.
> Thanks,
> Kyrill
>
> >
> > gcc/config/aarch64/aarch64.cc | 12 ++++++------
> > 1 file changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> > index b27f4354031..661fff65cea 100644
> > --- a/gcc/config/aarch64/aarch64.cc
> > +++ b/gcc/config/aarch64/aarch64.cc
> > @@ -1132,7 +1132,7 @@ static const struct cpu_vector_cost
> > thunderx3t110_vector_cost =
> >
> > static const advsimd_vec_cost ampere1_advsimd_vector_cost =
> > {
> > - 3, /* int_stmt_cost */
> > + 1, /* int_stmt_cost */
> > 3, /* fp_stmt_cost */
> > 0, /* ld2_st2_permute_cost */
> > 0, /* ld3_st3_permute_cost */
> > @@ -1148,17 +1148,17 @@ static const advsimd_vec_cost
> > ampere1_advsimd_vector_cost =
> > 8, /* store_elt_extra_cost */
> > 6, /* vec_to_scalar_cost */
> > 7, /* scalar_to_vec_cost */
> > - 5, /* align_load_cost */
> > - 5, /* unalign_load_cost */
> > - 2, /* unalign_store_cost */
> > - 2 /* store_cost */
> > + 4, /* align_load_cost */
> > + 4, /* unalign_load_cost */
> > + 1, /* unalign_store_cost */
> > + 1 /* store_cost */
> > };
> >
> > /* Ampere-1 costs for vector insn classes. */
> > static const struct cpu_vector_cost ampere1_vector_cost =
> > {
> > 1, /* scalar_int_stmt_cost */
> > - 1, /* scalar_fp_stmt_cost */
> > + 3, /* scalar_fp_stmt_cost */
> > 4, /* scalar_load_cost */
> > 1, /* scalar_store_cost */
> > 1, /* cond_taken_branch_cost */
> > --
> > 2.34.1
>
Applied to master, thanks!
Philipp.
On Mon, 27 Mar 2023 at 16:45, Kyrylo Tkachov <Kyrylo.Tkachov@arm.com> wrote:
>
> Hi Philipp,
>
> > -----Original Message-----
> > From: Gcc-patches <gcc-patches-
> > bounces+kyrylo.tkachov=arm.com@gcc.gnu.org> On Behalf Of Philipp
> > Tomsich
> > Sent: Monday, March 27, 2023 8:47 AM
> > To: gcc-patches@gcc.gnu.org
> > Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Tamar Christina
> > <Tamar.Christina@arm.com>; Philipp Tomsich <philipp.tomsich@vrull.eu>;
> > Manolis Tsamis <manolis.tsamis@vrull.eu>
> > Subject: [PATCH] aarch64: update ampere1 vectorization cost
> >
> > The original submission of AmpereOne (-mcpu=ampere1) costs occurred
> > prior to exhaustive testing of vectorizable workloads against
> > hardware.
> >
> > Adjust the vector costs to achieve the best results and more closely
> > match the underlying hardware.
> >
> > gcc/ChangeLog:
> >
> > * config/aarch64/aarch64.cc: Update vector costs for ampere1.
> >
> > Co-Authored-By: Manolis Tsamis <manolis.tsamis@vrull.eu>
> >
> > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > ---
> > We would like to get this into GCC 13 to avoid having to backport at
> > the start of the next cycle.
> >
>
> Given this affects only the ampere1 costs that sounds fine to me and fairly low risk, you are being trusted that these costs are actually desirable and properly validated on the hardware involved.
>
> > OK for backports?
>
> This is ok for trunk (GCC 13). Do you also want to backport this to other branches?
> Thanks,
> Kyrill
>
> >
> > gcc/config/aarch64/aarch64.cc | 12 ++++++------
> > 1 file changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> > index b27f4354031..661fff65cea 100644
> > --- a/gcc/config/aarch64/aarch64.cc
> > +++ b/gcc/config/aarch64/aarch64.cc
> > @@ -1132,7 +1132,7 @@ static const struct cpu_vector_cost
> > thunderx3t110_vector_cost =
> >
> > static const advsimd_vec_cost ampere1_advsimd_vector_cost =
> > {
> > - 3, /* int_stmt_cost */
> > + 1, /* int_stmt_cost */
> > 3, /* fp_stmt_cost */
> > 0, /* ld2_st2_permute_cost */
> > 0, /* ld3_st3_permute_cost */
> > @@ -1148,17 +1148,17 @@ static const advsimd_vec_cost
> > ampere1_advsimd_vector_cost =
> > 8, /* store_elt_extra_cost */
> > 6, /* vec_to_scalar_cost */
> > 7, /* scalar_to_vec_cost */
> > - 5, /* align_load_cost */
> > - 5, /* unalign_load_cost */
> > - 2, /* unalign_store_cost */
> > - 2 /* store_cost */
> > + 4, /* align_load_cost */
> > + 4, /* unalign_load_cost */
> > + 1, /* unalign_store_cost */
> > + 1 /* store_cost */
> > };
> >
> > /* Ampere-1 costs for vector insn classes. */
> > static const struct cpu_vector_cost ampere1_vector_cost =
> > {
> > 1, /* scalar_int_stmt_cost */
> > - 1, /* scalar_fp_stmt_cost */
> > + 3, /* scalar_fp_stmt_cost */
> > 4, /* scalar_load_cost */
> > 1, /* scalar_store_cost */
> > 1, /* cond_taken_branch_cost */
> > --
> > 2.34.1
>
> -----Original Message-----
> From: Philipp Tomsich <philipp.tomsich@vrull.eu>
> Sent: Monday, March 27, 2023 9:50 AM
> To: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
> Cc: gcc-patches@gcc.gnu.org; Richard Sandiford
> <Richard.Sandiford@arm.com>; Tamar Christina
> <Tamar.Christina@arm.com>; Manolis Tsamis <manolis.tsamis@vrull.eu>
> Subject: Re: [PATCH] aarch64: update ampere1 vectorization cost
>
> On Mon, 27 Mar 2023 at 16:45, Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
> wrote:
> >
> > Hi Philipp,
> >
> > > -----Original Message-----
> > > From: Gcc-patches <gcc-patches-
> > > bounces+kyrylo.tkachov=arm.com@gcc.gnu.org> On Behalf Of Philipp
> > > Tomsich
> > > Sent: Monday, March 27, 2023 8:47 AM
> > > To: gcc-patches@gcc.gnu.org
> > > Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Tamar Christina
> > > <Tamar.Christina@arm.com>; Philipp Tomsich
> <philipp.tomsich@vrull.eu>;
> > > Manolis Tsamis <manolis.tsamis@vrull.eu>
> > > Subject: [PATCH] aarch64: update ampere1 vectorization cost
> > >
> > > The original submission of AmpereOne (-mcpu=ampere1) costs occurred
> > > prior to exhaustive testing of vectorizable workloads against
> > > hardware.
> > >
> > > Adjust the vector costs to achieve the best results and more closely
> > > match the underlying hardware.
> > >
> > > gcc/ChangeLog:
> > >
> > > * config/aarch64/aarch64.cc: Update vector costs for ampere1.
> > >
> > > Co-Authored-By: Manolis Tsamis <manolis.tsamis@vrull.eu>
> > >
> > > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > > ---
> > > We would like to get this into GCC 13 to avoid having to backport at
> > > the start of the next cycle.
> > >
> >
> > Given this affects only the ampere1 costs that sounds fine to me and fairly
> low risk, you are being trusted that these costs are actually desirable and
> properly validated on the hardware involved.
> >
> > > OK for backports?
> >
> > This is ok for trunk (GCC 13). Do you also want to backport this to other
> branches?
>
> Ampere1 (with the older vector costs) are in GCC12 and GCC11.
> I would like to backport to those as well.
Ok then, though you may want to run the benchmarks on the branches as well to make sure the costs give the expected benefit there as well.
Thanks,
Kyrill
>
> Thanks,
> Philipp.
>
> > Thanks,
> > Kyrill
> >
> > >
> > > gcc/config/aarch64/aarch64.cc | 12 ++++++------
> > > 1 file changed, 6 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/gcc/config/aarch64/aarch64.cc
> b/gcc/config/aarch64/aarch64.cc
> > > index b27f4354031..661fff65cea 100644
> > > --- a/gcc/config/aarch64/aarch64.cc
> > > +++ b/gcc/config/aarch64/aarch64.cc
> > > @@ -1132,7 +1132,7 @@ static const struct cpu_vector_cost
> > > thunderx3t110_vector_cost =
> > >
> > > static const advsimd_vec_cost ampere1_advsimd_vector_cost =
> > > {
> > > - 3, /* int_stmt_cost */
> > > + 1, /* int_stmt_cost */
> > > 3, /* fp_stmt_cost */
> > > 0, /* ld2_st2_permute_cost */
> > > 0, /* ld3_st3_permute_cost */
> > > @@ -1148,17 +1148,17 @@ static const advsimd_vec_cost
> > > ampere1_advsimd_vector_cost =
> > > 8, /* store_elt_extra_cost */
> > > 6, /* vec_to_scalar_cost */
> > > 7, /* scalar_to_vec_cost */
> > > - 5, /* align_load_cost */
> > > - 5, /* unalign_load_cost */
> > > - 2, /* unalign_store_cost */
> > > - 2 /* store_cost */
> > > + 4, /* align_load_cost */
> > > + 4, /* unalign_load_cost */
> > > + 1, /* unalign_store_cost */
> > > + 1 /* store_cost */
> > > };
> > >
> > > /* Ampere-1 costs for vector insn classes. */
> > > static const struct cpu_vector_cost ampere1_vector_cost =
> > > {
> > > 1, /* scalar_int_stmt_cost */
> > > - 1, /* scalar_fp_stmt_cost */
> > > + 3, /* scalar_fp_stmt_cost */
> > > 4, /* scalar_load_cost */
> > > 1, /* scalar_store_cost */
> > > 1, /* cond_taken_branch_cost */
> > > --
> > > 2.34.1
> >
Kyrill,
We reran on GCC12 and GCC11, reproducing the same improvements (e.g.,
on fotonik3d) that prompted the changes.
I'll apply the backports later this week, unless you have any further concerns…
Thanks,
Philipp.
On Mon, 27 Mar 2023 at 11:24, Kyrylo Tkachov <Kyrylo.Tkachov@arm.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > Sent: Monday, March 27, 2023 9:50 AM
> > To: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
> > Cc: gcc-patches@gcc.gnu.org; Richard Sandiford
> > <Richard.Sandiford@arm.com>; Tamar Christina
> > <Tamar.Christina@arm.com>; Manolis Tsamis <manolis.tsamis@vrull.eu>
> > Subject: Re: [PATCH] aarch64: update ampere1 vectorization cost
> >
> > On Mon, 27 Mar 2023 at 16:45, Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
> > wrote:
> > >
> > > Hi Philipp,
> > >
> > > > -----Original Message-----
> > > > From: Gcc-patches <gcc-patches-
> > > > bounces+kyrylo.tkachov=arm.com@gcc.gnu.org> On Behalf Of Philipp
> > > > Tomsich
> > > > Sent: Monday, March 27, 2023 8:47 AM
> > > > To: gcc-patches@gcc.gnu.org
> > > > Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Tamar Christina
> > > > <Tamar.Christina@arm.com>; Philipp Tomsich
> > <philipp.tomsich@vrull.eu>;
> > > > Manolis Tsamis <manolis.tsamis@vrull.eu>
> > > > Subject: [PATCH] aarch64: update ampere1 vectorization cost
> > > >
> > > > The original submission of AmpereOne (-mcpu=ampere1) costs occurred
> > > > prior to exhaustive testing of vectorizable workloads against
> > > > hardware.
> > > >
> > > > Adjust the vector costs to achieve the best results and more closely
> > > > match the underlying hardware.
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * config/aarch64/aarch64.cc: Update vector costs for ampere1.
> > > >
> > > > Co-Authored-By: Manolis Tsamis <manolis.tsamis@vrull.eu>
> > > >
> > > > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > > > ---
> > > > We would like to get this into GCC 13 to avoid having to backport at
> > > > the start of the next cycle.
> > > >
> > >
> > > Given this affects only the ampere1 costs that sounds fine to me and fairly
> > low risk, you are being trusted that these costs are actually desirable and
> > properly validated on the hardware involved.
> > >
> > > > OK for backports?
> > >
> > > This is ok for trunk (GCC 13). Do you also want to backport this to other
> > branches?
> >
> > Ampere1 (with the older vector costs) are in GCC12 and GCC11.
> > I would like to backport to those as well.
>
> Ok then, though you may want to run the benchmarks on the branches as well to make sure the costs give the expected benefit there as well.
> Thanks,
> Kyrill
>
> >
> > Thanks,
> > Philipp.
> >
> > > Thanks,
> > > Kyrill
> > >
> > > >
> > > > gcc/config/aarch64/aarch64.cc | 12 ++++++------
> > > > 1 file changed, 6 insertions(+), 6 deletions(-)
> > > >
> > > > diff --git a/gcc/config/aarch64/aarch64.cc
> > b/gcc/config/aarch64/aarch64.cc
> > > > index b27f4354031..661fff65cea 100644
> > > > --- a/gcc/config/aarch64/aarch64.cc
> > > > +++ b/gcc/config/aarch64/aarch64.cc
> > > > @@ -1132,7 +1132,7 @@ static const struct cpu_vector_cost
> > > > thunderx3t110_vector_cost =
> > > >
> > > > static const advsimd_vec_cost ampere1_advsimd_vector_cost =
> > > > {
> > > > - 3, /* int_stmt_cost */
> > > > + 1, /* int_stmt_cost */
> > > > 3, /* fp_stmt_cost */
> > > > 0, /* ld2_st2_permute_cost */
> > > > 0, /* ld3_st3_permute_cost */
> > > > @@ -1148,17 +1148,17 @@ static const advsimd_vec_cost
> > > > ampere1_advsimd_vector_cost =
> > > > 8, /* store_elt_extra_cost */
> > > > 6, /* vec_to_scalar_cost */
> > > > 7, /* scalar_to_vec_cost */
> > > > - 5, /* align_load_cost */
> > > > - 5, /* unalign_load_cost */
> > > > - 2, /* unalign_store_cost */
> > > > - 2 /* store_cost */
> > > > + 4, /* align_load_cost */
> > > > + 4, /* unalign_load_cost */
> > > > + 1, /* unalign_store_cost */
> > > > + 1 /* store_cost */
> > > > };
> > > >
> > > > /* Ampere-1 costs for vector insn classes. */
> > > > static const struct cpu_vector_cost ampere1_vector_cost =
> > > > {
> > > > 1, /* scalar_int_stmt_cost */
> > > > - 1, /* scalar_fp_stmt_cost */
> > > > + 3, /* scalar_fp_stmt_cost */
> > > > 4, /* scalar_load_cost */
> > > > 1, /* scalar_store_cost */
> > > > 1, /* cond_taken_branch_cost */
> > > > --
> > > > 2.34.1
> > >
Hi Philipp,
> -----Original Message-----
> From: Philipp Tomsich <philipp.tomsich@vrull.eu>
> Sent: Monday, April 3, 2023 12:46 PM
> To: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
> Cc: gcc-patches@gcc.gnu.org; Richard Sandiford
> <Richard.Sandiford@arm.com>; Tamar Christina
> <Tamar.Christina@arm.com>; Manolis Tsamis <manolis.tsamis@vrull.eu>
> Subject: Re: [PATCH] aarch64: update ampere1 vectorization cost
>
> Kyrill,
>
> We reran on GCC12 and GCC11, reproducing the same improvements (e.g.,
> on fotonik3d) that prompted the changes.
> I'll apply the backports later this week, unless you have any further
> concerns…
Ok, thanks for checking.
Kyrill
>
> Thanks,
> Philipp.
>
>
> On Mon, 27 Mar 2023 at 11:24, Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
> wrote:
> >
> >
> >
> > > -----Original Message-----
> > > From: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > > Sent: Monday, March 27, 2023 9:50 AM
> > > To: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
> > > Cc: gcc-patches@gcc.gnu.org; Richard Sandiford
> > > <Richard.Sandiford@arm.com>; Tamar Christina
> > > <Tamar.Christina@arm.com>; Manolis Tsamis
> <manolis.tsamis@vrull.eu>
> > > Subject: Re: [PATCH] aarch64: update ampere1 vectorization cost
> > >
> > > On Mon, 27 Mar 2023 at 16:45, Kyrylo Tkachov
> <Kyrylo.Tkachov@arm.com>
> > > wrote:
> > > >
> > > > Hi Philipp,
> > > >
> > > > > -----Original Message-----
> > > > > From: Gcc-patches <gcc-patches-
> > > > > bounces+kyrylo.tkachov=arm.com@gcc.gnu.org> On Behalf Of Philipp
> > > > > Tomsich
> > > > > Sent: Monday, March 27, 2023 8:47 AM
> > > > > To: gcc-patches@gcc.gnu.org
> > > > > Cc: Richard Sandiford <Richard.Sandiford@arm.com>; Tamar
> Christina
> > > > > <Tamar.Christina@arm.com>; Philipp Tomsich
> > > <philipp.tomsich@vrull.eu>;
> > > > > Manolis Tsamis <manolis.tsamis@vrull.eu>
> > > > > Subject: [PATCH] aarch64: update ampere1 vectorization cost
> > > > >
> > > > > The original submission of AmpereOne (-mcpu=ampere1) costs
> occurred
> > > > > prior to exhaustive testing of vectorizable workloads against
> > > > > hardware.
> > > > >
> > > > > Adjust the vector costs to achieve the best results and more closely
> > > > > match the underlying hardware.
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > > * config/aarch64/aarch64.cc: Update vector costs for ampere1.
> > > > >
> > > > > Co-Authored-By: Manolis Tsamis <manolis.tsamis@vrull.eu>
> > > > >
> > > > > Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> > > > > ---
> > > > > We would like to get this into GCC 13 to avoid having to backport at
> > > > > the start of the next cycle.
> > > > >
> > > >
> > > > Given this affects only the ampere1 costs that sounds fine to me and
> fairly
> > > low risk, you are being trusted that these costs are actually desirable and
> > > properly validated on the hardware involved.
> > > >
> > > > > OK for backports?
> > > >
> > > > This is ok for trunk (GCC 13). Do you also want to backport this to other
> > > branches?
> > >
> > > Ampere1 (with the older vector costs) are in GCC12 and GCC11.
> > > I would like to backport to those as well.
> >
> > Ok then, though you may want to run the benchmarks on the branches as
> well to make sure the costs give the expected benefit there as well.
> > Thanks,
> > Kyrill
> >
> > >
> > > Thanks,
> > > Philipp.
> > >
> > > > Thanks,
> > > > Kyrill
> > > >
> > > > >
> > > > > gcc/config/aarch64/aarch64.cc | 12 ++++++------
> > > > > 1 file changed, 6 insertions(+), 6 deletions(-)
> > > > >
> > > > > diff --git a/gcc/config/aarch64/aarch64.cc
> > > b/gcc/config/aarch64/aarch64.cc
> > > > > index b27f4354031..661fff65cea 100644
> > > > > --- a/gcc/config/aarch64/aarch64.cc
> > > > > +++ b/gcc/config/aarch64/aarch64.cc
> > > > > @@ -1132,7 +1132,7 @@ static const struct cpu_vector_cost
> > > > > thunderx3t110_vector_cost =
> > > > >
> > > > > static const advsimd_vec_cost ampere1_advsimd_vector_cost =
> > > > > {
> > > > > - 3, /* int_stmt_cost */
> > > > > + 1, /* int_stmt_cost */
> > > > > 3, /* fp_stmt_cost */
> > > > > 0, /* ld2_st2_permute_cost */
> > > > > 0, /* ld3_st3_permute_cost */
> > > > > @@ -1148,17 +1148,17 @@ static const advsimd_vec_cost
> > > > > ampere1_advsimd_vector_cost =
> > > > > 8, /* store_elt_extra_cost */
> > > > > 6, /* vec_to_scalar_cost */
> > > > > 7, /* scalar_to_vec_cost */
> > > > > - 5, /* align_load_cost */
> > > > > - 5, /* unalign_load_cost */
> > > > > - 2, /* unalign_store_cost */
> > > > > - 2 /* store_cost */
> > > > > + 4, /* align_load_cost */
> > > > > + 4, /* unalign_load_cost */
> > > > > + 1, /* unalign_store_cost */
> > > > > + 1 /* store_cost */
> > > > > };
> > > > >
> > > > > /* Ampere-1 costs for vector insn classes. */
> > > > > static const struct cpu_vector_cost ampere1_vector_cost =
> > > > > {
> > > > > 1, /* scalar_int_stmt_cost */
> > > > > - 1, /* scalar_fp_stmt_cost */
> > > > > + 3, /* scalar_fp_stmt_cost */
> > > > > 4, /* scalar_load_cost */
> > > > > 1, /* scalar_store_cost */
> > > > > 1, /* cond_taken_branch_cost */
> > > > > --
> > > > > 2.34.1
> > > >
@@ -1132,7 +1132,7 @@ static const struct cpu_vector_cost thunderx3t110_vector_cost =
static const advsimd_vec_cost ampere1_advsimd_vector_cost =
{
- 3, /* int_stmt_cost */
+ 1, /* int_stmt_cost */
3, /* fp_stmt_cost */
0, /* ld2_st2_permute_cost */
0, /* ld3_st3_permute_cost */
@@ -1148,17 +1148,17 @@ static const advsimd_vec_cost ampere1_advsimd_vector_cost =
8, /* store_elt_extra_cost */
6, /* vec_to_scalar_cost */
7, /* scalar_to_vec_cost */
- 5, /* align_load_cost */
- 5, /* unalign_load_cost */
- 2, /* unalign_store_cost */
- 2 /* store_cost */
+ 4, /* align_load_cost */
+ 4, /* unalign_load_cost */
+ 1, /* unalign_store_cost */
+ 1 /* store_cost */
};
/* Ampere-1 costs for vector insn classes. */
static const struct cpu_vector_cost ampere1_vector_cost =
{
1, /* scalar_int_stmt_cost */
- 1, /* scalar_fp_stmt_cost */
+ 3, /* scalar_fp_stmt_cost */
4, /* scalar_load_cost */
1, /* scalar_store_cost */
1, /* cond_taken_branch_cost */