x86/cpu: Add support for modern x86-64-v* march
Commit Message
Add support for setting march to x86-64-v2, x86-64-v3, x86-64-v4 with
tuning set to an early family of CPUs that support such instruction
levels. By default gcc sets generic tuning for x86-64-v*, which is
suboptimal for all brands of CPUs with such instruction set support.
Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com>
---
arch/x86/Kconfig.cpu | 23 ++++++++++++++++++++++-
arch/x86/Makefile | 11 +++++++++++
2 files changed, 33 insertions(+), 1 deletion(-)
Comments
On Fri, Jul 07, 2023 at 11:56:00AM +0100, Dimitri John Ledkov wrote:
> Add support for setting march to x86-64-v2, x86-64-v3, x86-64-v4 with
> tuning set to an early family of CPUs that support such instruction
> levels. By default gcc sets generic tuning for x86-64-v*, which is
> suboptimal for all brands of CPUs with such instruction set support.
Prove that it is suboptimal for the kernel. Numbers please.
And even if it shows on *some* uarch:
* we need a *single* setting for distro kernels - i.e.,
CONFIG_GENERIC_CPU and compilers do make sure that -mtune=generic does
the most optimal code generation for all uarches
* our Kconfig option set is abysmal so don't need any more if useless.
Yeah, a patch like that keeps popping up on a regular basis but no,
thanks.
On Fri, 7 Jul 2023 at 12:28, Borislav Petkov <bp@alien8.de> wrote:
>
> On Fri, Jul 07, 2023 at 11:56:00AM +0100, Dimitri John Ledkov wrote:
> > Add support for setting march to x86-64-v2, x86-64-v3, x86-64-v4 with
> > tuning set to an early family of CPUs that support such instruction
> > levels. By default gcc sets generic tuning for x86-64-v*, which is
> > suboptimal for all brands of CPUs with such instruction set support.
>
> Prove that it is suboptimal for the kernel. Numbers please.
It's not suboptimal for the kernel as is, it is suboptimal for
march=x86-64-v* as documented in gcc - probably gcc should actually
prohibit mtune=generic with march=x86-64-v* settings.
> And even if it shows on *some* uarch:
>
> * we need a *single* setting for distro kernels - i.e.,
> CONFIG_GENERIC_CPU and compilers do make sure that -mtune=generic does
> the most optimal code generation for all uarches
>
> * our Kconfig option set is abysmal so don't need any more if useless.
>
> Yeah, a patch like that keeps popping up on a regular basis but no,
> thanks.
The biggest issue is that march & mtune is always set, and there is no
option to use the compiler configured builtin default, or pass in an
arbitrary string.
Would it be acceptable to change GENERIC_CPU to not set neither march
nor mtune and thus use the compiler configured default? If not, would
it be acceptable to have a new option GENERIC_NONE which does not set
any march/mtune and thus uses a compiler configured default? Or for
example, allow a new freeform string for march and mtune?
+ Matz
On Fri, Jul 07, 2023 at 12:46:50PM +0100, Dimitri John Ledkov wrote:
> On Fri, 7 Jul 2023 at 12:28, Borislav Petkov <bp@alien8.de> wrote:
> >
> > On Fri, Jul 07, 2023 at 11:56:00AM +0100, Dimitri John Ledkov wrote:
> > > Add support for setting march to x86-64-v2, x86-64-v3, x86-64-v4 with
> > > tuning set to an early family of CPUs that support such instruction
> > > levels. By default gcc sets generic tuning for x86-64-v*, which is
> > > suboptimal for all brands of CPUs with such instruction set support.
> >
> > Prove that it is suboptimal for the kernel. Numbers please.
>
> It's not suboptimal for the kernel as is, it is suboptimal for
> march=x86-64-v* as documented in gcc - probably gcc should actually
> prohibit mtune=generic with march=x86-64-v* settings.
I can't parse that. You probably need to explain in greater detail:
these are the current GENERIC_CPU options, the problem is <bla>, the fix
is <foo> and so on.
> > And even if it shows on *some* uarch:
> >
> > * we need a *single* setting for distro kernels - i.e.,
> > CONFIG_GENERIC_CPU and compilers do make sure that -mtune=generic does
> > the most optimal code generation for all uarches
> >
> > * our Kconfig option set is abysmal so don't need any more if useless.
> >
> > Yeah, a patch like that keeps popping up on a regular basis but no,
> > thanks.
>
> The biggest issue is that march & mtune is always set, and there is no
> option to use the compiler configured builtin default, or pass in an
> arbitrary string.
Why would you want to use the compiler configured builtin default?
> Would it be acceptable to change GENERIC_CPU to not set neither march
> nor mtune and thus use the compiler configured default? If not, would
> it be acceptable to have a new option GENERIC_NONE which does not set
> any march/mtune and thus uses a compiler configured default? Or for
> example, allow a new freeform string for march and mtune?
You're asking me all those questions but you're not giving any reasoning
*why*. You need to sell it to me and explain why it should be in the
kernel. What's the use case that the majority of users out there will
profit from and so on and so on.
Thx.
@@ -294,6 +294,27 @@ config GENERIC_CPU
Generic x86-64 CPU.
Run equally well on all x86-64 CPUs.
+config X86_64_V2
+ bool "x86-64-v2"
+ depends on X86_64
+ help
+ x86-64-v2 CPU.
+ Run well on x86-64-v2 capable CPUs.
+
+config X86_64_V3
+ bool "x86-64-v3"
+ depends on X86_64
+ help
+ x86-64-v3 CPU.
+ Run well on x86-64-v3 capable CPUs.
+
+config X86_64_V4
+ bool "x86-64-v4"
+ depends on X86_64
+ help
+ x86-64-v4 CPU.
+ Run well on x86-64-v4 capable CPUs.
+
endchoice
config X86_GENERIC
@@ -318,7 +339,7 @@ config X86_INTERNODE_CACHE_SHIFT
config X86_L1_CACHE_SHIFT
int
default "7" if MPENTIUM4 || MPSC
- default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MATOM || MVIAC7 || X86_GENERIC || GENERIC_CPU
+ default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MATOM || MVIAC7 || X86_GENERIC || GENERIC_CPU || X86_64_V2 || X86_64_V3 || X86_64_V4
default "4" if MELAN || M486SX || M486 || MGEODEGX1
default "5" if MWINCHIP3D || MWINCHIPC6 || MCRUSOE || MEFFICEON || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || M586 || MVIAC3_2 || MGEODE_LX
@@ -154,6 +154,14 @@ else
cflags-$(CONFIG_MCORE2) += -march=core2
cflags-$(CONFIG_MATOM) += -march=atom
cflags-$(CONFIG_GENERIC_CPU) += -mtune=generic
+ # Note gcc for -march=x86-64-v* sets genetic tuning only,
+ # rather than something sensible / minimum tuning from the
+ # same family. Set mtune to early family of CPUs that support
+ # such instructions, which works better across all CPU brands.
+ cflags-$(CONFIG_X86_64_V2) += -march=x86-64-v2 -mtune=nehalem
+ cflags-$(CONFIG_X86_64_V3) += -march=x86-64-v3 -mtune=skylake
+ cflags-$(CONFIG_X86_64_V4) += -march=x86-64-v4 -mtune=skylake-avx512
+
KBUILD_CFLAGS += $(cflags-y)
rustflags-$(CONFIG_MK8) += -Ctarget-cpu=k8
@@ -161,6 +169,9 @@ else
rustflags-$(CONFIG_MCORE2) += -Ctarget-cpu=core2
rustflags-$(CONFIG_MATOM) += -Ctarget-cpu=atom
rustflags-$(CONFIG_GENERIC_CPU) += -Ztune-cpu=generic
+ rustflags-$(CONFIG_X86_64_V2) += -Ctarget-cpu=x86-64-v2
+ rustflags-$(CONFIG_X86_64_V3) += -Ctarget-cpu=x86-64-v3
+ rustflags-$(CONFIG_X86_64_V4) += -Ctarget-cpu=x86-64-v4
KBUILD_RUSTFLAGS += $(rustflags-y)
KBUILD_CFLAGS += -mno-red-zone