x86/cpu: Add support for modern x86-64-v* march

Message ID 20230707105601.133221-1-dimitri.ledkov@canonical.com
State New
Headers
Series x86/cpu: Add support for modern x86-64-v* march |

Commit Message

Dimitri John Ledkov July 7, 2023, 10:56 a.m. UTC
  Add support for setting march to x86-64-v2, x86-64-v3, x86-64-v4 with
tuning set to an early family of CPUs that support such instruction
levels. By default gcc sets generic tuning for x86-64-v*, which is
suboptimal for all brands of CPUs with such instruction set support.

Signed-off-by: Dimitri John Ledkov <dimitri.ledkov@canonical.com>
---
 arch/x86/Kconfig.cpu | 23 ++++++++++++++++++++++-
 arch/x86/Makefile    | 11 +++++++++++
 2 files changed, 33 insertions(+), 1 deletion(-)
  

Comments

Borislav Petkov July 7, 2023, 11:28 a.m. UTC | #1
On Fri, Jul 07, 2023 at 11:56:00AM +0100, Dimitri John Ledkov wrote:
> Add support for setting march to x86-64-v2, x86-64-v3, x86-64-v4 with
> tuning set to an early family of CPUs that support such instruction
> levels. By default gcc sets generic tuning for x86-64-v*, which is
> suboptimal for all brands of CPUs with such instruction set support.

Prove that it is suboptimal for the kernel. Numbers please.

And even if it shows on *some* uarch:

* we need a *single* setting for distro kernels - i.e.,
CONFIG_GENERIC_CPU and compilers do make sure that -mtune=generic does
the most optimal code generation for all uarches

* our Kconfig option set is abysmal so don't need any more if useless.

Yeah, a patch like that keeps popping up on a regular basis but no,
thanks.
  
Dimitri John Ledkov July 7, 2023, 11:46 a.m. UTC | #2
On Fri, 7 Jul 2023 at 12:28, Borislav Petkov <bp@alien8.de> wrote:
>
> On Fri, Jul 07, 2023 at 11:56:00AM +0100, Dimitri John Ledkov wrote:
> > Add support for setting march to x86-64-v2, x86-64-v3, x86-64-v4 with
> > tuning set to an early family of CPUs that support such instruction
> > levels. By default gcc sets generic tuning for x86-64-v*, which is
> > suboptimal for all brands of CPUs with such instruction set support.
>
> Prove that it is suboptimal for the kernel. Numbers please.

It's not suboptimal for the kernel as is, it is suboptimal for
march=x86-64-v* as documented in gcc - probably gcc should actually
prohibit mtune=generic with march=x86-64-v* settings.

> And even if it shows on *some* uarch:
>
> * we need a *single* setting for distro kernels - i.e.,
> CONFIG_GENERIC_CPU and compilers do make sure that -mtune=generic does
> the most optimal code generation for all uarches
>
> * our Kconfig option set is abysmal so don't need any more if useless.
>
> Yeah, a patch like that keeps popping up on a regular basis but no,
> thanks.

The biggest issue is that march & mtune is always set, and there is no
option to use the compiler configured builtin default, or pass in an
arbitrary string.

Would it be acceptable to change GENERIC_CPU to not set neither march
nor mtune and thus use the compiler configured default? If not, would
it be acceptable to have a new option GENERIC_NONE which does not set
any march/mtune and thus uses a compiler configured default? Or for
example, allow a new freeform string for march and mtune?
  
Borislav Petkov July 7, 2023, 12:05 p.m. UTC | #3
+ Matz

On Fri, Jul 07, 2023 at 12:46:50PM +0100, Dimitri John Ledkov wrote:
> On Fri, 7 Jul 2023 at 12:28, Borislav Petkov <bp@alien8.de> wrote:
> >
> > On Fri, Jul 07, 2023 at 11:56:00AM +0100, Dimitri John Ledkov wrote:
> > > Add support for setting march to x86-64-v2, x86-64-v3, x86-64-v4 with
> > > tuning set to an early family of CPUs that support such instruction
> > > levels. By default gcc sets generic tuning for x86-64-v*, which is
> > > suboptimal for all brands of CPUs with such instruction set support.
> >
> > Prove that it is suboptimal for the kernel. Numbers please.
> 
> It's not suboptimal for the kernel as is, it is suboptimal for
> march=x86-64-v* as documented in gcc - probably gcc should actually
> prohibit mtune=generic with march=x86-64-v* settings.

I can't parse that. You probably need to explain in greater detail:
these are the current GENERIC_CPU options, the problem is <bla>, the fix
is <foo> and so on.

> > And even if it shows on *some* uarch:
> >
> > * we need a *single* setting for distro kernels - i.e.,
> > CONFIG_GENERIC_CPU and compilers do make sure that -mtune=generic does
> > the most optimal code generation for all uarches
> >
> > * our Kconfig option set is abysmal so don't need any more if useless.
> >
> > Yeah, a patch like that keeps popping up on a regular basis but no,
> > thanks.
> 
> The biggest issue is that march & mtune is always set, and there is no
> option to use the compiler configured builtin default, or pass in an
> arbitrary string.

Why would you want to use the compiler configured builtin default?

> Would it be acceptable to change GENERIC_CPU to not set neither march
> nor mtune and thus use the compiler configured default? If not, would
> it be acceptable to have a new option GENERIC_NONE which does not set
> any march/mtune and thus uses a compiler configured default? Or for
> example, allow a new freeform string for march and mtune?

You're asking me all those questions but you're not giving any reasoning
*why*. You need to sell it to me and explain why it should be in the
kernel. What's the use case that the majority of users out there will
profit from and so on and so on.

Thx.
  

Patch

diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu
index 00468adf18..70e70b4733 100644
--- a/arch/x86/Kconfig.cpu
+++ b/arch/x86/Kconfig.cpu
@@ -294,6 +294,27 @@  config GENERIC_CPU
 	  Generic x86-64 CPU.
 	  Run equally well on all x86-64 CPUs.
 
+config X86_64_V2
+	bool "x86-64-v2"
+	depends on X86_64
+	help
+	  x86-64-v2 CPU.
+	  Run well on x86-64-v2 capable CPUs.
+
+config X86_64_V3
+	bool "x86-64-v3"
+	depends on X86_64
+	help
+	  x86-64-v3 CPU.
+	  Run well on x86-64-v3 capable CPUs.
+
+config X86_64_V4
+	bool "x86-64-v4"
+	depends on X86_64
+	help
+	  x86-64-v4 CPU.
+	  Run well on x86-64-v4 capable CPUs.
+
 endchoice
 
 config X86_GENERIC
@@ -318,7 +339,7 @@  config X86_INTERNODE_CACHE_SHIFT
 config X86_L1_CACHE_SHIFT
 	int
 	default "7" if MPENTIUM4 || MPSC
-	default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MATOM || MVIAC7 || X86_GENERIC || GENERIC_CPU
+	default "6" if MK7 || MK8 || MPENTIUMM || MCORE2 || MATOM || MVIAC7 || X86_GENERIC || GENERIC_CPU || X86_64_V2 || X86_64_V3 || X86_64_V4
 	default "4" if MELAN || M486SX || M486 || MGEODEGX1
 	default "5" if MWINCHIP3D || MWINCHIPC6 || MCRUSOE || MEFFICEON || MCYRIXIII || MK6 || MPENTIUMIII || MPENTIUMII || M686 || M586MMX || M586TSC || M586 || MVIAC3_2 || MGEODE_LX
 
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index fdc2e3abd6..ee97501970 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -154,6 +154,14 @@  else
         cflags-$(CONFIG_MCORE2)		+= -march=core2
         cflags-$(CONFIG_MATOM)		+= -march=atom
         cflags-$(CONFIG_GENERIC_CPU)	+= -mtune=generic
+        # Note gcc for -march=x86-64-v* sets genetic tuning only,
+        # rather than something sensible / minimum tuning from the
+        # same family. Set mtune to early family of CPUs that support
+        # such instructions, which works better across all CPU brands.
+        cflags-$(CONFIG_X86_64_V2)	+= -march=x86-64-v2 -mtune=nehalem
+        cflags-$(CONFIG_X86_64_V3)	+= -march=x86-64-v3 -mtune=skylake
+        cflags-$(CONFIG_X86_64_V4)	+= -march=x86-64-v4 -mtune=skylake-avx512
+
         KBUILD_CFLAGS += $(cflags-y)
 
         rustflags-$(CONFIG_MK8)		+= -Ctarget-cpu=k8
@@ -161,6 +169,9 @@  else
         rustflags-$(CONFIG_MCORE2)	+= -Ctarget-cpu=core2
         rustflags-$(CONFIG_MATOM)	+= -Ctarget-cpu=atom
         rustflags-$(CONFIG_GENERIC_CPU)	+= -Ztune-cpu=generic
+        rustflags-$(CONFIG_X86_64_V2)	+= -Ctarget-cpu=x86-64-v2
+        rustflags-$(CONFIG_X86_64_V3)	+= -Ctarget-cpu=x86-64-v3
+        rustflags-$(CONFIG_X86_64_V4)	+= -Ctarget-cpu=x86-64-v4
         KBUILD_RUSTFLAGS += $(rustflags-y)
 
         KBUILD_CFLAGS += -mno-red-zone