[tip:,x86/urgent] x86/cpu: Fix AMD erratum #1485 on Zen4-based CPUs

Message ID 169701622768.3135.17489375930381616520.tip-bot2@tip-bot2
State New
Headers
Series [tip:,x86/urgent] x86/cpu: Fix AMD erratum #1485 on Zen4-based CPUs |

Commit Message

tip-bot2 for Thomas Gleixner Oct. 11, 2023, 9:23 a.m. UTC
  The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     f454b18e07f518bcd0c05af17a2239138bff52de
Gitweb:        https://git.kernel.org/tip/f454b18e07f518bcd0c05af17a2239138bff52de
Author:        Borislav Petkov (AMD) <bp@alien8.de>
AuthorDate:    Sat, 07 Oct 2023 12:57:02 +02:00
Committer:     Borislav Petkov (AMD) <bp@alien8.de>
CommitterDate: Wed, 11 Oct 2023 11:00:11 +02:00

x86/cpu: Fix AMD erratum #1485 on Zen4-based CPUs

Fix erratum #1485 on Zen4 parts where running with STIBP disabled can
cause an #UD exception. The performance impact of the fix is negligible.

Reported-by: René Rebe <rene@exactcode.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: René Rebe <rene@exactcode.de>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/D99589F4-BC5D-430B-87B2-72C20370CF57@exactcode.com
---
 arch/x86/include/asm/msr-index.h |  9 +++++++--
 arch/x86/kernel/cpu/amd.c        |  8 ++++++++
 2 files changed, 15 insertions(+), 2 deletions(-)
  

Comments

Ingo Molnar Oct. 11, 2023, 9:28 p.m. UTC | #1
* tip-bot2 for Borislav Petkov (AMD) <tip-bot2@linutronix.de> wrote:

>  /* AMD Last Branch Record MSRs */
>  #define MSR_AMD64_LBR_SELECT			0xc000010e
>  
> +/* Zen4 */
> +#define MSR_ZEN4_BP_CFG			0xc001102e
> +#define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
>  
> +/* Zen 2 */
>  #define MSR_ZEN2_SPECTRAL_CHICKEN	0xc00110e3
>  #define MSR_ZEN2_SPECTRAL_CHICKEN_BIT	BIT_ULL(1)
>  
> +/* Fam 17h MSRs */
> +#define MSR_F17H_IRPERF			0xc00000e9

Yeah, so these latest AMD MSR definitions in <asm/msr-index.h> are pretty 
confused, they list MSRs in the following order:

   Zen 4
   Zen 2
   Fam 19h         // resolution in tip:master
   Fam 17h

where perf/core added a Fam 19h section a couple of days ago ...

While in reality:

   Zen 2 == Fam 17h
   Zen 4 == Fam 19h

So it's confusing to list these separately and out of order.

So in resolving the conflict in perf/core I updated this section to read:

  /* Fam 19h (Zen 4) MSRs */
  #define MSR_F19H_UMC_PERF_CTL		0xc0010800
  #define MSR_F19H_UMC_PERF_CTR		0xc0010801

  #define MSR_ZEN4_BP_CFG		0xc001102e
  #define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5

  /* Fam 17h (Zen 2) MSRs */
  #define MSR_F17H_IRPERF		0xc00000e9

  #define MSR_ZEN2_SPECTRAL_CHICKEN	0xc00110e3
  #define MSR_ZEN2_SPECTRAL_CHICKEN_BIT	BIT_ULL(1)

This doesn't change the definitions themselves, only merges the comments 
and the sections, (to keep the Git conflict resolution non-evil), but 
arguably once perf/core goes upstream, we should probably unify the naming 
to follow the existing nomenclature, which is, starting at around F15H, the 
following:

   MSR_F15H_
   MSR_F16H_
   MSR_F17H_
   MSR_F19H_

Or are the MSRs named ZEN2 and ZEN4 in AMD SDMs, which we should follow?

Anyway, something to keep in mind.

Thanks,

	Ingo
  
Borislav Petkov Oct. 12, 2023, 7:40 a.m. UTC | #2
On Wed, Oct 11, 2023 at 11:28:26PM +0200, Ingo Molnar wrote:
> While in reality:
> 
>    Zen 2 == Fam 17h
>    Zen 4 == Fam 19h

If only were that easy...

family 0x17 is Zen1 and 2, family 0x19 is spread around Zen 3 and 4.

> 
> So it's confusing to list these separately and out of order.
> 
> So in resolving the conflict in perf/core I updated this section to read:
> 
>   /* Fam 19h (Zen 4) MSRs */

That's wrong.

>   #define MSR_F19H_UMC_PERF_CTL		0xc0010800
>   #define MSR_F19H_UMC_PERF_CTR		0xc0010801
> 
>   #define MSR_ZEN4_BP_CFG		0xc001102e
>   #define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
> 
>   /* Fam 17h (Zen 2) MSRs */

Ditto.

> This doesn't change the definitions themselves, only merges the comments 
> and the sections, (to keep the Git conflict resolution non-evil), but 
> arguably once perf/core goes upstream, we should probably unify the naming 
> to follow the existing nomenclature, which is, starting at around F15H, the 
> following:
> 
>    MSR_F15H_
>    MSR_F16H_
>    MSR_F17H_
>    MSR_F19H_
> 
> Or are the MSRs named ZEN2 and ZEN4 in AMD SDMs, which we should follow?

See above. The MSRs are per Zen generation while the family is per
family. Yes, it is confusing. :-\

IOW, you want to have this as the end product:

/* Zen4 */
#define MSR_ZEN4_BP_CFG                 0xc001102e
#define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5

/* Fam 19h MSRs */
#define MSR_F19H_UMC_PERF_CTL           0xc0010800
#define MSR_F19H_UMC_PERF_CTR           0xc0010801

/* Zen 2 */
#define MSR_ZEN2_SPECTRAL_CHICKEN       0xc00110e3
#define MSR_ZEN2_SPECTRAL_CHICKEN_BIT   BIT_ULL(1)

/* Fam 17h MSRs */
#define MSR_F17H_IRPERF			0xc00000e9
  

Patch

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 1d11135..b37abb5 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -637,12 +637,17 @@ 
 /* AMD Last Branch Record MSRs */
 #define MSR_AMD64_LBR_SELECT			0xc000010e
 
-/* Fam 17h MSRs */
-#define MSR_F17H_IRPERF			0xc00000e9
+/* Zen4 */
+#define MSR_ZEN4_BP_CFG			0xc001102e
+#define MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT 5
 
+/* Zen 2 */
 #define MSR_ZEN2_SPECTRAL_CHICKEN	0xc00110e3
 #define MSR_ZEN2_SPECTRAL_CHICKEN_BIT	BIT_ULL(1)
 
+/* Fam 17h MSRs */
+#define MSR_F17H_IRPERF			0xc00000e9
+
 /* Fam 16h MSRs */
 #define MSR_F16H_L2I_PERF_CTL		0xc0010230
 #define MSR_F16H_L2I_PERF_CTR		0xc0010231
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 03ef962..ece2b5b 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -80,6 +80,10 @@  static const int amd_div0[] =
 	AMD_LEGACY_ERRATUM(AMD_MODEL_RANGE(0x17, 0x00, 0x0, 0x2f, 0xf),
 			   AMD_MODEL_RANGE(0x17, 0x50, 0x0, 0x5f, 0xf));
 
+static const int amd_erratum_1485[] =
+	AMD_LEGACY_ERRATUM(AMD_MODEL_RANGE(0x19, 0x10, 0x0, 0x1f, 0xf),
+			   AMD_MODEL_RANGE(0x19, 0x60, 0x0, 0xaf, 0xf));
+
 static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
 {
 	int osvw_id = *erratum++;
@@ -1149,6 +1153,10 @@  static void init_amd(struct cpuinfo_x86 *c)
 		pr_notice_once("AMD Zen1 DIV0 bug detected. Disable SMT for full protection.\n");
 		setup_force_cpu_bug(X86_BUG_DIV0);
 	}
+
+	if (!cpu_has(c, X86_FEATURE_HYPERVISOR) &&
+	     cpu_has_amd_erratum(c, amd_erratum_1485))
+		msr_set_bit(MSR_ZEN4_BP_CFG, MSR_ZEN4_BP_CFG_SHARED_BTB_FIX_BIT);
 }
 
 #ifdef CONFIG_X86_32