[3/3] x86/speculation: Support Automatic IBRS under virtualization

Message ID 20221104213651.141057-4-kim.phillips@amd.com
State New
Headers
Series x86/speculation: Support Automatic IBRS |

Commit Message

Kim Phillips Nov. 4, 2022, 9:36 p.m. UTC
  VM Guests may want to use Auto IBRS, so propagate the CPUID to them.

Co-developed-by: Babu Moger <Babu.Moger@amd.com>
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
---
 arch/x86/kvm/cpuid.c         | 5 ++++-
 arch/x86/kvm/reverse_cpuid.h | 1 +
 arch/x86/kvm/svm/svm.c       | 3 +++
 arch/x86/kvm/x86.c           | 3 +++
 4 files changed, 11 insertions(+), 1 deletion(-)
  

Comments

Jim Mattson Nov. 4, 2022, 10 p.m. UTC | #1
On Fri, Nov 4, 2022 at 2:38 PM Kim Phillips <kim.phillips@amd.com> wrote:
>
> VM Guests may want to use Auto IBRS, so propagate the CPUID to them.
>
> Co-developed-by: Babu Moger <Babu.Moger@amd.com>
> Signed-off-by: Kim Phillips <kim.phillips@amd.com>

The APM says that, under AutoIBRS, CPL0 processes "have IBRS
protection." I'm taking this to mean only that indirect branches in
CPL0 are not subject to steering from a less privileged predictor
mode. This would imply that indirect branches executed at CPL0 in L1
could potentially be subject to steering by code running at CPL0 in
L2, since L1 and L2 share hardware predictor modes.

Fortunately, there is an IBPB when switching VMCBs in svm_vcpu_load().
But it might be worth noting that this is necessary for AutoIBRS to
work (unless it actually isn't).
  
Paolo Bonzini Nov. 6, 2022, 8:38 a.m. UTC | #2
On 11/4/22 22:36, Kim Phillips wrote:
> @@ -730,6 +730,8 @@ void kvm_set_cpu_caps(void)
>   		0 /* SME */ | F(SEV) | 0 /* VM_PAGE_FLUSH */ | F(SEV_ES) |
>   		F(SME_COHERENT));
>   
> +	kvm_cpu_cap_mask(CPUID_8000_0021_EAX, F(AUTOIBRS));

This should also include bits 0, 2 and 6.  Feel free to add #defines for
them in cpuid.c if x86 maintainers do not want them in cpufeatures.h.

There should also be something like:

                 if (static_cpu_has(X86_FEATURE_LFENCE_RDTSC))
                         kvm_cpu_cap_set(CPUID_8000_0021_EAX, F(AMD_LFENCE_RDTSC));
                 if (!static_cpu_has_bug(X86_BUG_NULL_SEG))
                         kvm_cpu_cap_set(CPUID_8000_0021_EAX, F(NSCB);

so that...

> @@ -1211,12 +1213,13 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
>  		 *    EAX      0      NNDBP, Processor ignores nested data breakpoints
>  		 *    EAX      2      LAS, LFENCE always serializing
>  		 *    EAX      6      NSCB, Null selector clear base
> +		 *    EAX      8      Automatic IBRS
>  		 *
>  		 * Other defined bits are for MSRs that KVM does not expose:
>  		 *   EAX      3      SPCL, SMM page configuration lock
>  		 *   EAX      13     PCMSR, Prefetch control MSR
>  		 */
> -		entry->eax &= BIT(0) | BIT(2) | BIT(6);
> +		entry->eax &= BIT(0) | BIT(2) | BIT(6) | BIT(8);
>  		if (static_cpu_has(X86_FEATURE_LFENCE_RDTSC))
>  			entry->eax |= BIT(2);
>  		if (!static_cpu_has_bug(X86_BUG_NULL_SEG))
>  			entry->eax |= BIT(6);

... these five lines become simply

	cpuid_entry_override(entry, CPUID_8000_0021_EAX);

In the end these should be two patches:

- kvm, x86: use CPU capabilities for CPUID[0x80000021].EAX
- kvm, x86: support AMD automatic IBRS

Thanks,

Paolo
  
Kim Phillips Nov. 7, 2022, 10:29 p.m. UTC | #3
On 11/4/22 5:00 PM, Jim Mattson wrote:
> On Fri, Nov 4, 2022 at 2:38 PM Kim Phillips <kim.phillips@amd.com> wrote:
>>
>> VM Guests may want to use Auto IBRS, so propagate the CPUID to them.
>>
>> Co-developed-by: Babu Moger <Babu.Moger@amd.com>
>> Signed-off-by: Kim Phillips <kim.phillips@amd.com>
> 
> The APM says that, under AutoIBRS, CPL0 processes "have IBRS
> protection." I'm taking this to mean only that indirect branches in
> CPL0 are not subject to steering from a less privileged predictor
> mode. This would imply that indirect branches executed at CPL0 in L1
> could potentially be subject to steering by code running at CPL0 in
> L2, since L1 and L2 share hardware predictor modes.

That's true for AMD processors that don't support Same Mode IBRS, also
documented in the APM.

Processors that support AutoIBRS also support Same Mode IBRS (see
CPUID Fn8000_0008_EBX[IbrsSameMode] (bit 19)).

> Fortunately, there is an IBPB when switching VMCBs in svm_vcpu_load().
> But it might be worth noting that this is necessary for AutoIBRS to
> work (unless it actually isn't).

It is needed, but not for kernel/CPL0 code, rather to protect one
guest's user-space code from another's.

Kim
  
Jim Mattson Nov. 7, 2022, 10:42 p.m. UTC | #4
On Mon, Nov 7, 2022 at 2:29 PM Kim Phillips <kim.phillips@amd.com> wrote:
>
> On 11/4/22 5:00 PM, Jim Mattson wrote:
> > On Fri, Nov 4, 2022 at 2:38 PM Kim Phillips <kim.phillips@amd.com> wrote:
> >>
> >> VM Guests may want to use Auto IBRS, so propagate the CPUID to them.
> >>
> >> Co-developed-by: Babu Moger <Babu.Moger@amd.com>
> >> Signed-off-by: Kim Phillips <kim.phillips@amd.com>
> >
> > The APM says that, under AutoIBRS, CPL0 processes "have IBRS
> > protection." I'm taking this to mean only that indirect branches in
> > CPL0 are not subject to steering from a less privileged predictor
> > mode. This would imply that indirect branches executed at CPL0 in L1
> > could potentially be subject to steering by code running at CPL0 in
> > L2, since L1 and L2 share hardware predictor modes.
>
> That's true for AMD processors that don't support Same Mode IBRS, also
> documented in the APM.
>
> Processors that support AutoIBRS also support Same Mode IBRS (see
> CPUID Fn8000_0008_EBX[IbrsSameMode] (bit 19)).
>
> > Fortunately, there is an IBPB when switching VMCBs in svm_vcpu_load().
> > But it might be worth noting that this is necessary for AutoIBRS to
> > work (unless it actually isn't).
>
> It is needed, but not for kernel/CPL0 code, rather to protect one
> guest's user-space code from another's.

The question is whether it's necessary when switching between L1 and
L2 on the same vCPU of the same VM.

On the Intel side, this was (erroneously) optimized away in commit
5c911beff20a ("KVM: nVMX: Skip IBPB when switching between vmcs01 and
vmcs02").
  
Kim Phillips Nov. 8, 2022, 10:48 p.m. UTC | #5
On 11/7/22 4:42 PM, Jim Mattson wrote:
> On Mon, Nov 7, 2022 at 2:29 PM Kim Phillips <kim.phillips@amd.com> wrote:
>>
>> On 11/4/22 5:00 PM, Jim Mattson wrote:
>>> On Fri, Nov 4, 2022 at 2:38 PM Kim Phillips <kim.phillips@amd.com> wrote:
>>>>
>>>> VM Guests may want to use Auto IBRS, so propagate the CPUID to them.
>>>>
>>>> Co-developed-by: Babu Moger <Babu.Moger@amd.com>
>>>> Signed-off-by: Kim Phillips <kim.phillips@amd.com>
>>>
>>> The APM says that, under AutoIBRS, CPL0 processes "have IBRS
>>> protection." I'm taking this to mean only that indirect branches in
>>> CPL0 are not subject to steering from a less privileged predictor
>>> mode. This would imply that indirect branches executed at CPL0 in L1
>>> could potentially be subject to steering by code running at CPL0 in
>>> L2, since L1 and L2 share hardware predictor modes.
>>
>> That's true for AMD processors that don't support Same Mode IBRS, also
>> documented in the APM.
>>
>> Processors that support AutoIBRS also support Same Mode IBRS (see
>> CPUID Fn8000_0008_EBX[IbrsSameMode] (bit 19)).
>>
>>> Fortunately, there is an IBPB when switching VMCBs in svm_vcpu_load().
>>> But it might be worth noting that this is necessary for AutoIBRS to
>>> work (unless it actually isn't).
>>
>> It is needed, but not for kernel/CPL0 code, rather to protect one
>> guest's user-space code from another's.
> 
> The question is whether it's necessary when switching between L1 and
> L2 on the same vCPU of the same VM.
> 
> On the Intel side, this was (erroneously) optimized away in commit
> 5c911beff20a ("KVM: nVMX: Skip IBPB when switching between vmcs01 and
> vmcs02").

Then why hasn't it been reverted?

Does its rationale not make sense?:

     The IBPB is intended to prevent one guest from attacking another, which
     is unnecessary in the nested case as it's the same guest from KVM's
     perspective.

Thanks,

Kim
  
Jim Mattson Nov. 8, 2022, 10:59 p.m. UTC | #6
On Tue, Nov 8, 2022 at 2:48 PM Kim Phillips <kim.phillips@amd.com> wrote:
>
> On 11/7/22 4:42 PM, Jim Mattson wrote:
> > On Mon, Nov 7, 2022 at 2:29 PM Kim Phillips <kim.phillips@amd.com> wrote:
> >>
> >> On 11/4/22 5:00 PM, Jim Mattson wrote:
> >>> On Fri, Nov 4, 2022 at 2:38 PM Kim Phillips <kim.phillips@amd.com> wrote:
> >>>>
> >>>> VM Guests may want to use Auto IBRS, so propagate the CPUID to them.
> >>>>
> >>>> Co-developed-by: Babu Moger <Babu.Moger@amd.com>
> >>>> Signed-off-by: Kim Phillips <kim.phillips@amd.com>
> >>>
> >>> The APM says that, under AutoIBRS, CPL0 processes "have IBRS
> >>> protection." I'm taking this to mean only that indirect branches in
> >>> CPL0 are not subject to steering from a less privileged predictor
> >>> mode. This would imply that indirect branches executed at CPL0 in L1
> >>> could potentially be subject to steering by code running at CPL0 in
> >>> L2, since L1 and L2 share hardware predictor modes.
> >>
> >> That's true for AMD processors that don't support Same Mode IBRS, also
> >> documented in the APM.
> >>
> >> Processors that support AutoIBRS also support Same Mode IBRS (see
> >> CPUID Fn8000_0008_EBX[IbrsSameMode] (bit 19)).
> >>
> >>> Fortunately, there is an IBPB when switching VMCBs in svm_vcpu_load().
> >>> But it might be worth noting that this is necessary for AutoIBRS to
> >>> work (unless it actually isn't).
> >>
> >> It is needed, but not for kernel/CPL0 code, rather to protect one
> >> guest's user-space code from another's.
> >
> > The question is whether it's necessary when switching between L1 and
> > L2 on the same vCPU of the same VM.
> >
> > On the Intel side, this was (erroneously) optimized away in commit
> > 5c911beff20a ("KVM: nVMX: Skip IBPB when switching between vmcs01 and
> > vmcs02").
>
> Then why hasn't it been reverted?

Sometimes, the wheels turn slowly. See
https://lore.kernel.org/kvm/20221019213620.1953281-1-jmattson@google.com/.

> Does its rationale not make sense?:
>
>      The IBPB is intended to prevent one guest from attacking another, which
>      is unnecessary in the nested case as it's the same guest from KVM's
>      perspective.

No, it doesn't. IBRS promises to protect the host from the guest. To
properly virtualize IBRS, KVM has to provide that protection,
regardless of its "perspective."
  

Patch

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 7065462378e2..2524cd82627b 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -730,6 +730,8 @@  void kvm_set_cpu_caps(void)
 		0 /* SME */ | F(SEV) | 0 /* VM_PAGE_FLUSH */ | F(SEV_ES) |
 		F(SME_COHERENT));
 
+	kvm_cpu_cap_mask(CPUID_8000_0021_EAX, F(AUTOIBRS));
+
 	kvm_cpu_cap_mask(CPUID_C000_0001_EDX,
 		F(XSTORE) | F(XSTORE_EN) | F(XCRYPT) | F(XCRYPT_EN) |
 		F(ACE2) | F(ACE2_EN) | F(PHE) | F(PHE_EN) |
@@ -1211,12 +1213,13 @@  static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
 		 *    EAX      0      NNDBP, Processor ignores nested data breakpoints
 		 *    EAX      2      LAS, LFENCE always serializing
 		 *    EAX      6      NSCB, Null selector clear base
+		 *    EAX      8      Automatic IBRS
 		 *
 		 * Other defined bits are for MSRs that KVM does not expose:
 		 *   EAX      3      SPCL, SMM page configuration lock
 		 *   EAX      13     PCMSR, Prefetch control MSR
 		 */
-		entry->eax &= BIT(0) | BIT(2) | BIT(6);
+		entry->eax &= BIT(0) | BIT(2) | BIT(6) | BIT(8);
 		if (static_cpu_has(X86_FEATURE_LFENCE_RDTSC))
 			entry->eax |= BIT(2);
 		if (!static_cpu_has_bug(X86_BUG_NULL_SEG))
diff --git a/arch/x86/kvm/reverse_cpuid.h b/arch/x86/kvm/reverse_cpuid.h
index a19d473d0184..7eeade35a425 100644
--- a/arch/x86/kvm/reverse_cpuid.h
+++ b/arch/x86/kvm/reverse_cpuid.h
@@ -48,6 +48,7 @@  static const struct cpuid_reg reverse_cpuid[] = {
 	[CPUID_7_1_EAX]       = {         7, 1, CPUID_EAX},
 	[CPUID_12_EAX]        = {0x00000012, 0, CPUID_EAX},
 	[CPUID_8000_001F_EAX] = {0x8000001f, 0, CPUID_EAX},
+	[CPUID_8000_0021_EAX] = {0x80000021, 0, CPUID_EAX},
 };
 
 /*
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 58f0077d9357..2add5eb3303f 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4993,6 +4993,9 @@  static __init int svm_hardware_setup(void)
 
 	tsc_aux_uret_slot = kvm_add_user_return_msr(MSR_TSC_AUX);
 
+	if (boot_cpu_has(X86_FEATURE_AUTOIBRS))
+		kvm_enable_efer_bits(EFER_AUTOIBRS);
+
 	/* Check for pause filtering support */
 	if (!boot_cpu_has(X86_FEATURE_PAUSEFILTER)) {
 		pause_filter_count = 0;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9cf1ba865562..3dbeda353853 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1687,6 +1687,9 @@  static int do_get_msr_feature(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
 
 static bool __kvm_valid_efer(struct kvm_vcpu *vcpu, u64 efer)
 {
+	if (efer & EFER_AUTOIBRS && !guest_cpuid_has(vcpu, X86_FEATURE_AUTOIBRS))
+		return false;
+
 	if (efer & EFER_FFXSR && !guest_cpuid_has(vcpu, X86_FEATURE_FXSR_OPT))
 		return false;