[v2,05/21] KVM: x86: Disallow writes to immutable feature MSRs after KVM_RUN

Message ID 20230210003148.2646712-6-seanjc@google.com
State New
Headers
Series KVM: x86: Disallow writes to feature MSRs post-KVM_RUN |

Commit Message

Sean Christopherson Feb. 10, 2023, 12:31 a.m. UTC
  Disallow writes to feature MSRs after KVM_RUN to prevent userspace from
changing the vCPU model after running the vCPU.  Similar to guest CPUID,
KVM uses feature MSRs to configure intercepts, determine what operations
are/aren't allowed, etc.  Changing the capabilities while the vCPU is
active will at best yield unpredictable guest behavior, and at worst
could be dangerous to KVM.

Allow writing the current value, e.g. so that userspace can blindly set
all MSRs when emulating RESET, and unconditionally allow writes to
MSR_IA32_UCODE_REV so that userspace can emulate patch loads.

Special case the VMX MSRs to keep the generic list small, i.e. so that
KVM can do a linear walk of the generic list without incurring meaningful
overhead.

Cc: Like Xu <like.xu.linux@gmail.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/x86.c | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)
  

Comments

Yu Zhang Feb. 10, 2023, 1:01 p.m. UTC | #1
On Fri, Feb 10, 2023 at 12:31:32AM +0000, Sean Christopherson wrote:
> Disallow writes to feature MSRs after KVM_RUN to prevent userspace from
> changing the vCPU model after running the vCPU.  Similar to guest CPUID,
> KVM uses feature MSRs to configure intercepts, determine what operations
> are/aren't allowed, etc.  Changing the capabilities while the vCPU is
> active will at best yield unpredictable guest behavior, and at worst
> could be dangerous to KVM.
> 
> Allow writing the current value, e.g. so that userspace can blindly set
> all MSRs when emulating RESET, and unconditionally allow writes to
> MSR_IA32_UCODE_REV so that userspace can emulate patch loads.
> 
> Special case the VMX MSRs to keep the generic list small, i.e. so that
> KVM can do a linear walk of the generic list without incurring meaningful
> overhead.
> 
> Cc: Like Xu <like.xu.linux@gmail.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>  arch/x86/kvm/x86.c | 36 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 36 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 7b73a0b45041..186cb6a81643 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1554,6 +1554,25 @@ static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all_except_vmx) +
>  			      (KVM_LAST_EMULATED_VMX_MSR - KVM_FIRST_EMULATED_VMX_MSR + 1)];
>  static unsigned int num_msr_based_features;
>  
> +/*
> + * All feature MSRs except uCode revID, which tracks the currently loaded uCode
> + * patch, are immutable once the vCPU model is defined.
> + */
> +static bool kvm_is_immutable_feature_msr(u32 msr)
> +{
> +	int i;
> +
> +	if (msr >= KVM_FIRST_EMULATED_VMX_MSR && msr <= KVM_LAST_EMULATED_VMX_MSR)
> +		return true;
> +
> +	for (i = 0; i < ARRAY_SIZE(msr_based_features_all_except_vmx); i++) {
> +		if (msr == msr_based_features_all_except_vmx[i])
> +			return msr != MSR_IA32_UCODE_REV;
> +	}
> +
> +	return false;
> +}
> +
>  /*
>   * Some IA32_ARCH_CAPABILITIES bits have dependencies on MSRs that KVM
>   * does not yet virtualize. These include:
> @@ -2168,6 +2187,23 @@ static int do_get_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
>  
>  static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
>  {
> +	u64 val;
> +
> +	/*
> +	 * Disallow writes to immutable feature MSRs after KVM_RUN.  KVM does
> +	 * not support modifying the guest vCPU model on the fly, e.g. changing
> +	 * the nVMX capabilities while L2 is running is nonsensical.  Ignore
> +	 * writes of the same value, e.g. to allow userspace to blindly stuff
> +	 * all MSRs when emulating RESET.
> +	 */
> +	if (vcpu->arch.last_vmentry_cpu != -1 &&

Use kvm_vcpu_has_run(vcpu) here? 

B.R.
Yu
  
Sean Christopherson Feb. 10, 2023, 4:31 p.m. UTC | #2
On Fri, Feb 10, 2023, Yu Zhang wrote:
> On Fri, Feb 10, 2023 at 12:31:32AM +0000, Sean Christopherson wrote:
> > @@ -2168,6 +2187,23 @@ static int do_get_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
> >  
> >  static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
> >  {
> > +	u64 val;
> > +
> > +	/*
> > +	 * Disallow writes to immutable feature MSRs after KVM_RUN.  KVM does
> > +	 * not support modifying the guest vCPU model on the fly, e.g. changing
> > +	 * the nVMX capabilities while L2 is running is nonsensical.  Ignore
> > +	 * writes of the same value, e.g. to allow userspace to blindly stuff
> > +	 * all MSRs when emulating RESET.
> > +	 */
> > +	if (vcpu->arch.last_vmentry_cpu != -1 &&
> 
> Use kvm_vcpu_has_run(vcpu) here? 

/facepalm

Yes, that was the entire point of adding the helper.

Thanks!
  
Xiaoyao Li Feb. 14, 2023, 6:39 a.m. UTC | #3
Maybe be more clearer in the title to reflect what the patch really does

KVM: x86: Disallow writes to immutable feature MSRs from user space 
after KVM_RUN

On 2/10/2023 8:31 AM, Sean Christopherson wrote:
> Disallow writes to feature MSRs after KVM_RUN to prevent userspace from
> changing the vCPU model after running the vCPU.  Similar to guest CPUID,
> KVM uses feature MSRs to configure intercepts, determine what operations
> are/aren't allowed, etc.  Changing the capabilities while the vCPU is
> active will at best yield unpredictable guest behavior, and at worst
> could be dangerous to KVM.
> 
> Allow writing the current value, e.g. so that userspace can blindly set
> all MSRs when emulating RESET, and unconditionally allow writes to
> MSR_IA32_UCODE_REV so that userspace can emulate patch loads.
> 
> Special case the VMX MSRs to keep the generic list small, i.e. so that
> KVM can do a linear walk of the generic list without incurring meaningful
> overhead.
> 
> Cc: Like Xu <like.xu.linux@gmail.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   arch/x86/kvm/x86.c | 36 ++++++++++++++++++++++++++++++++++++
>   1 file changed, 36 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 7b73a0b45041..186cb6a81643 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1554,6 +1554,25 @@ static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all_except_vmx) +
>   			      (KVM_LAST_EMULATED_VMX_MSR - KVM_FIRST_EMULATED_VMX_MSR + 1)];
>   static unsigned int num_msr_based_features;
>   
> +/*
> + * All feature MSRs except uCode revID, which tracks the currently loaded uCode
> + * patch, are immutable once the vCPU model is defined.
> + */
> +static bool kvm_is_immutable_feature_msr(u32 msr)
> +{
> +	int i;
> +
> +	if (msr >= KVM_FIRST_EMULATED_VMX_MSR && msr <= KVM_LAST_EMULATED_VMX_MSR)
> +		return true;
> +
> +	for (i = 0; i < ARRAY_SIZE(msr_based_features_all_except_vmx); i++) {
> +		if (msr == msr_based_features_all_except_vmx[i])
> +			return msr != MSR_IA32_UCODE_REV;
> +	}
> +
> +	return false;
> +}
> +
>   /*
>    * Some IA32_ARCH_CAPABILITIES bits have dependencies on MSRs that KVM
>    * does not yet virtualize. These include:
> @@ -2168,6 +2187,23 @@ static int do_get_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
>   
>   static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
>   {
> +	u64 val;
> +
> +	/*
> +	 * Disallow writes to immutable feature MSRs after KVM_RUN.  KVM does
> +	 * not support modifying the guest vCPU model on the fly, e.g. changing
> +	 * the nVMX capabilities while L2 is running is nonsensical.  Ignore
> +	 * writes of the same value, e.g. to allow userspace to blindly stuff
> +	 * all MSRs when emulating RESET.
> +	 */
> +	if (vcpu->arch.last_vmentry_cpu != -1 &&

after this replaced with the helper,

Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>

> +	    kvm_is_immutable_feature_msr(index)) {
> +		if (do_get_msr(vcpu, index, &val) || *data != val)
> +			return -EINVAL;
> +
> +		return 0;
> +	}
> +
>   	return kvm_set_msr_ignored_check(vcpu, index, *data, true);
>   }
>
  
Like Xu Feb. 14, 2023, 11:39 a.m. UTC | #4
On 10/2/2023 8:31 am, Sean Christopherson wrote:
> Disallow writes to feature MSRs after KVM_RUN to prevent userspace from
> changing the vCPU model after running the vCPU.  Similar to guest CPUID,
> KVM uses feature MSRs to configure intercepts, determine what operations
> are/aren't allowed, etc.  Changing the capabilities while the vCPU is
> active will at best yield unpredictable guest behavior, and at worst
> could be dangerous to KVM.
> 
> Allow writing the current value, e.g. so that userspace can blindly set
> all MSRs when emulating RESET, and unconditionally allow writes to
> MSR_IA32_UCODE_REV so that userspace can emulate patch loads.
> 
> Special case the VMX MSRs to keep the generic list small, i.e. so that
> KVM can do a linear walk of the generic list without incurring meaningful
> overhead.
> 
> Cc: Like Xu <like.xu.linux@gmail.com>
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   arch/x86/kvm/x86.c | 36 ++++++++++++++++++++++++++++++++++++
>   1 file changed, 36 insertions(+)
> 
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 7b73a0b45041..186cb6a81643 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1554,6 +1554,25 @@ static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all_except_vmx) +
>   			      (KVM_LAST_EMULATED_VMX_MSR - KVM_FIRST_EMULATED_VMX_MSR + 1)];
>   static unsigned int num_msr_based_features;
>   
> +/*
> + * All feature MSRs except uCode revID, which tracks the currently loaded uCode
> + * patch, are immutable once the vCPU model is defined.
> + */
> +static bool kvm_is_immutable_feature_msr(u32 msr)
> +{
> +	int i;
> +
> +	if (msr >= KVM_FIRST_EMULATED_VMX_MSR && msr <= KVM_LAST_EMULATED_VMX_MSR)
> +		return true;
> +
> +	for (i = 0; i < ARRAY_SIZE(msr_based_features_all_except_vmx); i++) {
> +		if (msr == msr_based_features_all_except_vmx[i])
> +			return msr != MSR_IA32_UCODE_REV;
> +	}
> +
> +	return false;
> +}
> +
>   /*
>    * Some IA32_ARCH_CAPABILITIES bits have dependencies on MSRs that KVM
>    * does not yet virtualize. These include:
> @@ -2168,6 +2187,23 @@ static int do_get_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
>   
>   static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
>   {
> +	u64 val;
> +
> +	/*
> +	 * Disallow writes to immutable feature MSRs after KVM_RUN.  KVM does
> +	 * not support modifying the guest vCPU model on the fly, e.g. changing
> +	 * the nVMX capabilities while L2 is running is nonsensical.  Ignore
> +	 * writes of the same value, e.g. to allow userspace to blindly stuff
> +	 * all MSRs when emulating RESET.
> +	 */
> +	if (vcpu->arch.last_vmentry_cpu != -1 &&

Three concerns on my mind (to help you think more if any):
- why not using kvm->created_vcpus;

- how about different vcpu models of the same guest have different feature_msr 
values;
(although they are not altered after the first run, cases (selftests) may be 
needed to
show that it is dangerous for KVM);

- the relative time to set "vcpu->arch.last_vmentry_cpu = vcpu->cpu" is still 
too late,
since part of the guest code (an attack window) has already been executed on first
run of kvm_x86_vcpu_run() which may run for a long time;

> +	    kvm_is_immutable_feature_msr(index)) {
> +		if (do_get_msr(vcpu, index, &val) || *data != val)
> +			return -EINVAL;
> +
> +		return 0;
> +	}
> +
>   	return kvm_set_msr_ignored_check(vcpu, index, *data, true);
>   }
>
  
Sean Christopherson Feb. 15, 2023, 10:36 p.m. UTC | #5
On Tue, Feb 14, 2023, Like Xu wrote:
> On 10/2/2023 8:31 am, Sean Christopherson wrote:
> > @@ -2168,6 +2187,23 @@ static int do_get_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
> >   static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
> >   {
> > +	u64 val;
> > +
> > +	/*
> > +	 * Disallow writes to immutable feature MSRs after KVM_RUN.  KVM does
> > +	 * not support modifying the guest vCPU model on the fly, e.g. changing
> > +	 * the nVMX capabilities while L2 is running is nonsensical.  Ignore
> > +	 * writes of the same value, e.g. to allow userspace to blindly stuff
> > +	 * all MSRs when emulating RESET.
> > +	 */
> > +	if (vcpu->arch.last_vmentry_cpu != -1 &&
> 
> Three concerns on my mind (to help you think more if any):
> - why not using kvm->created_vcpus;

Because this is a vCPU scoped ioctl().

> - how about different vcpu models of the same guest have different
> feature_msr values;

KVM shouldn't care.  If KVM does care, then that's a completely orthogonal bug
that needs to be fixed separately.

> (although they are not altered after the first run, cases (selftests) may be
> needed to
> show that it is dangerous for KVM);
> 
> - the relative time to set "vcpu->arch.last_vmentry_cpu = vcpu->cpu" is
> still too late,
> since part of the guest code (an attack window) has already been executed on first
> run of kvm_x86_vcpu_run() which may run for a long time;

Again, this is a vCPU scoped ioctl.  The task doing KVM_RUN holds vcpu->mutex so
there is no race.
  

Patch

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7b73a0b45041..186cb6a81643 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1554,6 +1554,25 @@  static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all_except_vmx) +
 			      (KVM_LAST_EMULATED_VMX_MSR - KVM_FIRST_EMULATED_VMX_MSR + 1)];
 static unsigned int num_msr_based_features;
 
+/*
+ * All feature MSRs except uCode revID, which tracks the currently loaded uCode
+ * patch, are immutable once the vCPU model is defined.
+ */
+static bool kvm_is_immutable_feature_msr(u32 msr)
+{
+	int i;
+
+	if (msr >= KVM_FIRST_EMULATED_VMX_MSR && msr <= KVM_LAST_EMULATED_VMX_MSR)
+		return true;
+
+	for (i = 0; i < ARRAY_SIZE(msr_based_features_all_except_vmx); i++) {
+		if (msr == msr_based_features_all_except_vmx[i])
+			return msr != MSR_IA32_UCODE_REV;
+	}
+
+	return false;
+}
+
 /*
  * Some IA32_ARCH_CAPABILITIES bits have dependencies on MSRs that KVM
  * does not yet virtualize. These include:
@@ -2168,6 +2187,23 @@  static int do_get_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
 
 static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
 {
+	u64 val;
+
+	/*
+	 * Disallow writes to immutable feature MSRs after KVM_RUN.  KVM does
+	 * not support modifying the guest vCPU model on the fly, e.g. changing
+	 * the nVMX capabilities while L2 is running is nonsensical.  Ignore
+	 * writes of the same value, e.g. to allow userspace to blindly stuff
+	 * all MSRs when emulating RESET.
+	 */
+	if (vcpu->arch.last_vmentry_cpu != -1 &&
+	    kvm_is_immutable_feature_msr(index)) {
+		if (do_get_msr(vcpu, index, &val) || *data != val)
+			return -EINVAL;
+
+		return 0;
+	}
+
 	return kvm_set_msr_ignored_check(vcpu, index, *data, true);
 }