KVM: nVMX: Emulate NOPs in L2, and PAUSE if it's not intercepted

Message ID 20230405002359.418138-1-seanjc@google.com
State New
Headers
Series KVM: nVMX: Emulate NOPs in L2, and PAUSE if it's not intercepted |

Commit Message

Sean Christopherson April 5, 2023, 12:23 a.m. UTC
  Extend VMX's nested intercept logic for emulated instructions to handle
"pause" interception, in quotes because KVM's emulator doesn't filter out
NOPs when checking for nested intercepts.  Failure to allow emulation of
NOPs results in KVM injecting a #UD into L2 on any NOP that collides with
the emulator's definition of PAUSE, i.e. on all single-byte NOPs.

For PAUSE itself, honor L1's PAUSE-exiting control, but ignore PLE to
avoid unnecessarily injecting a #UD into L2.  Per the SDM, the first
execution of PAUSE after VM-Entry is treated as the beginning of a new
loop, i.e. will never trigger a PLE VM-Exit, and so L1 can't expect any
given execution of PAUSE to deterministically exit.

  ... the processor considers this execution to be the first execution of
  PAUSE in a loop. (It also does so for the first execution of PAUSE at
  CPL 0 after VM entry.)

All that said, the PLE side of things is currently a moot point, as KVM
doesn't expose PLE to L1.

Note, vmx_check_intercept() is still wildly broken when L1 wants to
intercept an instruction, as KVM injects a #UD instead of synthesizing a
nested VM-Exit.  That issue extends far beyond NOP/PAUSE and needs far
more effort to fix, i.e. is a problem for the future.

Fixes: 07721feee46b ("KVM: nVMX: Don't emulate instructions in guest mode")
Cc: Mathias Krause <minipli@grsecurity.net>
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/vmx/vmx.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)


base-commit: 27d6845d258b67f4eb3debe062b7dacc67e0c393
  

Comments

Sean Christopherson April 10, 2023, 11:31 p.m. UTC | #1
On Tue, 04 Apr 2023 17:23:59 -0700, Sean Christopherson wrote:
> Extend VMX's nested intercept logic for emulated instructions to handle
> "pause" interception, in quotes because KVM's emulator doesn't filter out
> NOPs when checking for nested intercepts.  Failure to allow emulation of
> NOPs results in KVM injecting a #UD into L2 on any NOP that collides with
> the emulator's definition of PAUSE, i.e. on all single-byte NOPs.
> 
> For PAUSE itself, honor L1's PAUSE-exiting control, but ignore PLE to
> avoid unnecessarily injecting a #UD into L2.  Per the SDM, the first
> execution of PAUSE after VM-Entry is treated as the beginning of a new
> loop, i.e. will never trigger a PLE VM-Exit, and so L1 can't expect any
> given execution of PAUSE to deterministically exit.
> 
> [...]

Applied to kvm-x86 vmx.  I haven't gotten any reviews, but the FEP changes
in KUT will cause the nVMX test to fail, so I want to get kvm-x86/next fixed
sooner than later.  I'm not expecting anything else for "vmx", so unwinding
should be easy if it turns out this is busted/flawed.

[1/1] KVM: nVMX: Emulate NOPs in L2, and PAUSE if it's not intercepted
      https://github.com/kvm-x86/linux/commit/84f481315b10

--
https://github.com/kvm-x86/linux/tree/next
https://github.com/kvm-x86/linux/tree/fixes
  
Paolo Bonzini April 11, 2023, 6:37 a.m. UTC | #2
On 4/5/23 02:23, Sean Christopherson wrote:
> Extend VMX's nested intercept logic for emulated instructions to handle
> "pause" interception, in quotes because KVM's emulator doesn't filter out
> NOPs when checking for nested intercepts.  Failure to allow emulation of
> NOPs results in KVM injecting a #UD into L2 on any NOP that collides with
> the emulator's definition of PAUSE, i.e. on all single-byte NOPs.
> 
> For PAUSE itself, honor L1's PAUSE-exiting control, but ignore PLE to
> avoid unnecessarily injecting a #UD into L2.  Per the SDM, the first
> execution of PAUSE after VM-Entry is treated as the beginning of a new
> loop, i.e. will never trigger a PLE VM-Exit, and so L1 can't expect any
> given execution of PAUSE to deterministically exit.
> 
>    ... the processor considers this execution to be the first execution of
>    PAUSE in a loop. (It also does so for the first execution of PAUSE at
>    CPL 0 after VM entry.)
> 
> All that said, the PLE side of things is currently a moot point, as KVM
> doesn't expose PLE to L1.
> 
> Note, vmx_check_intercept() is still wildly broken when L1 wants to
> intercept an instruction, as KVM injects a #UD instead of synthesizing a
> nested VM-Exit.  That issue extends far beyond NOP/PAUSE and needs far
> more effort to fix, i.e. is a problem for the future.
> 
> Fixes: 07721feee46b ("KVM: nVMX: Don't emulate instructions in guest mode")
> Cc: Mathias Krause <minipli@grsecurity.net>
> Cc: stable@vger.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@google.com>
> ---
>   arch/x86/kvm/vmx/vmx.c | 15 +++++++++++++++
>   1 file changed, 15 insertions(+)
> 
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 9ae4044f076f..1e560457bf9a 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7898,6 +7898,21 @@ static int vmx_check_intercept(struct kvm_vcpu *vcpu,
>   		/* FIXME: produce nested vmexit and return X86EMUL_INTERCEPTED.  */
>   		break;
>   
> +	case x86_intercept_pause:
> +		/*
> +		 * PAUSE is a single-byte NOP with a REPE prefix, i.e. collides
> +		 * with vanilla NOPs in the emulator.  Apply the interception
> +		 * check only to actual PAUSE instructions.  Don't check
> +		 * PAUSE-loop-exiting, software can't expect a given PAUSE to
> +		 * exit, i.e. KVM is within its rights to allow L2 to execute
> +		 * the PAUSE.
> +		 */
> +		if ((info->rep_prefix != REPE_PREFIX) ||
> +		    !nested_cpu_has2(vmcs12, CPU_BASED_PAUSE_EXITING))
> +			return X86EMUL_CONTINUE;
> +
> +		break;
> +
>   	/* TODO: check more intercepts... */
>   	default:
>   		break;
> 
> base-commit: 27d6845d258b67f4eb3debe062b7dacc67e0c393

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

Would you like me to apply this for 6.3?

Paolo
  
Sean Christopherson April 11, 2023, 4:07 p.m. UTC | #3
On Tue, Apr 11, 2023, Paolo Bonzini wrote:
> On 4/5/23 02:23, Sean Christopherson wrote:
> > Fixes: 07721feee46b ("KVM: nVMX: Don't emulate instructions in guest mode")
> > Cc: Mathias Krause <minipli@grsecurity.net>
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Sean Christopherson <seanjc@google.com>
> > ---
> >   arch/x86/kvm/vmx/vmx.c | 15 +++++++++++++++
> >   1 file changed, 15 insertions(+)
> > 
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index 9ae4044f076f..1e560457bf9a 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -7898,6 +7898,21 @@ static int vmx_check_intercept(struct kvm_vcpu *vcpu,
> >   		/* FIXME: produce nested vmexit and return X86EMUL_INTERCEPTED.  */
> >   		break;
> > +	case x86_intercept_pause:
> > +		/*
> > +		 * PAUSE is a single-byte NOP with a REPE prefix, i.e. collides
> > +		 * with vanilla NOPs in the emulator.  Apply the interception
> > +		 * check only to actual PAUSE instructions.  Don't check
> > +		 * PAUSE-loop-exiting, software can't expect a given PAUSE to
> > +		 * exit, i.e. KVM is within its rights to allow L2 to execute
> > +		 * the PAUSE.
> > +		 */
> > +		if ((info->rep_prefix != REPE_PREFIX) ||
> > +		    !nested_cpu_has2(vmcs12, CPU_BASED_PAUSE_EXITING))
> > +			return X86EMUL_CONTINUE;
> > +
> > +		break;
> > +
> >   	/* TODO: check more intercepts... */
> >   	default:
> >   		break;
> > 
> > base-commit: 27d6845d258b67f4eb3debe062b7dacc67e0c393
> 
> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
> 
> Would you like me to apply this for 6.3?

Nah, I don't think there's a good risk vs. reward ratio.  KVM doesn't enable PLE
for L2, never enables PAUSE-exiting, and won't emulate NOP without forced emulation
or TLB shenanigans from L2.  So other than tests, this really shouldn't matter.

Actually, typing that out is making me rethink the stable tag...
  

Patch

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9ae4044f076f..1e560457bf9a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7898,6 +7898,21 @@  static int vmx_check_intercept(struct kvm_vcpu *vcpu,
 		/* FIXME: produce nested vmexit and return X86EMUL_INTERCEPTED.  */
 		break;
 
+	case x86_intercept_pause:
+		/*
+		 * PAUSE is a single-byte NOP with a REPE prefix, i.e. collides
+		 * with vanilla NOPs in the emulator.  Apply the interception
+		 * check only to actual PAUSE instructions.  Don't check
+		 * PAUSE-loop-exiting, software can't expect a given PAUSE to
+		 * exit, i.e. KVM is within its rights to allow L2 to execute
+		 * the PAUSE.
+		 */
+		if ((info->rep_prefix != REPE_PREFIX) ||
+		    !nested_cpu_has2(vmcs12, CPU_BASED_PAUSE_EXITING))
+			return X86EMUL_CONTINUE;
+
+		break;
+
 	/* TODO: check more intercepts... */
 	default:
 		break;