[1/7] KVM: VMX: remove regs argument of __vmx_vcpu_run

Message ID 20221028230723.3254250-2-pbonzini@redhat.com
State New
Headers
Series KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly |

Commit Message

Paolo Bonzini Oct. 28, 2022, 11:07 p.m. UTC
  Registers are reachable through vcpu_vmx, no need to pass
a separate pointer to the regs[] array.

No functional change intended.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/kernel/asm-offsets.c |  1 +
 arch/x86/kvm/vmx/nested.c     |  3 +-
 arch/x86/kvm/vmx/vmenter.S    | 58 +++++++++++++++--------------------
 arch/x86/kvm/vmx/vmx.c        |  3 +-
 arch/x86/kvm/vmx/vmx.h        |  3 +-
 5 files changed, 29 insertions(+), 39 deletions(-)
  

Comments

Sean Christopherson Oct. 31, 2022, 5:37 p.m. UTC | #1
On Fri, Oct 28, 2022, Paolo Bonzini wrote:
> Registers are reachable through vcpu_vmx, no need to pass
> a separate pointer to the regs[] array.
> 
> No functional change intended.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  arch/x86/kernel/asm-offsets.c |  1 +
>  arch/x86/kvm/vmx/nested.c     |  3 +-
>  arch/x86/kvm/vmx/vmenter.S    | 58 +++++++++++++++--------------------
>  arch/x86/kvm/vmx/vmx.c        |  3 +-
>  arch/x86/kvm/vmx/vmx.h        |  3 +-
>  5 files changed, 29 insertions(+), 39 deletions(-)
> 
> diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> index cb50589a7102..90da275ad223 100644
> --- a/arch/x86/kernel/asm-offsets.c
> +++ b/arch/x86/kernel/asm-offsets.c
> @@ -111,6 +111,7 @@ static void __used common(void)
>  
>  	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
>  		BLANK();
> +		OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);

Is there an asm-offsets-like solution that doesn't require exposing vcpu_vmx
outside of KVM?  We (Google) want to explore loading multiple instances of KVM,
i.e. loading multiple versions of kvm.ko at the same time, to allow intra-host
migration between versions of KVM to upgrade/rollback KVM without changing the
kernel (RFC coming soon-ish).  IIRC, asm-offsets is the only place where I haven't
been able to figure out a simple way to avoid exposing KVM's internal structures
outside of KVM (so that the structures can change across KVM instances without
breaking kernel code).
  
Josh Poimboeuf Nov. 1, 2022, 5:32 p.m. UTC | #2
On Mon, Oct 31, 2022 at 05:37:46PM +0000, Sean Christopherson wrote:
> On Fri, Oct 28, 2022, Paolo Bonzini wrote:
> > Registers are reachable through vcpu_vmx, no need to pass
> > a separate pointer to the regs[] array.
> > 
> > No functional change intended.
> > 
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >  arch/x86/kernel/asm-offsets.c |  1 +
> >  arch/x86/kvm/vmx/nested.c     |  3 +-
> >  arch/x86/kvm/vmx/vmenter.S    | 58 +++++++++++++++--------------------
> >  arch/x86/kvm/vmx/vmx.c        |  3 +-
> >  arch/x86/kvm/vmx/vmx.h        |  3 +-
> >  5 files changed, 29 insertions(+), 39 deletions(-)
> > 
> > diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> > index cb50589a7102..90da275ad223 100644
> > --- a/arch/x86/kernel/asm-offsets.c
> > +++ b/arch/x86/kernel/asm-offsets.c
> > @@ -111,6 +111,7 @@ static void __used common(void)
> >  
> >  	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
> >  		BLANK();
> > +		OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
> 
> Is there an asm-offsets-like solution that doesn't require exposing vcpu_vmx
> outside of KVM?  We (Google) want to explore loading multiple instances of KVM,
> i.e. loading multiple versions of kvm.ko at the same time, to allow intra-host
> migration between versions of KVM to upgrade/rollback KVM without changing the
> kernel (RFC coming soon-ish).  IIRC, asm-offsets is the only place where I haven't
> been able to figure out a simple way to avoid exposing KVM's internal structures
> outside of KVM (so that the structures can change across KVM instances without
> breaking kernel code).

Is that really a problem?  Would it even make sense for non-KVM kernel
code to use 'vcpu_vmx' anyway?  It already seems to be private.
asm-offsets.c has to "cheat" to get access to it by including
"../kvm/vmx/vmx.h".

So the only concern is exposing the asm offsets, right?  But it seems
highly unlikely any non-KVM code would be using those either.

And, that would be a bug anyway: module code is subject to change and
could always get recompiled.  The kernel proper shouldn't be making any
assumptions about the layouts of module-owned structs.
  
Sean Christopherson Nov. 1, 2022, 6:03 p.m. UTC | #3
On Tue, Nov 01, 2022, Josh Poimboeuf wrote:
> On Mon, Oct 31, 2022 at 05:37:46PM +0000, Sean Christopherson wrote:
> > > diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
> > > index cb50589a7102..90da275ad223 100644
> > > --- a/arch/x86/kernel/asm-offsets.c
> > > +++ b/arch/x86/kernel/asm-offsets.c
> > > @@ -111,6 +111,7 @@ static void __used common(void)
> > >  
> > >  	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
> > >  		BLANK();
> > > +		OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
> > 
> > Is there an asm-offsets-like solution that doesn't require exposing vcpu_vmx
> > outside of KVM?  We (Google) want to explore loading multiple instances of KVM,
> > i.e. loading multiple versions of kvm.ko at the same time, to allow intra-host
> > migration between versions of KVM to upgrade/rollback KVM without changing the
> > kernel (RFC coming soon-ish).  IIRC, asm-offsets is the only place where I haven't
> > been able to figure out a simple way to avoid exposing KVM's internal structures
> > outside of KVM (so that the structures can change across KVM instances without
> > breaking kernel code).
> 
> Is that really a problem?  Would it even make sense for non-KVM kernel
> code to use 'vcpu_vmx' anyway?

vcpu_vmx itself isn't a problem as non-KVM kernel code _shouldn't_ be using
vcpu_vmx, but I want to go beyond "shouldn't" and make it all-but-impossible for
non-KVM code to reference internal KVM structures/state, e.g. I want to bury all
kvm_host.h headers in kvm/ code instead of exposing them in include/asm/.
  
Paolo Bonzini Nov. 2, 2022, 5:42 p.m. UTC | #4
On 10/31/22 18:37, Sean Christopherson wrote:
>>   	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
>>   		BLANK();
>> +		OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
> Is there an asm-offsets-like solution that doesn't require exposing vcpu_vmx
> outside of KVM?  We (Google) want to explore loading multiple instances of KVM,
> i.e. loading multiple versions of kvm.ko at the same time, to allow intra-host
> migration between versions of KVM to upgrade/rollback KVM without changing the
> kernel (RFC coming soon-ish).  IIRC, asm-offsets is the only place where I haven't
> been able to figure out a simple way to avoid exposing KVM's internal structures
> outside of KVM (so that the structures can change across KVM instances without
> breaking kernel code).
> 

It's possible to create our own asm-offsets.h file using something 
similar to the logic in Kbuild:

#####
# Generate asm-offsets.h

offsets-file := include/generated/asm-offsets.h

always-y += $(offsets-file)
targets += arch/$(SRCARCH)/kernel/asm-offsets.s

arch/$(SRCARCH)/kernel/asm-offsets.s: $(timeconst-file) $(bounds-file)

$(offsets-file): arch/$(SRCARCH)/kernel/asm-offsets.s FORCE
         $(call filechk,offsets,__ASM_OFFSETS_H__)

Paolo
  

Patch

diff --git a/arch/x86/kernel/asm-offsets.c b/arch/x86/kernel/asm-offsets.c
index cb50589a7102..90da275ad223 100644
--- a/arch/x86/kernel/asm-offsets.c
+++ b/arch/x86/kernel/asm-offsets.c
@@ -111,6 +111,7 @@  static void __used common(void)
 
 	if (IS_ENABLED(CONFIG_KVM_INTEL)) {
 		BLANK();
+		OFFSET(VMX_vcpu_arch_regs, vcpu_vmx, vcpu.arch.regs);
 		OFFSET(VMX_spec_ctrl, vcpu_vmx, spec_ctrl);
 	}
 }
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 61a2e551640a..3f62bdaffb0b 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3094,8 +3094,7 @@  static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu)
 		vmx->loaded_vmcs->host_state.cr4 = cr4;
 	}
 
-	vm_fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
-				 __vmx_vcpu_run_flags(vmx));
+	vm_fail = __vmx_vcpu_run(vmx, __vmx_vcpu_run_flags(vmx));
 
 	if (vmx->msr_autoload.host.nr)
 		vmcs_write32(VM_EXIT_MSR_LOAD_COUNT, vmx->msr_autoload.host.nr);
diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S
index 63b4ad54331b..1362fe5859f9 100644
--- a/arch/x86/kvm/vmx/vmenter.S
+++ b/arch/x86/kvm/vmx/vmenter.S
@@ -11,24 +11,24 @@ 
 
 #define WORD_SIZE (BITS_PER_LONG / 8)
 
-#define VCPU_RAX	__VCPU_REGS_RAX * WORD_SIZE
-#define VCPU_RCX	__VCPU_REGS_RCX * WORD_SIZE
-#define VCPU_RDX	__VCPU_REGS_RDX * WORD_SIZE
-#define VCPU_RBX	__VCPU_REGS_RBX * WORD_SIZE
+#define VCPU_RAX	(VMX_vcpu_arch_regs + __VCPU_REGS_RAX * WORD_SIZE)
+#define VCPU_RCX	(VMX_vcpu_arch_regs + __VCPU_REGS_RCX * WORD_SIZE)
+#define VCPU_RDX	(VMX_vcpu_arch_regs + __VCPU_REGS_RDX * WORD_SIZE)
+#define VCPU_RBX	(VMX_vcpu_arch_regs + __VCPU_REGS_RBX * WORD_SIZE)
 /* Intentionally omit RSP as it's context switched by hardware */
-#define VCPU_RBP	__VCPU_REGS_RBP * WORD_SIZE
-#define VCPU_RSI	__VCPU_REGS_RSI * WORD_SIZE
-#define VCPU_RDI	__VCPU_REGS_RDI * WORD_SIZE
+#define VCPU_RBP	(VMX_vcpu_arch_regs + __VCPU_REGS_RBP * WORD_SIZE)
+#define VCPU_RSI	(VMX_vcpu_arch_regs + __VCPU_REGS_RSI * WORD_SIZE)
+#define VCPU_RDI	(VMX_vcpu_arch_regs + __VCPU_REGS_RDI * WORD_SIZE)
 
 #ifdef CONFIG_X86_64
-#define VCPU_R8		__VCPU_REGS_R8  * WORD_SIZE
-#define VCPU_R9		__VCPU_REGS_R9  * WORD_SIZE
-#define VCPU_R10	__VCPU_REGS_R10 * WORD_SIZE
-#define VCPU_R11	__VCPU_REGS_R11 * WORD_SIZE
-#define VCPU_R12	__VCPU_REGS_R12 * WORD_SIZE
-#define VCPU_R13	__VCPU_REGS_R13 * WORD_SIZE
-#define VCPU_R14	__VCPU_REGS_R14 * WORD_SIZE
-#define VCPU_R15	__VCPU_REGS_R15 * WORD_SIZE
+#define VCPU_R8		(VMX_vcpu_arch_regs + __VCPU_REGS_R8  * WORD_SIZE)
+#define VCPU_R9		(VMX_vcpu_arch_regs + __VCPU_REGS_R9  * WORD_SIZE)
+#define VCPU_R10	(VMX_vcpu_arch_regs + __VCPU_REGS_R10 * WORD_SIZE)
+#define VCPU_R11	(VMX_vcpu_arch_regs + __VCPU_REGS_R11 * WORD_SIZE)
+#define VCPU_R12	(VMX_vcpu_arch_regs + __VCPU_REGS_R12 * WORD_SIZE)
+#define VCPU_R13	(VMX_vcpu_arch_regs + __VCPU_REGS_R13 * WORD_SIZE)
+#define VCPU_R14	(VMX_vcpu_arch_regs + __VCPU_REGS_R14 * WORD_SIZE)
+#define VCPU_R15	(VMX_vcpu_arch_regs + __VCPU_REGS_R15 * WORD_SIZE)
 #endif
 
 .section .noinstr.text, "ax"
@@ -36,7 +36,6 @@ 
 /**
  * __vmx_vcpu_run - Run a vCPU via a transition to VMX guest mode
  * @vmx:	struct vcpu_vmx *
- * @regs:	unsigned long * (to guest registers)
  * @flags:	VMX_RUN_VMRESUME:	use VMRESUME instead of VMLAUNCH
  *		VMX_RUN_SAVE_SPEC_CTRL: save guest SPEC_CTRL into vmx->spec_ctrl
  *
@@ -61,22 +60,19 @@  SYM_FUNC_START(__vmx_vcpu_run)
 	push %_ASM_ARG1
 
 	/* Save @flags for SPEC_CTRL handling */
-	push %_ASM_ARG3
-
-	/*
-	 * Save @regs, _ASM_ARG2 may be modified by vmx_update_host_rsp() and
-	 * @regs is needed after VM-Exit to save the guest's register values.
-	 */
 	push %_ASM_ARG2
 
-	/* Copy @flags to BL, _ASM_ARG3 is volatile. */
-	mov %_ASM_ARG3B, %bl
+	/* Copy @flags to BL, _ASM_ARG2 is volatile. */
+	mov %_ASM_ARG2B, %bl
 
 	lea (%_ASM_SP), %_ASM_ARG2
 	call vmx_update_host_rsp
 
 	ALTERNATIVE "jmp .Lspec_ctrl_done", "", X86_FEATURE_MSR_SPEC_CTRL
 
+	/* Reload @vmx, _ASM_ARG1 may be modified by vmx_update_host_rsp().  */
+	mov WORD_SIZE(%_ASM_SP), %_ASM_DI
+
 	/*
 	 * SPEC_CTRL handling: if the guest's SPEC_CTRL value differs from the
 	 * host's, write the MSR.
@@ -85,7 +81,6 @@  SYM_FUNC_START(__vmx_vcpu_run)
 	 * there must not be any returns or indirect branches between this code
 	 * and vmentry.
 	 */
-	mov 2*WORD_SIZE(%_ASM_SP), %_ASM_DI
 	movl VMX_spec_ctrl(%_ASM_DI), %edi
 	movl PER_CPU_VAR(x86_spec_ctrl_current), %esi
 	cmp %edi, %esi
@@ -102,8 +97,8 @@  SYM_FUNC_START(__vmx_vcpu_run)
 	 * an LFENCE to stop speculation from skipping the wrmsr.
 	 */
 
-	/* Load @regs to RAX. */
-	mov (%_ASM_SP), %_ASM_AX
+	/* Load @vmx to RAX. */
+	mov WORD_SIZE(%_ASM_SP), %_ASM_AX
 
 	/* Check if vmlaunch or vmresume is needed */
 	testb $VMX_RUN_VMRESUME, %bl
@@ -125,7 +120,7 @@  SYM_FUNC_START(__vmx_vcpu_run)
 	mov VCPU_R14(%_ASM_AX), %r14
 	mov VCPU_R15(%_ASM_AX), %r15
 #endif
-	/* Load guest RAX.  This kills the @regs pointer! */
+	/* Load guest RAX.  This kills the @vmx pointer! */
 	mov VCPU_RAX(%_ASM_AX), %_ASM_AX
 
 	/* Check EFLAGS.ZF from 'testb' above */
@@ -163,8 +158,8 @@  SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL)
 	/* Temporarily save guest's RAX. */
 	push %_ASM_AX
 
-	/* Reload @regs to RAX. */
-	mov WORD_SIZE(%_ASM_SP), %_ASM_AX
+	/* Reload @vmx to RAX. */
+	mov 2*WORD_SIZE(%_ASM_SP), %_ASM_AX
 
 	/* Save all guest registers, including RAX from the stack */
 	pop           VCPU_RAX(%_ASM_AX)
@@ -189,9 +184,6 @@  SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL)
 	xor %ebx, %ebx
 
 .Lclear_regs:
-	/* Discard @regs.  The register is irrelevant, it just can't be RBX. */
-	pop %_ASM_AX
-
 	/*
 	 * Clear all general purpose registers except RSP and RBX to prevent
 	 * speculative use of the guest's values, even those that are reloaded
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 05a747c9a9ff..42cda7a5c009 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7084,8 +7084,7 @@  static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
 	if (vcpu->arch.cr2 != native_read_cr2())
 		native_write_cr2(vcpu->arch.cr2);
 
-	vmx->fail = __vmx_vcpu_run(vmx, (unsigned long *)&vcpu->arch.regs,
-				   flags);
+	vmx->fail = __vmx_vcpu_run(vmx, flags);
 
 	vcpu->arch.cr2 = native_read_cr2();
 
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index a3da84f4ea45..d90cdbea0e4c 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -422,8 +422,7 @@  void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu);
 void vmx_update_host_rsp(struct vcpu_vmx *vmx, unsigned long host_rsp);
 void vmx_spec_ctrl_restore_host(struct vcpu_vmx *vmx, unsigned int flags);
 unsigned int __vmx_vcpu_run_flags(struct vcpu_vmx *vmx);
-bool __vmx_vcpu_run(struct vcpu_vmx *vmx, unsigned long *regs,
-		    unsigned int flags);
+bool __vmx_vcpu_run(struct vcpu_vmx *vmx, unsigned int flags);
 int vmx_find_loadstore_msr_slot(struct vmx_msrs *m, u32 msr);
 void vmx_ept_load_pdptrs(struct kvm_vcpu *vcpu);