[v5,33/34] KVM: x86/vmx: call external_interrupt() to handle IRQ in IRQ caused VM exits

Message ID 20230307023946.14516-34-xin3.li@intel.com
State New
Headers
Series x86: enable FRED for x86-64 |

Commit Message

Li, Xin3 March 7, 2023, 2:39 a.m. UTC
  When FRED is enabled, IDT is gone, thus call external_interrupt() to handle
IRQ in IRQ caused VM exits.

Create an event return stack frame with the host context immediately after
a VM exit for calling external_interrupt(). All other fields of the pt_regs
structure are cleared to 0. Refer to the discussion about the register values
in the pt_regs structure at:

  https://lore.kernel.org/kvm/ef2c54f7-14b9-dcbb-c3c4-1533455e7a18@redhat.com/

Tested-by: Shan Kang <shan.kang@intel.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
---

Changes since v4:
*) Do NOT use the term "injection", which in the KVM context means to
   reinject an event into the guest (Sean Christopherson).
*) Use cs/ss instead of csx/ssx when initializing the pt_regs structure
   for calling external_interrupt(), otherwise it breaks i386 build.
---
 arch/x86/kvm/vmx/vmx.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)
  

Comments

Sean Christopherson March 22, 2023, 5:57 p.m. UTC | #1
On Mon, Mar 06, 2023, Xin Li wrote:
> @@ -6923,7 +6924,26 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu)
>  		return;
>  
>  	kvm_before_interrupt(vcpu, KVM_HANDLING_IRQ);
> -	vmx_do_interrupt_irqoff(gate_offset(desc));
> +	if (cpu_feature_enabled(X86_FEATURE_FRED)) {
> +		struct vcpu_vmx *vmx = to_vmx(vcpu);
> +		struct pt_regs regs = {};
> +
> +		/*
> +		 * Create an event return stack frame with the
> +		 * host context immediately after a VM exit.

Why snapshot the context immediately after VM-Exit?  It diverges from what is
done in the non-FRED path, and it seems quite misleading and maybe even dangerous.
The RSP and RIP values are long since gone, e.g. if something explodes, the stack
trace will be outright wrong.

> +		 *
> +		 * All other fields of the pt_regs structure are
> +		 * cleared to 0.
> +		 */
> +		regs.ss		= __KERNEL_DS;
> +		regs.sp		= vmx->loaded_vmcs->host_state.rsp;
> +		regs.flags	= X86_EFLAGS_FIXED;
> +		regs.cs		= __KERNEL_CS;
> +		regs.ip		= (unsigned long)vmx_vmexit;
> +
> +		external_interrupt(&regs, vector);

I assume FRED still uses the stack, so why not do something similar to
vmx_do_interrupt_irqoff() and build @regs after an explicit CALL?  Might even
be possible to share some/all of VMX_DO_EVENT_IRQOFF.

> +	} else

Curly braces needed since the first half has 'em.

> +		vmx_do_interrupt_irqoff(gate_offset(desc));
>  	kvm_after_interrupt(vcpu);
>  
>  	vcpu->arch.at_instruction_boundary = true;
> -- 
> 2.34.1
>
  

Patch

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index bcac3efcde41..3ebeaab34b2e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -47,6 +47,7 @@ 
 #include <asm/mshyperv.h>
 #include <asm/mwait.h>
 #include <asm/spec-ctrl.h>
+#include <asm/traps.h>
 #include <asm/virtext.h>
 #include <asm/vmx.h>
 
@@ -6923,7 +6924,26 @@  static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu)
 		return;
 
 	kvm_before_interrupt(vcpu, KVM_HANDLING_IRQ);
-	vmx_do_interrupt_irqoff(gate_offset(desc));
+	if (cpu_feature_enabled(X86_FEATURE_FRED)) {
+		struct vcpu_vmx *vmx = to_vmx(vcpu);
+		struct pt_regs regs = {};
+
+		/*
+		 * Create an event return stack frame with the
+		 * host context immediately after a VM exit.
+		 *
+		 * All other fields of the pt_regs structure are
+		 * cleared to 0.
+		 */
+		regs.ss		= __KERNEL_DS;
+		regs.sp		= vmx->loaded_vmcs->host_state.rsp;
+		regs.flags	= X86_EFLAGS_FIXED;
+		regs.cs		= __KERNEL_CS;
+		regs.ip		= (unsigned long)vmx_vmexit;
+
+		external_interrupt(&regs, vector);
+	} else
+		vmx_do_interrupt_irqoff(gate_offset(desc));
 	kvm_after_interrupt(vcpu);
 
 	vcpu->arch.at_instruction_boundary = true;