From patchwork Tue Apr 4 10:27:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 79023 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp2953576vqo; Tue, 4 Apr 2023 04:34:52 -0700 (PDT) X-Google-Smtp-Source: AKy350ZRrA/0f7Em9Z1IJNfTZph2cGbBxtzziHIUg9jYr802YlEuXEoi+4pW//9C7/KEnZphyvC0 X-Received: by 2002:aa7:9521:0:b0:627:e690:eacd with SMTP id c1-20020aa79521000000b00627e690eacdmr1858974pfp.29.1680608092566; Tue, 04 Apr 2023 04:34:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680608092; cv=none; d=google.com; s=arc-20160816; b=FdIOs6Ab+AqqPGWVClKoRxpPYxb9WckT0yKjlA1lY/wGwJFdqkqmip5KRK9Q1nCLQT REyg/PsWVCQXTF1eSFGkjMXpLk+pRzoqFRIu89ts31Fj2raFY7IIOeDKSHoCFTwZqja6 mmbq/hNZPg4Or+tKmO/471PEmNEIfKtVPCKWksosD3c5bpbwnn4bu28LU8wK3eO8zwvt 7UeyRehhY3ff9NsbfAnEEhe7rzsTd1r0DuxpXKSS21Vivou2n8clcDZKbnKTk65hd99V qIlD0KT2/kwh4IbM2VzcPGmhplHr5AFKUA2I2IKFRp3KcXfp86KyA1HR+9yQGpZjJ0SI TWTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=PrFmWeUdWkwvuLYVPGsXMWjUbO/MiCX3Sl+f1h0fEek=; b=m51j4Svhc4+BeDxSJQN4k4As9+oEY0IfQJinuoalRiFiir2zQ3AmTKn/cXO494NEL+ zI6w3oB8WbYWWC8e4VTGhA/fGLv2z60uVWzoWgGraYY5QVKyw6AHa/0Mf8MPLXP5MVlJ sYkLAa2hVLJFErnIDNJhJ0y0Fre64wh6YM4Fh3D64iBuhzQkJ6B4uzSTfz8isoV6/uYi 6JdLzx7DYvjzxeggSY74gAsA93HsARSDAZ7vG3ejs53o7T6vC1QC/PG0lC9J1jBm2r9Y JEwQ1Y76SU1yUBwyA8KSF7ZH4GQ9W9WhtBo6zgS7IHstWirvTWshIb45PPAU+Hj2U5ed qIlw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="m/x/Twa9"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x188-20020a6263c5000000b005ecf91666b8si10523590pfb.184.2023.04.04.04.34.38; Tue, 04 Apr 2023 04:34:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="m/x/Twa9"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234360AbjDDK44 (ORCPT + 99 others); Tue, 4 Apr 2023 06:56:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40598 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234886AbjDDK40 (ORCPT ); Tue, 4 Apr 2023 06:56:26 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 287064C2C; Tue, 4 Apr 2023 03:54:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680605663; x=1712141663; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2hEOO4AGi3vlRhLY8tgivbLZ1qLQc8RNqYRR6FbCFpw=; b=m/x/Twa9/Dwk4tyfda9pzI2x+xvNUxXAdYfGBZ2ysmRK5OuoGyKCtzrh mTzQMVHdXgiIZl20n0EYYaFkfScZTiArE4XMC3Ox9LypKPSEQir70ePLk BmTUaxfBkrAKRzFCfw6GHdIAKCWZfHvrbWW5oaljp01uS0glbufRAofPK /EwRpAVSNFOwHGrJg5dbYU75L8JQV0skPekJwiF3QIinTF9tMqFmDxdnl 6Szy0yqBjquAf7ebHy/FnrUoLn6Al57PFXNXJ2stYbKSZ0fuAiranZPWy JjwaFLat1am9tMQj1+/U03Arclw00zbamSxqHRH8Pzs0v7n6Eh6PHguIu g==; X-IronPort-AV: E=McAfee;i="6600,9927,10669"; a="330734263" X-IronPort-AV: E=Sophos;i="5.98,317,1673942400"; d="scan'208";a="330734263" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Apr 2023 03:53:07 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10669"; a="775597877" X-IronPort-AV: E=Sophos;i="5.98,317,1673942400"; d="scan'208";a="775597877" Received: from unknown (HELO fred..) ([172.25.112.68]) by FMSMGA003.fm.intel.com with ESMTP; 04 Apr 2023 03:53:07 -0700 From: Xin Li To: linux-kernel@vger.kernel.org, x86@kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com, pbonzini@redhat.com, ravi.v.shankar@intel.com, jiangshanlai@gmail.com, shan.kang@intel.com Subject: [PATCH v7 33/33] KVM: x86/vmx: refactor VMX_DO_EVENT_IRQOFF to generate FRED stack frames Date: Tue, 4 Apr 2023 03:27:16 -0700 Message-Id: <20230404102716.1795-34-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230404102716.1795-1-xin3.li@intel.com> References: <20230404102716.1795-1-xin3.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.5 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1762245310898926963?= X-GMAIL-MSGID: =?utf-8?q?1762245310898926963?= Comparing to an IDT stack frame, a FRED stack frame has extra 16 bytes of information pushed at the regular stack top and 8 bytes of error code _always_ pushed at the regular stack bottom, VMX_DO_EVENT_IRQOFF can be refactored to generate FRED stack frames with event type and vector properly set. Thus, IRQ/NMI can be handled with the existing approach when FRED is enabled. As a FRED stack frame always contains an error code pushed by hardware, call a trampoline function first to have the return instruction address pushed on the regular stack. Then the trampoline function pushes an error code (0 for both IRQ and NMI) and jumps to fred_entrypoint_kernel() for NMI handling or calls external_interrupt() for IRQ handling. The trampoline function for IRQ handling pushes general purpose registers to form a pt_regs structure and then use it to call external_interrupt(). As a result, IRQ handling does not execute any noinstr code. Export fred_entrypoint_kernel() and external_interrupt() for above changes. Tested-by: Shan Kang Signed-off-by: Xin Li --- Changes since v6: * Export fred_entrypoint_kernel(), required when kvm-intel built as a module. * Reserve a REDZONE for CALL emulation and Align RSP to a 64-byte boundary before pushing a new FRED stack frame. --- arch/x86/entry/entry_64_fred.S | 1 + arch/x86/include/asm/asm-prototypes.h | 1 + arch/x86/include/asm/fred.h | 1 + arch/x86/include/asm/traps.h | 2 + arch/x86/kernel/traps.c | 5 ++ arch/x86/kvm/vmx/vmenter.S | 74 ++++++++++++++++++++++++++- arch/x86/kvm/vmx/vmx.c | 16 +++++- 7 files changed, 96 insertions(+), 4 deletions(-) diff --git a/arch/x86/entry/entry_64_fred.S b/arch/x86/entry/entry_64_fred.S index efe2bcd11273..de74ab97ff00 100644 --- a/arch/x86/entry/entry_64_fred.S +++ b/arch/x86/entry/entry_64_fred.S @@ -59,3 +59,4 @@ SYM_CODE_START_NOALIGN(fred_entrypoint_kernel) FRED_EXIT ERETS SYM_CODE_END(fred_entrypoint_kernel) +EXPORT_SYMBOL(fred_entrypoint_kernel) diff --git a/arch/x86/include/asm/asm-prototypes.h b/arch/x86/include/asm/asm-prototypes.h index b1a98fa38828..076bf8dee702 100644 --- a/arch/x86/include/asm/asm-prototypes.h +++ b/arch/x86/include/asm/asm-prototypes.h @@ -12,6 +12,7 @@ #include #include #include +#include #include #ifndef CONFIG_X86_CMPXCHG64 diff --git a/arch/x86/include/asm/fred.h b/arch/x86/include/asm/fred.h index f7caf3b2f3f7..d00b9cab6aa6 100644 --- a/arch/x86/include/asm/fred.h +++ b/arch/x86/include/asm/fred.h @@ -129,6 +129,7 @@ DECLARE_FRED_HANDLER(fred_exc_machine_check); * The actual assembly entry and exit points */ extern __visible void fred_entrypoint_user(void); +extern __visible void fred_entrypoint_kernel(void); /* * Initialization diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h index 612b3d6fec53..017b95624325 100644 --- a/arch/x86/include/asm/traps.h +++ b/arch/x86/include/asm/traps.h @@ -58,4 +58,6 @@ typedef DECLARE_SYSTEM_INTERRUPT_HANDLER((*system_interrupt_handler)); system_interrupt_handler get_system_interrupt_handler(unsigned int i); +int external_interrupt(struct pt_regs *regs); + #endif /* _ASM_X86_TRAPS_H */ diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c index 73471053ed02..0f1fcd53cb52 100644 --- a/arch/x86/kernel/traps.c +++ b/arch/x86/kernel/traps.c @@ -1573,6 +1573,11 @@ int external_interrupt(struct pt_regs *regs) return 0; } +#if IS_ENABLED(CONFIG_KVM_INTEL) +/* For KVM VMX to handle IRQs in IRQ induced VM exits. */ +EXPORT_SYMBOL_GPL(external_interrupt); +#endif + #endif /* CONFIG_X86_64 */ void __init install_system_interrupt_handler(unsigned int n, const void *asm_addr, const void *addr) diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S index 631fd7da2bc3..f64b05b3d775 100644 --- a/arch/x86/kvm/vmx/vmenter.S +++ b/arch/x86/kvm/vmx/vmenter.S @@ -2,12 +2,14 @@ #include #include #include +#include #include #include #include #include #include "kvm-asm-offsets.h" #include "run_flags.h" +#include "../../entry/calling.h" #define WORD_SIZE (BITS_PER_LONG / 8) @@ -31,7 +33,7 @@ #define VCPU_R15 __VCPU_REGS_R15 * WORD_SIZE #endif -.macro VMX_DO_EVENT_IRQOFF call_insn call_target +.macro VMX_DO_EVENT_IRQOFF call_insn call_target fred=0 nmi=0 /* * Unconditionally create a stack frame, getting the correct RSP on the * stack (for x86-64) would take two instructions anyways, and RBP can @@ -41,16 +43,56 @@ mov %_ASM_SP, %_ASM_BP #ifdef CONFIG_X86_64 + .if \fred +#ifdef CONFIG_X86_FRED + /* + * It's not necessary to change current stack level for handling IRQ/NMI + * because the state of the kernel stack is well defined in this place + * in the code, and it is known not to be deep in a bunch of nested I/O + * layer handlers that eat up the stack. + */ + + /* Reserve a REDZONE for CALL emulation. */ + sub $(FRED_CONFIG_REDZONE_AMOUNT << 6), %rsp + + /* Align RSP to a 64-byte boundary before pushing a new stack frame */ + and $FRED_STACK_FRAME_RSP_MASK, %rsp + + push $0 /* Reserved by FRED, must be 0 */ + push $0 /* FRED event data, 0 for NMI and external interrupts */ +#endif + .else /* * Align RSP to a 16-byte boundary (to emulate CPU behavior) before * creating the synthetic interrupt stack frame for the IRQ/NMI. */ and $-16, %rsp + .endif + + .if \fred + .if \nmi + mov $(2 << 32 | 2 << 48), %_ASM_AX /* NMI event type and vector */ + .else + mov %_ASM_ARG1, %_ASM_AX + shl $32, %_ASM_AX /* external interrupt vector */ + .endif + add $__KERNEL_DS, %_ASM_AX + bts $57, %_ASM_AX /* bit 57: 64-bit mode */ + push %_ASM_AX + .else push $__KERNEL_DS + .endif + push %rbp #endif pushf + .if \nmi + mov $__KERNEL_CS, %_ASM_AX + bts $28, %_ASM_AX /* set the NMI bit */ + push %_ASM_AX + .else push $__KERNEL_CS + .endif \call_insn \call_target /* @@ -300,9 +342,19 @@ SYM_INNER_LABEL(vmx_vmexit, SYM_L_GLOBAL) SYM_FUNC_END(__vmx_vcpu_run) SYM_FUNC_START(vmx_do_nmi_irqoff) - VMX_DO_EVENT_IRQOFF call asm_exc_nmi_kvm_vmx + VMX_DO_EVENT_IRQOFF call asm_exc_nmi_kvm_vmx nmi=1 SYM_FUNC_END(vmx_do_nmi_irqoff) +#ifdef CONFIG_X86_FRED +SYM_FUNC_START(vmx_do_fred_nmi_trampoline) + push $0 /* FRED error code, 0 for NMI */ + jmp fred_entrypoint_kernel +SYM_FUNC_END(vmx_do_fred_nmi_trampoline) + +SYM_FUNC_START(vmx_do_fred_nmi_irqoff) + VMX_DO_EVENT_IRQOFF call vmx_do_fred_nmi_trampoline fred=1 nmi=1 +SYM_FUNC_END(vmx_do_fred_nmi_irqoff) +#endif .section .text, "ax" @@ -360,3 +412,21 @@ SYM_FUNC_END(vmread_error_trampoline) SYM_FUNC_START(vmx_do_interrupt_irqoff) VMX_DO_EVENT_IRQOFF CALL_NOSPEC _ASM_ARG1 SYM_FUNC_END(vmx_do_interrupt_irqoff) + +#ifdef CONFIG_X86_FRED +SYM_FUNC_START(vmx_do_fred_interrupt_trampoline) + push $0 /* FRED error code, 0 for NMI and external interrupts */ + PUSH_REGS + + movq %rsp, %rdi /* %rdi -> pt_regs */ + call external_interrupt + + POP_REGS + addq $8,%rsp /* Drop FRED error code */ + RET +SYM_FUNC_END(vmx_do_fred_interrupt_trampoline) + +SYM_FUNC_START(vmx_do_fred_interrupt_irqoff) + VMX_DO_EVENT_IRQOFF call vmx_do_fred_interrupt_trampoline fred=1 +SYM_FUNC_END(vmx_do_fred_interrupt_irqoff) +#endif diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index d2d6e1b6c788..6dfe692dfd6a 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6875,7 +6875,9 @@ static void vmx_apicv_post_state_restore(struct kvm_vcpu *vcpu) } void vmx_do_interrupt_irqoff(unsigned long entry); +void vmx_do_fred_interrupt_irqoff(unsigned int vector); void vmx_do_nmi_irqoff(void); +void vmx_do_fred_nmi_irqoff(void); static void handle_nm_fault_irqoff(struct kvm_vcpu *vcpu) { @@ -6923,7 +6925,12 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu) return; kvm_before_interrupt(vcpu, KVM_HANDLING_IRQ); - vmx_do_interrupt_irqoff(gate_offset(desc)); +#ifdef CONFIG_X86_64 + if (cpu_feature_enabled(X86_FEATURE_FRED)) + vmx_do_fred_interrupt_irqoff(vector); + else +#endif + vmx_do_interrupt_irqoff(gate_offset(desc)); kvm_after_interrupt(vcpu); vcpu->arch.at_instruction_boundary = true; @@ -7209,7 +7216,12 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu, if ((u16)vmx->exit_reason.basic == EXIT_REASON_EXCEPTION_NMI && is_nmi(vmx_get_intr_info(vcpu))) { kvm_before_interrupt(vcpu, KVM_HANDLING_NMI); - vmx_do_nmi_irqoff(); +#ifdef CONFIG_X86_64 + if (cpu_feature_enabled(X86_FEATURE_FRED)) + vmx_do_fred_nmi_irqoff(); + else +#endif + vmx_do_nmi_irqoff(); kvm_after_interrupt(vcpu); }