From patchwork Wed Nov 8 18:29:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Li, Xin3" X-Patchwork-Id: 163139 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:aa0b:0:b0:403:3b70:6f57 with SMTP id k11csp1124432vqo; Wed, 8 Nov 2023 11:03:09 -0800 (PST) X-Google-Smtp-Source: AGHT+IGmfGkzQsP4A7g6SW3kC4F3/crnqGQtjqFUAannmDf++4BwJ+v8kcsBpgSrmrPQkvvcGPZj X-Received: by 2002:a05:6808:13d1:b0:3ae:5e0e:1671 with SMTP id d17-20020a05680813d100b003ae5e0e1671mr3608628oiw.4.1699470189066; Wed, 08 Nov 2023 11:03:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1699470189; cv=none; d=google.com; s=arc-20160816; b=OK/PD04EzxQeegWwFZoPQeF1BtBUpOc7njQyvAnA/9Feh+fKNjUKl8U18+c5sz4elS pLemYtevj9IUgR/f4oXCjqHWICjsKK22vSmB6P/blP/1zJwP1q5JN8II0lbRstDr08SB 4q0UYWNuzNKM5YwfDcl7ttGYj6ToRfW29TVkGHoSROjKqeuWeVAEXB6E/lBRuAH1CjGM oXFIqlOoyuaFe1UQK+jEeWa1CNzztT1KJXjuOCxlqslna8sKEeqAu97WxzRoUzRPTr83 myv8VAQ1y2ZO4GtGbCp59w7r8ao4iF/Wgp4/9I7ffMEnLsGK1qxsaQxWIvWRPR535O68 kj0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=QIcDfU+I2XavtoT7/8Rm9Rkxk2AJuD6ZDkZrkfQ8txM=; fh=jyCs2STAghiemYCMqutk2CMon3BCEX6sSSqaVLqDaaU=; b=bAeXJbI9RdgCqOsfqORz7sghx529v7ckGS2lm5OXJGe5DG25gegtRjXusGgO7LVlpX b9zvPi34l0apiIqqLnBwPcV7ADG4J9kSVEvpxyjbMGspzZyoY38heBYIzV8zZlIp1h+P 1HsSOcrm118c1i2OmpMAbPeMaelOT2fsCdU8/8qW3Ri0cC87OGv0zi2lnKK48g/M47fh bSUzJjNHX0Oz/1AlGOaaChjcEgcweVm4kf/1G427ppG88Z7YxEsKeGdweoXTCjvWFuk7 o+3ysBvHUDq7m51Zq7cDC6OKlw06X6XU+NPk0uyR/DbnSnclOOm/11TVKcOGzUlaBK12 WXIg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=MJylD18k; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id bl27-20020a056808309b00b003ae52c00434si5732764oib.169.2023.11.08.11.03.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Nov 2023 11:03:09 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=MJylD18k; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id ECBB0809D32F; Wed, 8 Nov 2023 11:01:54 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232716AbjKHTA5 (ORCPT + 32 others); Wed, 8 Nov 2023 14:00:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56690 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232662AbjKHTAa (ORCPT ); Wed, 8 Nov 2023 14:00:30 -0500 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D6F5A211D; Wed, 8 Nov 2023 11:00:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1699470023; x=1731006023; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bFpGIxylK8ZmSMW102hy9IXnv91TDM4GB/lAo5qMasw=; b=MJylD18kXkXZqAWyjXz0YKXklSXZNjGGz42kvWw/nfQnDm34KVZK7PK9 a7JP8NlItAtm7CKWDKT1nehIoBFIJZnYKDqh+1zLwgeNPGLJyvDQXt5hl OYdGTiccmWa11R3TdQsl8F6oCCq56pbh3ga2Zr/bNAbz3XbGcnlEabo8c MbxTrOsJA67ZDh50jqEScmAXng0CWd9EtfRGFffKETLQO+WwgvLrFew6r BF2SKzXKIpi2qGW//TbUtgAeXWe9yEij9fm/6ckixK+GK6nFRURDH3coZ HiqeCJHoPtTssqrimr317krnzWwWC7J9ROdyZRYSCy6U6bJQSM1/pJX70 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10888"; a="8486348" X-IronPort-AV: E=Sophos;i="6.03,287,1694761200"; d="scan'208";a="8486348" Received: from orviesa001.jf.intel.com ([10.64.159.141]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Nov 2023 11:00:23 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,287,1694761200"; d="scan'208";a="10892473" Received: from unknown (HELO fred..) ([172.25.112.68]) by orviesa001.jf.intel.com with ESMTP; 08 Nov 2023 11:00:22 -0800 From: Xin Li To: kvm@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-kselftest@vger.kernel.org Cc: seanjc@google.com, pbonzini@redhat.com, corbet@lwn.net, kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, vkuznets@redhat.com, peterz@infradead.org, ravi.v.shankar@intel.com Subject: [PATCH v1 12/23] KVM: VMX: Handle FRED event data Date: Wed, 8 Nov 2023 10:29:52 -0800 Message-ID: <20231108183003.5981-13-xin3.li@intel.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231108183003.5981-1-xin3.li@intel.com> References: <20231108183003.5981-1-xin3.li@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Wed, 08 Nov 2023 11:01:55 -0800 (PST) X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782023652887987809 X-GMAIL-MSGID: 1782023652887987809 Set injected-event data when injecting a #PF, #DB, or #NM caused by extended feature disable using FRED event delivery, and save original-event data for being used as injected-event data. Unlike IDT using some extra CPU register as part of an event context, e.g., %cr2 for #PF, FRED saves a complete event context in its stack frame, e.g., FRED saves the faulting linear address of a #PF into the event data field defined in its stack frame. Thus a new VMX control field called injected-event data is added to provide the event data that will be pushed into a FRED stack frame for VM entries that inject an event using FRED event delivery. In addition, a new VM exit information field called original-event data is added to store the event data that would have saved into a FRED stack frame for VM exits that occur during FRED event delivery. After such a VM exit is handled to allow the original-event to be delivered, the data in the original-event data VMCS field needs to be set into the injected-event data VMCS field for the injection of the original event. Tested-by: Shan Kang Signed-off-by: Xin Li --- arch/x86/include/asm/vmx.h | 4 ++ arch/x86/kvm/vmx/vmx.c | 84 +++++++++++++++++++++++++++++++++++--- arch/x86/kvm/x86.c | 10 ++++- 3 files changed, 91 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index d54a1a1057b0..97729248e844 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -253,8 +253,12 @@ enum vmcs_field { PID_POINTER_TABLE_HIGH = 0x00002043, SECONDARY_VM_EXIT_CONTROLS = 0x00002044, SECONDARY_VM_EXIT_CONTROLS_HIGH = 0x00002045, + INJECTED_EVENT_DATA = 0x00002052, + INJECTED_EVENT_DATA_HIGH = 0x00002053, GUEST_PHYSICAL_ADDRESS = 0x00002400, GUEST_PHYSICAL_ADDRESS_HIGH = 0x00002401, + ORIGINAL_EVENT_DATA = 0x00002404, + ORIGINAL_EVENT_DATA_HIGH = 0x00002405, VMCS_LINK_POINTER = 0x00002800, VMCS_LINK_POINTER_HIGH = 0x00002801, GUEST_IA32_DEBUGCTL = 0x00002802, diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 58d01e845804..67fd4a56d031 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -1880,9 +1880,30 @@ static void vmx_inject_exception(struct kvm_vcpu *vcpu) vmcs_write32(VM_ENTRY_INSTRUCTION_LEN, vmx->vcpu.arch.event_exit_inst_len); intr_info |= INTR_TYPE_SOFT_EXCEPTION; - } else + } else { intr_info |= INTR_TYPE_HARD_EXCEPTION; + if (kvm_is_fred_enabled(vcpu)) { + u64 event_data = 0; + + if (is_debug(intr_info)) + /* + * Compared to DR6, FRED #DB event data saved on + * the stack frame have bits 4 ~ 11 and 16 ~ 31 + * inverted, i.e., + * fred_db_event_data = dr6 ^ 0xFFFF0FF0UL + */ + event_data = vcpu->arch.dr6 ^ DR6_RESERVED; + else if (is_page_fault(intr_info)) + event_data = vcpu->arch.cr2; + else if (is_nm_fault(intr_info) && + vcpu->arch.guest_fpu.fpstate->xfd) + event_data = vcpu->arch.guest_fpu.xfd_err; + + vmcs_write64(INJECTED_EVENT_DATA, event_data); + } + } + vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr_info); vmx_clear_hlt(vcpu); @@ -7226,7 +7247,8 @@ static void vmx_recover_nmi_blocking(struct vcpu_vmx *vmx) static void __vmx_complete_interrupts(struct kvm_vcpu *vcpu, u32 idt_vectoring_info, int instr_len_field, - int error_code_field) + int error_code_field, + int event_data_field) { u8 vector; int type; @@ -7260,6 +7282,37 @@ static void __vmx_complete_interrupts(struct kvm_vcpu *vcpu, vcpu->arch.event_exit_inst_len = vmcs_read32(instr_len_field); fallthrough; case INTR_TYPE_HARD_EXCEPTION: + if (kvm_is_fred_enabled(vcpu) && event_data_field) { + /* + * Save original-event data for being used as injected-event data. + */ + u64 event_data = vmcs_read64(event_data_field); + + switch (vector) { + case DB_VECTOR: + get_debugreg(vcpu->arch.dr6, 6); + WARN_ON(vcpu->arch.dr6 != (event_data ^ DR6_RESERVED)); + vcpu->arch.dr6 = event_data ^ DR6_RESERVED; + break; + case NM_VECTOR: + if (vcpu->arch.guest_fpu.fpstate->xfd) { + rdmsrl(MSR_IA32_XFD_ERR, vcpu->arch.guest_fpu.xfd_err); + WARN_ON(vcpu->arch.guest_fpu.xfd_err != event_data); + vcpu->arch.guest_fpu.xfd_err = event_data; + } else { + WARN_ON(event_data != 0); + } + break; + case PF_VECTOR: + WARN_ON(vcpu->arch.cr2 != event_data); + vcpu->arch.cr2 = event_data; + break; + default: + WARN_ON(event_data != 0); + break; + } + } + if (idt_vectoring_info & VECTORING_INFO_DELIVER_CODE_MASK) { u32 err = vmcs_read32(error_code_field); kvm_requeue_exception_e(vcpu, vector, err); @@ -7279,9 +7332,11 @@ static void __vmx_complete_interrupts(struct kvm_vcpu *vcpu, static void vmx_complete_interrupts(struct vcpu_vmx *vmx) { - __vmx_complete_interrupts(&vmx->vcpu, vmx->idt_vectoring_info, + __vmx_complete_interrupts(&vmx->vcpu, + vmx->idt_vectoring_info, VM_EXIT_INSTRUCTION_LEN, - IDT_VECTORING_ERROR_CODE); + IDT_VECTORING_ERROR_CODE, + ORIGINAL_EVENT_DATA); } static void vmx_cancel_injection(struct kvm_vcpu *vcpu) @@ -7289,7 +7344,8 @@ static void vmx_cancel_injection(struct kvm_vcpu *vcpu) __vmx_complete_interrupts(vcpu, vmcs_read32(VM_ENTRY_INTR_INFO_FIELD), VM_ENTRY_INSTRUCTION_LEN, - VM_ENTRY_EXCEPTION_ERROR_CODE); + VM_ENTRY_EXCEPTION_ERROR_CODE, + 0); vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, 0); } @@ -7406,6 +7462,24 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu, vmx_disable_fb_clear(vmx); + /* + * %cr2 needs to be saved after a VM exit and restored before a VM + * entry in case a VM exit happens immediately after delivery of a + * guest #PF but before guest reads %cr2. + * + * A FRED guest should read its #PF faulting linear address from + * the event data field in its FRED stack frame instead of %cr2. + * But the FRED 5.0 spec still requires a FRED CPU to update %cr2 + * in the normal way, thus %cr2 is still updated even for a FRED + * guest. + * + * Note, an NMI could interrupt KVM: + * 1) after VM exit but before CR2 is saved. + * 2) after CR2 is restored but before VM entry. + * And a #PF could happen durng NMI handlng, which overwrites %cr2. + * Thus exc_nmi() should save and restore %cr2 upon entering and + * before leaving to make sure %cr2 not corrupted. + */ if (vcpu->arch.cr2 != native_read_cr2()) native_write_cr2(vcpu->arch.cr2); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c5a55810647f..d190bfc63fc4 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -680,8 +680,14 @@ static void kvm_multiple_exception(struct kvm_vcpu *vcpu, vcpu->arch.exception.injected = true; if (WARN_ON_ONCE(has_payload)) { /* - * A reinjected event has already - * delivered its payload. + * For a reinjected event, KVM delivers its + * payload through: + * #PF: save %cr2 into arch.cr2 immediately + * after VM exits. + * #DB: save %dr6 into arch.dr6 later in + * sync_dirty_debug_regs(). + * + * For FRED guest, see __vmx_complete_interrupts(). */ has_payload = false; payload = 0;