From patchwork Sun Oct 30 06:22:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12828 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665130wru; Sat, 29 Oct 2022 23:25:07 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6Wl1uUx6SaLOnxwPepOMXfkcBXKQEPUq87RMhcGyCRzW48HIJLjCLv/7I/EFQrwKg6mweI X-Received: by 2002:a17:907:2d2c:b0:78d:d289:7efd with SMTP id gs44-20020a1709072d2c00b0078dd2897efdmr7068544ejc.166.1667111106971; Sat, 29 Oct 2022 23:25:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111106; cv=none; d=google.com; s=arc-20160816; b=CbmhMhxpnZlgB9BZM9XC57rB+Hc9f5Z2xdqC1TNVIL9qQmnv+l3gEIVNzonsbpqDMw 3c6KwqYQUSnnF6actoMrLF9AFKynt/77Zr56/BpxF1D1yZZd/LrQQzJSfMHcTNsTe6LD s0EjDdpGjNxE0jAJRpQKmSEW7J5d9TnzlFn2+NPu663ZpKpQfwk19pNDzGzzNrvTGV7p cFjMSNGN3Bw8LILPNOUspTiDUe2Oz2HApVXm3TmTu2npviyoNNL2CwUfhwuu6gsBJGLZ 7WwNaqxB04hWX3XDH6lokmOca0UUF+ldOwX4JT59dICCzRiWJghXby/um32PCrwrg6DX 3wVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=5q3Sd13t1ENA5FJ/8+Qn9ktt+gxunVOYM8UyvMXPWi4=; b=hzzHw591etWZqHYEdF2aYFZJaVKGI8yIG+BtXyYvszIG7OdCm7Z+49Y9pSwVY7K5TY IcSnVEZ2XkoWutd9UyBtYqJNQFtp3DlqbkPxeRIws+wBl42sJip6Ml2S0+629BfOhUBd 2Mr8F+xv0qbhCSkb8j09AdDcaW6EY4Lso2jkX2wMgm+d+yQXt3S6loHtKY66Vw8IWt2U TfaugzPuwKFkWR2jsxaWhrjquMUkq1qzwQ72Pg+OX82JAqzsQinAtGlRpr/a15MFw2aE xgIMWev1fQkf6WPcefqqh52Udd2+3ppvNMLbFCPCBMI132OcRSRU/RdNqR9joGT2qiRY 5EGw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=KcsAxknh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r2-20020a50aac2000000b00461f2136969si3723045edc.242.2022.10.29.23.24.42; Sat, 29 Oct 2022 23:25:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=KcsAxknh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229918AbiJ3GYL (ORCPT + 99 others); Sun, 30 Oct 2022 02:24:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229629AbiJ3GYB (ORCPT ); Sun, 30 Oct 2022 02:24:01 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D6133A7; Sat, 29 Oct 2022 23:23:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111037; x=1698647037; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1mAFGjJwAu16rSgc+YfnfYat4P2oUumk03cVBbtyAb8=; b=KcsAxknhLlXtomMnghwAhfVcqXk2INYmjGOv/KdZTHrunCDnGZuRegz4 8Vn6OGAP5HpR8+bq1meQY49yJkROh9j5N79yFgM6oGeoBgbYP/xGGtDdF dWJt0GLfeImMvNwLx6DRIIQcY1oT7fNWyColc3fx36JhzZXhxxAiI4pwi expHBXwR0qgl+xbIqQy1BQoJsn3Itij9lWnDO3+msSNYxZURlSzwmI9vZ CiJcfCXRKkUDtdrQ7MUbm5xpuX4s9SutVhJSShoblbnGWKC1Uf3EKr6o7 rgZXlTKOFl0UIyNGKeCLseEnezrwVkR+G2qS68i4KGebGmoPX+6Id+gue g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037110" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037110" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:56 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392830" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392830" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:56 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson , Xiaoyao Li Subject: [PATCH v10 001/108] KVM: VMX: Move out vmx_x86_ops to 'main.c' to wrap VMX and TDX Date: Sat, 29 Oct 2022 23:22:02 -0700 Message-Id: <10793e9e974e43e497d00fb57b52c85a3432b45c.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092695912111046?= X-GMAIL-MSGID: =?utf-8?q?1748092695912111046?= From: Sean Christopherson KVM accesses Virtual Machine Control Structure (VMCS) with VMX instructions to operate on VM. TDX doesn't allow VMM to operate VMCS directly. Instead, TDX has its own data structures, and TDX SEAMCALL APIs for VMM to indirectly operate those data structures. This means we must have a TDX version of kvm_x86_ops. The existing global struct kvm_x86_ops already defines an interface which fits with TDX. But kvm_x86_ops is system-wide, not per-VM structure. To allow VMX to coexist with TDs, the kvm_x86_ops callbacks will have wrappers "if (tdx) tdx_op() else vmx_op()" to switch VMX or TDX at run time. To split the runtime switch, the VMX implementation, and the TDX implementation, add main.c, and move out the vmx_x86_ops hooks in preparation for adding TDX, which can coexist with VMX, i.e. KVM can run both VMs and TDs. Use 'vt' for the naming scheme as a nod to VT-x and as a concatenation of VmxTdx. The current code looks as follows. In vmx.c static vmx_op() { ... } static struct kvm_x86_ops vmx_x86_ops = { .op = vmx_op, initialization code The eventually converted code will look like In vmx.c, keep the VMX operations. vmx_op() { ... } VMX initialization In tdx.c, define the TDX operations. tdx_op() { ... } TDX initialization In x86_ops.h, declare the VMX and TDX operations. vmx_op(); tdx_op(); In main.c, define common wrappers for VMX and TDX. static vt_ops() { if (tdx) tdx_ops() else vmx_ops() } static struct kvm_x86_ops vt_x86_ops = { .op = vt_op, initialization to call VMX and TDX initialization Opportunistically, fix the name inconsistency from vmx_create_vcpu() and vmx_free_vcpu() to vmx_vcpu_create() and vxm_vcpu_free(). Co-developed-by: Xiaoyao Li Signed-off-by: Xiaoyao Li Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/kvm/Makefile | 2 +- arch/x86/kvm/vmx/main.c | 155 ++++++++++++++++ arch/x86/kvm/vmx/vmx.c | 363 +++++++++++-------------------------- arch/x86/kvm/vmx/x86_ops.h | 125 +++++++++++++ 4 files changed, 386 insertions(+), 259 deletions(-) create mode 100644 arch/x86/kvm/vmx/main.c create mode 100644 arch/x86/kvm/vmx/x86_ops.h diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index 30f244b64523..ee4d0999f20f 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -22,7 +22,7 @@ kvm-$(CONFIG_X86_64) += mmu/tdp_iter.o mmu/tdp_mmu.o kvm-$(CONFIG_KVM_XEN) += xen.o kvm-intel-y += vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o \ - vmx/evmcs.o vmx/nested.o vmx/posted_intr.o + vmx/evmcs.o vmx/nested.o vmx/posted_intr.o vmx/main.o kvm-intel-$(CONFIG_X86_SGX_KVM) += vmx/sgx.o kvm-amd-y += svm/svm.o svm/vmenter.o svm/pmu.o svm/nested.o svm/avic.o svm/sev.o diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c new file mode 100644 index 000000000000..381059631e4b --- /dev/null +++ b/arch/x86/kvm/vmx/main.c @@ -0,0 +1,155 @@ +// SPDX-License-Identifier: GPL-2.0 +#include + +#include "x86_ops.h" +#include "vmx.h" +#include "nested.h" +#include "pmu.h" + +struct kvm_x86_ops vt_x86_ops __initdata = { + .name = "kvm_intel", + + .hardware_unsetup = vmx_hardware_unsetup, + .check_processor_compatibility = vmx_check_processor_compatibility, + + .hardware_enable = vmx_hardware_enable, + .hardware_disable = vmx_hardware_disable, + .has_emulated_msr = vmx_has_emulated_msr, + + .vm_size = sizeof(struct kvm_vmx), + .vm_init = vmx_vm_init, + .vm_destroy = vmx_vm_destroy, + + .vcpu_precreate = vmx_vcpu_precreate, + .vcpu_create = vmx_vcpu_create, + .vcpu_free = vmx_vcpu_free, + .vcpu_reset = vmx_vcpu_reset, + + .prepare_switch_to_guest = vmx_prepare_switch_to_guest, + .vcpu_load = vmx_vcpu_load, + .vcpu_put = vmx_vcpu_put, + + .update_exception_bitmap = vmx_update_exception_bitmap, + .get_msr_feature = vmx_get_msr_feature, + .get_msr = vmx_get_msr, + .set_msr = vmx_set_msr, + .get_segment_base = vmx_get_segment_base, + .get_segment = vmx_get_segment, + .set_segment = vmx_set_segment, + .get_cpl = vmx_get_cpl, + .get_cs_db_l_bits = vmx_get_cs_db_l_bits, + .set_cr0 = vmx_set_cr0, + .is_valid_cr4 = vmx_is_valid_cr4, + .set_cr4 = vmx_set_cr4, + .set_efer = vmx_set_efer, + .get_idt = vmx_get_idt, + .set_idt = vmx_set_idt, + .get_gdt = vmx_get_gdt, + .set_gdt = vmx_set_gdt, + .set_dr7 = vmx_set_dr7, + .sync_dirty_debug_regs = vmx_sync_dirty_debug_regs, + .cache_reg = vmx_cache_reg, + .get_rflags = vmx_get_rflags, + .set_rflags = vmx_set_rflags, + .get_if_flag = vmx_get_if_flag, + + .flush_tlb_all = vmx_flush_tlb_all, + .flush_tlb_current = vmx_flush_tlb_current, + .flush_tlb_gva = vmx_flush_tlb_gva, + .flush_tlb_guest = vmx_flush_tlb_guest, + + .vcpu_pre_run = vmx_vcpu_pre_run, + .vcpu_run = vmx_vcpu_run, + .handle_exit = vmx_handle_exit, + .skip_emulated_instruction = vmx_skip_emulated_instruction, + .update_emulated_instruction = vmx_update_emulated_instruction, + .set_interrupt_shadow = vmx_set_interrupt_shadow, + .get_interrupt_shadow = vmx_get_interrupt_shadow, + .patch_hypercall = vmx_patch_hypercall, + .inject_irq = vmx_inject_irq, + .inject_nmi = vmx_inject_nmi, + .inject_exception = vmx_inject_exception, + .cancel_injection = vmx_cancel_injection, + .interrupt_allowed = vmx_interrupt_allowed, + .nmi_allowed = vmx_nmi_allowed, + .get_nmi_mask = vmx_get_nmi_mask, + .set_nmi_mask = vmx_set_nmi_mask, + .enable_nmi_window = vmx_enable_nmi_window, + .enable_irq_window = vmx_enable_irq_window, + .update_cr8_intercept = vmx_update_cr8_intercept, + .set_virtual_apic_mode = vmx_set_virtual_apic_mode, + .set_apic_access_page_addr = vmx_set_apic_access_page_addr, + .refresh_apicv_exec_ctrl = vmx_refresh_apicv_exec_ctrl, + .load_eoi_exitmap = vmx_load_eoi_exitmap, + .apicv_post_state_restore = vmx_apicv_post_state_restore, + .check_apicv_inhibit_reasons = vmx_check_apicv_inhibit_reasons, + .hwapic_irr_update = vmx_hwapic_irr_update, + .hwapic_isr_update = vmx_hwapic_isr_update, + .guest_apic_has_interrupt = vmx_guest_apic_has_interrupt, + .sync_pir_to_irr = vmx_sync_pir_to_irr, + .deliver_interrupt = vmx_deliver_interrupt, + .dy_apicv_has_pending_interrupt = pi_has_pending_interrupt, + + .set_tss_addr = vmx_set_tss_addr, + .set_identity_map_addr = vmx_set_identity_map_addr, + .get_mt_mask = vmx_get_mt_mask, + + .get_exit_info = vmx_get_exit_info, + + .vcpu_after_set_cpuid = vmx_vcpu_after_set_cpuid, + + .has_wbinvd_exit = cpu_has_vmx_wbinvd_exit, + + .get_l2_tsc_offset = vmx_get_l2_tsc_offset, + .get_l2_tsc_multiplier = vmx_get_l2_tsc_multiplier, + .write_tsc_offset = vmx_write_tsc_offset, + .write_tsc_multiplier = vmx_write_tsc_multiplier, + + .load_mmu_pgd = vmx_load_mmu_pgd, + + .check_intercept = vmx_check_intercept, + .handle_exit_irqoff = vmx_handle_exit_irqoff, + + .request_immediate_exit = vmx_request_immediate_exit, + + .sched_in = vmx_sched_in, + + .cpu_dirty_log_size = PML_ENTITY_NUM, + .update_cpu_dirty_logging = vmx_update_cpu_dirty_logging, + + .nested_ops = &vmx_nested_ops, + + .pi_update_irte = vmx_pi_update_irte, + .pi_start_assignment = vmx_pi_start_assignment, + +#ifdef CONFIG_X86_64 + .set_hv_timer = vmx_set_hv_timer, + .cancel_hv_timer = vmx_cancel_hv_timer, +#endif + + .setup_mce = vmx_setup_mce, + + .smi_allowed = vmx_smi_allowed, + .enter_smm = vmx_enter_smm, + .leave_smm = vmx_leave_smm, + .enable_smi_window = vmx_enable_smi_window, + + .can_emulate_instruction = vmx_can_emulate_instruction, + .apic_init_signal_blocked = vmx_apic_init_signal_blocked, + .migrate_timers = vmx_migrate_timers, + + .msr_filter_changed = vmx_msr_filter_changed, + .complete_emulated_msr = kvm_complete_insn_gp, + + .vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector, +}; + +struct kvm_x86_init_ops vt_init_ops __initdata = { + .cpu_has_kvm_support = vmx_cpu_has_kvm_support, + .disabled_by_bios = vmx_disabled_by_bios, + .hardware_setup = vmx_hardware_setup, + .handle_intel_pt_intr = NULL, + + .runtime_ops = &vt_x86_ops, + .pmu_ops = &intel_pmu_ops, +}; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 7cc06ca0efc2..0080d88ded20 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -66,6 +66,7 @@ #include "vmcs12.h" #include "vmx.h" #include "x86.h" +#include "x86_ops.h" MODULE_AUTHOR("Qumranet"); MODULE_LICENSE("GPL"); @@ -1386,7 +1387,7 @@ void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu, int cpu, * Switches to specified vcpu, until a matching vcpu_put(), but assumes * vcpu mutex is already taken. */ -static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) +void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -1397,7 +1398,7 @@ static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) vmx->host_debugctlmsr = get_debugctlmsr(); } -static void vmx_vcpu_put(struct kvm_vcpu *vcpu) +void vmx_vcpu_put(struct kvm_vcpu *vcpu) { vmx_vcpu_pi_put(vcpu); @@ -1451,7 +1452,7 @@ void vmx_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags) vmx->emulation_required = vmx_emulation_required(vcpu); } -static bool vmx_get_if_flag(struct kvm_vcpu *vcpu) +bool vmx_get_if_flag(struct kvm_vcpu *vcpu) { return vmx_get_rflags(vcpu) & X86_EFLAGS_IF; } @@ -1557,8 +1558,8 @@ static int vmx_rtit_ctl_check(struct kvm_vcpu *vcpu, u64 data) return 0; } -static bool vmx_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type, - void *insn, int insn_len) +bool vmx_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type, + void *insn, int insn_len) { /* * Emulation of instructions in SGX enclaves is impossible as RIP does @@ -1642,7 +1643,7 @@ static int skip_emulated_instruction(struct kvm_vcpu *vcpu) * Recognizes a pending MTF VM-exit and records the nested state for later * delivery. */ -static void vmx_update_emulated_instruction(struct kvm_vcpu *vcpu) +void vmx_update_emulated_instruction(struct kvm_vcpu *vcpu) { struct vmcs12 *vmcs12 = get_vmcs12(vcpu); struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -1673,7 +1674,7 @@ static void vmx_update_emulated_instruction(struct kvm_vcpu *vcpu) } } -static int vmx_skip_emulated_instruction(struct kvm_vcpu *vcpu) +int vmx_skip_emulated_instruction(struct kvm_vcpu *vcpu) { vmx_update_emulated_instruction(vcpu); return skip_emulated_instruction(vcpu); @@ -1692,7 +1693,7 @@ static void vmx_clear_hlt(struct kvm_vcpu *vcpu) vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE); } -static void vmx_inject_exception(struct kvm_vcpu *vcpu) +void vmx_inject_exception(struct kvm_vcpu *vcpu) { struct kvm_queued_exception *ex = &vcpu->arch.exception; u32 intr_info = ex->vector | INTR_INFO_VALID_MASK; @@ -1813,12 +1814,12 @@ u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu) return kvm_caps.default_tsc_scaling_ratio; } -static void vmx_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset) +void vmx_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset) { vmcs_write64(TSC_OFFSET, offset); } -static void vmx_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier) +void vmx_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier) { vmcs_write64(TSC_MULTIPLIER, multiplier); } @@ -1842,7 +1843,7 @@ static inline bool vmx_feature_control_msr_valid(struct kvm_vcpu *vcpu, return !(val & ~valid_bits); } -static int vmx_get_msr_feature(struct kvm_msr_entry *msr) +int vmx_get_msr_feature(struct kvm_msr_entry *msr) { switch (msr->index) { case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC: @@ -1862,7 +1863,7 @@ static int vmx_get_msr_feature(struct kvm_msr_entry *msr) * Returns 0 on success, non-0 otherwise. * Assumes vcpu_load() was already called. */ -static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) +int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) { struct vcpu_vmx *vmx = to_vmx(vcpu); struct vmx_uret_msr *msr; @@ -2039,7 +2040,7 @@ static u64 vcpu_supported_debugctl(struct kvm_vcpu *vcpu) * Returns 0 on success, non-0 otherwise. * Assumes vcpu_load() was already called. */ -static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) +int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) { struct vcpu_vmx *vmx = to_vmx(vcpu); struct vmx_uret_msr *msr; @@ -2373,7 +2374,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return ret; } -static void vmx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg) +void vmx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg) { unsigned long guest_owned_bits; @@ -2416,12 +2417,12 @@ static void vmx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg) } } -static __init int cpu_has_kvm_support(void) +__init int vmx_cpu_has_kvm_support(void) { return cpu_has_vmx(); } -static __init int vmx_disabled_by_bios(void) +__init int vmx_disabled_by_bios(void) { return !boot_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) || !boot_cpu_has(X86_FEATURE_VMX); @@ -2447,7 +2448,7 @@ static int kvm_cpu_vmxon(u64 vmxon_pointer) return -EFAULT; } -static int vmx_hardware_enable(void) +int vmx_hardware_enable(void) { int cpu = raw_smp_processor_id(); u64 phys_addr = __pa(per_cpu(vmxarea, cpu)); @@ -2488,7 +2489,7 @@ static void vmclear_local_loaded_vmcss(void) __loaded_vmcs_clear(v); } -static void vmx_hardware_disable(void) +void vmx_hardware_disable(void) { vmclear_local_loaded_vmcss(); @@ -3025,7 +3026,7 @@ static void exit_lmode(struct kvm_vcpu *vcpu) #endif -static void vmx_flush_tlb_all(struct kvm_vcpu *vcpu) +void vmx_flush_tlb_all(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -3055,7 +3056,7 @@ static inline int vmx_get_current_vpid(struct kvm_vcpu *vcpu) return to_vmx(vcpu)->vpid; } -static void vmx_flush_tlb_current(struct kvm_vcpu *vcpu) +void vmx_flush_tlb_current(struct kvm_vcpu *vcpu) { struct kvm_mmu *mmu = vcpu->arch.mmu; u64 root_hpa = mmu->root.hpa; @@ -3071,7 +3072,7 @@ static void vmx_flush_tlb_current(struct kvm_vcpu *vcpu) vpid_sync_context(vmx_get_current_vpid(vcpu)); } -static void vmx_flush_tlb_gva(struct kvm_vcpu *vcpu, gva_t addr) +void vmx_flush_tlb_gva(struct kvm_vcpu *vcpu, gva_t addr) { /* * vpid_sync_vcpu_addr() is a nop if vpid==0, see the comment in @@ -3080,7 +3081,7 @@ static void vmx_flush_tlb_gva(struct kvm_vcpu *vcpu, gva_t addr) vpid_sync_vcpu_addr(vmx_get_current_vpid(vcpu), addr); } -static void vmx_flush_tlb_guest(struct kvm_vcpu *vcpu) +void vmx_flush_tlb_guest(struct kvm_vcpu *vcpu) { /* * vpid_sync_context() is a nop if vpid==0, e.g. if enable_vpid==0 or a @@ -3235,8 +3236,7 @@ u64 construct_eptp(struct kvm_vcpu *vcpu, hpa_t root_hpa, int root_level) return eptp; } -static void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, - int root_level) +void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int root_level) { struct kvm *kvm = vcpu->kvm; bool update_guest_cr3 = true; @@ -3264,8 +3264,7 @@ static void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, vmcs_writel(GUEST_CR3, guest_cr3); } - -static bool vmx_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) +bool vmx_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) { /* * We operate under the default treatment of SMM, so VMX cannot be @@ -3381,7 +3380,7 @@ void vmx_get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg) var->g = (ar >> 15) & 1; } -static u64 vmx_get_segment_base(struct kvm_vcpu *vcpu, int seg) +u64 vmx_get_segment_base(struct kvm_vcpu *vcpu, int seg) { struct kvm_segment s; @@ -3461,14 +3460,14 @@ void __vmx_set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg) vmcs_write32(sf->ar_bytes, vmx_segment_access_rights(var)); } -static void vmx_set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg) +void vmx_set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg) { __vmx_set_segment(vcpu, var, seg); to_vmx(vcpu)->emulation_required = vmx_emulation_required(vcpu); } -static void vmx_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l) +void vmx_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l) { u32 ar = vmx_read_guest_seg_ar(to_vmx(vcpu), VCPU_SREG_CS); @@ -3476,25 +3475,25 @@ static void vmx_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l) *l = (ar >> 13) & 1; } -static void vmx_get_idt(struct kvm_vcpu *vcpu, struct desc_ptr *dt) +void vmx_get_idt(struct kvm_vcpu *vcpu, struct desc_ptr *dt) { dt->size = vmcs_read32(GUEST_IDTR_LIMIT); dt->address = vmcs_readl(GUEST_IDTR_BASE); } -static void vmx_set_idt(struct kvm_vcpu *vcpu, struct desc_ptr *dt) +void vmx_set_idt(struct kvm_vcpu *vcpu, struct desc_ptr *dt) { vmcs_write32(GUEST_IDTR_LIMIT, dt->size); vmcs_writel(GUEST_IDTR_BASE, dt->address); } -static void vmx_get_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt) +void vmx_get_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt) { dt->size = vmcs_read32(GUEST_GDTR_LIMIT); dt->address = vmcs_readl(GUEST_GDTR_BASE); } -static void vmx_set_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt) +void vmx_set_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt) { vmcs_write32(GUEST_GDTR_LIMIT, dt->size); vmcs_writel(GUEST_GDTR_BASE, dt->address); @@ -3992,7 +3991,7 @@ void pt_update_intercept_for_msr(struct kvm_vcpu *vcpu) } } -static bool vmx_guest_apic_has_interrupt(struct kvm_vcpu *vcpu) +bool vmx_guest_apic_has_interrupt(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); void *vapic_page; @@ -4012,7 +4011,7 @@ static bool vmx_guest_apic_has_interrupt(struct kvm_vcpu *vcpu) return ((rvi & 0xf0) > (vppr & 0xf0)); } -static void vmx_msr_filter_changed(struct kvm_vcpu *vcpu) +void vmx_msr_filter_changed(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); u32 i; @@ -4153,8 +4152,8 @@ static int vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector) return 0; } -static void vmx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, - int trig_mode, int vector) +void vmx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, + int trig_mode, int vector) { struct kvm_vcpu *vcpu = apic->vcpu; @@ -4316,7 +4315,7 @@ static u32 vmx_vmexit_ctrl(void) ~(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | VM_EXIT_LOAD_IA32_EFER); } -static void vmx_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu) +void vmx_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -4574,7 +4573,7 @@ static int vmx_alloc_ipiv_pid_table(struct kvm *kvm) return 0; } -static int vmx_vcpu_precreate(struct kvm *kvm) +int vmx_vcpu_precreate(struct kvm *kvm) { return vmx_alloc_ipiv_pid_table(kvm); } @@ -4726,7 +4725,7 @@ static void __vmx_vcpu_reset(struct kvm_vcpu *vcpu) vmx->pi_desc.sn = 1; } -static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) +void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -4785,12 +4784,12 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vmx_update_fb_clear_dis(vcpu, vmx); } -static void vmx_enable_irq_window(struct kvm_vcpu *vcpu) +void vmx_enable_irq_window(struct kvm_vcpu *vcpu) { exec_controls_setbit(to_vmx(vcpu), CPU_BASED_INTR_WINDOW_EXITING); } -static void vmx_enable_nmi_window(struct kvm_vcpu *vcpu) +void vmx_enable_nmi_window(struct kvm_vcpu *vcpu) { if (!enable_vnmi || vmcs_read32(GUEST_INTERRUPTIBILITY_INFO) & GUEST_INTR_STATE_STI) { @@ -4801,7 +4800,7 @@ static void vmx_enable_nmi_window(struct kvm_vcpu *vcpu) exec_controls_setbit(to_vmx(vcpu), CPU_BASED_NMI_WINDOW_EXITING); } -static void vmx_inject_irq(struct kvm_vcpu *vcpu, bool reinjected) +void vmx_inject_irq(struct kvm_vcpu *vcpu, bool reinjected) { struct vcpu_vmx *vmx = to_vmx(vcpu); uint32_t intr; @@ -4829,7 +4828,7 @@ static void vmx_inject_irq(struct kvm_vcpu *vcpu, bool reinjected) vmx_clear_hlt(vcpu); } -static void vmx_inject_nmi(struct kvm_vcpu *vcpu) +void vmx_inject_nmi(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -4907,7 +4906,7 @@ bool vmx_nmi_blocked(struct kvm_vcpu *vcpu) GUEST_INTR_STATE_NMI)); } -static int vmx_nmi_allowed(struct kvm_vcpu *vcpu, bool for_injection) +int vmx_nmi_allowed(struct kvm_vcpu *vcpu, bool for_injection) { if (to_vmx(vcpu)->nested.nested_run_pending) return -EBUSY; @@ -4929,7 +4928,7 @@ bool vmx_interrupt_blocked(struct kvm_vcpu *vcpu) (GUEST_INTR_STATE_STI | GUEST_INTR_STATE_MOV_SS)); } -static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu, bool for_injection) +int vmx_interrupt_allowed(struct kvm_vcpu *vcpu, bool for_injection) { if (to_vmx(vcpu)->nested.nested_run_pending) return -EBUSY; @@ -4944,7 +4943,7 @@ static int vmx_interrupt_allowed(struct kvm_vcpu *vcpu, bool for_injection) return !vmx_interrupt_blocked(vcpu); } -static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr) +int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr) { void __user *ret; @@ -4964,7 +4963,7 @@ static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr) return init_rmode_tss(kvm, ret); } -static int vmx_set_identity_map_addr(struct kvm *kvm, u64 ident_addr) +int vmx_set_identity_map_addr(struct kvm *kvm, u64 ident_addr) { to_kvm_vmx(kvm)->ept_identity_map_addr = ident_addr; return 0; @@ -5245,8 +5244,7 @@ static int handle_io(struct kvm_vcpu *vcpu) return kvm_fast_pio(vcpu, size, port, in); } -static void -vmx_patch_hypercall(struct kvm_vcpu *vcpu, unsigned char *hypercall) +void vmx_patch_hypercall(struct kvm_vcpu *vcpu, unsigned char *hypercall) { /* * Patch in the VMCALL instruction: @@ -5456,7 +5454,7 @@ static int handle_dr(struct kvm_vcpu *vcpu) return kvm_complete_insn_gp(vcpu, err); } -static void vmx_sync_dirty_debug_regs(struct kvm_vcpu *vcpu) +void vmx_sync_dirty_debug_regs(struct kvm_vcpu *vcpu) { get_debugreg(vcpu->arch.db[0], 0); get_debugreg(vcpu->arch.db[1], 1); @@ -5475,7 +5473,7 @@ static void vmx_sync_dirty_debug_regs(struct kvm_vcpu *vcpu) set_debugreg(DR6_RESERVED, 6); } -static void vmx_set_dr7(struct kvm_vcpu *vcpu, unsigned long val) +void vmx_set_dr7(struct kvm_vcpu *vcpu, unsigned long val) { vmcs_writel(GUEST_DR7, val); } @@ -5746,7 +5744,7 @@ static int handle_invalid_guest_state(struct kvm_vcpu *vcpu) return 1; } -static int vmx_vcpu_pre_run(struct kvm_vcpu *vcpu) +int vmx_vcpu_pre_run(struct kvm_vcpu *vcpu) { if (vmx_emulation_required_with_pending_exception(vcpu)) { kvm_prepare_emulation_failure_exit(vcpu); @@ -6010,9 +6008,8 @@ static int (*kvm_vmx_exit_handlers[])(struct kvm_vcpu *vcpu) = { static const int kvm_vmx_max_exit_handlers = ARRAY_SIZE(kvm_vmx_exit_handlers); -static void vmx_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, - u64 *info1, u64 *info2, - u32 *intr_info, u32 *error_code) +void vmx_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, + u64 *info1, u64 *info2, u32 *intr_info, u32 *error_code) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -6455,7 +6452,7 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) return 0; } -static int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) +int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) { int ret = __vmx_handle_exit(vcpu, exit_fastpath); @@ -6543,7 +6540,7 @@ static noinstr void vmx_l1d_flush(struct kvm_vcpu *vcpu) : "eax", "ebx", "ecx", "edx"); } -static void vmx_update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr) +void vmx_update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr) { struct vmcs12 *vmcs12 = get_vmcs12(vcpu); int tpr_threshold; @@ -6613,7 +6610,7 @@ void vmx_set_virtual_apic_mode(struct kvm_vcpu *vcpu) vmx_update_msr_bitmap_x2apic(vcpu); } -static void vmx_set_apic_access_page_addr(struct kvm_vcpu *vcpu) +void vmx_set_apic_access_page_addr(struct kvm_vcpu *vcpu) { struct page *page; @@ -6641,7 +6638,7 @@ static void vmx_set_apic_access_page_addr(struct kvm_vcpu *vcpu) put_page(page); } -static void vmx_hwapic_isr_update(int max_isr) +void vmx_hwapic_isr_update(int max_isr) { u16 status; u8 old; @@ -6675,7 +6672,7 @@ static void vmx_set_rvi(int vector) } } -static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr) +void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr) { /* * When running L2, updating RVI is only relevant when @@ -6689,7 +6686,7 @@ static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr) vmx_set_rvi(max_irr); } -static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu) +int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); int max_irr; @@ -6735,7 +6732,7 @@ static int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu) return max_irr; } -static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap) +void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap) { if (!kvm_vcpu_apicv_active(vcpu)) return; @@ -6746,7 +6743,7 @@ static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap) vmcs_write64(EOI_EXIT_BITMAP3, eoi_exit_bitmap[3]); } -static void vmx_apicv_post_state_restore(struct kvm_vcpu *vcpu) +void vmx_apicv_post_state_restore(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -6819,7 +6816,7 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu) vcpu->arch.at_instruction_boundary = true; } -static void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu) +void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -6836,7 +6833,7 @@ static void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu) * The kvm parameter can be NULL (module initialization, or invocation before * VM creation). Be sure to check the kvm parameter before using it. */ -static bool vmx_has_emulated_msr(struct kvm *kvm, u32 index) +bool vmx_has_emulated_msr(struct kvm *kvm, u32 index) { switch (index) { case MSR_IA32_SMBASE: @@ -6957,7 +6954,7 @@ static void vmx_complete_interrupts(struct vcpu_vmx *vmx) IDT_VECTORING_ERROR_CODE); } -static void vmx_cancel_injection(struct kvm_vcpu *vcpu) +void vmx_cancel_injection(struct kvm_vcpu *vcpu) { __vmx_complete_interrupts(vcpu, vmcs_read32(VM_ENTRY_INTR_INFO_FIELD), @@ -7091,7 +7088,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu, guest_state_exit_irqoff(); } -static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) +fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); unsigned long cr3, cr4; @@ -7257,7 +7254,7 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu) return vmx_exit_handlers_fastpath(vcpu); } -static void vmx_vcpu_free(struct kvm_vcpu *vcpu) +void vmx_vcpu_free(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -7268,7 +7265,7 @@ static void vmx_vcpu_free(struct kvm_vcpu *vcpu) free_loaded_vmcs(vmx->loaded_vmcs); } -static int vmx_vcpu_create(struct kvm_vcpu *vcpu) +int vmx_vcpu_create(struct kvm_vcpu *vcpu) { struct vmx_uret_msr *tsx_ctrl; struct vcpu_vmx *vmx; @@ -7377,7 +7374,7 @@ static int vmx_vcpu_create(struct kvm_vcpu *vcpu) #define L1TF_MSG_SMT "L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.\n" #define L1TF_MSG_L1D "L1TF CPU bug present and virtualization mitigation disabled, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.\n" -static int vmx_vm_init(struct kvm *kvm) +int vmx_vm_init(struct kvm *kvm) { if (!ple_gap) kvm->arch.pause_in_guest = true; @@ -7408,7 +7405,7 @@ static int vmx_vm_init(struct kvm *kvm) return 0; } -static int vmx_check_processor_compatibility(void) +int vmx_check_processor_compatibility(void) { struct vmcs_config vmcs_conf; struct vmx_capability vmx_cap; @@ -7433,7 +7430,7 @@ static int vmx_check_processor_compatibility(void) return 0; } -static u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio) +u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio) { u8 cache; @@ -7605,7 +7602,7 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu) vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4)); } -static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) +void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -7715,7 +7712,7 @@ static __init void vmx_set_cpu_caps(void) kvm_cpu_cap_check_and_set(X86_FEATURE_WAITPKG); } -static void vmx_request_immediate_exit(struct kvm_vcpu *vcpu) +void vmx_request_immediate_exit(struct kvm_vcpu *vcpu) { to_vmx(vcpu)->req_immediate_exit = true; } @@ -7754,10 +7751,10 @@ static int vmx_check_intercept_io(struct kvm_vcpu *vcpu, return intercept ? X86EMUL_UNHANDLEABLE : X86EMUL_CONTINUE; } -static int vmx_check_intercept(struct kvm_vcpu *vcpu, - struct x86_instruction_info *info, - enum x86_intercept_stage stage, - struct x86_exception *exception) +int vmx_check_intercept(struct kvm_vcpu *vcpu, + struct x86_instruction_info *info, + enum x86_intercept_stage stage, + struct x86_exception *exception) { struct vmcs12 *vmcs12 = get_vmcs12(vcpu); @@ -7822,8 +7819,8 @@ static inline int u64_shl_div_u64(u64 a, unsigned int shift, return 0; } -static int vmx_set_hv_timer(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc, - bool *expired) +int vmx_set_hv_timer(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc, + bool *expired) { struct vcpu_vmx *vmx; u64 tscl, guest_tscl, delta_tsc, lapic_timer_advance_cycles; @@ -7862,13 +7859,13 @@ static int vmx_set_hv_timer(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc, return 0; } -static void vmx_cancel_hv_timer(struct kvm_vcpu *vcpu) +void vmx_cancel_hv_timer(struct kvm_vcpu *vcpu) { to_vmx(vcpu)->hv_deadline_tsc = -1; } #endif -static void vmx_sched_in(struct kvm_vcpu *vcpu, int cpu) +void vmx_sched_in(struct kvm_vcpu *vcpu, int cpu) { if (!kvm_pause_in_guest(vcpu->kvm)) shrink_ple_window(vcpu); @@ -7894,7 +7891,7 @@ void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu) secondary_exec_controls_clearbit(vmx, SECONDARY_EXEC_ENABLE_PML); } -static void vmx_setup_mce(struct kvm_vcpu *vcpu) +void vmx_setup_mce(struct kvm_vcpu *vcpu) { if (vcpu->arch.mcg_cap & MCG_LMCE_P) to_vmx(vcpu)->msr_ia32_feature_control_valid_bits |= @@ -7904,7 +7901,7 @@ static void vmx_setup_mce(struct kvm_vcpu *vcpu) ~FEAT_CTL_LMCE_ENABLED; } -static int vmx_smi_allowed(struct kvm_vcpu *vcpu, bool for_injection) +int vmx_smi_allowed(struct kvm_vcpu *vcpu, bool for_injection) { /* we need a nested vmexit to enter SMM, postpone if run is pending */ if (to_vmx(vcpu)->nested.nested_run_pending) @@ -7912,7 +7909,7 @@ static int vmx_smi_allowed(struct kvm_vcpu *vcpu, bool for_injection) return !is_smm(vcpu); } -static int vmx_enter_smm(struct kvm_vcpu *vcpu, char *smstate) +int vmx_enter_smm(struct kvm_vcpu *vcpu, char *smstate) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -7933,7 +7930,7 @@ static int vmx_enter_smm(struct kvm_vcpu *vcpu, char *smstate) return 0; } -static int vmx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate) +int vmx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate) { struct vcpu_vmx *vmx = to_vmx(vcpu); int ret; @@ -7954,17 +7951,17 @@ static int vmx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate) return 0; } -static void vmx_enable_smi_window(struct kvm_vcpu *vcpu) +void vmx_enable_smi_window(struct kvm_vcpu *vcpu) { /* RSM will cause a vmexit anyway. */ } -static bool vmx_apic_init_signal_blocked(struct kvm_vcpu *vcpu) +bool vmx_apic_init_signal_blocked(struct kvm_vcpu *vcpu) { return to_vmx(vcpu)->nested.vmxon && !is_guest_mode(vcpu); } -static void vmx_migrate_timers(struct kvm_vcpu *vcpu) +void vmx_migrate_timers(struct kvm_vcpu *vcpu) { if (is_guest_mode(vcpu)) { struct hrtimer *timer = &to_vmx(vcpu)->nested.preemption_timer; @@ -7974,7 +7971,7 @@ static void vmx_migrate_timers(struct kvm_vcpu *vcpu) } } -static void vmx_hardware_unsetup(void) +void vmx_hardware_unsetup(void) { kvm_set_posted_intr_wakeup_handler(NULL); @@ -7984,7 +7981,7 @@ static void vmx_hardware_unsetup(void) free_kvm_area(); } -static bool vmx_check_apicv_inhibit_reasons(enum kvm_apicv_inhibit reason) +bool vmx_check_apicv_inhibit_reasons(enum kvm_apicv_inhibit reason) { ulong supported = BIT(APICV_INHIBIT_REASON_DISABLE) | BIT(APICV_INHIBIT_REASON_ABSENT) | @@ -7996,151 +7993,13 @@ static bool vmx_check_apicv_inhibit_reasons(enum kvm_apicv_inhibit reason) return supported & BIT(reason); } -static void vmx_vm_destroy(struct kvm *kvm) +void vmx_vm_destroy(struct kvm *kvm) { struct kvm_vmx *kvm_vmx = to_kvm_vmx(kvm); free_pages((unsigned long)kvm_vmx->pid_table, vmx_get_pid_table_order(kvm)); } -static struct kvm_x86_ops vmx_x86_ops __initdata = { - .name = "kvm_intel", - - .hardware_unsetup = vmx_hardware_unsetup, - - .check_processor_compatibility = vmx_check_processor_compatibility, - .hardware_enable = vmx_hardware_enable, - .hardware_disable = vmx_hardware_disable, - .has_emulated_msr = vmx_has_emulated_msr, - - .vm_size = sizeof(struct kvm_vmx), - .vm_init = vmx_vm_init, - .vm_destroy = vmx_vm_destroy, - - .vcpu_precreate = vmx_vcpu_precreate, - .vcpu_create = vmx_vcpu_create, - .vcpu_free = vmx_vcpu_free, - .vcpu_reset = vmx_vcpu_reset, - - .prepare_switch_to_guest = vmx_prepare_switch_to_guest, - .vcpu_load = vmx_vcpu_load, - .vcpu_put = vmx_vcpu_put, - - .update_exception_bitmap = vmx_update_exception_bitmap, - .get_msr_feature = vmx_get_msr_feature, - .get_msr = vmx_get_msr, - .set_msr = vmx_set_msr, - .get_segment_base = vmx_get_segment_base, - .get_segment = vmx_get_segment, - .set_segment = vmx_set_segment, - .get_cpl = vmx_get_cpl, - .get_cs_db_l_bits = vmx_get_cs_db_l_bits, - .set_cr0 = vmx_set_cr0, - .is_valid_cr4 = vmx_is_valid_cr4, - .set_cr4 = vmx_set_cr4, - .set_efer = vmx_set_efer, - .get_idt = vmx_get_idt, - .set_idt = vmx_set_idt, - .get_gdt = vmx_get_gdt, - .set_gdt = vmx_set_gdt, - .set_dr7 = vmx_set_dr7, - .sync_dirty_debug_regs = vmx_sync_dirty_debug_regs, - .cache_reg = vmx_cache_reg, - .get_rflags = vmx_get_rflags, - .set_rflags = vmx_set_rflags, - .get_if_flag = vmx_get_if_flag, - - .flush_tlb_all = vmx_flush_tlb_all, - .flush_tlb_current = vmx_flush_tlb_current, - .flush_tlb_gva = vmx_flush_tlb_gva, - .flush_tlb_guest = vmx_flush_tlb_guest, - - .vcpu_pre_run = vmx_vcpu_pre_run, - .vcpu_run = vmx_vcpu_run, - .handle_exit = vmx_handle_exit, - .skip_emulated_instruction = vmx_skip_emulated_instruction, - .update_emulated_instruction = vmx_update_emulated_instruction, - .set_interrupt_shadow = vmx_set_interrupt_shadow, - .get_interrupt_shadow = vmx_get_interrupt_shadow, - .patch_hypercall = vmx_patch_hypercall, - .inject_irq = vmx_inject_irq, - .inject_nmi = vmx_inject_nmi, - .inject_exception = vmx_inject_exception, - .cancel_injection = vmx_cancel_injection, - .interrupt_allowed = vmx_interrupt_allowed, - .nmi_allowed = vmx_nmi_allowed, - .get_nmi_mask = vmx_get_nmi_mask, - .set_nmi_mask = vmx_set_nmi_mask, - .enable_nmi_window = vmx_enable_nmi_window, - .enable_irq_window = vmx_enable_irq_window, - .update_cr8_intercept = vmx_update_cr8_intercept, - .set_virtual_apic_mode = vmx_set_virtual_apic_mode, - .set_apic_access_page_addr = vmx_set_apic_access_page_addr, - .refresh_apicv_exec_ctrl = vmx_refresh_apicv_exec_ctrl, - .load_eoi_exitmap = vmx_load_eoi_exitmap, - .apicv_post_state_restore = vmx_apicv_post_state_restore, - .check_apicv_inhibit_reasons = vmx_check_apicv_inhibit_reasons, - .hwapic_irr_update = vmx_hwapic_irr_update, - .hwapic_isr_update = vmx_hwapic_isr_update, - .guest_apic_has_interrupt = vmx_guest_apic_has_interrupt, - .sync_pir_to_irr = vmx_sync_pir_to_irr, - .deliver_interrupt = vmx_deliver_interrupt, - .dy_apicv_has_pending_interrupt = pi_has_pending_interrupt, - - .set_tss_addr = vmx_set_tss_addr, - .set_identity_map_addr = vmx_set_identity_map_addr, - .get_mt_mask = vmx_get_mt_mask, - - .get_exit_info = vmx_get_exit_info, - - .vcpu_after_set_cpuid = vmx_vcpu_after_set_cpuid, - - .has_wbinvd_exit = cpu_has_vmx_wbinvd_exit, - - .get_l2_tsc_offset = vmx_get_l2_tsc_offset, - .get_l2_tsc_multiplier = vmx_get_l2_tsc_multiplier, - .write_tsc_offset = vmx_write_tsc_offset, - .write_tsc_multiplier = vmx_write_tsc_multiplier, - - .load_mmu_pgd = vmx_load_mmu_pgd, - - .check_intercept = vmx_check_intercept, - .handle_exit_irqoff = vmx_handle_exit_irqoff, - - .request_immediate_exit = vmx_request_immediate_exit, - - .sched_in = vmx_sched_in, - - .cpu_dirty_log_size = PML_ENTITY_NUM, - .update_cpu_dirty_logging = vmx_update_cpu_dirty_logging, - - .nested_ops = &vmx_nested_ops, - - .pi_update_irte = vmx_pi_update_irte, - .pi_start_assignment = vmx_pi_start_assignment, - -#ifdef CONFIG_X86_64 - .set_hv_timer = vmx_set_hv_timer, - .cancel_hv_timer = vmx_cancel_hv_timer, -#endif - - .setup_mce = vmx_setup_mce, - - .smi_allowed = vmx_smi_allowed, - .enter_smm = vmx_enter_smm, - .leave_smm = vmx_leave_smm, - .enable_smi_window = vmx_enable_smi_window, - - .can_emulate_instruction = vmx_can_emulate_instruction, - .apic_init_signal_blocked = vmx_apic_init_signal_blocked, - .migrate_timers = vmx_migrate_timers, - - .msr_filter_changed = vmx_msr_filter_changed, - .complete_emulated_msr = kvm_complete_insn_gp, - - .vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector, -}; - static unsigned int vmx_handle_intel_pt_intr(void) { struct kvm_vcpu *vcpu = kvm_get_running_vcpu(); @@ -8206,9 +8065,7 @@ static void __init vmx_setup_me_spte_mask(void) kvm_mmu_set_me_spte_mask(0, me_mask); } -static struct kvm_x86_init_ops vmx_init_ops __initdata; - -static __init int hardware_setup(void) +__init int vmx_hardware_setup(void) { unsigned long host_bndcfgs; struct desc_ptr dt; @@ -8272,16 +8129,16 @@ static __init int hardware_setup(void) * using the APIC_ACCESS_ADDR VMCS field. */ if (!flexpriority_enabled) - vmx_x86_ops.set_apic_access_page_addr = NULL; + vt_x86_ops.set_apic_access_page_addr = NULL; if (!cpu_has_vmx_tpr_shadow()) - vmx_x86_ops.update_cr8_intercept = NULL; + vt_x86_ops.update_cr8_intercept = NULL; #if IS_ENABLED(CONFIG_HYPERV) if (ms_hyperv.nested_features & HV_X64_NESTED_GUEST_MAPPING_FLUSH && enable_ept) { - vmx_x86_ops.tlb_remote_flush = hv_remote_flush_tlb; - vmx_x86_ops.tlb_remote_flush_with_range = + vt_x86_ops.tlb_remote_flush = hv_remote_flush_tlb; + vt_x86_ops.tlb_remote_flush_with_range = hv_remote_flush_tlb_with_range; } #endif @@ -8297,7 +8154,7 @@ static __init int hardware_setup(void) if (!cpu_has_vmx_apicv()) enable_apicv = 0; if (!enable_apicv) - vmx_x86_ops.sync_pir_to_irr = NULL; + vt_x86_ops.sync_pir_to_irr = NULL; if (!enable_apicv || !cpu_has_vmx_ipiv()) enable_ipiv = false; @@ -8333,7 +8190,7 @@ static __init int hardware_setup(void) enable_pml = 0; if (!enable_pml) - vmx_x86_ops.cpu_dirty_log_size = 0; + vt_x86_ops.cpu_dirty_log_size = 0; if (!cpu_has_vmx_preemption_timer()) enable_preemption_timer = false; @@ -8358,9 +8215,9 @@ static __init int hardware_setup(void) } if (!enable_preemption_timer) { - vmx_x86_ops.set_hv_timer = NULL; - vmx_x86_ops.cancel_hv_timer = NULL; - vmx_x86_ops.request_immediate_exit = __kvm_request_immediate_exit; + vt_x86_ops.set_hv_timer = NULL; + vt_x86_ops.cancel_hv_timer = NULL; + vt_x86_ops.request_immediate_exit = __kvm_request_immediate_exit; } kvm_caps.supported_mce_cap |= MCG_LMCE_P; @@ -8371,9 +8228,9 @@ static __init int hardware_setup(void) if (!enable_ept || !enable_pmu || !cpu_has_vmx_intel_pt()) pt_mode = PT_MODE_SYSTEM; if (pt_mode == PT_MODE_HOST_GUEST) - vmx_init_ops.handle_intel_pt_intr = vmx_handle_intel_pt_intr; + vt_init_ops.handle_intel_pt_intr = vmx_handle_intel_pt_intr; else - vmx_init_ops.handle_intel_pt_intr = NULL; + vt_init_ops.handle_intel_pt_intr = NULL; setup_default_sgx_lepubkeyhash(); @@ -8396,16 +8253,6 @@ static __init int hardware_setup(void) return r; } -static struct kvm_x86_init_ops vmx_init_ops __initdata = { - .cpu_has_kvm_support = cpu_has_kvm_support, - .disabled_by_bios = vmx_disabled_by_bios, - .hardware_setup = hardware_setup, - .handle_intel_pt_intr = NULL, - - .runtime_ops = &vmx_x86_ops, - .pmu_ops = &intel_pmu_ops, -}; - static void vmx_cleanup_l1d_flush(void) { if (vmx_l1d_flush_pages) { @@ -8483,7 +8330,7 @@ static int __init vmx_init(void) } if (ms_hyperv.nested_features & HV_X64_NESTED_DIRECT_FLUSH) - vmx_x86_ops.enable_direct_tlbflush + vt_x86_ops.enable_direct_tlbflush = hv_enable_direct_tlbflush; } else { @@ -8491,8 +8338,8 @@ static int __init vmx_init(void) } #endif - r = kvm_init(&vmx_init_ops, sizeof(struct vcpu_vmx), - __alignof__(struct vcpu_vmx), THIS_MODULE); + r = kvm_init(&vt_init_ops, sizeof(struct vcpu_vmx), + __alignof__(struct vcpu_vmx), THIS_MODULE); if (r) return r; diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h new file mode 100644 index 000000000000..8cc2182fc6d7 --- /dev/null +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -0,0 +1,125 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __KVM_X86_VMX_X86_OPS_H +#define __KVM_X86_VMX_X86_OPS_H + +#include + +#include + +#include "x86.h" + +__init int vmx_cpu_has_kvm_support(void); +__init int vmx_disabled_by_bios(void); +__init int vmx_hardware_setup(void); + +extern struct kvm_x86_ops vt_x86_ops __initdata; +extern struct kvm_x86_init_ops vt_init_ops __initdata; + +void vmx_hardware_unsetup(void); +int vmx_check_processor_compatibility(void); +int vmx_hardware_enable(void); +void vmx_hardware_disable(void); +int vmx_vm_init(struct kvm *kvm); +void vmx_vm_destroy(struct kvm *kvm); +int vmx_vcpu_precreate(struct kvm *kvm); +int vmx_vcpu_create(struct kvm_vcpu *vcpu); +int vmx_vcpu_pre_run(struct kvm_vcpu *vcpu); +fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu); +void vmx_vcpu_free(struct kvm_vcpu *vcpu); +void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event); +void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu); +void vmx_vcpu_put(struct kvm_vcpu *vcpu); +int vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath); +void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu); +int vmx_skip_emulated_instruction(struct kvm_vcpu *vcpu); +void vmx_update_emulated_instruction(struct kvm_vcpu *vcpu); +int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info); +int vmx_smi_allowed(struct kvm_vcpu *vcpu, bool for_injection); +int vmx_enter_smm(struct kvm_vcpu *vcpu, char *smstate); +int vmx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate); +void vmx_enable_smi_window(struct kvm_vcpu *vcpu); +bool vmx_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type, + void *insn, int insn_len); +int vmx_check_intercept(struct kvm_vcpu *vcpu, + struct x86_instruction_info *info, + enum x86_intercept_stage stage, + struct x86_exception *exception); +bool vmx_apic_init_signal_blocked(struct kvm_vcpu *vcpu); +void vmx_migrate_timers(struct kvm_vcpu *vcpu); +void vmx_set_virtual_apic_mode(struct kvm_vcpu *vcpu); +void vmx_apicv_post_state_restore(struct kvm_vcpu *vcpu); +bool vmx_check_apicv_inhibit_reasons(enum kvm_apicv_inhibit reason); +void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr); +void vmx_hwapic_isr_update(int max_isr); +bool vmx_guest_apic_has_interrupt(struct kvm_vcpu *vcpu); +int vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu); +void vmx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, + int trig_mode, int vector); +void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu); +bool vmx_has_emulated_msr(struct kvm *kvm, u32 index); +void vmx_msr_filter_changed(struct kvm_vcpu *vcpu); +void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu); +void vmx_update_exception_bitmap(struct kvm_vcpu *vcpu); +int vmx_get_msr_feature(struct kvm_msr_entry *msr); +int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info); +u64 vmx_get_segment_base(struct kvm_vcpu *vcpu, int seg); +void vmx_get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); +void vmx_set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); +int vmx_get_cpl(struct kvm_vcpu *vcpu); +void vmx_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l); +void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0); +void vmx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int root_level); +void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4); +bool vmx_is_valid_cr4(struct kvm_vcpu *vcpu, unsigned long cr4); +int vmx_set_efer(struct kvm_vcpu *vcpu, u64 efer); +void vmx_get_idt(struct kvm_vcpu *vcpu, struct desc_ptr *dt); +void vmx_set_idt(struct kvm_vcpu *vcpu, struct desc_ptr *dt); +void vmx_get_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt); +void vmx_set_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt); +void vmx_set_dr7(struct kvm_vcpu *vcpu, unsigned long val); +void vmx_sync_dirty_debug_regs(struct kvm_vcpu *vcpu); +void vmx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg); +unsigned long vmx_get_rflags(struct kvm_vcpu *vcpu); +void vmx_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags); +bool vmx_get_if_flag(struct kvm_vcpu *vcpu); +void vmx_flush_tlb_all(struct kvm_vcpu *vcpu); +void vmx_flush_tlb_current(struct kvm_vcpu *vcpu); +void vmx_flush_tlb_gva(struct kvm_vcpu *vcpu, gva_t addr); +void vmx_flush_tlb_guest(struct kvm_vcpu *vcpu); +void vmx_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask); +u32 vmx_get_interrupt_shadow(struct kvm_vcpu *vcpu); +void vmx_patch_hypercall(struct kvm_vcpu *vcpu, unsigned char *hypercall); +void vmx_inject_irq(struct kvm_vcpu *vcpu, bool reinjected); +void vmx_inject_nmi(struct kvm_vcpu *vcpu); +void vmx_inject_exception(struct kvm_vcpu *vcpu); +void vmx_cancel_injection(struct kvm_vcpu *vcpu); +int vmx_interrupt_allowed(struct kvm_vcpu *vcpu, bool for_injection); +int vmx_nmi_allowed(struct kvm_vcpu *vcpu, bool for_injection); +bool vmx_get_nmi_mask(struct kvm_vcpu *vcpu); +void vmx_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked); +void vmx_enable_nmi_window(struct kvm_vcpu *vcpu); +void vmx_enable_irq_window(struct kvm_vcpu *vcpu); +void vmx_update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr); +void vmx_set_apic_access_page_addr(struct kvm_vcpu *vcpu); +void vmx_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu); +void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap); +int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr); +int vmx_set_identity_map_addr(struct kvm *kvm, u64 ident_addr); +u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio); +void vmx_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, + u64 *info1, u64 *info2, u32 *intr_info, u32 *error_code); +u64 vmx_get_l2_tsc_offset(struct kvm_vcpu *vcpu); +u64 vmx_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu); +void vmx_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset); +void vmx_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier); +void vmx_request_immediate_exit(struct kvm_vcpu *vcpu); +void vmx_sched_in(struct kvm_vcpu *vcpu, int cpu); +void vmx_update_cpu_dirty_logging(struct kvm_vcpu *vcpu); +#ifdef CONFIG_X86_64 +int vmx_set_hv_timer(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc, + bool *expired); +void vmx_cancel_hv_timer(struct kvm_vcpu *vcpu); +#endif +void vmx_setup_mce(struct kvm_vcpu *vcpu); + +#endif /* __KVM_X86_VMX_X86_OPS_H */ From patchwork Sun Oct 30 06:22:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12825 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665080wru; Sat, 29 Oct 2022 23:24:54 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4M8qUqmvawZYeK1Un9fmAFnS3Q7SE6kMqIUdDOZbG4uqspbCEbEEQqrFGgdOyRuKmKhA9V X-Received: by 2002:a17:906:edc2:b0:7ad:9f04:1c15 with SMTP id sb2-20020a170906edc200b007ad9f041c15mr6690930ejb.559.1667111094317; Sat, 29 Oct 2022 23:24:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111094; cv=none; d=google.com; s=arc-20160816; b=rcoDNhxmIl9fqZ3YS6rwKYurZ4LS49tcwZXh55U+0ZGBKMiXC8/S5Hmsbhx3Ph1hvq cdQDwzXVK/fcEii1EO1aXRwYgwYsvaw8evxpEzw+dt/HXMdzsM2CYT5fpjJqjYS7PQOT lIECGYIfsjp82wnq9ZlfivZ/Iks33cfrTVLB5jH4B36nnaMBQiKGwnZ9YqrpnU2qsp7W a7tZ9wjPQEuwwkfBFML7tOlOQSsX2ouvS3LruOTtyCMQ4EQNTdnbGnmtqBDQLYE55ERB 3OYY7qaBVVueIvfbIStNo/JGLGxQIZOAsNhjLACRp5ef/AmQitqBP4BgjTSn0Zk0KpzN ITrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=km+dHvq6aDrzSIdLPU0jgcC/TCpCAPgPJa4ywrHqfys=; b=QH5+ISAT3xzMvikkBEwUKgTeobqfGg1smdCb3GKldPXusmtvFfaBgPmq6oM5qnCJmL Ttio3hnf59nUqask/tqWA7INsfcBGM2+Hqdy/JAWHB/Pk24CyykWZBQ6pWgdPXyHnrqb pCG0WDXO15jYcK9cnxWd/moDriy78NslmTPPxOPRa+AWe7rDjl2ipVPhWaUWJINKS2z5 PKC9QF7LYvbbQs/EtT1HHc0alDXPrRyL0iIjY7ebieblu5VxWdS6WdcYQhdsdanBv+ai Le1Vhzb3Sdrajus9JQ7vJxbcb8f1funp1Vd8z0rTDQh0vswY1iosebIlQa4Vl2zYIxoy +XjQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=FXn13cgl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id mp22-20020a1709071b1600b007a6843971c0si4638852ejc.190.2022.10.29.23.24.30; Sat, 29 Oct 2022 23:24:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=FXn13cgl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229770AbiJ3GYD (ORCPT + 99 others); Sun, 30 Oct 2022 02:24:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46816 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229549AbiJ3GYA (ORCPT ); Sun, 30 Oct 2022 02:24:00 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F2B5AA8; Sat, 29 Oct 2022 23:23:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111038; x=1698647038; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mIDDpOSSLkQ6ZJq6nWQZVkipSj1876U7znT7QCkvz98=; b=FXn13cgljFfbzCEySnSIGmfcIbbFaX3FbSJrmf6WA+yeORLwrdTkVb4V 5Hf6uwiDcO1Ed2kxa22BvToXBZKR8e0HaegIkDBfsriz5IRuQAM1UwzfJ RQYJKeuZ5bVOyUefM6OQCsP1W4b3Pw12gKKKFMLkQ2fCGFuGEUAci89na vuzXs7yxHk9QP3w1heUyOI2+kTzhe8PiwXlW6qsPFPU0033XIc8IwnIq1 eIG0kk5Uenbu3nusBbbLIS0i1HmAoW6OkyXtc8x0/4ONOmgA7+bfUImS0 gyFE5YO6uQYZfxyuXhZWQeqyXu1uL15N8nQubU5b2u9K5+Qd/Kxm6rk5n w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037111" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037111" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:56 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392833" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392833" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:56 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 002/108] KVM: x86: Refactor KVM VMX module init/exit functions Date: Sat, 29 Oct 2022 23:22:03 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092683263034647?= X-GMAIL-MSGID: =?utf-8?q?1748092683263034647?= From: Isaku Yamahata Currently, KVM VMX module initialization/exit functions are a single function each. Refactor KVM VMX module initialization functions into KVM common part and VMX part so that TDX specific part can be added cleanly. Opportunistically refactor module exit function as well. The current module initialization flow is, 1.) calculate the sizes of VMX kvm structure and VMX vcpu structure, 2.) hyper-v specific initialization 3.) report those sizes to the KVM common layer and KVM common initialization, and 4.) VMX specific system-wide initialization. Refactor the KVM VMX module initialization function into functions with a wrapper function to separate VMX logic in vmx.c from a file, main.c, common among VMX and TDX. We have a wrapper function, "vt_init() {vmx kvm/vcpu size calculation; hv_vp_assist_page_init(); kvm_init(); vmx_init(); }" in main.c, and hv_vp_assist_page_init() and vmx_init() in vmx.c. hv_vp_assist_page_init() initializes hyper-v specific assist pages, kvm_init() does system-wide initialization of the KVM common layer, and vmx_init() does system-wide VMX initialization. The KVM architecture common layer allocates struct kvm with reported size for architecture-specific code. The KVM VMX module defines its structure as struct vmx_kvm { struct kvm; VMX specific members;} and uses it as struct vmx kvm. Similar for vcpu structure. TDX KVM patches will define TDX specific kvm and vcpu structures, add tdx_pre_kvm_init() to report the sizes of them to the KVM common layer. The current module exit function is also a single function, a combination of VMX specific logic and common KVM logic. Refactor it into VMX specific logic and KVM common logic. This is just refactoring to keep the VMX specific logic in vmx.c from main.c. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/main.c | 37 +++++++++++++++ arch/x86/kvm/vmx/vmx.c | 95 ++++++++++++++++++-------------------- arch/x86/kvm/vmx/x86_ops.h | 5 ++ 3 files changed, 88 insertions(+), 49 deletions(-) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 381059631e4b..4c7e71ec3e1d 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -153,3 +153,40 @@ struct kvm_x86_init_ops vt_init_ops __initdata = { .runtime_ops = &vt_x86_ops, .pmu_ops = &intel_pmu_ops, }; + +static int __init vt_init(void) +{ + unsigned int vcpu_size, vcpu_align; + int r; + + vt_x86_ops.vm_size = sizeof(struct kvm_vmx); + vcpu_size = sizeof(struct vcpu_vmx); + vcpu_align = __alignof__(struct vcpu_vmx); + + hv_vp_assist_page_init(); + + r = kvm_init(&vt_init_ops, vcpu_size, vcpu_align, THIS_MODULE); + if (r) + goto err_vmx_post_exit; + + r = vmx_init(); + if (r) + goto err_kvm_exit; + + return 0; + +err_kvm_exit: + kvm_exit(); +err_vmx_post_exit: + hv_vp_assist_page_exit(); + return r; +} +module_init(vt_init); + +static void vt_exit(void) +{ + vmx_exit(); + kvm_exit(); + hv_vp_assist_page_exit(); +} +module_exit(vt_exit); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 0080d88ded20..1963b28a2ea5 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -8263,48 +8263,8 @@ static void vmx_cleanup_l1d_flush(void) l1tf_vmx_mitigation = VMENTER_L1D_FLUSH_AUTO; } -static void vmx_exit(void) +void __init hv_vp_assist_page_init(void) { -#ifdef CONFIG_KEXEC_CORE - RCU_INIT_POINTER(crash_vmclear_loaded_vmcss, NULL); - synchronize_rcu(); -#endif - - kvm_exit(); - -#if IS_ENABLED(CONFIG_HYPERV) - if (static_branch_unlikely(&enable_evmcs)) { - int cpu; - struct hv_vp_assist_page *vp_ap; - /* - * Reset everything to support using non-enlightened VMCS - * access later (e.g. when we reload the module with - * enlightened_vmcs=0) - */ - for_each_online_cpu(cpu) { - vp_ap = hv_get_vp_assist_page(cpu); - - if (!vp_ap) - continue; - - vp_ap->nested_control.features.directhypercall = 0; - vp_ap->current_nested_vmcs = 0; - vp_ap->enlighten_vmentry = 0; - } - - static_branch_disable(&enable_evmcs); - } -#endif - vmx_cleanup_l1d_flush(); - - allow_smaller_maxphyaddr = false; -} -module_exit(vmx_exit); - -static int __init vmx_init(void) -{ - int r, cpu; - #if IS_ENABLED(CONFIG_HYPERV) /* * Enlightened VMCS usage should be recommended and the host needs @@ -8315,6 +8275,7 @@ static int __init vmx_init(void) ms_hyperv.hints & HV_X64_ENLIGHTENED_VMCS_RECOMMENDED && (ms_hyperv.nested_features & HV_X64_ENLIGHTENED_VMCS_VERSION) >= KVM_EVMCS_VERSION) { + int cpu; /* Check that we have assist pages on all online CPUs */ for_each_online_cpu(cpu) { @@ -8337,11 +8298,38 @@ static int __init vmx_init(void) enlightened_vmcs = false; } #endif +} - r = kvm_init(&vt_init_ops, sizeof(struct vcpu_vmx), - __alignof__(struct vcpu_vmx), THIS_MODULE); - if (r) - return r; +void hv_vp_assist_page_exit(void) +{ +#if IS_ENABLED(CONFIG_HYPERV) + if (static_branch_unlikely(&enable_evmcs)) { + int cpu; + struct hv_vp_assist_page *vp_ap; + /* + * Reset everything to support using non-enlightened VMCS + * access later (e.g. when we reload the module with + * enlightened_vmcs=0) + */ + for_each_online_cpu(cpu) { + vp_ap = hv_get_vp_assist_page(cpu); + + if (!vp_ap) + continue; + + vp_ap->nested_control.features.directhypercall = 0; + vp_ap->current_nested_vmcs = 0; + vp_ap->enlighten_vmentry = 0; + } + + static_branch_disable(&enable_evmcs); + } +#endif +} + +int __init vmx_init(void) +{ + int r, cpu; /* * Must be called after kvm_init() so enable_ept is properly set @@ -8351,10 +8339,8 @@ static int __init vmx_init(void) * mitigation mode. */ r = vmx_setup_l1d_flush(vmentry_l1d_flush_param); - if (r) { - vmx_exit(); + if (r) return r; - } vmx_setup_fb_clear_ctrl(); @@ -8380,4 +8366,15 @@ static int __init vmx_init(void) return 0; } -module_init(vmx_init); + +void vmx_exit(void) +{ +#ifdef CONFIG_KEXEC_CORE + RCU_INIT_POINTER(crash_vmclear_loaded_vmcss, NULL); + synchronize_rcu(); +#endif + + vmx_cleanup_l1d_flush(); + + allow_smaller_maxphyaddr = false; +} diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 8cc2182fc6d7..83efb7f92a58 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -8,6 +8,11 @@ #include "x86.h" +void __init hv_vp_assist_page_init(void); +void hv_vp_assist_page_exit(void); +int __init vmx_init(void); +void vmx_exit(void); + __init int vmx_cpu_has_kvm_support(void); __init int vmx_disabled_by_bios(void); __init int vmx_hardware_setup(void); From patchwork Sun Oct 30 06:22:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12826 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665090wru; Sat, 29 Oct 2022 23:24:57 -0700 (PDT) X-Google-Smtp-Source: AMsMyM63iIBi/Xer3icbEDQfovDrVT9CGKdbwaD1NqM1SlBpVHltvt5TqGDJRyCnE3ryKPJ+/O5k X-Received: by 2002:a17:906:1c0e:b0:7ad:c648:a4af with SMTP id k14-20020a1709061c0e00b007adc648a4afmr1367120ejg.277.1667111097635; Sat, 29 Oct 2022 23:24:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111097; cv=none; d=google.com; s=arc-20160816; b=Sou+oGaOfrIcyfAyAagZfDNR+ItgLXuXyGNdK7YD3qDaNg+SHcDjCT+lXzAUU5asDS yez6F8HnGgM48H3s0zbt85DC6Gif2pjSuB1olIjTR2ejw/2TcwYub5egNCoQOnd1rrgi EZ6BynJiIgfqjs586PrFvQxmxuzKeDYvumnI7g5Rrr+6JjxG+cO7yTTR3i0FbOEF/5QH pbF7+SADxFtQ9CQTscPaSa/2sEb+ahpt1cH2W/lhOQPnnGzsSoOVCXPcZmeihQbbLeeq tq1s+GDOn02mBRgrrlgaavKNu5A/h8o+N1d3hIv/Tttnru9QTwbHSJcSK8Q2+sGnMUPU 3tAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=pm2xjPwiHEVJekHE7GwQVpAqc6GoEqaeH/bgB/DzV5c=; b=T67w43HKc+AIIcv/s/d3B1EZlShdWJlrpo6sYIyLxWVIB9cerm8Q1FueHspJ9Iq2Wf zmi7I6ynhkoWQEe7xS8SfBho3qwDXtWA1w01K1s7pPg/kYgdZk6zY5/4uA8hiArhScs8 FUWgWMx+wepye3n9dG6+atMZh0Ts3EPGZlcUqViikfu8HQGHUJUmu3wyHdD3efv5ORH0 3RFelZB3RRcS41mvZsbNUAv39QH7+tzPhaaNEibEkanSqChqd6bEGYRjZ8nSnGq2ZCzX 3LP8GgvP3qxlBZCOGtZB6JPTE7rEZ5I05Kcnesb4oVhnEWs0qaHFIDD9U8yL4w1hatZ+ nA4g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Xk5Aa0fZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i6-20020a1709063c4600b0078255525a6fsi3068548ejg.671.2022.10.29.23.24.33; Sat, 29 Oct 2022 23:24:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Xk5Aa0fZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229824AbiJ3GYF (ORCPT + 99 others); Sun, 30 Oct 2022 02:24:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46818 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229574AbiJ3GYA (ORCPT ); Sun, 30 Oct 2022 02:24:00 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8BD9BA9; Sat, 29 Oct 2022 23:23:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111039; x=1698647039; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fd2r6Nl585pvLjCHmTFv0mAVqQ4w+HXhAjGFyyRQkdE=; b=Xk5Aa0fZ+LBqTAt94gFWG8VFy2zgIXLRdodoMWcg5Yvrxg8wDJWhObmH WBEnW9U7pJmmRqldKH90Ca9Xq0bzA+jHE6FZAit+rIpyBMShEnPN8K7oR ZCuveSNfz2r7zihESm1zggLAf2ZZJF80E/Vzs2b/mgZ671NUS6S83XdG+ Q+CJUTbTBRJr4aST7s9wS5nHK48TlzwAz/6aE5tcd76Bc1ZvAsg0dfOt/ 5zGVWj01eWJvotDvl+Ot7cSiqWEG6QTii47jAOOSuhOulSXb/y+jqEdjv ymSeADKOJzXor8IxGIbSSG1V1PnusJI1GylziXDAfAEeeJjKtvlLQwj4P w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037112" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037112" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:57 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392837" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392837" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:56 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 003/108] KVM: TDX: Add placeholders for TDX VM/vcpu structure Date: Sat, 29 Oct 2022 23:22:04 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092686225966799?= X-GMAIL-MSGID: =?utf-8?q?1748092686225966799?= From: Isaku Yamahata Add placeholders TDX VM/vcpu structure that overlays with VMX VM/vcpu structures. Initialize VM structure size and vcpu size/align so that x86 KVM common code knows those size irrespective of VMX or TDX. Those structures will be populated as guest creation logic develops. Add helper functions to check if the VM is guest TD and add conversion functions between KVM VM/VCPU and TDX VM/VCPU. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/main.c | 8 +++--- arch/x86/kvm/vmx/tdx.h | 54 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 59 insertions(+), 3 deletions(-) create mode 100644 arch/x86/kvm/vmx/tdx.h diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 4c7e71ec3e1d..a7d4af73228e 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -5,6 +5,7 @@ #include "vmx.h" #include "nested.h" #include "pmu.h" +#include "tdx.h" struct kvm_x86_ops vt_x86_ops __initdata = { .name = "kvm_intel", @@ -159,9 +160,10 @@ static int __init vt_init(void) unsigned int vcpu_size, vcpu_align; int r; - vt_x86_ops.vm_size = sizeof(struct kvm_vmx); - vcpu_size = sizeof(struct vcpu_vmx); - vcpu_align = __alignof__(struct vcpu_vmx); + vt_x86_ops.vm_size = max(sizeof(struct kvm_vmx), sizeof(struct kvm_tdx)); + vcpu_size = max(sizeof(struct vcpu_vmx), sizeof(struct vcpu_tdx)); + vcpu_align = max(__alignof__(struct vcpu_vmx), + __alignof__(struct vcpu_tdx)); hv_vp_assist_page_init(); diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h new file mode 100644 index 000000000000..060bf48ec3d6 --- /dev/null +++ b/arch/x86/kvm/vmx/tdx.h @@ -0,0 +1,54 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __KVM_X86_TDX_H +#define __KVM_X86_TDX_H + +#ifdef CONFIG_INTEL_TDX_HOST +struct kvm_tdx { + struct kvm kvm; + /* TDX specific members follow. */ +}; + +struct vcpu_tdx { + struct kvm_vcpu vcpu; + /* TDX specific members follow. */ +}; + +static inline bool is_td(struct kvm *kvm) +{ + /* + * TDX VM type isn't defined yet. + * return kvm->arch.vm_type == KVM_X86_TDX_VM; + */ + return false; +} + +static inline bool is_td_vcpu(struct kvm_vcpu *vcpu) +{ + return is_td(vcpu->kvm); +} + +static inline struct kvm_tdx *to_kvm_tdx(struct kvm *kvm) +{ + return container_of(kvm, struct kvm_tdx, kvm); +} + +static inline struct vcpu_tdx *to_tdx(struct kvm_vcpu *vcpu) +{ + return container_of(vcpu, struct vcpu_tdx, vcpu); +} +#else +struct kvm_tdx { + struct kvm kvm; +}; + +struct vcpu_tdx { + struct kvm_vcpu vcpu; +}; + +static inline bool is_td(struct kvm *kvm) { return false; } +static inline bool is_td_vcpu(struct kvm_vcpu *vcpu) { return false; } +static inline struct kvm_tdx *to_kvm_tdx(struct kvm *kvm) { return NULL; } +static inline struct vcpu_tdx *to_tdx(struct kvm_vcpu *vcpu) { return NULL; } +#endif /* CONFIG_INTEL_TDX_HOST */ + +#endif /* __KVM_X86_TDX_H */ From patchwork Sun Oct 30 06:22:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12827 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665094wru; Sat, 29 Oct 2022 23:25:00 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5CV1B15NUR33i1scT9uR7+FkHeIWnmyiu9dlVVdDoJXC9NSbRN3LOSW7mGevgMq6Fz0iX+ X-Received: by 2002:a17:907:6e11:b0:78e:3057:f631 with SMTP id sd17-20020a1709076e1100b0078e3057f631mr6558920ejc.333.1667111100185; Sat, 29 Oct 2022 23:25:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111100; cv=none; d=google.com; s=arc-20160816; b=b2B8wCMILjn7j//IEZCwHp9gaBZY5LITC4/Twyh26CJnc5tkeNpEci15Vi9XhBFsfP lIVVt5P6h9FD1AAldjYiQoslwHqESrE7XQ1iZfExvx1CwSV1AAeAwaMm/8P4oWfpZbux kehe2xR0a6g9/EvM25to0FtSLXZhXdjKMZ+l0NT4+AsYB22onD0FBKnSIKGYNbtIbERq 0G1A6qXhMr2LiYMj2YI2AINLxZ/TKLyJhD8rw//wDmvFtpzGxk/DvtI+ntVUPTclXFBZ 1HnwWqf1GPQ2tWkEQsIT9/035tjIyoGl0vpSKsqnTei7nDZYoa/hC9+5QNJdCajA6Yna NPvw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=xSGZT0/jyS+9kAVCKfzO9mj2hPGT9RZHobjk5jQgVyc=; b=XfEszh3rkxPL7+UakFyLVc/5aJdxinLS9vJhnhxNQDqEHjK6S+gxmLKajHj55E5qwX hM8CBnKsUDp8AB4Rh6r6fWTlHovnK0Tb4xXSPvcqhWE0SzhjJTNAFNWdPOy1QjZn8Ars aMfuGujnX1b7qrcii95EKCDNmEDnxnZqLCsiAqlbAK+kgzngN6OHCm/yUi245naCdyZH k0t3Rm4/2ONTxm4DFpxFL4dPhal5Rqsr/Y+B/5dWhuCEOKd2H9yxDcUKIYqMFTuipK/p BScL3Qg85U88292FIYH6N5LPM0XuitnjcjG4qOU7dIYLd4bmRtIP98ayNqmOheAQK/ev yKaA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=gcfs2XD1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e14-20020a056402190e00b0045d57af9fd3si4506600edz.593.2022.10.29.23.24.36; Sat, 29 Oct 2022 23:25:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=gcfs2XD1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229868AbiJ3GYH (ORCPT + 99 others); Sun, 30 Oct 2022 02:24:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46830 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229670AbiJ3GYB (ORCPT ); Sun, 30 Oct 2022 02:24:01 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2C525A6; Sat, 29 Oct 2022 23:24:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111040; x=1698647040; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=P0q3MAqCXPolZrENxyiTU8RYF/5Olq0KGw+lpScwWWo=; b=gcfs2XD1IuWIXoXl5TldHZ5Ic/0Bqh92uYKGjzCqBoQv6vgAByENYfo+ eof//9PprXRKr47mbUYDVhpaRMs69ZjYUanvBnGiLCYYweherE+fH4cDk arpzjOh/B61c3H7UMwNpiPyZ1lZ3h2g9KYUKArdRQTET+ZU+tfYls9JVF pc9zHP0Jn2ghZZuHXbJR6sh83BVF92puH+5i+PMcmoXTw4DMV5C7I2A26 ErUa1bmzZJex26w3e7ceHgXlgfeU6rPAvXP8618cD7Mhp4TaKWWHkOFE/ KZvkZbBZm2lKOw6e3rXY73S1pa+W1vr1aCtWzvjdtz26gKY9U+3TLQ40Q A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037114" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037114" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:57 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392841" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392841" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:57 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 004/108] x86/virt/tdx: Add a helper function to return system wide info about TDX module Date: Sat, 29 Oct 2022 23:22:05 -0700 Message-Id: <1bae1243e67ed05e3eb7c211dc0ced2e9645c8b6.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092689214639618?= X-GMAIL-MSGID: =?utf-8?q?1748092689214639618?= From: Isaku Yamahata TDX KVM needs system-wide information about the TDX module, struct tdsysinfo_struct. Add a helper function tdx_get_sysinfo() to return it instead of KVM getting it with various error checks. Move out the struct definition about it to common place arch/x86/include/asm/tdx.h. Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/tdx.h | 55 +++++++++++++++++++++++++++++++++++++ arch/x86/virt/vmx/tdx/tdx.c | 16 +++++++++-- arch/x86/virt/vmx/tdx/tdx.h | 52 ----------------------------------- 3 files changed, 69 insertions(+), 54 deletions(-) diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index d568f17da742..5cff7ed5b11e 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -131,9 +131,64 @@ static inline long tdx_kvm_hypercall(unsigned int nr, unsigned long p1, #endif /* CONFIG_INTEL_TDX_GUEST && CONFIG_KVM_GUEST */ #ifdef CONFIG_INTEL_TDX_HOST +struct tdx_cpuid_config { + u32 leaf; + u32 sub_leaf; + u32 eax; + u32 ebx; + u32 ecx; + u32 edx; +} __packed; + +#define TDSYSINFO_STRUCT_SIZE 1024 +#define TDSYSINFO_STRUCT_ALIGNMENT 1024 + +struct tdsysinfo_struct { + /* TDX-SEAM Module Info */ + u32 attributes; + u32 vendor_id; + u32 build_date; + u16 build_num; + u16 minor_version; + u16 major_version; + u8 reserved0[14]; + /* Memory Info */ + u16 max_tdmrs; + u16 max_reserved_per_tdmr; + u16 pamt_entry_size; + u8 reserved1[10]; + /* Control Struct Info */ + u16 tdcs_base_size; + u8 reserved2[2]; + u16 tdvps_base_size; + u8 tdvps_xfam_dependent_size; + u8 reserved3[9]; + /* TD Capabilities */ + u64 attributes_fixed0; + u64 attributes_fixed1; + u64 xfam_fixed0; + u64 xfam_fixed1; + u8 reserved4[32]; + u32 num_cpuid_config; + /* + * The actual number of CPUID_CONFIG depends on above + * 'num_cpuid_config'. The size of 'struct tdsysinfo_struct' + * is 1024B defined by TDX architecture. Use a union with + * specific padding to make 'sizeof(struct tdsysinfo_struct)' + * equal to 1024. + */ + union { + struct tdx_cpuid_config cpuid_configs[0]; + u8 reserved5[892]; + }; +} __packed __aligned(TDSYSINFO_STRUCT_ALIGNMENT); + +const struct tdsysinfo_struct *tdx_get_sysinfo(void); bool platform_tdx_enabled(void); int tdx_enable(void); #else /* !CONFIG_INTEL_TDX_HOST */ +struct tdsysinfo_struct; +static inline const struct tdsysinfo_struct *tdx_get_sysinfo(void) { return NULL; } static inline bool platform_tdx_enabled(void) { return false; } static inline int tdx_enable(void) { return -ENODEV; } #endif /* CONFIG_INTEL_TDX_HOST */ diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 68ec1ebecb49..6fb630fa7d09 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -503,7 +503,7 @@ static int check_cmrs(struct cmr_info *cmr_array, int *actual_cmr_num) return 0; } -static int tdx_get_sysinfo(void) +static int __tdx_get_sysinfo(void) { struct tdx_module_output out; int ret; @@ -530,6 +530,18 @@ static int tdx_get_sysinfo(void) return check_cmrs(tdx_cmr_array, &tdx_cmr_num); } +const struct tdsysinfo_struct *tdx_get_sysinfo(void) +{ + const struct tdsysinfo_struct *r = NULL; + + mutex_lock(&tdx_module_lock); + if (tdx_module_status == TDX_MODULE_INITIALIZED) + r = &tdx_sysinfo; + mutex_unlock(&tdx_module_lock); + return r; +} +EXPORT_SYMBOL_GPL(tdx_get_sysinfo); + /* Check whether the first range is the subrange of the second */ static bool is_subrange(u64 r1_start, u64 r1_end, u64 r2_start, u64 r2_end) { @@ -1238,7 +1250,7 @@ static int init_tdx_module(void) if (ret) goto out; - ret = tdx_get_sysinfo(); + ret = __tdx_get_sysinfo(); if (ret) goto out; diff --git a/arch/x86/virt/vmx/tdx/tdx.h b/arch/x86/virt/vmx/tdx/tdx.h index 891691b1ea50..5ce3bd38ce08 100644 --- a/arch/x86/virt/vmx/tdx/tdx.h +++ b/arch/x86/virt/vmx/tdx/tdx.h @@ -31,58 +31,6 @@ struct cmr_info { #define MAX_CMRS 32 #define CMR_INFO_ARRAY_ALIGNMENT 512 -struct cpuid_config { - u32 leaf; - u32 sub_leaf; - u32 eax; - u32 ebx; - u32 ecx; - u32 edx; -} __packed; - -#define TDSYSINFO_STRUCT_SIZE 1024 -#define TDSYSINFO_STRUCT_ALIGNMENT 1024 - -struct tdsysinfo_struct { - /* TDX-SEAM Module Info */ - u32 attributes; - u32 vendor_id; - u32 build_date; - u16 build_num; - u16 minor_version; - u16 major_version; - u8 reserved0[14]; - /* Memory Info */ - u16 max_tdmrs; - u16 max_reserved_per_tdmr; - u16 pamt_entry_size; - u8 reserved1[10]; - /* Control Struct Info */ - u16 tdcs_base_size; - u8 reserved2[2]; - u16 tdvps_base_size; - u8 tdvps_xfam_dependent_size; - u8 reserved3[9]; - /* TD Capabilities */ - u64 attributes_fixed0; - u64 attributes_fixed1; - u64 xfam_fixed0; - u64 xfam_fixed1; - u8 reserved4[32]; - u32 num_cpuid_config; - /* - * The actual number of CPUID_CONFIG depends on above - * 'num_cpuid_config'. The size of 'struct tdsysinfo_struct' - * is 1024B defined by TDX architecture. Use a union with - * specific padding to make 'sizeof(struct tdsysinfo_struct)' - * equal to 1024. - */ - union { - struct cpuid_config cpuid_configs[0]; - u8 reserved5[892]; - }; -} __packed __aligned(TDSYSINFO_STRUCT_ALIGNMENT); - struct tdmr_reserved_area { u64 offset; u64 size; From patchwork Sun Oct 30 06:22:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12829 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665183wru; Sat, 29 Oct 2022 23:25:19 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7onVP0VU0ogeOpEzU6mgBilkX1py/zWHixQxqZK7Z/yLDXgOY3jy88mSgd3ZzhNmNxe/J9 X-Received: by 2002:a17:907:a4a:b0:77b:c1b2:479a with SMTP id be10-20020a1709070a4a00b0077bc1b2479amr7232730ejc.109.1667111119383; Sat, 29 Oct 2022 23:25:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111119; cv=none; d=google.com; s=arc-20160816; b=zwW+SLD+Ce2m1xN1xYnOzyvtIRtu/pO3De5XFvVPRhRCdxxgPijevrCJtbCpXv6eJL /qsCc3KFiE1vrBg9HzE/Y2iZTRHVpHMavvGvKhSMoMXdtaTvZMWn6W5TqNP5ZncyX/JE QCp/RgRk6DNSjKSGkX8r0vqwj1mHuMaYzqSubm6eZtTo+69RiJwRJZnMZ+xdT6537/Iv Dhv+QELIenEsp4ZsVLc/byVDG3IsVThlcmyk2VdRhS2U5PG/pN5vE6XjPsmOSoucLKCQ +H+5tr83ZwQ9/veXEdfCc1rg4kV+qxu/YyDEIS6qa1Lmm0mflTaavPtjKpc4IfTBkz6E IOSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=CYbEl4b1XvwMDhTFDc5yBEcZ84QiQOq989MPN9McpcY=; b=aOEt4MD9clgqdvQLq8LdIO9xZcpzb4AYZnD5x8u4yuAhBYcA89huaSq/cFxnGUbfTw uTFG2RFKmRX7vhcIiDrRh9BGw8oyCmQ+naqybJlX1V+uag+8B4tCw/qCWHvhHuh2G/WT KpJieYn73hKjdzT/cDZYMTj9FBeU6DDDfBlOBZRAkR96932dyMS16HdHT+kw2gRGyJ7v wXGxwXwivtOtt6nI27Xhu0OauJaUNkma8uK4hXoNuJHTi8boZ7kvQwBc4jO3M+zGc0Dz vslOnDYjRU2I3sYzOgtigoujL1XlgUlxQr5uU4SM/kPpcPnMqqYwaDJmaORFPom2YiM6 bkZg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Hp30dhW+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id jg6-20020a170907970600b0078d85777c4fsi4383549ejc.700.2022.10.29.23.24.53; Sat, 29 Oct 2022 23:25:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Hp30dhW+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229520AbiJ3GYO (ORCPT + 99 others); Sun, 30 Oct 2022 02:24:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46844 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229763AbiJ3GYC (ORCPT ); Sun, 30 Oct 2022 02:24:02 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 68589AA; Sat, 29 Oct 2022 23:24:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111040; x=1698647040; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=IF5t22+jjqwYZmXfcnPQEHlPDHTbwXg0OYhOtId3FpQ=; b=Hp30dhW+sUWfGQgU/TmS786iIrHCnuHcZ0wF2QPa2KUgsRsnKOtPn3qW 3awWkQqQsqMkNmCvJo/Uf330IiEHF1Yjs8GmK+ThU6uR+8CLiWbYG3uw8 SOQj/eZjp7YDMzqBVj8C7cIwdNO9+0nVNC3jhS7KdSizNqBOhzm0bBM4C 4b9p3ydPGehdg6eam+OexAUmcU3FhjFLsNZiBGDstMvkSv/Hd5N742TZ6 OPy7AOmMdUdkv0uYzXCnRUf2Scd0FrrjsxYGBBgxMV4PwqQTmQ/MyC03U MSyIjOjrG74Ggwr60mvZJTE/a8TSyCtMD1RtfU8Yv7CydFQud3cn3JFQF Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037115" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037115" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:57 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392844" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392844" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:57 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 005/108] KVM: TDX: Initialize the TDX module when loading the KVM intel kernel module Date: Sat, 29 Oct 2022 23:22:06 -0700 Message-Id: <99e5fcf2a7127347816982355fd4141ee1038a54.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092709527277450?= X-GMAIL-MSGID: =?utf-8?q?1748092709527277450?= From: Isaku Yamahata TDX requires several initialization steps for KVM to create guest TDs. Detect CPU feature, enable VMX (TDX is based on VMX), detect the TDX module availability, and initialize it. This patch implements those steps. There are several options on when to initialize the TDX module. A.) kernel module loading time, B.) the first guest TD creation time. A.) was chosen. With B.), a user may hit an error of the TDX initialization when trying to create the first guest TD. The machine that fails to initialize the TDX module can't boot any guest TD further. Such failure is undesirable and a surprise because the user expects that the machine can accommodate guest TD, but actually not. So A.) is better than B.). Introduce a module parameter, enable_tdx, to explicitly enable TDX KVM support. It's off by default to keep same behavior for those who don't use TDX. Implement hardware_setup method to detect TDX feature of CPU. Because TDX requires all present CPUs to enable VMX (VMXON). The x86 specific kvm_arch_post_hardware_enable_setup overrides the existing weak symbol of kvm_arch_post_hardware_enable_setup which is called at the KVM module initialization. Suggested-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/kvm/Makefile | 1 + arch/x86/kvm/vmx/main.c | 18 ++++++- arch/x86/kvm/vmx/tdx.c | 99 ++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/vmx.c | 39 +++++++++++++++ arch/x86/kvm/vmx/x86_ops.h | 9 ++++ arch/x86/kvm/x86.c | 32 +++++++----- 6 files changed, 186 insertions(+), 12 deletions(-) create mode 100644 arch/x86/kvm/vmx/tdx.c diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index ee4d0999f20f..e2c05195cb95 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -24,6 +24,7 @@ kvm-$(CONFIG_KVM_XEN) += xen.o kvm-intel-y += vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o \ vmx/evmcs.o vmx/nested.o vmx/posted_intr.o vmx/main.o kvm-intel-$(CONFIG_X86_SGX_KVM) += vmx/sgx.o +kvm-intel-$(CONFIG_INTEL_TDX_HOST) += vmx/tdx.o kvm-amd-y += svm/svm.o svm/vmenter.o svm/pmu.o svm/nested.o svm/avic.o svm/sev.o diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index a7d4af73228e..a5cbf3ca2055 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -7,6 +7,22 @@ #include "pmu.h" #include "tdx.h" +static bool __read_mostly enable_tdx = IS_ENABLED(CONFIG_INTEL_TDX_HOST); +module_param_named(tdx, enable_tdx, bool, 0444); + +static __init int vt_hardware_setup(void) +{ + int ret; + + ret = vmx_hardware_setup(); + if (ret) + return ret; + + enable_tdx = enable_tdx && !tdx_hardware_setup(&vt_x86_ops); + + return 0; +} + struct kvm_x86_ops vt_x86_ops __initdata = { .name = "kvm_intel", @@ -148,7 +164,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { struct kvm_x86_init_ops vt_init_ops __initdata = { .cpu_has_kvm_support = vmx_cpu_has_kvm_support, .disabled_by_bios = vmx_disabled_by_bios, - .hardware_setup = vmx_hardware_setup, + .hardware_setup = vt_hardware_setup, .handle_intel_pt_intr = NULL, .runtime_ops = &vt_x86_ops, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c new file mode 100644 index 000000000000..6213a5c6b637 --- /dev/null +++ b/arch/x86/kvm/vmx/tdx.c @@ -0,0 +1,99 @@ +// SPDX-License-Identifier: GPL-2.0 +#include + +#include + +#include "capabilities.h" +#include "x86_ops.h" +#include "tdx.h" +#include "x86.h" + +#undef pr_fmt +#define pr_fmt(fmt) "tdx: " fmt + +#define TDX_MAX_NR_CPUID_CONFIGS \ + ((sizeof(struct tdsysinfo_struct) - \ + offsetof(struct tdsysinfo_struct, cpuid_configs)) \ + / sizeof(struct tdx_cpuid_config)) + +struct tdx_capabilities { + u8 tdcs_nr_pages; + u8 tdvpx_nr_pages; + + u64 attrs_fixed0; + u64 attrs_fixed1; + u64 xfam_fixed0; + u64 xfam_fixed1; + + u32 nr_cpuid_configs; + struct tdx_cpuid_config cpuid_configs[TDX_MAX_NR_CPUID_CONFIGS]; +}; + +/* Capabilities of KVM + the TDX module. */ +static struct tdx_capabilities tdx_caps; + +static int __init tdx_module_setup(void) +{ + const struct tdsysinfo_struct *tdsysinfo; + int ret = 0; + + BUILD_BUG_ON(sizeof(*tdsysinfo) != 1024); + BUILD_BUG_ON(TDX_MAX_NR_CPUID_CONFIGS != 37); + + ret = tdx_enable(); + if (ret) { + pr_info("Failed to initialize TDX module.\n"); + return ret; + } + + tdsysinfo = tdx_get_sysinfo(); + if (tdsysinfo->num_cpuid_config > TDX_MAX_NR_CPUID_CONFIGS) + return -EIO; + + tdx_caps = (struct tdx_capabilities) { + .tdcs_nr_pages = tdsysinfo->tdcs_base_size / PAGE_SIZE, + /* + * TDVPS = TDVPR(4K page) + TDVPX(multiple 4K pages). + * -1 for TDVPR. + */ + .tdvpx_nr_pages = tdsysinfo->tdvps_base_size / PAGE_SIZE - 1, + .attrs_fixed0 = tdsysinfo->attributes_fixed0, + .attrs_fixed1 = tdsysinfo->attributes_fixed1, + .xfam_fixed0 = tdsysinfo->xfam_fixed0, + .xfam_fixed1 = tdsysinfo->xfam_fixed1, + .nr_cpuid_configs = tdsysinfo->num_cpuid_config, + }; + if (!memcpy(tdx_caps.cpuid_configs, tdsysinfo->cpuid_configs, + tdsysinfo->num_cpuid_config * + sizeof(struct tdx_cpuid_config))) + return -EIO; + + pr_info("kvm: TDX is supported. x86 phys bits %d\n", + boot_cpu_data.x86_phys_bits); + + return 0; +} + +int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops) +{ + int r; + + if (!enable_ept) { + pr_warn("Cannot enable TDX with EPT disabled\n"); + return -EINVAL; + } + + /* MOVDIR64B instruction is needed. */ + if (!static_cpu_has(X86_FEATURE_MOVDIR64B)) { + pr_warn("Cannot enable TDX with MOVDIR64B supported "); + return -ENODEV; + } + + /* TDX requires VMX. */ + r = vmxon_all(); + if (!r) + r = tdx_module_setup(); + vmxoff_all(); + + return r; +} diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 1963b28a2ea5..68aef67c5eb7 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2479,6 +2479,35 @@ int vmx_hardware_enable(void) return 0; } +static void __init vmxon(void *arg) +{ + int cpu = raw_smp_processor_id(); + u64 phys_addr = __pa(per_cpu(vmxarea, cpu)); + atomic_t *failed = arg; + int r; + + if (cr4_read_shadow() & X86_CR4_VMXE) { + r = -EBUSY; + goto out; + } + + r = kvm_cpu_vmxon(phys_addr); +out: + if (r) + atomic_inc(failed); +} + +int __init vmxon_all(void) +{ + atomic_t failed = ATOMIC_INIT(0); + + on_each_cpu(vmxon, &failed, 1); + + if (atomic_read(&failed)) + return -EBUSY; + return 0; +} + static void vmclear_local_loaded_vmcss(void) { int cpu = raw_smp_processor_id(); @@ -2499,6 +2528,16 @@ void vmx_hardware_disable(void) intel_pt_handle_vmx(0); } +static void __init vmxoff(void *junk) +{ + cpu_vmxoff(); +} + +void __init vmxoff_all(void) +{ + on_each_cpu(vmxoff, NULL, 1); +} + /* * There is no X86_FEATURE for SGX yet, but anyway we need to query CPUID * directly instead of going through cpu_has(), to ensure KVM is trapping diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 83efb7f92a58..6196d651a00a 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -17,6 +17,9 @@ __init int vmx_cpu_has_kvm_support(void); __init int vmx_disabled_by_bios(void); __init int vmx_hardware_setup(void); +int __init vmxon_all(void); +void __init vmxoff_all(void); + extern struct kvm_x86_ops vt_x86_ops __initdata; extern struct kvm_x86_init_ops vt_init_ops __initdata; @@ -127,4 +130,10 @@ void vmx_cancel_hv_timer(struct kvm_vcpu *vcpu); #endif void vmx_setup_mce(struct kvm_vcpu *vcpu); +#ifdef CONFIG_INTEL_TDX_HOST +int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops); +#else +static inline int tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { return 0; } +#endif + #endif /* __KVM_X86_VMX_X86_OPS_H */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 769b3a9a3151..715a53d4fc3d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12244,6 +12244,16 @@ static void hardware_enable(void *arg) atomic_inc(failed); } +static int kvm_hardware_enable_all(void) +{ + atomic_t failed = ATOMIC_INIT(0); + + on_each_cpu(hardware_enable, &failed, 1); + if (atomic_read(&failed)) + return -EBUSY; + return 0; +} + static void hardware_disable(void *junk) { WARN_ON_ONCE(preemptible()); @@ -12251,29 +12261,29 @@ static void hardware_disable(void *junk) drop_user_return_notifiers(); } +static void kvm_hardware_disable_all(void) +{ + on_each_cpu(hardware_disable, NULL, 1); +} + /* * Called after the VM is otherwise initialized, but just before adding it to * the vm_list. */ int kvm_arch_add_vm(struct kvm *kvm, int usage_count) { - atomic_t failed = ATOMIC_INIT(0); - int r = 0; + int r; if (usage_count != 1) return kvm_mmu_post_init_vm(kvm); - on_each_cpu(hardware_enable, &failed, 1); - - if (atomic_read(&failed)) { - r = -EBUSY; - goto err; - } + r = kvm_hardware_enable_all(); + if (r) + return r; r = kvm_mmu_post_init_vm(kvm); -err: if (r) - on_each_cpu(hardware_disable, NULL, 1); + kvm_hardware_disable_all(); return r; } @@ -12288,7 +12298,7 @@ int kvm_arch_drop_vm(int usage_count) if (usage_count) return 0; - on_each_cpu(hardware_disable, NULL, 1); + kvm_hardware_disable_all(); return 0; } From patchwork Sun Oct 30 06:22:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12830 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665223wru; Sat, 29 Oct 2022 23:25:28 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5BsuxK/qSogv7HQc3RAElPPNyy5Wxe3Y74YcxqAoDu5LQe3DNLWP4hi6upvzB3keHPw5gV X-Received: by 2002:a50:ee87:0:b0:461:a09b:aae5 with SMTP id f7-20020a50ee87000000b00461a09baae5mr7650127edr.24.1667111128089; Sat, 29 Oct 2022 23:25:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111128; cv=none; d=google.com; s=arc-20160816; b=Mo4NzZyHSJ18DuGXbz/RwVWjImvvv8tDcDvNldiC1URcwiA9B5aFKFklW5GAesTTzY JC6EC+0At4Y+Rpbi9M3PpHQL/vcWUUJzJK/8kx/xAWxnNUtXPceiP1InvkSFvqM5I6PN s2mlSXlrXmx3uPzb/GSp2lboN9wMRPKbGzZ50K12dPEIbr4Mm7wsHWEd4CM8u/oYTiiI u70gL0uBWwhhhygzPn2eda7SUjEz8g3bN/KOg+y33Tui0Z0NISCykCUC3jWUraFbjt7g mzw800hn46ha55mkh33KBSh9yJ1VCOa7Jr41/KYl0SfuUWQGrS7jIyyLGgHmko6kdccw 8cuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ZteNeb4mkj37Hzzn537NNuYlS8RHkAafgGEaQZZv1bE=; b=jT7v1Hu/eqyjFIcbudqI4fb/djIokeUzb47J1qC0D63653HKx9fuKzQDEMEUsbEwld jnHVFcP4a6QoXFBxjxccVZtKkJjUe7johq+nLxEg2oHD7zX3KKY2hK213OEuB0WYXjTm 9mmsoDx5MYh5IBreOduyBhE4sAMFbzyAE2Kwt2jBMahCwGD8nGEJ4ooqOaW4zUVSVJ2K qNP4salGBy0BQ87+MLi8ag3AWKQPt6aehxMydEIPTzIds9kjfN2hfklO1vMH1oU2fd2m PIxDxCA8Peirp/AnkRVJG87py2T6Bms96s+/V6MRp45JCC3+VW6J3nON20NqGBkOGwZn 8ucw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=PcRz3Wlv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a27-20020a50c31b000000b00457e9f88b90si3715362edb.246.2022.10.29.23.25.00; Sat, 29 Oct 2022 23:25:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=PcRz3Wlv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229552AbiJ3GYU (ORCPT + 99 others); Sun, 30 Oct 2022 02:24:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46846 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229647AbiJ3GYC (ORCPT ); Sun, 30 Oct 2022 02:24:02 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02043A8; Sat, 29 Oct 2022 23:24:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111040; x=1698647040; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Wz0IOBwmt08sjnia6i0n3OJTM/lMAvZ79Lb70SgXMts=; b=PcRz3WlvN2kzIZzgkGZNjbmHgKfH4IfslmkHfAIsJw4rFyq0IEQnj+N9 PPYaxj0J4uERFOvNGUxubutU5FYbuMg5CSAEu9HJnB2/UkDUS1yBxH7+G Kwws201DeQMdKlMkNP5MG3uKVD14UYnI0XaPTK5NoIjuNjakuWUBhLYgU wSaRKAvw7QdIRAO0rNHHla8aIXh+bBLCBmrIVS5KuE6sBvxu1IlZoXMtg UkdTH2CV3k1qcGU3zPqe3aXsO2DSZLwj8hlVamqh9UhI5m0jAXStEOfjY RyCakXK7MXiKZid85dQJqrW7N6iYuPPdh3X+jDR5bp1Q5Y7V1j4h0WSrV Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037116" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037116" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:57 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392847" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392847" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:57 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson , Xiaoyao Li Subject: [PATCH v10 006/108] KVM: x86: Introduce vm_type to differentiate default VMs from confidential VMs Date: Sat, 29 Oct 2022 23:22:07 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092718308454117?= X-GMAIL-MSGID: =?utf-8?q?1748092718308454117?= From: Sean Christopherson Unlike default VMs, confidential VMs (Intel TDX and AMD SEV-ES) don't allow some operations (e.g., memory read/write, register state access, etc). Introduce vm_type to track the type of the VM to x86 KVM. Other arch KVMs already use vm_type, KVM_INIT_VM accepts vm_type, and x86 KVM callback vm_init accepts vm_type. So follow them. Further, a different policy can be made based on vm_type. Define KVM_X86_DEFAULT_VM for default VM as default and define KVM_X86_TDX_VM for Intel TDX VM. The wrapper function will be defined as "bool is_td(kvm) { return vm_type == VM_TYPE_TDX; }" Add a capability KVM_CAP_VM_TYPES to effectively allow device model, e.g. qemu, to query what VM types are supported by KVM. This (introduce a new capability and add vm_type) is chosen to align with other arch KVMs that have VM types already. Other arch KVMs uses different name to query supported vm types and there is no common name for it, so new name was chosen. Co-developed-by: Xiaoyao Li Signed-off-by: Xiaoyao Li Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- Documentation/virt/kvm/api.rst | 21 +++++++++++++++++++++ arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/include/uapi/asm/kvm.h | 3 +++ arch/x86/kvm/svm/svm.c | 6 ++++++ arch/x86/kvm/vmx/main.c | 1 + arch/x86/kvm/vmx/tdx.h | 6 +----- arch/x86/kvm/vmx/vmx.c | 5 +++++ arch/x86/kvm/vmx/x86_ops.h | 1 + arch/x86/kvm/x86.c | 9 ++++++++- include/uapi/linux/kvm.h | 1 + tools/arch/x86/include/uapi/asm/kvm.h | 3 +++ tools/include/uapi/linux/kvm.h | 1 + 13 files changed, 54 insertions(+), 6 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 08253cf498d1..b6f08e8a8320 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -147,10 +147,31 @@ described as 'basic' will be available. The new VM has no virtual cpus and no memory. You probably want to use 0 as machine type. +X86: +^^^^ + +Supported vm type can be queried from KVM_CAP_VM_TYPES, which returns the +bitmap of supported vm types. The 1-setting of bit @n means vm type with +value @n is supported. + +S390: +^^^^^ + In order to create user controlled virtual machines on S390, check KVM_CAP_S390_UCONTROL and use the flag KVM_VM_S390_UCONTROL as privileged user (CAP_SYS_ADMIN). +MIPS: +^^^^^ + +To use hardware assisted virtualization on MIPS (VZ ASE) rather than +the default trap & emulate implementation (which changes the virtual +memory layout to fit in user mode), check KVM_CAP_MIPS_VZ and use the +flag KVM_VM_MIPS_VZ. + +ARM64: +^^^^^^ + On arm64, the physical address size for a VM (IPA Size limit) is limited to 40bits by default. The limit can be configured if the host supports the extension KVM_CAP_ARM_VM_IPA_SIZE. When supported, use diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 15c1515a0760..8a5c5ae70bc5 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -19,6 +19,7 @@ KVM_X86_OP(hardware_disable) KVM_X86_OP(hardware_unsetup) KVM_X86_OP(has_emulated_msr) KVM_X86_OP(vcpu_after_set_cpuid) +KVM_X86_OP(is_vm_type_supported) KVM_X86_OP(vm_init) KVM_X86_OP_OPTIONAL(vm_destroy) KVM_X86_OP_OPTIONAL_RET0(vcpu_precreate) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 66b1ff1cec61..2a41a93a80f3 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1151,6 +1151,7 @@ enum kvm_apicv_inhibit { }; struct kvm_arch { + unsigned long vm_type; unsigned long n_used_mmu_pages; unsigned long n_requested_mmu_pages; unsigned long n_max_mmu_pages; @@ -1468,6 +1469,7 @@ struct kvm_x86_ops { bool (*has_emulated_msr)(struct kvm *kvm, u32 index); void (*vcpu_after_set_cpuid)(struct kvm_vcpu *vcpu); + bool (*is_vm_type_supported)(unsigned long vm_type); unsigned int vm_size; int (*vm_init)(struct kvm *kvm); void (*vm_destroy)(struct kvm *kvm); diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 46de10a809ec..54b08789c402 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -532,4 +532,7 @@ struct kvm_pmu_event_filter { #define KVM_VCPU_TSC_CTRL 0 /* control group for the timestamp counter (TSC) */ #define KVM_VCPU_TSC_OFFSET 0 /* attribute for the TSC offset */ +#define KVM_X86_DEFAULT_VM 0 +#define KVM_X86_TDX_VM 1 + #endif /* _ASM_X86_KVM_H */ diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index aaf5f0623011..2bcf2e1a5271 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4713,6 +4713,11 @@ static void svm_vm_destroy(struct kvm *kvm) sev_vm_destroy(kvm); } +static bool svm_is_vm_type_supported(unsigned long type) +{ + return type == KVM_X86_DEFAULT_VM; +} + static int svm_vm_init(struct kvm *kvm) { if (!pause_filter_count || !pause_filter_thresh) @@ -4740,6 +4745,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .vcpu_free = svm_vcpu_free, .vcpu_reset = svm_vcpu_reset, + .is_vm_type_supported = svm_is_vm_type_supported, .vm_size = sizeof(struct kvm_svm), .vm_init = svm_vm_init, .vm_destroy = svm_vm_destroy, diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index a5cbf3ca2055..22bf49afc761 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -33,6 +33,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .hardware_disable = vmx_hardware_disable, .has_emulated_msr = vmx_has_emulated_msr, + .is_vm_type_supported = vmx_is_vm_type_supported, .vm_size = sizeof(struct kvm_vmx), .vm_init = vmx_vm_init, .vm_destroy = vmx_vm_destroy, diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index 060bf48ec3d6..473013265bd8 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -15,11 +15,7 @@ struct vcpu_tdx { static inline bool is_td(struct kvm *kvm) { - /* - * TDX VM type isn't defined yet. - * return kvm->arch.vm_type == KVM_X86_TDX_VM; - */ - return false; + return kvm->arch.vm_type == KVM_X86_TDX_VM; } static inline bool is_td_vcpu(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 68aef67c5eb7..dc05b78e0a1e 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7410,6 +7410,11 @@ int vmx_vcpu_create(struct kvm_vcpu *vcpu) return err; } +bool vmx_is_vm_type_supported(unsigned long type) +{ + return type == KVM_X86_DEFAULT_VM; +} + #define L1TF_MSG_SMT "L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.\n" #define L1TF_MSG_L1D "L1TF CPU bug present and virtualization mitigation disabled, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.\n" diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 6196d651a00a..d4877f4f93de 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -27,6 +27,7 @@ void vmx_hardware_unsetup(void); int vmx_check_processor_compatibility(void); int vmx_hardware_enable(void); void vmx_hardware_disable(void); +bool vmx_is_vm_type_supported(unsigned long type); int vmx_vm_init(struct kvm *kvm); void vmx_vm_destroy(struct kvm *kvm); int vmx_vcpu_precreate(struct kvm *kvm); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 715a53d4fc3d..91053fdc4512 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4520,6 +4520,11 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext) case KVM_CAP_X86_NOTIFY_VMEXIT: r = kvm_caps.has_notify_vmexit; break; + case KVM_CAP_VM_TYPES: + r = BIT(KVM_X86_DEFAULT_VM); + if (static_call(kvm_x86_is_vm_type_supported)(KVM_X86_TDX_VM)) + r |= BIT(KVM_X86_TDX_VM); + break; default: break; } @@ -12507,9 +12512,11 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type) int ret; unsigned long flags; - if (type) + if (!static_call(kvm_x86_is_vm_type_supported)(type)) return -EINVAL; + kvm->arch.vm_type = type; + ret = kvm_page_track_init(kvm); if (ret) goto out; diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index fa60b032a405..49386e4de8b8 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1216,6 +1216,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_S390_CPU_TOPOLOGY 222 #define KVM_CAP_DIRTY_LOG_RING_ACQ_REL 223 #define KVM_CAP_PRIVATE_MEM 224 +#define KVM_CAP_VM_TYPES 225 #ifdef KVM_CAP_IRQ_ROUTING diff --git a/tools/arch/x86/include/uapi/asm/kvm.h b/tools/arch/x86/include/uapi/asm/kvm.h index 46de10a809ec..54b08789c402 100644 --- a/tools/arch/x86/include/uapi/asm/kvm.h +++ b/tools/arch/x86/include/uapi/asm/kvm.h @@ -532,4 +532,7 @@ struct kvm_pmu_event_filter { #define KVM_VCPU_TSC_CTRL 0 /* control group for the timestamp counter (TSC) */ #define KVM_VCPU_TSC_OFFSET 0 /* attribute for the TSC offset */ +#define KVM_X86_DEFAULT_VM 0 +#define KVM_X86_TDX_VM 1 + #endif /* _ASM_X86_KVM_H */ diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h index 0d5d4419139a..812b771d8702 100644 --- a/tools/include/uapi/linux/kvm.h +++ b/tools/include/uapi/linux/kvm.h @@ -1178,6 +1178,7 @@ struct kvm_ppc_resize_hpt { #define KVM_CAP_S390_ZPCI_OP 221 #define KVM_CAP_S390_CPU_TOPOLOGY 222 #define KVM_CAP_DIRTY_LOG_RING_ACQ_REL 223 +#define KVM_CAP_VM_TYPES 225 #ifdef KVM_CAP_IRQ_ROUTING From patchwork Sun Oct 30 06:22:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12848 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665747wru; Sat, 29 Oct 2022 23:27:16 -0700 (PDT) X-Google-Smtp-Source: AMsMyM48LCq+tOJLhzCOBK1Cq5VpfG/GK2TZY8WCR5Sr33eqrwDD/qvNdOP8XXQsQKs3y0f0U33Z X-Received: by 2002:a17:907:7f25:b0:7aa:acf9:c07e with SMTP id qf37-20020a1709077f2500b007aaacf9c07emr6840260ejc.280.1667111236254; Sat, 29 Oct 2022 23:27:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111236; cv=none; d=google.com; s=arc-20160816; b=Tvj2rGaKxKOa5phcqMvkaIg2jgBZXAcnCx3nIZMjFMIkOLOUbqRSPvCgCO5YDG8zpp 8R+XMtwnS0+BOv7lgwh1uqL0UZlyaOTPFhzJMhUEJlp0WUSI5lUp2/Tp7VRn/HRACO0H tpSY15bG7+cNqya8nTFUQ8rz/lVqKFH34AJZIQHD6amqWw284RDDALarF4A+rLy8RAhj FcnB4pobpCt/XBPsDEakS5lNBZLx5VgF9M1TaPADwhuLEFZya9EhXDFh675FQLYcT+Tv NESCv/d7mBwvcFbVMoBIen/ZMoL2gEzMyNTHA8/RcDBIHRrEFTlpcVEVYG3xfClrjjfo R3vw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=8cHozTrfRlW+zrtOq1l+VLJIMISNYkzulDWNu6+Wu6c=; b=oouIl7lU8uYkg/k1ZcvKfd5GofVF3Fzyya2s7JKxVvn4BwSLIMXzNO9qtLZY1C1YE5 IGCkcIAfkxZ7fLGRFLhNalO6ve6MhpyRnoB3zMGhPKO5EOxIMDboBTfSrNr0k8VnouUt J/DtRIAqy8X3Eds/6K+fa33JkwSWvTySS+oQNLyBa+b+mJWKeJk/aX9CC8QoORFGqlH5 7bLTrzJRWIgI0wNj5IG0lain43jHRxpWvIyxFSCJvP8q6ZIlNoZohq9SNR4f30x8HqJi 7OGo9GWdZyMDgK60hJw2di4vQaDu14tNVzjOpNriLCnkxMB6jYw33SHir5/y2IvITPbo RFew== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ewtbCVDv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h14-20020a1709066d8e00b007aa35038c6bsi3112490ejt.463.2022.10.29.23.26.52; Sat, 29 Oct 2022 23:27:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ewtbCVDv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229711AbiJ3GYX (ORCPT + 99 others); Sun, 30 Oct 2022 02:24:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46852 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229776AbiJ3GYD (ORCPT ); Sun, 30 Oct 2022 02:24:03 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 023B0B1; Sat, 29 Oct 2022 23:24:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111040; x=1698647040; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VsKeVF448wPktNQHNkJF248fNLEn7lGAPSlciDP/1cM=; b=ewtbCVDvNBTk4vvzcR3xI0UzGCPjKaW8gja5DmPZaUCcAq9ZnF3ETHbh OLTyk+UauhhHwlXatQeBByy2SsGnBvgdyn6UWZ+nA8VvpOzvHItUElqqJ WUyksNMLiovImsd2wV5dM2TpPSNX/VUfvThhLonuxXllGpBC4wOGB2qdn MrRhLVVcuvo734Mkh721tqSDvJ2hhTH4FjAsLMtllhkx3xJQ1Fna4DFOZ GzcIZhaJXA+IKqYy4wWO/ETf0j9COvIWJCvW2jw9hDDhsfyRlnbbABMjS u/NalxRJz3Skz83HpLK6oNWrKdaZhLsTguN+PycdoCnkG5zffQZkEmDx8 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037117" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037117" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:57 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392850" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392850" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:57 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 007/108] KVM: TDX: Make TDX VM type supported Date: Sat, 29 Oct 2022 23:22:08 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092831797990740?= X-GMAIL-MSGID: =?utf-8?q?1748092831797990740?= From: Isaku Yamahata NOTE: This patch is in position of the patch series for developers to be able to test codes during the middle of the patch series although this patch series doesn't provide functional features until the all the patches of this patch series. When merging this patch series, this patch can be moved to the end. As first step TDX VM support, return that TDX VM type supported to device model, e.g. qemu. The callback to create guest TD is vm_init callback for KVM_CREATE_VM. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/main.c | 18 ++++++++++++++++-- arch/x86/kvm/vmx/tdx.c | 6 ++++++ arch/x86/kvm/vmx/vmx.c | 5 ----- arch/x86/kvm/vmx/x86_ops.h | 3 ++- 4 files changed, 24 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 22bf49afc761..0900ff2f2390 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -10,6 +10,12 @@ static bool __read_mostly enable_tdx = IS_ENABLED(CONFIG_INTEL_TDX_HOST); module_param_named(tdx, enable_tdx, bool, 0444); +static bool vt_is_vm_type_supported(unsigned long type) +{ + return type == KVM_X86_DEFAULT_VM || + (enable_tdx && tdx_is_vm_type_supported(type)); +} + static __init int vt_hardware_setup(void) { int ret; @@ -23,6 +29,14 @@ static __init int vt_hardware_setup(void) return 0; } +static int vt_vm_init(struct kvm *kvm) +{ + if (is_td(kvm)) + return -EOPNOTSUPP; /* Not ready to create guest TD yet. */ + + return vmx_vm_init(kvm); +} + struct kvm_x86_ops vt_x86_ops __initdata = { .name = "kvm_intel", @@ -33,9 +47,9 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .hardware_disable = vmx_hardware_disable, .has_emulated_msr = vmx_has_emulated_msr, - .is_vm_type_supported = vmx_is_vm_type_supported, + .is_vm_type_supported = vt_is_vm_type_supported, .vm_size = sizeof(struct kvm_vmx), - .vm_init = vmx_vm_init, + .vm_init = vt_vm_init, .vm_destroy = vmx_vm_destroy, .vcpu_precreate = vmx_vcpu_precreate, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 6213a5c6b637..530e72f85762 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -74,6 +74,12 @@ static int __init tdx_module_setup(void) return 0; } +bool tdx_is_vm_type_supported(unsigned long type) +{ + /* enable_tdx check is done by the caller. */ + return type == KVM_X86_TDX_VM; +} + int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { int r; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index dc05b78e0a1e..68aef67c5eb7 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7410,11 +7410,6 @@ int vmx_vcpu_create(struct kvm_vcpu *vcpu) return err; } -bool vmx_is_vm_type_supported(unsigned long type) -{ - return type == KVM_X86_DEFAULT_VM; -} - #define L1TF_MSG_SMT "L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.\n" #define L1TF_MSG_L1D "L1TF CPU bug present and virtualization mitigation disabled, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.\n" diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index d4877f4f93de..ac1688b0b0e3 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -27,7 +27,6 @@ void vmx_hardware_unsetup(void); int vmx_check_processor_compatibility(void); int vmx_hardware_enable(void); void vmx_hardware_disable(void); -bool vmx_is_vm_type_supported(unsigned long type); int vmx_vm_init(struct kvm *kvm); void vmx_vm_destroy(struct kvm *kvm); int vmx_vcpu_precreate(struct kvm *kvm); @@ -133,8 +132,10 @@ void vmx_setup_mce(struct kvm_vcpu *vcpu); #ifdef CONFIG_INTEL_TDX_HOST int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops); +bool tdx_is_vm_type_supported(unsigned long type); #else static inline int tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { return 0; } +static inline bool tdx_is_vm_type_supported(unsigned long type) { return false; } #endif #endif /* __KVM_X86_VMX_X86_OPS_H */ From patchwork Sun Oct 30 06:22:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12833 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665442wru; Sat, 29 Oct 2022 23:25:55 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6lWQXl6mMBJDMYWaeh0RdE1kLiRePb2lqM6bh3ATHjrD9+bkPHR7gcQZpL8wk32/xVzLEI X-Received: by 2002:aa7:d744:0:b0:45c:e353:e891 with SMTP id a4-20020aa7d744000000b0045ce353e891mr7632147eds.36.1667111155869; Sat, 29 Oct 2022 23:25:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111155; cv=none; d=google.com; s=arc-20160816; b=nmR5PJJt5BPFEo7SHqyGc3bvkgYRu/IlDsf/dvSkvBCGfG641XLc23ssq9qSx6VIgS 6TT55g92qOj5E+TeglXMYfA87xNMfdc+I3UxLyMq6PrUX3y51mliGJ4v+T/qmkDviTIL 3Qup747BZhoGniDMw8eUHwBAU2yk4BHVoGH+UG1W4adyiVa2GrFU6g1Fd5cP5Fiysn6d AngOGLjGYUJ56xCjh8KM+7jsFnQgET1zXYfW+HrVvQG6z4eq7Ba6Y9mMrO5QiAwN5xtC HdvKgENm9D69LS+pBDjI0RGoXusL0oPGzX9AH+CkU9xauAM+BnDknh1CbPfXtGRiO6tS RzPQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=9nlzYMO2DIrbQcPz/VLZlI9jRwaiOYZLQJGndMcpd/E=; b=Ch7IAgGvSuDCnJy+SQUCHnvPEc/eZCxd8fB1Z9mkYHpbBbdDYHgbQwijgk2l8AcNTg zHLK6im7HNXotb6Sq4u79xbKZplcywxJx07JW8pnnRfUZjfOZE9z/brPUwOokHW4Jr53 zYlmY1Fhv8OHdRtYjVBpm6GSIN7KDC6oWq8bqiqUdzyG8CgPsNyJHCTXHr4sSMhDL/z1 5e9yT/i8miOTYRwlVKpuqOsqlpJW0Ljd7enwAO6w1lAzCHeXJ3XiY5Xje7ZqbRd7JMGY U+rLjqYPTizk2lDMIYZKkoEbrLszQK6VCrCIMzYXdfoFy4pHQB+S+AA9DVT5vLr1tFRR K3Xw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Uq+Fkcba; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l9-20020a170906794900b0078de4629958si4221169ejo.248.2022.10.29.23.25.31; Sat, 29 Oct 2022 23:25:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Uq+Fkcba; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229632AbiJ3GYg (ORCPT + 99 others); Sun, 30 Oct 2022 02:24:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46854 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229777AbiJ3GYD (ORCPT ); Sun, 30 Oct 2022 02:24:03 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E461A6; Sat, 29 Oct 2022 23:24:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111041; x=1698647041; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Wk7ogtc5Ub+4u4KfteyYsotpfo7mJ/YCVxSbhwLcn7s=; b=Uq+FkcbaZnYqBKqht4eNBILoDYAuY+7KWjerYLqu5pBnd0ovt5J4hg4D 1CvsA9Wm9O8v9lZelfSoYdI4m5HphqcV16qKa1d9nj818iT4wq5O3vFXM 5JlOXWprAAOXD7ZuIVIKMjYPN0TTN29vavJPnA1a3LEMtW/qsv9oY47bk c/gucsvV5KtW81iGxAAWZzR3ueZiF0XHwFAAnkuMlTOjy3wWLxg3nljTm NInLApOUfbysdeSIpIRnYecgJSQ6Rn6yYG9aHJaOXXNPHu1xFVeMEXmfC 4gxIpB/q9c1R9WN6j1+3eeKuOs4TQ0LivHfe/OF/+LlSwLgPTr4OcswsZ Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037118" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037118" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:57 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392853" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392853" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:57 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 008/108] [MARKER] The start of TDX KVM patch series: TDX architectural definitions Date: Sat, 29 Oct 2022 23:22:09 -0700 Message-Id: <534314bb6041345ade0db6c4ddc18ade8d540648.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092747438057012?= X-GMAIL-MSGID: =?utf-8?q?1748092747438057012?= From: Isaku Yamahata This empty commit is to mark the start of patch series of TDX architectural definitions. Signed-off-by: Isaku Yamahata --- .../virt/kvm/intel-tdx-layer-status.rst | 29 +++++++++++++++++++ 1 file changed, 29 insertions(+) create mode 100644 Documentation/virt/kvm/intel-tdx-layer-status.rst diff --git a/Documentation/virt/kvm/intel-tdx-layer-status.rst b/Documentation/virt/kvm/intel-tdx-layer-status.rst new file mode 100644 index 000000000000..b7a14bc73853 --- /dev/null +++ b/Documentation/virt/kvm/intel-tdx-layer-status.rst @@ -0,0 +1,29 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=================================== +Intel Trust Dodmain Extensions(TDX) +=================================== + +Layer status +============ +What qemu can do +---------------- +- TDX VM TYPE is exposed to Qemu. +- Qemu can try to create VM of TDX VM type and then fails. + +Patch Layer status +------------------ + Patch layer Status +* TDX, VMX coexistence: Applied +* TDX architectural definitions: Applying +* TD VM creation/destruction: Not yet +* TD vcpu creation/destruction: Not yet +* TDX EPT violation: Not yet +* TD finalization: Not yet +* TD vcpu enter/exit: Not yet +* TD vcpu interrupts/exit/hypercall: Not yet + +* KVM MMU GPA shared bits: Not yet +* KVM TDP refactoring for TDX: Not yet +* KVM TDP MMU hooks: Not yet +* KVM TDP MMU MapGPA: Not yet From patchwork Sun Oct 30 06:22:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12831 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665345wru; Sat, 29 Oct 2022 23:25:43 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6vnHFxhj/irwnugFF3B0GwHYklAlzXcYX9kVxNpdDVQ9BzGrbi3u92piinfu4WKQTR6x2M X-Received: by 2002:a17:907:6d0e:b0:7ad:c0d5:ad7a with SMTP id sa14-20020a1709076d0e00b007adc0d5ad7amr2195406ejc.81.1667111143827; Sat, 29 Oct 2022 23:25:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111143; cv=none; d=google.com; s=arc-20160816; b=u5zQ90J8JMhjkWtcGSb9Dl6mnHykr0NImK3FGBd22l8M23TKH/m0E6XM7tGtAWLqu0 6TOyy5U343njg1kGq1ovy4TK5RXsC2NgLBxIdDo6Q3mhOuBDDixjP1OcuEvgTBwnc9g9 kVifR8aoSeNcxLHKqwLAl9XVtKvyrivMdWKe70YpoNBddXUKtwygQBIDk11kBGOJTvMR fU983EhZKQAIRO0J5ZbhG5wo4d5/nuJEFzmKWmGPXtS74kgOc2+7Q4rwRb8yJIDKfdvy aR3ZjMHCdJAzaW+PmDGCj9P3plp7MjSqvR13R4nfwe3a1NSjKOiOPVz+Psqpz7tY6JXG rOPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=DnJ61eABkomZdjlzZ9NEe6/hiaOmtqaTxUByThT1j38=; b=xFri408mcl+pDszksP3lcnbusP/HDQp44Y6dIQuZvkd6dEzva5krOOzkWpQgmmIgcb vt03usUQ1uOeUmxOBxfY8K9SW0hAKeLebrtePShl0jZjfaqZHIcVjuO8IYp01Mv6S6SX JbQIZ0vPmDOpIHB6Ncmct0vYJXT+X7e76thSZGpyCom/n6YJY8QnKmtXlso7oX6sFLJJ ZWOnu2GbNxxLYn4BXjBE4Hqdoi1U0M1ZWig31iWmIQGuhCsNHoj8N/n3TY3PYfyys5PW yzV4LJ25K9D3vVtbCTh+FHeIG9DtOgwlb+80EC+nxmuWID/P5zFMVzOlQI9Du7YBYdUM 3pXw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=LfiBdrpG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gb17-20020a170907961100b0078d4ba46622si4213631ejc.616.2022.10.29.23.25.16; Sat, 29 Oct 2022 23:25:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=LfiBdrpG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229961AbiJ3GY1 (ORCPT + 99 others); Sun, 30 Oct 2022 02:24:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229779AbiJ3GYD (ORCPT ); Sun, 30 Oct 2022 02:24:03 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4EFFB4; Sat, 29 Oct 2022 23:24:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111042; x=1698647042; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=I2wvfJ/y/t932BPFlGKQdNeB6Xot/PIk5zhThIBg9pM=; b=LfiBdrpGFR/MNQLxxqtuz7Ajh1px4cJ8V36wB0FW+LTZ272i8fdhDVpA bc1jPHvg4pRVfCnzyseThzbQL4TKbnOWvv6pz6lc9KTRfgSc2y0eYJBby eRe68oUskPjPWe+T9XQ4BOOGCrO5xzKnQhCs7Akfzjv819vNf8C7tnSOd V/rlw1DLrOXH7jaV8qmav7SRMzZb+GlOCDVrQVA6oSVWBvE6T46BYnWuX cEGMYqOtF+l62z4hMlgAskmH3vIq4AK5LIPhii//tyJynQwfIMkNG2qvK nSZQsT6QU6ZTSmUkwpF3pm4kge514HGv/ELo6RXIuKeHlvhn1GntYAcIC A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037119" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037119" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:58 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392856" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392856" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:58 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 009/108] KVM: TDX: Define TDX architectural definitions Date: Sat, 29 Oct 2022 23:22:10 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092734556198662?= X-GMAIL-MSGID: =?utf-8?q?1748092734556198662?= From: Isaku Yamahata Define architectural definitions for KVM to issue the TDX SEAMCALLs. Structures and values that are architecturally defined in the TDX module specifications the chapter of ABI Reference. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/kvm/vmx/tdx_arch.h | 166 ++++++++++++++++++++++++++++++++++++ 1 file changed, 166 insertions(+) create mode 100644 arch/x86/kvm/vmx/tdx_arch.h diff --git a/arch/x86/kvm/vmx/tdx_arch.h b/arch/x86/kvm/vmx/tdx_arch.h new file mode 100644 index 000000000000..18604734fb14 --- /dev/null +++ b/arch/x86/kvm/vmx/tdx_arch.h @@ -0,0 +1,166 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* architectural constants/data definitions for TDX SEAMCALLs */ + +#ifndef __KVM_X86_TDX_ARCH_H +#define __KVM_X86_TDX_ARCH_H + +#include + +/* + * TDX SEAMCALL API function leaves + */ +#define TDH_VP_ENTER 0 +#define TDH_MNG_ADDCX 1 +#define TDH_MEM_PAGE_ADD 2 +#define TDH_MEM_SEPT_ADD 3 +#define TDH_VP_ADDCX 4 +#define TDH_MEM_PAGE_RELOCATE 5 +#define TDH_MEM_PAGE_AUG 6 +#define TDH_MEM_RANGE_BLOCK 7 +#define TDH_MNG_KEY_CONFIG 8 +#define TDH_MNG_CREATE 9 +#define TDH_VP_CREATE 10 +#define TDH_MNG_RD 11 +#define TDH_MR_EXTEND 16 +#define TDH_MR_FINALIZE 17 +#define TDH_VP_FLUSH 18 +#define TDH_MNG_VPFLUSHDONE 19 +#define TDH_MNG_KEY_FREEID 20 +#define TDH_MNG_INIT 21 +#define TDH_VP_INIT 22 +#define TDH_VP_RD 26 +#define TDH_MNG_KEY_RECLAIMID 27 +#define TDH_PHYMEM_PAGE_RECLAIM 28 +#define TDH_MEM_PAGE_REMOVE 29 +#define TDH_MEM_SEPT_REMOVE 30 +#define TDH_MEM_TRACK 38 +#define TDH_MEM_RANGE_UNBLOCK 39 +#define TDH_PHYMEM_CACHE_WB 40 +#define TDH_PHYMEM_PAGE_WBINVD 41 +#define TDH_VP_WR 43 +#define TDH_SYS_LP_SHUTDOWN 44 + +#define TDG_VP_VMCALL_GET_TD_VM_CALL_INFO 0x10000 +#define TDG_VP_VMCALL_MAP_GPA 0x10001 +#define TDG_VP_VMCALL_GET_QUOTE 0x10002 +#define TDG_VP_VMCALL_REPORT_FATAL_ERROR 0x10003 +#define TDG_VP_VMCALL_SETUP_EVENT_NOTIFY_INTERRUPT 0x10004 + +/* TDX control structure (TDR/TDCS/TDVPS) field access codes */ +#define TDX_NON_ARCH BIT_ULL(63) +#define TDX_CLASS_SHIFT 56 +#define TDX_FIELD_MASK GENMASK_ULL(31, 0) + +#define __BUILD_TDX_FIELD(non_arch, class, field) \ + (((non_arch) ? TDX_NON_ARCH : 0) | \ + ((u64)(class) << TDX_CLASS_SHIFT) | \ + ((u64)(field) & TDX_FIELD_MASK)) + +#define BUILD_TDX_FIELD(class, field) \ + __BUILD_TDX_FIELD(false, (class), (field)) + +#define BUILD_TDX_FIELD_NON_ARCH(class, field) \ + __BUILD_TDX_FIELD(true, (class), (field)) + + +/* Class code for TD */ +#define TD_CLASS_EXECUTION_CONTROLS 17ULL + +/* Class code for TDVPS */ +#define TDVPS_CLASS_VMCS 0ULL +#define TDVPS_CLASS_GUEST_GPR 16ULL +#define TDVPS_CLASS_OTHER_GUEST 17ULL +#define TDVPS_CLASS_MANAGEMENT 32ULL + +enum tdx_tdcs_execution_control { + TD_TDCS_EXEC_TSC_OFFSET = 10, +}; + +/* @field is any of enum tdx_tdcs_execution_control */ +#define TDCS_EXEC(field) BUILD_TDX_FIELD(TD_CLASS_EXECUTION_CONTROLS, (field)) + +/* @field is the VMCS field encoding */ +#define TDVPS_VMCS(field) BUILD_TDX_FIELD(TDVPS_CLASS_VMCS, (field)) + +enum tdx_vcpu_guest_other_state { + TD_VCPU_STATE_DETAILS_NON_ARCH = 0x100, +}; + +union tdx_vcpu_state_details { + struct { + u64 vmxip : 1; + u64 reserved : 63; + }; + u64 full; +}; + +/* @field is any of enum tdx_guest_other_state */ +#define TDVPS_STATE(field) BUILD_TDX_FIELD(TDVPS_CLASS_OTHER_GUEST, (field)) +#define TDVPS_STATE_NON_ARCH(field) BUILD_TDX_FIELD_NON_ARCH(TDVPS_CLASS_OTHER_GUEST, (field)) + +/* Management class fields */ +enum tdx_vcpu_guest_management { + TD_VCPU_PEND_NMI = 11, +}; + +/* @field is any of enum tdx_vcpu_guest_management */ +#define TDVPS_MANAGEMENT(field) BUILD_TDX_FIELD(TDVPS_CLASS_MANAGEMENT, (field)) + +#define TDX_EXTENDMR_CHUNKSIZE 256 + +struct tdx_cpuid_value { + u32 eax; + u32 ebx; + u32 ecx; + u32 edx; +} __packed; + +#define TDX_TD_ATTRIBUTE_DEBUG BIT_ULL(0) +#define TDX_TD_ATTRIBUTE_PKS BIT_ULL(30) +#define TDX_TD_ATTRIBUTE_KL BIT_ULL(31) +#define TDX_TD_ATTRIBUTE_PERFMON BIT_ULL(63) + +/* + * TD_PARAMS is provided as an input to TDH_MNG_INIT, the size of which is 1024B. + */ +struct td_params { + u64 attributes; + u64 xfam; + u32 max_vcpus; + u32 reserved0; + + u64 eptp_controls; + u64 exec_controls; + u16 tsc_frequency; + u8 reserved1[38]; + + u64 mrconfigid[6]; + u64 mrowner[6]; + u64 mrownerconfig[6]; + u64 reserved2[4]; + + union { + struct tdx_cpuid_value cpuid_values[0]; + u8 reserved3[768]; + }; +} __packed __aligned(1024); + +/* + * Guest uses MAX_PA for GPAW when set. + * 0: GPA.SHARED bit is GPA[47] + * 1: GPA.SHARED bit is GPA[51] + */ +#define TDX_EXEC_CONTROL_MAX_GPAW BIT_ULL(0) + +/* + * TDX requires the frequency to be defined in units of 25MHz, which is the + * frequency of the core crystal clock on TDX-capable platforms, i.e. the TDX + * module can only program frequencies that are multiples of 25MHz. The + * frequency must be between 100mhz and 10ghz (inclusive). + */ +#define TDX_TSC_KHZ_TO_25MHZ(tsc_in_khz) ((tsc_in_khz) / (25 * 1000)) +#define TDX_TSC_25MHZ_TO_KHZ(tsc_in_25mhz) ((tsc_in_25mhz) * (25 * 1000)) +#define TDX_MIN_TSC_FREQUENCY_KHZ (100 * 1000) +#define TDX_MAX_TSC_FREQUENCY_KHZ (10 * 1000 * 1000) + +#endif /* __KVM_X86_TDX_ARCH_H */ From patchwork Sun Oct 30 06:22:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12834 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665461wru; Sat, 29 Oct 2022 23:25:58 -0700 (PDT) X-Google-Smtp-Source: AMsMyM77x9tuGDExL7pg3Rt4qevqneIHWGzzB2/h2S/mH9oVPhD68IFasAiJcYwtEixSjToVtkZw X-Received: by 2002:a05:6402:f1e:b0:461:cfd3:48c2 with SMTP id i30-20020a0564020f1e00b00461cfd348c2mr7359270eda.294.1667111158308; Sat, 29 Oct 2022 23:25:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111158; cv=none; d=google.com; s=arc-20160816; b=f7NHGPt8vhW4ZRvM+5K1SnQ8vNZfvAgXQOJwwtQFW+xTjteewY/Ht8PHyY8GjB6FtZ PpD9wf8+uKhMrdjw/MH6NZPLhKM3dZD6wks+nsvYOuHf/0biE0rbCJNBhaWDur9TowWl UwTG2YAydcPnKRwT7z7BLCgI6hw/9GgyVVtkD3BZIBHCEJFOwPHaFOssnZE9Jj33MZ7X fIEErWZMTu9C41YosBYnRzbjTOLmREMs4bqmY1pDFsJOyAfrDMNF2ZDrTom60OQ4L6DV DF690YvLv3x5Gju4+yvyZgo96niQNk2X/lOJCnx8DtqTWcC2E75OHL90/7qPk6vK7etv cyzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=38aFb9ID8rQQJAYk3md0CJgUvdADHHBJoraPiF7VRb0=; b=uhFXxli9JwC9UJRbwndKpzijuo8zGBj2oG93gugKniN5NpcK+rToUaPB0lYQHuCm+7 jWxAQ84KKVcExsvHlmgtwAE1/tQi1JRG5tsim6CGnwSU4HLA0z6o//LKXK3uuUSe5pjq gmUHc2A3/Q/50sCs4zRdg7mNQHYlieJkZ34il9KIlUmz/atW0l8sCOflcxmYKmn7FO1u HrDLujWa8aj8OP5Vpx3oBI8F7IJwtPKO5GIEUuWq2ftMgPbDx7YGziXxnd5L3435S4Z1 s0cEZ5HFEyvZu8Iys2jWXybtBeOnXbOu90F++oFF8uvuHBXdvFam184SLakNLJISJIp0 ZddQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ACJqljr6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id js2-20020a17090797c200b0078d9f02b452si4615045ejc.861.2022.10.29.23.25.34; Sat, 29 Oct 2022 23:25:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ACJqljr6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230007AbiJ3GYo (ORCPT + 99 others); Sun, 30 Oct 2022 02:24:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46854 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229782AbiJ3GYD (ORCPT ); Sun, 30 Oct 2022 02:24:03 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B16ABC; Sat, 29 Oct 2022 23:24:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111043; x=1698647043; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QafbYPMxQOgi9WthXzffod7HReNCaa9JqivlLmXQmJw=; b=ACJqljr61hFNUbgY/tMRS49oDUKuRTFMlFU9zWyu+LniWF0PSwBxLvdS WcPcfVL7q/I4Yg5ODsmsCyCI8VWcEzPwjdSJPJc26Gcg8LWS6YMUsc0jl 2+jxrh6wQAckgKt5rv+m7hSClFRZ2jXQMzUmL/IXq33FjI5ij5XPQMozF A7kWbU+YlrxFJfxfc4BSMTJRju3VU2u05R3LYK/ZYKW7aKbEqeITTbMsK eAtozbJKy8nltrj8hzbpqPM7tt4gNgQZXmRpEyAR0GxWaFqUakDo9gWNI K6X1KM4AXrTg2pl/1dKKAjnzVV92+UyDPuJyDk2w2bAjo8lj/yONwdmb0 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037120" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037120" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:58 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392860" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392860" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:58 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 010/108] KVM: TDX: Add TDX "architectural" error codes Date: Sat, 29 Oct 2022 23:22:11 -0700 Message-Id: <679bb45187dc54b82ebc9df5381a7d5de0b782d5.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092750411975565?= X-GMAIL-MSGID: =?utf-8?q?1748092750411975565?= From: Sean Christopherson Add error codes for the TDX SEAMCALLs both for TDX VMM side for TDH SEAMCALL and TDX guest side for TDG.VP.VMCALL. KVM issues the TDX SEAMCALLs and checks its error code. KVM handles hypercall from the TDX guest and may return an error. So error code for the TDX guest is also needed. TDX SEAMCALL uses bits 31:0 to return more information, so these error codes will only exactly match RAX[63:32]. Error codes for TDG.VP.VMCALL is defined by TDX Guest-Host-Communication interface spec. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/kvm/vmx/tdx_errno.h | 38 ++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) create mode 100644 arch/x86/kvm/vmx/tdx_errno.h diff --git a/arch/x86/kvm/vmx/tdx_errno.h b/arch/x86/kvm/vmx/tdx_errno.h new file mode 100644 index 000000000000..ce246ba62454 --- /dev/null +++ b/arch/x86/kvm/vmx/tdx_errno.h @@ -0,0 +1,38 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* architectural status code for SEAMCALL */ + +#ifndef __KVM_X86_TDX_ERRNO_H +#define __KVM_X86_TDX_ERRNO_H + +#define TDX_SEAMCALL_STATUS_MASK 0xFFFFFFFF00000000ULL + +/* + * TDX SEAMCALL Status Codes (returned in RAX) + */ +#define TDX_SUCCESS 0x0000000000000000ULL +#define TDX_NON_RECOVERABLE_VCPU 0x4000000100000000ULL +#define TDX_INTERRUPTED_RESUMABLE 0x8000000300000000ULL +#define TDX_OPERAND_BUSY 0x8000020000000000ULL +#define TDX_VCPU_NOT_ASSOCIATED 0x8000070200000000ULL +#define TDX_KEY_GENERATION_FAILED 0x8000080000000000ULL +#define TDX_KEY_STATE_INCORRECT 0xC000081100000000ULL +#define TDX_KEY_CONFIGURED 0x0000081500000000ULL +#define TDX_NO_HKID_READY_TO_WBCACHE 0x0000082100000000ULL +#define TDX_EPT_WALK_FAILED 0xC0000B0000000000ULL + +/* + * TDG.VP.VMCALL Status Codes (returned in R10) + */ +#define TDG_VP_VMCALL_SUCCESS 0x0000000000000000ULL +#define TDG_VP_VMCALL_RETRY 0x0000000000000001ULL +#define TDG_VP_VMCALL_INVALID_OPERAND 0x8000000000000000ULL +#define TDG_VP_VMCALL_TDREPORT_FAILED 0x8000000000000001ULL + +/* + * TDX module operand ID, appears in 31:0 part of error code as + * detail information + */ +#define TDX_OPERAND_ID_RCX 0x01 +#define TDX_OPERAND_ID_SEPT 0x92 + +#endif /* __KVM_X86_TDX_ERRNO_H */ From patchwork Sun Oct 30 06:22:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12835 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665491wru; Sat, 29 Oct 2022 23:26:03 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7zDoHTaJXlkYFOqAXAOVjVCrbdn9LscoCY/MGVDu61b+wIT9KziEwD0jqF1egKeusRAlnV X-Received: by 2002:a05:6402:2802:b0:43a:9098:55a0 with SMTP id h2-20020a056402280200b0043a909855a0mr7304327ede.179.1667111162986; Sat, 29 Oct 2022 23:26:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111162; cv=none; d=google.com; s=arc-20160816; b=F5rXACDxAHahozkQW8+12W9gcd2oK5cR0P7XLuEnszscs4DNr5Nym6zi5L37++8T2F qF7PZeVoJdX1f0C8vanJpAwa96CqwseMkopW50wN55dlWFtXM2EeyJq2CBhmOVWUsbrb Ke6lztpJ/eICbuMYYtJQnU8voXSQbyuI9yS0PAcIIademnfb/Qx9IQDFhj8eG/8pZHpW qHXkIEkhxnV/5osrOdMAUZgvyl+iXiC4YTR+ShayEysV0rS45KMhVLZRPB9wTbDiAx83 RPUnahqDCqOVRTVzQ+MC29tr0kct4LM7Ya3Hii28cPJsVA1FGR6YmSjzukVDceNrtcgn fasQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=8fqsbAT+TsvPadw/f5lYiDu6gAljdVAMwZ6t/XZByHU=; b=KXbdU9Ccp1mVidT0QAdKily+SLgS3zD05yRdvwh3J3LR0V0dpCNBBdkeVSORC99n9y efWZBmPPLVHUhimqQan8AyIV9MD2J19yWDArVlKyz8Fnv8plBsd3yREoRiHAk+ertely n6sHf0HCW2C30wEn9ubKVTUe1Uu7ncs7D1CbsuPt3VLARhUjcl2+FzLoud64fzdC0aKv Tj4FQ/kAt2vn/V8hki6Mp+tKimvJCtxmnGLYNV2uqstZR1GlsIIiMe2+R2c/XAAphOp6 Be6o+d7l5OC+/no6pvqjLhyISTSVwbKax5aJceQA5tyVTalOdWoeH6X7M3CeMC4M5vST D+HQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=hejtP13M; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sa29-20020a1709076d1d00b0078dbed58d26si4158425ejc.635.2022.10.29.23.25.39; Sat, 29 Oct 2022 23:26:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=hejtP13M; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230022AbiJ3GYt (ORCPT + 99 others); Sun, 30 Oct 2022 02:24:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46854 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229792AbiJ3GYE (ORCPT ); Sun, 30 Oct 2022 02:24:04 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B762CD; Sat, 29 Oct 2022 23:24:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111043; x=1698647043; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=y46nqaHpYFWyIzH5auIC5Mf/k7r9rbeh/+n5b6OOuoQ=; b=hejtP13MdFc0P5vOQ+dazdjLyiNFuapspPqKDQz1c6lOYgU9f8gEaV5t YCl91Okpoa4fr33hCvNc+h+R7UbBUdFHPZA32rkepJ65gWbLPi0tlgdgZ IUMJuwSXgJ3f4Q6UXkG/U6e4GmZZuxwDRyR6XTMFpgc7LuXbuf7RT5646 Y8PgxvjvLGUCV6mgZCJp2AKYGvVzy4zCoSQPINbsJ+gs0Q0xWLDCrCyqZ R+8cQhsQ/fmqyA4Zct43QnI1kyavT90IylohKT8FBpeeRLTnP7aRlQsYb ikHzIYGzkJulJzNfBLlXMqo4ZU2qgkivaxr/ByhIUoPzYo1+ky9Fq09tj A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037121" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037121" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:58 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392863" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392863" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:58 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 011/108] KVM: TDX: Add C wrapper functions for SEAMCALLs to the TDX module Date: Sat, 29 Oct 2022 23:22:12 -0700 Message-Id: <5f1a80e9ab037fa88d8821a6548638d282070f1d.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092754756931659?= X-GMAIL-MSGID: =?utf-8?q?1748092754756931659?= From: Isaku Yamahata A VMM interacts with the TDX module using a new instruction (SEAMCALL). A TDX VMM uses SEAMCALLs where a VMX VMM would have directly interacted with VMX instructions. For instance, a TDX VMM does not have full access to the VM control structure corresponding to VMX VMCS. Instead, a VMM induces the TDX module to act on behalf via SEAMCALLs. Export __seamcall and define C wrapper functions for SEAMCALLs for readability. Some SEAMCALL APIs donates pages to TDX module or guest TD. The pages are encrypted with TDX private host key id set in high bits of physical address. If any modified cache lines may exit for these pages, flush them to memory by clflush_cache_range(). Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/tdx.h | 2 + arch/x86/kvm/vmx/tdx_ops.h | 185 +++++++++++++++++++++++++++++++ arch/x86/virt/vmx/tdx/seamcall.S | 2 + 3 files changed, 189 insertions(+) create mode 100644 arch/x86/kvm/vmx/tdx_ops.h diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index 5cff7ed5b11e..ba2e4c69fb9f 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -186,6 +186,8 @@ struct tdsysinfo_struct { const struct tdsysinfo_struct *tdx_get_sysinfo(void); bool platform_tdx_enabled(void); int tdx_enable(void); +u64 __seamcall(u64 op, u64 rcx, u64 rdx, u64 r8, u64 r9, + struct tdx_module_output *out); #else /* !CONFIG_INTEL_TDX_HOST */ struct tdsysinfo_struct; static inline const struct tdsysinfo_struct *tdx_get_sysinfo(void) { return NULL; } diff --git a/arch/x86/kvm/vmx/tdx_ops.h b/arch/x86/kvm/vmx/tdx_ops.h new file mode 100644 index 000000000000..85adbf49c277 --- /dev/null +++ b/arch/x86/kvm/vmx/tdx_ops.h @@ -0,0 +1,185 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* constants/data definitions for TDX SEAMCALLs */ + +#ifndef __KVM_X86_TDX_OPS_H +#define __KVM_X86_TDX_OPS_H + +#include + +#include +#include +#include + +#include "tdx_errno.h" +#include "tdx_arch.h" + +#ifdef CONFIG_INTEL_TDX_HOST + +static inline u64 tdh_mng_addcx(hpa_t tdr, hpa_t addr) +{ + clflush_cache_range(__va(addr), PAGE_SIZE); + return __seamcall(TDH_MNG_ADDCX, addr, tdr, 0, 0, NULL); +} + +static inline u64 tdh_mem_page_add(hpa_t tdr, gpa_t gpa, hpa_t hpa, hpa_t source, + struct tdx_module_output *out) +{ + clflush_cache_range(__va(hpa), PAGE_SIZE); + return __seamcall(TDH_MEM_PAGE_ADD, gpa, tdr, hpa, source, out); +} + +static inline u64 tdh_mem_sept_add(hpa_t tdr, gpa_t gpa, int level, hpa_t page, + struct tdx_module_output *out) +{ + clflush_cache_range(__va(page), PAGE_SIZE); + return __seamcall(TDH_MEM_SEPT_ADD, gpa | level, tdr, page, 0, out); +} + +static inline u64 tdh_mem_sept_remove(hpa_t tdr, gpa_t gpa, int level, + struct tdx_module_output *out) +{ + return __seamcall(TDH_MEM_SEPT_REMOVE, gpa | level, tdr, 0, 0, out); +} + +static inline u64 tdh_vp_addcx(hpa_t tdvpr, hpa_t addr) +{ + clflush_cache_range(__va(addr), PAGE_SIZE); + return __seamcall(TDH_VP_ADDCX, addr, tdvpr, 0, 0, NULL); +} + +static inline u64 tdh_mem_page_relocate(hpa_t tdr, gpa_t gpa, hpa_t hpa, + struct tdx_module_output *out) +{ + clflush_cache_range(__va(hpa), PAGE_SIZE); + return __seamcall(TDH_MEM_PAGE_RELOCATE, gpa, tdr, hpa, 0, out); +} + +static inline u64 tdh_mem_page_aug(hpa_t tdr, gpa_t gpa, hpa_t hpa, + struct tdx_module_output *out) +{ + clflush_cache_range(__va(hpa), PAGE_SIZE); + return __seamcall(TDH_MEM_PAGE_AUG, gpa, tdr, hpa, 0, out); +} + +static inline u64 tdh_mem_range_block(hpa_t tdr, gpa_t gpa, int level, + struct tdx_module_output *out) +{ + return __seamcall(TDH_MEM_RANGE_BLOCK, gpa | level, tdr, 0, 0, out); +} + +static inline u64 tdh_mng_key_config(hpa_t tdr) +{ + return __seamcall(TDH_MNG_KEY_CONFIG, tdr, 0, 0, 0, NULL); +} + +static inline u64 tdh_mng_create(hpa_t tdr, int hkid) +{ + clflush_cache_range(__va(tdr), PAGE_SIZE); + return __seamcall(TDH_MNG_CREATE, tdr, hkid, 0, 0, NULL); +} + +static inline u64 tdh_vp_create(hpa_t tdr, hpa_t tdvpr) +{ + clflush_cache_range(__va(tdvpr), PAGE_SIZE); + return __seamcall(TDH_VP_CREATE, tdvpr, tdr, 0, 0, NULL); +} + +static inline u64 tdh_mng_rd(hpa_t tdr, u64 field, struct tdx_module_output *out) +{ + return __seamcall(TDH_MNG_RD, tdr, field, 0, 0, out); +} + +static inline u64 tdh_mr_extend(hpa_t tdr, gpa_t gpa, + struct tdx_module_output *out) +{ + return __seamcall(TDH_MR_EXTEND, gpa, tdr, 0, 0, out); +} + +static inline u64 tdh_mr_finalize(hpa_t tdr) +{ + return __seamcall(TDH_MR_FINALIZE, tdr, 0, 0, 0, NULL); +} + +static inline u64 tdh_vp_flush(hpa_t tdvpr) +{ + return __seamcall(TDH_VP_FLUSH, tdvpr, 0, 0, 0, NULL); +} + +static inline u64 tdh_mng_vpflushdone(hpa_t tdr) +{ + return __seamcall(TDH_MNG_VPFLUSHDONE, tdr, 0, 0, 0, NULL); +} + +static inline u64 tdh_mng_key_freeid(hpa_t tdr) +{ + return __seamcall(TDH_MNG_KEY_FREEID, tdr, 0, 0, 0, NULL); +} + +static inline u64 tdh_mng_init(hpa_t tdr, hpa_t td_params, + struct tdx_module_output *out) +{ + return __seamcall(TDH_MNG_INIT, tdr, td_params, 0, 0, out); +} + +static inline u64 tdh_vp_init(hpa_t tdvpr, u64 rcx) +{ + return __seamcall(TDH_VP_INIT, tdvpr, rcx, 0, 0, NULL); +} + +static inline u64 tdh_vp_rd(hpa_t tdvpr, u64 field, + struct tdx_module_output *out) +{ + return __seamcall(TDH_VP_RD, tdvpr, field, 0, 0, out); +} + +static inline u64 tdh_mng_key_reclaimid(hpa_t tdr) +{ + return __seamcall(TDH_MNG_KEY_RECLAIMID, tdr, 0, 0, 0, NULL); +} + +static inline u64 tdh_phymem_page_reclaim(hpa_t page, + struct tdx_module_output *out) +{ + return __seamcall(TDH_PHYMEM_PAGE_RECLAIM, page, 0, 0, 0, out); +} + +static inline u64 tdh_mem_page_remove(hpa_t tdr, gpa_t gpa, int level, + struct tdx_module_output *out) +{ + return __seamcall(TDH_MEM_PAGE_REMOVE, gpa | level, tdr, 0, 0, out); +} + +static inline u64 tdh_sys_lp_shutdown(void) +{ + return __seamcall(TDH_SYS_LP_SHUTDOWN, 0, 0, 0, 0, NULL); +} + +static inline u64 tdh_mem_track(hpa_t tdr) +{ + return __seamcall(TDH_MEM_TRACK, tdr, 0, 0, 0, NULL); +} + +static inline u64 tdh_mem_range_unblock(hpa_t tdr, gpa_t gpa, int level, + struct tdx_module_output *out) +{ + return __seamcall(TDH_MEM_RANGE_UNBLOCK, gpa | level, tdr, 0, 0, out); +} + +static inline u64 tdh_phymem_cache_wb(bool resume) +{ + return __seamcall(TDH_PHYMEM_CACHE_WB, resume ? 1 : 0, 0, 0, 0, NULL); +} + +static inline u64 tdh_phymem_page_wbinvd(hpa_t page) +{ + return __seamcall(TDH_PHYMEM_PAGE_WBINVD, page, 0, 0, 0, NULL); +} + +static inline u64 tdh_vp_wr(hpa_t tdvpr, u64 field, u64 val, u64 mask, + struct tdx_module_output *out) +{ + return __seamcall(TDH_VP_WR, tdvpr, field, val, mask, out); +} +#endif /* CONFIG_INTEL_TDX_HOST */ + +#endif /* __KVM_X86_TDX_OPS_H */ diff --git a/arch/x86/virt/vmx/tdx/seamcall.S b/arch/x86/virt/vmx/tdx/seamcall.S index f81be6b9c133..b90a7fe05494 100644 --- a/arch/x86/virt/vmx/tdx/seamcall.S +++ b/arch/x86/virt/vmx/tdx/seamcall.S @@ -1,5 +1,6 @@ /* SPDX-License-Identifier: GPL-2.0 */ #include +#include #include #include "tdxcall.S" @@ -50,3 +51,4 @@ SYM_FUNC_START(__seamcall) FRAME_END RET SYM_FUNC_END(__seamcall) +EXPORT_SYMBOL_GPL(__seamcall) From patchwork Sun Oct 30 06:22:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12837 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665550wru; Sat, 29 Oct 2022 23:26:14 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6+3dUEj+EacS99YyMFRY+zSd9LNqTYVwCmt4O0KVz4CMcPX4oSPGp5eSFRRxgp+TeNA9nJ X-Received: by 2002:a17:906:9bc2:b0:7ad:975f:b567 with SMTP id de2-20020a1709069bc200b007ad975fb567mr7005081ejc.107.1667111174722; Sat, 29 Oct 2022 23:26:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111174; cv=none; d=google.com; s=arc-20160816; b=C67EquhXA5jka5m5J1XjW6sNjrvRVeupMiBSYRPGjfP1oEsuRk1nnkBCnkt0BmvbdH VEfb1ZKp0w/VHfNMhVxo9k3LGLgsLBruMgUqpekwqldatKrlWreqNzN+oUq5JW80SC9V /BjxXpxpB5em5BdaO47wqNMCF7NQ8W+Kk7E/d+qVDEl93WnF/kIol+c4EWpwj7hGSoVe 7IIvn8dqDFy6fiJxyWlGQ1Svr9fSHIncMk267AgdErBWAwyAOxYjM2qpaAuK5ducM0Aw 55rAnaapk93WEz+1DqVTb7kmwzq5+XP5mXfeXAW7ckWRKetElNSe+te66nHOO22owlrV dc5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Tbt50/AC+RbPsz2NdLvw4UvTZ3u6J3BtdEo2Ll/Dy7I=; b=G3Y/3ohw/NTPPVCqMCRinee1SEZ6Wi4x4Py1YJ8zAGGnohDjeSHTlW45fBO/jfwYH7 iGb+miw4+7l0UfukxijiXGqXMESIyNaPx4+2vHq/7acoUJLkM4HZY6wFD9ZWDn/eq7yk a8sBFfXaWVsjk7wPEr/K2fSQgWX/83JrYW10ZW6qEumtl5pm8Xsr9KDlLkN7HQjqN1L+ 45PmyfrobNLma3ykTzbl4wnoHPXO2KCmAcmCZ51WrnDpZTtetIt0SSv3QRIo7eC6S7eM YaY3LzBlmsU2SMvG2gIQlX2IgZy1/pWlZKi9qLOsKHIgcqvnmBfRarw418xcpLvycrLK eqAA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=QwZdbkst; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dt2-20020a170907728200b007ada2ec1a28si4229896ejc.165.2022.10.29.23.25.49; Sat, 29 Oct 2022 23:26:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=QwZdbkst; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230054AbiJ3GY6 (ORCPT + 99 others); Sun, 30 Oct 2022 02:24:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229795AbiJ3GYE (ORCPT ); Sun, 30 Oct 2022 02:24:04 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 50D5BA8; Sat, 29 Oct 2022 23:24:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111043; x=1698647043; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=rRN5kbq0TqI2b4uCt5j6yH0GqnOcoNEW+tyjQehDDAc=; b=QwZdbkst8sRpUhDBuLCuW1ZfCEpWTFGTOweuEll52rAETzhd/ZjyAi0a IkkL6cZvxhUjVNWgcSkRs3YbVFORk/ollKZ2bA4ffIuxiIScGn2qxQEps 4ISjfty+/KSZQOnWfQ/OGGchNrQagOom48GC0DuVCX6m4LXm2wwaqUWN2 y5+yKk3eK/uxMqFQ0kc2UARI3AhihiclvwJ+tLm4WuU14Wsw4RbdsnnHB i/nBCX9InNU7PinkPB/c0S+KCat9LdzOtDKVFJyeQ8Msm2XIlTnFZOPCo 6bP/PbYyAsUUWS50c0VqB/+SKNFCDcezDxglzWunO+BBJrNfXHonvBCl+ w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037124" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037124" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:58 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392866" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392866" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:58 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 012/108] KVM: TDX: Add helper functions to print TDX SEAMCALL error Date: Sat, 29 Oct 2022 23:22:13 -0700 Message-Id: <752b25882092134bd8d117b3e429227bd5a18567.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092767290348018?= X-GMAIL-MSGID: =?utf-8?q?1748092767290348018?= From: Isaku Yamahata Add helper functions to print out errors from the TDX module in a uniform manner. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/Makefile | 2 +- arch/x86/kvm/vmx/tdx_error.c | 21 +++++++++++++++++++++ arch/x86/kvm/vmx/tdx_ops.h | 3 +++ 3 files changed, 25 insertions(+), 1 deletion(-) create mode 100644 arch/x86/kvm/vmx/tdx_error.c diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile index e2c05195cb95..f1ad445df505 100644 --- a/arch/x86/kvm/Makefile +++ b/arch/x86/kvm/Makefile @@ -24,7 +24,7 @@ kvm-$(CONFIG_KVM_XEN) += xen.o kvm-intel-y += vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o \ vmx/evmcs.o vmx/nested.o vmx/posted_intr.o vmx/main.o kvm-intel-$(CONFIG_X86_SGX_KVM) += vmx/sgx.o -kvm-intel-$(CONFIG_INTEL_TDX_HOST) += vmx/tdx.o +kvm-intel-$(CONFIG_INTEL_TDX_HOST) += vmx/tdx.o vmx/tdx_error.o kvm-amd-y += svm/svm.o svm/vmenter.o svm/pmu.o svm/nested.o svm/avic.o svm/sev.o diff --git a/arch/x86/kvm/vmx/tdx_error.c b/arch/x86/kvm/vmx/tdx_error.c new file mode 100644 index 000000000000..574b72d34e1e --- /dev/null +++ b/arch/x86/kvm/vmx/tdx_error.c @@ -0,0 +1,21 @@ +// SPDX-License-Identifier: GPL-2.0 +/* functions to record TDX SEAMCALL error */ + +#include +#include + +#include "tdx_ops.h" + +void pr_tdx_error(u64 op, u64 error_code, const struct tdx_module_output *out) +{ + if (!out) { + pr_err_ratelimited("SEAMCALL[%lld] failed: 0x%llx\n", + op, error_code); + return; + } + + pr_err_ratelimited("SEAMCALL[%lld] failed: 0x%llx RCX 0x%llx, RDX 0x%llx," + " R8 0x%llx, R9 0x%llx, R10 0x%llx, R11 0x%llx\n", + op, error_code, + out->rcx, out->rdx, out->r8, out->r9, out->r10, out->r11); +} diff --git a/arch/x86/kvm/vmx/tdx_ops.h b/arch/x86/kvm/vmx/tdx_ops.h index 85adbf49c277..8cc2f01c509b 100644 --- a/arch/x86/kvm/vmx/tdx_ops.h +++ b/arch/x86/kvm/vmx/tdx_ops.h @@ -9,12 +9,15 @@ #include #include #include +#include #include "tdx_errno.h" #include "tdx_arch.h" #ifdef CONFIG_INTEL_TDX_HOST +void pr_tdx_error(u64 op, u64 error_code, const struct tdx_module_output *out); + static inline u64 tdh_mng_addcx(hpa_t tdr, hpa_t addr) { clflush_cache_range(__va(addr), PAGE_SIZE); From patchwork Sun Oct 30 06:22:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12832 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665383wru; Sat, 29 Oct 2022 23:25:48 -0700 (PDT) X-Google-Smtp-Source: AMsMyM55Kk8EIccXY4KZ0bcsLYjr3XiTlGKUyBWL3zjJ8NuDwAuGykTV8ueWCI6svnYL2CJC555j X-Received: by 2002:a17:906:eec9:b0:783:e662:2513 with SMTP id wu9-20020a170906eec900b00783e6622513mr6746682ejb.656.1667111148412; Sat, 29 Oct 2022 23:25:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111148; cv=none; d=google.com; s=arc-20160816; b=vdcJ3SkOWnwAuBIfT8/6aYilTD4nGRQJu00M6ysdalLsaDMd+gN1Xb8On3XdsZ+TIM BgsA+9VrDmF2FfA63JYZyaljWUs+qRtb6WsjjnTFvtdPgg18u29+za8JxEGOiydt9CRu xhSFT3x7DpKmLej3knfCnvpJeaPSPRzrCn7TZmW/z5EYmqBIEzl84c8mSB35ybRgPHqa 4XRcv0BelhczU+63w6yKtJEXAeymHVtC/MNtLKvfyINgfjJjFChitR6V/DPRm5zu9T8m b0jx5jQWjRdENQMYENsRFi6QIEMJY7SpHpE1Gff3058GeTpWJnLpZjUgliNjam0lMTJ/ w5LA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=vUT1AblImCJUbcuXyozNgIYLRkVo7osuNn/GqLSvCh8=; b=Jp++NUzNZV9+qtwDkYO/l23rYbE0gbRtDLKCKddPjur0MDDu8WKpEL8XIZfdpcho0m 4pZ9d/XWOmh4FnE8Eja8CGmYYX/59lk//CvDXRiXp190qX2PC5jUzj6m+my2ecSR5m9t hO5I7DbH1n4DYhTD3mt8LDpHIyw+lHfLZEg5d3MrVjMmubyfqGiYB7CNNLD8h9ajVhhg 12B++AavpEhKfNztl0reCnKlEkXfmMOul+Ef4arNhvuiPbxRlE3NKvB90TYStONqBx7N tZNyKFfg+bk1VPKqWu+PvIGWRpxejpq4bQwc6iqSujSsWlS2s9QT4NSv3P+rahLIgb1a 3UVA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=hcDE6xWO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q24-20020aa7d458000000b0045cfb639ff2si3065306edr.506.2022.10.29.23.25.23; Sat, 29 Oct 2022 23:25:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=hcDE6xWO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229920AbiJ3GYa (ORCPT + 99 others); Sun, 30 Oct 2022 02:24:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46882 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229549AbiJ3GYE (ORCPT ); Sun, 30 Oct 2022 02:24:04 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6DE03AA; Sat, 29 Oct 2022 23:24:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111043; x=1698647043; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wbnROJIV0sPi9VRc1Kym0LCeIa3DweI4YQjfi8+X354=; b=hcDE6xWONa93IILGygpjTjEFl94SYh/GYUXj8kdjgq4edLkib6DqzvJc 3zoqP99dYe9D3s2iJkIY2Fyf8s0ojXG85ILp5wYkiMrPRsMLKtrvsWFZX n3ZPPF6utFRwhDWH9ZHlBkIjwIEwl2K3hAF2xy57waIy9qlmfXfA4pODm FzlkFn9kuMbwU//W8G0pg+jmCsJKXzExZ6fDznWilwkXVwOuD8JQU99SI Nh3kyIAztRlNJNPEBTkCLtlCDTChg/RT2DLJLeZb9tAKgwmclMacZPALa /hFpPuTKmR7bmE8i7395460/uPvG0sIjCeq+niZk0Pkd/9QWHpHS2jXMR w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037125" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037125" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:58 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392869" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392869" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:58 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 013/108] [MARKER] The start of TDX KVM patch series: TD VM creation/destruction Date: Sat, 29 Oct 2022 23:22:14 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092739538758295?= X-GMAIL-MSGID: =?utf-8?q?1748092739538758295?= From: Isaku Yamahata This empty commit is to mark the start of patch series of TD VM creation/destruction. Signed-off-by: Isaku Yamahata --- Documentation/virt/kvm/intel-tdx-layer-status.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/virt/kvm/intel-tdx-layer-status.rst b/Documentation/virt/kvm/intel-tdx-layer-status.rst index b7a14bc73853..5e0deaebf843 100644 --- a/Documentation/virt/kvm/intel-tdx-layer-status.rst +++ b/Documentation/virt/kvm/intel-tdx-layer-status.rst @@ -15,8 +15,8 @@ Patch Layer status ------------------ Patch layer Status * TDX, VMX coexistence: Applied -* TDX architectural definitions: Applying -* TD VM creation/destruction: Not yet +* TDX architectural definitions: Applied +* TD VM creation/destruction: Applying * TD vcpu creation/destruction: Not yet * TDX EPT violation: Not yet * TD finalization: Not yet From patchwork Sun Oct 30 06:22:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12836 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665515wru; Sat, 29 Oct 2022 23:26:07 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7gTFruUrDZli5bC/yX5uKiNB5HOgJE0Fa0JCAVwaKeBzf3iEbgxerrDAHtQQ1EapchPPHL X-Received: by 2002:a17:907:9625:b0:78d:bb06:9072 with SMTP id gb37-20020a170907962500b0078dbb069072mr6901290ejc.472.1667111167325; Sat, 29 Oct 2022 23:26:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111167; cv=none; d=google.com; s=arc-20160816; b=iNx0xATkVj5mdZUtvN5mk5wASg967fWuSBLA+JUiEQDEkMJVAsI/Iw0ZVA/u6ZhKMX 9NSu0NzA+zqL1ZwpwvdywK3nysBnyI+1XJKXXdIdzf1t/xjzY5p/UKFyjgJ0xCB64gEG xslAD1dnzW02WCXvOlWgs5ubR1q15/QKOMfE3JtCXuSpCdbYdD2dZtbAul6JcYgvV8wK QBKhP35OfdEGk9BQxd22e8sn7ZrxwCFHWpb2N6klKCh+ZpJ36GiuBFqnyT2Ay4j6weKO 1Z5oKrR/M1XO7F385qu6Ppedmq/5zjTGnkQjOQtxEinfq8zpBIIcpuYV1SNPhhNJEd6g cyfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=3JBGnAgcnwIV6JvSWiRVvPqKLzDV6IgOcijUFvWo1rc=; b=C1RKlOxQkFV5ce+13hfMlleK+RMCozX9fEx7UpPz2CDH5mnZzOtrmYTz2Fei8vKlv9 BhylxR+XSp6zsqTEjC3ustpoL+S352RpsgVKAVDQeDHZu4XTmeosqTVfluECwcSUIPxC caO6ahnoYpcczKq4mddgLqbulM2BWRG2d7V0j8zMr/hc7smFTKU6hiaPZR5PJhIgt218 Be0sH95SYuNCMTQgEw65tVa5sfnkJWbAooZKUYx2wJCOPahpO25mKEFMlUYPaXzym6YX dS8qE4JD2hkP/C/5qON/xWGR6iTSRGMF++STcDU89xUlwJoCRk+bqh11k9L1XWk0vgiG 9+FQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=d2h2Rc5Q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bn13-20020a170906c0cd00b007ad69eba328si3479687ejb.539.2022.10.29.23.25.42; Sat, 29 Oct 2022 23:26:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=d2h2Rc5Q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230037AbiJ3GYy (ORCPT + 99 others); Sun, 30 Oct 2022 02:24:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46882 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229802AbiJ3GYE (ORCPT ); Sun, 30 Oct 2022 02:24:04 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D6BC2D3; Sat, 29 Oct 2022 23:24:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111043; x=1698647043; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5k5aCmwyIFSsGRCiOhHxI/sKSJLDmnsaLGB+lq67gZM=; b=d2h2Rc5QSVPAYszf9LQq8r7u5WtVMgWUuR68amr8MIkDsfrwz9IXhQbl amu9Y/u1zk+Y2w5ImY/gi/Rt+sX3vtHql/Sglm5iXt/w/2b8EKHHqOMiY moLSrY8oYjRUr0AN/49avSoEhT47uwqM/+VNrAVyfSko5BG3IalT2TpGK H/vkHnxrVENv2e2xVqFxVFXO7PzaFgxPYOSnbqHx9JnSPe+A/Nl5RSzJ9 VG8n7rfQZ+iP6Lby0m6qMDn65mVSnL9YlIOGFrmy4BpxJmhZBgaIAxq9t n0QHt6EIoeetTh08Adm8ZiljL21eKp+H937+EIJ4omk4WKLLGgifKT/I7 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037126" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037126" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:59 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392872" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392872" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:58 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 014/108] KVM: TDX: Stub in tdx.h with structs, accessors, and VMCS helpers Date: Sat, 29 Oct 2022 23:22:15 -0700 Message-Id: <75ac959fddbfd057d3ae8ad73e91708a2da60965.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092759458695056?= X-GMAIL-MSGID: =?utf-8?q?1748092759458695056?= From: Sean Christopherson Stub in kvm_tdx, vcpu_tdx, and their various accessors. TDX defines SEAMCALL APIs to access TDX control structures corresponding to the VMX VMCS. Introduce helper accessors to hide its SEAMCALL ABI details. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/tdx.h | 118 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 116 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index 473013265bd8..98999bf3f188 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -3,14 +3,27 @@ #define __KVM_X86_TDX_H #ifdef CONFIG_INTEL_TDX_HOST + +#include "tdx_ops.h" + +struct tdx_td_page { + unsigned long va; + hpa_t pa; + bool added; +}; + struct kvm_tdx { struct kvm kvm; - /* TDX specific members follow. */ + + struct tdx_td_page tdr; + struct tdx_td_page *tdcs; }; struct vcpu_tdx { struct kvm_vcpu vcpu; - /* TDX specific members follow. */ + + struct tdx_td_page tdvpr; + struct tdx_td_page *tdvpx; }; static inline bool is_td(struct kvm *kvm) @@ -32,6 +45,107 @@ static inline struct vcpu_tdx *to_tdx(struct kvm_vcpu *vcpu) { return container_of(vcpu, struct vcpu_tdx, vcpu); } + +static __always_inline void tdvps_vmcs_check(u32 field, u8 bits) +{ +#define VMCS_ENC_ACCESS_TYPE_MASK 0x1UL +#define VMCS_ENC_ACCESS_TYPE_FULL 0x0UL +#define VMCS_ENC_ACCESS_TYPE_HIGH 0x1UL +#define VMCS_ENC_ACCESS_TYPE(field) ((field) & VMCS_ENC_ACCESS_TYPE_MASK) + + /* TDX is 64bit only. HIGH field isn't supported. */ + BUILD_BUG_ON_MSG(__builtin_constant_p(field) && + VMCS_ENC_ACCESS_TYPE(field) == VMCS_ENC_ACCESS_TYPE_HIGH, + "Read/Write to TD VMCS *_HIGH fields not supported"); + + BUILD_BUG_ON(bits != 16 && bits != 32 && bits != 64); + +#define VMCS_ENC_WIDTH_MASK GENMASK(14, 13) +#define VMCS_ENC_WIDTH_16BIT (0UL << 13) +#define VMCS_ENC_WIDTH_64BIT (1UL << 13) +#define VMCS_ENC_WIDTH_32BIT (2UL << 13) +#define VMCS_ENC_WIDTH_NATURAL (3UL << 13) +#define VMCS_ENC_WIDTH(field) ((field) & VMCS_ENC_WIDTH_MASK) + + /* TDX is 64bit only. i.e. natural width = 64bit. */ + BUILD_BUG_ON_MSG(bits != 64 && __builtin_constant_p(field) && + (VMCS_ENC_WIDTH(field) == VMCS_ENC_WIDTH_64BIT || + VMCS_ENC_WIDTH(field) == VMCS_ENC_WIDTH_NATURAL), + "Invalid TD VMCS access for 64-bit field"); + BUILD_BUG_ON_MSG(bits != 32 && __builtin_constant_p(field) && + VMCS_ENC_WIDTH(field) == VMCS_ENC_WIDTH_32BIT, + "Invalid TD VMCS access for 32-bit field"); + BUILD_BUG_ON_MSG(bits != 16 && __builtin_constant_p(field) && + VMCS_ENC_WIDTH(field) == VMCS_ENC_WIDTH_16BIT, + "Invalid TD VMCS access for 16-bit field"); +} + +static __always_inline void tdvps_state_non_arch_check(u64 field, u8 bits) {} +static __always_inline void tdvps_management_check(u64 field, u8 bits) {} + +#define TDX_BUILD_TDVPS_ACCESSORS(bits, uclass, lclass) \ +static __always_inline u##bits td_##lclass##_read##bits(struct vcpu_tdx *tdx, \ + u32 field) \ +{ \ + struct tdx_module_output out; \ + u64 err; \ + \ + tdvps_##lclass##_check(field, bits); \ + err = tdh_vp_rd(tdx->tdvpr.pa, TDVPS_##uclass(field), &out); \ + if (unlikely(err)) { \ + pr_err("TDH_VP_RD["#uclass".0x%x] failed: 0x%llx\n", \ + field, err); \ + return 0; \ + } \ + return (u##bits)out.r8; \ +} \ +static __always_inline void td_##lclass##_write##bits(struct vcpu_tdx *tdx, \ + u32 field, u##bits val) \ +{ \ + struct tdx_module_output out; \ + u64 err; \ + \ + tdvps_##lclass##_check(field, bits); \ + err = tdh_vp_wr(tdx->tdvpr.pa, TDVPS_##uclass(field), val, \ + GENMASK_ULL(bits - 1, 0), &out); \ + if (unlikely(err)) \ + pr_err("TDH_VP_WR["#uclass".0x%x] = 0x%llx failed: 0x%llx\n", \ + field, (u64)val, err); \ +} \ +static __always_inline void td_##lclass##_setbit##bits(struct vcpu_tdx *tdx, \ + u32 field, u64 bit) \ +{ \ + struct tdx_module_output out; \ + u64 err; \ + \ + tdvps_##lclass##_check(field, bits); \ + err = tdh_vp_wr(tdx->tdvpr.pa, TDVPS_##uclass(field), bit, bit, \ + &out); \ + if (unlikely(err)) \ + pr_err("TDH_VP_WR["#uclass".0x%x] |= 0x%llx failed: 0x%llx\n", \ + field, bit, err); \ +} \ +static __always_inline void td_##lclass##_clearbit##bits(struct vcpu_tdx *tdx, \ + u32 field, u64 bit) \ +{ \ + struct tdx_module_output out; \ + u64 err; \ + \ + tdvps_##lclass##_check(field, bits); \ + err = tdh_vp_wr(tdx->tdvpr.pa, TDVPS_##uclass(field), 0, bit, \ + &out); \ + if (unlikely(err)) \ + pr_err("TDH_VP_WR["#uclass".0x%x] &= ~0x%llx failed: 0x%llx\n", \ + field, bit, err); \ +} + +TDX_BUILD_TDVPS_ACCESSORS(16, VMCS, vmcs); +TDX_BUILD_TDVPS_ACCESSORS(32, VMCS, vmcs); +TDX_BUILD_TDVPS_ACCESSORS(64, VMCS, vmcs); + +TDX_BUILD_TDVPS_ACCESSORS(64, STATE_NON_ARCH, state_non_arch); +TDX_BUILD_TDVPS_ACCESSORS(8, MANAGEMENT, management); + #else struct kvm_tdx { struct kvm kvm; From patchwork Sun Oct 30 06:22:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12877 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666294wru; Sat, 29 Oct 2022 23:29:42 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5LrBEUXBpyiBpIDtI0PWZNtE3hyHstljW/wiLNcFRLuzyWeZGJeeWQ3AR+cP4l9IrhRF2/ X-Received: by 2002:a63:e211:0:b0:43b:f03e:3cc5 with SMTP id q17-20020a63e211000000b0043bf03e3cc5mr7104382pgh.256.1667111382401; Sat, 29 Oct 2022 23:29:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111382; cv=none; d=google.com; s=arc-20160816; b=do9YMC1IOl8b1pn8fit8vg9XGgRwJi+Hlup11XsvTqD3VmglyVZO7FfjBi0avFfg0Q TFMzrCaacQAMTxQYq3Qseewr8LY4SHZ7ejH27Ty+eXVKGKVVrDfuDSiV/vd5hY8j4aNl 2KV0BVpYdFvP8cevZGnGmjY/sZFfpzNiTAtPc30R10sd/aaNYBQZaNWXpuHIeIaa1lwa 3RQfJ6pG/jXFKDkMT5a6Yxm0ebyRLEnH0tBQcOZSo/G+ishZeMQOMIaqYUY1EC4J471w tYgHsCqYoUgMR0fM/E6VphFGTkAEnrLwULCHbY9+fLD4MqLeomhKMuzp/X7iKeKpxeAo X18w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=iBgC5AEaBErdtRF/WMmM084oDPT7L+LaTj74mF7grUc=; b=Apw4MaWuvx6BSEd2eGe7wYaVs6wdbVCqsI3PWfdCXrpKvUn1XQ8FtGw856Ah9Fb70F h9HrlM4+1//HRIwAR7Dc1rd2TsENPqblwaHcTDp/aj2UuSu1ELzvr75sEsqQZ5EELu1S j9hMbLD7YkLbJhi3ELx8FgkATQwToaGN1MOYAioF1JJs53iOkMbsHGScDsoJkafOaXBA yhiKa1GiHdBjvVOT+8ybQqi+bmxTXfHumBe+UHoSziUws+SCPl0u38O4iOit7A9YPxKy CuEooEnshlzOBX/IB4RFdrX4fcR4sBKVpEccjDLINGEk9h4cGaWO4ImPKSoQLryGFuDU 5/ZA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Fuzv+ajV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k18-20020a056a00169200b0056bae3f63b5si4716050pfc.327.2022.10.29.23.29.29; Sat, 29 Oct 2022 23:29:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Fuzv+ajV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230333AbiJ3G0N (ORCPT + 99 others); Sun, 30 Oct 2022 02:26:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46914 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229808AbiJ3GYE (ORCPT ); Sun, 30 Oct 2022 02:24:04 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E44D6D8; Sat, 29 Oct 2022 23:24:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111043; x=1698647043; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=c3by2T16lzjQJuySZXCKjj7l+xnufSG16l0hxS0c+aE=; b=Fuzv+ajVo81/Z4+iMCpSWCflVcPzwIBayhJYgEhvGmtn+P7UD0zkGt2+ a72tW1qrysTojRd7z87WDEfhvIheMjFOB2QgC9ZrrmpdIva9efv3jey0N s+j0MMWd4Z2EnLPdAgjt+KMmzOebU86wws8G4mseXoQ+u748i1+xPs408 W+dHOFVqddoUUEBLyoCSflHQ7MVXaA7JfzjETs+1Shk2SwYRMb26wI7DD zNB4mRsfA/2liqoqWJBye+OptCx/fmadFvKIHs1Y5Ds8bz+Sc1P+U3ANa 8MerpF0zAsuFjrUvK+7ggNT/cZmB+1SLDV99+OZxv7zXVuSP/2Rp4wPkp A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037127" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037127" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:59 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392876" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392876" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:59 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 015/108] x86/cpu: Add helper functions to allocate/free TDX private host key id Date: Sat, 29 Oct 2022 23:22:16 -0700 Message-Id: <5ee7c6dc4ba03b5d5166e015c148ef534ee53f8e.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092985008959195?= X-GMAIL-MSGID: =?utf-8?q?1748092985008959195?= From: Isaku Yamahata TDX private host key id is assigned to guest TD. The memory controller encrypts guest TD memory with the assigned TDX private host key id (HIKD). Add helper functions to allocate/free TDX private host key id so that TDX KVM manage it. Also export the global TDX private host key id that is used to encrypt TDX module, its memory and some dynamic data (TDR). When VMM releasing encrypted page to reuse it, the page needs to be flushed with the used host key id. VMM needs the global TDX private host key id to flush such pages TDX module accesses with the global TDX private host key id. Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/tdx.h | 13 +++++++++++++ arch/x86/virt/vmx/tdx/tdx.c | 28 +++++++++++++++++++++++++++- 2 files changed, 40 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index ba2e4c69fb9f..cd304d323d33 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -186,6 +186,17 @@ struct tdsysinfo_struct { const struct tdsysinfo_struct *tdx_get_sysinfo(void); bool platform_tdx_enabled(void); int tdx_enable(void); + +/* + * Key id globally used by TDX module: TDX module maps TDR with this TDX global + * key id. TDR includes key id assigned to the TD. Then TDX module maps other + * TD-related pages with the assigned key id. TDR requires this TDX global key + * id for cache flush unlike other TD-related pages. + */ +extern u32 tdx_global_keyid __read_mostly; +int tdx_keyid_alloc(void); +void tdx_keyid_free(int keyid); + u64 __seamcall(u64 op, u64 rcx, u64 rdx, u64 r8, u64 r9, struct tdx_module_output *out); #else /* !CONFIG_INTEL_TDX_HOST */ @@ -193,6 +204,8 @@ struct tdsysinfo_struct; static inline const struct tdsysinfo_struct *tdx_get_sysinfo(void) { return NULL; } static inline bool platform_tdx_enabled(void) { return false; } static inline int tdx_enable(void) { return -ENODEV; } +static inline int tdx_keyid_alloc(void) { return -EOPNOTSUPP; } +static inline void tdx_keyid_free(int keyid) { } #endif /* CONFIG_INTEL_TDX_HOST */ #endif /* !__ASSEMBLY__ */ diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 6fb630fa7d09..0625ced219d7 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -64,7 +64,8 @@ static struct cmr_info tdx_cmr_array[MAX_CMRS] __aligned(CMR_INFO_ARRAY_ALIGNMEN static int tdx_cmr_num; /* TDX module global KeyID. Used in TDH.SYS.CONFIG ABI. */ -static u32 tdx_global_keyid; +u32 tdx_global_keyid __read_mostly; +EXPORT_SYMBOL_GPL(tdx_global_keyid); /* * Detect TDX private KeyIDs to see whether TDX has been enabled by the @@ -113,6 +114,31 @@ static void __init clear_tdx(void) tdx_keyid_start = tdx_keyid_num = 0; } +/* TDX KeyID pool */ +static DEFINE_IDA(tdx_keyid_pool); + +int tdx_keyid_alloc(void) +{ + if (WARN_ON_ONCE(!tdx_keyid_start || !tdx_keyid_num)) + return -EINVAL; + + /* The first keyID is reserved for the global key. */ + return ida_alloc_range(&tdx_keyid_pool, tdx_keyid_start + 1, + tdx_keyid_start + tdx_keyid_num - 1, + GFP_KERNEL); +} +EXPORT_SYMBOL_GPL(tdx_keyid_alloc); + +void tdx_keyid_free(int keyid) +{ + /* keyid = 0 is reserved. */ + if (!keyid || keyid <= 0) + return; + + ida_free(&tdx_keyid_pool, keyid); +} +EXPORT_SYMBOL_GPL(tdx_keyid_free); + static void __init tdx_memory_destroy(void) { while (!list_empty(&tdx_memlist)) { From patchwork Sun Oct 30 06:22:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12839 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665595wru; Sat, 29 Oct 2022 23:26:27 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4ibiNXlyzdGKZ+dphk7CE5HaOGT+On8/oL4Qo9/G9fRr/9DRydCfTUVIUBnPZMwIBQ3v4m X-Received: by 2002:a17:907:703:b0:78e:25be:5455 with SMTP id xb3-20020a170907070300b0078e25be5455mr6729069ejb.630.1667111187111; Sat, 29 Oct 2022 23:26:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111187; cv=none; d=google.com; s=arc-20160816; b=UTRXIoCPlEhERvEDmk9F5SSDlOTpxONuyl47fA1zgpxdRqC8zM0XhL5hWrUkVD77yA HzynSyIrKcKpmWKR9z0SKBTqzIUt2poxJgcRCMUwJ2Trl2KzSDupA6+f6JJE2T7L6A43 zor8bPo9Fq9WPObUNyoJR1epQtWPy5oAAjFfsBGpjrDfXectgnqymzR4F2xmTGWWhBPy 2sQX3oy5Iea4vSryHZZhEe8srpMLX1GUqnE2LLIfDxnpfzDVJ2e1eL+7ws8QutIBxwNE qL94g15xemRIFNJP1+LWnIF/uZslB5kBYxui1G0pkG0jQsQUoMqJR6tfklfaX94sRG2R Mxsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=PHP3PHmo3+hIGMxOIRxV59nL0TaFHfYmfUBfEmcTPhE=; b=YeckSvFAYA4PNjC9eVsSNYOvHIxUw8fx6F6MJ9/Mxb37twztwuPNEx23yey3NpYjV6 oWM9UrEsW6t8IgSc29sh2MUuCZuT0ZgMUoZfWf5yfJHq/RLwqajAATGZ8C1dmx3zW9y5 QKQ1xziXXFgwENSOMyYW3TgsiFSLQ+arIS+44KMR+8WzWZ9Mn3jFD2QvZpMNpPfNXpa4 gqqav1GPsf6WHtqwRC8DztmV15P7TqahTNdaUxghJ78JTcZq8djguxk1FNJghLqdX6+j wZBx+GB1cmSwW+D5voGkPJrnjVRQxyeDpUoxad89tIq9I1eaW+YSi/hsJrXVqdpvHSCp Iu6A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GNfIxmOL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s23-20020a50ab17000000b004513abe8f74si3164536edc.249.2022.10.29.23.26.02; Sat, 29 Oct 2022 23:26:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GNfIxmOL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230094AbiJ3GZH (ORCPT + 99 others); Sun, 30 Oct 2022 02:25:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229787AbiJ3GYF (ORCPT ); Sun, 30 Oct 2022 02:24:05 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 20BCAB4; Sat, 29 Oct 2022 23:24:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111044; x=1698647044; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mNq7USUq4s3azC4fpdp0DqaTiaOSN0rdu5xF/5mqy90=; b=GNfIxmOLs3TcUmyDL/gIk4yCRRbrlFKpF0rbphA+TrvCSkQWvTyCyFXK UJVjshyHmHQesQhg39Q0/rhO5+gsexRRwsgifSEBh54Ct8/H7lpIGUb7r J8ULbS3AeUVtr9T0ThH2zSlih1bgTvgM3b7G89Dks+tQKxGnEryYny2hQ pUBm79S0lq3SDuAE25F24w8Z/7bYRpAr3ZT77DvMm058Ep0rcZ6ewn710 FFcbOAqtVIZtAxoonA1FAuqstjaT8EbIL/VGB3XF0pTeOLwakIFy/Q/IS 5wgyArpfgCE7U391todjX2zy3m6fd3H+hbEtBC82WBvvaqdlygzQYud0/ g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037128" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037128" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:59 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392879" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392879" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:59 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson , Kai Huang Subject: [PATCH v10 016/108] KVM: TDX: create/destroy VM structure Date: Sat, 29 Oct 2022 23:22:17 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092780211901160?= X-GMAIL-MSGID: =?utf-8?q?1748092780211901160?= From: Sean Christopherson As the first step to create TDX guest, create/destroy VM struct. Assign TDX private Host Key ID (HKID) to the TDX guest for memory encryption and allocate extra pages for the TDX guest. On destruction, free allocated pages, and HKID. Before tearing down private page tables, TDX requires some resources of the guest TD to be destroyed (i.e. keyID must have been reclaimed, etc). Add flush_shadow_all_private callback before tearing down private page tables for it. Add a second kvm_x86_ops hook in kvm_arch_destroy_vm() to support TDX's destruction path, which needs to first put the VM into a teardown state, then free per-vCPU resources, and finally free per-VM resources. Co-developed-by: Kai Huang Signed-off-by: Kai Huang Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm-x86-ops.h | 2 + arch/x86/include/asm/kvm_host.h | 2 + arch/x86/kvm/vmx/main.c | 34 ++- arch/x86/kvm/vmx/tdx.c | 411 +++++++++++++++++++++++++++++ arch/x86/kvm/vmx/tdx.h | 2 + arch/x86/kvm/vmx/x86_ops.h | 11 + arch/x86/kvm/x86.c | 8 + 7 files changed, 467 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 8a5c5ae70bc5..3a29a6b31ee8 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -21,7 +21,9 @@ KVM_X86_OP(has_emulated_msr) KVM_X86_OP(vcpu_after_set_cpuid) KVM_X86_OP(is_vm_type_supported) KVM_X86_OP(vm_init) +KVM_X86_OP_OPTIONAL(flush_shadow_all_private) KVM_X86_OP_OPTIONAL(vm_destroy) +KVM_X86_OP_OPTIONAL(vm_free) KVM_X86_OP_OPTIONAL_RET0(vcpu_precreate) KVM_X86_OP(vcpu_create) KVM_X86_OP(vcpu_free) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 2a41a93a80f3..2870155ce6fb 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1472,7 +1472,9 @@ struct kvm_x86_ops { bool (*is_vm_type_supported)(unsigned long vm_type); unsigned int vm_size; int (*vm_init)(struct kvm *kvm); + void (*flush_shadow_all_private)(struct kvm *kvm); void (*vm_destroy)(struct kvm *kvm); + void (*vm_free)(struct kvm *kvm); /* Create, but do not attach this VCPU */ int (*vcpu_precreate)(struct kvm *kvm); diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 0900ff2f2390..d01a946a18cf 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -29,18 +29,44 @@ static __init int vt_hardware_setup(void) return 0; } +static void vt_hardware_unsetup(void) +{ + tdx_hardware_unsetup(); + vmx_hardware_unsetup(); +} + static int vt_vm_init(struct kvm *kvm) { if (is_td(kvm)) - return -EOPNOTSUPP; /* Not ready to create guest TD yet. */ + return tdx_vm_init(kvm); return vmx_vm_init(kvm); } +static void vt_flush_shadow_all_private(struct kvm *kvm) +{ + if (is_td(kvm)) + return tdx_mmu_release_hkid(kvm); +} + +static void vt_vm_destroy(struct kvm *kvm) +{ + if (is_td(kvm)) + return; + + vmx_vm_destroy(kvm); +} + +static void vt_vm_free(struct kvm *kvm) +{ + if (is_td(kvm)) + return tdx_vm_free(kvm); +} + struct kvm_x86_ops vt_x86_ops __initdata = { .name = "kvm_intel", - .hardware_unsetup = vmx_hardware_unsetup, + .hardware_unsetup = vt_hardware_unsetup, .check_processor_compatibility = vmx_check_processor_compatibility, .hardware_enable = vmx_hardware_enable, @@ -50,7 +76,9 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .is_vm_type_supported = vt_is_vm_type_supported, .vm_size = sizeof(struct kvm_vmx), .vm_init = vt_vm_init, - .vm_destroy = vmx_vm_destroy, + .flush_shadow_all_private = vt_flush_shadow_all_private, + .vm_destroy = vt_vm_destroy, + .vm_free = vt_vm_free, .vcpu_precreate = vmx_vcpu_precreate, .vcpu_create = vmx_vcpu_create, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 530e72f85762..ec88dde0d300 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -32,6 +32,401 @@ struct tdx_capabilities { /* Capabilities of KVM + the TDX module. */ static struct tdx_capabilities tdx_caps; +/* + * Some TDX SEAMCALLs (TDH.MNG.CREATE, TDH.PHYMEM.CACHE.WB, + * TDH.MNG.KEY.RECLAIMID, TDH.MNG.KEY.FREEID etc) tries to acquire a global lock + * internally in TDX module. If failed, TDX_OPERAND_BUSY is returned without + * spinning or waiting due to a constraint on execution time. It's caller's + * responsibility to avoid race (or retry on TDX_OPERAND_BUSY). Use this mutex + * to avoid race in TDX module because the kernel knows better about scheduling. + */ +static DEFINE_MUTEX(tdx_lock); +static struct mutex *tdx_mng_key_config_lock; + +static __always_inline hpa_t set_hkid_to_hpa(hpa_t pa, u16 hkid) +{ + return pa | ((hpa_t)hkid << boot_cpu_data.x86_phys_bits); +} + +static inline bool is_td_created(struct kvm_tdx *kvm_tdx) +{ + return kvm_tdx->tdr.added; +} + +static inline void tdx_hkid_free(struct kvm_tdx *kvm_tdx) +{ + tdx_keyid_free(kvm_tdx->hkid); + kvm_tdx->hkid = -1; +} + +static inline bool is_hkid_assigned(struct kvm_tdx *kvm_tdx) +{ + return kvm_tdx->hkid > 0; +} + +static void tdx_clear_page(unsigned long page) +{ + const void *zero_page = (const void *) __va(page_to_phys(ZERO_PAGE(0))); + unsigned long i; + + /* + * Zeroing the page is only necessary for systems with MKTME-i: + * when re-assign one page from old keyid to a new keyid, MOVDIR64B is + * required to clear/write the page with new keyid to prevent integrity + * error when read on the page with new keyid. + * + * The cache line could be poisoned (even without MKTME-i), clear the + * poison bit. + */ + for (i = 0; i < PAGE_SIZE; i += 64) + movdir64b((void *)(page + i), zero_page); + /* + * MOVDIR64B store uses WC buffer. Prevent following memory reads + * from seeing potentially poisoned cache. + */ + __mb(); +} + +static int tdx_reclaim_page(unsigned long va, hpa_t pa, bool do_wb, u16 hkid) +{ + struct tdx_module_output out; + u64 err; + + do { + err = tdh_phymem_page_reclaim(pa, &out); + /* + * TDH.PHYMEM.PAGE.RECLAIM is allowed only when TD is shutdown. + * state. i.e. destructing TD. + * TDH.PHYMEM.PAGE.RECLAIM requires TDR and target page. + * Because we're destructing TD, it's rare to contend with TDR. + */ + } while (err == (TDX_OPERAND_BUSY | TDX_OPERAND_ID_RCX)); + if (WARN_ON_ONCE(err)) { + pr_tdx_error(TDH_PHYMEM_PAGE_RECLAIM, err, &out); + return -EIO; + } + + if (do_wb) { + /* + * Only TDR page gets into this path. No contention is expected + * because the last page of TD. + */ + err = tdh_phymem_page_wbinvd(set_hkid_to_hpa(pa, hkid)); + if (WARN_ON_ONCE(err)) { + pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err, NULL); + return -EIO; + } + } + + tdx_clear_page(va); + return 0; +} + +static int tdx_alloc_td_page(struct tdx_td_page *page) +{ + page->va = __get_free_page(GFP_KERNEL_ACCOUNT); + if (!page->va) + return -ENOMEM; + + page->pa = __pa(page->va); + return 0; +} + +static inline void tdx_mark_td_page_added(struct tdx_td_page *page) +{ + WARN_ON_ONCE(page->added); + page->added = true; +} + +static void tdx_reclaim_td_page(struct tdx_td_page *page) +{ + if (page->added) { + /* + * TDCX are being reclaimed. TDX module maps TDCX with HKID + * assigned to the TD. Here the cache associated to the TD + * was already flushed by TDH.PHYMEM.CACHE.WB before here, So + * cache doesn't need to be flushed again. + */ + if (tdx_reclaim_page(page->va, page->pa, false, 0)) + return; + + page->added = false; + } + if (page->va) { + free_page(page->va); + page->va = 0; + } +} + +static int tdx_do_tdh_phymem_cache_wb(void *param) +{ + u64 err = 0; + + do { + err = tdh_phymem_cache_wb(!!err); + } while (err == TDX_INTERRUPTED_RESUMABLE); + + /* Other thread may have done for us. */ + if (err == TDX_NO_HKID_READY_TO_WBCACHE) + err = TDX_SUCCESS; + if (WARN_ON_ONCE(err)) { + pr_tdx_error(TDH_PHYMEM_CACHE_WB, err, NULL); + return -EIO; + } + + return 0; +} + +void tdx_mmu_release_hkid(struct kvm *kvm) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + cpumask_var_t packages; + bool cpumask_allocated; + u64 err; + int ret; + int i; + + if (!is_hkid_assigned(kvm_tdx)) + return; + + if (!is_td_created(kvm_tdx)) + goto free_hkid; + + cpumask_allocated = zalloc_cpumask_var(&packages, GFP_KERNEL); + cpus_read_lock(); + for_each_online_cpu(i) { + if (cpumask_allocated && + cpumask_test_and_set_cpu(topology_physical_package_id(i), + packages)) + continue; + + /* + * We can destroy multiple the guest TDs simultaneously. + * Prevent tdh_phymem_cache_wb from returning TDX_BUSY by + * serialization. + */ + mutex_lock(&tdx_lock); + ret = smp_call_on_cpu(i, tdx_do_tdh_phymem_cache_wb, NULL, 1); + mutex_unlock(&tdx_lock); + if (ret) + break; + } + cpus_read_unlock(); + free_cpumask_var(packages); + + mutex_lock(&tdx_lock); + err = tdh_mng_key_freeid(kvm_tdx->tdr.pa); + mutex_unlock(&tdx_lock); + if (WARN_ON_ONCE(err)) { + pr_tdx_error(TDH_MNG_KEY_FREEID, err, NULL); + pr_err("tdh_mng_key_freeid failed. HKID %d is leaked.\n", + kvm_tdx->hkid); + return; + } + +free_hkid: + tdx_hkid_free(kvm_tdx); +} + +void tdx_vm_free(struct kvm *kvm) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + int i; + + /* Can't reclaim or free TD pages if teardown failed. */ + if (is_hkid_assigned(kvm_tdx)) + return; + + if (kvm_tdx->tdcs) { + for (i = 0; i < tdx_caps.tdcs_nr_pages; i++) + tdx_reclaim_td_page(&kvm_tdx->tdcs[i]); + kfree(kvm_tdx->tdcs); + } + + /* + * TDX module maps TDR with TDX global HKID. TDX module may access TDR + * while operating on TD (Especially reclaiming TDCS). Cache flush with + * TDX global HKID is needed. + */ + if (kvm_tdx->tdr.added && + tdx_reclaim_page(kvm_tdx->tdr.va, kvm_tdx->tdr.pa, true, + tdx_global_keyid)) + return; + + free_page(kvm_tdx->tdr.va); +} + +static int tdx_do_tdh_mng_key_config(void *param) +{ + hpa_t *tdr_p = param; + u64 err; + + do { + err = tdh_mng_key_config(*tdr_p); + + /* + * If it failed to generate a random key, retry it because this + * is typically caused by an entropy error of the CPU's random + * number generator. + */ + } while (err == TDX_KEY_GENERATION_FAILED); + + if (WARN_ON_ONCE(err)) { + pr_tdx_error(TDH_MNG_KEY_CONFIG, err, NULL); + return -EIO; + } + + return 0; +} + +int tdx_vm_init(struct kvm *kvm) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + cpumask_var_t packages; + int ret, i; + u64 err; + + ret = tdx_keyid_alloc(); + if (ret < 0) + return ret; + kvm_tdx->hkid = ret; + + ret = tdx_alloc_td_page(&kvm_tdx->tdr); + if (ret) + goto free_hkid; + + kvm_tdx->tdcs = kcalloc(tdx_caps.tdcs_nr_pages, sizeof(*kvm_tdx->tdcs), + GFP_KERNEL_ACCOUNT | __GFP_ZERO); + if (!kvm_tdx->tdcs) + goto free_tdr; + for (i = 0; i < tdx_caps.tdcs_nr_pages; i++) { + ret = tdx_alloc_td_page(&kvm_tdx->tdcs[i]); + if (ret) + goto free_tdcs; + } + + if (!zalloc_cpumask_var(&packages, GFP_KERNEL)) { + ret = -ENOMEM; + goto free_tdcs; + } + cpus_read_lock(); + /* + * Need at least one CPU of the package to be online in order to + * program all packages for host key id. Check it. + */ + for_each_present_cpu(i) + cpumask_set_cpu(topology_physical_package_id(i), packages); + for_each_online_cpu(i) + cpumask_clear_cpu(topology_physical_package_id(i), packages); + if (!cpumask_empty(packages)) { + ret = -EIO; + /* + * Because it's hard for human operator to figure out the + * reason, warn it. + */ + pr_warn("All packages need to have online CPU to create TD. Online CPU and retry.\n"); + goto free_packages; + } + + /* + * Acquire global lock to avoid TDX_OPERAND_BUSY: + * TDH.MNG.CREATE and other APIs try to lock the global Key Owner + * Table (KOT) to track the assigned TDX private HKID. It doesn't spin + * to acquire the lock, returns TDX_OPERAND_BUSY instead, and let the + * caller to handle the contention. This is because of time limitation + * usable inside the TDX module and OS/VMM knows better about process + * scheduling. + * + * APIs to acquire the lock of KOT: + * TDH.MNG.CREATE, TDH.MNG.KEY.FREEID, TDH.MNG.VPFLUSHDONE, and + * TDH.PHYMEM.CACHE.WB. + */ + mutex_lock(&tdx_lock); + err = tdh_mng_create(kvm_tdx->tdr.pa, kvm_tdx->hkid); + mutex_unlock(&tdx_lock); + if (WARN_ON_ONCE(err)) { + pr_tdx_error(TDH_MNG_CREATE, err, NULL); + ret = -EIO; + goto free_packages; + } + tdx_mark_td_page_added(&kvm_tdx->tdr); + + for_each_online_cpu(i) { + int pkg = topology_physical_package_id(i); + + if (cpumask_test_and_set_cpu(pkg, packages)) + continue; + + /* + * Program the memory controller in the package with an + * encryption key associated to a TDX private host key id + * assigned to this TDR. Concurrent operations on same memory + * controller results in TDX_OPERAND_BUSY. Avoid this race by + * mutex. + */ + mutex_lock(&tdx_mng_key_config_lock[pkg]); + ret = smp_call_on_cpu(i, tdx_do_tdh_mng_key_config, + &kvm_tdx->tdr.pa, true); + mutex_unlock(&tdx_mng_key_config_lock[pkg]); + if (ret) + break; + } + cpus_read_unlock(); + free_cpumask_var(packages); + if (ret) + goto teardown; + + for (i = 0; i < tdx_caps.tdcs_nr_pages; i++) { + err = tdh_mng_addcx(kvm_tdx->tdr.pa, kvm_tdx->tdcs[i].pa); + if (WARN_ON_ONCE(err)) { + pr_tdx_error(TDH_MNG_ADDCX, err, NULL); + ret = -EIO; + goto teardown; + } + tdx_mark_td_page_added(&kvm_tdx->tdcs[i]); + } + + /* + * Note, TDH_MNG_INIT cannot be invoked here. TDH_MNG_INIT requires a dedicated + * ioctl() to define the configure CPUID values for the TD. + */ + return 0; + + /* + * The sequence for freeing resources from a partially initialized TD + * varies based on where in the initialization flow failure occurred. + * Simply use the full teardown and destroy, which naturally play nice + * with partial initialization. + */ +teardown: + tdx_mmu_release_hkid(kvm); + tdx_vm_free(kvm); + return ret; + +free_packages: + cpus_read_unlock(); + free_cpumask_var(packages); +free_tdcs: + for (i = 0; i < tdx_caps.tdcs_nr_pages; i++) { + if (!kvm_tdx->tdcs[i].va) + continue; + free_page(kvm_tdx->tdcs[i].va); + } + kfree(kvm_tdx->tdcs); + kvm_tdx->tdcs = NULL; +free_tdr: + if (kvm_tdx->tdr.va) { + free_page(kvm_tdx->tdr.va); + kvm_tdx->tdr.added = false; + kvm_tdx->tdr.va = 0; + kvm_tdx->tdr.pa = 0; + } +free_hkid: + if (kvm_tdx->hkid != -1) + tdx_hkid_free(kvm_tdx); + return ret; +} + static int __init tdx_module_setup(void) { const struct tdsysinfo_struct *tdsysinfo; @@ -82,6 +477,8 @@ bool tdx_is_vm_type_supported(unsigned long type) int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { + int max_pkgs; + int i; int r; if (!enable_ept) { @@ -95,6 +492,14 @@ int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops) return -ENODEV; } + max_pkgs = topology_max_packages(); + tdx_mng_key_config_lock = kcalloc(max_pkgs, sizeof(*tdx_mng_key_config_lock), + GFP_KERNEL); + if (!tdx_mng_key_config_lock) + return -ENOMEM; + for (i = 0; i < max_pkgs; i++) + mutex_init(&tdx_mng_key_config_lock[i]); + /* TDX requires VMX. */ r = vmxon_all(); if (!r) @@ -103,3 +508,9 @@ int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops) return r; } + +void tdx_hardware_unsetup(void) +{ + /* kfree accepts NULL. */ + kfree(tdx_mng_key_config_lock); +} diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index 98999bf3f188..938314635b47 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -17,6 +17,8 @@ struct kvm_tdx { struct tdx_td_page tdr; struct tdx_td_page *tdcs; + + int hkid; }; struct vcpu_tdx { diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index ac1688b0b0e3..95da978c9aa9 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -133,9 +133,20 @@ void vmx_setup_mce(struct kvm_vcpu *vcpu); #ifdef CONFIG_INTEL_TDX_HOST int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops); bool tdx_is_vm_type_supported(unsigned long type); +void tdx_hardware_unsetup(void); + +int tdx_vm_init(struct kvm *kvm); +void tdx_mmu_release_hkid(struct kvm *kvm); +void tdx_vm_free(struct kvm *kvm); #else static inline int tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { return 0; } static inline bool tdx_is_vm_type_supported(unsigned long type) { return false; } +static inline void tdx_hardware_unsetup(void) {} + +static inline int tdx_vm_init(struct kvm *kvm) { return -EOPNOTSUPP; } +static inline void tdx_mmu_release_hkid(struct kvm *kvm) {} +static inline void tdx_flush_shadow_all_private(struct kvm *kvm) {} +static inline void tdx_vm_free(struct kvm *kvm) {} #endif #endif /* __KVM_X86_VMX_X86_OPS_H */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 91053fdc4512..4b22196cb12c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12702,6 +12702,7 @@ void kvm_arch_destroy_vm(struct kvm *kvm) kvm_page_track_cleanup(kvm); kvm_xen_destroy_vm(kvm); kvm_hv_destroy_vm(kvm); + static_call_cond(kvm_x86_vm_free)(kvm); } static void memslot_rmap_free(struct kvm_memory_slot *slot) @@ -13012,6 +13013,13 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, void kvm_arch_flush_shadow_all(struct kvm *kvm) { + /* + * kvm_mmu_zap_all() zaps both private and shared page tables. Before + * tearing down private page tables, TDX requires some TD resources to + * be destroyed (i.e. keyID must have been reclaimed, etc). Invoke + * kvm_x86_flush_shadow_all_private() for this. + */ + static_call_cond(kvm_x86_flush_shadow_all_private)(kvm); kvm_mmu_zap_all(kvm); } From patchwork Sun Oct 30 06:22:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12841 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665628wru; Sat, 29 Oct 2022 23:26:39 -0700 (PDT) X-Google-Smtp-Source: AMsMyM56aEm7OjXUoo4X3wrOx2Cz6i6qHSYBCX++nYzylR5IJlrpEtmnzzF1zMQ35D5tZ+D4QtaL X-Received: by 2002:a17:907:72d2:b0:79e:8082:1326 with SMTP id du18-20020a17090772d200b0079e80821326mr6747215ejc.252.1667111199465; Sat, 29 Oct 2022 23:26:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111199; cv=none; d=google.com; s=arc-20160816; b=AqkmaMMcqwjDpHbE0w8k7Yl06gz0eIHFGEBJ1mmdJOV0tvIrRwTVwz5vFobu2JLxBw Fd/KsrV1C5fKq/f2LjDq+gOaXJBc6FGAaYgutART9Js/JuHf3sWHX+XGM5rB8vqspPmN stycKfD+Xjb7GfozIA7VJ1aGq76yzH73pgXwKrGFneleROXHRms7vdNVnQ9ozVyK3jm/ 8YRXPivPVgsznl38SrugJ5HQCGy3i4DIqqSSjMTxMBmhWNNG6rCzQpxrxPiwI5XcUwYd c3N5GhZ2ukHRXwE6pL/FjApsVX9autUiBvWav3M/VKunR100tPICinjDD/n14kVgw7S2 WmcQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=lp1ARPcj70PJ7uXCs1LpdODrJHmSKztKpOUGXx/wcHw=; b=lsCoczl2i2uTrLsYefqfUS/bcAIRbFpFxlcjJBEqSPvW6dyHzYUTORInDuAuTjlyVn ds4Zw3BL9e9tIR3fbO9pHaRw5B8bmKt5VvfdtmPuyF1lP1DkrpOQxyGsDFYJ/ABkEBRK Ol5uFOz8xnhXQ3BljcYFw5VqEi/2k/xj0HcMbasHx+Thiifw02z4uAsrRWAaBCdCFaPp S5me+Zg9eo+F3ZSDVVWVY5JqxL0CeGkIu481H4Y+Pn/Fv869qCPIY5pEZ9EIGeB5SuwV Miqxz6tMQ1tXc/OWF2/+JeREEgXbHI3rw1NSQFoIIclG4AQHsFhz7Y8MdT9iB24QCVQR 1jyw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=NbwUDwka; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f7-20020a1709063f4700b0078e19037659si3308745ejj.792.2022.10.29.23.26.15; Sat, 29 Oct 2022 23:26:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=NbwUDwka; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230147AbiJ3GZU (ORCPT + 99 others); Sun, 30 Oct 2022 02:25:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46900 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229822AbiJ3GYE (ORCPT ); Sun, 30 Oct 2022 02:24:04 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3FC35DB; Sat, 29 Oct 2022 23:24:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111044; x=1698647044; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0nvz3WaeOypeXRr2ZlbIpcoCC53yXD/BSQQN4hsBa1E=; b=NbwUDwkaJbRpWat4C6Gi8xmv5iMdqaMcA9hcGIhAjgvdwh+2EvMCvu+X K6Qcs1dhCSxY/nJdfYuQ9RI+XSYG42f5lqkF9q/KYgfk6PFDxVimU+0LD OohSrXbNAO4/Qp2fbW+KFuwPcMh6BjRvyCYuI9J+xLWGXjHsrs0wJGez5 g5P3UrOHiLp+qdYFCXW+H1QqkwzzHIQI5SO2RVI6augcG0nlUZ7eR8R0Z Q3Sgcr3qHA69F8StPGJjkeMEYfn0+0/oM1uPW2n/cKj6GMzaZx/QVamIe ZmWha5DYrCXSdzjpYFTc7SOFnWOB2cE36mfryr4VQ6fvm5IZE6PxtC3pU w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037129" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037129" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:59 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392882" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392882" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:59 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 017/108] KVM: TDX: Refuse to unplug the last cpu on the package Date: Sat, 29 Oct 2022 23:22:18 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092792936442345?= X-GMAIL-MSGID: =?utf-8?q?1748092792936442345?= From: Isaku Yamahata In order to reclaim TDX host key id, (i.e. when deleting guest TD), needs to call TDH.PHYMEM.PAGE.WBINVD on all packages. If we have used TDX host key id, refuse to offline the last online cpu. Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/vmx/main.c | 1 + arch/x86/kvm/vmx/tdx.c | 40 +++++++++++++++++++++++++++++- arch/x86/kvm/vmx/x86_ops.h | 2 ++ arch/x86/kvm/x86.c | 27 ++++++++++++-------- 6 files changed, 61 insertions(+), 11 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 3a29a6b31ee8..0ceb8e58a6c0 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -17,6 +17,7 @@ BUILD_BUG_ON(1) KVM_X86_OP(hardware_enable) KVM_X86_OP(hardware_disable) KVM_X86_OP(hardware_unsetup) +KVM_X86_OP_OPTIONAL_RET0(offline_cpu) KVM_X86_OP(has_emulated_msr) KVM_X86_OP(vcpu_after_set_cpuid) KVM_X86_OP(is_vm_type_supported) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 2870155ce6fb..50b39d0071ff 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1466,6 +1466,7 @@ struct kvm_x86_ops { int (*hardware_enable)(void); void (*hardware_disable)(void); void (*hardware_unsetup)(void); + int (*offline_cpu)(void); bool (*has_emulated_msr)(struct kvm *kvm, u32 index); void (*vcpu_after_set_cpuid)(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index d01a946a18cf..0918d1e6d2f3 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -67,6 +67,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .name = "kvm_intel", .hardware_unsetup = vt_hardware_unsetup, + .offline_cpu = tdx_offline_cpu, .check_processor_compatibility = vmx_check_processor_compatibility, .hardware_enable = vmx_hardware_enable, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index ec88dde0d300..64229c3b3c5a 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -42,6 +42,7 @@ static struct tdx_capabilities tdx_caps; */ static DEFINE_MUTEX(tdx_lock); static struct mutex *tdx_mng_key_config_lock; +static atomic_t nr_configured_hkid; static __always_inline hpa_t set_hkid_to_hpa(hpa_t pa, u16 hkid) { @@ -222,7 +223,8 @@ void tdx_mmu_release_hkid(struct kvm *kvm) pr_err("tdh_mng_key_freeid failed. HKID %d is leaked.\n", kvm_tdx->hkid); return; - } + } else + atomic_dec(&nr_configured_hkid); free_hkid: tdx_hkid_free(kvm_tdx); @@ -371,6 +373,8 @@ int tdx_vm_init(struct kvm *kvm) if (ret) break; } + if (!ret) + atomic_inc(&nr_configured_hkid); cpus_read_unlock(); free_cpumask_var(packages); if (ret) @@ -514,3 +518,37 @@ void tdx_hardware_unsetup(void) /* kfree accepts NULL. */ kfree(tdx_mng_key_config_lock); } + +int tdx_offline_cpu(void) +{ + int curr_cpu = smp_processor_id(); + cpumask_var_t packages; + int ret = 0; + int i; + + if (!atomic_read(&nr_configured_hkid)) + return 0; + + /* + * To reclaim hkid, need to call TDH.PHYMEM.PAGE.WBINVD on all packages. + * If this is the last online cpu on the package, refuse offline. + */ + if (!zalloc_cpumask_var(&packages, GFP_KERNEL)) + return -ENOMEM; + + for_each_online_cpu(i) { + if (i != curr_cpu) + cpumask_set_cpu(topology_physical_package_id(i), packages); + } + if (!cpumask_test_cpu(topology_physical_package_id(curr_cpu), packages)) + ret = -EBUSY; + free_cpumask_var(packages); + if (ret) + /* + * Because it's hard for human operator to understand the + * reason, warn it. + */ + pr_warn("TDX requires all packages to have an online CPU. " + "Delete all TDs in order to offline all CPUs of a package.\n"); + return ret; +} diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 95da978c9aa9..b2cb5786830a 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -134,6 +134,7 @@ void vmx_setup_mce(struct kvm_vcpu *vcpu); int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops); bool tdx_is_vm_type_supported(unsigned long type); void tdx_hardware_unsetup(void); +int tdx_offline_cpu(void); int tdx_vm_init(struct kvm *kvm); void tdx_mmu_release_hkid(struct kvm *kvm); @@ -142,6 +143,7 @@ void tdx_vm_free(struct kvm *kvm); static inline int tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { return 0; } static inline bool tdx_is_vm_type_supported(unsigned long type) { return false; } static inline void tdx_hardware_unsetup(void) {} +static inline int tdx_offline_cpu(void) { return 0; } static inline int tdx_vm_init(struct kvm *kvm) { return -EOPNOTSUPP; } static inline void tdx_mmu_release_hkid(struct kvm *kvm) {} diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 4b22196cb12c..25c30c8c2d9b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12337,16 +12337,23 @@ int kvm_arch_online_cpu(unsigned int cpu, int usage_count) int kvm_arch_offline_cpu(unsigned int cpu, int usage_count) { - if (usage_count) { - /* - * arch callback kvm_arch_hardware_disable() assumes that - * preemption is disabled for historical reason. Disable - * preemption until all arch callbacks are fixed. - */ - preempt_disable(); - hardware_disable(NULL); - preempt_enable(); - } + int ret; + + if (!usage_count) + return 0; + + ret = static_call(kvm_x86_offline_cpu)(); + if (ret) + return ret; + + /* + * arch callback kvm_arch_hardware_disable() assumes that preemption is + * disabled for historical reason. Disable preemption until all arch + * callbacks are fixed. + */ + preempt_disable(); + hardware_disable(NULL); + preempt_enable(); return 0; } From patchwork Sun Oct 30 06:22:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12838 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665579wru; Sat, 29 Oct 2022 23:26:21 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7/rdm9nglWuCf8QKoUdxSsmjGEwYlB2npkGPjUgZlMEV+qldvJqE5n6vjUxdYlSaFAYe0Q X-Received: by 2002:a17:906:3111:b0:7ad:a7fc:f3e8 with SMTP id 17-20020a170906311100b007ada7fcf3e8mr6673713ejx.518.1667111181252; Sat, 29 Oct 2022 23:26:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111181; cv=none; d=google.com; s=arc-20160816; b=qni7KuHmgAPROv8p6WT4Xsxgp48BizLvXcR5kEC5qu2CApk4mHERGIU1sYVmjK9Gff JrPhBaYODM2inSEu8mHWv152VBYtUtyM2vFQxKmClxgBMj0XIhsP8BUsSgLHctpP2Hv+ OdT1DupcER80gcQC57ifW9nEbfM1Iyyk3k35sNS05qoxHPPZ3uQGgz+MWCHbDryKU44S MeF+xhxtRa7UdfVxKNaFq3dP1HfBr0xVT/KBI2mNEf1xali/iHX8DDEE4kkVfLUgDrZZ c5UJs2t5FD1MAr9qCDid48Y4o5AV0buvdjt0nv7b/eyj81dkDnVKPByktYQPJHZrFvoy FE0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=uQ8HnaD6B6CQnZaW8rbjPza0fVodbbavTkyJMwwH02k=; b=bswTXjItmoIz/i1Y/MMA4ZSZHqpfPYOAhZdxqIxXRvbAHFeiTAtRHotLyb04rYfToN mkZ2Yyub1icwa76YjnC0CQJFiORZZfPpM7faLHxCZk0xl5Juvhdz95TxbX78ens6few4 eAIXpniwnokgnk19B3NcteNXGrbCybDE63GR5AsbgXSRtRPwshtyR0Dnb0p8Vuo6ikiX wx8Wsdd91UL+OFoQx6w/rU7RfLIxAWuqbU06e5soTbYm+yiIsGP2VT8uLsddYIjMY66p 8Xbr3hux2zMvcSkJUuwsz3mmxYEdA+f3BVsyKEwTgjowMwNvZtHsdOvUvtMSNj76n6Z5 0Ajg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=RNhVN0bu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v19-20020a056402175300b0045fd458991fsi3408823edx.428.2022.10.29.23.25.56; Sat, 29 Oct 2022 23:26:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=RNhVN0bu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230080AbiJ3GZE (ORCPT + 99 others); Sun, 30 Oct 2022 02:25:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46882 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229828AbiJ3GYF (ORCPT ); Sun, 30 Oct 2022 02:24:05 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 43BCBBC; Sat, 29 Oct 2022 23:24:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111044; x=1698647044; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jpCQcZecMNiOkK/TgcMHhl62N+vT7vM+ua/D84nFGlU=; b=RNhVN0buGE9GVKrc1k39uQOe1edPbgFchdEb9h2Kry/yMRcVYq0nh4CA bTfRCmo1brW9fWFFODased4gnLRJhp6n8jar+05x2uJbxU31HrTs50paR 5V2oovGimwY9d+NmrONrBxlzD40s1iHtKnPg+D3k8Qc3vZ/+l9A7Pp6I1 csmtJjMk5Ca4u3wAb/n11BYI/ZiVIyhUT2nN7AE+bJJY075O45hsglobw RHZJVOZMk4zBxIxF6lRVET4VJgwsCkfOLMs/yBWuqT4tO428BcaWVOLi/ GBaEEadoYaDLhFvxEWskhfH3hwujV90TpxWAZuZUISLoS/20c1+e7dyLy A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037130" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037130" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:59 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392885" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392885" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:59 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 018/108] KVM: TDX: x86: Add ioctl to get TDX systemwide parameters Date: Sat, 29 Oct 2022 23:22:19 -0700 Message-Id: <178c7ceb19eace04d2745be0feb7623a5cf56738.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092773974431959?= X-GMAIL-MSGID: =?utf-8?q?1748092773974431959?= From: Sean Christopherson Implement a system-scoped ioctl to get system-wide parameters for TDX. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/include/uapi/asm/kvm.h | 48 +++++++++++++++++++++++++++ arch/x86/kvm/vmx/main.c | 2 ++ arch/x86/kvm/vmx/tdx.c | 46 +++++++++++++++++++++++++ arch/x86/kvm/vmx/x86_ops.h | 2 ++ arch/x86/kvm/x86.c | 6 ++++ tools/arch/x86/include/uapi/asm/kvm.h | 48 +++++++++++++++++++++++++++ 8 files changed, 154 insertions(+) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 0ceb8e58a6c0..4425564647cb 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -118,6 +118,7 @@ KVM_X86_OP(smi_allowed) KVM_X86_OP(enter_smm) KVM_X86_OP(leave_smm) KVM_X86_OP(enable_smi_window) +KVM_X86_OP_OPTIONAL(dev_mem_enc_ioctl) KVM_X86_OP_OPTIONAL(mem_enc_ioctl) KVM_X86_OP_OPTIONAL(mem_enc_register_region) KVM_X86_OP_OPTIONAL(mem_enc_unregister_region) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 50b39d0071ff..1fced310ec63 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1626,6 +1626,7 @@ struct kvm_x86_ops { int (*leave_smm)(struct kvm_vcpu *vcpu, const char *smstate); void (*enable_smi_window)(struct kvm_vcpu *vcpu); + int (*dev_mem_enc_ioctl)(void __user *argp); int (*mem_enc_ioctl)(struct kvm *kvm, void __user *argp); int (*mem_enc_register_region)(struct kvm *kvm, struct kvm_enc_region *argp); int (*mem_enc_unregister_region)(struct kvm *kvm, struct kvm_enc_region *argp); diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 54b08789c402..2ad9666e02a5 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -535,4 +535,52 @@ struct kvm_pmu_event_filter { #define KVM_X86_DEFAULT_VM 0 #define KVM_X86_TDX_VM 1 +/* Trust Domain eXtension sub-ioctl() commands. */ +enum kvm_tdx_cmd_id { + KVM_TDX_CAPABILITIES = 0, + + KVM_TDX_CMD_NR_MAX, +}; + +struct kvm_tdx_cmd { + /* enum kvm_tdx_cmd_id */ + __u32 id; + /* flags for sub-commend. If sub-command doesn't use this, set zero. */ + __u32 flags; + /* + * data for each sub-command. An immediate or a pointer to the actual + * data in process virtual address. If sub-command doesn't use it, + * set zero. + */ + __u64 data; + /* + * Auxiliary error code. The sub-command may return TDX SEAMCALL + * status code in addition to -Exxx. + * Defined for consistency with struct kvm_sev_cmd. + */ + __u64 error; + /* Reserved: Defined for consistency with struct kvm_sev_cmd. */ + __u64 unused; +}; + +struct kvm_tdx_cpuid_config { + __u32 leaf; + __u32 sub_leaf; + __u32 eax; + __u32 ebx; + __u32 ecx; + __u32 edx; +}; + +struct kvm_tdx_capabilities { + __u64 attrs_fixed0; + __u64 attrs_fixed1; + __u64 xfam_fixed0; + __u64 xfam_fixed1; + + __u32 nr_cpuid_configs; + __u32 padding; + struct kvm_tdx_cpuid_config cpuid_configs[0]; +}; + #endif /* _ASM_X86_KVM_H */ diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 0918d1e6d2f3..aedba5acb8eb 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -203,6 +203,8 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .complete_emulated_msr = kvm_complete_insn_gp, .vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector, + + .dev_mem_enc_ioctl = tdx_dev_ioctl, }; struct kvm_x86_init_ops vt_init_ops __initdata = { diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 64229c3b3c5a..5a3ed8217a54 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -431,6 +431,52 @@ int tdx_vm_init(struct kvm *kvm) return ret; } +int tdx_dev_ioctl(void __user *argp) +{ + struct kvm_tdx_capabilities __user *user_caps; + struct kvm_tdx_capabilities caps; + struct kvm_tdx_cmd cmd; + + BUILD_BUG_ON(sizeof(struct kvm_tdx_cpuid_config) != + sizeof(struct tdx_cpuid_config)); + + if (copy_from_user(&cmd, argp, sizeof(cmd))) + return -EFAULT; + if (cmd.flags || cmd.error || cmd.unused) + return -EINVAL; + /* + * Currently only KVM_TDX_CAPABILITIES is defined for system-scoped + * mem_enc_ioctl(). + */ + if (cmd.id != KVM_TDX_CAPABILITIES) + return -EINVAL; + + user_caps = (void __user *)cmd.data; + if (copy_from_user(&caps, user_caps, sizeof(caps))) + return -EFAULT; + + if (caps.nr_cpuid_configs < tdx_caps.nr_cpuid_configs) + return -E2BIG; + + caps = (struct kvm_tdx_capabilities) { + .attrs_fixed0 = tdx_caps.attrs_fixed0, + .attrs_fixed1 = tdx_caps.attrs_fixed1, + .xfam_fixed0 = tdx_caps.xfam_fixed0, + .xfam_fixed1 = tdx_caps.xfam_fixed1, + .nr_cpuid_configs = tdx_caps.nr_cpuid_configs, + .padding = 0, + }; + + if (copy_to_user(user_caps, &caps, sizeof(caps))) + return -EFAULT; + if (copy_to_user(user_caps->cpuid_configs, &tdx_caps.cpuid_configs, + tdx_caps.nr_cpuid_configs * + sizeof(struct tdx_cpuid_config))) + return -EFAULT; + + return 0; +} + static int __init tdx_module_setup(void) { const struct tdsysinfo_struct *tdsysinfo; diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index b2cb5786830a..057f2be3d818 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -135,6 +135,7 @@ int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops); bool tdx_is_vm_type_supported(unsigned long type); void tdx_hardware_unsetup(void); int tdx_offline_cpu(void); +int tdx_dev_ioctl(void __user *argp); int tdx_vm_init(struct kvm *kvm); void tdx_mmu_release_hkid(struct kvm *kvm); @@ -144,6 +145,7 @@ static inline int tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { return 0; } static inline bool tdx_is_vm_type_supported(unsigned long type) { return false; } static inline void tdx_hardware_unsetup(void) {} static inline int tdx_offline_cpu(void) { return 0; } +static inline int tdx_dev_ioctl(void __user *argp) { return -EOPNOTSUPP; }; static inline int tdx_vm_init(struct kvm *kvm) { return -EOPNOTSUPP; } static inline void tdx_mmu_release_hkid(struct kvm *kvm) {} diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 25c30c8c2d9b..ddcbbcf13a55 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4678,6 +4678,12 @@ long kvm_arch_dev_ioctl(struct file *filp, r = kvm_x86_dev_has_attr(&attr); break; } + case KVM_MEMORY_ENCRYPT_OP: + r = -EINVAL; + if (!kvm_x86_ops.dev_mem_enc_ioctl) + goto out; + r = static_call(kvm_x86_dev_mem_enc_ioctl)(argp); + break; default: r = -EINVAL; break; diff --git a/tools/arch/x86/include/uapi/asm/kvm.h b/tools/arch/x86/include/uapi/asm/kvm.h index 54b08789c402..2ad9666e02a5 100644 --- a/tools/arch/x86/include/uapi/asm/kvm.h +++ b/tools/arch/x86/include/uapi/asm/kvm.h @@ -535,4 +535,52 @@ struct kvm_pmu_event_filter { #define KVM_X86_DEFAULT_VM 0 #define KVM_X86_TDX_VM 1 +/* Trust Domain eXtension sub-ioctl() commands. */ +enum kvm_tdx_cmd_id { + KVM_TDX_CAPABILITIES = 0, + + KVM_TDX_CMD_NR_MAX, +}; + +struct kvm_tdx_cmd { + /* enum kvm_tdx_cmd_id */ + __u32 id; + /* flags for sub-commend. If sub-command doesn't use this, set zero. */ + __u32 flags; + /* + * data for each sub-command. An immediate or a pointer to the actual + * data in process virtual address. If sub-command doesn't use it, + * set zero. + */ + __u64 data; + /* + * Auxiliary error code. The sub-command may return TDX SEAMCALL + * status code in addition to -Exxx. + * Defined for consistency with struct kvm_sev_cmd. + */ + __u64 error; + /* Reserved: Defined for consistency with struct kvm_sev_cmd. */ + __u64 unused; +}; + +struct kvm_tdx_cpuid_config { + __u32 leaf; + __u32 sub_leaf; + __u32 eax; + __u32 ebx; + __u32 ecx; + __u32 edx; +}; + +struct kvm_tdx_capabilities { + __u64 attrs_fixed0; + __u64 attrs_fixed1; + __u64 xfam_fixed0; + __u64 xfam_fixed1; + + __u32 nr_cpuid_configs; + __u32 padding; + struct kvm_tdx_cpuid_config cpuid_configs[0]; +}; + #endif /* _ASM_X86_KVM_H */ From patchwork Sun Oct 30 06:22:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12840 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665602wru; Sat, 29 Oct 2022 23:26:31 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4dK5FA8mpodjLpySptxZ2eTn+F1UFX3VSjMNIVWOheogYK9ynKLNUy0wrI2wJHb/iixuqR X-Received: by 2002:a17:907:160c:b0:78d:b6f5:9f56 with SMTP id hb12-20020a170907160c00b0078db6f59f56mr6984600ejc.325.1667111191106; Sat, 29 Oct 2022 23:26:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111191; cv=none; d=google.com; s=arc-20160816; b=rgq+ErltvQeGTHDywYXGf3FC82dIQf8l0gG7AYv4aynRe3USrHKIIdSeXeCxiDrEmB ySbOcAhJchTEIU+hjvF5fJF1Pbe3n3RaISqip5TyLC512wAlCXIJBCgnEV4HMnBfY/Of G1oEj2YJn7Cc4rS9D2fwC+kP3lkiq5rDXVm7aym+gV4jpcU6XBOg9z4RCBEKaX1iEfxP 4aVjoE7suM12LpNaFrchs0xaf+jyVg7C23Nkkj7+i8DkyImDJOkE0/6kTrjfFsbeb+Kz SduD3kMtJqQALMFk1vYGIt2eHsSkXQrqubDHIpZksp5jvplZh5Do45YYsN4bXqHluFRb m9pw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=308qLUNyC3H+eVMoF+4HC902onVTc1xDsYR9c3x2koM=; b=hYdqoVPNXwopKaMOqITCdadSlp87r2fvdlNS82v+FWvUY4WWzJuvkU5VnsG0fzKZ7B 2wc/LeDv/WxJFUL3TnOKyhEdds+MLIwL0uChbrL1GDrAtEPRF8weOcy8vLC1T+HCd0uU ZfllZqClv4xPCoXoiGX5Abs+okQ3h9tpXh1S+kQFk88JlwmK8NOc1CFQRLfb9QH36Z29 lHWzzKuTydbWAd7VTT5jcduXtnW6B/QMesvxynwYOLRpRYOr4PT/a1TjAVsSfdNt4GNG cyqKFmT/S28WJ2deK1ypgvOUDszRqeftYkmWnaKxCVBTOzqLQRsOca9MhDtwcaLaEFEB VTxQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XmjTl6s3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id nd21-20020a170907629500b0076fb816dae7si4785963ejc.97.2022.10.29.23.26.07; Sat, 29 Oct 2022 23:26:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XmjTl6s3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229717AbiJ3GZM (ORCPT + 99 others); Sun, 30 Oct 2022 02:25:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46914 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229832AbiJ3GYF (ORCPT ); Sun, 30 Oct 2022 02:24:05 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A83BDD; Sat, 29 Oct 2022 23:24:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111044; x=1698647044; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=S7sJ6GhRmBa9HotJtfVTbz0zU6hwSLQrUIW8UTLhzm8=; b=XmjTl6s3YnWZ5BYFeMbZGleSPZtkMPM/8uxh0zR0WPi2tQNEZx8mD5La i8ighYgK3svuHSnpzHklvPV/uFZnUKANWD7Isht6yJkFQ7H1Ie27BHT4i 2GfMia2sq+FqEBeQRYsFr17HJ9VDEyH1CchqWmw593fVUk7BdAdskdSqA PYd+ooAKVL4FgcpZFjiiuMhZPElfH45bgJ/C4P203OX7fGWbnbJboL7nQ apYxJymU048+0V57KUh7gdq09ErLuy5+H4RGpiS+wE62v4rTKFuAL1g7q OsKp25sTqrcWn6UVRz9Kw1tBwcz9LSVbNCswZ9wH6LLbXF+uCvphwvUz8 w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037131" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037131" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:59 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392888" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392888" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:59 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 019/108] KVM: TDX: Add place holder for TDX VM specific mem_enc_op ioctl Date: Sat, 29 Oct 2022 23:22:20 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092784323384561?= X-GMAIL-MSGID: =?utf-8?q?1748092784323384561?= From: Isaku Yamahata Add a place holder function for TDX specific VM-scoped ioctl as mem_enc_op. TDX specific sub-commands will be added to retrieve/pass TDX specific parameters. KVM_MEMORY_ENCRYPT_OP was introduced for VM-scoped operations specific for guest state-protected VM. It defined subcommands for technology-specific operations under KVM_MEMORY_ENCRYPT_OP. Despite its name, the subcommands are not limited to memory encryption, but various technology-specific operations are defined. It's natural to repurpose KVM_MEMORY_ENCRYPT_OP for TDX specific operations and define subcommands. TDX requires VM-scoped TDX-specific operations for device model, for example, qemu. Getting system-wide parameters, TDX-specific VM initialization. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/main.c | 9 +++++++++ arch/x86/kvm/vmx/tdx.c | 26 ++++++++++++++++++++++++++ arch/x86/kvm/vmx/x86_ops.h | 4 ++++ 3 files changed, 39 insertions(+) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index aedba5acb8eb..b4e4c6c677f6 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -63,6 +63,14 @@ static void vt_vm_free(struct kvm *kvm) return tdx_vm_free(kvm); } +static int vt_mem_enc_ioctl(struct kvm *kvm, void __user *argp) +{ + if (!is_td(kvm)) + return -ENOTTY; + + return tdx_vm_ioctl(kvm, argp); +} + struct kvm_x86_ops vt_x86_ops __initdata = { .name = "kvm_intel", @@ -205,6 +213,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector, .dev_mem_enc_ioctl = tdx_dev_ioctl, + .mem_enc_ioctl = vt_mem_enc_ioctl, }; struct kvm_x86_init_ops vt_init_ops __initdata = { diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 5a3ed8217a54..d77709a6da51 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -477,6 +477,32 @@ int tdx_dev_ioctl(void __user *argp) return 0; } +int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) +{ + struct kvm_tdx_cmd tdx_cmd; + int r; + + if (copy_from_user(&tdx_cmd, argp, sizeof(struct kvm_tdx_cmd))) + return -EFAULT; + if (tdx_cmd.error || tdx_cmd.unused) + return -EINVAL; + + mutex_lock(&kvm->lock); + + switch (tdx_cmd.id) { + default: + r = -EINVAL; + goto out; + } + + if (copy_to_user(argp, &tdx_cmd, sizeof(struct kvm_tdx_cmd))) + r = -EFAULT; + +out: + mutex_unlock(&kvm->lock); + return r; +} + static int __init tdx_module_setup(void) { const struct tdsysinfo_struct *tdsysinfo; diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 057f2be3d818..93ffe2deb8e8 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -140,6 +140,8 @@ int tdx_dev_ioctl(void __user *argp); int tdx_vm_init(struct kvm *kvm); void tdx_mmu_release_hkid(struct kvm *kvm); void tdx_vm_free(struct kvm *kvm); + +int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); #else static inline int tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { return 0; } static inline bool tdx_is_vm_type_supported(unsigned long type) { return false; } @@ -151,6 +153,8 @@ static inline int tdx_vm_init(struct kvm *kvm) { return -EOPNOTSUPP; } static inline void tdx_mmu_release_hkid(struct kvm *kvm) {} static inline void tdx_flush_shadow_all_private(struct kvm *kvm) {} static inline void tdx_vm_free(struct kvm *kvm) {} + +static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } #endif #endif /* __KVM_X86_VMX_X86_OPS_H */ From patchwork Sun Oct 30 06:22:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12845 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665701wru; Sat, 29 Oct 2022 23:27:01 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4veyeWOBAJwiOojo8xYv397hAhO/+/Tpvdg+tW7TCfUuW+xMPKSnEXYiGCMsEbNpVhVi7k X-Received: by 2002:a17:907:6e1a:b0:7ad:ba0b:538c with SMTP id sd26-20020a1709076e1a00b007adba0b538cmr3631826ejc.111.1667111221172; Sat, 29 Oct 2022 23:27:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111221; cv=none; d=google.com; s=arc-20160816; b=0N5i+7GqEA6gTQjb4+QtbQdzhVuDhv7PHjJ3CwijFe8kL3cLUow96vKrRklLmpCVOb A+Av0KX1dFBgHmx92X2g3P/Y08levIx8F4UsD9EgHCV3396vy3gj/lLLKmUp7rKGv3sp ocnArymRhjT0FOIVI/MRRpcWx3MnseqHuMiIndjecn7hsXBUSSjJ4ddtM9Ki+Kxh0sFq UAlDXn3gavn+Kv2WbtfgWExwWJ/WDxhFT6AScMN6A89XpsCIfX4JLIxRAC5CcC1N7/uz 45ApKN8WXG4JMbjX80JE/drZSY4vj5ZSdgXuJAi3gv0lUwbESiOpLgcfT6/Jp4W1q1l9 bT0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=qzTr3NdQov4sMJWKch2uHRZu/nSWnCpkdRYhqz5JPhg=; b=T3NFZdoJuHgEP6w7PxiUThqczr5IGZPENDeiwSMWmErGz+TQT4M6RJa2IKOC/GRpwq jwNKwx7IYfp99ORWFzS+tso032ZdyJmVh/tCTLfT+q9e/jfbn0YTLgn+7uW0WRqFNQGv J77BSd3Kr5fcP1Qmc7NtPQN6+X66CuKdBluovPwXjJnb7oXvtN1nlzxowuxhf76/eNez TEIr13b3gwX1lfIRTyMgYD/DgSR757El4EcEY4ueIpjwtokyZLLRsxSQbWxQVNyCQIxO pLU514nFEVP1UDr0SC/zzfaP+hVzvZMyAyh7rNgz8E04dX8970+XplDIlYatPkKVujAe Wk9g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ggxUr75B; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l7-20020a170906644700b0078da99ecbdbsi2884292ejn.673.2022.10.29.23.26.35; Sat, 29 Oct 2022 23:27:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ggxUr75B; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229949AbiJ3GZR (ORCPT + 99 others); Sun, 30 Oct 2022 02:25:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46946 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229833AbiJ3GYF (ORCPT ); Sun, 30 Oct 2022 02:24:05 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ADDD0DE; Sat, 29 Oct 2022 23:24:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111044; x=1698647044; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1K4yf7A4XzVtLVElMggRuvYNfU2q/PlhFY/Ez4kSMlc=; b=ggxUr75BvmJfou7PeGA5nR8RHFENjYai2Lo8fCCdG8Jg1Py+3IKzcR2o 9QtatENnwj2cT3xGHmquVP1FFR+jZDpYZuTZaRC35RXkVNKZy1Rqjk6tk AsQVZxIGa37mXadDL3MZAd9sb1v41cTolI38PzsRcVBl40EGZSQtVp0yT HD+5uI0p40eSp0+1WauBUMFwim3qsXIFir/O0Shh0yAwpIJK7oVkkmUnZ 5oue1Zlw8UtWwvfaJqtjllOXY18JPaoL2DMRNv+rv/8vFD8CGAyhxqZKA Japto/os3+BUG/xTiQd7s9inTveEr0ur/4+ZTF++O2rPNytBZGXvT+bIK w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037132" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037132" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:00 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392891" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392891" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:23:59 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 020/108] KVM: Support KVM_CAP_MAX_VCPUS for KVM_ENABLE_CAP Date: Sat, 29 Oct 2022 23:22:21 -0700 Message-Id: <07daa8dbe1e3d6fec8db47a3ff3422a9bc460548.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092815893717761?= X-GMAIL-MSGID: =?utf-8?q?1748092815893717761?= From: Isaku Yamahata TDX attestation includes the maximum number of vcpu that the guest can accommodate. For that, the maximum number of vcpu needs to be specified instead of constant, KVM_MAX_VCPUS. Make KVM_ENABLE_CAP support KVM_CAP_MAX_VCPUS. Suggested-by: Sagi Shahar Signed-off-by: Isaku Yamahata --- virt/kvm/kvm_main.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b9c270f97c88..3b05a3396f89 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4967,6 +4967,27 @@ static int kvm_vm_ioctl_enable_cap_generic(struct kvm *kvm, case KVM_CAP_DIRTY_LOG_RING: case KVM_CAP_DIRTY_LOG_RING_ACQ_REL: return kvm_vm_ioctl_enable_dirty_log_ring(kvm, cap->args[0]); + case KVM_CAP_MAX_VCPUS: { + int r; + + if (cap->flags || cap->args[0] == 0) + return -EINVAL; + if (cap->args[0] > kvm_vm_ioctl_check_extension(kvm, KVM_CAP_MAX_VCPUS)) + return -E2BIG; + + mutex_lock(&kvm->lock); + /* Only decreasing is allowed. */ + if (cap->args[0] > kvm->max_vcpus) + r = -E2BIG; + else if (kvm->created_vcpus) + r = -EBUSY; + else { + kvm->max_vcpus = cap->args[0]; + r = 0; + } + mutex_unlock(&kvm->lock); + return r; + } default: return kvm_vm_ioctl_enable_cap(kvm, cap); } From patchwork Sun Oct 30 06:22:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12858 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666017wru; Sat, 29 Oct 2022 23:28:34 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6B3KQACTJBtVE9J4Qbb9y9ygXtUw4uNeQOHdbCaD9Z7mWqMIPO98czkPICQx6pEABmcr/m X-Received: by 2002:a17:907:3d89:b0:7ad:b97e:2949 with SMTP id he9-20020a1709073d8900b007adb97e2949mr3653840ejc.686.1667111313908; Sat, 29 Oct 2022 23:28:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111313; cv=none; d=google.com; s=arc-20160816; b=XIY4IbWpgfoEmExr1WBKATPrnkw1d7z9/5sfAiVwk3ot0gGBVRS2IPXXo63/Scb8oK FDqednhdp96LZsY6reRVV3a0aW66N6T1d4NkGonmiFT0mIT3s08UFXRU1LCzQ9Pomljl OTILrDeFRWyMgXN84pdQwMhDtOZ4dFPoRBWJNLbZs14A/FwF4gkQxJ3nyfZZsDiqQGI8 nsRuUb7xHQMFwV1RTaQpJlhsBkZGaDNYaB+qXwRKoZ3EQllHaA3T1oJ6g2hiZzYhnI3s ikGdJgdKKCYm15oUJrZEnivR50A6X8dO5FixHaYqhS5B0PXTCQ8HXECT5Iim4RuXFoaP u4oQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=/pooWg5SfFGAMEPbULzJnDvcKrXepaPQeD2aG0LmwFE=; b=HxXNSKpADcOVJNoECCPX7zL7mek4lofYC8USsraj2hWJv7QKUOtVre/BPEp1AZx46c VUls6kfR34cWfiFzC6JdOtuptXIRBfERJX3tY1tCEY449y41rzaLHZ4/++qO5673pQEm stUTiYUV3QIK24qhfhuYN2mHqiSsCMLzECkFKVOb6tgODDjINz2hf4yN6zuIaIHfuZxz SKxhZKbOOAkGMXbSs+SaOD2QiSjF1quIrYawB8xntdeJpnZYigC+Xw4ZQrJcsELJbAEn Bm5FPzDkXu43jKBeMj3lgysdGaE1HQPijNzqAJ0oP5AvVWrfcZKIWyu/TnQbxE1pt6bo BaYw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TolqpHmj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id wt5-20020a170906ee8500b0078dc5c888f1si4356712ejb.135.2022.10.29.23.28.02; Sat, 29 Oct 2022 23:28:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TolqpHmj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229648AbiJ3GZZ (ORCPT + 99 others); Sun, 30 Oct 2022 02:25:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229861AbiJ3GYH (ORCPT ); Sun, 30 Oct 2022 02:24:07 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B760CF2; Sat, 29 Oct 2022 23:24:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111045; x=1698647045; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pbg5s80LRMy1gbuBSqvYBvjgHld8A3fEVhu6KnmunpE=; b=TolqpHmj6LfxXjMaItR0XbPc6VxvukcD0qOUZfLAfsfn/gZDqJwwE1Cg TZIABhpeOQXh6JSCAbMn8zXZSbGI9uI12K5dEIDCDOEBUxv09UcRfdkao 0WRdxLtX4AnbZ0NFbtUudW5RBUKNzMcrbUsE2z8hoPViH+kQIC7TI040I hJutulGaSLTVtSxHDp1spwYmHOSe9HPrs1hJ0S5rRt3LjRC0pBWanupLS ttVnCNE71FEG2Hbs4PpOlmh5IdbGYf7QNWjBGme3MOUyRldCB3e67o1vQ yt6A3493L+lTE32jtvw5vJZN+e55ofKuER+8e4EbrWplA6t/ZalbdnNUR A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037133" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037133" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:00 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392894" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392894" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:00 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Xiaoyao Li Subject: [PATCH v10 021/108] KVM: TDX: initialize VM with TDX specific parameters Date: Sat, 29 Oct 2022 23:22:22 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092912569437282?= X-GMAIL-MSGID: =?utf-8?q?1748092912569437282?= From: Xiaoyao Li TDX requires additional parameters for TDX VM for confidential execution to protect its confidentiality of its memory contents and its CPU state from any other software, including VMM. When creating guest TD VM before creating vcpu, the number of vcpu, TSC frequency (that is same among vcpus. and it can't be changed.) CPUIDs which is emulated by the TDX module. It means guest can trust those CPUIDs. and sha384 values for measurement. Add new subcommand, KVM_TDX_INIT_VM, to pass parameters for TDX guest. It assigns encryption key to the TDX guest for memory encryption. TDX encrypts memory per-guest bases. It assigns device model passes per-VM parameters for the TDX guest. The maximum number of vcpus, tsc frequency (TDX guest has fised VM-wide TSC frequency. not per-vcpu. The TDX guest can not change it.), attributes (production or debug), available extended features (which is reflected into guest XCR0, IA32_XSS MSR), cpuids, sha384 measurements, and etc. This subcommand is called before creating vcpu and KVM_SET_CPUID2, i.e. cpuids configurations aren't available yet. So CPUIDs configuration values needs to be passed in struct kvm_init_vm. It's device model responsibility to make this cpuid config for KVM_TDX_INIT_VM and KVM_SET_CPUID2. Signed-off-by: Xiaoyao Li Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/tdx.h | 3 + arch/x86/include/uapi/asm/kvm.h | 31 +++ arch/x86/kvm/vmx/tdx.c | 296 ++++++++++++++++++++++---- arch/x86/kvm/vmx/tdx.h | 22 ++ tools/arch/x86/include/uapi/asm/kvm.h | 33 +++ 5 files changed, 347 insertions(+), 38 deletions(-) diff --git a/arch/x86/include/asm/tdx.h b/arch/x86/include/asm/tdx.h index cd304d323d33..05ac4bfc8f8a 100644 --- a/arch/x86/include/asm/tdx.h +++ b/arch/x86/include/asm/tdx.h @@ -131,6 +131,9 @@ static inline long tdx_kvm_hypercall(unsigned int nr, unsigned long p1, #endif /* CONFIG_INTEL_TDX_GUEST && CONFIG_KVM_GUEST */ #ifdef CONFIG_INTEL_TDX_HOST + +/* -1 indicates CPUID leaf with no sub-leaves. */ +#define TDX_CPUID_NO_SUBLEAF ((u32)-1) struct tdx_cpuid_config { u32 leaf; u32 sub_leaf; diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 2ad9666e02a5..26661879c031 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -538,6 +538,7 @@ struct kvm_pmu_event_filter { /* Trust Domain eXtension sub-ioctl() commands. */ enum kvm_tdx_cmd_id { KVM_TDX_CAPABILITIES = 0, + KVM_TDX_INIT_VM, KVM_TDX_CMD_NR_MAX, }; @@ -583,4 +584,34 @@ struct kvm_tdx_capabilities { struct kvm_tdx_cpuid_config cpuid_configs[0]; }; +struct kvm_tdx_init_vm { + __u64 attributes; + __u64 mrconfigid[6]; /* sha384 digest */ + __u64 mrowner[6]; /* sha384 digest */ + __u64 mrownerconfig[6]; /* sha348 digest */ + union { + /* + * KVM_TDX_INIT_VM is called before vcpu creation, thus before + * KVM_SET_CPUID2. CPUID configurations needs to be passed. + * + * This configuration supersedes KVM_SET_CPUID{,2}. + * The user space VMM, e.g. qemu, should make them consistent + * with this values. + * sizeof(struct kvm_cpuid_entry2) * KVM_MAX_CPUID_ENTRIES(256) + * = 8KB. + */ + struct { + struct kvm_cpuid2 cpuid; + /* 8KB with KVM_MAX_CPUID_ENTRIES. */ + struct kvm_cpuid_entry2 entries[]; + }; + /* + * For future extensibility. + * The size(struct kvm_tdx_init_vm) = 16KB. + * This should be enough given sizeof(TD_PARAMS) = 1024 + */ + __u64 reserved[2029]; + }; +}; + #endif /* _ASM_X86_KVM_H */ diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index d77709a6da51..54045e0576e7 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -284,6 +284,205 @@ static int tdx_do_tdh_mng_key_config(void *param) int tdx_vm_init(struct kvm *kvm) { struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + + kvm_tdx->hkid = -1; + + /* + * This function initializes only KVM software construct. It doesn't + * initialize TDX stuff, e.g. TDCS, TDR, TDCX, HKID etc. + * It is handled by KVM_TDX_INIT_VM, __tdx_td_init(). + */ + + return 0; +} + +int tdx_dev_ioctl(void __user *argp) +{ + struct kvm_tdx_capabilities __user *user_caps; + struct kvm_tdx_capabilities caps; + struct kvm_tdx_cmd cmd; + + BUILD_BUG_ON(sizeof(struct kvm_tdx_cpuid_config) != + sizeof(struct tdx_cpuid_config)); + + if (copy_from_user(&cmd, argp, sizeof(cmd))) + return -EFAULT; + if (cmd.flags || cmd.error || cmd.unused) + return -EINVAL; + /* + * Currently only KVM_TDX_CAPABILITIES is defined for system-scoped + * mem_enc_ioctl(). + */ + if (cmd.id != KVM_TDX_CAPABILITIES) + return -EINVAL; + + user_caps = (void __user *)cmd.data; + if (copy_from_user(&caps, user_caps, sizeof(caps))) + return -EFAULT; + + if (caps.nr_cpuid_configs < tdx_caps.nr_cpuid_configs) + return -E2BIG; + + caps = (struct kvm_tdx_capabilities) { + .attrs_fixed0 = tdx_caps.attrs_fixed0, + .attrs_fixed1 = tdx_caps.attrs_fixed1, + .xfam_fixed0 = tdx_caps.xfam_fixed0, + .xfam_fixed1 = tdx_caps.xfam_fixed1, + .nr_cpuid_configs = tdx_caps.nr_cpuid_configs, + .padding = 0, + }; + + if (copy_to_user(user_caps, &caps, sizeof(caps))) + return -EFAULT; + if (copy_to_user(user_caps->cpuid_configs, &tdx_caps.cpuid_configs, + tdx_caps.nr_cpuid_configs * + sizeof(struct tdx_cpuid_config))) + return -EFAULT; + + return 0; +} + +/* + * cpuid entry lookup in TDX cpuid config way. + * The difference is how to specify index(subleaves). + * Specify index to TDX_CPUID_NO_SUBLEAF for CPUID leaf with no-subleaves. + */ +static const struct kvm_cpuid_entry2 *tdx_find_cpuid_entry(const struct kvm_cpuid2 *cpuid, + u32 function, u32 index) +{ + int i; + + /* In TDX CPU CONFIG, TDX_CPUID_NO_SUBLEAF means index = 0. */ + if (index == TDX_CPUID_NO_SUBLEAF) + index = 0; + + for (i = 0; i < cpuid->nent; i++) { + const struct kvm_cpuid_entry2 *e = &cpuid->entries[i]; + + if (e->function == function && + (e->index == index || + !(e->flags & KVM_CPUID_FLAG_SIGNIFCANT_INDEX))) + return e; + } + return NULL; +} + +static int setup_tdparams(struct kvm *kvm, struct td_params *td_params, + struct kvm_tdx_init_vm *init_vm) +{ + const struct kvm_cpuid2 *cpuid = &init_vm->cpuid; + const struct kvm_cpuid_entry2 *entry; + u64 guest_supported_xcr0; + u64 guest_supported_xss; + int max_pa; + int i; + + if (kvm->created_vcpus) + return -EBUSY; + td_params->max_vcpus = kvm->max_vcpus; + td_params->attributes = init_vm->attributes; + if (td_params->attributes & TDX_TD_ATTRIBUTE_PERFMON) { + /* + * TODO: save/restore PMU related registers around TDENTER. + * Once it's done, remove this guard. + */ + pr_warn("TD doesn't support perfmon yet. KVM needs to save/restore " + "host perf registers properly.\n"); + return -EOPNOTSUPP; + } + + for (i = 0; i < tdx_caps.nr_cpuid_configs; i++) { + const struct tdx_cpuid_config *config = &tdx_caps.cpuid_configs[i]; + const struct kvm_cpuid_entry2 *entry = + tdx_find_cpuid_entry(cpuid, config->leaf, config->sub_leaf); + struct tdx_cpuid_value *value = &td_params->cpuid_values[i]; + + if (!entry) + continue; + + value->eax = entry->eax & config->eax; + value->ebx = entry->ebx & config->ebx; + value->ecx = entry->ecx & config->ecx; + value->edx = entry->edx & config->edx; + } + + max_pa = 36; + entry = tdx_find_cpuid_entry(cpuid, 0x80000008, 0); + if (entry) + max_pa = entry->eax & 0xff; + + td_params->eptp_controls = VMX_EPTP_MT_WB; + /* + * No CPU supports 4-level && max_pa > 48. + * "5-level paging and 5-level EPT" section 4.1 4-level EPT + * "4-level EPT is limited to translating 48-bit guest-physical + * addresses." + * cpu_has_vmx_ept_5levels() check is just in case. + */ + if (cpu_has_vmx_ept_5levels() && max_pa > 48) { + td_params->eptp_controls |= VMX_EPTP_PWL_5; + td_params->exec_controls |= TDX_EXEC_CONTROL_MAX_GPAW; + } else { + td_params->eptp_controls |= VMX_EPTP_PWL_4; + } + + /* Setup td_params.xfam */ + entry = tdx_find_cpuid_entry(cpuid, 0xd, 0); + if (entry) + guest_supported_xcr0 = (entry->eax | ((u64)entry->edx << 32)); + else + guest_supported_xcr0 = 0; + guest_supported_xcr0 &= kvm_caps.supported_xcr0; + + entry = tdx_find_cpuid_entry(cpuid, 0xd, 1); + if (entry) + guest_supported_xss = (entry->ecx | ((u64)entry->edx << 32)); + else + guest_supported_xss = 0; + /* PT can be exposed to TD guest regardless of KVM's XSS support */ + guest_supported_xss &= (kvm_caps.supported_xss | XFEATURE_MASK_PT); + + td_params->xfam = guest_supported_xcr0 | guest_supported_xss; + if (td_params->xfam & XFEATURE_MASK_LBR) { + /* + * TODO: once KVM supports LBR(save/restore LBR related + * registers around TDENTER), remove this guard. + */ + pr_warn("TD doesn't support LBR yet. KVM needs to save/restore " + "IA32_LBR_DEPTH properly.\n"); + return -EOPNOTSUPP; + } + + if (td_params->xfam & XFEATURE_MASK_XTILE) { + /* + * TODO: once KVM supports AMX(save/restore AMX related + * registers around TDENTER), remove this guard. + */ + pr_warn("TD doesn't support AMX yet. KVM needs to save/restore " + "IA32_XFD, IA32_XFD_ERR properly.\n"); + return -EOPNOTSUPP; + } + + td_params->tsc_frequency = + TDX_TSC_KHZ_TO_25MHZ(kvm->arch.default_tsc_khz); + +#define MEMCPY_SAME_SIZE(dst, src) \ + do { \ + BUILD_BUG_ON(sizeof(dst) != sizeof(src)); \ + memcpy((dst), (src), sizeof(dst)); \ + } while (0) + + MEMCPY_SAME_SIZE(td_params->mrconfigid, init_vm->mrconfigid); + MEMCPY_SAME_SIZE(td_params->mrowner, init_vm->mrowner); + MEMCPY_SAME_SIZE(td_params->mrownerconfig, init_vm->mrownerconfig); + + return 0; +} + +static int __tdx_td_init(struct kvm *kvm, struct td_params *td_params) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + struct tdx_module_output out; cpumask_var_t packages; int ret, i; u64 err; @@ -390,10 +589,13 @@ int tdx_vm_init(struct kvm *kvm) tdx_mark_td_page_added(&kvm_tdx->tdcs[i]); } - /* - * Note, TDH_MNG_INIT cannot be invoked here. TDH_MNG_INIT requires a dedicated - * ioctl() to define the configure CPUID values for the TD. - */ + err = tdh_mng_init(kvm_tdx->tdr.pa, __pa(td_params), &out); + if (WARN_ON_ONCE(err)) { + pr_tdx_error(TDH_MNG_INIT, err, &out); + ret = -EIO; + goto teardown; + } + return 0; /* @@ -431,50 +633,65 @@ int tdx_vm_init(struct kvm *kvm) return ret; } -int tdx_dev_ioctl(void __user *argp) +static int tdx_td_init(struct kvm *kvm, struct kvm_tdx_cmd *cmd) { - struct kvm_tdx_capabilities __user *user_caps; - struct kvm_tdx_capabilities caps; - struct kvm_tdx_cmd cmd; + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + struct kvm_tdx_init_vm *init_vm = NULL; + struct td_params *td_params = NULL; + void *entries_end; + int ret; - BUILD_BUG_ON(sizeof(struct kvm_tdx_cpuid_config) != - sizeof(struct tdx_cpuid_config)); + BUILD_BUG_ON(sizeof(*init_vm) != 16 * 1024); + BUILD_BUG_ON((sizeof(*init_vm) - offsetof(typeof(*init_vm), entries)) / + sizeof(init_vm->entries[0]) < KVM_MAX_CPUID_ENTRIES); + BUILD_BUG_ON(sizeof(struct td_params) != 1024); - if (copy_from_user(&cmd, argp, sizeof(cmd))) - return -EFAULT; - if (cmd.flags || cmd.error || cmd.unused) + if (is_td_initialized(kvm)) return -EINVAL; - /* - * Currently only KVM_TDX_CAPABILITIES is defined for system-scoped - * mem_enc_ioctl(). - */ - if (cmd.id != KVM_TDX_CAPABILITIES) + + if (cmd->flags) return -EINVAL; - user_caps = (void __user *)cmd.data; - if (copy_from_user(&caps, user_caps, sizeof(caps))) - return -EFAULT; + init_vm = kzalloc(sizeof(*init_vm), GFP_KERNEL); + if (copy_from_user(init_vm, (void __user *)cmd->data, sizeof(*init_vm))) { + ret = -EFAULT; + goto out; + } - if (caps.nr_cpuid_configs < tdx_caps.nr_cpuid_configs) - return -E2BIG; + ret = -EINVAL; + if (init_vm->cpuid.padding) + goto out; + /* init_vm->entries shouldn't overrun. */ + entries_end = init_vm->entries + init_vm->cpuid.nent; + if (entries_end > (void *)(init_vm + 1)) + goto out; + /* Unused part must be zero. */ + if (memchr_inv(entries_end, 0, (void *)(init_vm + 1) - entries_end)) + goto out; - caps = (struct kvm_tdx_capabilities) { - .attrs_fixed0 = tdx_caps.attrs_fixed0, - .attrs_fixed1 = tdx_caps.attrs_fixed1, - .xfam_fixed0 = tdx_caps.xfam_fixed0, - .xfam_fixed1 = tdx_caps.xfam_fixed1, - .nr_cpuid_configs = tdx_caps.nr_cpuid_configs, - .padding = 0, - }; + td_params = kzalloc(sizeof(struct td_params), GFP_KERNEL); + if (!td_params) { + ret = -ENOMEM; + goto out; + } - if (copy_to_user(user_caps, &caps, sizeof(caps))) - return -EFAULT; - if (copy_to_user(user_caps->cpuid_configs, &tdx_caps.cpuid_configs, - tdx_caps.nr_cpuid_configs * - sizeof(struct tdx_cpuid_config))) - return -EFAULT; + ret = setup_tdparams(kvm, td_params, init_vm); + if (ret) + goto out; - return 0; + ret = __tdx_td_init(kvm, td_params); + if (ret) + goto out; + + kvm_tdx->tsc_offset = td_tdcs_exec_read64(kvm_tdx, TD_TDCS_EXEC_TSC_OFFSET); + kvm_tdx->attributes = td_params->attributes; + kvm_tdx->xfam = td_params->xfam; + +out: + /* kfree() accepts NULL. */ + kfree(init_vm); + kfree(td_params); + return ret; } int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) @@ -490,6 +707,9 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) mutex_lock(&kvm->lock); switch (tdx_cmd.id) { + case KVM_TDX_INIT_VM: + r = tdx_td_init(kvm, &tdx_cmd); + break; default: r = -EINVAL; goto out; diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index 938314635b47..ff0ea9cad347 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -18,7 +18,11 @@ struct kvm_tdx { struct tdx_td_page tdr; struct tdx_td_page *tdcs; + u64 attributes; + u64 xfam; int hkid; + + u64 tsc_offset; }; struct vcpu_tdx { @@ -48,6 +52,11 @@ static inline struct vcpu_tdx *to_tdx(struct kvm_vcpu *vcpu) return container_of(vcpu, struct vcpu_tdx, vcpu); } +static inline bool is_td_initialized(struct kvm *kvm) +{ + return to_kvm_tdx(kvm)->hkid > 0; +} + static __always_inline void tdvps_vmcs_check(u32 field, u8 bits) { #define VMCS_ENC_ACCESS_TYPE_MASK 0x1UL @@ -148,6 +157,19 @@ TDX_BUILD_TDVPS_ACCESSORS(64, VMCS, vmcs); TDX_BUILD_TDVPS_ACCESSORS(64, STATE_NON_ARCH, state_non_arch); TDX_BUILD_TDVPS_ACCESSORS(8, MANAGEMENT, management); +static __always_inline u64 td_tdcs_exec_read64(struct kvm_tdx *kvm_tdx, u32 field) +{ + struct tdx_module_output out; + u64 err; + + err = tdh_mng_rd(kvm_tdx->tdr.pa, TDCS_EXEC(field), &out); + if (unlikely(err)) { + pr_err("TDH_MNG_RD[EXEC.0x%x] failed: 0x%llx\n", field, err); + return 0; + } + return out.r8; +} + #else struct kvm_tdx { struct kvm kvm; diff --git a/tools/arch/x86/include/uapi/asm/kvm.h b/tools/arch/x86/include/uapi/asm/kvm.h index 2ad9666e02a5..531a0033e530 100644 --- a/tools/arch/x86/include/uapi/asm/kvm.h +++ b/tools/arch/x86/include/uapi/asm/kvm.h @@ -538,6 +538,7 @@ struct kvm_pmu_event_filter { /* Trust Domain eXtension sub-ioctl() commands. */ enum kvm_tdx_cmd_id { KVM_TDX_CAPABILITIES = 0, + KVM_TDX_INIT_VM, KVM_TDX_CMD_NR_MAX, }; @@ -583,4 +584,36 @@ struct kvm_tdx_capabilities { struct kvm_tdx_cpuid_config cpuid_configs[0]; }; +struct kvm_tdx_init_vm { + __u64 attributes; + __u32 max_vcpus; + __u32 padding; + __u64 mrconfigid[6]; /* sha384 digest */ + __u64 mrowner[6]; /* sha384 digest */ + __u64 mrownerconfig[6]; /* sha348 digest */ + union { + /* + * KVM_TDX_INIT_VM is called before vcpu creation, thus before + * KVM_SET_CPUID2. CPUID configurations needs to be passed. + * + * This configuration supersedes KVM_SET_CPUID{,2}. + * The user space VMM, e.g. qemu, should make them consistent + * with this values. + * sizeof(struct kvm_cpuid_entry2) * KVM_MAX_CPUID_ENTRIES(256) + * = 8KB. + */ + struct { + struct kvm_cpuid2 cpuid; + /* 8KB with KVM_MAX_CPUID_ENTRIES. */ + struct kvm_cpuid_entry2 entries[]; + }; + /* + * For future extensibility. + * The size(struct kvm_tdx_init_vm) = 16KB. + * This should be enough given sizeof(TD_PARAMS) = 1024 + */ + __u64 reserved[2028]; + }; +}; + #endif /* _ASM_X86_KVM_H */ From patchwork Sun Oct 30 06:22:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12851 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665816wru; Sat, 29 Oct 2022 23:27:36 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6+VaVRE8GNShYgpyd86O9KkqTKm+2PihdCnHkKHPLBc52Z4IDBM+Mko7v6maYOTXvWnIZb X-Received: by 2002:a17:907:6da3:b0:78e:2a5f:5aaf with SMTP id sb35-20020a1709076da300b0078e2a5f5aafmr6889275ejc.554.1667111256819; Sat, 29 Oct 2022 23:27:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111256; cv=none; d=google.com; s=arc-20160816; b=eEyAsoWVfk0i1bN+a2m4/g+l+aq5JZDrYshpSdKgNx5GdKhEkLwV9LgKVW7SKW6y3W 0gaF+CKDqpk+Dx0xGdZhIui38ab7o/n3PrJisZ1qHUu/s0TUEu2E/Oq+1pIBARC90t3w oc9NUczshfjCDBicioYXCwtILNqzHyB8bMR2v5XMr75oAj/xx8Ruhhbn7DkV1p79KCZc PI2T1XGNP7Ox1CKrCnvFtDF3N6f4dSuBL2uanq0q0SgbSJcTJYpRdJIR91cR1JX6kVcB 7KMSYn+iplJm4+vF4Jf3kfTJROSnqFEqugIkvwGmdUkqSj6yzWY90ph7HH51Tb7Z67iy Tu/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=utvaVbzsbrYXkLCc8115xahJ5DuRaNtm0tNb6V5uEok=; b=CkBDxXrDaZxa3515nkSpapKsr0iy/rMo0AmaLvD+SXMB2qPL1VlIq/10iGSPcNvOxT jmeB7SQCR2iWMBnxamte8XgJFpJjxzUvqjcMBv46nlX9XacQowo4w5ziXGZy23npNBwp +Xxk+vbShCq3r4Cqwnm5XgpoFBtZTkA0iBES7rflIuFyzq2KlGKMO6MjiKU5lppuzHTg XmKdvJdP93J1CCI2rIdtZHxwvKvO8952QwEmiznUw9r5Wk8cK488f6b8HaQmZ3Puc5gv 7VoFvQ9GYHbhMGx3c8Et7aKW/y8JqtMDqhOY0ctQXhfIRb1ilLHT+mRFzM7kayF7GlOX Fxkg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ehFSrCVv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gb38-20020a170907962600b007414dda0c62si4175762ejc.817.2022.10.29.23.27.13; Sat, 29 Oct 2022 23:27:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ehFSrCVv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230306AbiJ3G0C (ORCPT + 99 others); Sun, 30 Oct 2022 02:26:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46854 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229819AbiJ3GYG (ORCPT ); Sun, 30 Oct 2022 02:24:06 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B69AAEE; Sat, 29 Oct 2022 23:24:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111045; x=1698647045; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5drslpxtVGL+0KKcdfBocOMC1b3Tz+ZZ9S/hCMfBIQE=; b=ehFSrCVvnoLkFAjVyL/QtOYO8W9kXNAKtsYWTUssMyVvthEG6GYbP64t 5AeH/KBu4DyB9YYfFjhOIWv9QOTTvKtCb0cnddgXXoGUrKy7iUnOwSXsE m+guDiJ84uSriNP/q0zftWkq9PC8dysDrhYuspS0sHH4x8uPZqblZVBFk yvK3+5Nq39ycDiexstVVwv5wa6IS8o7qe0ZauFe0KG40POVd/JDGf3UNJ JRRxQNEG1ar7trdF0TlKC0bnQXPOgiBxxF1sT0ZL+wPRX0ZYmNYY7oVzh 8G9hOfpTzrxgbYVQkUcaQKE10/uewM6sbuaaAjCnj2LoDWktnHFcsezfw A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037134" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037134" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:00 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392897" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392897" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:00 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 022/108] KVM: TDX: Make pmu_intel.c ignore guest TD case Date: Sat, 29 Oct 2022 23:22:23 -0700 Message-Id: <914c3ef854bad539c0b33d195b5019c9941771e9.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092853124578142?= X-GMAIL-MSGID: =?utf-8?q?1748092853124578142?= From: Isaku Yamahata Because TDX KVM doesn't support PMU yet (it's future work of TDX KVM support as another patch series) and pmu_intel.c touches vmx specific structure in vcpu initialization, as workaround add dummy structure to struct vcpu_tdx and pmu_intel.c can ignore TDX case. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/pmu_intel.c | 46 +++++++++++++++++++++++++++++++++++- arch/x86/kvm/vmx/pmu_intel.h | 28 ++++++++++++++++++++++ arch/x86/kvm/vmx/tdx.h | 7 ++++++ arch/x86/kvm/vmx/vmx.c | 2 +- arch/x86/kvm/vmx/vmx.h | 32 +------------------------ 5 files changed, 82 insertions(+), 33 deletions(-) create mode 100644 arch/x86/kvm/vmx/pmu_intel.h diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c index 25b70a85bef5..41e97c21f0d7 100644 --- a/arch/x86/kvm/vmx/pmu_intel.c +++ b/arch/x86/kvm/vmx/pmu_intel.c @@ -17,6 +17,7 @@ #include "lapic.h" #include "nested.h" #include "pmu.h" +#include "tdx.h" #define MSR_PMC_FULL_WIDTH_BIT (MSR_IA32_PMC0 - MSR_IA32_PERFCTR0) @@ -35,6 +36,26 @@ static struct kvm_event_hw_type_mapping intel_arch_events[] = { /* mapping between fixed pmc index and intel_arch_events array */ static int fixed_pmc_events[] = {1, 0, 7}; +struct lbr_desc *vcpu_to_lbr_desc(struct kvm_vcpu *vcpu) +{ +#ifdef CONFIG_INTEL_TDX_HOST + if (is_td_vcpu(vcpu)) + return &to_tdx(vcpu)->lbr_desc; +#endif + + return &to_vmx(vcpu)->lbr_desc; +} + +struct x86_pmu_lbr *vcpu_to_lbr_records(struct kvm_vcpu *vcpu) +{ +#ifdef CONFIG_INTEL_TDX_HOST + if (is_td_vcpu(vcpu)) + return &to_tdx(vcpu)->lbr_desc.records; +#endif + + return &to_vmx(vcpu)->lbr_desc.records; +} + static void reprogram_fixed_counters(struct kvm_pmu *pmu, u64 data) { struct kvm_pmc *pmc; @@ -167,6 +188,23 @@ static inline struct kvm_pmc *get_fw_gp_pmc(struct kvm_pmu *pmu, u32 msr) return get_gp_pmc(pmu, msr, MSR_IA32_PMC0); } +bool intel_pmu_lbr_is_compatible(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return false; + return cpuid_model_is_consistent(vcpu); +} + +bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu) +{ + struct x86_pmu_lbr *lbr = vcpu_to_lbr_records(vcpu); + + if (is_td_vcpu(vcpu)) + return false; + + return lbr->nr && (vcpu_get_perf_capabilities(vcpu) & PMU_CAP_LBR_FMT); +} + static bool intel_pmu_is_valid_lbr_msr(struct kvm_vcpu *vcpu, u32 index) { struct x86_pmu_lbr *records = vcpu_to_lbr_records(vcpu); @@ -277,6 +315,9 @@ int intel_pmu_create_guest_lbr_event(struct kvm_vcpu *vcpu) PERF_SAMPLE_BRANCH_USER, }; + if (WARN_ON_ONCE(is_td_vcpu(vcpu))) + return 0; + if (unlikely(lbr_desc->event)) { __set_bit(INTEL_PMC_IDX_FIXED_VLBR, pmu->pmc_in_use); return 0; @@ -586,7 +627,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu) INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters); perf_capabilities = vcpu_get_perf_capabilities(vcpu); - if (cpuid_model_is_consistent(vcpu) && + if (intel_pmu_lbr_is_compatible(vcpu) && (perf_capabilities & PMU_CAP_LBR_FMT)) x86_perf_get_lbr(&lbr_desc->records); else @@ -643,6 +684,9 @@ static void intel_pmu_reset(struct kvm_vcpu *vcpu) struct kvm_pmc *pmc = NULL; int i; + if (is_td_vcpu(vcpu)) + return; + for (i = 0; i < INTEL_PMC_MAX_GENERIC; i++) { pmc = &pmu->gp_counters[i]; diff --git a/arch/x86/kvm/vmx/pmu_intel.h b/arch/x86/kvm/vmx/pmu_intel.h new file mode 100644 index 000000000000..66bba47c1269 --- /dev/null +++ b/arch/x86/kvm/vmx/pmu_intel.h @@ -0,0 +1,28 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __KVM_X86_VMX_PMU_INTEL_H +#define __KVM_X86_VMX_PMU_INTEL_H + +struct lbr_desc *vcpu_to_lbr_desc(struct kvm_vcpu *vcpu); +struct x86_pmu_lbr *vcpu_to_lbr_records(struct kvm_vcpu *vcpu); + +bool intel_pmu_lbr_is_compatible(struct kvm_vcpu *vcpu); +bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu); +int intel_pmu_create_guest_lbr_event(struct kvm_vcpu *vcpu); + +struct lbr_desc { + /* Basic info about guest LBR records. */ + struct x86_pmu_lbr records; + + /* + * Emulate LBR feature via passthrough LBR registers when the + * per-vcpu guest LBR event is scheduled on the current pcpu. + * + * The records may be inaccurate if the host reclaims the LBR. + */ + struct perf_event *event; + + /* True if LBRs are marked as not intercepted in the MSR bitmap */ + bool msr_passthrough; +}; + +#endif /* __KVM_X86_VMX_PMU_INTEL_H */ diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index ff0ea9cad347..5aea69716278 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -4,6 +4,7 @@ #ifdef CONFIG_INTEL_TDX_HOST +#include "pmu_intel.h" #include "tdx_ops.h" struct tdx_td_page { @@ -30,6 +31,12 @@ struct vcpu_tdx { struct tdx_td_page tdvpr; struct tdx_td_page *tdvpx; + + /* + * Dummy to make pmu_intel not corrupt memory. + * TODO: Support PMU for TDX. Future work. + */ + struct lbr_desc lbr_desc; }; static inline bool is_td(struct kvm *kvm) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 68aef67c5eb7..f890191e8580 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -2341,7 +2341,7 @@ int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) if ((data & PMU_CAP_LBR_FMT) != (vmx_get_perf_capabilities() & PMU_CAP_LBR_FMT)) return 1; - if (!cpuid_model_is_consistent(vcpu)) + if (!intel_pmu_lbr_is_compatible(vcpu)) return 1; } if (data & PERF_CAP_PEBS_FORMAT) { diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index a3da84f4ea45..d49d0ace9fb8 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -11,6 +11,7 @@ #include "capabilities.h" #include "../kvm_cache_regs.h" #include "posted_intr.h" +#include "pmu_intel.h" #include "vmcs.h" #include "vmx_ops.h" #include "../cpuid.h" @@ -105,22 +106,6 @@ static inline bool intel_pmu_has_perf_global_ctrl(struct kvm_pmu *pmu) return pmu->version > 1; } -struct lbr_desc { - /* Basic info about guest LBR records. */ - struct x86_pmu_lbr records; - - /* - * Emulate LBR feature via passthrough LBR registers when the - * per-vcpu guest LBR event is scheduled on the current pcpu. - * - * The records may be inaccurate if the host reclaims the LBR. - */ - struct perf_event *event; - - /* True if LBRs are marked as not intercepted in the MSR bitmap */ - bool msr_passthrough; -}; - /* * The nested_vmx structure is part of vcpu_vmx, and holds information we need * for correct emulation of VMX (i.e., nested VMX) on this vcpu. @@ -650,21 +635,6 @@ static inline struct vcpu_vmx *to_vmx(struct kvm_vcpu *vcpu) return container_of(vcpu, struct vcpu_vmx, vcpu); } -static inline struct lbr_desc *vcpu_to_lbr_desc(struct kvm_vcpu *vcpu) -{ - return &to_vmx(vcpu)->lbr_desc; -} - -static inline struct x86_pmu_lbr *vcpu_to_lbr_records(struct kvm_vcpu *vcpu) -{ - return &vcpu_to_lbr_desc(vcpu)->records; -} - -static inline bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu) -{ - return !!vcpu_to_lbr_records(vcpu)->nr; -} - void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu); int intel_pmu_create_guest_lbr_event(struct kvm_vcpu *vcpu); void vmx_passthrough_lbr_msrs(struct kvm_vcpu *vcpu); From patchwork Sun Oct 30 06:22:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12852 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665832wru; Sat, 29 Oct 2022 23:27:41 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5Mv71oXm91QSYib7MMINUU5gb04whd/eYhsnjx8FZP5SZIABj6zl6l4VBnXCW3WizwcIHM X-Received: by 2002:a05:6402:1e96:b0:462:89aa:d402 with SMTP id f22-20020a0564021e9600b0046289aad402mr7373422edf.190.1667111261386; Sat, 29 Oct 2022 23:27:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111261; cv=none; d=google.com; s=arc-20160816; b=B0J2YfJxQJQMGtZgahCU+Z3RVwhUGxxoy9UlKjhOpPgdSnukqLgn++d5M8qQmhb7CB Ql/ZECBFFw/h9YI2ra/1SrQXZSeC+Ui4d+Pj3qLPvBLj3o4Ox3XWQnvxxaO2WpRDU7xX U48AERn6O7DzKgQCmW8W/dlDKfXUN3NBCfgUTO/ZGjXvjsGxf+Z7lX1hbYKe42Ujr4t+ 8m2F8wyqmvx1OwIm78X3QjGKY0y/eRTqqwoJblZjXJJdZkVYFBM0LcEFzhfiR3MQBBlg WjlGVN/wqkzkX/bOIOWjiDdjZView81aJG4a+NLtJ1gU9N1wBkZsl8h+CkZ5agDM1TAE zbrQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=4yByBX1ZiHQEA78Z5h+MA70MLY4VtTdZbzC5YeGD6hg=; b=V2nAd0707RyqO/8qlPvccKlrrM3JWlLPaQYlBpJADhZoNoEAmYwtk7ddNCnPh+gtGQ ncso872BS8TAAOO8olWOIcxK00PYVGc2TmTaThu3LspJgGvk64cqHDB7uEUW1NnTH33D KXUI5G12lck2hSr3MM3kh1oNxd/DLGrz+3YRpVO7KECPgZFfm7WYy3THRcwYpDVxrDaF kbbSCzIkCgUijrgnrtqaR7I/YZqOmMpIyfQz3epRtVJ2Kw70xHkZdhLBBc7YUBGu66mT 2ST3TuyPRIO+Eu7V9U2cC8VMwVqiLwZ895rIU8LphiVTipqeMyEc0FWqmeMVRgwkKM8u P/XA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ZojVBEy0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id nc9-20020a1709071c0900b0078b96722000si4059286ejc.608.2022.10.29.23.27.18; Sat, 29 Oct 2022 23:27:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ZojVBEy0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229721AbiJ3G0I (ORCPT + 99 others); Sun, 30 Oct 2022 02:26:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46900 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229862AbiJ3GYH (ORCPT ); Sun, 30 Oct 2022 02:24:07 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B76A7F4; Sat, 29 Oct 2022 23:24:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111045; x=1698647045; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dq/4kTxFog2mJ+pfF6ngUCSHtSzKcfcxTCggESSMMVA=; b=ZojVBEy0o7Qx4R40uvo1OCU/Dfx47yH5P9PBzXqR6KCRQZnohGfoPAUn qHcOVkAC1Gn/LXcIZTTwByKR5CKVHWLcmzYEQGM5mKisjTSed2FmJIfj1 PIfOkSpBBj3PWZEqbfF1XppyiM5oI0apF4pgYDack6H90hOHquck+Yc/Y LhghF7OedFGz+Zo9v7k41YCpM9PP5CBhDw2IhHLLh4pYMaZVES/I3hKwu QKz2bv2WpGYdl1nC7ojC7ihFAuZSc9MjZxnj/YMDkKoZuQnJIzWJJ6YYn v9flrLCyjR/2Xmyucs+s28pspkGhwka9fH3udhStOu7mVq39C+VfpHeYz g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037135" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037135" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:00 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392904" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392904" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:00 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 023/108] [MARKER] The start of TDX KVM patch series: TD vcpu creation/destruction Date: Sat, 29 Oct 2022 23:22:24 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092858121205425?= X-GMAIL-MSGID: =?utf-8?q?1748092858121205425?= From: Isaku Yamahata This empty commit is to mark the start of patch series of TD vcpu creation/destruction. Signed-off-by: Isaku Yamahata --- Documentation/virt/kvm/intel-tdx-layer-status.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/Documentation/virt/kvm/intel-tdx-layer-status.rst b/Documentation/virt/kvm/intel-tdx-layer-status.rst index 5e0deaebf843..3e8efde3e3f3 100644 --- a/Documentation/virt/kvm/intel-tdx-layer-status.rst +++ b/Documentation/virt/kvm/intel-tdx-layer-status.rst @@ -9,15 +9,15 @@ Layer status What qemu can do ---------------- - TDX VM TYPE is exposed to Qemu. -- Qemu can try to create VM of TDX VM type and then fails. +- Qemu can create/destroy guest of TDX vm type. Patch Layer status ------------------ Patch layer Status * TDX, VMX coexistence: Applied * TDX architectural definitions: Applied -* TD VM creation/destruction: Applying -* TD vcpu creation/destruction: Not yet +* TD VM creation/destruction: Applied +* TD vcpu creation/destruction: Applying * TDX EPT violation: Not yet * TD finalization: Not yet * TD vcpu enter/exit: Not yet From patchwork Sun Oct 30 06:22:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12842 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665653wru; Sat, 29 Oct 2022 23:26:46 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4Fl3GJAJKBMlBNdqpxcz547LMmmmerSK7wmS+mVziskFdcaBw3HZz501H5F/cZfr2UQyZG X-Received: by 2002:a17:906:dac9:b0:780:ab6f:591f with SMTP id xi9-20020a170906dac900b00780ab6f591fmr6883129ejb.77.1667111206315; Sat, 29 Oct 2022 23:26:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111206; cv=none; d=google.com; s=arc-20160816; b=mEsafUJzSC0X/rJJiwir9f1Ndcq6Hx6pO4u7P+SMoHTk9NMgfe535wFpSDkqFfOqDv qabf+Ms3veo8arO/MrFwKbWNVvorFPY+4I36/Bspf4HbeimoU3kTD2pucI091IGRBHnQ whFgIW0zgZir9Z3t0BY/R44mnqWPO86fT7FroQdFTV4yW7gD/jnelZMuUgy+jLr5vaT4 p6t/IrRLTXeTC1IzVckS8XbKWWn4olnnYNng1JQUyE8InT5PWWfLcXbvr9cUFtGZ0rTb Qp4o3Zj3iQzbw6Rfu7vB2ukucg4MaaU6brxYT/EnPRFtXlzLAPmVWKOst6xSv5VCiclP ca/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=bBb/NHx6dWTf1DpT3ontY2LzuSblJi/NXG5VXxijaFQ=; b=Yd+60nOUkseXcW8DJCtIFJE/Z0WSi1hgK1Z0W29iGCAKtXt/lBNMwoRq6fMemFtoB0 HcZeaF7Xi8A/YduqSJEF9Im/Mu3EzHnfcrRIjX5CZjFq0OUxRDsLG1jT/GMhIKZwBiCW h/MZiwI633IFh4bNMLaL/+n+2ht7FAKYiTDcEpjl6Ke6NinRDRyPz1ikf+JW7uYr3RMP 8AmXm9HJcMOpIeG7tO+5k1O1+sqO3vEPdPy0wQs+ZHoPwmCeBbrk4d3pjiGeTD2HPKXn IiFtAh+i4Pb7iMzOvsJ9cjEvFOOA+FLqc6Tay8LScVlI8JKVu5JGKB2MC36sELaYw5GN 4D0g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="lWFiF/Kx"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dp18-20020a170906c15200b007824b741e7asi4799897ejc.236.2022.10.29.23.26.22; Sat, 29 Oct 2022 23:26:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="lWFiF/Kx"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229730AbiJ3GZ3 (ORCPT + 99 others); Sun, 30 Oct 2022 02:25:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46914 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229867AbiJ3GYH (ORCPT ); Sun, 30 Oct 2022 02:24:07 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7757FC; Sat, 29 Oct 2022 23:24:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111045; x=1698647045; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LpOo+HJSEXi1J51A1S2p59aWy0L6eRzsu9BTikvKAmA=; b=lWFiF/KxDInMsOOD5SZw4wQ2cZc0sQjMvcG4gJ+PdgPRLdsFYjjZf2uU F17h3ZlhadGDUkW/ZJSzOjwP6hFsKLAKNu1QNpj4AMJBLfMSh8DluUyFU UMWYHvBKZIxC4/BXCbf0PC/aZtsVYKIZPc4oxFRpKaqYeCR8//1WqFosr b8Qztjn9k5Rsebb0mU0gQjr7LMW1Izfu0dTOdOTULSYUtiJXy5KgyiXnC 7jjQnG0sJCl9AO03CmcFokjCUtXwgteV2rkRdwirSDa9uiKOpucEDCJ4B T4Kkej+54MTsbAgpSTo3lOCFpSkdjlhCmpAvFmLg7cezBhAoiAG9H9y9J Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037136" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037136" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:00 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392910" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392910" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:00 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 024/108] KVM: TDX: allocate/free TDX vcpu structure Date: Sat, 29 Oct 2022 23:22:25 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092800029923034?= X-GMAIL-MSGID: =?utf-8?q?1748092800029923034?= From: Isaku Yamahata The next step of TDX guest creation is to create vcpu. Allocate TDX vcpu structures, initialize it. Allocate pages of TDX vcpu for the TDX module. In the case of the conventional case, cpuid is empty at the initialization. and cpuid is configured after the vcpu initialization. Because TDX supports only X2APIC mode, cpuid is forcibly initialized to support X2APIC on the vcpu initialization. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/main.c | 40 +++++++++-- arch/x86/kvm/vmx/tdx.c | 138 +++++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/x86_ops.h | 8 +++ 3 files changed, 182 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index b4e4c6c677f6..c125b2e3e8b4 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -63,6 +63,38 @@ static void vt_vm_free(struct kvm *kvm) return tdx_vm_free(kvm); } +static int vt_vcpu_precreate(struct kvm *kvm) +{ + if (is_td(kvm)) + return 0; + + return vmx_vcpu_precreate(kvm); +} + +static int vt_vcpu_create(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return tdx_vcpu_create(vcpu); + + return vmx_vcpu_create(vcpu); +} + +static void vt_vcpu_free(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return tdx_vcpu_free(vcpu); + + return vmx_vcpu_free(vcpu); +} + +static void vt_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) +{ + if (is_td_vcpu(vcpu)) + return tdx_vcpu_reset(vcpu, init_event); + + return vmx_vcpu_reset(vcpu, init_event); +} + static int vt_mem_enc_ioctl(struct kvm *kvm, void __user *argp) { if (!is_td(kvm)) @@ -89,10 +121,10 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .vm_destroy = vt_vm_destroy, .vm_free = vt_vm_free, - .vcpu_precreate = vmx_vcpu_precreate, - .vcpu_create = vmx_vcpu_create, - .vcpu_free = vmx_vcpu_free, - .vcpu_reset = vmx_vcpu_reset, + .vcpu_precreate = vt_vcpu_precreate, + .vcpu_create = vt_vcpu_create, + .vcpu_free = vt_vcpu_free, + .vcpu_reset = vt_vcpu_reset, .prepare_switch_to_guest = vmx_prepare_switch_to_guest, .vcpu_load = vmx_vcpu_load, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 54045e0576e7..0625c354b341 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -49,6 +49,11 @@ static __always_inline hpa_t set_hkid_to_hpa(hpa_t pa, u16 hkid) return pa | ((hpa_t)hkid << boot_cpu_data.x86_phys_bits); } +static inline bool is_td_vcpu_created(struct vcpu_tdx *tdx) +{ + return tdx->tdvpr.added; +} + static inline bool is_td_created(struct kvm_tdx *kvm_tdx) { return kvm_tdx->tdr.added; @@ -296,6 +301,139 @@ int tdx_vm_init(struct kvm *kvm) return 0; } +int tdx_vcpu_create(struct kvm_vcpu *vcpu) +{ + struct vcpu_tdx *tdx = to_tdx(vcpu); + int ret, i; + + /* TDX only supports x2APIC, which requires an in-kernel local APIC. */ + if (!vcpu->arch.apic) + return -EINVAL; + + fpstate_set_confidential(&vcpu->arch.guest_fpu); + + ret = tdx_alloc_td_page(&tdx->tdvpr); + if (ret) + return ret; + + tdx->tdvpx = kcalloc(tdx_caps.tdvpx_nr_pages, sizeof(*tdx->tdvpx), + GFP_KERNEL_ACCOUNT); + if (!tdx->tdvpx) { + ret = -ENOMEM; + goto free_tdvpr; + } + for (i = 0; i < tdx_caps.tdvpx_nr_pages; i++) { + ret = tdx_alloc_td_page(&tdx->tdvpx[i]); + if (ret) + goto free_tdvpx; + } + + vcpu->arch.efer = EFER_SCE | EFER_LME | EFER_LMA | EFER_NX; + + vcpu->arch.cr0_guest_owned_bits = -1ul; + vcpu->arch.cr4_guest_owned_bits = -1ul; + + vcpu->arch.tsc_offset = to_kvm_tdx(vcpu->kvm)->tsc_offset; + vcpu->arch.l1_tsc_offset = vcpu->arch.tsc_offset; + vcpu->arch.guest_state_protected = + !(to_kvm_tdx(vcpu->kvm)->attributes & TDX_TD_ATTRIBUTE_DEBUG); + + return 0; + +free_tdvpx: + /* @i points at the TDVPX page that failed allocation. */ + for (--i; i >= 0; i--) + free_page(tdx->tdvpx[i].va); + kfree(tdx->tdvpx); + tdx->tdvpx = NULL; +free_tdvpr: + free_page(tdx->tdvpr.va); + + return ret; +} + +void tdx_vcpu_free(struct kvm_vcpu *vcpu) +{ + struct vcpu_tdx *tdx = to_tdx(vcpu); + int i; + + /* Can't reclaim or free pages if teardown failed. */ + if (is_hkid_assigned(to_kvm_tdx(vcpu->kvm))) + return; + + if (tdx->tdvpx) { + for (i = 0; i < tdx_caps.tdvpx_nr_pages; i++) + tdx_reclaim_td_page(&tdx->tdvpx[i]); + kfree(tdx->tdvpx); + tdx->tdvpx = NULL; + } + tdx_reclaim_td_page(&tdx->tdvpr); +} + +void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm); + struct vcpu_tdx *tdx = to_tdx(vcpu); + struct msr_data apic_base_msr; + u64 err; + int i; + + /* TDX doesn't support INIT event. */ + if (WARN_ON_ONCE(init_event)) + goto td_bugged; + if (WARN_ON_ONCE(is_td_vcpu_created(tdx))) + goto td_bugged; + + err = tdh_vp_create(kvm_tdx->tdr.pa, tdx->tdvpr.pa); + if (WARN_ON_ONCE(err)) { + pr_tdx_error(TDH_VP_CREATE, err, NULL); + goto td_bugged; + } + tdx_mark_td_page_added(&tdx->tdvpr); + + for (i = 0; i < tdx_caps.tdvpx_nr_pages; i++) { + err = tdh_vp_addcx(tdx->tdvpr.pa, tdx->tdvpx[i].pa); + if (WARN_ON_ONCE(err)) { + pr_tdx_error(TDH_VP_ADDCX, err, NULL); + goto td_bugged; + } + tdx_mark_td_page_added(&tdx->tdvpx[i]); + } + + if (!vcpu->arch.cpuid_entries) { + /* + * On cpu creation, cpuid entry is blank. Forcibly enable + * X2APIC feature to allow X2APIC. + */ + struct kvm_cpuid_entry2 *e; + + e = kvmalloc_array(1, sizeof(*e), GFP_KERNEL_ACCOUNT); + *e = (struct kvm_cpuid_entry2) { + .function = 1, /* Features for X2APIC */ + .index = 0, + .eax = 0, + .ebx = 0, + .ecx = 1ULL << 21, /* X2APIC */ + .edx = 0, + }; + vcpu->arch.cpuid_entries = e; + vcpu->arch.cpuid_nent = 1; + } + apic_base_msr.data = APIC_DEFAULT_PHYS_BASE | LAPIC_MODE_X2APIC; + if (kvm_vcpu_is_reset_bsp(vcpu)) + apic_base_msr.data |= MSR_IA32_APICBASE_BSP; + apic_base_msr.host_initiated = true; + if (WARN_ON_ONCE(kvm_set_apic_base(vcpu, &apic_base_msr))) + goto td_bugged; + + vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; + + return; + +td_bugged: + vcpu->kvm->vm_bugged = true; +} + int tdx_dev_ioctl(void __user *argp) { struct kvm_tdx_capabilities __user *user_caps; diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 93ffe2deb8e8..f6841c3dd12d 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -141,6 +141,10 @@ int tdx_vm_init(struct kvm *kvm); void tdx_mmu_release_hkid(struct kvm *kvm); void tdx_vm_free(struct kvm *kvm); +int tdx_vcpu_create(struct kvm_vcpu *vcpu); +void tdx_vcpu_free(struct kvm_vcpu *vcpu); +void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event); + int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); #else static inline int tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { return 0; } @@ -154,6 +158,10 @@ static inline void tdx_mmu_release_hkid(struct kvm *kvm) {} static inline void tdx_flush_shadow_all_private(struct kvm *kvm) {} static inline void tdx_vm_free(struct kvm *kvm) {} +static inline int tdx_vcpu_create(struct kvm_vcpu *vcpu) { return -EOPNOTSUPP; } +static inline void tdx_vcpu_free(struct kvm_vcpu *vcpu) {} +static inline void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) {} + static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } #endif From patchwork Sun Oct 30 06:22:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12849 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665765wru; Sat, 29 Oct 2022 23:27:22 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7mMVV3XuKQjdxBjsz51hgs2nTvZUQpc9vTRTd6mHCSckoFLn2ZbE13pQHk4rNFt7/5VnZC X-Received: by 2002:a05:6402:540d:b0:450:bda7:f76e with SMTP id ev13-20020a056402540d00b00450bda7f76emr7373828edb.249.1667111242515; Sat, 29 Oct 2022 23:27:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111242; cv=none; d=google.com; s=arc-20160816; b=VyaloLNFjTkigkn/kDxfIf7EzTcpyphR0USwvTs91KQt1wUSTxW7f3WUAFBau77NkE KdiRxdjiflCEq+QpmLOsZggk1ZcVZ13oG/MS3H/a7bhf8LFq+ukZ0C76P6VSFtnbyYGw slSefsMZXIO3j/NLg05L2YEUTNjjUmTq4Hj+0+VIyleeml/MbsurSmbruorKNLqiC7c/ ODHlFPuhbbP9Wy8QmTDaUSltMgNZ0qNbLC2XDJ4AdfkljzORDpHFhjFT7FI804Rob0WN qnuC1a0H8luoi7JYnWY6Rj3HPe+kQO2449S9t6ob3Gazn8oxAdarWf9WLE9CWI+XiBSA zSzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=k81e5g9/fKqR2g8gbGtuCfg/Jmn7Nn5oiLp5X1dcV4k=; b=c88LqSViTV6gH/xIpYjq95VuZOF6UdrnOGf61MWs7AFD4Xczkqw9SnAuVYXWmCea3C eSmxpjI14neVQwoQEXPCuRuAoSMPFHiw/xq+Dh8WkUGqqiNExppHYJQMxxJiMRB54WsZ 8EQTf8UOgi9iIldlVX0ki02yfbp7AGAs+CgYZUFBInJ5BXGO8b9zsWv2oqHGhHqr/nr/ Glqc/8mlvYgbmMPnyjUTeWlUyzRDfjJkp5ZZNjHDg9KyoG0gpu5G1aLm/HRZgL6wyJ0P +GgNE5ED1I1/IWXhXe2NPvAbk1iQJj4kyppGAGGmor9+j5q4Ueqq3PvWQDvfbBreuW09 M4Tg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="PKjyhL/u"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id nc16-20020a1709071c1000b007adc3b6dcfdsi1455189ejc.392.2022.10.29.23.26.58; Sat, 29 Oct 2022 23:27:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="PKjyhL/u"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230274AbiJ3GZ4 (ORCPT + 99 others); Sun, 30 Oct 2022 02:25:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46950 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229870AbiJ3GYI (ORCPT ); Sun, 30 Oct 2022 02:24:08 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B76E7F5; Sat, 29 Oct 2022 23:24:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111045; x=1698647045; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5N14W9d4UZ3Tne7waI4pJoVpGQT1k/qC2lTFBmHV9zg=; b=PKjyhL/uR/KYxQDKSyVAZz/NETn+xR5Yw2RbkaJnAOQZdYPQfteNXao8 LrR1b4FWTDPk0Wt0FZRg6Ulyn9S+nXtQ/UrUlu1Vskvg1e1C+5NPndJb5 lf/xieHIB7NeEhReduqudLyfFSF3eqHwAX/ev2rBoFB2ZlqVOfVDAyokP HJ2zDFhDVLZcsLwUS9j/utWpylQzgI38gBJ1y5z05yp0v+GGjdeYuF9np G+F7VALdAjraaUBUiFAITBZzW+o27nMOc4PtB1QY50cVSIisD9V/Adpgx Bh9+DNDOFTgC7sMkGwSKDiCcJY/QwIyxvt3ytI9JmzOUgOHtMV7WJB88N A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037137" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037137" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:01 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392914" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392914" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:00 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 025/108] KVM: TDX: Do TDX specific vcpu initialization Date: Sat, 29 Oct 2022 23:22:26 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092838023980077?= X-GMAIL-MSGID: =?utf-8?q?1748092838023980077?= From: Sean Christopherson TD guest vcpu need to be configured before ready to run which requests addtional information from Device model (e.g. qemu), one 64bit value is passed to vcpu's RCX as an initial value. Repurpose KVM_MEMORY_ENCRYPT_OP to vcpu-scope and add new sub-commands KVM_TDX_INIT_VCPU under it for such additional vcpu configuration. Add callback for kvm vCPU-scoped operations of KVM_MEMORY_ENCRYPT_OP and add a new subcommand, KVM_TDX_INIT_VCPU, for further vcpu initialization. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/vmx/main.c | 9 ++ arch/x86/kvm/vmx/tdx.c | 166 ++++++++++++++++++-------- arch/x86/kvm/vmx/tdx.h | 4 + arch/x86/kvm/vmx/x86_ops.h | 2 + arch/x86/kvm/x86.c | 6 + tools/arch/x86/include/uapi/asm/kvm.h | 1 + 9 files changed, 139 insertions(+), 52 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 4425564647cb..f28c9fd72ac4 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -120,6 +120,7 @@ KVM_X86_OP(leave_smm) KVM_X86_OP(enable_smi_window) KVM_X86_OP_OPTIONAL(dev_mem_enc_ioctl) KVM_X86_OP_OPTIONAL(mem_enc_ioctl) +KVM_X86_OP_OPTIONAL(vcpu_mem_enc_ioctl) KVM_X86_OP_OPTIONAL(mem_enc_register_region) KVM_X86_OP_OPTIONAL(mem_enc_unregister_region) KVM_X86_OP_OPTIONAL(vm_copy_enc_context_from) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 1fced310ec63..829a07d23909 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1628,6 +1628,7 @@ struct kvm_x86_ops { int (*dev_mem_enc_ioctl)(void __user *argp); int (*mem_enc_ioctl)(struct kvm *kvm, void __user *argp); + int (*vcpu_mem_enc_ioctl)(struct kvm_vcpu *vcpu, void __user *argp); int (*mem_enc_register_region)(struct kvm *kvm, struct kvm_enc_region *argp); int (*mem_enc_unregister_region)(struct kvm *kvm, struct kvm_enc_region *argp); int (*vm_copy_enc_context_from)(struct kvm *kvm, unsigned int source_fd); diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 26661879c031..80db152430e4 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -539,6 +539,7 @@ struct kvm_pmu_event_filter { enum kvm_tdx_cmd_id { KVM_TDX_CAPABILITIES = 0, KVM_TDX_INIT_VM, + KVM_TDX_INIT_VCPU, KVM_TDX_CMD_NR_MAX, }; diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index c125b2e3e8b4..0d5ca65e9997 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -103,6 +103,14 @@ static int vt_mem_enc_ioctl(struct kvm *kvm, void __user *argp) return tdx_vm_ioctl(kvm, argp); } +static int vt_vcpu_mem_enc_ioctl(struct kvm_vcpu *vcpu, void __user *argp) +{ + if (!is_td_vcpu(vcpu)) + return -EINVAL; + + return tdx_vcpu_ioctl(vcpu, argp); +} + struct kvm_x86_ops vt_x86_ops __initdata = { .name = "kvm_intel", @@ -246,6 +254,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .dev_mem_enc_ioctl = tdx_dev_ioctl, .mem_enc_ioctl = vt_mem_enc_ioctl, + .vcpu_mem_enc_ioctl = vt_vcpu_mem_enc_ioctl, }; struct kvm_x86_init_ops vt_init_ops __initdata = { diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 0625c354b341..fd9210cb4f36 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -70,6 +70,11 @@ static inline bool is_hkid_assigned(struct kvm_tdx *kvm_tdx) return kvm_tdx->hkid > 0; } +static inline bool is_td_finalized(struct kvm_tdx *kvm_tdx) +{ + return kvm_tdx->finalized; +} + static void tdx_clear_page(unsigned long page) { const void *zero_page = (const void *) __va(page_to_phys(ZERO_PAGE(0))); @@ -303,31 +308,12 @@ int tdx_vm_init(struct kvm *kvm) int tdx_vcpu_create(struct kvm_vcpu *vcpu) { - struct vcpu_tdx *tdx = to_tdx(vcpu); - int ret, i; - /* TDX only supports x2APIC, which requires an in-kernel local APIC. */ if (!vcpu->arch.apic) return -EINVAL; fpstate_set_confidential(&vcpu->arch.guest_fpu); - ret = tdx_alloc_td_page(&tdx->tdvpr); - if (ret) - return ret; - - tdx->tdvpx = kcalloc(tdx_caps.tdvpx_nr_pages, sizeof(*tdx->tdvpx), - GFP_KERNEL_ACCOUNT); - if (!tdx->tdvpx) { - ret = -ENOMEM; - goto free_tdvpr; - } - for (i = 0; i < tdx_caps.tdvpx_nr_pages; i++) { - ret = tdx_alloc_td_page(&tdx->tdvpx[i]); - if (ret) - goto free_tdvpx; - } - vcpu->arch.efer = EFER_SCE | EFER_LME | EFER_LMA | EFER_NX; vcpu->arch.cr0_guest_owned_bits = -1ul; @@ -339,17 +325,6 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu) !(to_kvm_tdx(vcpu->kvm)->attributes & TDX_TD_ATTRIBUTE_DEBUG); return 0; - -free_tdvpx: - /* @i points at the TDVPX page that failed allocation. */ - for (--i; i >= 0; i--) - free_page(tdx->tdvpx[i].va); - kfree(tdx->tdvpx); - tdx->tdvpx = NULL; -free_tdvpr: - free_page(tdx->tdvpr.va); - - return ret; } void tdx_vcpu_free(struct kvm_vcpu *vcpu) @@ -372,34 +347,14 @@ void tdx_vcpu_free(struct kvm_vcpu *vcpu) void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) { - struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm); - struct vcpu_tdx *tdx = to_tdx(vcpu); struct msr_data apic_base_msr; - u64 err; - int i; /* TDX doesn't support INIT event. */ if (WARN_ON_ONCE(init_event)) goto td_bugged; - if (WARN_ON_ONCE(is_td_vcpu_created(tdx))) + if (WARN_ON_ONCE(is_td_vcpu_created(to_tdx(vcpu)))) goto td_bugged; - err = tdh_vp_create(kvm_tdx->tdr.pa, tdx->tdvpr.pa); - if (WARN_ON_ONCE(err)) { - pr_tdx_error(TDH_VP_CREATE, err, NULL); - goto td_bugged; - } - tdx_mark_td_page_added(&tdx->tdvpr); - - for (i = 0; i < tdx_caps.tdvpx_nr_pages; i++) { - err = tdh_vp_addcx(tdx->tdvpr.pa, tdx->tdvpx[i].pa); - if (WARN_ON_ONCE(err)) { - pr_tdx_error(TDH_VP_ADDCX, err, NULL); - goto td_bugged; - } - tdx_mark_td_page_added(&tdx->tdvpx[i]); - } - if (!vcpu->arch.cpuid_entries) { /* * On cpu creation, cpuid entry is blank. Forcibly enable @@ -419,6 +374,8 @@ void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vcpu->arch.cpuid_entries = e; vcpu->arch.cpuid_nent = 1; } + + /* TDX rquires X2APIC. */ apic_base_msr.data = APIC_DEFAULT_PHYS_BASE | LAPIC_MODE_X2APIC; if (kvm_vcpu_is_reset_bsp(vcpu)) apic_base_msr.data |= MSR_IA32_APICBASE_BSP; @@ -426,7 +383,10 @@ void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) if (WARN_ON_ONCE(kvm_set_apic_base(vcpu, &apic_base_msr))) goto td_bugged; - vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; + /* + * Don't update mp_state to runnable because more initialization + * is needed by TDX_VCPU_INIT. + */ return; @@ -861,6 +821,108 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) return r; } +static int tdx_td_vcpu_init(struct kvm_vcpu *vcpu, u64 vcpu_rcx) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm); + struct vcpu_tdx *tdx = to_tdx(vcpu); + int ret, i; + u64 err; + + if (is_td_vcpu_created(tdx)) + return -EINVAL; + + ret = tdx_alloc_td_page(&tdx->tdvpr); + if (ret) + return ret; + + tdx->tdvpx = kcalloc(tdx_caps.tdvpx_nr_pages, sizeof(*tdx->tdvpx), + GFP_KERNEL_ACCOUNT); + if (!tdx->tdvpx) { + ret = -ENOMEM; + goto free_tdvpr; + } + for (i = 0; i < tdx_caps.tdvpx_nr_pages; i++) { + ret = tdx_alloc_td_page(&tdx->tdvpx[i]); + if (ret) + goto free_tdvpx; + } + + err = tdh_vp_create(kvm_tdx->tdr.pa, tdx->tdvpr.pa); + if (WARN_ON_ONCE(err)) { + ret = -EIO; + pr_tdx_error(TDH_VP_CREATE, err, NULL); + goto td_bugged; + } + tdx_mark_td_page_added(&tdx->tdvpr); + + for (i = 0; i < tdx_caps.tdvpx_nr_pages; i++) { + err = tdh_vp_addcx(tdx->tdvpr.pa, tdx->tdvpx[i].pa); + if (WARN_ON_ONCE(err)) { + ret = -EIO; + pr_tdx_error(TDH_VP_ADDCX, err, NULL); + goto td_bugged; + } + tdx_mark_td_page_added(&tdx->tdvpx[i]); + } + + err = tdh_vp_init(tdx->tdvpr.pa, vcpu_rcx); + if (WARN_ON_ONCE(err)) { + ret = -EIO; + pr_tdx_error(TDH_VP_INIT, err, NULL); + goto td_bugged; + } + + vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; + + return 0; + +td_bugged: + vcpu->kvm->vm_bugged = true; + return ret; + +free_tdvpx: + /* @i points at the TDVPX page that failed allocation. */ + for (--i; i >= 0; i--) + free_page(tdx->tdvpx[i].va); + kfree(tdx->tdvpx); + tdx->tdvpx = NULL; +free_tdvpr: + free_page(tdx->tdvpr.va); + + return ret; +} + +int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm); + struct vcpu_tdx *tdx = to_tdx(vcpu); + struct kvm_tdx_cmd cmd; + int ret; + + if (tdx->vcpu_initialized) + return -EINVAL; + + if (!is_td_initialized(vcpu->kvm) || is_td_finalized(kvm_tdx)) + return -EINVAL; + + if (copy_from_user(&cmd, argp, sizeof(cmd))) + return -EFAULT; + + if (cmd.error || cmd.unused) + return -EINVAL; + + /* Currently only KVM_TDX_INTI_VCPU is defined for vcpu operation. */ + if (cmd.flags || cmd.id != KVM_TDX_INIT_VCPU) + return -EINVAL; + + ret = tdx_td_vcpu_init(vcpu, (u64)cmd.data); + if (ret) + return ret; + + tdx->vcpu_initialized = true; + return 0; +} + static int __init tdx_module_setup(void) { const struct tdsysinfo_struct *tdsysinfo; diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index 5aea69716278..a95f25845f24 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -23,6 +23,8 @@ struct kvm_tdx { u64 xfam; int hkid; + bool finalized; + u64 tsc_offset; }; @@ -32,6 +34,8 @@ struct vcpu_tdx { struct tdx_td_page tdvpr; struct tdx_td_page *tdvpx; + bool vcpu_initialized; + /* * Dummy to make pmu_intel not corrupt memory. * TODO: Support PMU for TDX. Future work. diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index f6841c3dd12d..fda1b2eaebc6 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -146,6 +146,7 @@ void tdx_vcpu_free(struct kvm_vcpu *vcpu); void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event); int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); +int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); #else static inline int tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { return 0; } static inline bool tdx_is_vm_type_supported(unsigned long type) { return false; } @@ -163,6 +164,7 @@ static inline void tdx_vcpu_free(struct kvm_vcpu *vcpu) {} static inline void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) {} static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } +static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; } #endif #endif /* __KVM_X86_VMX_X86_OPS_H */ diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ddcbbcf13a55..a811d643f71c 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5963,6 +5963,12 @@ long kvm_arch_vcpu_ioctl(struct file *filp, case KVM_SET_DEVICE_ATTR: r = kvm_vcpu_ioctl_device_attr(vcpu, ioctl, argp); break; + case KVM_MEMORY_ENCRYPT_OP: + r = -ENOTTY; + if (!kvm_x86_ops.vcpu_mem_enc_ioctl) + goto out; + r = kvm_x86_ops.vcpu_mem_enc_ioctl(vcpu, argp); + break; default: r = -EINVAL; } diff --git a/tools/arch/x86/include/uapi/asm/kvm.h b/tools/arch/x86/include/uapi/asm/kvm.h index 531a0033e530..35e3b4aa2e96 100644 --- a/tools/arch/x86/include/uapi/asm/kvm.h +++ b/tools/arch/x86/include/uapi/asm/kvm.h @@ -539,6 +539,7 @@ struct kvm_pmu_event_filter { enum kvm_tdx_cmd_id { KVM_TDX_CAPABILITIES = 0, KVM_TDX_INIT_VM, + KVM_TDX_INIT_VCPU, KVM_TDX_CMD_NR_MAX, }; From patchwork Sun Oct 30 06:22:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12843 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665680wru; Sat, 29 Oct 2022 23:26:54 -0700 (PDT) X-Google-Smtp-Source: AMsMyM56PgD03ZuSnDtmno/bojHtrYWAVXULfUG8WZB2yxhbaQvqlzbkZOJ6lgarfMawheeBA19r X-Received: by 2002:a50:a406:0:b0:463:4fe5:67d1 with SMTP id u6-20020a50a406000000b004634fe567d1mr226836edb.151.1667111214359; Sat, 29 Oct 2022 23:26:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111214; cv=none; d=google.com; s=arc-20160816; b=iLGjGqoCVf/b3mou+/mGUeuY+a92C+DbYwF4m5xILKQrLYPlF+EU8U5/YHYfrNUYPG irbJuGQMDXHRbWB4gPj5rCwdjEgfu/sLG/TRiK+rv51YfZEu1u/WW1MSftHIsTB1FoFM a4HRJI1i2m6SSgwlWWS4aasP6cjHHsQBryu2iN1puOXCPJOwexxneEN08Emdot+mks22 DZdQZCfe5MMHN5gTgVT3gJD+zuwyhHXvIyxbzAK17c0W+J1xavOagDk+NWd0diWh+Bmc spveUDQNINQz3t6alCehq7KkYTQKLrOw619r8a/eGAc0+fOYr113DPLbrCIC/eQbCKcJ JCtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=gLgifD5DDchsxWUHF0oC+bY/MIvnlzIVqZyRaDLwRVE=; b=hL/bOD5LclnlNRL+Dk6PBvmeh16OjPhxDNiLtjIKwXngc+U9SPqBkcbluXl2k5qCiz mOi77p7MaLjwERgqwLWi38kSpNddc0HPqpQsjDkFJgcWG+BWOCWjRY7tACc9iPeI4/yk KnPiuI5QGZ553DFz3TrGRClU7wZjOV4/izQGmTeX53SQ6TzuiBvOSoU/22VuGX8JyPu7 jFESe8C/4cWGOlV9UA4mfaPm1J3LA2JbCMx1JxOqC0OnnIdT/wJDuTEbTM8Sp2mieFAi AHXoBuV/HhQSL88MoslfeZFyUy+JOOk9mfvDQqdO+nTOGGg5s/UztKFKw+wX1JV4opya fbYQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=C4L1RMV2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t18-20020a1709063e5200b0079330b37fb6si2586605eji.880.2022.10.29.23.26.29; Sat, 29 Oct 2022 23:26:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=C4L1RMV2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230196AbiJ3GZc (ORCPT + 99 others); Sun, 30 Oct 2022 02:25:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229872AbiJ3GYI (ORCPT ); Sun, 30 Oct 2022 02:24:08 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3809106; Sat, 29 Oct 2022 23:24:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111046; x=1698647046; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3Jh8nljoKLt6zY7RQFbcFy85RXbREQjhP5gORZY8yN4=; b=C4L1RMV2LCP33zq6TM/YXDGG5g0hd1+BTXyR4K9Q1vFkkFSG597aoAj0 B0iyynqQQ9SlpiNKn+uKY/4PoBy8auKflpksl72SIG7oSy1NXq5cAXLGh Fip+fviDnBpyuPvLcyjcK+NMcC8oBys6oWfaBNlkktCh/hRArnIiqwS/i FnpgY3GW4ip0lg1DzWfbKVy+8g5Opmwne0aDIBb7oTvEBwxFxEZcxUZRg kQg3qy+hwhbjj8agzCSBE2mD9+vKt4uokordt8afj2DLetNRQx4Gx4J84 ywU7kmBOJm93EvJz8lWwmXr8y0sBRFkfkaGdmLsswNYwIovYKl/kRxTRK Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037138" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037138" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:01 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392918" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392918" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:01 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Chao Peng Subject: [PATCH v10 026/108] KVM: TDX: Use private memory for TDX Date: Sat, 29 Oct 2022 23:22:27 -0700 Message-Id: <009f75283b0d7084c9eecd0e712d18818d005170.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092808538276762?= X-GMAIL-MSGID: =?utf-8?q?1748092808538276762?= From: Chao Peng Override kvm_arch_has_private_mem() to use fd-based private memory. Return true when a VM has a type of KVM_X86_TDX_VM. Signed-off-by: Chao Peng Signed-off-by: Isaku Yamahata --- arch/x86/kvm/x86.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index a811d643f71c..ba4a9ce0ee80 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -13862,6 +13862,11 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size, } EXPORT_SYMBOL_GPL(kvm_sev_es_string_io); +bool kvm_arch_has_private_mem(struct kvm *kvm) +{ + return kvm->arch.vm_type == KVM_X86_TDX_VM; +} + EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_entry); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_fast_mmio); From patchwork Sun Oct 30 06:22:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12850 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665791wru; Sat, 29 Oct 2022 23:27:28 -0700 (PDT) X-Google-Smtp-Source: AMsMyM70XZW4jigRq77BiyMKZMBXbE/3zZV3Gl81J+1bBmlgDql1yz04KEPJ3F3Fphmz4Krficmm X-Received: by 2002:a17:906:4fd1:b0:787:434f:d755 with SMTP id i17-20020a1709064fd100b00787434fd755mr6647650ejw.356.1667111248681; Sat, 29 Oct 2022 23:27:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111248; cv=none; d=google.com; s=arc-20160816; b=EyXGyjf5simYSr02gjf1r7eI6KS4lZGkOn7IiSZbbqYLQTbO0dzzQf8G/YYdYyxFM8 snecGdKwy8Vi7PpxEGSfOjWBq4PnzROlASOoW1SVSrWFxkHVGnr4pO9pf85jwD7veNUG Kzo08mw9kNYPFyyH2njuLZJoweHqOew1QyJizPrqUoH4P2ObP8ENaKMAiweTk56bPxdp s59gnCutliCB1mhadbfvb153z+CTkLZECzuHhyBPbLycqFvtC8JiGiMaxCH1Sq1u0iFy ulihqofE+Vyj8D5UaCOBqpcIpHuXAJu6lMHvNpf9hMfGMBa70KH72OiR20DsI1Kayokk 4x6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=3/qNlpPa01n3KtDm5/IKqwbDp40ohUnTvegwPiWbWGo=; b=tKALNiqHNLVuzSAJ+dETF8QVuOnw8mfFhCqY/0/XtpZnflNHUo4CnMlkao7qfrSgWt 1OWyQEfpbsLRK9uHsvrsdMAZcZ1XU/xJeyXl9NNuhT5UYTkiiGDgJFYwYcEpUsmUhsDh xPtySMtWoy0GCtxBcT65KxGrXTv7H8CcCI4FD4VvQO/QIr/wgxdRmVatimVkQLsSo6R6 fcEVaYOWuV2pX+6yG1orvlXpyTEsFdA3iOWxVl47xvB/85GD6czQdPQXTVPXaaKVT2uj wPyS/OhbezW9TUZX7+Yc2z88mgq9q30fyUCfsjDyo1WCc1mN1plc3uHThOJYlItprajK e53Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Trh4Tkqm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dr19-20020a170907721300b0079b8cce1170si3780644ejc.950.2022.10.29.23.27.05; Sat, 29 Oct 2022 23:27:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Trh4Tkqm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230292AbiJ3G0A (ORCPT + 99 others); Sun, 30 Oct 2022 02:26:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46966 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229871AbiJ3GYI (ORCPT ); Sun, 30 Oct 2022 02:24:08 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 78AF2AA; Sat, 29 Oct 2022 23:24:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111047; x=1698647047; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GrRrgKMz4riAK/ebZ/sI81D2KhEbTTaxZ+pXBobqoJ4=; b=Trh4Tkqmo6G+MruY22DAVcTXqOWU3+PEAZHCW7fvrsMawsQFM4mifBce rTrm1JQFW1wAYJpfFHbXozwpfmo3FofOTzwuuSGpFQENFgM8W1TSPzp8u Q2T0LtrWiXova6/LnL6UNMKAi/Gck4Y5aNMKtjQLWJ34DA0QUgu2F5JXu cbvtcBzxLO5T/ivjl7F/3/UCJhPKga0QF7MbHw3jRfKPhCiNw7/ZGJOyN JIIs3JLwhIsFk3QcjuMMsC/TDEnym0xFqOoEh87nxjjG/WXKrnuExIno0 BFN973QkE6c9Ydcck9SWFe44r447FNhIJVrnT9r87CB2vE6rY56SfUyLE w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037139" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037139" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:01 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392922" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392922" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:01 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 027/108] [MARKER] The start of TDX KVM patch series: KVM MMU GPA shared bits Date: Sat, 29 Oct 2022 23:22:28 -0700 Message-Id: <4a423a23e8b1057b203b4ee2ad3280cc6594f654.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092844901890159?= X-GMAIL-MSGID: =?utf-8?q?1748092844901890159?= From: Isaku Yamahata This empty commit is to mark the start of patch series of KVM MMU GPA shared bits. Signed-off-by: Isaku Yamahata --- Documentation/virt/kvm/intel-tdx-layer-status.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/Documentation/virt/kvm/intel-tdx-layer-status.rst b/Documentation/virt/kvm/intel-tdx-layer-status.rst index 3e8efde3e3f3..6e3f71ab6b59 100644 --- a/Documentation/virt/kvm/intel-tdx-layer-status.rst +++ b/Documentation/virt/kvm/intel-tdx-layer-status.rst @@ -10,6 +10,7 @@ What qemu can do ---------------- - TDX VM TYPE is exposed to Qemu. - Qemu can create/destroy guest of TDX vm type. +- Qemu can create/destroy vcpu of TDX vm type. Patch Layer status ------------------ @@ -17,13 +18,13 @@ Patch Layer status * TDX, VMX coexistence: Applied * TDX architectural definitions: Applied * TD VM creation/destruction: Applied -* TD vcpu creation/destruction: Applying +* TD vcpu creation/destruction: Applied * TDX EPT violation: Not yet * TD finalization: Not yet * TD vcpu enter/exit: Not yet * TD vcpu interrupts/exit/hypercall: Not yet -* KVM MMU GPA shared bits: Not yet +* KVM MMU GPA shared bits: Applying * KVM TDP refactoring for TDX: Not yet * KVM TDP MMU hooks: Not yet * KVM TDP MMU MapGPA: Not yet From patchwork Sun Oct 30 06:22:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12844 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665693wru; Sat, 29 Oct 2022 23:26:58 -0700 (PDT) X-Google-Smtp-Source: AMsMyM70R/3fBcH87FvmAuBYUp5FhxsFhO8ToE6S3yFzY55Q+kyBlTcmHuu/+tkjYuDUoZPIOm4a X-Received: by 2002:a17:907:7606:b0:7ac:a344:ebea with SMTP id jx6-20020a170907760600b007aca344ebeamr7025160ejc.580.1667111217879; Sat, 29 Oct 2022 23:26:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111217; cv=none; d=google.com; s=arc-20160816; b=LEoBCx3/6N+Lxa5NkWb+x6DOBf+YjuTtqloQx/MwXP2FKlnrdKqW5+UMc0Gw5giGid f/faBxMiE+UysilVqkiTQG0dBUChPV6MxXnigizbCwXvogC6X5HU0h4MdGGh4Jc5iE6P 0a1zeVGSsm0eTl1QvP2RlIfyUfhdJjjvtPW1ZA0YiN23OcnWDJ/O4SU58jxQClH8rv5F aStM3CTERmCiM3eOzmZS9C/szM7+eGk+txvtCiJIZDbhqDYZtP20DWXosmIaSUGcLG39 jyy6uIBj9Y+YZTUga1CSUHQ+P3kWF9RPiQQuNDzcd5nm3z9IXhioY7oR0Rb4hCxAkMqi XLjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=L+x5v5uSIqpA2ITa3J3tCEZhP2fdF1WW3sEk79SJUdM=; b=nxFfiHb0jeQaqxbDRbPkaKKe2+p/NPhBvT7uWdJ5Z9HJ6AwlXLNTl1Uhh3V5KAqx0w SiGM4YfJ9xITAJ3diBcxWROIkRBHjNI0FThQLRF72bpBqCGWnC/G1gFOa1BcR5k+CxNZ CJbzdColGNRUT6KZP7J6l/4RoXCQTFhctO9Xua52rO6hfnWrSX8Mo+YIkjsd+7LFT0uf RFnWzRp5Be676TK5eRE6f1a3TVeu2p2LOwBYj7JH+O6H/qjsji1ji/F7oDrlWE0STrcW d+tK8l18YcU8JAY3fQJgZ8MgEt2VCuy3Cx6cKxkwJ/ESkLJsMCHBXgn+rjgs73PrYGgF g4Bw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=JuwHeMaC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q16-20020a50aa90000000b00461b5ba9933si3774078edc.132.2022.10.29.23.26.34; Sat, 29 Oct 2022 23:26:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=JuwHeMaC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230209AbiJ3GZi (ORCPT + 99 others); Sun, 30 Oct 2022 02:25:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46972 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229874AbiJ3GYI (ORCPT ); Sun, 30 Oct 2022 02:24:08 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 78C92CD; Sat, 29 Oct 2022 23:24:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111047; x=1698647047; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WrHckUqFHfiof4poG5sF1RlRME0x5PTeFyG15y9QQLI=; b=JuwHeMaClDFiQPeO/nYmkLcD9YjHSiGqJkjMrMZD3rv8oFmtjQG2l0SC SBW1gik5ze1EMVndryYU2JgOevlTZ1yyJE50uxrRBOUozTyb3x+aLlWhe 3RCoXOtjj3RiOZ5Vd3QJ166sDpBxrkuZr58PVfT/jbgrE8vsfgxVsZwAq fPH4mt5fS/meqWjlMq7zrPwPMQ/+J2rzW7Ozcsgcr56pe9ArL/hw2pCdU eE7RmV3G8f6sWDKf9qiDgg503OaTIJP79Kc5xn2fSc+OMR211tUpPp8Hq TW+rYWnkyvgtiTHy5+q97fzSlPLIyuTJUEu5XKpYDldCOjrfeeXC9YZzR Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037140" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037140" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:01 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392928" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392928" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:01 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 028/108] KVM: x86/mmu: introduce config for PRIVATE KVM MMU Date: Sat, 29 Oct 2022 23:22:29 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092812586100876?= X-GMAIL-MSGID: =?utf-8?q?1748092812586100876?= From: Isaku Yamahata To keep the case of non TDX intact, introduce a new config option for private KVM MMU support. At the moment, this is synonym for CONFIG_INTEL_TDX_HOST && CONFIG_KVM_INTEL. The config makes it clear that the config is only for x86 KVM MMU. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/Kconfig | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index 73fdfa429b20..6bafdb2ce284 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -133,4 +133,8 @@ config KVM_XEN config KVM_EXTERNAL_WRITE_TRACKING bool +config KVM_MMU_PRIVATE + def_bool y + depends on INTEL_TDX_HOST && KVM_INTEL + endif # VIRTUALIZATION From patchwork Sun Oct 30 06:22:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12846 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665703wru; Sat, 29 Oct 2022 23:27:02 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6LAhsBmmxF8y71NG/2FOQ2TKFUGt88kfNYBBlr2Ln0fw80NDmSmsPbwmSGRAChjpUR40Xv X-Received: by 2002:a05:6402:1941:b0:457:13a:cce9 with SMTP id f1-20020a056402194100b00457013acce9mr7609033edz.265.1667111222398; Sat, 29 Oct 2022 23:27:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111222; cv=none; d=google.com; s=arc-20160816; b=tFveBofXqYUjgslltHB6hxsfGQKi7DGsCbz52PuSfFtbJaHfJlRtNwXPxuDpjFgIRy Uwzm5JqiUFJUoqyXUyGEZqAhkVRss4yL73UbdSpec2rFUdyQgDbQSkrIAJ6c0C81K5lm p6r98OlRsWicas0pEZbw3XV8JqYmP4aMUfZl2wQzYZwGRLi/KZJIrsYNoCm33m/5zgAg tRFYy24VuQiHsIPsORdkva+cxCvA8ppoA+fZqM87EU85mTk13lEszKUU/6RgKM1NG/gv wolbJQ9bch4FJmdg3hYAASTfkbk92lZ7s5VwF/shy8y6uNBJh/+2mmpzdVxcCqnPRELD mO4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=csIWDw8BEohdmkMHrjK8Mvh3bn6pRvV7Y+JNJqJKhd4=; b=FFcL119cZZId+C1Yf76Jzf7OiMSa3CN047nciJZGWPN/8r81DZYPYDOgSRFyf+BaGd zDkbN0CBocGJ2o1DTqmrGrcj87gOatfbsqjGqROEJ1D1YTwoVUuULUXEwbhrc6EzDQ67 4q8v10cG+iqqGU5f3RHQxeikTrgNVO6Xui9vsDGCvtharw+KWr7ori6SlL6mL//qAhW8 svcpwwlTKis3/6cbHrbFvWT5hpbGmtYaY6VtWc9dmwDt65LIqrCGWkCLifxMuMQrLlc1 vfbUkYiz4Nu47+9Ryjuje9e2kkxiaOlY6041CXYUWCrCz7VJo7he2bnWJpeTHP16Rwoj Vj6g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=J1AoI+Hc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qw18-20020a1709066a1200b007a7d22b9e0bsi4419872ejc.133.2022.10.29.23.26.37; Sat, 29 Oct 2022 23:27:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=J1AoI+Hc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230222AbiJ3GZl (ORCPT + 99 others); Sun, 30 Oct 2022 02:25:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46854 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229886AbiJ3GYJ (ORCPT ); Sun, 30 Oct 2022 02:24:09 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 13CB1107; Sat, 29 Oct 2022 23:24:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111048; x=1698647048; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=E8AWVag83/nQIVNiwPvy8uiPIhu/N4cABF2sF0KoCec=; b=J1AoI+HcJ32+6gBJO/c7Ych4KmmUv/Hnmt8ahFOJ5k5EdFgZ02mYPFBK irpxRqG9wYi1SKcF8kdH+Mhf8RRi+kTwi0DfbAN8NnGhjaNe6zHBeykUv T6ACcbXtjFhuZzagSjIXNIJCbGLbNfjwDrkTphxkH3vK7j1pJxq8ilEQj lo9fUa/yMhPU2M2Iy6Wv/FKY/Zs+5cSY8U9E4sNHIQx+7opDPfyUiau2q 660Sp2FZp6oL0fd08kxjQXPBDAp1Z3d20YUlXBYTzez0WFN84NY5W63cm fi4r5aQ8ewjZcXLOoBwErfujAO11Y+Zvhxpl0KDuRzoEctBbrVUB5SfoS A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037141" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037141" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:01 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392933" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392933" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:01 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Rick Edgecombe Subject: [PATCH v10 029/108] KVM: x86/mmu: Add address conversion functions for TDX shared bit of GPA Date: Sat, 29 Oct 2022 23:22:30 -0700 Message-Id: <6e6eafc711f7a174f760b8933c6b8658971c864b.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092817268279067?= X-GMAIL-MSGID: =?utf-8?q?1748092817268279067?= From: Isaku Yamahata TDX repurposes one GPA bit (51 bit or 47 bit based on configuration) to indicate the GPA is private(if cleared) or shared (if set) with VMM. If GPA.shared is set, GPA is covered by the existing conventional EPT pointed by EPTP. If GPA.shared bit is cleared, GPA is covered by TDX module. VMM has to issue SEAMCALLs to operate. Add a member to remember GPA shared bit for each guest TDs, add address conversion functions between private GPA and shared GPA and test if GPA is private. Because struct kvm_arch (or struct kvm which includes struct kvm_arch. See kvm_arch_alloc_vm() that passes __GPF_ZERO) is zero-cleared when allocated, the new member to remember GPA shared bit is guaranteed to be zero with this patch unless it's initialized explicitly. Co-developed-by: Rick Edgecombe Signed-off-by: Rick Edgecombe Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm_host.h | 4 ++++ arch/x86/kvm/mmu.h | 32 ++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/tdx.c | 5 +++++ 3 files changed, 41 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 829a07d23909..3374ec0d6d90 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1372,6 +1372,10 @@ struct kvm_arch { */ #define SPLIT_DESC_CACHE_MIN_NR_OBJECTS (SPTE_ENT_PER_PAGE + 1) struct kvm_mmu_memory_cache split_desc_cache; + +#ifdef CONFIG_KVM_MMU_PRIVATE + gfn_t gfn_shared_mask; +#endif }; struct kvm_vm_stat { diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 6bdaacb6faa0..a45f7a96b821 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -278,4 +278,36 @@ static inline gpa_t kvm_translate_gpa(struct kvm_vcpu *vcpu, return gpa; return translate_nested_gpa(vcpu, gpa, access, exception); } + +static inline gfn_t kvm_gfn_shared_mask(const struct kvm *kvm) +{ +#ifdef CONFIG_KVM_MMU_PRIVATE + return kvm->arch.gfn_shared_mask; +#else + return 0; +#endif +} + +static inline gfn_t kvm_gfn_shared(const struct kvm *kvm, gfn_t gfn) +{ + return gfn | kvm_gfn_shared_mask(kvm); +} + +static inline gfn_t kvm_gfn_private(const struct kvm *kvm, gfn_t gfn) +{ + return gfn & ~kvm_gfn_shared_mask(kvm); +} + +static inline gpa_t kvm_gpa_private(const struct kvm *kvm, gpa_t gpa) +{ + return gpa & ~gfn_to_gpa(kvm_gfn_shared_mask(kvm)); +} + +static inline bool kvm_is_private_gpa(const struct kvm *kvm, gpa_t gpa) +{ + gfn_t mask = kvm_gfn_shared_mask(kvm); + + return mask && !(gpa_to_gfn(gpa) & mask); +} + #endif diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index fd9210cb4f36..e80f9cf79b2e 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -785,6 +785,11 @@ static int tdx_td_init(struct kvm *kvm, struct kvm_tdx_cmd *cmd) kvm_tdx->attributes = td_params->attributes; kvm_tdx->xfam = td_params->xfam; + if (td_params->exec_controls & TDX_EXEC_CONTROL_MAX_GPAW) + kvm->arch.gfn_shared_mask = gpa_to_gfn(BIT_ULL(51)); + else + kvm->arch.gfn_shared_mask = gpa_to_gfn(BIT_ULL(47)); + out: /* kfree() accepts NULL. */ kfree(init_vm); From patchwork Sun Oct 30 06:22:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12871 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666201wru; Sat, 29 Oct 2022 23:29:21 -0700 (PDT) X-Google-Smtp-Source: AMsMyM482b3+/s88qbTcP0HNDxUVRl68pfFGag/lA8YVxvwCGv7fXzHC4ngo2IGCuY7WxTTrVUsD X-Received: by 2002:a17:907:3f94:b0:78d:9d2f:3002 with SMTP id hr20-20020a1709073f9400b0078d9d2f3002mr6946619ejc.40.1667111360935; Sat, 29 Oct 2022 23:29:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111360; cv=none; d=google.com; s=arc-20160816; b=X2+uBKE8y15WiXPvvkNYlaOrBICdOaKI2MgW9eAagS+qxi7WilzXx9NJ7bjNpgDtVw nQTgDeDuXg1Zie2DzZOwYCz04wUVbwZjd0uxr/8Huxrs+lTvolGGc4SoH4Q3oaWj5MgT usJHLd1SEMBiCuY3k3loIemh+I07ooHxn12djr9cX2a44o7mq+IJOrlMIcdcu50AcZNP 4c3qvBp4v8ZOtm6/VvaA8nJbjzyKqP9WgGubMqujbX0Bmkui2pL7mjwk9Ln5Rmul1XjK oHJjs91SgUuwCbpZ5fzfbZ/5xO1qMbuPBYeYMwFY5NjV0tiCVv8lyRkt26uw/4r+R3jc eHmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=DjjDBayMB9AGOFkPTp/YT3VBja42LApZtE30EksRVkQ=; b=pbLRJ1sy2n3aHx4DTfLpCanrqRDU5G9HNRpukD1cpTp6y71Tlmtfto1fUHc8QFzFro BXRfI+bOx6qQ57Fg3kYCe0CrxXlUqsPFbI3IN//qhuQLm0L3h8g9TG+k+4Gb8SSb//X6 GckW2dOW6ybk5ICP9gyG6Kszb379+Fj3wYhlwet3NNuYmNaXp+y0G39dwiZTw4qOPhyY zKtQ4UiGlTqzkjjNYvAdmiHGvN9v1Kc/NAtoGPUxSbno50gJzs5Tmd89PdvHDF8wYxhc iosKHrkjcpA8xv9dgN3YDmgDXocbPx1MpQmhRhvKGWFcovHpflr9fa9ySvK28dQ43aSk QqAQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=NzMggKJv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d6-20020a50cd46000000b004615e2cb81csi3714005edj.244.2022.10.29.23.28.55; Sat, 29 Oct 2022 23:29:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=NzMggKJv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230261AbiJ3GZt (ORCPT + 99 others); Sun, 30 Oct 2022 02:25:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229889AbiJ3GYJ (ORCPT ); Sun, 30 Oct 2022 02:24:09 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 16C81108; Sat, 29 Oct 2022 23:24:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111048; x=1698647048; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=rWTnsxMrIzKafHc5a68BzRinBeuJZkWqHiP4eLXCUc4=; b=NzMggKJvehCRwvQwsgz311scCzl6bUybkj8PxHPk2L4NMNl9k1LStnlr AUXphrxZU4nkX1I8k6nsTdNUouEbnJJPWqa/p2xgmBoQ0OKPot9VBzqo6 MW2NAfs1C7bH3g2l0Lchy9bU9J4U2yxAaQd2fulEtqbwfBcMcFVsIV5dE O5oitRZt5CGrm+8Rxvah0Vy95EuKRPr9tBQwjN0gyn8UH6Lv6OysUj9fH mezx7gtobasKb8pbLepIy37+/pKhiWNENDpAa8DzrQdPwJyb67yk0fwHc ofPV1wcv1ajFtjxFmnLoJU1UQDFF6dFYbQtm9fkWxjg1IeYX/M//edaH0 w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037142" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037142" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:01 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392938" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392938" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:01 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 030/108] [MARKER] The start of TDX KVM patch series: KVM TDP refactoring for TDX Date: Sat, 29 Oct 2022 23:22:31 -0700 Message-Id: <5b3e89364c0579f88d36fb0ee39696aeb0270cde.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092962266094009?= X-GMAIL-MSGID: =?utf-8?q?1748092962266094009?= From: Isaku Yamahata This empty commit is to mark the start of patch series of KVM TDP refactoring for TDX. Signed-off-by: Isaku Yamahata --- Documentation/virt/kvm/intel-tdx-layer-status.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/virt/kvm/intel-tdx-layer-status.rst b/Documentation/virt/kvm/intel-tdx-layer-status.rst index 6e3f71ab6b59..df003d2ed89e 100644 --- a/Documentation/virt/kvm/intel-tdx-layer-status.rst +++ b/Documentation/virt/kvm/intel-tdx-layer-status.rst @@ -24,7 +24,7 @@ Patch Layer status * TD vcpu enter/exit: Not yet * TD vcpu interrupts/exit/hypercall: Not yet -* KVM MMU GPA shared bits: Applying -* KVM TDP refactoring for TDX: Not yet +* KVM MMU GPA shared bits: Applied +* KVM TDP refactoring for TDX: Applying * KVM TDP MMU hooks: Not yet * KVM TDP MMU MapGPA: Not yet From patchwork Sun Oct 30 06:22:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12862 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666040wru; Sat, 29 Oct 2022 23:28:40 -0700 (PDT) X-Google-Smtp-Source: AMsMyM40to+N5KHOzzYdIBwZeIrNbXhKLdN++BqrI3jxlcRgwoHmz/+7I0VSu7zmiVStVjj7rBl9 X-Received: by 2002:a05:6402:158d:b0:463:2343:b980 with SMTP id c13-20020a056402158d00b004632343b980mr2825444edv.150.1667111320804; Sat, 29 Oct 2022 23:28:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111320; cv=none; d=google.com; s=arc-20160816; b=JXYY601HImhjBe52+i11Rc94JHbVQ9qK+4/FA+pWR/mIvupoSl8mc/UXknA7FRXwhU IqOGuzpzHQfyVzVGobLkr/DtAaAAC5X68HqswPTZq3bsPrQDmf96mPRrKq8OVXCGwd/w vQd68GzuugTziTBpMWJpQN18ZbR2UoXpZ4O7uYT0pWMy62qROwZWVZOtV9e9wAGZ1Cww a+5qydp00wV6BfYIa2/3tkydPXSaQugn59eldNCLB9qVj4Jxcs69fw8ONv0nErrHBY3v nJTLtSz2aTF32oqkQPOYqzebqC9LtSIX20DkVOjH7jY/QXZ5PXLA4gM8rZfrnCCOylvc 09dA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=y/X5BgPLBhwcctKktmrq1jp1nSnNI1fKIeRySWeLDJk=; b=k4mLJdFeg8U3OrGMoQ+WW2kUF8U6k7n0nb0Mrdq8SEnlubjj9qeDINXbY615FgIFK+ cdro/pBNLDmj+qXbnFVlWzgFprjUjAZssZhAAng8iT6qkPs9U8TJ1w6gu+BftRv8ZwyA UbZ/yE6s8qeTA1LwPQmgdAUfXzcS9MTYRsKN83GyKaE6HflpWNlUQNmFeYCltYopInQZ jRPsgyxg995/IwcRRrNuEkRnA5qU0JX9yrRjsV/2OfGKe7R3HJJ2oMqGzThrHTuJJx3o w6UFufo+d0r7g0pEVYKM8S4R0ynUtTv10fIFj4FKTDk7jaof4vZx4wTQUquygaqrqJQz mf3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=kzICDaKj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b12-20020a056402350c00b0045d5b83114esi4172153edd.112.2022.10.29.23.28.17; Sat, 29 Oct 2022 23:28:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=kzICDaKj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230353AbiJ3G0Z (ORCPT + 99 others); Sun, 30 Oct 2022 02:26:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229902AbiJ3GYJ (ORCPT ); Sun, 30 Oct 2022 02:24:09 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 16FEA109; Sat, 29 Oct 2022 23:24:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111048; x=1698647048; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=U9taNEK+Sd3FCL0cWVbcEHkBfRflbcnLImJXgQmB6kY=; b=kzICDaKjl6x20djopy8LQzZUe23Ko2ecYjergso9toBT04KajTYY1U6M 30MO5hDmiNLjONRhhnZrw6iWywEs7xz3hDZRv01M41Cu9+cdP4F81CeGH n04KtK1GLl+VBgVJj/qCsF0m1Ptk/2QqgidejL2WcAXa14LGs32m6QqfO Kynx3fQbfLad2RrOD+ick7qaOMxlPs+rYTAf+ilm2gRFFTm/Gn5OlZEkD qVWwNxN5/hbdY7d8S8PMDlxsLbH919ahbXtF7hvsrAzhFDI7UekRUhIcT MZ811XPJzi8BGHlT8w8U0rAg9gM/sUAaG6vk7gLUjX7RnXsKVl3X3t7b9 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037143" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037143" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:02 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392942" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392942" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:01 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 031/108] KVM: x86/mmu: Replace hardcoded value 0 for the initial value for SPTE Date: Sat, 29 Oct 2022 23:22:32 -0700 Message-Id: <0de1d5dfbce49b5e9d4f93289296b726180b8dd0.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092920547229546?= X-GMAIL-MSGID: =?utf-8?q?1748092920547229546?= From: Isaku Yamahata The TDX support will need the "suppress #VE" bit (bit 63) set as the initial value for SPTE. To reduce code change size, introduce a new macro SHADOW_NONPRESENT_VALUE for the initial value for the shadow page table entry (SPTE) and replace hard-coded value 0 for it. Initialize shadow page tables with their value. The plan is to unconditionally set the "suppress #VE" bit for both AMD and Intel as: 1) AMD hardware doesn't use this bit; 2) for conventional VMX guests, KVM never enables the "EPT-violation #VE" in VMCS control and "suppress #VE" bit is ignored by hardware. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu/mmu.c | 50 +++++++++++++++++++++++++++++++++----- arch/x86/kvm/mmu/spte.h | 2 ++ arch/x86/kvm/mmu/tdp_mmu.c | 15 ++++++------ 3 files changed, 54 insertions(+), 13 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 10017a9f26ee..e7e11f51f8b4 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -538,9 +538,9 @@ static u64 mmu_spte_clear_track_bits(struct kvm *kvm, u64 *sptep) if (!is_shadow_present_pte(old_spte) || !spte_has_volatile_bits(old_spte)) - __update_clear_spte_fast(sptep, 0ull); + __update_clear_spte_fast(sptep, SHADOW_NONPRESENT_VALUE); else - old_spte = __update_clear_spte_slow(sptep, 0ull); + old_spte = __update_clear_spte_slow(sptep, SHADOW_NONPRESENT_VALUE); if (!is_shadow_present_pte(old_spte)) return old_spte; @@ -574,7 +574,7 @@ static u64 mmu_spte_clear_track_bits(struct kvm *kvm, u64 *sptep) */ static void mmu_spte_clear_no_track(u64 *sptep) { - __update_clear_spte_fast(sptep, 0ull); + __update_clear_spte_fast(sptep, SHADOW_NONPRESENT_VALUE); } static u64 mmu_spte_get_lockless(u64 *sptep) @@ -642,6 +642,39 @@ static void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu) } } +#ifdef CONFIG_X86_64 +static inline void kvm_init_shadow_page(void *page) +{ + memset64(page, SHADOW_NONPRESENT_VALUE, 4096 / 8); +} + +static int mmu_topup_shadow_page_cache(struct kvm_vcpu *vcpu) +{ + struct kvm_mmu_memory_cache *mc = &vcpu->arch.mmu_shadow_page_cache; + int start, end, i, r; + + start = kvm_mmu_memory_cache_nr_free_objects(mc); + r = kvm_mmu_topup_memory_cache(mc, PT64_ROOT_MAX_LEVEL); + + /* + * Note, topup may have allocated objects even if it failed to allocate + * the minimum number of objects required to make forward progress _at + * this time_. Initialize newly allocated objects even on failure, as + * userspace can free memory and rerun the vCPU in response to -ENOMEM. + */ + end = kvm_mmu_memory_cache_nr_free_objects(mc); + for (i = start; i < end; i++) + kvm_init_shadow_page(mc->objects[i]); + return r; +} +#else +static int mmu_topup_shadow_page_cache(struct kvm_vcpu *vcpu) +{ + return kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_shadow_page_cache, + PT64_ROOT_MAX_LEVEL); +} +#endif /* CONFIG_X86_64 */ + static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect) { int r; @@ -651,8 +684,7 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect) 1 + PT64_ROOT_MAX_LEVEL + PTE_PREFETCH_NUM); if (r) return r; - r = kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_shadow_page_cache, - PT64_ROOT_MAX_LEVEL); + r = mmu_topup_shadow_page_cache(vcpu); if (r) return r; if (maybe_indirect) { @@ -5870,7 +5902,13 @@ int kvm_mmu_create(struct kvm_vcpu *vcpu) vcpu->arch.mmu_page_header_cache.kmem_cache = mmu_page_header_cache; vcpu->arch.mmu_page_header_cache.gfp_zero = __GFP_ZERO; - vcpu->arch.mmu_shadow_page_cache.gfp_zero = __GFP_ZERO; + /* + * When X86_64, initial SEPT entries are initialized with + * SHADOW_NONPRESENT_VALUE. Otherwise zeroed. See + * mmu_topup_shadow_page_cache(). + */ + if (!IS_ENABLED(CONFIG_X86_64)) + vcpu->arch.mmu_shadow_page_cache.gfp_zero = __GFP_ZERO; vcpu->arch.mmu = &vcpu->arch.root_mmu; vcpu->arch.walk_mmu = &vcpu->arch.root_mmu; diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index 7670c13ce251..42ecaa75da15 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -148,6 +148,8 @@ static_assert(MMIO_SPTE_GEN_LOW_BITS == 8 && MMIO_SPTE_GEN_HIGH_BITS == 11); #define MMIO_SPTE_GEN_MASK GENMASK_ULL(MMIO_SPTE_GEN_LOW_BITS + MMIO_SPTE_GEN_HIGH_BITS - 1, 0) +#define SHADOW_NONPRESENT_VALUE 0ULL + extern u64 __read_mostly shadow_host_writable_mask; extern u64 __read_mostly shadow_mmu_writable_mask; extern u64 __read_mostly shadow_nx_mask; diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index eab765442d0b..38bc4c2f0f1f 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -694,7 +694,7 @@ static inline int tdp_mmu_zap_spte_atomic(struct kvm *kvm, * here since the SPTE is going from non-present to non-present. Use * the raw write helper to avoid an unnecessary check on volatile bits. */ - __kvm_tdp_mmu_write_spte(iter->sptep, 0); + __kvm_tdp_mmu_write_spte(iter->sptep, SHADOW_NONPRESENT_VALUE); return 0; } @@ -871,8 +871,8 @@ static void __tdp_mmu_zap_root(struct kvm *kvm, struct kvm_mmu_page *root, continue; if (!shared) - tdp_mmu_set_spte(kvm, &iter, 0); - else if (tdp_mmu_set_spte_atomic(kvm, &iter, 0)) + tdp_mmu_set_spte(kvm, &iter, SHADOW_NONPRESENT_VALUE); + else if (tdp_mmu_set_spte_atomic(kvm, &iter, SHADOW_NONPRESENT_VALUE)) goto retry; } } @@ -928,8 +928,9 @@ bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp) if (WARN_ON_ONCE(!is_shadow_present_pte(old_spte))) return false; - __tdp_mmu_set_spte(kvm, kvm_mmu_page_as_id(sp), sp->ptep, old_spte, 0, - sp->gfn, sp->role.level + 1, true, true); + __tdp_mmu_set_spte(kvm, kvm_mmu_page_as_id(sp), sp->ptep, old_spte, + SHADOW_NONPRESENT_VALUE, sp->gfn, sp->role.level + 1, + true, true); return true; } @@ -963,7 +964,7 @@ static bool tdp_mmu_zap_leafs(struct kvm *kvm, struct kvm_mmu_page *root, !is_last_spte(iter.old_spte, iter.level)) continue; - tdp_mmu_set_spte(kvm, &iter, 0); + tdp_mmu_set_spte(kvm, &iter, SHADOW_NONPRESENT_VALUE); flush = true; } @@ -1328,7 +1329,7 @@ static bool set_spte_gfn(struct kvm *kvm, struct tdp_iter *iter, * invariant that the PFN of a present * leaf SPTE can never change. * See __handle_changed_spte(). */ - tdp_mmu_set_spte(kvm, iter, 0); + tdp_mmu_set_spte(kvm, iter, SHADOW_NONPRESENT_VALUE); if (!pte_write(range->pte)) { new_spte = kvm_mmu_changed_pte_notifier_make_spte(iter->old_spte, From patchwork Sun Oct 30 06:22:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12847 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665717wru; Sat, 29 Oct 2022 23:27:08 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4X5WlV7bq4rAccFe1sDWnTO4yPdNBpP1bo0U/Sso7CnYa9BMSgwfTqvaIkvD+EGAMLbyes X-Received: by 2002:aa7:c452:0:b0:463:14dd:2093 with SMTP id n18-20020aa7c452000000b0046314dd2093mr4259314edr.48.1667111228318; Sat, 29 Oct 2022 23:27:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111228; cv=none; d=google.com; s=arc-20160816; b=AVYuR5Fxedio6POhQ1qOuyodVbZH9DGnBYWC4aEXYCJmbiBehYdkFK3kVwL2B9TcN5 kUzR9HMMdGWL2N2lcqm5PuuVtpQvfnQ5cMG+xDCd6JWVn+WRAVuJpwUghYn62yCrz5wK I33rTgcYcJVhsrG89lGWYKTSiQ/Syr+9/scp2QHAk35tghfKSNudYSTcspg+ET7N6zlH u+gzI57at/ZH+3m+97svl50NFX9iwLleZv9P9QHOx+GOXY50OMSeCAVf7p5Au1UE4MLR PVYUmAmcIlH4wM1svfbQzqv61c8xUvMYZ2Jx25sp9T/d484eN6x+ct5Zad+hTnjIN3G0 ROEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=rY1d4DRX+NYMJ+zyfNpbvpstA0n7iAQrKhKtwby2RU4=; b=zmQUK7MmpmIRR/iMktEGG1ZM44TM6jaxQRbxc49ifICrTTNhpbcWXJMPw/FmScc0ZC 0ILWOSIdNVaKn2OwlnOVPO+iFrTNbQFzNe8l7gBwdOoMNIMzd5SGb6ohdYTlrm0QVRNE FZm0DKotzNxAJDVeA3ArYHLDIvOjIavPK6g/hnRo6VMxvMbjt32rTiyK2T+NGJ8kAQNL +5Kll8n2QIvRWFDGEOw1RB1CUtc0IE0Mb0o0wdWp63fmhOa6Q12c9Xs0clbSrcu1AU6O 3yGq/+8qNpgSWL2fsMxUTPdTa2S/pVPpdB+Qauf1wsF07sa4yE0S0s5/qNYWjmjEPtDv lKxw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=kQ0dttyu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v15-20020a056402348f00b00461c74a07c8si1796374edc.343.2022.10.29.23.26.43; Sat, 29 Oct 2022 23:27:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=kQ0dttyu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230252AbiJ3GZo (ORCPT + 99 others); Sun, 30 Oct 2022 02:25:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46882 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229887AbiJ3GYJ (ORCPT ); Sun, 30 Oct 2022 02:24:09 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 170F610A; Sat, 29 Oct 2022 23:24:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111048; x=1698647048; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=NRy7ZqKX91pWNXtRhcY8rjR1ap8KYHXHj+9Nb/db+28=; b=kQ0dttyudbN8Ld5PhggJCy1rtRbVqqDVsRB2xeUXpqyxIgoZNTOsbbdc bnC0iRsvqA77cCWKyxokXKubLB7P738Cg/yC7spRjx0KblNTNaAMuTN8m EnWujYbK2Epp/U6mXj6n7Ag2dJO7F+u1ym+uUQlqx9LbTgikdN4rdXqfk JlTlQV4qpKePYna2M4LxalAdwe7YWyseec+ENeZdKjEby0BeYxW7oC9BF oJjxI7/Vlp1hTU+fp/HPmy3kalMeUpUsWPiGFrg8lXn1LUwvZ8iEu1409 IS2AUhV5JJ39X1ua6nVKQ8gzC/KridXPyp5KVILGC0ra5z7oLelaxN4kt w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037144" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037144" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:02 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392948" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392948" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:02 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 032/108] KVM: x86/mmu: Make sync_page not use hard-coded 0 as the initial SPTE value Date: Sat, 29 Oct 2022 23:22:33 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092823149192811?= X-GMAIL-MSGID: =?utf-8?q?1748092823149192811?= From: Isaku Yamahata FNAME(sync_page) in arch/x86/kvm/mmu/paging_tmpl.h assumes that the initial shadow page table entry (SPTE) is zero. Remove the assumption by using SHADOW_NONPRESENT_VALUE that will be updated from 0 to non-zero value. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu/paging_tmpl.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index 5ab5f94dcb6f..6db3f2b5563a 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -1036,7 +1036,8 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp) gpa_t pte_gpa; gfn_t gfn; - if (!sp->spt[i]) + /* spt[i] has initial value of shadow page table allocation */ + if (sp->spt[i] == SHADOW_NONPRESENT_VALUE) continue; pte_gpa = first_pte_gpa + i * sizeof(pt_element_t); From patchwork Sun Oct 30 06:22:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12853 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665900wru; Sat, 29 Oct 2022 23:28:01 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5fyS8K28uKwzhbK4/YS0B8bqf9t1wizSLBfIXBU3ie1qkRh9WeXqeea11DGQo7E63pgv3J X-Received: by 2002:a17:906:cc47:b0:7ad:8560:5937 with SMTP id mm7-20020a170906cc4700b007ad85605937mr6998956ejb.445.1667111281421; Sat, 29 Oct 2022 23:28:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111281; cv=none; d=google.com; s=arc-20160816; b=vWPGdU4LwwzdHOY9VddtZkXSUBa4jFmnxOLhdmiv7FqGvBmPA5tQFZkuqaoxBBUWlj To/yMpufunpxfVwwHdiCeiLGDS5qRucEeo0TcDGZHajtZAbLoo1v5fmIInykiLAGkK4f H8P/NpKKagKVYkVcUi9xny3AfKrsginPxUwLsZAj4VT7nG4zC7DlngaTywMJjIuOOW+B JZcs6jPf1NsrAwAtkXdzE/DJZO/kdlcoW4wh0KYzhLcJwMrH5io1N8PTrikp0hfF7jEL nAAJjgsCTmYai8PXGVy4/6XF3l9j8nCZGVpBgiFGkhLpETymGhhi5dTl3rEUg+rI3hwS HDYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=NjLVjJGFZV/lQziTxyvBcbpBxuIh1BZM0oo2PUnCHUg=; b=Whtqq0IYyAo5uc4knVrMt9vxcECAtP1CdUIBL/A10xMn8pB3GDeZ65A6jkdmXMHg3M MyLczCAHPo0Doq8ha4ENevGlBHI53xZJLQ6iIVFku7GFqtIXSkkcj15P/xoeHdv2PdWG cU9rzbx6r5m+X+ypaAEYoEdNtVk6RRXNzKYvICqNE+3xFci1E6HYsUqnuM44Irxxe9Xu gEhyQNODTKCQ577CG2GneMb3p3rNvXJgJP2b6IA4glvWXdoB0+ijBNCCTmyvkQ+5DOR7 wu31WuG1EG5tArYZNZ7dYISQBtmvs8gEXLPIds7lvlgMQxuqxleKaevlDxsyxF9BCZQE x68Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=SsgZWJx4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x2-20020a170906148200b007800b181a0csi3338218ejc.300.2022.10.29.23.27.38; Sat, 29 Oct 2022 23:28:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=SsgZWJx4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230371AbiJ3G0g (ORCPT + 99 others); Sun, 30 Oct 2022 02:26:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47118 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229904AbiJ3GYJ (ORCPT ); Sun, 30 Oct 2022 02:24:09 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 179BE10B; Sat, 29 Oct 2022 23:24:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111048; x=1698647048; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HD0oZM0PQkBmnse5VNjuZto1jI4XWFmzeZByIMhGHcg=; b=SsgZWJx4G70prIzMrpPg3N2P0pFlkvfM/xPgKYt66gEc5FS7epPmaCVN cO9l6H85AvvnWzIfalOcFTs8W4ysZ6c2ChF8PQdwITqZYdnWnNfVHDqyz Bn+zqGna4FeLRKS0UGLuYpVGBseDcBl5fCw+xcLTaBg4pWGEE7+pCXzSm aBBfldfvp4qE/eWoxyrzUfNnoXjbg3VVDHvPo6bbctj1kTnUY/wumIVBN zoVutYIY2G6xSIBOEzAFr9WP2wqPAYmpbZSrXTUgzD9Xfdvrw5ApNd5bx UdVPs5tI1uM/NQotUpWSE7Z++xu457F5NQwM33Wr/LjNVjFJNEfjByv5n g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037145" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037145" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:02 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392951" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392951" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:02 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 033/108] KVM: x86/mmu: Allow non-zero value for non-present SPTE and removed SPTE Date: Sat, 29 Oct 2022 23:22:34 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092878978509104?= X-GMAIL-MSGID: =?utf-8?q?1748092878978509104?= From: Sean Christopherson For TD guest, the current way to emulate MMIO doesn't work any more, as KVM is not able to access the private memory of TD guest and do the emulation. Instead, TD guest expects to receive #VE when it accesses the MMIO and then it can explicitly makes hypercall to KVM to get the expected information. To achieve this, the TDX module always enables "EPT-violation #VE" in the VMCS control. And accordingly, KVM needs to configure the MMIO spte to trigger EPT violation (instead of misconfiguration) and at the same time, also clear the "suppress #VE" bit so the TD guest can get the #VE instead of causing actual EPT violation to KVM. In order for KVM to be able to have chance to set up the correct SPTE for MMIO for TD guest, the default non-present SPTE must have the "suppress guest accesses the MMIO. Also, when TD guest accesses the actual shared memory, it should continue to trigger EPT violation to the KVM instead of receiving the #VE (the TDX module guarantees KVM will receive EPT violation for private memory access). This means for the shared memory, the SPTE also must have the "suppress #VE" bit set for the non-present SPTE. Add "suppress VE" bit (bit 63) to SHADOW_NONPRESENT_VALUE and REMOVED_SPTE. Unconditionally set the "suppress #VE" bit (which is bit 63) for both AMD and Intel as: 1) AMD hardware doesn't use this bit when present bit is off; 2) for normal VMX guest, KVM never enables the "EPT-violation #VE" in VMCS control and "suppress #VE" bit is ignored by hardware. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/vmx.h | 1 + arch/x86/kvm/mmu/spte.c | 4 +++- arch/x86/kvm/mmu/spte.h | 22 +++++++++++++++++++++- arch/x86/kvm/mmu/tdp_mmu.c | 8 ++++++++ 4 files changed, 33 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 498dc600bd5c..cdbf12c1a83c 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -511,6 +511,7 @@ enum vmcs_field { #define VMX_EPT_IPAT_BIT (1ull << 6) #define VMX_EPT_ACCESS_BIT (1ull << 8) #define VMX_EPT_DIRTY_BIT (1ull << 9) +#define VMX_EPT_SUPPRESS_VE_BIT (1ull << 63) #define VMX_EPT_RWX_MASK (VMX_EPT_READABLE_MASK | \ VMX_EPT_WRITABLE_MASK | \ VMX_EPT_EXECUTABLE_MASK) diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c index 2e08b2a45361..0b97a045c5f0 100644 --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -419,7 +419,9 @@ void kvm_mmu_set_ept_masks(bool has_ad_bits, bool has_exec_only) shadow_dirty_mask = has_ad_bits ? VMX_EPT_DIRTY_BIT : 0ull; shadow_nx_mask = 0ull; shadow_x_mask = VMX_EPT_EXECUTABLE_MASK; - shadow_present_mask = has_exec_only ? 0ull : VMX_EPT_READABLE_MASK; + /* VMX_EPT_SUPPRESS_VE_BIT is needed for W or X violation. */ + shadow_present_mask = + (has_exec_only ? 0ull : VMX_EPT_READABLE_MASK) | VMX_EPT_SUPPRESS_VE_BIT; /* * EPT overrides the host MTRRs, and so KVM must program the desired * memtype directly into the SPTEs. Note, this mask is just the mask diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index 42ecaa75da15..7e0f79e8f45b 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -148,7 +148,22 @@ static_assert(MMIO_SPTE_GEN_LOW_BITS == 8 && MMIO_SPTE_GEN_HIGH_BITS == 11); #define MMIO_SPTE_GEN_MASK GENMASK_ULL(MMIO_SPTE_GEN_LOW_BITS + MMIO_SPTE_GEN_HIGH_BITS - 1, 0) +/* + * non-present SPTE value for both VMX and SVM for TDP MMU. + * For SVM NPT, for non-present spte (bit 0 = 0), other bits are ignored. + * For VMX EPT, bit 63 is ignored if #VE is disabled. (EPT_VIOLATION_VE=0) + * bit 63 is #VE suppress if #VE is enabled. (EPT_VIOLATION_VE=1) + * For TDX: + * Secure-EPT: TDX module sets EPT_VIOLATION_VE for Secure-EPT + * private EPT: "suppress #VE" bit is ignored. CPU doesn't walk it. + * conventional EPT: "suppress #VE" bit must be set to get EPT violation + */ +#ifdef CONFIG_X86_64 +#define SHADOW_NONPRESENT_VALUE BIT_ULL(63) +static_assert(!(SHADOW_NONPRESENT_VALUE & SPTE_MMU_PRESENT_MASK)); +#else #define SHADOW_NONPRESENT_VALUE 0ULL +#endif extern u64 __read_mostly shadow_host_writable_mask; extern u64 __read_mostly shadow_mmu_writable_mask; @@ -189,13 +204,18 @@ extern u64 __read_mostly shadow_nonpresent_or_rsvd_mask; * non-present intermediate value. Other threads which encounter this value * should not modify the SPTE. * + * For X86_64 case, SHADOW_NONPRESENT_VALUE, "suppress #VE" bit, is set because + * "EPT violation #VE" in the secondary VM execution control may be enabled. + * Because TDX module sets "EPT violation #VE" for TD, "suppress #VE" bit for + * the conventional EPT needs to be set. + * * Use a semi-arbitrary value that doesn't set RWX bits, i.e. is not-present on * bot AMD and Intel CPUs, and doesn't set PFN bits, i.e. doesn't create a L1TF * vulnerability. Use only low bits to avoid 64-bit immediates. * * Only used by the TDP MMU. */ -#define REMOVED_SPTE 0x5a0ULL +#define REMOVED_SPTE (SHADOW_NONPRESENT_VALUE | 0x5a0ULL) /* Removed SPTEs must not be misconstrued as shadow present PTEs. */ static_assert(!(REMOVED_SPTE & SPTE_MMU_PRESENT_MASK)); diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 38bc4c2f0f1f..1eee9c159958 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -693,6 +693,14 @@ static inline int tdp_mmu_zap_spte_atomic(struct kvm *kvm, * overwrite the special removed SPTE value. No bookkeeping is needed * here since the SPTE is going from non-present to non-present. Use * the raw write helper to avoid an unnecessary check on volatile bits. + * + * Set non-present value to SHADOW_NONPRESENT_VALUE, rather than 0. + * It is because when TDX is enabled, TDX module always + * enables "EPT-violation #VE", so KVM needs to set + * "suppress #VE" bit in EPT table entries, in order to get + * real EPT violation, rather than TDVMCALL. KVM sets + * SHADOW_NONPRESENT_VALUE (which sets "suppress #VE" bit) so it + * can be set when EPT table entries are zapped. */ __kvm_tdp_mmu_write_spte(iter->sptep, SHADOW_NONPRESENT_VALUE); From patchwork Sun Oct 30 06:22:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12876 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666293wru; Sat, 29 Oct 2022 23:29:41 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7KmSqCH/r2+K5UbCR1AZplCPnz5gY/zs8umEdZR0F4OdeJtFodm6Zf//D7I9MZ6OehPJUS X-Received: by 2002:a17:902:8c81:b0:178:1701:cd with SMTP id t1-20020a1709028c8100b00178170100cdmr8106366plo.138.1667111381674; Sat, 29 Oct 2022 23:29:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111381; cv=none; d=google.com; s=arc-20160816; b=WW2X4fT2t3ODI53u5X0ubx/gUBjfQKbWgSPrkW/PsWLBCxOc4bgjw1BQsxIB70n4Q1 dFkZGnAkxc6ay8XfyxU6ab37ZNJlJGXRFCNB6Mc5Ykx7fAoE5hyN56aKOkLdwx0y3f8Q 5d1dqFA9RwKiJb3CHq8LWw1J3URZxDNrTp0VubAdoeuYEPr8iR7NJIvkRjPqkyQPhQyU kr64cYr4jb3VPtgshEZBCmnQVTE2j2rSkKAIB+D0S2pzFUd9imXNd1IzhdsKoTesLY5m ec/8h9tKqTQsvjnqliNwgs+E8/hg7hHqjXVUcI08tfKXXq3TZ+ux+D8otAUS+D36l2Jm r0Gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=eXLWTGp96py4Py2p2JWvtFjNI5py5oSs4PyrPwVC7iQ=; b=Wps0vrjx2XpRNXn7LWQigkn85HtJH29zm1rYFu6bxCjA8GoF77ixuwU/wCMmdfjgbr 3tJ6a6avFIVOv1dWKv9YuM5Va33PueC2pm3zJPuJQSe//FJg/zHPFuiRbeWJ5WjjDJpX Y/nAxzVLd7YzKalBFNYWwa0EgxPbz2sIk+pOYsYct8AOQrJUf4y7Fx8U8RpkKmR1XkaA WItOgg9Tyn2VzC1WBXQO3p1mX12db625YktDhwG2cIU6yiAHSX8sjqacukaMDd0KqVb8 Soy5wulwZl4YH1MMUNnjnxKfGsOCIRvesdTByWKL1wr3/nhEh+C/KDDoWBDu6AF8b8Dr 06Jw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=CtcR55sk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s16-20020a170902ea1000b0018686834315si5097202plg.431.2022.10.29.23.29.29; Sat, 29 Oct 2022 23:29:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=CtcR55sk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230343AbiJ3G0V (ORCPT + 99 others); Sun, 30 Oct 2022 02:26:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46900 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229890AbiJ3GYJ (ORCPT ); Sun, 30 Oct 2022 02:24:09 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 17B3310C; Sat, 29 Oct 2022 23:24:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111048; x=1698647048; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KuHqjRVUBksdUjeiSc9eKEgyTsBPP0Eliqw02r0hyTk=; b=CtcR55sk/E6tQ2LBiliuRJG6tM63UeX0woNBCd7cOt44JOtXR16TzxlI Cz6Hj3dxBlmzzXytaxgl+EL1HH6jzTC4QhmLt8rVosrdjo7jlJxSlF6m3 8kX5rJrvXVEjxPlicpYL722R78WLQrlNaY0Pbh8BTFzoVARaHK86WR1D3 SFn9jqau8v2a3bBvlP5CvqniHl3m5Rsg0GPv/+uzlbHyB3dZtTjzpHmPC dWUW68wVmK+md2lTyQLWnYDqHUPiiaqrfwWE6T+yVxscb1emJWtMbcImF QqrhYZ1SiboJvsuBYO2HxG0eO18jD181eFYFE/9I/l6nNdIjJ2iuXHEWL w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037146" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037146" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:02 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392954" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392954" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:02 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 034/108] KVM: x86/mmu: Add Suppress VE bit to shadow_mmio_{value, mask} Date: Sat, 29 Oct 2022 23:22:35 -0700 Message-Id: <1c480a48c2697054b1cfe068fa073f4035648f9a.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092984169717764?= X-GMAIL-MSGID: =?utf-8?q?1748092984169717764?= From: Isaku Yamahata Because TDX will need shadow_mmio_mask to be VMX_SUPPRESS_VE | RWX and shadow_mmio_value to be 0, make VMX EPT case use same value for TDX shadow_mmio_mask. For VMX, VMX_SUPPRESS_VE doesn't matter, it doesn't affect VMX logic to add the bit to shadow_mmio_{value, mask}. Note that shadow_mmio_value will be per-VM value. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu/spte.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c index 0b97a045c5f0..5d5c06d4fd89 100644 --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -437,8 +437,8 @@ void kvm_mmu_set_ept_masks(bool has_ad_bits, bool has_exec_only) * EPT Misconfigurations are generated if the value of bits 2:0 * of an EPT paging-structure entry is 110b (write/execute). */ - kvm_mmu_set_mmio_spte_mask(VMX_EPT_MISCONFIG_WX_VALUE, - VMX_EPT_RWX_MASK, 0); + kvm_mmu_set_mmio_spte_mask(VMX_EPT_MISCONFIG_WX_VALUE | VMX_EPT_SUPPRESS_VE_BIT, + VMX_EPT_RWX_MASK | VMX_EPT_SUPPRESS_VE_BIT, 0); } EXPORT_SYMBOL_GPL(kvm_mmu_set_ept_masks); From patchwork Sun Oct 30 06:22:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12855 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665942wru; Sat, 29 Oct 2022 23:28:12 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7orpWYZ1TVOBHeE35puA+LxW3cMz7LYuAYuipxCSWM4gYRkU6zlr1PCTOrpR1cipf5wmor X-Received: by 2002:a05:6402:2937:b0:461:32aa:32da with SMTP id ee55-20020a056402293700b0046132aa32damr7296098edb.78.1667111292031; Sat, 29 Oct 2022 23:28:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111292; cv=none; d=google.com; s=arc-20160816; b=xVR+/YKPtg6Wsirjf+vwJFrsKNmQuFQP3138ooQy3D43TmbDkgUX7exMSC7UiC29nm TOnVQmHS3YTgYYXx5hDmiLvKOgavrndXcXNUaJpxlmpUwBznHzkcrYnVjA+djmYequzu CaLPKyh1+HQCiVIkpGiCOvUfL8AShHOdmWSodobvB2Ns63KENo8oMKEtmVlpeGotRUbq 5e/+r/T9LjXclnzLfTeyqRAg6+tdwYJiGN6LvaFqJ7oK8bP1f4SaLChlCae0hOwJWblI r9DXUqt4Pxgf5iFWR8OKFr8cL9nDXQlqfttDvw7fP1iAbdyE/UOBt1z/Ta+jtbe9Uof/ m7NA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=4m/XcRhvQbIUZbGUheQ6e2Go0SypJr1nHnQlapTrNNI=; b=Y7kfe6Z+mOhOEpydqLXI5AKfLX8mklwLw2MlYieQpPMk3wouIaNnn6r3ba79y2+vrF LKG/K1086DThGF2tLMTq68IMEw3WozKmkpNLEyHr5aYh6qAw2PobOKvjet5hF3btnp9M Oh/VfQ+j3v0ar5Xr9u6vQ1m4lefmBdXSHmzT1IgZMRWL6lyVaqHBmRChzLMK7PGWQiIA 6TsPasnl4dZgPBhaQWlBFx86GnOu0aA2xhM8vl0OC4EIHm20W7aNObJtdDmQDV8k+wzC 0WI8+ZVRtY1kbMiWwA/Z6i37QifUNTma1pAin8Nk+j17FpxeOpvn9vycWowEd+fBA+H5 zonA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ck+EAyT3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qk32-20020a1709077fa000b0077b2e822b8fsi4340647ejc.76.2022.10.29.23.27.43; Sat, 29 Oct 2022 23:28:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ck+EAyT3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230381AbiJ3G0q (ORCPT + 99 others); Sun, 30 Oct 2022 02:26:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46916 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229912AbiJ3GYK (ORCPT ); Sun, 30 Oct 2022 02:24:10 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D052D8; Sat, 29 Oct 2022 23:24:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111049; x=1698647049; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WS28xij+tQGHUc6KvLdGLN5pld0dZz13FOze1Aa2/4Y=; b=ck+EAyT36wqY6kfs4dkCZdB3tMF4jBcATa3AfUxkwti/nCB0uEbib74x AdFXgMziZaG1ZNpwwjkUrJmYerJUXTnKBUWqquHXGFjaa814qf0OmShpo HA+19cABAB8PyW8ttNOmek0VzxdjpEb4S/6aBqUYbvz4Iz30WIJZEmeIH lqW4jsTMWiM9hoLLvoLxF1S+VHhMpzzkEraFk9HxJVi8NUooVAd+wanpy yyhupOAAGl1/2vAS/qulwdui63SDVMatul+1pTVlqRuqtYe6cCPQQ5Gty ibopy0IFItchN0IgAqYiR9joNdtr8SU11ZbfQY2hXrg217PML5Q9iNlsT g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037147" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037147" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:02 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392958" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392958" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:02 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 035/108] KVM: x86/mmu: Track shadow MMIO value on a per-VM basis Date: Sat, 29 Oct 2022 23:22:36 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092890215548652?= X-GMAIL-MSGID: =?utf-8?q?1748092890215548652?= From: Isaku Yamahata TDX will use a different shadow PTE entry value for MMIO from VMX. Add members to kvm_arch and track value for MMIO per-VM instead of global variables. By using the per-VM EPT entry value for MMIO, the existing VMX logic is kept working. To untangle the logic to initialize shadow_mmio_access_mask, introduce a separate setter function. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/mmu.h | 1 + arch/x86/kvm/mmu/mmu.c | 7 ++++--- arch/x86/kvm/mmu/spte.c | 11 +++++++++-- arch/x86/kvm/mmu/spte.h | 4 ++-- arch/x86/kvm/mmu/tdp_mmu.c | 6 +++--- 6 files changed, 21 insertions(+), 10 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 3374ec0d6d90..a1c801ca61d3 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1171,6 +1171,8 @@ struct kvm_arch { */ spinlock_t mmu_unsync_pages_lock; + u64 shadow_mmio_value; + struct list_head assigned_dev_head; struct iommu_domain *iommu_domain; bool iommu_noncoherent; diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index a45f7a96b821..50d240d52697 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -101,6 +101,7 @@ static inline u8 kvm_get_shadow_phys_bits(void) } void kvm_mmu_set_mmio_spte_mask(u64 mmio_value, u64 mmio_mask, u64 access_mask); +void kvm_mmu_set_mmio_spte_value(struct kvm *kvm, u64 mmio_value); void kvm_mmu_set_me_spte_mask(u64 me_value, u64 me_mask); void kvm_mmu_set_ept_masks(bool has_ad_bits, bool has_exec_only); diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index e7e11f51f8b4..0d3fa29ccccc 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -2421,7 +2421,7 @@ static int mmu_page_zap_pte(struct kvm *kvm, struct kvm_mmu_page *sp, return kvm_mmu_prepare_zap_page(kvm, child, invalid_list); } - } else if (is_mmio_spte(pte)) { + } else if (is_mmio_spte(kvm, pte)) { mmu_spte_clear_no_track(spte); } return 0; @@ -4081,7 +4081,7 @@ static int handle_mmio_page_fault(struct kvm_vcpu *vcpu, u64 addr, bool direct) if (WARN_ON(reserved)) return -EINVAL; - if (is_mmio_spte(spte)) { + if (is_mmio_spte(vcpu->kvm, spte)) { gfn_t gfn = get_mmio_spte_gfn(spte); unsigned int access = get_mmio_spte_access(spte); @@ -4578,7 +4578,7 @@ static unsigned long get_cr3(struct kvm_vcpu *vcpu) static bool sync_mmio_spte(struct kvm_vcpu *vcpu, u64 *sptep, gfn_t gfn, unsigned int access) { - if (unlikely(is_mmio_spte(*sptep))) { + if (unlikely(is_mmio_spte(vcpu->kvm, *sptep))) { if (gfn != get_mmio_spte_gfn(*sptep)) { mmu_spte_clear_no_track(sptep); return true; @@ -6061,6 +6061,7 @@ int kvm_mmu_init_vm(struct kvm *kvm) struct kvm_page_track_notifier_node *node = &kvm->arch.mmu_sp_tracker; int r; + kvm->arch.shadow_mmio_value = shadow_mmio_value; INIT_LIST_HEAD(&kvm->arch.active_mmu_pages); INIT_LIST_HEAD(&kvm->arch.zapped_obsolete_pages); INIT_LIST_HEAD(&kvm->arch.lpage_disallowed_mmu_pages); diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c index 5d5c06d4fd89..8f468ee2b985 100644 --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -74,10 +74,10 @@ u64 make_mmio_spte(struct kvm_vcpu *vcpu, u64 gfn, unsigned int access) u64 spte = generation_mmio_spte_mask(gen); u64 gpa = gfn << PAGE_SHIFT; - WARN_ON_ONCE(!shadow_mmio_value); + WARN_ON_ONCE(!vcpu->kvm->arch.shadow_mmio_value); access &= shadow_mmio_access_mask; - spte |= shadow_mmio_value | access; + spte |= vcpu->kvm->arch.shadow_mmio_value | access; spte |= gpa | shadow_nonpresent_or_rsvd_mask; spte |= (gpa & shadow_nonpresent_or_rsvd_mask) << SHADOW_NONPRESENT_OR_RSVD_MASK_LEN; @@ -352,6 +352,7 @@ u64 mark_spte_for_access_track(u64 spte) void kvm_mmu_set_mmio_spte_mask(u64 mmio_value, u64 mmio_mask, u64 access_mask) { BUG_ON((u64)(unsigned)access_mask != access_mask); + WARN_ON(mmio_value & shadow_nonpresent_or_rsvd_lower_gfn_mask); /* @@ -401,6 +402,12 @@ void kvm_mmu_set_mmio_spte_mask(u64 mmio_value, u64 mmio_mask, u64 access_mask) } EXPORT_SYMBOL_GPL(kvm_mmu_set_mmio_spte_mask); +void kvm_mmu_set_mmio_spte_value(struct kvm *kvm, u64 mmio_value) +{ + kvm->arch.shadow_mmio_value = mmio_value; +} +EXPORT_SYMBOL_GPL(kvm_mmu_set_mmio_spte_value); + void kvm_mmu_set_me_spte_mask(u64 me_value, u64 me_mask) { /* shadow_me_value must be a subset of shadow_me_mask */ diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index 7e0f79e8f45b..82f0d5c08b77 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -241,9 +241,9 @@ static inline int spte_index(u64 *sptep) */ extern u64 __read_mostly shadow_nonpresent_or_rsvd_lower_gfn_mask; -static inline bool is_mmio_spte(u64 spte) +static inline bool is_mmio_spte(struct kvm *kvm, u64 spte) { - return (spte & shadow_mmio_mask) == shadow_mmio_value && + return (spte & shadow_mmio_mask) == kvm->arch.shadow_mmio_value && likely(enable_mmio_caching); } diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 1eee9c159958..e07f14351d14 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -580,8 +580,8 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, * impact the guest since both the former and current SPTEs * are nonpresent. */ - if (WARN_ON(!is_mmio_spte(old_spte) && - !is_mmio_spte(new_spte) && + if (WARN_ON(!is_mmio_spte(kvm, old_spte) && + !is_mmio_spte(kvm, new_spte) && !is_removed_spte(new_spte))) pr_err("Unexpected SPTE change! Nonpresent SPTEs\n" "should not be replaced with another,\n" @@ -1105,7 +1105,7 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, } /* If a MMIO SPTE is installed, the MMIO will need to be emulated. */ - if (unlikely(is_mmio_spte(new_spte))) { + if (unlikely(is_mmio_spte(vcpu->kvm, new_spte))) { vcpu->stat.pf_mmio_spte_created++; trace_mark_mmio_spte(rcu_dereference(iter->sptep), iter->gfn, new_spte); From patchwork Sun Oct 30 06:22:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12878 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666295wru; Sat, 29 Oct 2022 23:29:43 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4QSt+Dd5ZGeEmIv1WzN9e1H3YvHs6wJdI/EJBq7HeNoqA1NPGHD7OP/qxR3c8yW8ecA64e X-Received: by 2002:a17:902:b48b:b0:182:42ce:5779 with SMTP id y11-20020a170902b48b00b0018242ce5779mr7997155plr.103.1667111382799; Sat, 29 Oct 2022 23:29:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111382; cv=none; d=google.com; s=arc-20160816; b=PCvtVfiRopRtBRxnMGMHTBAb2K5d1ay9TeJfzL9pC4sEtOT4zDExdj7vmg+08YXuho gT8oBMijCM3E6P07Yp9zzONePe4dlKG1i+N4Ks9wgSkj6pF2wDs4bcSztCxLtKQbS0/K e1Oc41IRDWefPS7XLsBMe5TXV/4DC2POIBh7NeQW9IH9jCetizyU5AswnLrLYTXlwv3l xmqusZPyO5WkuMiXRej+3I0zzxM8qrB3sGziBUGcdGVIYLRuWNXuEK+3UlVQ8V53vwq3 d0wSOlA1cxum4xQn2qotYhGtBIw1hIXihf/p4MWCuBUsPMLMYHoPKKmuP++NZpKh6Sf2 JnTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=7j0AqVma7qt7X8r6kXilqHTphpHK2HW1J+pHruk8D2U=; b=UWmgKxoSYL5s743KISHTHb7rKQPhq8ZC6VnCc/jkD5oceZrk89W6DBbIfy4JAlFoLN Vd4XHckWHbh9YsfAq7AJInucbOrRnT1X44uA8NTX9mnq6kkg8Alv7RkjLYmBvASuyehf 5S6L8UnR6CGdBoLi2OYuMHixPJgJnYMAMyWauyLiOU6KAD5q7AiHTC+6WRlNn/NO0bnf Y+FTvKT1hiQz4Ygk/Esu9zJtK6sV2/DqeRokOjrXiV6o8Sj8LUdTvURhppJCgtljSDjz oEZAqwGisp9l/Mp/HHzxG6ElxxYCiINO87XGy3KKeXvQnBOsvWgigp9SW05eVB4UDQ1q eePg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=P8OLk53L; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 123-20020a630581000000b0043895127033si4172376pgf.335.2022.10.29.23.29.30; Sat, 29 Oct 2022 23:29:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=P8OLk53L; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230366AbiJ3G0c (ORCPT + 99 others); Sun, 30 Oct 2022 02:26:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47140 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229909AbiJ3GYK (ORCPT ); Sun, 30 Oct 2022 02:24:10 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 11212DB; Sat, 29 Oct 2022 23:24:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111049; x=1698647049; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pWNqZahEI5Xl8NsH6Eh4XLhAHK5yB8z2OVg0DU77Aqw=; b=P8OLk53LzvWe6iF5XhUREwfxw2+s3HLA2xP/FB2JsDaw5C+l3BDN0E/D ryS8fLvzuIw05nRckc3E8ULyHbKZpOVfcNKVGBeyHZ2k1IMRTpwZ6Kugq LnNtjduioVkzYhuuLo1hN5z0AQjBxQ92boYSWJBaBNlCffXBt54KXq9Sn vzAdH/Giw96Ctw1s2SqIniJgKXUHQsBSr7p1XtGFnauIwlzVIyrC9wEIb os55kL/pc/qxLM3eCKcGHI9NCoshAEScogpcUPRkToj3H2uZ3O1gxdlNf d4N2ZwLuj5bgkXNxdhiKLdFP3rgH2+4LXPAmh5gR+MtynybTnTbGfogBZ g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037148" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037148" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:03 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392961" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392961" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:02 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 036/108] KVM: TDX: Enable mmio spte caching always for TDX Date: Sat, 29 Oct 2022 23:22:37 -0700 Message-Id: <820bac8ce45b92d643630084096dcd7e71038a58.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092985495756485?= X-GMAIL-MSGID: =?utf-8?q?1748092985495756485?= From: Isaku Yamahata TDX needs to set shared spte for MMIO GFN to !SUPPRES_VE_BIT | !RWX so that guest TD can get #VE and then issue TDG.VP.VMCALL. Enable mmio caching always for TDX irrelevant the module parameter enable_mmio_caching. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu/mmu.c | 3 ++- arch/x86/kvm/mmu/spte.h | 2 +- arch/x86/kvm/mmu/tdp_mmu.c | 7 +++++++ 3 files changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 0d3fa29ccccc..9098f77cdaa4 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3229,7 +3229,8 @@ static int handle_abnormal_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fau * and only if L1's MAXPHYADDR is inaccurate with respect to * the hardware's). */ - if (unlikely(!enable_mmio_caching) || + if (unlikely(!enable_mmio_caching && + !kvm_gfn_shared_mask(vcpu->kvm)) || unlikely(fault->gfn > kvm_mmu_max_gfn())) return RET_PF_EMULATE; } diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index 82f0d5c08b77..fecfdcb5f321 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -244,7 +244,7 @@ extern u64 __read_mostly shadow_nonpresent_or_rsvd_lower_gfn_mask; static inline bool is_mmio_spte(struct kvm *kvm, u64 spte) { return (spte & shadow_mmio_mask) == kvm->arch.shadow_mmio_value && - likely(enable_mmio_caching); + likely(enable_mmio_caching || kvm_gfn_shared_mask(kvm)); } static inline bool is_shadow_present_pte(u64 pte) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index e07f14351d14..3325633b1cb5 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1875,6 +1875,13 @@ int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, *root_level = vcpu->arch.mmu->root_role.level; + /* + * mmio page fault isn't supported for protected guest because + * instructions in protected guest memory can't be parsed by VMM. + */ + if (WARN_ON_ONCE(kvm_gfn_shared_mask(vcpu->kvm))) + return leaf; + tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) { leaf = iter.level; sptes[leaf] = iter.old_spte; From patchwork Sun Oct 30 06:22:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12861 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666036wru; Sat, 29 Oct 2022 23:28:39 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6Q9S9hiUqjp4/i8h2RzJ7Fw+Y6chh/O8sCFke+Cy8mHC1aiMBpISOVWxFnEDWq/8RhoG9U X-Received: by 2002:a05:6402:440f:b0:45d:297b:c70a with SMTP id y15-20020a056402440f00b0045d297bc70amr7516354eda.187.1667111319270; Sat, 29 Oct 2022 23:28:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111319; cv=none; d=google.com; s=arc-20160816; b=mr2ee0shwwL5fBo5b+D48iZj8izXp7zekXo7nVp2BpHVOohQJivX29+WnuunGF+gSb Jsw6tXlVmIi6tynX/M5w1J1689B4y6y09BY5D/dPfhb3ToyJ/90+4OGSzGc5BuzrqtNY U6bEcIiTsJLA1tlgJ2EIo86JeZaw5/u9MHepsLkPgHF27naV5/nYPUlX7Ki17z3EGRtE dYuzJLyNWAcNqPiwHqPjP9+WyuWWNuYeNOoCVrcv7BumwXnyzjiwCKJTVguhELsR1pLp 5+s6yOksMMwxwfWV0BuwZdgDE1JMTLLVg/CEkqky0yvhmViP7gLuelZNSQPDmisxYhc1 mXEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=O6IkLQ78XnWZRu+1P+/z2C6cr0iCNdcMOpVAoYVCYlk=; b=T731XXwaz84j+8uPt2+msQJ3A59c+7D2FdAr5gnYqC5WyyU9y/L4rHPA1x26r8jFed kb6E9LCFF2iD4jVuyOtwPrtBlgf5TgVv294FEfkTQEBEnZjrZObq8Ij/GB5z+D/H+yQU GDNc9oi4OrJ0MLI1fd5XP4sl1Q1jQ+s+jKQQaSSfhJ+mvbmRz+p/z/oVuyhTLNir7FmB h/vjikwru46e71iiryNy9khB6yiO8RjmkYTuc404PT/PWgcgbnzjWtY1tCXXfbVXQiuw IRTDmN8Vo8L2ldQf1sOBAoEeETTIOy8XoxpwVjS6l39L+DXtrE29ApfLqaTc+ovsEk58 BOGg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="Ouo/6yjm"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ne14-20020a1709077b8e00b0078ddc074afdsi4482489ejc.577.2022.10.29.23.28.15; Sat, 29 Oct 2022 23:28:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="Ouo/6yjm"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230361AbiJ3G03 (ORCPT + 99 others); Sun, 30 Oct 2022 02:26:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47136 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229907AbiJ3GYK (ORCPT ); Sun, 30 Oct 2022 02:24:10 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 11365F4; Sat, 29 Oct 2022 23:24:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111049; x=1698647049; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=sa5y0Pbexso39/n1O3X/Rxg+kJKGCxQRecKftLrSJ9c=; b=Ouo/6yjm9Sni4wg2DlOc8sfn9PDy8PkkyA5aON+WuivBP/pd8EAUm9QT uAuF9I8Sakw0BFStZMAvFo4ZXL7hyeEVaoE9va1teMyuWuzz+u5RYTBmS GMqxQrmBJMCFt8bwsZYAGIA4RhqRJLRKwcgBkfUqry3qobxGvfWgtit+M 7/bQqOkUGjMHzGdS8Sf5amRv7GRz0Ll7TAt0q/IxAwSUCSc9Vk5n8CSXg iIGjna14+2y1HPpAojd53RjuKyhKV6idrwfBJ3saM36TSe6BE39nQSdN2 zFJY0e1kOCPEvdyJakQZ+O2ct+Rcu8rEBEsARd0eZ8wrp3VX/nAeiqSG7 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037149" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037149" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:03 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392964" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392964" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:03 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 037/108] KVM: x86/mmu: Disallow fast page fault on private GPA Date: Sat, 29 Oct 2022 23:22:38 -0700 Message-Id: <6e3d747ef224b44ff1f14bcb26424a1a3c210fb9.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092918504232425?= X-GMAIL-MSGID: =?utf-8?q?1748092918504232425?= From: Isaku Yamahata TDX requires TDX SEAMCALL to operate Secure EPT instead of direct memory access and TDX SEAMCALL is heavy operation. Fast page fault on private GPA doesn't make sense. Disallow fast page fault on private GPA. Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/kvm/mmu/mmu.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 9098f77cdaa4..09defac49bf0 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3238,8 +3238,16 @@ static int handle_abnormal_pfn(struct kvm_vcpu *vcpu, struct kvm_page_fault *fau return RET_PF_CONTINUE; } -static bool page_fault_can_be_fast(struct kvm_page_fault *fault) +static bool page_fault_can_be_fast(struct kvm *kvm, struct kvm_page_fault *fault) { + /* + * TDX private mapping doesn't support fast page fault because the EPT + * entry is read/written with TDX SEAMCALLs instead of direct memory + * access. + */ + if (kvm_is_private_gpa(kvm, fault->addr)) + return false; + /* * Page faults with reserved bits set, i.e. faults on MMIO SPTEs, only * reach the common page fault handler if the SPTE has an invalid MMIO @@ -3349,7 +3357,7 @@ static int fast_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) u64 *sptep = NULL; uint retry_count = 0; - if (!page_fault_can_be_fast(fault)) + if (!page_fault_can_be_fast(vcpu->kvm, fault)) return ret; walk_shadow_page_lockless_begin(vcpu); From patchwork Sun Oct 30 06:22:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12854 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665932wru; Sat, 29 Oct 2022 23:28:10 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5cHjzs+x0vmXkxVb0yj9PFybD5D/L6rHDdFplZOLI7QIwBmDKengvs/jq6H2smX7Y0mWjh X-Received: by 2002:a17:906:ee89:b0:73d:70c5:1a4e with SMTP id wt9-20020a170906ee8900b0073d70c51a4emr6597736ejb.683.1667111290528; Sat, 29 Oct 2022 23:28:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111290; cv=none; d=google.com; s=arc-20160816; b=OmOJU0M6BuHHA+1LJUVCW+4C/uUgo6LRUwm7T9i67aCnRbu9sO+AN1QGflP6gPBveD RNmsmjuz8ez5uJQDF7KXXySeDN96z8m+zeTf0JDdZhZ50kjFPQR5+qfAeir66xcPVn5k EZB7qQvVLd6ZZcaIekBe/otsLyoy8+2kTibi738V6PfvCvq7zq1Q99Teq6vZXuek524h WPyrg7zKz4iqzb7GVkyEe2kpPeBjnOTyARCtPfZ39/v44/XNNCVoAMMXilSUIR2+SVvC VFW3A+Pd9WZ/MEfI8XpGNX+tXtJ7ug208MIx7YHHKqTgb9i7ArX8IcPTcLV7yxWrVAi3 lTSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=eORZ8A+mxz1NzwkIX6wWBeyrp0clIllgfJCoYoRn+YU=; b=j4Yff/mVErDvPdG08XVzz79izU6SH/0G5uz+MginTMVhdKvRJUT5UvGdErQ2p1wt8T Wg1yZUHMPKlZUbI+AssJf1Z7Bo2hIIOGHhoEAyisWDawtc8gMXcyRr6jbr6oRvRHA0iT FiEAfFC1XGbchoCu4i+Sd52RIQoY/bZ7yxTXYzhnmTYFzfPnA3d/UOy/LTHJpBYyl2FU aWaC+sNCljd6XoipWvxcpcPz1z3b3l8XfmyU7r4Zkut66gPbAWAVSGYerfl6tKrZOR4E Lh+hcBZPdPaWnj8u89Q5ttHtodyZouQjQt+6vFFMg0m7hP4iWSA+gClQZYbeio3lVJKs pLvA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="Jkhnby+/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i9-20020a1709064fc900b0078e06ba4deasi4422804ejw.218.2022.10.29.23.27.42; Sat, 29 Oct 2022 23:28:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="Jkhnby+/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230376AbiJ3G0l (ORCPT + 99 others); Sun, 30 Oct 2022 02:26:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47178 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229917AbiJ3GYK (ORCPT ); Sun, 30 Oct 2022 02:24:10 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 670DAAA; Sat, 29 Oct 2022 23:24:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111049; x=1698647049; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SjeBxLoPf/CqcvjTLSjw7x/IeyIXmuliFNV5S2Q2WWc=; b=Jkhnby+/ZodnVrMlele+9KS5f9Maw3Us4+dBrJGoedY/LuZfiBknE3FT wEmy7B27QwzZ0ioSxG8Nn9qEQoPD+86gPyoJnc43KNRCYZoNqMCzkyUog 3LdzkZc9J1KG9lDTi3zM8FF6+GH+yZWEzX/P1sxyNWubMPfwnk05mLCkV S2j51jzj0gnR3SQmVwmPJAGAMKZ9BPaZFgBX87b2CrFUzM0ntZZbUkVCd VqIYAgqMOjCD35/sKKUU16s9sYrqRldXxTPaV0bztZnT1gSsrEq/RhCyM 6d3UIUU0Ft5/KN7BMhSVg5EBaD+WDPZmb6pNLrSb2segjw8YgGEcvDfUo g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037150" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037150" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:03 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392968" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392968" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:03 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 038/108] KVM: x86/mmu: Allow per-VM override of the TDP max page level Date: Sat, 29 Oct 2022 23:22:39 -0700 Message-Id: <9ddbeda4c638ef8211d37f8b89f1adfb2a669959.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092888605927431?= X-GMAIL-MSGID: =?utf-8?q?1748092888605927431?= From: Sean Christopherson TDX requires special handling to support large private page. For simplicity, only support 4K page for TD guest for now. Add per-VM maximum page level support to support different maximum page sizes for TD guest and conventional VMX guest. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/mmu/mmu.c | 1 + arch/x86/kvm/mmu/mmu_internal.h | 2 +- 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index a1c801ca61d3..2e0b23422d63 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1156,6 +1156,7 @@ struct kvm_arch { unsigned long n_requested_mmu_pages; unsigned long n_max_mmu_pages; unsigned int indirect_shadow_pages; + int tdp_max_page_level; u8 mmu_valid_gen; struct hlist_head mmu_page_hash[KVM_NUM_MMU_PAGES]; struct list_head active_mmu_pages; diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 09defac49bf0..0001e921154e 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6092,6 +6092,7 @@ int kvm_mmu_init_vm(struct kvm *kvm) kvm->arch.split_desc_cache.kmem_cache = pte_list_desc_cache; kvm->arch.split_desc_cache.gfp_zero = __GFP_ZERO; + kvm->arch.tdp_max_page_level = KVM_MAX_HUGEPAGE_LEVEL; return 0; } diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 5cdff5ca546c..57e3ea2b52cc 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -275,7 +275,7 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, .nx_huge_page_workaround_enabled = is_nx_huge_page_enabled(vcpu->kvm), - .max_level = KVM_MAX_HUGEPAGE_LEVEL, + .max_level = vcpu->kvm->arch.tdp_max_page_level, .req_level = PG_LEVEL_4K, .goal_level = PG_LEVEL_4K, }; From patchwork Sun Oct 30 06:22:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12857 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665982wru; Sat, 29 Oct 2022 23:28:24 -0700 (PDT) X-Google-Smtp-Source: AMsMyM67yi1UfFovhlOon2QJZOoTjVXtiy0I7+eJIa7xQkJcNougD/j/TSKoV2ZpNnCRQFcF/+3A X-Received: by 2002:a17:906:9c83:b0:779:c14c:55e4 with SMTP id fj3-20020a1709069c8300b00779c14c55e4mr6779149ejc.619.1667111304354; Sat, 29 Oct 2022 23:28:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111304; cv=none; d=google.com; s=arc-20160816; b=N6BEMtTHcSoMshq+Q5H8eyE8ImDD7z41lzrLAb1d43uLO/MWvTGj13cDfoA65FOM/d pQ0wmDgWlzXkrURCxE7LOUovIDXnVGQfva6uXxX7EityZOJw65b0H4GIeb2tupi9A412 zlUqbHXjUfmsx1oQbC3Z8Iz+QpMmXOmtJa8E7i1Fjy4su9xsMqb809B6fGaOV0ocKbm8 uBnBqq1OTrshPshKGctCcHG1E3Rx0/s7ZaaHV+zr6L/SK3NL1G5lx2+epcOvCsN7OzCt bemobkh90xcAeuwQJU1Uk4gSFVKRIZ4kYgilaU40vGwb5sRbKbmT6rbjEMqHDuC58XUl UpRg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=81nvyc5luY6S/FmpAeEUpQNjohFg7QFZd7wIZT9wMVo=; b=KZLafouLYRN4IKPvVw952xq6ZepZzXEI2nDCBt6H3V1E93M5f37aVyHEcYogwRmF82 D4DDqij+5VcEcC93dOzg/u69WK7VET3TwumE+XCFtlyEToEqtbARryCFLsrwqiGClP19 OlIiZShVoIRoxH4JS2tr6geYjzaJMm0jQQzMG7psMB9GU1hz3G2iF/CCrVZPHJ6nkb7p szL+oRwJFEQrBAZUPLMm1u+/PRiDmCJQq2gfxeSzriFv69seiQGMDBIU+T/mjvLiA2pH 7xrEZhdUfq6TmdlLNhRKS8saE4x8O6sy1v58RJ5HVzNnk0aqXKDgN6NRKRYap+hlzPd7 /Y9g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iz7b05at; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sc36-20020a1709078a2400b0073155abc1b8si3837180ejc.154.2022.10.29.23.27.53; Sat, 29 Oct 2022 23:28:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iz7b05at; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230391AbiJ3G07 (ORCPT + 99 others); Sun, 30 Oct 2022 02:26:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46946 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229923AbiJ3GYL (ORCPT ); Sun, 30 Oct 2022 02:24:11 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B727F119; Sat, 29 Oct 2022 23:24:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111049; x=1698647049; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+7BTlOk9N2U6BnlePCe66dB3SDCA9PryIW++EahJqO8=; b=iz7b05atofNomism7X2NLfU3bdW4Ts5XY8qWzvz/NWCj3PtcdarAu1VZ W06Bz2b8M4CxBI7VOhsW4B7FgmwDXduFdmwDeOcUhIXM1P59VE+U+QWxt L8aQXW3KEk997zHiRW83SSskYZjK2V08JSK31/EMDViDPTvOe1clT23+l q5HHg/o+0by3NQExeqxUGla35pClXfsVxuBmoxblCvN9I9zVNa8cu0CWz M6+XAphV6cVwUX8omKGMC8NiNbQnrtgJouYpxf4x2iZDzI4SNC0jJgsab DW7zrLdRSx5hOUv5zY8Gy5P0LS9E5SRTZIIGg9hpDd3FXP34pwqowbGRZ Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037151" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037151" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:03 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392972" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392972" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:03 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 039/108] KVM: VMX: Introduce test mode related to EPT violation VE Date: Sat, 29 Oct 2022 23:22:40 -0700 Message-Id: <8b3101711c5291246653efb50cc2975863d3a8ab.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092903124401550?= X-GMAIL-MSGID: =?utf-8?q?1748092903124401550?= From: Isaku Yamahata To support TDX, KVM is enhanced to operate with #VE. For TDX, KVM programs to inject #VE conditionally and set #VE suppress bit in EPT entry. For VMX case, #VE isn't used. If #VE happens for VMX, it's a bug. To be defensive (test that VMX case isn't broken), introduce option ept_violation_ve_test and when it's set, set error. Suggested-by: Paolo Bonzini Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/vmx.h | 12 +++++++ arch/x86/kvm/vmx/vmcs.h | 5 +++ arch/x86/kvm/vmx/vmx.c | 69 +++++++++++++++++++++++++++++++++++++- arch/x86/kvm/vmx/vmx.h | 6 +++- 4 files changed, 90 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index cdbf12c1a83c..752d53652007 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -68,6 +68,7 @@ #define SECONDARY_EXEC_ENCLS_EXITING VMCS_CONTROL_BIT(ENCLS_EXITING) #define SECONDARY_EXEC_RDSEED_EXITING VMCS_CONTROL_BIT(RDSEED_EXITING) #define SECONDARY_EXEC_ENABLE_PML VMCS_CONTROL_BIT(PAGE_MOD_LOGGING) +#define SECONDARY_EXEC_EPT_VIOLATION_VE VMCS_CONTROL_BIT(EPT_VIOLATION_VE) #define SECONDARY_EXEC_PT_CONCEAL_VMX VMCS_CONTROL_BIT(PT_CONCEAL_VMX) #define SECONDARY_EXEC_XSAVES VMCS_CONTROL_BIT(XSAVES) #define SECONDARY_EXEC_MODE_BASED_EPT_EXEC VMCS_CONTROL_BIT(MODE_BASED_EPT_EXEC) @@ -223,6 +224,8 @@ enum vmcs_field { VMREAD_BITMAP_HIGH = 0x00002027, VMWRITE_BITMAP = 0x00002028, VMWRITE_BITMAP_HIGH = 0x00002029, + VE_INFORMATION_ADDRESS = 0x0000202A, + VE_INFORMATION_ADDRESS_HIGH = 0x0000202B, XSS_EXIT_BITMAP = 0x0000202C, XSS_EXIT_BITMAP_HIGH = 0x0000202D, ENCLS_EXITING_BITMAP = 0x0000202E, @@ -628,4 +631,13 @@ enum vmx_l1d_flush_state { extern enum vmx_l1d_flush_state l1tf_vmx_mitigation; +struct vmx_ve_information { + u32 exit_reason; + u32 delivery; + u64 exit_qualification; + u64 guest_linear_address; + u64 guest_physical_address; + u16 eptp_index; +}; + #endif diff --git a/arch/x86/kvm/vmx/vmcs.h b/arch/x86/kvm/vmx/vmcs.h index ac290a44a693..9277676057a7 100644 --- a/arch/x86/kvm/vmx/vmcs.h +++ b/arch/x86/kvm/vmx/vmcs.h @@ -140,6 +140,11 @@ static inline bool is_nm_fault(u32 intr_info) return is_exception_n(intr_info, NM_VECTOR); } +static inline bool is_ve_fault(u32 intr_info) +{ + return is_exception_n(intr_info, VE_VECTOR); +} + /* Undocumented: icebp/int1 */ static inline bool is_icebp(u32 intr_info) { diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index f890191e8580..dd3fde9d3c32 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -126,6 +126,9 @@ module_param(error_on_inconsistent_vmcs_config, bool, 0444); static bool __read_mostly dump_invalid_vmcs = 0; module_param(dump_invalid_vmcs, bool, 0644); +static bool __read_mostly ept_violation_ve_test; +module_param(ept_violation_ve_test, bool, 0444); + #define MSR_BITMAP_MODE_X2APIC 1 #define MSR_BITMAP_MODE_X2APIC_APICV 2 @@ -783,6 +786,13 @@ void vmx_update_exception_bitmap(struct kvm_vcpu *vcpu) eb = (1u << PF_VECTOR) | (1u << UD_VECTOR) | (1u << MC_VECTOR) | (1u << DB_VECTOR) | (1u << AC_VECTOR); + /* + * #VE isn't used for VMX, but for TDX. To test against unexpected + * change related to #VE for VMX, intercept unexpected #VE and warn on + * it. + */ + if (ept_violation_ve_test) + eb |= 1u << VE_VECTOR; /* * Guest access to VMware backdoor ports could legitimately * trigger #GP because of TSS I/O permission bitmap. @@ -2644,6 +2654,9 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf, &_cpu_based_2nd_exec_control)) return -EIO; } + if (!ept_violation_ve_test) + _cpu_based_exec_control &= ~SECONDARY_EXEC_EPT_VIOLATION_VE; + #ifndef CONFIG_X86_64 if (!(_cpu_based_2nd_exec_control & SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)) @@ -2668,6 +2681,7 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf, return -EIO; vmx_cap->ept = 0; + _cpu_based_2nd_exec_control &= ~SECONDARY_EXEC_EPT_VIOLATION_VE; } if (!(_cpu_based_2nd_exec_control & SECONDARY_EXEC_ENABLE_VPID) && vmx_cap->vpid) { @@ -4510,6 +4524,7 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx) exec_control &= ~SECONDARY_EXEC_ENABLE_VPID; if (!enable_ept) { exec_control &= ~SECONDARY_EXEC_ENABLE_EPT; + exec_control &= ~SECONDARY_EXEC_EPT_VIOLATION_VE; enable_unrestricted_guest = 0; } if (!enable_unrestricted_guest) @@ -4637,8 +4652,40 @@ static void init_vmcs(struct vcpu_vmx *vmx) exec_controls_set(vmx, vmx_exec_control(vmx)); - if (cpu_has_secondary_exec_ctrls()) + if (cpu_has_secondary_exec_ctrls()) { secondary_exec_controls_set(vmx, vmx_secondary_exec_control(vmx)); + if (secondary_exec_controls_get(vmx) & + SECONDARY_EXEC_EPT_VIOLATION_VE) { + if (!vmx->ve_info) { + /* ve_info must be page aligned. */ + struct page *page; + + BUILD_BUG_ON(sizeof(*vmx->ve_info) > PAGE_SIZE); + page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO); + if (page) + vmx->ve_info = page_to_virt(page); + } + if (vmx->ve_info) { + /* + * Allow #VE delivery. CPU sets this field to + * 0xFFFFFFFF on #VE delivery. Another #VE can + * occur only if software clears the field. + */ + vmx->ve_info->delivery = 0; + vmcs_write64(VE_INFORMATION_ADDRESS, + __pa(vmx->ve_info)); + } else { + /* + * Because SECONDARY_EXEC_EPT_VIOLATION_VE is + * used only when ept_violation_ve_test is true, + * it's okay to go with the bit disabled. + */ + pr_err("Failed to allocate ve_info. disabling EPT_VIOLATION_VE.\n"); + secondary_exec_controls_clearbit(vmx, + SECONDARY_EXEC_EPT_VIOLATION_VE); + } + } + } if (cpu_has_tertiary_exec_ctrls()) tertiary_exec_controls_set(vmx, vmx_tertiary_exec_control(vmx)); @@ -5118,6 +5165,12 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) if (is_invalid_opcode(intr_info)) return handle_ud(vcpu); + /* + * #VE isn't supposed to happen. Although vcpu can send + */ + if (KVM_BUG_ON(is_ve_fault(intr_info), vcpu->kvm)) + return -EIO; + error_code = 0; if (intr_info & INTR_INFO_DELIVER_CODE_MASK) error_code = vmcs_read32(VM_EXIT_INTR_ERROR_CODE); @@ -6306,6 +6359,18 @@ void dump_vmcs(struct kvm_vcpu *vcpu) if (secondary_exec_control & SECONDARY_EXEC_ENABLE_VPID) pr_err("Virtual processor ID = 0x%04x\n", vmcs_read16(VIRTUAL_PROCESSOR_ID)); + if (secondary_exec_control & SECONDARY_EXEC_EPT_VIOLATION_VE) { + struct vmx_ve_information *ve_info; + + pr_err("VE info address = 0x%016llx\n", + vmcs_read64(VE_INFORMATION_ADDRESS)); + ve_info = __va(vmcs_read64(VE_INFORMATION_ADDRESS)); + pr_err("ve_info: 0x%08x 0x%08x 0x%016llx 0x%016llx 0x%016llx 0x%04x\n", + ve_info->exit_reason, ve_info->delivery, + ve_info->exit_qualification, + ve_info->guest_linear_address, + ve_info->guest_physical_address, ve_info->eptp_index); + } } /* @@ -7302,6 +7367,8 @@ void vmx_vcpu_free(struct kvm_vcpu *vcpu) free_vpid(vmx->vpid); nested_vmx_free_vcpu(vcpu); free_loaded_vmcs(vmx->loaded_vmcs); + if (vmx->ve_info) + free_page((unsigned long)vmx->ve_info); } int vmx_vcpu_create(struct kvm_vcpu *vcpu) diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index d49d0ace9fb8..1813caeb24d8 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -359,6 +359,9 @@ struct vcpu_vmx { DECLARE_BITMAP(read, MAX_POSSIBLE_PASSTHROUGH_MSRS); DECLARE_BITMAP(write, MAX_POSSIBLE_PASSTHROUGH_MSRS); } shadow_msr_intercept; + + /* ve_info must be page aligned. */ + struct vmx_ve_information *ve_info; }; struct kvm_vmx { @@ -570,7 +573,8 @@ static inline u8 vmx_get_rvi(void) SECONDARY_EXEC_ENABLE_VMFUNC | \ SECONDARY_EXEC_BUS_LOCK_DETECTION | \ SECONDARY_EXEC_NOTIFY_VM_EXITING | \ - SECONDARY_EXEC_ENCLS_EXITING) + SECONDARY_EXEC_ENCLS_EXITING | \ + SECONDARY_EXEC_EPT_VIOLATION_VE) #define KVM_REQUIRED_VMX_TERTIARY_VM_EXEC_CONTROL 0 #define KVM_OPTIONAL_VMX_TERTIARY_VM_EXEC_CONTROL \ From patchwork Sun Oct 30 06:22:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12859 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666025wru; Sat, 29 Oct 2022 23:28:36 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4FIrje4NF6lvE1QEiHvN0y3mfj0jdgNqp2eyL78ObTEfL+ICe2eOgY8rRx4SgJ2avpUCFP X-Received: by 2002:a17:906:1ec5:b0:78d:b3d1:183b with SMTP id m5-20020a1709061ec500b0078db3d1183bmr6953074ejj.709.1667111316275; Sat, 29 Oct 2022 23:28:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111316; cv=none; d=google.com; s=arc-20160816; b=vW/8VTvMWzkCG8hivaIi6sWTh0IWZc7mPW8AJFPwi2qz6wTEz3cHQtTc/rGWFpVCql 1BSGVItBiqzrpMF+LRxfW6zWpBiSZ5EhV9a5kbI8mN++dG29H5hinYYmKHOh8DzBIF/L CjEapvkFzEcLIn42k23J+cgIDzhaq1+V8n45HcAGW71/kObpt9a1MwXfikFr2C+0dZOr +aCiX/EnXPlfWnhY89sLwBA4gFo3f9wmNRYnDK3DmI+vDorTdrOOQBgXQMwwqKl4yKaN hfJyblLVUTcdmVYhmnakVc22CUbxyuBiUquchM4I2ltct3TQkcLkM8szELDmClddQ6vc Wp3w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=efCqntlzrh0FcO85waAlgosIGgiyH/28a2FyRq96624=; b=TYWSwcaYgi/fk/ivzqpU6Wuhf7s0FL1TZHpXOBES/C6iEUFDJo8bVPYmirkpq6mrP5 hrCKTamQIExTFMbH1eE0ucCeZUdfHDy8nCPTd2xzDRn8WbWbLpR6xcx4Fa3pO1VOSYQw dk7dxOfBju9mzzNz09170e0Gds1Zq29TDBz/RkAQssTbQ2xIkOwPgBHHlQI6/+WtBKvn eyel2qblYAITxfZQK86x/Zx2P9mlnGndPQ1+glWyJTcxyGRHGwGmugf0V569b0LDRRrB EgTLgDrzRpj4ZHe1HJG0W7luyV9/mupPM+fTSsGKUmaNorfFnU8R0ctgGYPwKrYOmXP0 4gSg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="g4TObN/Z"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s20-20020aa7d794000000b0045903f0af9dsi3279107edq.111.2022.10.29.23.28.11; Sat, 29 Oct 2022 23:28:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="g4TObN/Z"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230400AbiJ3G1F (ORCPT + 99 others); Sun, 30 Oct 2022 02:27:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47210 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229926AbiJ3GYL (ORCPT ); Sun, 30 Oct 2022 02:24:11 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B6E61115; Sat, 29 Oct 2022 23:24:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111049; x=1698647049; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9+zynqxUnbt7L2tmrykvNHmz+Wt/xr8Fj4lA+9l+hKk=; b=g4TObN/ZmXLocUM6TmKGs0dTcTrxY/LBI4+S4Hy0x337usVOHhK8N7S2 +YYyCdEFt3DU5q4lz3YnWYj2oFJzqeM+E//Dca+scRXGYrUF55P+5DNRp /+eH6oq8umn5kPdA8CaqAdiOoYpW+Pn/Y+U6U3QFX52a1XcDG7Yu6Bl9b OvKdXUV6D8t2RHIi91XaNvlcWdRewwXtRj3dRKPa3BlYLKTeE5bKwXXHe K0Uv42B3AypOsF3GpiOnM1YLp2B8TSUqWsGUxF7YND7E0/qOjyHy6EjFt c4kNsCYfWhVG3DKEibU7wn8C0OvOZpS2GlTAM6iwZ5k330G6kx9hODIwK w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037152" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037152" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:03 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392975" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392975" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:03 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 040/108] [MARKER] The start of TDX KVM patch series: KVM TDP MMU hooks Date: Sat, 29 Oct 2022 23:22:41 -0700 Message-Id: <72a30156f43914290e283d5228bd099cd552b062.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092915679426846?= X-GMAIL-MSGID: =?utf-8?q?1748092915679426846?= From: Isaku Yamahata This empty commit is to mark the start of patch series of KVM TDP MMU hooks. Signed-off-by: Isaku Yamahata --- Documentation/virt/kvm/intel-tdx-layer-status.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/virt/kvm/intel-tdx-layer-status.rst b/Documentation/virt/kvm/intel-tdx-layer-status.rst index df003d2ed89e..d5cace00c433 100644 --- a/Documentation/virt/kvm/intel-tdx-layer-status.rst +++ b/Documentation/virt/kvm/intel-tdx-layer-status.rst @@ -25,6 +25,6 @@ Patch Layer status * TD vcpu interrupts/exit/hypercall: Not yet * KVM MMU GPA shared bits: Applied -* KVM TDP refactoring for TDX: Applying -* KVM TDP MMU hooks: Not yet +* KVM TDP refactoring for TDX: Applied +* KVM TDP MMU hooks: Applying * KVM TDP MMU MapGPA: Not yet From patchwork Sun Oct 30 06:22:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12856 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1665959wru; Sat, 29 Oct 2022 23:28:18 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6eDqLl5jUIndaRiJFJC3Yf73WfB73DspwYplhbOt0GMVsM2wj17bN+LhGDtvR/Eao1eW8I X-Received: by 2002:a05:6402:366:b0:463:11e8:13cb with SMTP id s6-20020a056402036600b0046311e813cbmr4765463edw.367.1667111298192; Sat, 29 Oct 2022 23:28:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111298; cv=none; d=google.com; s=arc-20160816; b=XdyrJwJGWSLvjGz92hC7fLgy5i9Ks1vF5wPP97C4yP2nN4cRkXW/wIkKqoQGRUnnjS bPuRjucbx2aiqApcXtyGUpy34UhlwjwYg6ZNb6q3wOb9ab1c4TrqxgwE/4fQYkxjmk8u YeLgCCTimGLlWnaaZbJoRO7BxtiFtSWgN9VJv17SCKMXQTlCjeqrrRd777n0c64bO4pS Qy2bAJojkPUHJdeNHqsn5vl3OXvvYAV1Y99W2A6jyi9TPJjOuYIvIRFIsGL02WHFpPIG YRvrPVNsboREw1qZJI/b8dB4iDgCb/R8v5jpPYOWWXyaCHyFwTQJcE7hgl/5qPMTKvux hESg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=y3/7NxdHjMlIhEi2QOKsAbZ7afUPdh3yBKi3BAKb6f8=; b=cYNuMRUHcx+e9FOBU+qbmYuQmFKUW/yDA9reC7P6jRrU8wxPWB2a2DcKEH9FD8fg/N iz0tD+7lrD819v+1t0YPr77vPpLY8SumvN72canOrN5Qu1NKKa5uvozE3dckomkGAJfs WZlCK8hH0TtW9kC4g9cpGOrFgF87bASGt/VgIrI82OiOWIAFL3qbAfQ66ytWmpTpmU6p FDDh3tq4gocE7EPhpvbr4vG+Tq3kOomDSJqJbOqLzNg7di/+mP0gnr2jPTy4MIGtPATX WGKE/6P+lUCTgee+mq7KL72R8iyqGsQ5i1hFw8QFqx3a0kmJ+0ThcMHWYIf9z3gM2f1b tBMg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=E6dgYF4E; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e18-20020a170906505200b007acef6b1f00si3607099ejk.419.2022.10.29.23.27.47; Sat, 29 Oct 2022 23:28:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=E6dgYF4E; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230386AbiJ3G0v (ORCPT + 99 others); Sun, 30 Oct 2022 02:26:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46854 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229925AbiJ3GYL (ORCPT ); Sun, 30 Oct 2022 02:24:11 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 03C81126; Sat, 29 Oct 2022 23:24:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111049; x=1698647049; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=NKuth91WRs/SZEoHEI1aWkm/ClHWtQ+fWSUlO4AXg5k=; b=E6dgYF4EjrMxHT4FncLk7ukclniLDARsZlKM/48gxe4zMcnLbPr65ovR DnDM/bzzfe1T839BbYEBXMmmQeftG0R41wZoYSi6ikSc9APWSm8FXJZv+ 3sekp4mOXlNSMCmik8pc2i7vp1uCionK0d0/u/Y8Ol/6ZiOonq2emZE1L sZCDzRO1xUUagEVs11DCl76b2jJfgSlEM7pZOpwz89UKMxf4H8IV2KDX+ 2yxgmdTCksrFCySFa6+TlwNGgz/RCh3xr0BPGFfxh3hdmwQf6qCJhQ4C3 9iZ4bqdhI/meKMXhe/9q/KMA3GkBPmyKEy6YfJiMCG/s1WEoE4s8hqs/S g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037153" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037153" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:03 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392978" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392978" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:03 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 041/108] KVM: x86/tdp_mmu: refactor kvm_tdp_mmu_map() Date: Sat, 29 Oct 2022 23:22:42 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092896810956169?= X-GMAIL-MSGID: =?utf-8?q?1748092896810956169?= From: Isaku Yamahata Factor out non-leaf SPTE population logic from kvm_tdp_mmu_map(). MapGPA hypercall needs to populate non-leaf SPTE to record which GPA, private or shared, is allowed in the leaf EPT entry. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu/tdp_mmu.c | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 3325633b1cb5..11b0ec8aebe2 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1157,6 +1157,24 @@ static int tdp_mmu_link_sp(struct kvm *kvm, struct tdp_iter *iter, return 0; } +static int tdp_mmu_populate_nonleaf(struct kvm_vcpu *vcpu, struct tdp_iter *iter, + bool account_nx) +{ + struct kvm_mmu_page *sp; + int ret; + + KVM_BUG_ON(is_shadow_present_pte(iter->old_spte), vcpu->kvm); + KVM_BUG_ON(is_removed_spte(iter->old_spte), vcpu->kvm); + + sp = tdp_mmu_alloc_sp(vcpu); + tdp_mmu_init_child_sp(sp, iter); + + ret = tdp_mmu_link_sp(vcpu->kvm, iter, sp, account_nx, true); + if (ret) + tdp_mmu_free_sp(sp); + return ret; +} + /* * Handle a TDP page fault (NPT/EPT violation/misconfiguration) by installing * page tables and SPTEs to translate the faulting guest physical address. @@ -1165,7 +1183,6 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) { struct kvm_mmu *mmu = vcpu->arch.mmu; struct tdp_iter iter; - struct kvm_mmu_page *sp; int ret; kvm_mmu_hugepage_adjust(vcpu, fault); @@ -1211,13 +1228,8 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) if (is_removed_spte(iter.old_spte)) break; - sp = tdp_mmu_alloc_sp(vcpu); - tdp_mmu_init_child_sp(sp, &iter); - - if (tdp_mmu_link_sp(vcpu->kvm, &iter, sp, account_nx, true)) { - tdp_mmu_free_sp(sp); + if (tdp_mmu_populate_nonleaf(vcpu, &iter, account_nx)) break; - } } } From patchwork Sun Oct 30 06:22:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12863 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666078wru; Sat, 29 Oct 2022 23:28:48 -0700 (PDT) X-Google-Smtp-Source: AMsMyM79Hc+ki7BjiwjuPN3+2K4vbhgsZL6QCXf+Et6IQTbvIERhpCX0hPZEbY8ArJfZaPrQzlzG X-Received: by 2002:a17:907:e8d:b0:791:a798:7e09 with SMTP id ho13-20020a1709070e8d00b00791a7987e09mr6836340ejc.717.1667111328192; Sat, 29 Oct 2022 23:28:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111328; cv=none; d=google.com; s=arc-20160816; b=1EEdrmp0/UJ7qquXlPevSCvvIKtksGJ3cj/CH+H8GvTQR02bI6kfigU6shDLSYa/hj Qr5OSNLYCyIPH3zZHaErjZhJgBmyEse6NcB6G+G9fYHIRhoUJL4NpDUwX4RaGxz6qhq+ nMEBBZD3vz50B5TNDJQ/rj296rtzqN30267cl+3Ph7acSpGDvdPRC5BBQnx5Xfz9lADH tIktwt7ZicxoGEdRWw7an2FQrHhwzbzeQeR+UJEqKsMjGuGSTa8i4M2FMczWzma5Rh36 iDSlZplQHy30vlyW4aiAKqySnvr014/4GKBiOjZspMhC7AeQicSIuDFaWRLlfPwU1HCD Dk4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=FKRStfSep/Bw4zZ4KysolErIpuguUlS/7TyUFJgg9PA=; b=lE+goQIH4QmHva8glnc+s2/y/wWYpVq9xJcABVlCRkkOEM+i4z5my39bCYfkeXw73x U/i1WTznRl6UQCLjYuHWgCgq9w+hf3Bez9ejj9nW6QQUNcpRiH4wSZyRRBWzt3frJTNw j526/zWZsMK7Ci7MdzxHwceunHah3dW1d2Qu6/o+Xi3NqYNasw4GZdwUfuNGi+OE5rB4 PkMJxALYIt6LZ8iFvxxeEis+FbcPYOzH6DaM0mFejqit8fqn6ICGxL6YXT6Lh1U0/vII QV7nqI1EflHglzUyyoOeb+zjZG7Dxquj1QDGorfGoHBKiPIagnQOda8EC90zz8J7i6sa hPHw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="XLU6/LTB"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gs36-20020a1709072d2400b0073146b3f95asi4290311ejc.632.2022.10.29.23.28.22; Sat, 29 Oct 2022 23:28:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="XLU6/LTB"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230410AbiJ3G1N (ORCPT + 99 others); Sun, 30 Oct 2022 02:27:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229681AbiJ3GYX (ORCPT ); Sun, 30 Oct 2022 02:24:23 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 46B8E13A; Sat, 29 Oct 2022 23:24:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111050; x=1698647050; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jgOSKQCRPSf9U9FNajp69NpYFLNQCOdQ1keCTt2YExI=; b=XLU6/LTBrsKJpWwiQvvYUR4p5iBu+3UKrENSgazyrLPXpMEvtRjLVXSG AwWNdzsyPW3pipPOMV4XXAPqBDIzjfg6GVafXgWU2nsizmRHdzqxq2D/W ZJiOmBmGl1DiEyiINT1XveWOuW/T1swaBiOSdnFuwdt2x5OvbjYzQv7C+ 9poiBZJoTfgITfJFk/uPhLHnn0i4FrlhfyP1+3rSu/p2+qbN+BEVxkRIc 5vSmCOjAkZnajRpZHVUMSPzhBvm6qqIuLq64FM4mqGsIAQs/Nxs0OAiE3 YF6pMqaxqOS39jTDKG+rg5Qf0IH16znQmTU/j4jqK1Cqwshv9Y0cXYQ0x Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037154" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037154" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:04 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392982" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392982" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:03 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 042/108] KVM: x86/tdp_mmu: Init role member of struct kvm_mmu_page at allocation Date: Sat, 29 Oct 2022 23:22:43 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092927815549645?= X-GMAIL-MSGID: =?utf-8?q?1748092927815549645?= From: Isaku Yamahata Refactor tdp_mmu_alloc_sp() and tdp_mmu_init_sp and eliminate tdp_mmu_init_child_sp(). Currently tdp_mmu_init_sp() (or tdp_mmu_init_child_sp()) sets kvm_mmu_page.role after tdp_mmu_alloc_sp() allocating struct kvm_mmu_page and its page table page. This patch makes tdp_mmu_alloc_sp() initialize kvm_mmu_page.role instead of tdp_mmu_init_sp(). To handle private page tables, argument of is_private needs to be passed down. Given that already page level is passed down, it would be cumbersome to add one more parameter about sp. Instead replace the level argument with union kvm_mmu_page_role. Thus the number of argument won't be increased and more info about sp can be passed down. For private sp, secure page table will be also allocated in addition to struct kvm_mmu_page and page table (spt member). The allocation functions (tdp_mmu_alloc_sp() and __tdp_mmu_alloc_sp_for_split()) need to know if the allocation is for the conventional page table or private page table. Pass union kvm_mmu_role to those functions and initialize role member of struct kvm_mmu_page. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu/tdp_iter.h | 12 ++++++++++ arch/x86/kvm/mmu/tdp_mmu.c | 44 ++++++++++++++++--------------------- 2 files changed, 31 insertions(+), 25 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h index f0af385c56e0..9e56a5b1024c 100644 --- a/arch/x86/kvm/mmu/tdp_iter.h +++ b/arch/x86/kvm/mmu/tdp_iter.h @@ -115,4 +115,16 @@ void tdp_iter_start(struct tdp_iter *iter, struct kvm_mmu_page *root, void tdp_iter_next(struct tdp_iter *iter); void tdp_iter_restart(struct tdp_iter *iter); +static inline union kvm_mmu_page_role tdp_iter_child_role(struct tdp_iter *iter) +{ + union kvm_mmu_page_role child_role; + struct kvm_mmu_page *parent_sp; + + parent_sp = sptep_to_sp(rcu_dereference(iter->sptep)); + + child_role = parent_sp->role; + child_role.level--; + return child_role; +} + #endif /* __KVM_X86_MMU_TDP_ITER_H */ diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 11b0ec8aebe2..c6be76f9849c 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -271,22 +271,28 @@ static struct kvm_mmu_page *tdp_mmu_next_root(struct kvm *kvm, kvm_mmu_page_as_id(_root) != _as_id) { \ } else -static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm_vcpu *vcpu) +static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm_vcpu *vcpu, + union kvm_mmu_page_role role) { struct kvm_mmu_page *sp; sp = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_page_header_cache); sp->spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache); + sp->role = role; return sp; } static void tdp_mmu_init_sp(struct kvm_mmu_page *sp, tdp_ptep_t sptep, - gfn_t gfn, union kvm_mmu_page_role role) + gfn_t gfn) { set_page_private(virt_to_page(sp->spt), (unsigned long)sp); - sp->role = role; + /* + * role must be set before calling this function. At least role.level + * is not 0 (PG_LEVEL_NONE). + */ + WARN_ON_ONCE(!sp->role.word); sp->gfn = gfn; sp->ptep = sptep; sp->tdp_mmu_page = true; @@ -294,20 +300,6 @@ static void tdp_mmu_init_sp(struct kvm_mmu_page *sp, tdp_ptep_t sptep, trace_kvm_mmu_get_page(sp, true); } -static void tdp_mmu_init_child_sp(struct kvm_mmu_page *child_sp, - struct tdp_iter *iter) -{ - struct kvm_mmu_page *parent_sp; - union kvm_mmu_page_role role; - - parent_sp = sptep_to_sp(rcu_dereference(iter->sptep)); - - role = parent_sp->role; - role.level--; - - tdp_mmu_init_sp(child_sp, iter->sptep, iter->gfn, role); -} - hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu) { union kvm_mmu_page_role role = vcpu->arch.mmu->root_role; @@ -326,8 +318,8 @@ hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu) goto out; } - root = tdp_mmu_alloc_sp(vcpu); - tdp_mmu_init_sp(root, NULL, 0, role); + root = tdp_mmu_alloc_sp(vcpu, role); + tdp_mmu_init_sp(root, NULL, 0); refcount_set(&root->tdp_mmu_root_count, 1); @@ -1166,8 +1158,8 @@ static int tdp_mmu_populate_nonleaf(struct kvm_vcpu *vcpu, struct tdp_iter *iter KVM_BUG_ON(is_shadow_present_pte(iter->old_spte), vcpu->kvm); KVM_BUG_ON(is_removed_spte(iter->old_spte), vcpu->kvm); - sp = tdp_mmu_alloc_sp(vcpu); - tdp_mmu_init_child_sp(sp, iter); + sp = tdp_mmu_alloc_sp(vcpu, tdp_iter_child_role(iter)); + tdp_mmu_init_sp(sp, iter->sptep, iter->gfn); ret = tdp_mmu_link_sp(vcpu->kvm, iter, sp, account_nx, true); if (ret) @@ -1435,7 +1427,7 @@ bool kvm_tdp_mmu_wrprot_slot(struct kvm *kvm, return spte_set; } -static struct kvm_mmu_page *__tdp_mmu_alloc_sp_for_split(gfp_t gfp) +static struct kvm_mmu_page *__tdp_mmu_alloc_sp_for_split(gfp_t gfp, union kvm_mmu_page_role role) { struct kvm_mmu_page *sp; @@ -1445,6 +1437,7 @@ static struct kvm_mmu_page *__tdp_mmu_alloc_sp_for_split(gfp_t gfp) if (!sp) return NULL; + sp->role = role; sp->spt = (void *)__get_free_page(gfp); if (!sp->spt) { kmem_cache_free(mmu_page_header_cache, sp); @@ -1458,6 +1451,7 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_split(struct kvm *kvm, struct tdp_iter *iter, bool shared) { + union kvm_mmu_page_role role = tdp_iter_child_role(iter); struct kvm_mmu_page *sp; /* @@ -1469,7 +1463,7 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_split(struct kvm *kvm, * If this allocation fails we drop the lock and retry with reclaim * allowed. */ - sp = __tdp_mmu_alloc_sp_for_split(GFP_NOWAIT | __GFP_ACCOUNT); + sp = __tdp_mmu_alloc_sp_for_split(GFP_NOWAIT | __GFP_ACCOUNT, role); if (sp) return sp; @@ -1481,7 +1475,7 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_split(struct kvm *kvm, write_unlock(&kvm->mmu_lock); iter->yielded = true; - sp = __tdp_mmu_alloc_sp_for_split(GFP_KERNEL_ACCOUNT); + sp = __tdp_mmu_alloc_sp_for_split(GFP_KERNEL_ACCOUNT, role); if (shared) read_lock(&kvm->mmu_lock); @@ -1500,7 +1494,7 @@ static int tdp_mmu_split_huge_page(struct kvm *kvm, struct tdp_iter *iter, const int level = iter->level; int ret, i; - tdp_mmu_init_child_sp(sp, iter); + tdp_mmu_init_sp(sp, iter->sptep, iter->gfn); /* * No need for atomics when writing to sp->spt since the page table has From patchwork Sun Oct 30 06:22:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12860 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666032wru; Sat, 29 Oct 2022 23:28:39 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7Q9EB5q2jklezRrl8J1hADzW40yEQw7n3JMDc5lrfFWiWLrp2Z1k2hmx7hqEnHU2Gcd3v7 X-Received: by 2002:a05:6402:360d:b0:459:5f40:5b0a with SMTP id el13-20020a056402360d00b004595f405b0amr7265713edb.168.1667111318920; Sat, 29 Oct 2022 23:28:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111318; cv=none; d=google.com; s=arc-20160816; b=Xfysj5Av7/cJsRO1vL/llI60/ewMU5I0YJ2ejcfwlXuTurRTOa8oj5JssG2jAbrq9H qtFtMDSGLJa7hD45l6Jwgm9z37R6FyNSIZmj5zvvK0Wn7zqp1eJ1aRQnkgnMan5XO2Hx NkWZGmtiXhZVJPVYazilFFYdruguxQ4Rx43bNLB4000432NJqsOyYKEF2Qgs0peB3CyC +Du+HM8mHxW8KcVSjVPbn69blZ/Rt5cLRuUwye65pyT5/8su7cJZs91A4kCiXTxDiwO2 R4cAXBhGSDblw9QcA8cMtmeonE6ig2Vr7Wia0hZ7ggrygoZAor3RtQQ3YB8lkhWj+8QW 1/ug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=8WCbndcoPcpcw4iQf5/2TwIvIv/jZiueYLToK+OicnI=; b=xsnjrk9+mr5gNmr9Eb0To/iWKpTkK4T9+4SbsI52U2PG6Ygnlas6zhnLbZXDpPeT+v 0Pck0qbcQoXUccGVnDqfcv2/jOnpf+WBv5UcMK2ovjEKXkEvu72ArICpS/HDF0V5cFPJ f2GMkcHoDosZPZh3d0d8cyd68OfPOXWpVg59OPfFJWda2EHGXozp7hY4cKxvayMHweCy h0vLO30OsOZijNcL+oAbIrYIQEOaXrNx/qlZHRodUC2qgA4+3kzVnPtUzUTjE52dbv2H KzOxJVcohe+dAWhFVkL0/UgLX5K5mMDHlfiDTwDCLJIgFaDrVwIhS3p/JcGXyCRuX027 wyuw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Iw+GFymT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id tz10-20020a170907c78a00b0078d40f7ef1fsi3042837ejc.330.2022.10.29.23.28.14; Sat, 29 Oct 2022 23:28:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Iw+GFymT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230405AbiJ3G1I (ORCPT + 99 others); Sun, 30 Oct 2022 02:27:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47232 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229929AbiJ3GYL (ORCPT ); Sun, 30 Oct 2022 02:24:11 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 46F0F13C; Sat, 29 Oct 2022 23:24:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111050; x=1698647050; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XTbYY1fmIyEn8RqQg2g2IUB+pcesS3c7XzjHCDTS0FQ=; b=Iw+GFymTIGkx5EbISqnZrHFdmMhH9/fnUdut9U3Axbyqk4q5ud0sC9Sc XQXy00IpbrYaFEJ/EEZ4nvbdWezEKcZKmqqOcwxlbxwn3m9Hg8IuJzySI ynAS40vjzULeTaheKzwF9miog4GnbiLIu/A03UV4MT3Lcy9ijbnzkqjTF 5h7njKv3xJZan3Q/4QgRUw/o2j0/X11eFqFqMUpKCX4xtZHAJx0oYHmBq ELuxLFjQYATwEtw2TxHZnUEz3xw+SWjKOpBCbkkDJIAreog2442Wx5c+/ yc59erGK+4UrTdmSzxJl4fznTixhgT6O7ebZGNOyKcyvjWL2QyHicyaXN Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037155" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037155" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:04 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392985" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392985" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:04 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 043/108] KVM: x86/mmu: Require TDP MMU for TDX Date: Sat, 29 Oct 2022 23:22:44 -0700 Message-Id: <82c25eec5a4c42e4b6a2038c7553bbaf56e87091.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092918225755562?= X-GMAIL-MSGID: =?utf-8?q?1748092918225755562?= From: Isaku Yamahata Require the TDP MMU for guest TDs, the so called "shadow" MMU does not support mapping guest private memory, i.e. does not support Secure-EPT. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu/tdp_mmu.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index c6be76f9849c..1bf58288ea79 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -18,8 +18,12 @@ int kvm_mmu_init_tdp_mmu(struct kvm *kvm) { struct workqueue_struct *wq; + /* + * Because only the TDP MMU supports TDX, require the TDP MMU for guest + * TDs. + */ if (!tdp_enabled || !READ_ONCE(tdp_mmu_enabled)) - return 0; + return kvm->arch.vm_type == KVM_X86_TDX_VM ? -EOPNOTSUPP : 0; wq = alloc_workqueue("kvm", WQ_UNBOUND|WQ_MEM_RECLAIM|WQ_CPU_INTENSIVE, 0); if (!wq) From patchwork Sun Oct 30 06:22:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12867 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666161wru; Sat, 29 Oct 2022 23:29:12 -0700 (PDT) X-Google-Smtp-Source: AMsMyM64hFjxcedi/8ODp+2WbkddEyXHNgFh/kWN9LE2GDavGp3pKf6u0dtRpCMJocchwoNRnEnp X-Received: by 2002:a17:907:1b12:b0:72f:9b44:f9e with SMTP id mp18-20020a1709071b1200b0072f9b440f9emr6675265ejc.653.1667111352299; Sat, 29 Oct 2022 23:29:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111352; cv=none; d=google.com; s=arc-20160816; b=zinWptojtJhqln830SDHIskOZnX3Nm9yDSmdP+WC/ruRc7S39aFsh7eoSDk0+k9+l7 wjnIhGNjMdkwsYX/O1Dw/fR8Wp1U0rVIzpFGxTPZzTIccO7nNIwSGCJuF3xnuL4IFK1q kw8TlpZN98vl+lALOck1VXn1tNMF8iDr+PV+kHHl7ubAB3c/iYTkdGKPQUJuSCDDU5nQ 6R+RllrM9dmdIjgY1ekS1qWk0lNAKB6xgBRtH94bCZseiC5+F+zW+gokh6GniI6bxhKu R4ifvTcafv6O81AdY7qQ9iB8YZE4rc83fqYyjxYM/L/XvVzDBOX/BzkON4RGou61Ornj TsLQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=EbGfeYGIbkVkZrNxeHfPx6G+1ziOlfcXQ5D/vF4BzZk=; b=h4JJJQ0m5ej7bB/DVbxr3DRUbvZsYB6em+GNhPj376dsXnSKMPZnd/zEGnyDbmQkJ2 Px8ub+XW0PAFvEjUMO5TTAuu9JiMh5xuV7lUAM34LNmzXx9aXJy+FH7diUQST8U1XtL1 bu0R4nm0trQfRNMC6M3nhj8iFLO96UoSrG7Er1EQNRtKzvlydkj9iIB2eh7VUNWd5eiD k/omIB2vwFUW6yaXIxV0/gcFniGv7jZv8bAlzZTqBP0WyypRtwMecVjVq3NH6gLuccKf PLle/uaeVexpKzbx6MknhKjH/HyZz7qvU7WVfxgKE2/rOyrGGWQeO9gmuV7q0PFI5Tsx ASzg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="nR5tn4c/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qb7-20020a1709077e8700b0073fc8e72882si3939386ejc.28.2022.10.29.23.28.48; Sat, 29 Oct 2022 23:29:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="nR5tn4c/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230179AbiJ3G1c (ORCPT + 99 others); Sun, 30 Oct 2022 02:27:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46914 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229944AbiJ3GYY (ORCPT ); Sun, 30 Oct 2022 02:24:24 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 85771102; Sat, 29 Oct 2022 23:24:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111050; x=1698647050; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vqmWa7uBeqHBgssjmDBu2kCsD43GLn+7nDGrKtT7FPM=; b=nR5tn4c/JQ2YBRZN6H96B57JBHazDK9kQqafwyxbcaC/GrQm0deCfXiY cEEyq6Ie7A+j12qNVgBkxGcYliE9rUpPyNL7BiI1Dt2Haln1iNzZJIpBJ mms9ZN4XG/LkqHF7dpHbZHvRLBi7qSXQUQd6aAxURRtuRESF3VBaH2Slk 0G53xHoYd5u12EawnSNhwlJoXh7xW7IdO8kw07dJLN2V7amKRKYPsdwHb WToFEOUaxeMdrYPFAD24oh/yOinrqXXkoa5JNZwvfJPf8iAYhIc9n7HrB 7efUEuIGhjuf/lQFGNNKaUm+GWH2Xp5eOEC1CmQoA+IGGu2nIJ1cc7lMN g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037156" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037156" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:04 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392988" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392988" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:04 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 044/108] KVM: x86/mmu: Add a new is_private member for union kvm_mmu_page_role Date: Sat, 29 Oct 2022 23:22:45 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092953066768983?= X-GMAIL-MSGID: =?utf-8?q?1748092953066768983?= From: Isaku Yamahata Because TDX support introduces private mapping, add a new member in union kvm_mmu_page_role with access functions to check the member. Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm_host.h | 27 +++++++++++++++++++++++++++ arch/x86/kvm/mmu/mmu_internal.h | 11 +++++++++++ 2 files changed, 38 insertions(+) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 2e0b23422d63..ee01add57a6b 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -335,7 +335,12 @@ union kvm_mmu_page_role { unsigned ad_disabled:1; unsigned guest_mode:1; unsigned passthrough:1; +#ifdef CONFIG_KVM_MMU_PRIVATE + unsigned is_private:1; + unsigned :4; +#else unsigned :5; +#endif /* * This is left at the top of the word so that @@ -347,6 +352,28 @@ union kvm_mmu_page_role { }; }; +#ifdef CONFIG_KVM_MMU_PRIVATE +static inline bool kvm_mmu_page_role_is_private(union kvm_mmu_page_role role) +{ + return !!role.is_private; +} + +static inline void kvm_mmu_page_role_set_private(union kvm_mmu_page_role *role) +{ + role->is_private = 1; +} +#else +static inline bool kvm_mmu_page_role_is_private(union kvm_mmu_page_role role) +{ + return false; +} + +static inline void kvm_mmu_page_role_set_private(union kvm_mmu_page_role *role) +{ + WARN_ON_ONCE(1); +} +#endif + /* * kvm_mmu_extended_role complements kvm_mmu_page_role, tracking properties * relevant to the current MMU configuration. When loading CR0, CR4, or EFER, diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 57e3ea2b52cc..b0c4a78404b2 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -142,6 +142,17 @@ static inline int kvm_mmu_page_as_id(struct kvm_mmu_page *sp) return kvm_mmu_role_as_id(sp->role); } +static inline bool is_private_sp(const struct kvm_mmu_page *sp) +{ + return kvm_mmu_page_role_is_private(sp->role); +} + +static inline bool is_private_sptep(u64 *sptep) +{ + WARN_ON_ONCE(!sptep); + return is_private_sp(sptep_to_sp(sptep)); +} + static inline bool kvm_mmu_page_ad_need_write_protect(struct kvm_mmu_page *sp) { /* From patchwork Sun Oct 30 06:22:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12869 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666179wru; Sat, 29 Oct 2022 23:29:16 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6wh/rmfYOhSndGDkDQcxu6wkhYu18hJs4QBV8pTwUcHU482zQtcq9ysF9G7Cs3eiFpvYiY X-Received: by 2002:a17:907:628a:b0:781:bbff:1d42 with SMTP id nd10-20020a170907628a00b00781bbff1d42mr6640489ejc.375.1667111355912; Sat, 29 Oct 2022 23:29:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111355; cv=none; d=google.com; s=arc-20160816; b=t1pJGPygk+toglFdpYowcCkxs8kG89aKDopVwptwXDG6OqgVJ+g8hGxRsE2WRguonU vlpazKYKXCJvdNRXAc7RT5/uYieKRcQpfk5j2upCED/K1rE+rd6dR+pKG3CcqMYttJyr px6uzZK4Gn+K/V76lVONdzN8hohXry/Fry5qabrt3eCwEQx++wYRd0Ct3kj2ou3FFyj7 Ymb/Mhf06PHWdereblDm37cRU0ubzk9Joq+dAuuFQ94RArztAFEYGNEjk0y4FBzjT+yy G6G9gMEklEjdPnyfvpgKHrPdQUk7MxeUBZg8wLbBghR8v4j7oYvf7VQ03QCqN0wXe8Rm lBJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Ge9BreJ3CWfUMo6KlI1fRYLMSKwTyY7vn7/tZ5CWm+Y=; b=Bx8B1i/7HQaNFvfLbSLexAHwVXz/+pfK9rKXa8URIWmbgTQ/Zk3nhss/xfxkHScvZD i3vL/QaQ9vgEdK3aKjS80uOIa2ihS1/kaKaJFJRb6DI0svI7eNFBIihUTHAjSAOJJFPi NFAAr3OQjKI80SQmXIReWrsgLcz8p7wORiuwIlgMEMuIaxMU+Jd9WEsHJCBpm3hQdsqG 8xZD/LEmIENvAitHovRjB2DqqGAGr9tjKD/iaDU9A1US4Q3sMHUW5LnfU/gCndrZlK8G Aqu+MK/JBotElAv4BbPhDBsW/Ik8uq5hOdkHTRGfqCOb1eS2vrn1t93tdxYOtRlp6gqr ZXMA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=VtAoiD3I; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y20-20020a170906559400b0078d484e0e7esi2820199ejp.488.2022.10.29.23.28.51; Sat, 29 Oct 2022 23:29:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=VtAoiD3I; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230456AbiJ3G1m (ORCPT + 99 others); Sun, 30 Oct 2022 02:27:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229946AbiJ3GYY (ORCPT ); Sun, 30 Oct 2022 02:24:24 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9320CCE; Sat, 29 Oct 2022 23:24:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111050; x=1698647050; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HWEha6Cy5X2uOYXEIV6NT6az4T/FX1/hriKkax4dNqc=; b=VtAoiD3InT/1lI2yiaUW3+lVPWfsYRlMxa6M4nP8Wb4Dn8gzXnpQtbP9 v3XnJgr48uKXtaqIPDTLGP4HnsmJaer+ZevbEQhO4I2R61iEXeJ2IBO2f 4rNoq+tLhEuiefohVclupFkJNepZholFMcMszK9VjxCJtfM2n6IR/EX33 04xlG8GSLfX5wGoDQjgfB77qrp6G99IUTy6b5foALd6qrTA1BPSJJ+OE2 4joGz+LJw/25sU23a4kytOaQDenbtcCRKiMsccxTPFjtsWL82vESFS0VL BbxhjVmpUHz+MZui06dqQDBPceWYPNvWECMPPftelR/HfY/a/VbbXkPQf g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037157" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037157" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:04 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392991" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392991" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:04 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 045/108] KVM: x86/mmu: Add a private pointer to struct kvm_mmu_page Date: Sat, 29 Oct 2022 23:22:46 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092956896456448?= X-GMAIL-MSGID: =?utf-8?q?1748092956896456448?= From: Isaku Yamahata For private GPA, CPU refers a private page table whose contents are encrypted. The dedicated APIs to operate on it (e.g. updating/reading its PTE entry) are used and their cost is expensive. When KVM resolves KVM page fault, it walks the page tables. To reuse the existing KVM MMU code and mitigate the heavy cost to directly walk protected (encrypted) page table, allocate one more page to copy the protected page table for KVM MMU code to directly walk. Resolve KVM page fault with the existing code, and do additional operations necessary for the protected page table. To distinguish such cases, the existing KVM page table is called a shared page table (i.e. not associated with protected page table), and the page table with protected page table is called a private page table. The relationship is depicted below. Add a private pointer to struct kvm_mmu_page for protected page table and add helper functions to allocate/initialize/free a protected page table page. KVM page fault | | | V | -------------+---------- | | | | V V | shared GPA private GPA | | | | V V | shared PT root private PT root | protected PT root | | | | V V | V shared PT private PT ----propagate----> protected PT | | | | | \-----------------+------\ | | | | | V | V V shared guest page | private guest page | non-encrypted memory | encrypted memory | PT: page table - Shared PT is visible to KVM and it is used by CPU. - Protected PT is used by CPU but it is invisible to KVM. - Private PT is visible to KVM but not used by CPU. It is used to propagate PT change to the actual protected PT which is used by CPU. Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm_host.h | 7 +++ arch/x86/kvm/mmu/mmu.c | 8 +++ arch/x86/kvm/mmu/mmu_internal.h | 90 +++++++++++++++++++++++++++++++-- arch/x86/kvm/mmu/tdp_mmu.c | 1 + 4 files changed, 102 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index ee01add57a6b..381df2c8136d 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -754,6 +754,13 @@ struct kvm_vcpu_arch { struct kvm_mmu_memory_cache mmu_shadow_page_cache; struct kvm_mmu_memory_cache mmu_shadowed_info_cache; struct kvm_mmu_memory_cache mmu_page_header_cache; + /* + * This cache is to allocate private page table. E.g. Secure-EPT used + * by the TDX module. Because the TDX module doesn't trust VMM and + * initializes the pages itself, KVM doesn't initialize them. Allocate + * pages with garbage and give them to the TDX module. + */ + struct kvm_mmu_memory_cache mmu_private_spt_cache; /* * QEMU userspace and the guest each have their own FPU state. diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 0001e921154e..faf69774c7ce 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -653,6 +653,13 @@ static int mmu_topup_shadow_page_cache(struct kvm_vcpu *vcpu) struct kvm_mmu_memory_cache *mc = &vcpu->arch.mmu_shadow_page_cache; int start, end, i, r; + if (kvm_gfn_shared_mask(vcpu->kvm)) { + r = kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_private_spt_cache, + PT64_ROOT_MAX_LEVEL); + if (r) + return r; + } + start = kvm_mmu_memory_cache_nr_free_objects(mc); r = kvm_mmu_topup_memory_cache(mc, PT64_ROOT_MAX_LEVEL); @@ -702,6 +709,7 @@ static void mmu_free_memory_caches(struct kvm_vcpu *vcpu) kvm_mmu_free_memory_cache(&vcpu->arch.mmu_pte_list_desc_cache); kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadow_page_cache); kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadowed_info_cache); + kvm_mmu_free_memory_cache(&vcpu->arch.mmu_private_spt_cache); kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_header_cache); } diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index b0c4a78404b2..4c013124534b 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -87,7 +87,23 @@ struct kvm_mmu_page { int root_count; refcount_t tdp_mmu_root_count; }; - unsigned int unsync_children; + union { + struct { + unsigned int unsync_children; + /* + * Number of writes since the last time traversal + * visited this page. + */ + atomic_t write_flooding_count; + }; +#ifdef CONFIG_KVM_MMU_PRIVATE + /* + * Associated private shadow page table, e.g. Secure-EPT page + * passed to the TDX module. + */ + void *private_spt; +#endif + }; union { struct kvm_rmap_head parent_ptes; /* rmap pointers to parent sptes */ tdp_ptep_t ptep; @@ -109,9 +125,6 @@ struct kvm_mmu_page { int clear_spte_count; #endif - /* Number of writes since the last time traversal visited this page. */ - atomic_t write_flooding_count; - #ifdef CONFIG_X86_64 /* Used for freeing the page asynchronously if it is a TDP MMU page. */ struct rcu_head rcu_head; @@ -153,6 +166,75 @@ static inline bool is_private_sptep(u64 *sptep) return is_private_sp(sptep_to_sp(sptep)); } +#ifdef CONFIG_KVM_MMU_PRIVATE +static inline void *kvm_mmu_private_spt(struct kvm_mmu_page *sp) +{ + return sp->private_spt; +} + +static inline void kvm_mmu_init_private_spt(struct kvm_mmu_page *sp, void *private_spt) +{ + sp->private_spt = private_spt; +} + +static inline void kvm_mmu_alloc_private_spt(struct kvm_vcpu *vcpu, + struct kvm_mmu_memory_cache *private_spt_cache, + struct kvm_mmu_page *sp) +{ + /* + * vcpu == NULL means non-root SPT: + * vcpu == NULL is used to split a large SPT into smaller SPT. Root SPT + * is not a large SPT. + */ + bool is_root = vcpu && + vcpu->arch.root_mmu.root_role.level == sp->role.level; + + if (vcpu) + private_spt_cache = &vcpu->arch.mmu_private_spt_cache; + KVM_BUG_ON(!kvm_mmu_page_role_is_private(sp->role), vcpu->kvm); + if (is_root) + /* + * Because TDX module assigns root Secure-EPT page and set it to + * Secure-EPTP when TD vcpu is created, secure page table for + * root isn't needed. + */ + sp->private_spt = NULL; + else { + sp->private_spt = kvm_mmu_memory_cache_alloc(private_spt_cache); + /* + * Because mmu_private_spt_cache is topped up before staring kvm + * page fault resolving, the allocation above shouldn't fail. + */ + WARN_ON_ONCE(!sp->private_spt); + } +} + +static inline void kvm_mmu_free_private_spt(struct kvm_mmu_page *sp) +{ + if (sp->private_spt) + free_page((unsigned long)sp->private_spt); +} +#else +static inline void *kvm_mmu_private_spt(struct kvm_mmu_page *sp) +{ + return NULL; +} + +static inline void kvm_mmu_init_private_spt(struct kvm_mmu_page *sp, void *private_spt) +{ +} + +static inline void kvm_mmu_alloc_private_spt(struct kvm_vcpu *vcpu, + struct kvm_mmu_memory_cache *private_spt_cache, + struct kvm_mmu_page *sp) +{ +} + +static inline void kvm_mmu_free_private_spt(struct kvm_mmu_page *sp) +{ +} +#endif + static inline bool kvm_mmu_page_ad_need_write_protect(struct kvm_mmu_page *sp) { /* diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 1bf58288ea79..b2f56110d62d 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -71,6 +71,7 @@ void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm) static void tdp_mmu_free_sp(struct kvm_mmu_page *sp) { + kvm_mmu_free_private_spt(sp); free_page((unsigned long)sp->spt); kmem_cache_free(mmu_page_header_cache, sp); } From patchwork Sun Oct 30 06:22:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12870 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666200wru; Sat, 29 Oct 2022 23:29:21 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4rR45uR1mflZ+66qD/CqQ63N2PMF3u8CwRSmZ1AHAEAvJVP9/djg60cUVDE6aK/efm/S05 X-Received: by 2002:aa7:c58e:0:b0:461:77b:7bd with SMTP id g14-20020aa7c58e000000b00461077b07bdmr7503087edq.387.1667111360900; Sat, 29 Oct 2022 23:29:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111360; cv=none; d=google.com; s=arc-20160816; b=vWJXDLLVpr3V6yYtysLUIKY1wQ+iPNfC3NrZuIwgsIfhAzaKCGhfRFFaq2xhojP7o0 WkU8+iguiWxkv1DY8RJPQaFsSYIq/HaV/rN8MLzA0bValFGjDA32kD4zxTKOYBzLiKbX MEkrE7MfDZQe6sIBUZVm5kL7uqHdtbYpb6N/rBO+tXdCOar1cEFtpCzp1KEyZKP6Jtca UqseQZxVusNgB6UOnXPjAwJNJUQse4c+EnyiN19Ymq1z97H7EIVNWJICqRjy/WuOZAit A5Wf1Tbzymo9FYleiw/4qdRqAEqROQo2auyAHe1nZDzy5y0y3ija2FInE1R3Ta9cOxRU cYyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=T0oINi3gjIXL24qBkUQaN/vKL8gJIXP6zHTMPrjC9qQ=; b=I6sxFxIPSwfK3WwFp8IoL5vYfmtXIObACcf5rIQtpNlUlXu7B36xMeG5v2G6ZsOrqt utEBSRBarNiqeWINWI50Wp7ddM+9Iua7i8WVEy15pZCxbwY4ZsI3Ab4aRHTvaZcRARD3 W6eZkG3PlJF/F+Seb9JdBD8i/j2unZFtIIUbT4Uq72NV5Yl3roGzIs/+Zno0ptFAZoaL /e3g7NU3i49PcTO6sy8VxZf7h8F99PvRPJRnoU9+9jykMjEVtQBjx3ScFVBQU8VGga9a 2FSCLu005ikA16J86IRBPhJ3vA6rw1ofr3LWNtcdkGT1JoT2MxNbSnpUDQxmwnZCROlY 2rSw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=DG7kLZrd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g14-20020a056402320e00b0045d4f99616dsi3839901eda.456.2022.10.29.23.28.55; Sat, 29 Oct 2022 23:29:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=DG7kLZrd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230467AbiJ3G1r (ORCPT + 99 others); Sun, 30 Oct 2022 02:27:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47140 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229945AbiJ3GYY (ORCPT ); Sun, 30 Oct 2022 02:24:24 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9327013D; Sat, 29 Oct 2022 23:24:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111050; x=1698647050; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YUg8weR2aBuIZsU1DmkInT5uDP4qgsrNtwkEufPJkfE=; b=DG7kLZrdzmX/DjvqATg8fTlIgD5eIGrClAIqC3yulhxg1irNeDZTWWVP /CtXe0D20v/adFGRmYna4UwiR3gMevSrNtf91Th2Z5KjMiHVQ8lGtfSt+ OKeKWZrSgHDI3UOAgf1SMEiTRV5YfnzqcQDK+qBWVVD2SZGmtcM0fEsbU MegQfEHN2DP0EHCBeaLO1RNeO2Q+fkWajccfGKIhr5qDHIX6H+tis+ya0 ILHIqUpySErN9n2fJ2xK+LY6kwUpJb4ETm0O2z2z3TqUfeenWDlm9NlsP 9qmru/peSYNX4Wayp8R5KzjCHSJcosLhL+xwStmUexbh96BLzh+baOHot Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037158" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037158" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:04 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392995" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392995" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:04 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 046/108] KVM: Add flags to struct kvm_gfn_range Date: Sat, 29 Oct 2022 23:22:47 -0700 Message-Id: <880c1016c29624964baee580985b6a736fc7d656.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092962328514107?= X-GMAIL-MSGID: =?utf-8?q?1748092962328514107?= From: Isaku Yamahata kvm_unmap_gfn_range() needs to know the reason of the callback for TDX. mmu notifier, set memattr ioctl or restrictedmem notifier. Based on the reason, TDX changes the behavior. For mmu notifier, it's the operation on shared memory slot to zap shared PTE. For set memattr, private<->shared conversion, zap the original PTE. For restrictedmem, it's a hint that TDX can ignore. Signed-off-by: Isaku Yamahata --- include/linux/kvm_host.h | 8 +++++++- virt/kvm/kvm_main.c | 5 ++++- 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 839d98d56632..b658803ea2c7 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -247,12 +247,18 @@ int kvm_async_pf_wakeup_all(struct kvm_vcpu *vcpu); #if defined(KVM_ARCH_WANT_MMU_NOTIFIER) || defined(CONFIG_HAVE_KVM_RESTRICTED_MEM) +#define KVM_GFN_RANGE_FLAGS_RESTRICTED_MEM BIT(0) +#define KVM_GFN_RANGE_FLAGS_SET_MEM_ATTR BIT(1) struct kvm_gfn_range { struct kvm_memory_slot *slot; gfn_t start; gfn_t end; - pte_t pte; + union { + pte_t pte; + int attr; + }; bool may_block; + unsigned int flags; }; bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range); #endif diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 3b05a3396f89..dda2f2ec4faa 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -676,6 +676,7 @@ static __always_inline int __kvm_handle_hva_range(struct kvm *kvm, gfn_range.start = hva_to_gfn_memslot(hva_start, slot); gfn_range.end = hva_to_gfn_memslot(hva_end + PAGE_SIZE - 1, slot); gfn_range.slot = slot; + gfn_range.flags = 0; if (!locked) { locked = true; @@ -947,8 +948,9 @@ static void kvm_unmap_mem_range(struct kvm *kvm, gfn_t start, gfn_t end, int i; int r = 0; - gfn_range.pte = __pte(0); + gfn_range.attr = attr; gfn_range.may_block = true; + gfn_range.flags = KVM_GFN_RANGE_FLAGS_SET_MEM_ATTR; for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { slots = __kvm_memslots(kvm, i); @@ -1074,6 +1076,7 @@ static void kvm_restrictedmem_invalidate_begin(struct restrictedmem_notifier *no gfn_range.slot = slot; gfn_range.pte = __pte(0); gfn_range.may_block = true; + gfn_range.flags = KVM_GFN_RANGE_FLAGS_RESTRICTED_MEM; if (kvm_unmap_gfn_range(kvm, &gfn_range)) kvm_flush_remote_tlbs(kvm); From patchwork Sun Oct 30 06:22:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12865 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666129wru; Sat, 29 Oct 2022 23:29:02 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4qMnAZMjXB23hfTA7X98c8/445a5tLVcUuJh/5EAgD7nRN8Cib8gJGp4tqrcsR3/3mEVp2 X-Received: by 2002:a05:6402:3890:b0:45c:2b5:b622 with SMTP id fd16-20020a056402389000b0045c02b5b622mr7667745edb.69.1667111342721; Sat, 29 Oct 2022 23:29:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111342; cv=none; d=google.com; s=arc-20160816; b=utO8uw4UHLCDhWQIa4pFXOz/Nd7Vx6z0mCnkJfYT0xJnpGjnWYvXGrEdU8JQGxRjS+ r3/PEXkehQ3yy5xVMILAyRaNq2W5k6seqPnTkdzJRnsLXo3VvXpmFOtiEpbeKN8Q03hJ xPv0i14LxYXZ6DG0IgbcRmpmqmKsybQd20lHP/uyKTMQZovvJk3vTljbU2D+2FFuXxp2 BOfnECX4wzFnDCkMsP0U5XpQdHY0NB3UYZ0Fl78ke/iE7VP9A1xlwrlDmBEoU07GZik+ Oxzl+r2njglQ82IFrfq2g4MlygbuDr54UrgiLS2lmSgKCAyzrHZB5OFsyrlrQx1jKJuR XKWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=l71AqR8qMHyFdRRhWMAH/yQSpuaQHeUs51apbHPgRwA=; b=uKcdEmgGVYgEpiEVXW2OvJVZTbofpBt7eCjUnuAgyVe+ZfZCwDoVpKs3rY9/Ju29UT vbgh5ZlvWVkcs2lM62vNczwRN2XSi+08+qp1doWntHdiyicOZTWezgp11YRFuNSMcdxJ X+B/LzBB3Rpyuf50kdToyGXljKzEgMByhFzjLOe7BPisbPiPelyhf7DYNDPKyv4RAm1E donrhWJZ6vp/Oo7P2udFndcOC5N2PVrOkaE96OQpRPI2x6+5EGo/Y/RaogSj7EIjQ+X1 MGyjUsoyJk5VakIPu+xjZyQWhvSE3jzuuStpWzI9iqJp7u5CEdeL3jvYJFBYFLQhnoA9 HDyQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=f7EcaftI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hz4-20020a1709072ce400b007830f14fffesi4608812ejc.375.2022.10.29.23.28.38; Sat, 29 Oct 2022 23:29:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=f7EcaftI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230429AbiJ3G1V (ORCPT + 99 others); Sun, 30 Oct 2022 02:27:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46972 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229955AbiJ3GYY (ORCPT ); Sun, 30 Oct 2022 02:24:24 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 33CF2182; Sat, 29 Oct 2022 23:24:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111051; x=1698647051; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Ui65nPEfZFPgrYsNSO8QLZrUq6LO0/8MC9q/JZO/TvA=; b=f7EcaftIeAIabGGP3LvX+nRY8a1Gs6ISTmrrHTdxmbIdEbC3fo+E/Zjl 6ufEcKZoWxxeIWPrELk74hcKDu0DHq3fDPUn/uqUdmp0dEJB/kBec8AtY EQEbry7gsOh8n54hJZ99vNOZUFhlLZJOcU3FM5aVM8UTdQVxNUXumUkUL hRujVCwMBPjRzo2p0sONsvxRaCo8lJF5I86MQpOmmG/b8kuewJQ+9MksL JsL5LgL2S8VdiqwtFuPQUJJv2busMi1OE/GzuYr+yfwyGy9A1cCxgyEE2 qOM1PxsmMBqY7vOltiAOrpcp8Ty74KsSUF1l4C/UxWxRgNh8suVPJebTj A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037159" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037159" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:04 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878392998" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878392998" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:04 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 047/108] KVM: x86/tdp_mmu: Don't zap private pages for unsupported cases Date: Sat, 29 Oct 2022 23:22:48 -0700 Message-Id: <9e8346b692eb377576363a028c3688c66f3c0bfe.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092943436923196?= X-GMAIL-MSGID: =?utf-8?q?1748092943436923196?= From: Sean Christopherson TDX supports only write-back(WB) memory type for private memory architecturally so that (virtualized) memory type change doesn't make sense for private memory. Also currently, page migration isn't supported for TDX yet. (TDX architecturally supports page migration. it's KVM and kernel implementation issue.) Regarding memory type change (mtrr virtualization and lapic page mapping change), pages are zapped by kvm_zap_gfn_range(). On the next KVM page fault, the SPTE entry with a new memory type for the page is populated. Regarding page migration, pages are zapped by the mmu notifier. On the next KVM page fault, the new migrated page is populated. Don't zap private pages on unmapping for those two cases. When deleting/moving a KVM memory slot, zap private pages. Typically tearing down VM. Don't invalidate private page tables. i.e. zap only leaf SPTEs for KVM mmu that has a shared bit mask. The existing kvm_tdp_mmu_invalidate_all_roots() depends on role.invalid with read-lock of mmu_lock so that other vcpu can operate on KVM mmu concurrently. It marks the root page table invalid and zaps SPTEs of the root page tables. The TDX module doesn't allow to unlink a protected root page table from the hardware and then allocate a new one for it. i.e. replacing a protected root page table. Instead, zap only leaf SPTEs for KVM mmu with a shared bit mask set. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu/mmu.c | 85 ++++++++++++++++++++++++++++++++++++-- arch/x86/kvm/mmu/tdp_mmu.c | 24 ++++++++--- arch/x86/kvm/mmu/tdp_mmu.h | 5 ++- 3 files changed, 103 insertions(+), 11 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index faf69774c7ce..0237e143299c 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -1577,8 +1577,38 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) if (kvm_memslots_have_rmaps(kvm)) flush = kvm_handle_gfn_range(kvm, range, kvm_zap_rmap); - if (is_tdp_mmu_enabled(kvm)) - flush = kvm_tdp_mmu_unmap_gfn_range(kvm, range, flush); + if (is_tdp_mmu_enabled(kvm)) { + bool zap_private; + + if (kvm_slot_can_be_private(range->slot)) { + if (range->flags & KVM_GFN_RANGE_FLAGS_RESTRICTED_MEM) + /* + * For private slot, the callback is triggered + * via falloc. Mode can be allocation or punch + * hole. Because the private-shared conversion + * is done via + * KVM_MEMORY_ENCRYPT_REG/UNREG_REGION, we can + * ignore the request from restrictedmem. + */ + return flush; + else if (range->flags & KVM_GFN_RANGE_FLAGS_SET_MEM_ATTR) { + if (range->attr == KVM_MEM_ATTR_SHARED) + zap_private = true; + else { + WARN_ON_ONCE(range->attr != KVM_MEM_ATTR_PRIVATE); + zap_private = false; + } + } else + /* + * kvm_unmap_gfn_range() is called via mmu + * notifier. For now page migration for private + * page isn't supported yet, don't zap private + * pages. + */ + zap_private = false; + } + flush = kvm_tdp_mmu_unmap_gfn_range(kvm, range, flush, zap_private); + } return flush; } @@ -6066,11 +6096,48 @@ static bool kvm_has_zapped_obsolete_pages(struct kvm *kvm) return unlikely(!list_empty_careful(&kvm->arch.zapped_obsolete_pages)); } +static void kvm_mmu_zap_memslot(struct kvm *kvm, struct kvm_memory_slot *slot) +{ + bool flush = false; + + write_lock(&kvm->mmu_lock); + + /* + * Zapping non-leaf SPTEs, a.k.a. not-last SPTEs, isn't required, worst + * case scenario we'll have unused shadow pages lying around until they + * are recycled due to age or when the VM is destroyed. + */ + if (is_tdp_mmu_enabled(kvm)) { + struct kvm_gfn_range range = { + .slot = slot, + .start = slot->base_gfn, + .end = slot->base_gfn + slot->npages, + .may_block = false, + }; + + /* + * this handles both private gfn and shared gfn. + * All private page should be zapped on memslot deletion. + */ + flush = kvm_tdp_mmu_unmap_gfn_range(kvm, &range, flush, true); + } else { + flush = slot_handle_level(kvm, slot, __kvm_zap_rmap, PG_LEVEL_4K, + KVM_MAX_HUGEPAGE_LEVEL, true); + } + if (flush) + kvm_flush_remote_tlbs(kvm); + + write_unlock(&kvm->mmu_lock); +} + static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm, struct kvm_memory_slot *slot, struct kvm_page_track_notifier_node *node) { - kvm_mmu_zap_all_fast(kvm); + if (kvm_gfn_shared_mask(kvm)) + kvm_mmu_zap_memslot(kvm, slot); + else + kvm_mmu_zap_all_fast(kvm); } int kvm_mmu_init_vm(struct kvm *kvm) @@ -6173,8 +6240,18 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) if (is_tdp_mmu_enabled(kvm)) { for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) + /* + * zap_private = true. Zap both private/shared pages. + * + * kvm_zap_gfn_range() is used when PAT memory type was + * changed. Later on the next kvm page fault, populate + * it with updated spte entry. + * Because only WB is supported for private pages, don't + * care of private pages. + */ flush = kvm_tdp_mmu_zap_leafs(kvm, i, gfn_start, - gfn_end, true, flush); + gfn_end, true, flush, + true); } if (flush) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index b2f56110d62d..85d990ec149e 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -948,7 +948,8 @@ bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp) * operation can cause a soft lockup. */ static bool tdp_mmu_zap_leafs(struct kvm *kvm, struct kvm_mmu_page *root, - gfn_t start, gfn_t end, bool can_yield, bool flush) + gfn_t start, gfn_t end, bool can_yield, bool flush, + bool zap_private) { struct tdp_iter iter; @@ -956,6 +957,10 @@ static bool tdp_mmu_zap_leafs(struct kvm *kvm, struct kvm_mmu_page *root, lockdep_assert_held_write(&kvm->mmu_lock); + WARN_ON_ONCE(zap_private && !is_private_sp(root)); + if (!zap_private && is_private_sp(root)) + return false; + rcu_read_lock(); for_each_tdp_pte_min_level(iter, root, PG_LEVEL_4K, start, end) { @@ -988,12 +993,13 @@ static bool tdp_mmu_zap_leafs(struct kvm *kvm, struct kvm_mmu_page *root, * more SPTEs were zapped since the MMU lock was last acquired. */ bool kvm_tdp_mmu_zap_leafs(struct kvm *kvm, int as_id, gfn_t start, gfn_t end, - bool can_yield, bool flush) + bool can_yield, bool flush, bool zap_private) { struct kvm_mmu_page *root; for_each_tdp_mmu_root_yield_safe(kvm, root, as_id) - flush = tdp_mmu_zap_leafs(kvm, root, start, end, can_yield, flush); + flush = tdp_mmu_zap_leafs(kvm, root, start, end, can_yield, flush, + zap_private && is_private_sp(root)); return flush; } @@ -1053,6 +1059,12 @@ void kvm_tdp_mmu_invalidate_all_roots(struct kvm *kvm) lockdep_assert_held_write(&kvm->mmu_lock); list_for_each_entry(root, &kvm->arch.tdp_mmu_roots, link) { + /* + * Skip private root since private page table + * is only torn down when VM is destroyed. + */ + if (is_private_sp(root)) + continue; if (!root->role.invalid && !WARN_ON_ONCE(!kvm_tdp_mmu_get_root(root))) { root->role.invalid = true; @@ -1245,11 +1257,13 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) return ret; } +/* Used by mmu notifier via kvm_unmap_gfn_range() */ bool kvm_tdp_mmu_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range, - bool flush) + bool flush, bool zap_private) { return kvm_tdp_mmu_zap_leafs(kvm, range->slot->as_id, range->start, - range->end, range->may_block, flush); + range->end, range->may_block, flush, + zap_private); } typedef bool (*tdp_handler_t)(struct kvm *kvm, struct tdp_iter *iter, diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index c163f7cc23ca..c98c7df449a8 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -16,7 +16,8 @@ void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root, bool shared); bool kvm_tdp_mmu_zap_leafs(struct kvm *kvm, int as_id, gfn_t start, - gfn_t end, bool can_yield, bool flush); + gfn_t end, bool can_yield, bool flush, + bool zap_private); bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp); void kvm_tdp_mmu_zap_all(struct kvm *kvm); void kvm_tdp_mmu_invalidate_all_roots(struct kvm *kvm); @@ -25,7 +26,7 @@ void kvm_tdp_mmu_zap_invalidated_roots(struct kvm *kvm); int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault); bool kvm_tdp_mmu_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range, - bool flush); + bool flush, bool zap_private); bool kvm_tdp_mmu_age_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range); bool kvm_tdp_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range); bool kvm_tdp_mmu_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range); From patchwork Sun Oct 30 06:22:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12868 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666172wru; Sat, 29 Oct 2022 23:29:14 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4cpKI1JdqdisNkwYN17uNxjA3apdyHW4ufYmMT78zmiUBHglO2+SoCX6sRf+NAO0lYPWVW X-Received: by 2002:a17:907:7243:b0:7ad:88f8:7ef1 with SMTP id ds3-20020a170907724300b007ad88f87ef1mr6828218ejc.12.1667111354102; Sat, 29 Oct 2022 23:29:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111354; cv=none; d=google.com; s=arc-20160816; b=Pgh0tNMV0Y5mZJPzlBW6PKV8nHNSWOhuVziLdrBzWkkpMwP18879dIAu0GZrJpgUpW Dr7XdfW8sUl5fg0giWVogSAkLQ7bmycVja2AKlPNWrG8CYM7IST4p2xZnmcEiCTZP6Mj aZTP3lf0osfD2R31TolaQogQkuhq2hz82vgM1cRPiU5/P5evimfGNcedS+FJ9EWKR2ea RkU6ZL3hrAFyifLMpAgXuRkT3mKarc1wQnlLSsyB4FrzBnOjbVlnyWL4L1a512ACe+HX S2pYQeSsJ1svfGyRNPN/dB3LSrW2Gd4TY0hpKaWI0rhBjsZhTfuLxv3USPuWxfiazneh QVZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=hZpl5Kj7FcFebNTsXIOuXDVOawsou6pkJag05/C1V1U=; b=VP/Q6CoxqJCo8kptZmIz+qQfgftjQ7DjiAHO2RGuLphHzoknMlWTDrtfzmPnpjPc0p QFbUORAJd37KUm5Z4tT9r5Abo9/cD5HAl1hK1d1eGVf3PuXHcua1lx9E3HInQj49B7VN h3CNY/1l8Q+useq8OoK5mLAOc+EbqHLJljJxJRVPG4t2QUEOpxeDelMEqAfyeEMDche+ OmzdaegRlipU5E0Lz+Al5HsaZm/pu7CjpJHbYy410O9ftSh6E09+thHgHBHb3m7eBUTl ZsGc2hfMnqXE2Iv/RvPXJWrgRNkKSktf2HXkeANSLs2JRx5xcUjH+QnGrpD77IIcw4yz hjJA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=YiabnvSw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j27-20020a170906279b00b007adb8d724a7si1961190ejc.928.2022.10.29.23.28.50; Sat, 29 Oct 2022 23:29:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=YiabnvSw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230239AbiJ3G1f (ORCPT + 99 others); Sun, 30 Oct 2022 02:27:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229719AbiJ3GYY (ORCPT ); Sun, 30 Oct 2022 02:24:24 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A91BA6; Sat, 29 Oct 2022 23:24:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111051; x=1698647051; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=grz/Kkobq8mpulFiFbnq7H/VmlAjZxMI+BOmbNK2kbU=; b=YiabnvSwLtEceJ/ZDsmJ19WyI5ejlBc9Fp8RAw4M/waXMblpvOVSoEmR NCjqveu3Ztp5s/UaclGz8K6TiLD6+G5dEYbfiMIx2wSIFp8fAdqGi+2ya LLwh//7uBNz1CCahtvS2LzRH6jV/F2pjd5sPpWmAJocCixqN34FbBVl/Y 5Ev0n6KNreksussnbqgkN388hibYz3X5OgFCBDBBoml1cmSVjHJJaPem0 rTfxuuFPbj3r3eM1Mv8QUB3+zCdhjR45s4MjIxn43UxdbA3muD5tZ8xbv luz/1V2iCknSYmn2PI7WTKT2mPOMqK8MzJZzcMwPeW+VHP3QpsNsxu+C2 w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037160" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037160" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:05 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393001" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393001" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:04 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 048/108] KVM: x86/tdp_mmu: Make handle_changed_spte() return value Date: Sat, 29 Oct 2022 23:22:49 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092955033517810?= X-GMAIL-MSGID: =?utf-8?q?1748092955033517810?= From: Isaku Yamahata TDX operation can fail with TDX_OPERAND_BUSY when multiple vcpu try to operation on same TDX resource like Secure EPT. It doesn't spin and returns busy error to VMM so that VMM has to take action, e.g. retry or whatever. Because TDP MMU uses read spin lock for scalability, spinlock around seam call busts TDP MMU effort. The other option is to let SEAMCALL fail and page fault handler should retry. Make handle_changed_spte() and its caller return values so that kvm page fault handler can return on such cases. This patch makes it return only zero. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu/tdp_mmu.c | 72 +++++++++++++++++++++++++------------- 1 file changed, 47 insertions(+), 25 deletions(-) diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 85d990ec149e..bdb50c26849f 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -336,9 +336,9 @@ hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu) return __pa(root->spt); } -static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, - u64 old_spte, u64 new_spte, int level, - bool shared); +static int __must_check handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, + u64 old_spte, u64 new_spte, int level, + bool shared); static void handle_changed_spte_acc_track(u64 old_spte, u64 new_spte, int level) { @@ -427,6 +427,7 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared) struct kvm_mmu_page *sp = sptep_to_sp(rcu_dereference(pt)); int level = sp->role.level; gfn_t base_gfn = sp->gfn; + int ret; int i; trace_kvm_mmu_prepare_zap_page(sp); @@ -498,8 +499,14 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared) old_spte = kvm_tdp_mmu_write_spte(sptep, old_spte, REMOVED_SPTE, level); } - handle_changed_spte(kvm, kvm_mmu_page_as_id(sp), gfn, - old_spte, REMOVED_SPTE, level, shared); + ret = handle_changed_spte(kvm, kvm_mmu_page_as_id(sp), gfn, + old_spte, REMOVED_SPTE, level, shared); + /* + * We are removing page tables. Because in TDX case we don't + * zap private page tables except tearing down VM. It means + * no race condition. + */ + WARN_ON_ONCE(ret); } call_rcu(&sp->rcu_head, tdp_mmu_free_sp_rcu_callback); @@ -520,9 +527,9 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared) * Handle bookkeeping that might result from the modification of a SPTE. * This function must be called for all TDP SPTE modifications. */ -static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, - u64 old_spte, u64 new_spte, int level, - bool shared) +static int __must_check __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, + u64 old_spte, u64 new_spte, int level, + bool shared) { bool was_present = is_shadow_present_pte(old_spte); bool is_present = is_shadow_present_pte(new_spte); @@ -558,7 +565,7 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, } if (old_spte == new_spte) - return; + return 0; trace_kvm_tdp_mmu_spte_changed(as_id, gfn, level, old_spte, new_spte); @@ -587,7 +594,7 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, "a temporary removed SPTE.\n" "as_id: %d gfn: %llx old_spte: %llx new_spte: %llx level: %d", as_id, gfn, old_spte, new_spte, level); - return; + return 0; } if (is_leaf != was_leaf) @@ -606,17 +613,25 @@ static void __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, if (was_present && !was_leaf && (is_leaf || !is_present || WARN_ON_ONCE(pfn_changed))) handle_removed_pt(kvm, spte_to_child_pt(old_spte, level), shared); + + return 0; } -static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, - u64 old_spte, u64 new_spte, int level, - bool shared) +static int __must_check handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, + u64 old_spte, u64 new_spte, int level, + bool shared) { - __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, - shared); + int ret; + + ret = __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, + shared); + if (ret) + return ret; + handle_changed_spte_acc_track(old_spte, new_spte, level); handle_changed_spte_dirty_log(kvm, as_id, gfn, old_spte, new_spte, level); + return 0; } /* @@ -635,12 +650,14 @@ static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, * * -EBUSY - If the SPTE cannot be set. In this case this function will have * no side-effects other than setting iter->old_spte to the last * known value of the spte. + * * -EAGAIN - Same to -EBUSY. But the source is from callbacks for private spt */ -static inline int tdp_mmu_set_spte_atomic(struct kvm *kvm, - struct tdp_iter *iter, - u64 new_spte) +static inline int __must_check tdp_mmu_set_spte_atomic(struct kvm *kvm, + struct tdp_iter *iter, + u64 new_spte) { u64 *sptep = rcu_dereference(iter->sptep); + int ret; /* * The caller is responsible for ensuring the old SPTE is not a REMOVED @@ -659,15 +676,16 @@ static inline int tdp_mmu_set_spte_atomic(struct kvm *kvm, if (!try_cmpxchg64(sptep, &iter->old_spte, new_spte)) return -EBUSY; - __handle_changed_spte(kvm, iter->as_id, iter->gfn, iter->old_spte, - new_spte, iter->level, true); - handle_changed_spte_acc_track(iter->old_spte, new_spte, iter->level); + ret = __handle_changed_spte(kvm, iter->as_id, iter->gfn, iter->old_spte, + new_spte, iter->level, true); + if (!ret) + handle_changed_spte_acc_track(iter->old_spte, new_spte, iter->level); - return 0; + return ret; } -static inline int tdp_mmu_zap_spte_atomic(struct kvm *kvm, - struct tdp_iter *iter) +static inline int __must_check tdp_mmu_zap_spte_atomic(struct kvm *kvm, + struct tdp_iter *iter) { int ret; @@ -732,6 +750,8 @@ static u64 __tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep, u64 old_spte, u64 new_spte, gfn_t gfn, int level, bool record_acc_track, bool record_dirty_log) { + int ret; + lockdep_assert_held_write(&kvm->mmu_lock); /* @@ -745,7 +765,9 @@ static u64 __tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep, old_spte = kvm_tdp_mmu_write_spte(sptep, old_spte, new_spte, level); - __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, false); + ret = __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, false); + /* Because write spin lock is held, no race. It should success. */ + WARN_ON_ONCE(ret); if (record_acc_track) handle_changed_spte_acc_track(old_spte, new_spte, level); From patchwork Sun Oct 30 06:22:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12873 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666232wru; Sat, 29 Oct 2022 23:29:28 -0700 (PDT) X-Google-Smtp-Source: AMsMyM66NmtThBKfL5ntFvw5CL6CqmxrmHClT1FE/Iy+mgL1Vf5Y1O7srHN4LF1J7VNp3SeTc7AK X-Received: by 2002:a17:907:160c:b0:78d:b6f5:9f56 with SMTP id hb12-20020a170907160c00b0078db6f59f56mr6992388ejc.325.1667111368010; Sat, 29 Oct 2022 23:29:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111368; cv=none; d=google.com; s=arc-20160816; b=eNuvkr7ejrSSsghwghj+/Kk97l+Pb3AzfYZkzc+6L80hjOfLAliDX5auuuQbLph7CO 7yVzBJPS8fR0QV9p01ladHpE8yNlObQsw/vlBdNd6APMvOM6ykyo4EFWJZhP6RSK1GI6 iC+ZNechvSIdbwcwr+neSMcGUKp4JjT41EGlUZUGOhromXuv5ZFLaQmN6n1cF70QoNk6 lwDlGotyVwbY+dJeUsZgqo5/jtN4L2fnpD2cxgTGQzjY9VzxkwaFkJiGUHYCsyvDEOXl z8mEXxgMKLr5QgkMkr+MS9dTy8LcqLWizBGjraG2XGYb9UNUvxZeZfjqYDUnmZ5gyXPP k32Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=LHEytOeozPD34eCirtPJEXR1tUg50FhdqgkqqkGEiLg=; b=jS6VjOR7SFyx4LflC0BiZSw7RyNeAkQXznY+hCkex0+CRa0eUIjKUVTpb1StjuScPM tK1gMQrmTDKCxIQVsXkbcbNetyiLf8s91tTxY/ahf7F2cvPpJCpaqEX2ubTFCCdA4Yua gAALnlmz5eq3zz4hOTHKk1AWydaTzvWFtPlH/jchuIMX4UtmHKPKn/6ilXSiiWxJsxMa ChsCPANzOEykzLFEPH24lqaM0oGtCe7aseMa5WbtuObnKFa4DfIu0wyWjX9ioxX2L/bC 9BjKcoDN+dhhhyHSqjcMYqtGch9aTx0EPE6q5JgEx1i2s2BWa5/WvLA6aQyDbgQl91kv aGmQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=mpt4ur77; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id fk7-20020a056402398700b00459101ecc5esi3529537edb.468.2022.10.29.23.29.03; Sat, 29 Oct 2022 23:29:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=mpt4ur77; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230494AbiJ3G2L (ORCPT + 99 others); Sun, 30 Oct 2022 02:28:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47808 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229915AbiJ3GY1 (ORCPT ); Sun, 30 Oct 2022 02:24:27 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80D6610A; Sat, 29 Oct 2022 23:24:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111051; x=1698647051; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7BrXdXkST7hvGyewEEBXkVJAn7JTYiYcFSd+4wEDLR4=; b=mpt4ur77CsmXl+qUHjI62MK/coi7cSBRUL6tqYGN1n979HFQHbkGYvtY wrq+szQUDSCyVv1OieoRLpxizDhoJCdn33hwbKBWXuePncxIyQGXiL/to 0fsHDYX+U+SfF+6FPVkKdUaHw9Qbn3O6sVjq699X6DuQVODwspUp4s9Yr FeI7zjoilyZk5SuoMMCrQHmjYQ3ABI/akAuPBKoLf3Ekleig6y40JdIa7 IV9kfcNR40RNZumg5+kq+OZInGXPlDzX1JtFL8rKEUsoHbUkfYqf6RZbt pMydHxyPc7XHif8PaBZTfGAcGFMweaQSJSXOnQKTXR3v3C3oVZzVPHTXF w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037161" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037161" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:05 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393004" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393004" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:05 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Kai Huang Subject: [PATCH v10 049/108] KVM: x86/tdp_mmu: Support TDX private mapping for TDP MMU Date: Sat, 29 Oct 2022 23:22:50 -0700 Message-Id: <9d5595dfe1b5ab77bcb5650bc4d940dd977b0a32.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092969774634641?= X-GMAIL-MSGID: =?utf-8?q?1748092969774634641?= From: Isaku Yamahata Allocate protected page table for private page table, and add hooks to operate on protected page table. This patch adds allocation/free of protected page tables and hooks. When calling hooks to update SPTE entry, freeze the entry, call hooks and unfree the entry to allow concurrent updates on page tables. Which is the advantage of TDP MMU. As kvm_gfn_shared_mask() returns false always, those hooks aren't called yet with this patch. When the faulting GPA is private, the KVM fault is called private. When resolving private KVM, allocate protected page table and call hooks to operate on protected page table. On the change of the private PTE entry, invoke kvm_x86_ops hook in __handle_changed_spte() to propagate the change to protected page table. The following depicts the relationship. private KVM page fault | | | V | private GPA | CPU protected EPTP | | | V | V private PT root | protected PT root | | | V | V private PT --hook to propagate-->protected PT | | | \--------------------+------\ | | | | | V V | private guest page | | non-encrypted memory | encrypted memory | PT: page table The existing KVM TDP MMU code uses atomic update of SPTE. On populating the EPT entry, atomically set the entry. However, it requires TLB shootdown to zap SPTE. To address it, the entry is frozen with the special SPTE value that clears the present bit. After the TLB shootdown, the entry is set to the eventual value (unfreeze). For protected page table, hooks are called to update protected page table in addition to direct access to the private SPTE. For the zapping case, it works to freeze the SPTE. It can call hooks in addition to TLB shootdown. For populating the private SPTE entry, there can be a race condition without further protection vcpu 1: populating 2M private SPTE vcpu 2: populating 4K private SPTE vcpu 2: TDX SEAMCALL to update 4K protected SPTE => error vcpu 1: TDX SEAMCALL to update 2M protected SPTE To avoid the race, the frozen SPTE is utilized. Instead of atomic update of the private entry, freeze the entry, call the hook that update protected SPTE, set the entry to the final value. Support 4K page only at this stage. 2M page support can be done in future patches. Co-developed-by: Kai Huang Signed-off-by: Kai Huang Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm-x86-ops.h | 5 + arch/x86/include/asm/kvm_host.h | 11 ++ arch/x86/kvm/mmu/mmu.c | 15 +- arch/x86/kvm/mmu/mmu_internal.h | 32 ++++ arch/x86/kvm/mmu/tdp_iter.h | 2 +- arch/x86/kvm/mmu/tdp_mmu.c | 244 +++++++++++++++++++++++++---- arch/x86/kvm/mmu/tdp_mmu.h | 2 +- virt/kvm/kvm_main.c | 1 + 8 files changed, 280 insertions(+), 32 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index f28c9fd72ac4..1b01dc2098b0 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -94,6 +94,11 @@ KVM_X86_OP_OPTIONAL_RET0(set_tss_addr) KVM_X86_OP_OPTIONAL_RET0(set_identity_map_addr) KVM_X86_OP_OPTIONAL_RET0(get_mt_mask) KVM_X86_OP(load_mmu_pgd) +KVM_X86_OP_OPTIONAL(link_private_spt) +KVM_X86_OP_OPTIONAL(free_private_spt) +KVM_X86_OP_OPTIONAL(set_private_spte) +KVM_X86_OP_OPTIONAL(remove_private_spte) +KVM_X86_OP_OPTIONAL(zap_private_spte) KVM_X86_OP(has_wbinvd_exit) KVM_X86_OP(get_l2_tsc_offset) KVM_X86_OP(get_l2_tsc_multiplier) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 381df2c8136d..5f9634c130d0 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -467,6 +467,7 @@ struct kvm_mmu { struct kvm_mmu_page *sp); void (*invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa); struct kvm_mmu_root_info root; + hpa_t private_root_hpa; union kvm_cpu_role cpu_role; union kvm_mmu_page_role root_role; @@ -1613,6 +1614,16 @@ struct kvm_x86_ops { void (*load_mmu_pgd)(struct kvm_vcpu *vcpu, hpa_t root_hpa, int root_level); + int (*link_private_spt)(struct kvm *kvm, gfn_t gfn, enum pg_level level, + void *private_spt); + int (*free_private_spt)(struct kvm *kvm, gfn_t gfn, enum pg_level level, + void *private_spt); + int (*set_private_spte)(struct kvm *kvm, gfn_t gfn, enum pg_level level, + kvm_pfn_t pfn); + int (*remove_private_spte)(struct kvm *kvm, gfn_t gfn, enum pg_level level, + kvm_pfn_t pfn); + int (*zap_private_spte)(struct kvm *kvm, gfn_t gfn, enum pg_level level); + bool (*has_wbinvd_exit)(void); u64 (*get_l2_tsc_offset)(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 0237e143299c..02e7b5cf3231 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -3646,7 +3646,12 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu) goto out_unlock; if (is_tdp_mmu_enabled(vcpu->kvm)) { - root = kvm_tdp_mmu_get_vcpu_root_hpa(vcpu); + if (kvm_gfn_shared_mask(vcpu->kvm) && + !VALID_PAGE(mmu->private_root_hpa)) { + root = kvm_tdp_mmu_get_vcpu_root_hpa(vcpu, true); + mmu->private_root_hpa = root; + } + root = kvm_tdp_mmu_get_vcpu_root_hpa(vcpu, false); mmu->root.hpa = root; } else if (shadow_root_level >= PT64_ROOT_4LEVEL) { root = mmu_alloc_root(vcpu, 0, 0, shadow_root_level); @@ -4357,7 +4362,7 @@ static int direct_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault unsigned long mmu_seq; int r; - fault->gfn = fault->addr >> PAGE_SHIFT; + fault->gfn = gpa_to_gfn(fault->addr) & ~kvm_gfn_shared_mask(vcpu->kvm); fault->slot = kvm_vcpu_gfn_to_memslot(vcpu, fault->gfn); if (page_fault_handle_page_track(vcpu, fault)) @@ -5893,6 +5898,7 @@ static int __kvm_mmu_create(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu) mmu->root.hpa = INVALID_PAGE; mmu->root.pgd = 0; + mmu->private_root_hpa = INVALID_PAGE; for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++) mmu->prev_roots[i] = KVM_MMU_ROOT_INFO_INVALID; @@ -6116,7 +6122,7 @@ static void kvm_mmu_zap_memslot(struct kvm *kvm, struct kvm_memory_slot *slot) }; /* - * this handles both private gfn and shared gfn. + * This handles both private gfn and shared gfn. * All private page should be zapped on memslot deletion. */ flush = kvm_tdp_mmu_unmap_gfn_range(kvm, &range, flush, true); @@ -6919,6 +6925,9 @@ int kvm_mmu_vendor_module_init(void) void kvm_mmu_destroy(struct kvm_vcpu *vcpu) { kvm_mmu_unload(vcpu); + if (is_tdp_mmu_enabled(vcpu->kvm)) + mmu_free_root_page(vcpu->kvm, &vcpu->arch.mmu->private_root_hpa, + NULL); free_mmu_pages(&vcpu->arch.root_mmu); free_mmu_pages(&vcpu->arch.guest_mmu); mmu_free_memory_caches(vcpu); diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h index 4c013124534b..508e8402c07a 100644 --- a/arch/x86/kvm/mmu/mmu_internal.h +++ b/arch/x86/kvm/mmu/mmu_internal.h @@ -6,6 +6,8 @@ #include #include +#include "mmu.h" + #undef MMU_DEBUG #ifdef MMU_DEBUG @@ -209,11 +211,29 @@ static inline void kvm_mmu_alloc_private_spt(struct kvm_vcpu *vcpu, } } +static inline int kvm_alloc_private_spt_for_split(struct kvm_mmu_page *sp, gfp_t gfp) +{ + gfp &= ~__GFP_ZERO; + sp->private_spt = (void *)__get_free_page(gfp); + if (!sp->private_spt) + return -ENOMEM; + return 0; +} + static inline void kvm_mmu_free_private_spt(struct kvm_mmu_page *sp) { if (sp->private_spt) free_page((unsigned long)sp->private_spt); } + +static inline gfn_t kvm_gfn_for_root(struct kvm *kvm, struct kvm_mmu_page *root, + gfn_t gfn) +{ + if (is_private_sp(root)) + return kvm_gfn_private(kvm, gfn); + else + return kvm_gfn_shared(kvm, gfn); +} #else static inline void *kvm_mmu_private_spt(struct kvm_mmu_page *sp) { @@ -230,9 +250,20 @@ static inline void kvm_mmu_alloc_private_spt(struct kvm_vcpu *vcpu, { } +static inline int kvm_alloc_private_spt_for_split(struct kvm_mmu_page *sp, gfp_t gfp) +{ + return -ENOMEM; +} + static inline void kvm_mmu_free_private_spt(struct kvm_mmu_page *sp) { } + +static inline gfn_t kvm_gfn_for_root(struct kvm *kvm, struct kvm_mmu_page *root, + gfn_t gfn) +{ + return gfn; +} #endif static inline bool kvm_mmu_page_ad_need_write_protect(struct kvm_mmu_page *sp) @@ -367,6 +398,7 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, .is_tdp = likely(vcpu->arch.mmu->page_fault == kvm_tdp_page_fault), .nx_huge_page_workaround_enabled = is_nx_huge_page_enabled(vcpu->kvm), + .is_private = kvm_is_private_gpa(vcpu->kvm, cr2_or_gpa), .max_level = vcpu->kvm->arch.tdp_max_page_level, .req_level = PG_LEVEL_4K, diff --git a/arch/x86/kvm/mmu/tdp_iter.h b/arch/x86/kvm/mmu/tdp_iter.h index 9e56a5b1024c..eab62baf8549 100644 --- a/arch/x86/kvm/mmu/tdp_iter.h +++ b/arch/x86/kvm/mmu/tdp_iter.h @@ -71,7 +71,7 @@ struct tdp_iter { tdp_ptep_t pt_path[PT64_ROOT_MAX_LEVEL]; /* A pointer to the current SPTE */ tdp_ptep_t sptep; - /* The lowest GFN mapped by the current SPTE */ + /* The lowest GFN (shared bits included) mapped by the current SPTE */ gfn_t gfn; /* The level of the root page given to the iterator */ int root_level; diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index bdb50c26849f..0e053b96444a 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -285,6 +285,9 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm_vcpu *vcpu, sp->spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache); sp->role = role; + if (kvm_mmu_page_role_is_private(role)) + kvm_mmu_alloc_private_spt(vcpu, NULL, sp); + return sp; } @@ -305,7 +308,8 @@ static void tdp_mmu_init_sp(struct kvm_mmu_page *sp, tdp_ptep_t sptep, trace_kvm_mmu_get_page(sp, true); } -hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu) +static struct kvm_mmu_page *kvm_tdp_mmu_get_vcpu_root(struct kvm_vcpu *vcpu, + bool private) { union kvm_mmu_page_role role = vcpu->arch.mmu->root_role; struct kvm *kvm = vcpu->kvm; @@ -317,6 +321,8 @@ hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu) * Check for an existing root before allocating a new one. Note, the * role check prevents consuming an invalid root. */ + if (private) + kvm_mmu_page_role_set_private(&role); for_each_tdp_mmu_root(kvm, root, kvm_mmu_role_as_id(role)) { if (root->role.word == role.word && kvm_tdp_mmu_get_root(root)) @@ -333,11 +339,17 @@ hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu) spin_unlock(&kvm->arch.tdp_mmu_pages_lock); out: - return __pa(root->spt); + return root; +} + +hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu, bool private) +{ + return __pa(kvm_tdp_mmu_get_vcpu_root(vcpu, private)->spt); } static int __must_check handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, - u64 old_spte, u64 new_spte, int level, + u64 old_spte, u64 new_spte, + union kvm_mmu_page_role role, bool shared); static void handle_changed_spte_acc_track(u64 old_spte, u64 new_spte, int level) @@ -364,6 +376,8 @@ static void handle_changed_spte_dirty_log(struct kvm *kvm, int as_id, gfn_t gfn, if ((!is_writable_pte(old_spte) || pfn_changed) && is_writable_pte(new_spte)) { + /* For memory slot operations, use GFN without aliasing */ + gfn = gfn & ~kvm_gfn_shared_mask(kvm); slot = __gfn_to_memslot(__kvm_memslots(kvm, as_id), gfn); mark_page_dirty_in_slot(kvm, slot, gfn); } @@ -500,7 +514,8 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared) REMOVED_SPTE, level); } ret = handle_changed_spte(kvm, kvm_mmu_page_as_id(sp), gfn, - old_spte, REMOVED_SPTE, level, shared); + old_spte, REMOVED_SPTE, sp->role, + shared); /* * We are removing page tables. Because in TDX case we don't * zap private page tables except tearing down VM. It means @@ -509,9 +524,81 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared) WARN_ON_ONCE(ret); } + if (is_private_sp(sp) && + WARN_ON(static_call(kvm_x86_free_private_spt)(kvm, sp->gfn, sp->role.level, + kvm_mmu_private_spt(sp)))) { + /* + * Failed to unlink Secure EPT page and there is nothing to do + * further. Intentionally leak the page to prevent the kernel + * from accessing the encrypted page. + */ + kvm_mmu_init_private_spt(sp, NULL); + } + call_rcu(&sp->rcu_head, tdp_mmu_free_sp_rcu_callback); } +static void *get_private_spt(gfn_t gfn, u64 new_spte, int level) +{ + if (is_shadow_present_pte(new_spte) && !is_last_spte(new_spte, level)) { + struct kvm_mmu_page *sp = to_shadow_page(pfn_to_hpa(spte_to_pfn(new_spte))); + void *private_spt = kvm_mmu_private_spt(sp); + + WARN_ON_ONCE(!private_spt); + WARN_ON_ONCE(sp->role.level + 1 != level); + WARN_ON_ONCE(sp->gfn != gfn); + return private_spt; + } + + return NULL; +} + +static int __must_check handle_changed_private_spte(struct kvm *kvm, gfn_t gfn, + u64 old_spte, u64 new_spte, + int level) +{ + bool was_present = is_shadow_present_pte(old_spte); + bool is_present = is_shadow_present_pte(new_spte); + bool was_leaf = was_present && is_last_spte(old_spte, level); + bool is_leaf = is_present && is_last_spte(new_spte, level); + kvm_pfn_t old_pfn = spte_to_pfn(old_spte); + kvm_pfn_t new_pfn = spte_to_pfn(new_spte); + int ret; + + lockdep_assert_held(&kvm->mmu_lock); + if (is_present) { + /* TDP MMU doesn't change present -> present */ + KVM_BUG_ON(was_present, kvm); + + /* + * Use different call to either set up middle level + * private page table, or leaf. + */ + if (is_leaf) + ret = static_call(kvm_x86_set_private_spte)(kvm, gfn, level, new_pfn); + else { + void *private_spt = get_private_spt(gfn, new_spte, level); + + KVM_BUG_ON(!private_spt, kvm); + ret = static_call(kvm_x86_link_private_spt)(kvm, gfn, level, private_spt); + } + } else if (was_leaf) { + /* non-present -> non-present doesn't make sense. */ + KVM_BUG_ON(!was_present, kvm); + /* + * Zap private leaf SPTE. Zapping private table is done + * below in handle_removed_tdp_mmu_page(). + */ + lockdep_assert_held_write(&kvm->mmu_lock); + ret = static_call(kvm_x86_zap_private_spte)(kvm, gfn, level); + if (!ret) { + ret = static_call(kvm_x86_remove_private_spte)(kvm, gfn, level, old_pfn); + WARN_ON_ONCE(ret); + } + } + return ret; +} + /** * __handle_changed_spte - handle bookkeeping associated with an SPTE change * @kvm: kvm instance @@ -519,7 +606,7 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared) * @gfn: the base GFN that was mapped by the SPTE * @old_spte: The value of the SPTE before the change * @new_spte: The value of the SPTE after the change - * @level: the level of the PT the SPTE is part of in the paging structure + * @role: the role of the PT the SPTE is part of in the paging structure * @shared: This operation may not be running under the exclusive use of * the MMU lock and the operation must synchronize with other * threads that might be modifying SPTEs. @@ -528,14 +615,18 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t pt, bool shared) * This function must be called for all TDP SPTE modifications. */ static int __must_check __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, - u64 old_spte, u64 new_spte, int level, - bool shared) + u64 old_spte, u64 new_spte, + union kvm_mmu_page_role role, bool shared) { + bool is_private = kvm_mmu_page_role_is_private(role); + int level = role.level; bool was_present = is_shadow_present_pte(old_spte); bool is_present = is_shadow_present_pte(new_spte); bool was_leaf = was_present && is_last_spte(old_spte, level); bool is_leaf = is_present && is_last_spte(new_spte, level); - bool pfn_changed = spte_to_pfn(old_spte) != spte_to_pfn(new_spte); + kvm_pfn_t old_pfn = spte_to_pfn(old_spte); + kvm_pfn_t new_pfn = spte_to_pfn(new_spte); + bool pfn_changed = old_pfn != new_pfn; WARN_ON(level > PT64_ROOT_MAX_LEVEL); WARN_ON(level < PG_LEVEL_4K); @@ -602,7 +693,7 @@ static int __must_check __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t if (was_leaf && is_dirty_spte(old_spte) && (!is_present || !is_dirty_spte(new_spte) || pfn_changed)) - kvm_set_pfn_dirty(spte_to_pfn(old_spte)); + kvm_set_pfn_dirty(old_pfn); /* * Recursively handle child PTs if the change removed a subtree from @@ -611,26 +702,42 @@ static int __must_check __handle_changed_spte(struct kvm *kvm, int as_id, gfn_t * pages are kernel allocations and should never be migrated. */ if (was_present && !was_leaf && - (is_leaf || !is_present || WARN_ON_ONCE(pfn_changed))) + (is_leaf || !is_present || WARN_ON_ONCE(pfn_changed))) { + KVM_BUG_ON(is_private != is_private_sptep(spte_to_child_pt(old_spte, level)), + kvm); handle_removed_pt(kvm, spte_to_child_pt(old_spte, level), shared); + } + /* + * Special handling for the private mapping. We are either + * setting up new mapping at middle level page table, or leaf, + * or tearing down existing mapping. + * + * This is after handling lower page table by above + * handle_remove_tdp_mmu_page(). Secure-EPT requires to remove + * Secure-EPT tables after removing children. + */ + if (is_private && + /* Ignore change of software only bits. e.g. host_writable */ + (was_leaf != is_leaf || was_present != is_present || pfn_changed)) + return handle_changed_private_spte(kvm, gfn, old_spte, new_spte, role.level); return 0; } static int __must_check handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn, - u64 old_spte, u64 new_spte, int level, + u64 old_spte, u64 new_spte, + union kvm_mmu_page_role role, bool shared) { int ret; - ret = __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, - shared); + ret = __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, role, shared); if (ret) return ret; - handle_changed_spte_acc_track(old_spte, new_spte, level); + handle_changed_spte_acc_track(old_spte, new_spte, role.level); handle_changed_spte_dirty_log(kvm, as_id, gfn, old_spte, - new_spte, level); + new_spte, role.level); return 0; } @@ -656,6 +763,24 @@ static inline int __must_check tdp_mmu_set_spte_atomic(struct kvm *kvm, struct tdp_iter *iter, u64 new_spte) { + /* + * For conventional page table, the update flow is + * - update STPE with atomic operation + * - handle changed SPTE. __handle_changed_spte() + * NOTE: __handle_changed_spte() (and functions) must be safe against + * concurrent update. It is an exception to zap SPTE. See + * tdp_mmu_zap_spte_atomic(). + * + * For private page table, callbacks are needed to propagate SPTE + * change into the protected page table. In order to atomically update + * both the SPTE and the protected page tables with callbacks, utilize + * freezing SPTE. + * - Freeze the SPTE. Set entry to REMOVED_SPTE. + * - Trigger callbacks for protected page tables. __handle_changed_spte() + * - Unfreeze the SPTE. Set the entry to new_spte. + */ + bool freeze_spte = is_private_sptep(iter->sptep) && !is_removed_spte(new_spte); + u64 tmp_spte = freeze_spte ? REMOVED_SPTE : new_spte; u64 *sptep = rcu_dereference(iter->sptep); int ret; @@ -673,14 +798,24 @@ static inline int __must_check tdp_mmu_set_spte_atomic(struct kvm *kvm, * Note, fast_pf_fix_direct_spte() can also modify TDP MMU SPTEs and * does not hold the mmu_lock. */ - if (!try_cmpxchg64(sptep, &iter->old_spte, new_spte)) + if (!try_cmpxchg64(sptep, &iter->old_spte, tmp_spte)) return -EBUSY; ret = __handle_changed_spte(kvm, iter->as_id, iter->gfn, iter->old_spte, - new_spte, iter->level, true); + new_spte, sptep_to_sp(sptep)->role, true); if (!ret) handle_changed_spte_acc_track(iter->old_spte, new_spte, iter->level); + if (ret) { + /* + * !freeze_spte means this fault isn't private. No call to + * operation on Secure EPT. + */ + WARN_ON_ONCE(!freeze_spte); + __kvm_tdp_mmu_write_spte(sptep, iter->old_spte); + } else if (freeze_spte) + __kvm_tdp_mmu_write_spte(sptep, new_spte); + return ret; } @@ -750,6 +885,7 @@ static u64 __tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep, u64 old_spte, u64 new_spte, gfn_t gfn, int level, bool record_acc_track, bool record_dirty_log) { + union kvm_mmu_page_role role; int ret; lockdep_assert_held_write(&kvm->mmu_lock); @@ -765,7 +901,9 @@ static u64 __tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep, old_spte = kvm_tdp_mmu_write_spte(sptep, old_spte, new_spte, level); - ret = __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, level, false); + role = sptep_to_sp(sptep)->role; + role.level = level; + ret = __handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, role, false); /* Because write spin lock is held, no race. It should success. */ WARN_ON_ONCE(ret); @@ -819,8 +957,11 @@ static inline void tdp_mmu_set_spte_no_dirty_log(struct kvm *kvm, continue; \ else -#define tdp_mmu_for_each_pte(_iter, _mmu, _start, _end) \ - for_each_tdp_pte(_iter, to_shadow_page(_mmu->root.hpa), _start, _end) +#define tdp_mmu_for_each_pte(_iter, _mmu, _private, _start, _end) \ + for_each_tdp_pte(_iter, \ + to_shadow_page((_private) ? _mmu->private_root_hpa : \ + _mmu->root.hpa), \ + _start, _end) /* * Yield if the MMU lock is contended or this thread needs to return control @@ -983,6 +1124,14 @@ static bool tdp_mmu_zap_leafs(struct kvm *kvm, struct kvm_mmu_page *root, if (!zap_private && is_private_sp(root)) return false; + /* + * start and end doesn't have GFN shared bit. This function zaps + * a region including alias. Adjust shared bit of [start, end) if the + * root is shared. + */ + start = kvm_gfn_for_root(kvm, root, start); + end = kvm_gfn_for_root(kvm, root, end); + rcu_read_lock(); for_each_tdp_pte_min_level(iter, root, PG_LEVEL_4K, start, end) { @@ -1111,10 +1260,19 @@ static int tdp_mmu_map_handle_target_level(struct kvm_vcpu *vcpu, WARN_ON(sp->role.level != fault->goal_level); if (unlikely(!fault->slot)) new_spte = make_mmio_spte(vcpu, iter->gfn, ACC_ALL); - else - wrprot = make_spte(vcpu, sp, fault->slot, ACC_ALL, iter->gfn, - fault->pfn, iter->old_spte, fault->prefetch, true, - fault->map_writable, &new_spte); + else { + unsigned long pte_access = ACC_ALL; + gfn_t gfn_unalias = iter->gfn & ~kvm_gfn_shared_mask(vcpu->kvm); + + /* TDX shared GPAs are no executable, enforce this for the SDV. */ + if (kvm_gfn_shared_mask(vcpu->kvm) && !fault->is_private) + pte_access &= ~ACC_EXEC_MASK; + + wrprot = make_spte(vcpu, sp, fault->slot, pte_access, + gfn_unalias, fault->pfn, iter->old_spte, + fault->prefetch, true, fault->map_writable, + &new_spte); + } if (new_spte == iter->old_spte) ret = RET_PF_SPURIOUS; @@ -1214,6 +1372,8 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) { struct kvm_mmu *mmu = vcpu->arch.mmu; struct tdp_iter iter; + gfn_t raw_gfn; + bool is_private = fault->is_private; int ret; kvm_mmu_hugepage_adjust(vcpu, fault); @@ -1222,7 +1382,17 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) rcu_read_lock(); - tdp_mmu_for_each_pte(iter, mmu, fault->gfn, fault->gfn + 1) { + raw_gfn = gpa_to_gfn(fault->addr); + + if (is_error_noslot_pfn(fault->pfn) || + !kvm_pfn_to_refcounted_page(fault->pfn)) { + if (is_private) { + rcu_read_unlock(); + return -EFAULT; + } + } + + tdp_mmu_for_each_pte(iter, mmu, is_private, raw_gfn, raw_gfn + 1) { if (fault->nx_huge_page_workaround_enabled) disallowed_hugepage_adjust(fault, iter.old_spte, iter.level); @@ -1238,6 +1408,12 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) is_large_pte(iter.old_spte)) { if (tdp_mmu_zap_spte_atomic(vcpu->kvm, &iter)) break; + /* + * TODO: large page support. + * Doesn't support large page for TDX now + */ + KVM_BUG_ON(is_private_sptep(iter.sptep), vcpu->kvm); + /* * The iter must explicitly re-read the spte here @@ -1480,6 +1656,12 @@ static struct kvm_mmu_page *__tdp_mmu_alloc_sp_for_split(gfp_t gfp, union kvm_mm sp->role = role; sp->spt = (void *)__get_free_page(gfp); + if (kvm_mmu_page_role_is_private(role)) { + if (kvm_alloc_private_spt_for_split(sp, gfp)) { + free_page((unsigned long)sp->spt); + sp->spt = NULL; + } + } if (!sp->spt) { kmem_cache_free(mmu_page_header_cache, sp); return NULL; @@ -1495,6 +1677,11 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_split(struct kvm *kvm, union kvm_mmu_page_role role = tdp_iter_child_role(iter); struct kvm_mmu_page *sp; + KVM_BUG_ON(kvm_mmu_page_role_is_private(role) != + is_private_sptep(iter->sptep), kvm); + /* TODO: Large page isn't supported for private SPTE yet. */ + KVM_BUG_ON(kvm_mmu_page_role_is_private(role), kvm); + /* * Since we are allocating while under the MMU lock we have to be * careful about GFP flags. Use GFP_NOWAIT to avoid blocking on direct @@ -1929,7 +2116,7 @@ int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes, if (WARN_ON_ONCE(kvm_gfn_shared_mask(vcpu->kvm))) return leaf; - tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) { + tdp_mmu_for_each_pte(iter, mmu, false, gfn, gfn + 1) { leaf = iter.level; sptes[leaf] = iter.old_spte; } @@ -1956,7 +2143,10 @@ u64 *kvm_tdp_mmu_fast_pf_get_last_sptep(struct kvm_vcpu *vcpu, u64 addr, gfn_t gfn = addr >> PAGE_SHIFT; tdp_ptep_t sptep = NULL; - tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) { + /* fast page fault for private GPA isn't supported. */ + WARN_ON_ONCE(kvm_is_private_gpa(vcpu->kvm, addr)); + + tdp_mmu_for_each_pte(iter, mmu, false, gfn, gfn + 1) { *spte = iter.old_spte; sptep = iter.sptep; } diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index c98c7df449a8..695175c921a5 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -5,7 +5,7 @@ #include -hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu); +hpa_t kvm_tdp_mmu_get_vcpu_root_hpa(struct kvm_vcpu *vcpu, bool private); __must_check static inline bool kvm_tdp_mmu_get_root(struct kvm_mmu_page *root) { diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index dda2f2ec4faa..8c996f40b544 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -211,6 +211,7 @@ struct page *kvm_pfn_to_refcounted_page(kvm_pfn_t pfn) return NULL; } +EXPORT_SYMBOL_GPL(kvm_pfn_to_refcounted_page); /* * Switches to specified vcpu, until a matching vcpu_put() From patchwork Sun Oct 30 06:22:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12866 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666141wru; Sat, 29 Oct 2022 23:29:06 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7hp9iYGUjqKZR+MIdeaD3syWYuru7JZRUoADw6uWPGoK7kqLmPvF0ZcEFzUi/ts3lkjPz9 X-Received: by 2002:a17:907:1687:b0:78e:4b6:1aff with SMTP id hc7-20020a170907168700b0078e04b61affmr6810357ejc.520.1667111346715; Sat, 29 Oct 2022 23:29:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111346; cv=none; d=google.com; s=arc-20160816; b=nUPh3BMZMJYmrLGaEIvE1SFp5/PV+NJ7UAOLsBdnrJ6KQOn3FOZmZ/ZPhtMSZC0z9M S2J8m1akbdopKsmax7danbu/GhPo7UvCCe0J/qAcAA3xyWQYkzIBEuoo7Y9OX9/ryZHG /wxOAUqrn/+FohMb0Ts9qBWuKFSjB0YtFPPlZpAMD5frOURKzp/zue3z1chvqJLN3iLn vM1nvD+Ef6fhyxwUxcZz7C31FxRHYctASKZfIR7rdbnItT2T6MKUWHaNA2Q0HgjPqFYy 4O0b2RCSVi9gpuCJGk6V0YexxPv5r4k7T/P9zVUWhBSg/TMQEdjIhuZlE/SU5iKYZiOl gQ/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=FMqdIlTDnSkYKJAe1+jfMyM979spOMULIJbXxJ/HSMA=; b=G/xXOAOTm60YvjNKa0KQ6LDU5Dsj+2nPQEDBJ6lQArnvIQ2FrVMHVdQT6nXIByNlik WJPCvdcRzy8J/nr8KcjsOXXSzd6HbkwiWunXztRVHRoUgT35djVRsdGfN34HapYJmUnj Rc49qExvLELhCMZWP2ms66OhGDN0D08zmoSFM1XnWYyZiDq/McBqRTH8lvhWfC3bF1aP QbMJfT8B/IbRk2M3ZB79xOJJQq3yUbiPQrcj1ZPt00//VoHbjZMIOOJnAPB8GEbZtIjb v742cEqCSiAST9D9qUArUqpC3+gvpZEQgnJuKCT+Mf7a6JIYIy4pQ1JcutMWpErjo70J TThg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TXj+aMyA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j10-20020a170906830a00b0078da42195c2si3209033ejx.547.2022.10.29.23.28.43; Sat, 29 Oct 2022 23:29:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TXj+aMyA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230434AbiJ3G11 (ORCPT + 99 others); Sun, 30 Oct 2022 02:27:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47724 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229952AbiJ3GYY (ORCPT ); Sun, 30 Oct 2022 02:24:24 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E06E187; Sat, 29 Oct 2022 23:24:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111051; x=1698647051; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Fb3dBDV4drTH1Irv7AqwIGDFRaog5goehiwWHDo/z9Q=; b=TXj+aMyAYc+3v2ZH/LqcvYmus5axvGcxS9LlsmNl9JvymQvownvdU71F JujWgpoLrQX5JQHBAgvOO5y9vIObN69ayrSZJwN1ykgSkvjmEX7XAxjZu W6uIBcE/Lj4nCAfUoAknR69SInJ8KGQ1I88ysUes+Gg7evypuiAVgujY4 puZin3g6EuggdSM8C5TG1xNTywT9cPJNE0ov7u1ChpBhIJwR7+4rvZ6NM eT0W9hvAbOrCqpuZ7FPifX5AMVGcQ+e1UhrhcrEi7vkfCSUAs2wl4nAYc UQxWgNjOklp8rQ6uBgRfrgc456EaNKZwtkqw9JKwmDQdIHv3Oq+skXqn8 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037162" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037162" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:05 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393007" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393007" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:05 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 050/108] [MARKER] The start of TDX KVM patch series: TDX EPT violation Date: Sat, 29 Oct 2022 23:22:51 -0700 Message-Id: <13cd5623491890659b1eded413678f6a6c2fa50e.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092947873267949?= X-GMAIL-MSGID: =?utf-8?q?1748092947873267949?= From: Isaku Yamahata This empty commit is to mark the start of patch series of TDX EPT violation. Signed-off-by: Isaku Yamahata --- Documentation/virt/kvm/intel-tdx-layer-status.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/virt/kvm/intel-tdx-layer-status.rst b/Documentation/virt/kvm/intel-tdx-layer-status.rst index d5cace00c433..c3e675bea802 100644 --- a/Documentation/virt/kvm/intel-tdx-layer-status.rst +++ b/Documentation/virt/kvm/intel-tdx-layer-status.rst @@ -19,12 +19,12 @@ Patch Layer status * TDX architectural definitions: Applied * TD VM creation/destruction: Applied * TD vcpu creation/destruction: Applied -* TDX EPT violation: Not yet +* TDX EPT violation: Applying * TD finalization: Not yet * TD vcpu enter/exit: Not yet * TD vcpu interrupts/exit/hypercall: Not yet * KVM MMU GPA shared bits: Applied * KVM TDP refactoring for TDX: Applied -* KVM TDP MMU hooks: Applying +* KVM TDP MMU hooks: Applied * KVM TDP MMU MapGPA: Not yet From patchwork Sun Oct 30 06:22:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12864 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666090wru; Sat, 29 Oct 2022 23:28:51 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5LG9yIJurc3G9ONcGjyQAS89DEGJlzhWOG+CxxNoW6zYAm8dV/8H18X+1LlaRyHifUiA3k X-Received: by 2002:a17:906:dc93:b0:7ad:ca82:4cb9 with SMTP id cs19-20020a170906dc9300b007adca824cb9mr315595ejc.521.1667111330861; Sat, 29 Oct 2022 23:28:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111330; cv=none; d=google.com; s=arc-20160816; b=P7QYsP+rJmEvOvAEG4QZPaEHRD8tvZtf5VMBT+0XH+5I7g79WHaxb0cK5EQBY/zrrY Ab8npL2IPX+JWYBNYsWqifnDQhtI0sTUFffyOmUbbSEw11Iz+MWedsWbZ9tbLsWOOSZO W/yKXbOQBItsmgdOr7xi5h+F1oh7ZSk508s/B1oU7CpB3PJo1UjnLESjy4luTvYICQj+ Y08s5ROQs0B2oVpVl1GEBMRkTTDIImdsy8MckNIQKxCzyGfT12NZuK9aQJ9RTJicfX3e t/V92bknE/SVxpbSFGa8YsuxKORxL1nj76e6K7R5t4p1JVxRUJBcS1Ez+EDwdz39Jaec xuTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=pESYBI72QHhGNG01DCb2iPOvY23wWFHHe7JyDCiRoqA=; b=ORFynmU/Gu6xYxC7NqqE+K/ty3VGR3n/ZFvdM/maCqQ5NO4yphm6wiFz5Wasm+bj/J ceyb3wenvR8vPPr/5a5dLhFF+V8qQ13FiGA0eFSe5AcYlCDk7XK1lQ89djZ6OIRiv98p LNkQKc1+OPoTbH1rMlxMJMfGBQzSBMrhO5kgj1SjYzQwltfR+yoY6k0aQl6noelxMiyU 781wk2YcINc6uBkteKOK9dPizqc3+z6N4I6YbYVLwJUIZBiVFLG4jRldXzdjTZ3EQ8cY GaTHZHYcSsEKexlOAK54Hy+8TQmGQAqh62RSLzZmPSJJFiUJDSeV154vN8wkpPMutJpp Mnwg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XtcnInG+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u4-20020a170906654400b007aac98ec3dfsi3339672ejn.303.2022.10.29.23.28.27; Sat, 29 Oct 2022 23:28:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=XtcnInG+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230138AbiJ3G1Q (ORCPT + 99 others); Sun, 30 Oct 2022 02:27:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229635AbiJ3GY0 (ORCPT ); Sun, 30 Oct 2022 02:24:26 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3739188; Sat, 29 Oct 2022 23:24:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111051; x=1698647051; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8q0Zi89k4Pljz6j6YFmcZyhYRvEh99WoaA0AWQJmknA=; b=XtcnInG+kF2woEhjxgCvQ4kqmAk0Z9Ro3cUatmdo9CMSzTABTrXkzD4X m7Zpmdp9EmDitb+D8MDhHh6DumjYv767b3oS+uHw/s51cgPSd2p9wzEaB cvkfLBYIQ++8IJqKOp40iHvPgntqotbVAqiFOQYmYmBIgkgWn8/Afmg7F gH7K0dgpPkvn5VmqJ9OGLtjwR1a0mhx7fP0eoELzbjGBJixzgZJ0cMXuN Ld4p25gvYLKBES/OxQEnRRXQx4rGJbeXE9LD3B+iz9ZKg+2+8IAQ+Uwmx sC6agb6rsaEKquQTDTufPMgbJMFgUrnr6zJCgdfpuELVVhkP4ZQTUT70j g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037163" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037163" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:05 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393010" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393010" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:05 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 051/108] KVM: x86/mmu: Disallow dirty logging for x86 TDX Date: Sat, 29 Oct 2022 23:22:52 -0700 Message-Id: <4873c5af293116df92eb8da5e1ba4e76df081682.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092931018040556?= X-GMAIL-MSGID: =?utf-8?q?1748092931018040556?= From: Isaku Yamahata TDX doesn't support dirty logging. Report dirty logging isn't supported so that device model, for example qemu, can properly handle it. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/kvm/x86.c | 5 +++++ include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 10 +++++++++- 3 files changed, 15 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ba4a9ce0ee80..24d9bfd4c582 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -13862,6 +13862,11 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size, } EXPORT_SYMBOL_GPL(kvm_sev_es_string_io); +bool kvm_arch_dirty_log_supported(struct kvm *kvm) +{ + return kvm->arch.vm_type != KVM_X86_TDX_VM; +} + bool kvm_arch_has_private_mem(struct kvm *kvm) { return kvm->arch.vm_type == KVM_X86_TDX_VM; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index b658803ea2c7..a0b64308d240 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1477,6 +1477,7 @@ int kvm_arch_drop_vm(int usage_count); void kvm_arch_pre_destroy_vm(struct kvm *kvm); int kvm_arch_create_vm_debugfs(struct kvm *kvm); bool kvm_arch_has_private_mem(struct kvm *kvm); +bool kvm_arch_dirty_log_supported(struct kvm *kvm); #ifndef __KVM_HAVE_ARCH_VM_ALLOC /* diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 8c996f40b544..9f82b03a8118 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1869,10 +1869,18 @@ bool __weak kvm_arch_has_private_mem(struct kvm *kvm) return false; } +bool __weak kvm_arch_dirty_log_supported(struct kvm *kvm) +{ + return true; +} + static int check_memory_region_flags(struct kvm *kvm, const struct kvm_user_mem_region *mem) { - u32 valid_flags = KVM_MEM_LOG_DIRTY_PAGES; + u32 valid_flags = 0; + + if (kvm_arch_dirty_log_supported(kvm)) + valid_flags |= KVM_MEM_LOG_DIRTY_PAGES; #ifdef CONFIG_KVM_GENERIC_PRIVATE_MEM if (kvm_arch_has_private_mem(kvm)) From patchwork Sun Oct 30 06:22:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12886 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666374wru; Sat, 29 Oct 2022 23:29:59 -0700 (PDT) X-Google-Smtp-Source: AMsMyM685Q6CLLwR2soHHthwyoUDwQ6uQPHk4Kub2THmskJFFYIefrAP4vsB871rji1Kd/6uWgJs X-Received: by 2002:a17:902:a611:b0:186:9ba2:148b with SMTP id u17-20020a170902a61100b001869ba2148bmr7911831plq.164.1667111399307; Sat, 29 Oct 2022 23:29:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111399; cv=none; d=google.com; s=arc-20160816; b=NdZzk8ySXeccgqy8XadU7qGqOYTHLgoeGDRUKNGmr9XZKFAQypKubYFivrRz0PcxF/ o4bXNmKxfxGcgGfVSJNyx5/8n2NHaRd/1lESpFDFdcmf4gy19YE40d56P12aQhKahrOq JNjLflgAWhfE1rf/JGxTArlzymfEWnfhqhxETr7WgySMPIUoYCxho0dwZ8PUXYd204C+ 5INz3j2d/cuVrOD0Cdtcwu8j9+ggAv0WnqMMXXfsIdl+c+qthdP/k1tOJp2xg4gQzOmF 7gGjokAqh2+sB8pBqm569UPc1agxax7hFfO0IgwQwzHGJDJpytUSaTFeI0fh8TTJtVai Lnsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=A6N/4ZLQ0zOrZyJQWu34ICLgvKLzFF9/SaRxRR5gbLA=; b=0xrWysc42csTgZzFbTE3uQMIQp6lv63PrEiVETDRPO+cOq80JKrVPmc+sbHefsfOqd AqN/KUQJu84kxQLh3abJZ4PC9yhDlnvRybRbVdp77WS4IAMPFid18s12JxRYWXJfCYmY bsmfNIBvHVStyXHNyZGnbAePD5gbtuIbyicl5xu+ahUtAut7yR4r6rVLPnMhIu9H5iy+ hgzeIDJuv+6bz2v9tfKxIi88th7JPxa7H9f9P00Asr93a0SujaTK85zWZRCZfmV+b6Pz ObbIVg1lzplI9tyo0PENVGt3zN27wQYaQJoXeQPj3uCwIpir1/iYbraelVI8zESQ61F9 NyOQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iKwYjh93; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b6-20020a056a00114600b00536bbfa4994si4675502pfm.345.2022.10.29.23.29.47; Sat, 29 Oct 2022 23:29:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iKwYjh93; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230303AbiJ3G1w (ORCPT + 99 others); Sun, 30 Oct 2022 02:27:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46916 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229956AbiJ3GY1 (ORCPT ); Sun, 30 Oct 2022 02:24:27 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 46328194; Sat, 29 Oct 2022 23:24:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111052; x=1698647052; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/w066rUQeUR1aKt7Qslv/eTMH7Oknm4OPUaonW9aJSU=; b=iKwYjh93t6Z+ldoj2nqWRjBp+9dlY2fyTTGszWfy0Kh50kH3q4oUZgdy H0pZZ4mPWpKX2PQxrjYhSlyGalPb2SknZRncx8zYUjgFTyDC64WB1dVT6 RTwsVdYWD9jv2BRjexsOKhjnOpW7B/WZFINMOHk8v/SFEppDA4RPzgPiQ 0faJZaLPjowD0sNFxk1rlLVXIHPIF6CQcg8SvrZmifhpv5uwhy/iDehGn 2KnyiwJcvJQTUa4BLx2yPWexHw7GlAZSylUmxd0X+D2R/ZVRIOoyDLFQj +e4Rw2aXhNuKnNC5GGH+BUSjwuWG2fChMv51KMZZ0NDfTs7eJ8LkxZRLi w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037165" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037165" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:06 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393014" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393014" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:05 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 052/108] KVM: x86/tdp_mmu: Ignore unsupported mmu operation on private GFNs Date: Sat, 29 Oct 2022 23:22:53 -0700 Message-Id: <32e2f5f567e1af3858e2896d705b66f90a908ff0.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093002462752662?= X-GMAIL-MSGID: =?utf-8?q?1748093002462752662?= From: Isaku Yamahata Some KVM MMU operations (dirty page logging, page migration, aging page) aren't supported for private GFNs (yet) with the first generation of TDX. Silently return on unsupported TDX KVM MMU operations. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu/mmu.c | 3 ++ arch/x86/kvm/mmu/tdp_mmu.c | 73 +++++++++++++++++++++++++++++++++++--- arch/x86/kvm/x86.c | 3 ++ 3 files changed, 74 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 02e7b5cf3231..efc3b3f2dd12 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6588,6 +6588,9 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm, for_each_rmap_spte(rmap_head, &iter, sptep) { sp = sptep_to_sp(sptep); + /* Private page dirty logging is not supported yet. */ + KVM_BUG_ON(is_private_sptep(sptep), kvm); + /* * We cannot do huge page mapping for indirect shadow pages, * which are found on the last rmap (level = 1) when not using diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 0e053b96444a..4b207ce83ffe 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1469,7 +1469,8 @@ typedef bool (*tdp_handler_t)(struct kvm *kvm, struct tdp_iter *iter, static __always_inline bool kvm_tdp_mmu_handle_gfn(struct kvm *kvm, struct kvm_gfn_range *range, - tdp_handler_t handler) + tdp_handler_t handler, + bool only_shared) { struct kvm_mmu_page *root; struct tdp_iter iter; @@ -1480,9 +1481,23 @@ static __always_inline bool kvm_tdp_mmu_handle_gfn(struct kvm *kvm, * into this helper allow blocking; it'd be dead, wasteful code. */ for_each_tdp_mmu_root(kvm, root, range->slot->as_id) { + gfn_t start; + gfn_t end; + + if (only_shared && is_private_sp(root)) + continue; + rcu_read_lock(); - tdp_root_for_each_leaf_pte(iter, root, range->start, range->end) + /* + * For TDX shared mapping, set GFN shared bit to the range, + * so the handler() doesn't need to set it, to avoid duplicated + * code in multiple handler()s. + */ + start = kvm_gfn_for_root(kvm, root, range->start); + end = kvm_gfn_for_root(kvm, root, range->end); + + tdp_root_for_each_leaf_pte(iter, root, start, end) ret |= handler(kvm, &iter, range); rcu_read_unlock(); @@ -1526,7 +1541,12 @@ static bool age_gfn_range(struct kvm *kvm, struct tdp_iter *iter, bool kvm_tdp_mmu_age_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range) { - return kvm_tdp_mmu_handle_gfn(kvm, range, age_gfn_range); + /* + * First TDX generation doesn't support clearing A bit for private + * mapping, since there's no secure EPT API to support it. However + * it's a legitimate request for TDX guest. + */ + return kvm_tdp_mmu_handle_gfn(kvm, range, age_gfn_range, true); } static bool test_age_gfn(struct kvm *kvm, struct tdp_iter *iter, @@ -1537,7 +1557,8 @@ static bool test_age_gfn(struct kvm *kvm, struct tdp_iter *iter, bool kvm_tdp_mmu_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range) { - return kvm_tdp_mmu_handle_gfn(kvm, range, test_age_gfn); + /* The first TDX generation doesn't support A bit. */ + return kvm_tdp_mmu_handle_gfn(kvm, range, test_age_gfn, true); } static bool set_spte_gfn(struct kvm *kvm, struct tdp_iter *iter, @@ -1582,8 +1603,11 @@ bool kvm_tdp_mmu_set_spte_gfn(struct kvm *kvm, struct kvm_gfn_range *range) * No need to handle the remote TLB flush under RCU protection, the * target SPTE _must_ be a leaf SPTE, i.e. cannot result in freeing a * shadow page. See the WARN on pfn_changed in __handle_changed_spte(). + * + * .change_pte() callback should not happen for private page, because + * for now TDX private pages are pinned during VM's life time. */ - return kvm_tdp_mmu_handle_gfn(kvm, range, set_spte_gfn); + return kvm_tdp_mmu_handle_gfn(kvm, range, set_spte_gfn, true); } /* @@ -1637,6 +1661,14 @@ bool kvm_tdp_mmu_wrprot_slot(struct kvm *kvm, lockdep_assert_held_read(&kvm->mmu_lock); + /* + * Because first TDX generation doesn't support write protecting private + * mappings and kvm_arch_dirty_log_supported(kvm) = false, it's a bug + * to reach here for guest TD. + */ + if (WARN_ON_ONCE(!kvm_arch_dirty_log_supported(kvm))) + return false; + for_each_valid_tdp_mmu_root_yield_safe(kvm, root, slot->as_id, true) spte_set |= wrprot_gfn_range(kvm, root, slot->base_gfn, slot->base_gfn + slot->npages, min_level); @@ -1902,6 +1934,14 @@ bool kvm_tdp_mmu_clear_dirty_slot(struct kvm *kvm, lockdep_assert_held_read(&kvm->mmu_lock); + /* + * First TDX generation doesn't support clearing dirty bit, + * since there's no secure EPT API to support it. It is a + * bug to reach here for TDX guest. + */ + if (WARN_ON_ONCE(!kvm_arch_dirty_log_supported(kvm))) + return false; + for_each_valid_tdp_mmu_root_yield_safe(kvm, root, slot->as_id, true) spte_set |= clear_dirty_gfn_range(kvm, root, slot->base_gfn, slot->base_gfn + slot->npages); @@ -1968,6 +2008,13 @@ void kvm_tdp_mmu_clear_dirty_pt_masked(struct kvm *kvm, struct kvm_mmu_page *root; lockdep_assert_held_write(&kvm->mmu_lock); + /* + * First TDX generation doesn't support clearing dirty bit, + * since there's no secure EPT API to support it. For now silently + * ignore KVM_CLEAR_DIRTY_LOG. + */ + if (!kvm_arch_dirty_log_supported(kvm)) + return; for_each_tdp_mmu_root(kvm, root, slot->as_id) clear_dirty_pt_masked(kvm, root, gfn, mask, wrprot); } @@ -2034,6 +2081,13 @@ void kvm_tdp_mmu_zap_collapsible_sptes(struct kvm *kvm, lockdep_assert_held_read(&kvm->mmu_lock); + /* + * This should only be reachable when diryt-log is supported. It's a + * bug to reach here. + */ + if (WARN_ON_ONCE(!kvm_arch_dirty_log_supported(kvm))) + return; + for_each_valid_tdp_mmu_root_yield_safe(kvm, root, slot->as_id, true) zap_collapsible_spte_range(kvm, root, slot); } @@ -2087,6 +2141,15 @@ bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm, bool spte_set = false; lockdep_assert_held_write(&kvm->mmu_lock); + + /* + * First TDX generation doesn't support write protecting private + * mappings, silently ignore the request. KVM_GET_DIRTY_LOG etc + * can reach here, no warning. + */ + if (!kvm_arch_dirty_log_supported(kvm)) + return false; + for_each_tdp_mmu_root(kvm, root, slot->as_id) spte_set |= write_protect_gfn(kvm, root, gfn, min_level); diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 24d9bfd4c582..3868605462ed 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -12897,6 +12897,9 @@ static void kvm_mmu_slot_apply_flags(struct kvm *kvm, u32 new_flags = new ? new->flags : 0; bool log_dirty_pages = new_flags & KVM_MEM_LOG_DIRTY_PAGES; + if (!kvm_arch_dirty_log_supported(kvm) && log_dirty_pages) + return; + /* * Update CPU dirty logging if dirty logging is being toggled. This * applies to all operations. From patchwork Sun Oct 30 06:22:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12872 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666222wru; Sat, 29 Oct 2022 23:29:26 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5yu2Rrce8exWBgDYlynbgkxZs4KA/oc01vnR2WIdsMFaRgypdQm2lIn3AhcULApO4v4fFd X-Received: by 2002:a17:90a:ad83:b0:213:9c65:c2b5 with SMTP id s3-20020a17090aad8300b002139c65c2b5mr10169479pjq.137.1667111366000; Sat, 29 Oct 2022 23:29:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111365; cv=none; d=google.com; s=arc-20160816; b=rmp48bBGCikkmfXzum4NV3qVhMu1nfUtWbUdGoQpUDSo+BK+svOHtVZJW3Bm3rzQ+L avzCrdEp98IuAAmQlm7usgnF8iPeEVdt5U2TJ3G5gBSdHOm2spOzmazCmo2bnpYM44Ig j5BSpac84sW6fA/7vvt0FwtlSq9fd2HA5rvTFtLORIlPXj9kL+MmRdroGM6xWYw3kpuP 5q0fbqPyw+x0dSL6ha2GgPYxGyLSMlPyfDMqK9lB4F8nJKPWqie5vomtzUDhjYkSxh7+ 5suKS8ZT02Oy9k0lsewhCpJSUC7DepD4rj/Tbw4mfMtk/wPuTVgcbx+aZg1ZmxgPA23V milQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=XqxAufM38FR/xrAV6R5pHhZEYOuuZnHo1AJoZLrk2eo=; b=eveLIDL2bsRV1wNjV8nM0vXsewYTe5TWD/xZSUHCGXagfyTMAIFekFchk3ziCiadss aIHtxgjCBzdXIhnzK4xFHVnKWm+Oo7SkS++97ttRcPFrZv2Cq7YMXnwTpRPZ78XkJ3mJ DYVWpXohJ3BvyNqWjI8f8MYXdQEBeEh6y6OGo3Zufi9lzklEE5r3s9778c053qCtVeN/ UxpUw4bScyKIxkZpr6SjQz2F77mOG8fTV455l5g3v6Ckm7dsEgntgTI5z0yZgEBOHv8d LYyhAKB8Qf45nwCAoE5kG0oCTz5M/uM6EmBkk3u3Cd27qM8roWRrcoW/2ysvGNTOWdKT zypQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=F5mSZl1O; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ji21-20020a170903325500b0017881dc6dc2si3603212plb.489.2022.10.29.23.29.12; Sat, 29 Oct 2022 23:29:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=F5mSZl1O; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230123AbiJ3G2Y (ORCPT + 99 others); Sun, 30 Oct 2022 02:28:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47820 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229971AbiJ3GY2 (ORCPT ); Sun, 30 Oct 2022 02:24:28 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA8521BA; Sat, 29 Oct 2022 23:24:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111052; x=1698647052; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6sZcvskGCywxirHGHxxQ+Pk8Bh2FU/RqVrVJqY8gqEY=; b=F5mSZl1OAqwP3GbkVcxILdSgvvV/JcJZjgFU7Lfvrq1WUuE27W+4bCVX Achf1gEEzqRtpkxTwwf6FS2WSZJV9DJ0gubq6C7Z6vPhwkI8fakbgrsSN zTMmyPfXtOwPdDrZWzXXk9Y+edjaQQtI3pbHB1gZcFPlSedgvv/z65Afb exST86diaSVcfFmEY3dNguje4FgRVNOeJsxREIRQ4IgLi1lGzLvUWQJsq /XmuoDemgW1D1oIG3Gh4+qJVe1w7qlvAGzpsrWqdYGzvJZsm79sFxrLcG ErVeWe9xxbbrWb41T0kQARZ8AtTqlvMv3gaFNtH08XxReYqrqT+B8qQFp g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037166" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037166" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:06 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393017" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393017" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:06 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson , Kai Huang Subject: [PATCH v10 053/108] KVM: VMX: Split out guts of EPT violation to common/exposed function Date: Sat, 29 Oct 2022 23:22:54 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092967844877390?= X-GMAIL-MSGID: =?utf-8?q?1748092967844877390?= From: Sean Christopherson The difference of TDX EPT violation is how to retrieve information, GPA, and exit qualification. To share the code to handle EPT violation, split out the guts of EPT violation handler so that VMX/TDX exit handler can call it after retrieving GPA and exit qualification. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini Reviewed-by: Kai Huang --- arch/x86/kvm/vmx/common.h | 33 +++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/vmx.c | 25 +++---------------------- 2 files changed, 36 insertions(+), 22 deletions(-) create mode 100644 arch/x86/kvm/vmx/common.h diff --git a/arch/x86/kvm/vmx/common.h b/arch/x86/kvm/vmx/common.h new file mode 100644 index 000000000000..235908f3e044 --- /dev/null +++ b/arch/x86/kvm/vmx/common.h @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef __KVM_X86_VMX_COMMON_H +#define __KVM_X86_VMX_COMMON_H + +#include + +#include "mmu.h" + +static inline int __vmx_handle_ept_violation(struct kvm_vcpu *vcpu, gpa_t gpa, + unsigned long exit_qualification) +{ + u64 error_code; + + /* Is it a read fault? */ + error_code = (exit_qualification & EPT_VIOLATION_ACC_READ) + ? PFERR_USER_MASK : 0; + /* Is it a write fault? */ + error_code |= (exit_qualification & EPT_VIOLATION_ACC_WRITE) + ? PFERR_WRITE_MASK : 0; + /* Is it a fetch fault? */ + error_code |= (exit_qualification & EPT_VIOLATION_ACC_INSTR) + ? PFERR_FETCH_MASK : 0; + /* ept page table entry is present? */ + error_code |= (exit_qualification & EPT_VIOLATION_RWX_MASK) + ? PFERR_PRESENT_MASK : 0; + + error_code |= (exit_qualification & EPT_VIOLATION_GVA_TRANSLATED) != 0 ? + PFERR_GUEST_FINAL_MASK : PFERR_GUEST_PAGE_MASK; + + return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0); +} + +#endif /* __KVM_X86_VMX_COMMON_H */ diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index dd3fde9d3c32..2ff7af959e30 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -50,6 +50,7 @@ #include #include "capabilities.h" +#include "common.h" #include "cpuid.h" #include "evmcs.h" #include "hyperv.h" @@ -5702,11 +5703,8 @@ static int handle_task_switch(struct kvm_vcpu *vcpu) static int handle_ept_violation(struct kvm_vcpu *vcpu) { - unsigned long exit_qualification; + unsigned long exit_qualification = vmx_get_exit_qual(vcpu); gpa_t gpa; - u64 error_code; - - exit_qualification = vmx_get_exit_qual(vcpu); /* * EPT violation happened while executing iret from NMI, @@ -5721,23 +5719,6 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu) gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS); trace_kvm_page_fault(vcpu, gpa, exit_qualification); - - /* Is it a read fault? */ - error_code = (exit_qualification & EPT_VIOLATION_ACC_READ) - ? PFERR_USER_MASK : 0; - /* Is it a write fault? */ - error_code |= (exit_qualification & EPT_VIOLATION_ACC_WRITE) - ? PFERR_WRITE_MASK : 0; - /* Is it a fetch fault? */ - error_code |= (exit_qualification & EPT_VIOLATION_ACC_INSTR) - ? PFERR_FETCH_MASK : 0; - /* ept page table entry is present? */ - error_code |= (exit_qualification & EPT_VIOLATION_RWX_MASK) - ? PFERR_PRESENT_MASK : 0; - - error_code |= (exit_qualification & EPT_VIOLATION_GVA_TRANSLATED) != 0 ? - PFERR_GUEST_FINAL_MASK : PFERR_GUEST_PAGE_MASK; - vcpu->arch.exit_qualification = exit_qualification; /* @@ -5751,7 +5732,7 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu) if (unlikely(allow_smaller_maxphyaddr && kvm_vcpu_is_illegal_gpa(vcpu, gpa))) return kvm_emulate_instruction(vcpu, 0); - return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0); + return __vmx_handle_ept_violation(vcpu, gpa, exit_qualification); } static int handle_ept_misconfig(struct kvm_vcpu *vcpu) From patchwork Sun Oct 30 06:22:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12874 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666250wru; Sat, 29 Oct 2022 23:29:31 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5UW9YuNdqX+HTiS/gWY3u+Fl4ENPuJWP41uwBixyVz15cA46uUmU0k6jFocLwpkbMangn2 X-Received: by 2002:a17:903:1c6:b0:185:47ce:f4f0 with SMTP id e6-20020a17090301c600b0018547cef4f0mr8108410plh.132.1667111370789; Sat, 29 Oct 2022 23:29:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111370; cv=none; d=google.com; s=arc-20160816; b=kxIA1G1FIk36iDPRjwtA/pPmxafWhbZqup39BeuuLCiLCoIRO1tFqMjvY6xkbuLclO gFncDsMSDIXCe20fspsYPLH5/R4+BCigOA36sYDwqGnvAu8Pch/1G1GSGVCKtzuG+K5k pXmPozoDNzvWKlmVT0Iqrq4nrc8fKhlYlFOwTm8cFrq3taxQT59BIwYN7iCDd3kcMQCT cikFbpLtnru7YzNTs68GJp4GbJ/1kwAFhfqxNx1/MKYMk0/QYKgt/c+FLwaIq73EpxJS NMjub6xqIhg7Z+/FZcVGilAnYm5WSxHw8hbo4vK/hT/7tlLGfs9LKJTyyf4YZJ3CNcsS JU3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=d0eXdJ2MT3Bhle847Wag/7kSa5IJrv44Glhi78w0M68=; b=QsspR8CpBUKAZR/Kp1UblSXZPGlyFhmxNpZSUeWXTHHyWVaAHn9ic0hRr2KACiJ1JH 1ETJQQI0ic6obGN5xfUGdI6yem4+LgmfJRaRi5Yt3ou/FZAqiY3VawOxcvdJr3EnWeYY aDoNGwpls0f7xcUsONH5oDTauyIrmmBbv8cL5e1yL7r0dDNIB8F3NfLyy0Y1bi41ss7w 8yVzzrSCVs7x5d4IaHZw0XrZlf5qHu6ZRsM1uu70kn8HBnP042clN38M7U06hAltbnXw QyKoEmt/k/Nyzw52iIIY3wv9Qe25mdWMPXbgV3aPOb3lybb8GZ+riuQzfgIIRaVNyKEa t+XQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UaqGt3ya; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q16-20020a632a10000000b0046f56372127si4199153pgq.468.2022.10.29.23.29.18; Sat, 29 Oct 2022 23:29:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=UaqGt3ya; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230505AbiJ3G2b (ORCPT + 99 others); Sun, 30 Oct 2022 02:28:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48668 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230056AbiJ3GY7 (ORCPT ); Sun, 30 Oct 2022 02:24:59 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F6F11D4; Sat, 29 Oct 2022 23:24:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111053; x=1698647053; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=uDvd5aLwQcJPBBxAxWbJDQm02RoATEZ6Bg3FqxNLsrI=; b=UaqGt3yasWfCMwcxtunmOtau4fwckFAoJF4N1ZzztxFPA8WcWhL9VRiu O7OBmXiolbYxZVH4rBq63JMYS9tTcxks89eRaO/q13Eo7RfZvT9oXrx8A XFVQPk8+KrFh+gYPtZ1NQxJIJ+RM6AS3v9jcR+u//yXrL58Tw2eLPdwia gsVdAMFZwOa8Iv6oEYVWSrMWEGWGVIjtokteNXVeh2Njl4sllGTCS6QRe wZRGYBoGPNP25bjE3SfBPdXM49TLffUwvJgrpyTEFpD0JD5fxF18KQCNt eaSwL68/IJN9ZbmZRWZ5ja/a8JHv5h8QTv98ATAe/YgP81bkvGE3SLQEb w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037167" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037167" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:06 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393021" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393021" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:06 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 054/108] KVM: VMX: Move setting of EPT MMU masks to common VT-x code Date: Sat, 29 Oct 2022 23:22:55 -0700 Message-Id: <251f22af488caa17dcdbb0227e7fd0b7f61f3f54.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092973043681880?= X-GMAIL-MSGID: =?utf-8?q?1748092973043681880?= From: Sean Christopherson EPT MMU masks are used commonly for VMX and TDX. The value needs to be initialized in common code before both VMX/TDX-specific initialization code. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/main.c | 5 +++++ arch/x86/kvm/vmx/vmx.c | 4 ---- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 0d5ca65e9997..9fb6eb626a9a 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -4,6 +4,7 @@ #include "x86_ops.h" #include "vmx.h" #include "nested.h" +#include "mmu.h" #include "pmu.h" #include "tdx.h" @@ -26,6 +27,10 @@ static __init int vt_hardware_setup(void) enable_tdx = enable_tdx && !tdx_hardware_setup(&vt_x86_ops); + if (enable_ept) + kvm_mmu_set_ept_masks(enable_ept_ad_bits, + cpu_has_vmx_ept_execute_only()); + return 0; } diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 2ff7af959e30..b5c3652c3cc4 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -8256,10 +8256,6 @@ __init int vmx_hardware_setup(void) set_bit(0, vmx_vpid_bitmap); /* 0 is reserved for host */ - if (enable_ept) - kvm_mmu_set_ept_masks(enable_ept_ad_bits, - cpu_has_vmx_ept_execute_only()); - /* * Setup shadow_me_value/shadow_me_mask to include MKTME KeyID * bits to shadow_zero_check. From patchwork Sun Oct 30 06:22:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12875 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666285wru; Sat, 29 Oct 2022 23:29:39 -0700 (PDT) X-Google-Smtp-Source: AMsMyM48CWhTMRjKGRwrdYDrxzYNf5edpjFubxQmbgjqX74RK0v1eqt1yFH+aEGrIYsRVUuq6eKI X-Received: by 2002:a17:90b:4a92:b0:213:2421:5f38 with SMTP id lp18-20020a17090b4a9200b0021324215f38mr8466683pjb.10.1667111379669; Sat, 29 Oct 2022 23:29:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111379; cv=none; d=google.com; s=arc-20160816; b=pZUrpD/n9DDerPV7JpbbsKbicgZyBQ0C+oSH9c1VbQIhcSfnvNTgA7D/sBiDzQDIUb I6emLKa5qF8doABPGzAs4Df/ZDeeucymQOug5hOreJylxDch+Lcc8cUDuCa6eViqdE8F bN8FEW4cX449bMX8Ego+A3Oqdh8kng6trsXRatU8oJyO+7/xXbk+cqLG8sjiu19swtQe QoPmOcxALAtIGmrXmYR41cIXpq/cJHGvUYEbKkNOC4Z4xk/Kz6JAbiB0uH+thm60p7Pk 7UukxxUnV3lkpOrSiIcblXsAvt0YMGaLDRGVENUpMYB50eqBpU5azPz+dpbVnrSr9HQ8 8PCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=pllVE77nBISz1RInXWERmaWgYjBWfa5GD/4CuLWu+Ak=; b=gBflan1HcqKthXdLHfC3BjYcenkkvLjuvCHtZZl8/6No4dPQaKZilI85wvFcMu5VHc oK7pFqAvTIg7GD4+1IxhhkDqR7sja7i7fYIygeUxVA9KXZpnZO9PPAq/v4DiKYFou+Wz BAZ/kN65+mNzf+25WSwgOHkp3290jBRu4wMFKiIgH8D3YCnSW55DA5C19UYmKvSxwGZ1 xj1GMB+EobJiJTaM+6ZAEhQ2Uhnu0lgoH5FfMBRDWbLQmiKOfAWsFb0cbE/9Wc1Y3fcE bdNeDzGdiqA+YXdSTHj5sG7HibpM8aKcCQnrFfIMeaWTz13O4cxZ+2SNGltZLhbHECyG tL5A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=DcjTsKRN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k11-20020aa788cb000000b005631e47ea88si5013296pff.178.2022.10.29.23.29.27; Sat, 29 Oct 2022 23:29:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=DcjTsKRN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230518AbiJ3G2f (ORCPT + 99 others); Sun, 30 Oct 2022 02:28:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229927AbiJ3GZB (ORCPT ); Sun, 30 Oct 2022 02:25:01 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55476EE; Sat, 29 Oct 2022 23:24:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111053; x=1698647053; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=szdQR9xxXHuaOtlsKWph90ODmbm52oomOuydVveFV2E=; b=DcjTsKRNsIyq77HM19WJrVQV6WgaWwesTV5qn37+2SvpHIJIRsk2fayw M1ikJvAn6vvMLvgx0SSXPvbe7t1gGSEiBj7KHXHKrEaLbPqXh36T5+S0B ud6A2Zz9vwBtk2hB/pwL8bRyfvZcCIjvZK1xUXgIfCCOwbP7rjK9u8nbv KxDWv7fPKi5ulRekvigPhlO3Uzcj679XHPm3sSYtoOylCurIHCrK0PbXh sGfJIQAcOwzao8ZxS6LbYTnP7+PwtylrH0znZEtoTV0edmIBmqSW9VIED kT51lRLH8kFpN5iWdoeO5i4JlJG9vrXCitZCCoVqFrrXrZd/6ou4bpZAM w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037168" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037168" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:06 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393024" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393024" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:06 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 055/108] KVM: TDX: Add load_mmu_pgd method for TDX Date: Sat, 29 Oct 2022 23:22:56 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092981886829277?= X-GMAIL-MSGID: =?utf-8?q?1748092981886829277?= From: Sean Christopherson For virtual IO, the guest TD shares guest pages with VMM without encryption. Shared EPT is used to map guest pages in unprotected way. Add the VMCS field encoding for the shared EPTP, which will be used by TDX to have separate EPT walks for private GPAs (existing EPTP) versus shared GPAs (new shared EPTP). Set shared EPT pointer value for the TDX guest to initialize TDX MMU. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/include/asm/vmx.h | 1 + arch/x86/kvm/vmx/main.c | 11 ++++++++++- arch/x86/kvm/vmx/tdx.c | 5 +++++ arch/x86/kvm/vmx/x86_ops.h | 4 ++++ 4 files changed, 20 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 752d53652007..1205018b5b6b 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -234,6 +234,7 @@ enum vmcs_field { TSC_MULTIPLIER_HIGH = 0x00002033, TERTIARY_VM_EXEC_CONTROL = 0x00002034, TERTIARY_VM_EXEC_CONTROL_HIGH = 0x00002035, + SHARED_EPT_POINTER = 0x0000203C, PID_POINTER_TABLE = 0x00002042, PID_POINTER_TABLE_HIGH = 0x00002043, GUEST_PHYSICAL_ADDRESS = 0x00002400, diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 9fb6eb626a9a..974e00fd3260 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -100,6 +100,15 @@ static void vt_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) return vmx_vcpu_reset(vcpu, init_event); } +static void vt_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, + int pgd_level) +{ + if (is_td_vcpu(vcpu)) + return tdx_load_mmu_pgd(vcpu, root_hpa, pgd_level); + + vmx_load_mmu_pgd(vcpu, root_hpa, pgd_level); +} + static int vt_mem_enc_ioctl(struct kvm *kvm, void __user *argp) { if (!is_td(kvm)) @@ -219,7 +228,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .write_tsc_offset = vmx_write_tsc_offset, .write_tsc_multiplier = vmx_write_tsc_multiplier, - .load_mmu_pgd = vmx_load_mmu_pgd, + .load_mmu_pgd = vt_load_mmu_pgd, .check_intercept = vmx_check_intercept, .handle_exit_irqoff = vmx_handle_exit_irqoff, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index e80f9cf79b2e..6328eaa65126 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -394,6 +394,11 @@ void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vcpu->kvm->vm_bugged = true; } +void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int pgd_level) +{ + td_vmcs_write64(to_tdx(vcpu), SHARED_EPT_POINTER, root_hpa & PAGE_MASK); +} + int tdx_dev_ioctl(void __user *argp) { struct kvm_tdx_capabilities __user *user_caps; diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index fda1b2eaebc6..dd05991afbad 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -147,6 +147,8 @@ void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event); int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); + +void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int root_level); #else static inline int tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { return 0; } static inline bool tdx_is_vm_type_supported(unsigned long type) { return false; } @@ -165,6 +167,8 @@ static inline void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) {} static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; } + +static inline void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int root_level) {} #endif #endif /* __KVM_X86_VMX_X86_OPS_H */ From patchwork Sun Oct 30 06:22:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12883 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666361wru; Sat, 29 Oct 2022 23:29:56 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4AMst/nIpgoVx/vg5VCIIOgjr7Ydsuw/KNS5wuFQykQBDCdQbeNKWKGaML3bL+0iN16fPp X-Received: by 2002:a17:902:e743:b0:187:1d0:6b with SMTP id p3-20020a170902e74300b0018701d0006bmr7978826plf.119.1667111395960; Sat, 29 Oct 2022 23:29:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111395; cv=none; d=google.com; s=arc-20160816; b=lxGDdi+8/FSPA1xGUtTzRictyyTM+JC3FCqDH3K531zXRpm5IxjQm7QC5PXsWDODZJ tGbvf6gWsdCkjafXNRVGBj/+Eb4IqVviaJuY/J8XG6B/MDgJiFSfd1NlbjrRZGg5iNT5 071c/KUqXoFTSmdX71adFdQZOsb9TZsyzggl3akkrWg3d3t2Yy2fzn0sdEinaMsHME3A WYPu/recahOKwuH8ouZddBHKBja30XRcn8TywqA4whRuoGtej6Jh+EApNspe1uNsWbYp JKbdtMu9jiuR4nV9Qos2eVKxPI9BtLLWXJwxC8DCgXdRHwHWVVOBMDC4GIb+npjGO5uG 54LA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=me9bWRiTZ5mTreuBxbpHPwy2YoWGfS4FgLnMaxiHSls=; b=BSgmkXgsm6E5fbP06CpShb6UpTUXpDyasTt1+rVfo0KpEggxbMZt74pWp6fBX6wwBX ZonAsGmQiBdxdYbqXnmUkIIwHeMQD+eklWnOJTD/nMKagWbhYWADmt+skRhey1m8HioM AzQqRTSMc1Oy7zYic1zjt/SFMjBbuWYdRENPxJ6mYhu+S+4bS2vktuMRHYTAn+CnlT4N D5KXLJcQ+Z+0CWQqm/W7GK7LhWWgFq2g0kY9HX5oBE5FwsMa4p55dIUCltLPe1Ca2eRZ gzsldyCP8/Cgj1xRTMui9GbwNJ2r7EjtHvSY2akokl/lnqmTmEzhY8Hyg4R2672waMMG rj7A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=PTtuyg9C; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q16-20020a632a10000000b0046f56372127si4199986pgq.468.2022.10.29.23.29.43; Sat, 29 Oct 2022 23:29:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=PTtuyg9C; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230127AbiJ3G2l (ORCPT + 99 others); Sun, 30 Oct 2022 02:28:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230084AbiJ3GZF (ORCPT ); Sun, 30 Oct 2022 02:25:05 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 91F7B1DD; Sat, 29 Oct 2022 23:24:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111053; x=1698647053; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=elmKS02WiygU5IrlA++TZV5oI8cDn28t+pY/4QpTrh0=; b=PTtuyg9CnidcTcQ70FemTdfk9KIJyp1nl+N8JoRCWwgWMjH6Aif+pY8x axuIwHsPh0q1UaFnkpwqVGl/wbaMVvzEUoAD6bgdQM4yE8gIc0YyiVXPD HCS6SUnkF/bC4p0xscUYSPCX7h7IBrPshxJ+uEQ7TLAInksCkJaRKDHgN WBWdlIdhAZ4SZMMwndjGdEigum6Xlyv9xIHVFAPmrW7keZcgRaCgg4Iiy JN077KBuuXFqxm7rfqubZIUtWo6QhPiTTV4t/6bnsz+jiBhRljdii/RM9 YufRrkZLGwmk6ok7Zuuu9owDXU5jnLHnq2A7w7IHMab5vuGjkVm55hL63 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037169" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037169" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:06 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393027" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393027" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:06 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 056/108] KVM: TDX: don't request KVM_REQ_APIC_PAGE_RELOAD Date: Sat, 29 Oct 2022 23:22:57 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092999117285416?= X-GMAIL-MSGID: =?utf-8?q?1748092999117285416?= From: Isaku Yamahata TDX doesn't need APIC page depending on vapic and its callback is WARN_ON_ONCE(is_tdx). To avoid unnecessary overhead and WARN_ON_ONCE(), skip requesting KVM_REQ_APIC_PAGE_RELOAD when TD. WARNING: arch/x86/kvm/vmx/main.c:696 vt_set_apic_access_page_addr+0x3c/0x50 [kvm_intel] RIP: 0010:vt_set_apic_access_page_addr+0x3c/0x50 [kvm_intel] Call Trace: vcpu_enter_guest+0x145d/0x24d0 [kvm] kvm_arch_vcpu_ioctl_run+0x25d/0xcc0 [kvm] kvm_vcpu_ioctl+0x414/0xa30 [kvm] __x64_sys_ioctl+0xc0/0x100 do_syscall_64+0x39/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae Signed-off-by: Isaku Yamahata --- arch/x86/kvm/x86.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 3868605462ed..5dadd0f9a10e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10487,7 +10487,9 @@ void kvm_arch_mmu_notifier_invalidate_range(struct kvm *kvm, * Update it when it becomes invalid. */ apic_address = gfn_to_hva(kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT); - if (start <= apic_address && apic_address < end) + /* TDX doesn't need APIC page. */ + if (kvm->arch.vm_type != KVM_X86_TDX_VM && + start <= apic_address && apic_address < end) kvm_make_all_cpus_request(kvm, KVM_REQ_APIC_PAGE_RELOAD); } From patchwork Sun Oct 30 06:22:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12879 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666296wru; Sat, 29 Oct 2022 23:29:43 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4WrPKqhnEIghlmdgb/qFxFjAClyH5BQlY5l7DhdDReCBG4SWtvY3Fs6mW+XBfu+c8uYwYU X-Received: by 2002:a17:903:2344:b0:186:e357:f3ac with SMTP id c4-20020a170903234400b00186e357f3acmr8003440plh.110.1667111383104; Sat, 29 Oct 2022 23:29:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111383; cv=none; d=google.com; s=arc-20160816; b=DlZDYpEHGVQzpTFRpgDnW9yeDjTT+NFxNuuZOhw7dctsmk36QoDCZOxiZU4T0MIPSY YO8qIg+IXRvDIIic+D0IVZSRV+UdIbc7lbRrTvQ7U6lG/Y0KEZ66g0/7kROIPq19QDN9 QjOQbBBq2V2GeDAnPLtm6lT1vQaG9SyUJhGGLbMVABMj5LrRU1+K8FoVIzwpnhAnJM97 zI41RSr/RdovJxeIb9LUIsAtYkiXAn7eEqFwrEhxYJVZ0DrAZxcEzklth1v0pUNFznVK z+TJ6YX0CSygEchbv05MORU9nSYQ874joTxX82kow+NTj8n9US6pWkRbd5QSCC+lgjsy DYNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=X8nC6z9+OhxBMRRYcBECIigSERSZQMaXnnuJunL8Kek=; b=Tmy4GSqtrKG0mb/MFtRTGG0LYY6yoMtGHdCA8PXwv6FO0IHr82XOrJvzcKozPSlqyk ab6nwdGQZ+LLwxkyPg7Hgo43GjhCCR9qNMaswhziCZLHnC3XsX05+VHHMT2XezZ4SW2a 6bXVmz2aQwpFxafQCE6qs7YmUCng2Y+I6iU4JdgB12jKraNyowh6n5TXwOOCDEBZHpds vcgyT77BOZCO6jr1sWFiDFOzgAEpr1fKgmfMfUS+W8yOHnj37F/1KGbzTZkwzpuVCoZg 8ih9BSgY1WSkPeD/nClK0kDhIm+OqvP3Dt7DY7T86EotRe8MYj6/qyUD8ioXLep1Nq/c qKjw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Bw4uFQ7B; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m5-20020a170902e40500b00186e8c3782esi3496021ple.386.2022.10.29.23.29.31; Sat, 29 Oct 2022 23:29:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Bw4uFQ7B; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231124AbiJ3G2n (ORCPT + 99 others); Sun, 30 Oct 2022 02:28:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49172 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230116AbiJ3GZN (ORCPT ); Sun, 30 Oct 2022 02:25:13 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 047F41EE; Sat, 29 Oct 2022 23:24:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111054; x=1698647054; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZakXX3Wx8TmNedmlc63R5sE7OXgzgan1HAW4wBcmf/0=; b=Bw4uFQ7BS2IhUVisxC0dzWS2svNUw5wlmjNKsIRXUOzoIKIRw4a6Ts3D 5utLEttVLjhWu4zfbW0d6d753wzyf+SXPRMRJb7P2K5z+hFHBZSfr26QD /XGNmi1x6JiYC8SUjgOzfri2+Epso+iQBj7qZxfJXVgrIwYkq68oMAX8i tS4vuW5ZIV6hW8Kwuv4bylzrdX2L0lQI5U+OIKKKrpUZXKRYg8sVPs51z 1BIYwyFClZHpXKvZba05cKyGdqciqfdjiWWiUCMAs+AbEtYDPty7y27hX M6y503mxcVMNX5pJcwpQNYDkCMl6oS4JFYMk4Qqg0YEfjsUkz+vAxKVYE w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037170" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037170" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:06 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393030" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393030" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:06 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 057/108] KVM: x86/VMX: introduce vmx tlb_remote_flush and tlb_remote_flush_with_range Date: Sat, 29 Oct 2022 23:22:58 -0700 Message-Id: <009a08eb9b20f1c23d37e06a5958eafd05466249.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092985968695037?= X-GMAIL-MSGID: =?utf-8?q?1748092985968695037?= From: Isaku Yamahata This is preparation for TDX to define its own tlb_remote_flush and tlb_remote_flush_with_range. Currently vmx code defines tlb_remote_flush and tlb_remote_flush_with_range defined as NULL by default and only when nested hyper-v guest case, they are defined to non-NULL methods. To make TDX code to override those two methods consistently with other methods, define vmx_tlb_remote_flush and vmx_tlb_remote_flush_with_range as nop and call hyper-v code only when nested hyper-v guest case. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/kvm_onhyperv.c | 5 ++++- arch/x86/kvm/kvm_onhyperv.h | 1 + arch/x86/kvm/mmu/mmu.c | 2 +- arch/x86/kvm/svm/svm_onhyperv.h | 1 + arch/x86/kvm/vmx/main.c | 2 ++ arch/x86/kvm/vmx/vmx.c | 34 ++++++++++++++++++++++++++++----- arch/x86/kvm/vmx/x86_ops.h | 3 +++ 7 files changed, 41 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/kvm_onhyperv.c b/arch/x86/kvm/kvm_onhyperv.c index ee4f696a0782..d43518da1c0e 100644 --- a/arch/x86/kvm/kvm_onhyperv.c +++ b/arch/x86/kvm/kvm_onhyperv.c @@ -93,11 +93,14 @@ int hv_remote_flush_tlb(struct kvm *kvm) } EXPORT_SYMBOL_GPL(hv_remote_flush_tlb); +bool hv_use_remote_flush_tlb __ro_after_init; +EXPORT_SYMBOL_GPL(hv_use_remote_flush_tlb); + void hv_track_root_tdp(struct kvm_vcpu *vcpu, hpa_t root_tdp) { struct kvm_arch *kvm_arch = &vcpu->kvm->arch; - if (kvm_x86_ops.tlb_remote_flush == hv_remote_flush_tlb) { + if (hv_use_remote_flush_tlb) { spin_lock(&kvm_arch->hv_root_tdp_lock); vcpu->arch.hv_root_tdp = root_tdp; if (root_tdp != kvm_arch->hv_root_tdp) diff --git a/arch/x86/kvm/kvm_onhyperv.h b/arch/x86/kvm/kvm_onhyperv.h index 287e98ef9df3..9a07a34666fb 100644 --- a/arch/x86/kvm/kvm_onhyperv.h +++ b/arch/x86/kvm/kvm_onhyperv.h @@ -10,6 +10,7 @@ int hv_remote_flush_tlb_with_range(struct kvm *kvm, struct kvm_tlb_range *range); int hv_remote_flush_tlb(struct kvm *kvm); +extern bool hv_use_remote_flush_tlb __ro_after_init; void hv_track_root_tdp(struct kvm_vcpu *vcpu, hpa_t root_tdp); #else /* !CONFIG_HYPERV */ static inline void hv_track_root_tdp(struct kvm_vcpu *vcpu, hpa_t root_tdp) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index efc3b3f2dd12..08923b64dcc8 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -242,7 +242,7 @@ static void kvm_flush_remote_tlbs_with_range(struct kvm *kvm, { int ret = -ENOTSUPP; - if (range && kvm_x86_ops.tlb_remote_flush_with_range) + if (range && kvm_available_flush_tlb_with_range()) ret = static_call(kvm_x86_tlb_remote_flush_with_range)(kvm, range); if (ret) diff --git a/arch/x86/kvm/svm/svm_onhyperv.h b/arch/x86/kvm/svm/svm_onhyperv.h index e2fc59380465..b3cd61c62305 100644 --- a/arch/x86/kvm/svm/svm_onhyperv.h +++ b/arch/x86/kvm/svm/svm_onhyperv.h @@ -36,6 +36,7 @@ static inline void svm_hv_hardware_setup(void) svm_x86_ops.tlb_remote_flush = hv_remote_flush_tlb; svm_x86_ops.tlb_remote_flush_with_range = hv_remote_flush_tlb_with_range; + hv_use_remote_flush_tlb = true; } if (ms_hyperv.nested_features & HV_X64_NESTED_DIRECT_FLUSH) { diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 974e00fd3260..fe9583b640fb 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -178,6 +178,8 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .flush_tlb_all = vmx_flush_tlb_all, .flush_tlb_current = vmx_flush_tlb_current, + .tlb_remote_flush = vmx_tlb_remote_flush, + .tlb_remote_flush_with_range = vmx_tlb_remote_flush_with_range, .flush_tlb_gva = vmx_flush_tlb_gva, .flush_tlb_guest = vmx_flush_tlb_guest, diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index b5c3652c3cc4..f2887dbde700 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -3126,6 +3126,33 @@ void vmx_flush_tlb_current(struct kvm_vcpu *vcpu) vpid_sync_context(vmx_get_current_vpid(vcpu)); } +int vmx_tlb_remote_flush(struct kvm *kvm) +{ +#if IS_ENABLED(CONFIG_HYPERV) + if (hv_use_remote_flush_tlb) + return hv_remote_flush_tlb(kvm); +#endif + /* + * fallback to KVM_REQ_TLB_FLUSH. + * See kvm_arch_flush_remote_tlb() and kvm_flush_remote_tlbs(). + */ + return -EOPNOTSUPP; +} + +int vmx_tlb_remote_flush_with_range(struct kvm *kvm, + struct kvm_tlb_range *range) +{ +#if IS_ENABLED(CONFIG_HYPERV) + if (hv_use_remote_flush_tlb) + return hv_remote_flush_tlb_with_range(kvm, range); +#endif + /* + * fallback to tlb_remote_flush. See + * kvm_flush_remote_tlbs_with_range() + */ + return -EOPNOTSUPP; +} + void vmx_flush_tlb_gva(struct kvm_vcpu *vcpu, gva_t addr) { /* @@ -8223,11 +8250,8 @@ __init int vmx_hardware_setup(void) #if IS_ENABLED(CONFIG_HYPERV) if (ms_hyperv.nested_features & HV_X64_NESTED_GUEST_MAPPING_FLUSH - && enable_ept) { - vt_x86_ops.tlb_remote_flush = hv_remote_flush_tlb; - vt_x86_ops.tlb_remote_flush_with_range = - hv_remote_flush_tlb_with_range; - } + && enable_ept) + hv_use_remote_flush_tlb = true; #endif if (!cpu_has_vmx_ple()) { diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index dd05991afbad..cf7e0c6c65ac 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -92,6 +92,9 @@ void vmx_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags); bool vmx_get_if_flag(struct kvm_vcpu *vcpu); void vmx_flush_tlb_all(struct kvm_vcpu *vcpu); void vmx_flush_tlb_current(struct kvm_vcpu *vcpu); +int vmx_tlb_remote_flush(struct kvm *kvm); +int vmx_tlb_remote_flush_with_range(struct kvm *kvm, + struct kvm_tlb_range *range); void vmx_flush_tlb_gva(struct kvm_vcpu *vcpu, gva_t addr); void vmx_flush_tlb_guest(struct kvm_vcpu *vcpu); void vmx_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask); From patchwork Sun Oct 30 06:22:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12880 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666309wru; Sat, 29 Oct 2022 23:29:45 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7Rz00Ej33ErK6Dk0qj+ANAi8pqDUUBjGJVOOwHs9CcIxIblwCf7NhhyHmBQ9EsalrWokAI X-Received: by 2002:a17:902:c7c1:b0:186:b766:5dde with SMTP id r1-20020a170902c7c100b00186b7665ddemr8093065pla.93.1667111385368; Sat, 29 Oct 2022 23:29:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111385; cv=none; d=google.com; s=arc-20160816; b=hTMOAPfEIjfqr/5ONR6YoPolrm0OL2mxEOhemDgGQwp/CGyzTosjAgTXlHNYkwmYGf 0193EExLaImhFe0zm9pcNIdGXyGyMhjPTTOJcVOCwDXKKl6vFx5Q8/lQMxF9eu0O9Rbt /bsXlJcelWzQrOsivuvAQN8OuXHSS2iF5SntmDRa3FqEv17BmS6oVZRy3/BQN1w5P760 Bf7mHgtdMs0Ukslqm8LwMyPizWqutn94+MaXbHU4rU2eqcnvyYU8fnvVWVLi2b5RnAze s2GZsQVxmQr+n5A3vpprbGm0dwRkoHnUCgFg8dYQsSudz5Is6PAXXMHYTWK6amRRJUng HgmQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=0J6NTiI/SIaPgcX6+G6oyLdgu7WOwjbmtYL8kym+eUI=; b=Kg8vcF5xA0+kTfXQEbEysmLcJXkhhVoS6Hv/NFM4C0MDhop3tYGiTdP15NSYj75mkI XUETSpNpO5Zrb/aBMzT5zrRxQbx1V5Qe+dgP7B9HVXSx4bQQCHRXbwSFAf4Cn2gnTrr5 iVhCb7yZI0O9EnkrKVR/T8UK0YesX7lywRiUiiiBRuzGB5zHuJaBv4gyk0wCV7wRaY+n sqQuRfxnx+OGg05vTTF9ILLGlc5EwNC3QKqLB71xnRAbuEEKBckWkn0IE96aQ3z11T/+ 0YPNWwpH/g0F+FFA8iwNP6xgM4QhU5jFuz2PtGVlqXbb4Sy6MJm/sN7t6gfUQlwoZpib 9KwQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ZXbdNmn3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z15-20020a170903018f00b0017676f11aebsi4994775plg.5.2022.10.29.23.29.33; Sat, 29 Oct 2022 23:29:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ZXbdNmn3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231134AbiJ3G2r (ORCPT + 99 others); Sun, 30 Oct 2022 02:28:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230124AbiJ3GZO (ORCPT ); Sun, 30 Oct 2022 02:25:14 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 207AECD; Sat, 29 Oct 2022 23:24:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111054; x=1698647054; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ufYZmSZ96y68rMKiY+g+qpCyP7GuiojMoXyG0WorRl0=; b=ZXbdNmn3FS6afCLIGUTJY2rCVHw4ZiES0txDS3zNfCgYBN86FvXVsdV1 PAazCSEwdcMK3mOirNnEzl+RGaJJ7QLVqSQxIZk90BaPow6CzzqJ7KXQQ +E/bp2IMZQMaSHkGoDoY+E2pQu7z21uctEFHTiTmt0W8vFU7mOVddJHpT quwrGSygmTud6UUIwljXsZ6j1YECuUtNi63bfwooFZBUXq+i/evWenWPd JXZNDHq3cTWDVDnVojlJpj3AoW9LZ89TMsCTtfegKN3lMQh/Eh+65hRc4 EV1Xw2q+mEURjEoMXA1tNywUp9k1xp7UOLDO78pjggt4n4HHfTC705W8O A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037171" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037171" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:07 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393033" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393033" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:06 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 058/108] KVM: TDX: TDP MMU TDX support Date: Sat, 29 Oct 2022 23:22:59 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092988156175105?= X-GMAIL-MSGID: =?utf-8?q?1748092988156175105?= From: Isaku Yamahata Implement hooks of TDP MMU for TDX backend. TLB flush, TLB shootdown, propagating the change private EPT entry to Secure EPT and freeing Secure EPT page. TLB flush handles both shared EPT and private EPT. It flushes shared EPT same as VMX. It also waits for the TDX TLB shootdown. For the hook to free Secure EPT page, unlinks the Secure EPT page from the Secure EPT so that the page can be freed to OS. Propagate the entry change to Secure EPT. The possible entry changes are present -> non-present(zapping) and non-present -> present(population). On population just link the Secure EPT page or the private guest page to the Secure EPT by TDX SEAMCALL. Because TDP MMU allows concurrent zapping/population, zapping requires synchronous TLB shoot down with the frozen EPT entry. It zaps the secure entry, increments TLB counter, sends IPI to remote vcpus to trigger TLB flush, and then unlinks the private guest page from the Secure EPT. For simplicity, batched zapping with exclude lock is handled as concurrent zapping. Although it's inefficient, it can be optimized in the future. For MMIO SPTE, the spte value changes as follows. initial value (suppress VE bit is set) -> Guest issues MMIO and triggers EPT violation -> KVM updates SPTE value to MMIO value (suppress VE bit is cleared) -> Guest MMIO resumes. It triggers VE exception in guest TD -> Guest VE handler issues TDG.VP.VMCALL -> KVM handles MMIO -> Guest VE handler resumes its execution after MMIO instruction Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu/spte.c | 3 +- arch/x86/kvm/vmx/main.c | 61 +++++++- arch/x86/kvm/vmx/tdx.c | 299 ++++++++++++++++++++++++++++++++++++- arch/x86/kvm/vmx/tdx.h | 7 + arch/x86/kvm/vmx/tdx_ops.h | 2 + arch/x86/kvm/vmx/x86_ops.h | 4 + 6 files changed, 368 insertions(+), 8 deletions(-) diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c index 8f468ee2b985..3167c12d9c74 100644 --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -74,7 +74,8 @@ u64 make_mmio_spte(struct kvm_vcpu *vcpu, u64 gfn, unsigned int access) u64 spte = generation_mmio_spte_mask(gen); u64 gpa = gfn << PAGE_SHIFT; - WARN_ON_ONCE(!vcpu->kvm->arch.shadow_mmio_value); + WARN_ON_ONCE(!vcpu->kvm->arch.shadow_mmio_value && + !kvm_gfn_shared_mask(vcpu->kvm)); access &= shadow_mmio_access_mask; spte |= vcpu->kvm->arch.shadow_mmio_value | access; diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index fe9583b640fb..3163915e2e3d 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -100,6 +100,55 @@ static void vt_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) return vmx_vcpu_reset(vcpu, init_event); } +static void vt_flush_tlb_all(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return tdx_flush_tlb(vcpu); + + vmx_flush_tlb_all(vcpu); +} + +static void vt_flush_tlb_current(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return tdx_flush_tlb(vcpu); + + vmx_flush_tlb_current(vcpu); +} + +static int vt_tlb_remote_flush(struct kvm *kvm) +{ + if (is_td(kvm)) + return tdx_sept_tlb_remote_flush(kvm); + + return vmx_tlb_remote_flush(kvm); +} + +static int vt_tlb_remote_flush_with_range(struct kvm *kvm, + struct kvm_tlb_range *range) +{ + if (is_td(kvm)) + return -EOPNOTSUPP; /* fall back to tlb_remote_flush */ + + return vmx_tlb_remote_flush_with_range(kvm, range); +} + +static void vt_flush_tlb_gva(struct kvm_vcpu *vcpu, gva_t addr) +{ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) + return; + + vmx_flush_tlb_gva(vcpu, addr); +} + +static void vt_flush_tlb_guest(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return; + + vmx_flush_tlb_guest(vcpu); +} + static void vt_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int pgd_level) { @@ -176,12 +225,12 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .set_rflags = vmx_set_rflags, .get_if_flag = vmx_get_if_flag, - .flush_tlb_all = vmx_flush_tlb_all, - .flush_tlb_current = vmx_flush_tlb_current, - .tlb_remote_flush = vmx_tlb_remote_flush, - .tlb_remote_flush_with_range = vmx_tlb_remote_flush_with_range, - .flush_tlb_gva = vmx_flush_tlb_gva, - .flush_tlb_guest = vmx_flush_tlb_guest, + .flush_tlb_all = vt_flush_tlb_all, + .flush_tlb_current = vt_flush_tlb_current, + .tlb_remote_flush = vt_tlb_remote_flush, + .tlb_remote_flush_with_range = vt_tlb_remote_flush_with_range, + .flush_tlb_gva = vt_flush_tlb_gva, + .flush_tlb_guest = vt_flush_tlb_guest, .vcpu_pre_run = vmx_vcpu_pre_run, .vcpu_run = vmx_vcpu_run, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 6328eaa65126..5378d2c35e27 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -6,7 +6,9 @@ #include "capabilities.h" #include "x86_ops.h" #include "tdx.h" +#include "vmx.h" #include "x86.h" +#include "mmu.h" #undef pr_fmt #define pr_fmt(fmt) "tdx: " fmt @@ -295,6 +297,22 @@ int tdx_vm_init(struct kvm *kvm) { struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + /* + * Because guest TD is protected, VMM can't parse the instruction in TD. + * Instead, guest uses MMIO hypercall. For unmodified device driver, + * #VE needs to be injected for MMIO and #VE handler in TD converts MMIO + * instruction into MMIO hypercall. + * + * SPTE value for MMIO needs to be setup so that #VE is injected into + * TD instead of triggering EPT MISCONFIG. + * - RWX=0 so that EPT violation is triggered. + * - suppress #VE bit is cleared to inject #VE. + */ + kvm_mmu_set_mmio_spte_value(kvm, 0); + + /* TODO: Enable 2mb and 1gb large page support. */ + kvm->arch.tdp_max_page_level = PG_LEVEL_4K; + kvm_tdx->hkid = -1; /* @@ -399,6 +417,258 @@ void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int pgd_level) td_vmcs_write64(to_tdx(vcpu), SHARED_EPT_POINTER, root_hpa & PAGE_MASK); } +static void tdx_unpin(struct kvm *kvm, kvm_pfn_t pfn) +{ + struct page *page = pfn_to_page(pfn); + + put_page(page); +} + +static int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn, + enum pg_level level, kvm_pfn_t pfn) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + hpa_t hpa = pfn_to_hpa(pfn); + gpa_t gpa = gfn_to_gpa(gfn); + struct tdx_module_output out; + u64 err; + + if (WARN_ON_ONCE(is_error_noslot_pfn(pfn) || + !kvm_pfn_to_refcounted_page(pfn))) + return 0; + + /* TODO: handle large pages. */ + if (KVM_BUG_ON(level != PG_LEVEL_4K, kvm)) + return -EINVAL; + + /* To prevent page migration, do nothing on mmu notifier. */ + get_page(pfn_to_page(pfn)); + + if (likely(is_td_finalized(kvm_tdx))) { + err = tdh_mem_page_aug(kvm_tdx->tdr.pa, gpa, hpa, &out); + if (err == TDX_ERROR_SEPT_BUSY) { + tdx_unpin(kvm, pfn); + return -EAGAIN; + } + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error(TDH_MEM_PAGE_AUG, err, &out); + tdx_unpin(kvm, pfn); + } + return 0; + } + + /* TODO: tdh_mem_page_add() comes here */ + + return 0; +} + +static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn, + enum pg_level level, kvm_pfn_t pfn) +{ + int tdx_level = pg_level_to_tdx_sept_level(level); + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + struct tdx_module_output out; + gpa_t gpa = gfn_to_gpa(gfn); + hpa_t hpa = pfn_to_hpa(pfn); + hpa_t hpa_with_hkid; + u64 err; + + /* TODO: handle large pages. */ + if (KVM_BUG_ON(level != PG_LEVEL_4K, kvm)) + return -EINVAL; + + if (!is_hkid_assigned(kvm_tdx)) { + /* + * The HKID assigned to this TD was already freed and cache + * was already flushed. We don't have to flush again. + */ + err = tdx_reclaim_page((unsigned long)__va(hpa), hpa, false, 0); + if (KVM_BUG_ON(err, kvm)) + return -EIO; + tdx_unpin(kvm, pfn); + return 0; + } + + do { + /* + * When zapping private page, write lock is held. So no race + * condition with other vcpu sept operation. Race only with + * TDH.VP.ENTER. + */ + err = tdh_mem_page_remove(kvm_tdx->tdr.pa, gpa, tdx_level, &out); + } while (err == TDX_ERROR_SEPT_BUSY); + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error(TDH_MEM_PAGE_REMOVE, err, &out); + return -EIO; + } + + hpa_with_hkid = set_hkid_to_hpa(hpa, (u16)kvm_tdx->hkid); + do { + /* + * TDX_OPERAND_BUSY can happen on locking PAMT entry. Because + * this page was removed above, other thread shouldn't be + * repeatedly operating on this page. Just retry loop. + */ + err = tdh_phymem_page_wbinvd(hpa_with_hkid); + } while (err == (TDX_OPERAND_BUSY | TDX_OPERAND_ID_RCX)); + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error(TDH_PHYMEM_PAGE_WBINVD, err, NULL); + return -EIO; + } + tdx_unpin(kvm, pfn); + return 0; +} + +static int tdx_sept_link_private_spt(struct kvm *kvm, gfn_t gfn, + enum pg_level level, void *private_spt) +{ + int tdx_level = pg_level_to_tdx_sept_level(level); + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + gpa_t gpa = gfn_to_gpa(gfn); + hpa_t hpa = __pa(private_spt); + struct tdx_module_output out; + u64 err; + + err = tdh_mem_sept_add(kvm_tdx->tdr.pa, gpa, tdx_level, hpa, &out); + if (err == TDX_ERROR_SEPT_BUSY) + return -EAGAIN; + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error(TDH_MEM_SEPT_ADD, err, &out); + return -EIO; + } + + return 0; +} + +static int tdx_sept_zap_private_spte(struct kvm *kvm, gfn_t gfn, + enum pg_level level) +{ + int tdx_level = pg_level_to_tdx_sept_level(level); + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + gpa_t gpa = gfn_to_gpa(gfn); + struct tdx_module_output out; + u64 err; + + /* For now large page isn't supported yet. */ + WARN_ON_ONCE(level != PG_LEVEL_4K); + err = tdh_mem_range_block(kvm_tdx->tdr.pa, gpa, tdx_level, &out); + if (err == TDX_ERROR_SEPT_BUSY) + return -EAGAIN; + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error(TDH_MEM_RANGE_BLOCK, err, &out); + return -EIO; + } + return 0; +} + +/* + * TLB shoot down procedure: + * There is a global epoch counter and each vcpu has local epoch counter. + * - TDH.MEM.RANGE.BLOCK(TDR. level, range) on one vcpu + * This blocks the subsequenct creation of TLB translation on that range. + * This corresponds to clear the present bit(all RXW) in EPT entry + * - TDH.MEM.TRACK(TDR): advances the epoch counter which is global. + * - IPI to remote vcpus + * - TDExit and re-entry with TDH.VP.ENTER on remote vcpus + * - On re-entry, TDX module compares the local epoch counter with the global + * epoch counter. If the local epoch counter is older than the global epoch + * counter, update the local epoch counter and flushes TLB. + */ +static void tdx_track(struct kvm_tdx *kvm_tdx) +{ + u64 err; + + KVM_BUG_ON(!is_hkid_assigned(kvm_tdx), &kvm_tdx->kvm); + /* If TD isn't finalized, it's before any vcpu running. */ + if (unlikely(!is_td_finalized(kvm_tdx))) + return; + + /* + * tdx_flush_tlb() waits for this function to issue TDH.MEM.TRACK() by + * the counter. The counter is used instead of bool because multiple + * TDH_MEM_TRACK() can be issued concurrently by multiple vcpus. + */ + atomic_inc(&kvm_tdx->tdh_mem_track); + /* + * KVM_REQ_TLB_FLUSH waits for the empty IPI handler, ack_flush(), with + * KVM_REQUEST_WAIT. + */ + kvm_make_all_cpus_request(&kvm_tdx->kvm, KVM_REQ_TLB_FLUSH); + + do { + /* + * kvm_flush_remote_tlbs() doesn't allow to return error and + * retry. + */ + err = tdh_mem_track(kvm_tdx->tdr.pa); + } while ((err & TDX_SEAMCALL_STATUS_MASK) == TDX_OPERAND_BUSY); + + /* Release remote vcpu waiting for TDH.MEM.TRACK in tdx_flush_tlb(). */ + atomic_dec(&kvm_tdx->tdh_mem_track); + + if (KVM_BUG_ON(err, &kvm_tdx->kvm)) + pr_tdx_error(TDH_MEM_TRACK, err, NULL); + +} + +static int tdx_sept_free_private_spt(struct kvm *kvm, gfn_t gfn, + enum pg_level level, void *private_spt) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + + /* + * The HKID assigned to this TD was already freed and cache was + * already flushed. We don't have to flush again. + */ + if (!is_hkid_assigned(kvm_tdx)) + return tdx_reclaim_page((unsigned long)private_spt, + __pa(private_spt), false, 0); + + /* + * free_private_spt() is (obviously) called when a shadow page is being + * zapped. KVM doesn't (yet) zap private SPs while the TD is active. + * Note: This function is for private shadow page. Not for private + * guest page. private guest page can be zapped during TD is active. + * shared <-> private conversion and slot move/deletion. + */ + KVM_BUG_ON(is_hkid_assigned(kvm_tdx), kvm); + return -EINVAL; +} + +int tdx_sept_tlb_remote_flush(struct kvm *kvm) +{ + struct kvm_tdx *kvm_tdx; + + if (!is_td(kvm)) + return -EOPNOTSUPP; + + kvm_tdx = to_kvm_tdx(kvm); + if (is_hkid_assigned(kvm_tdx)) + tdx_track(kvm_tdx); + + return 0; +} + +static int tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn, + enum pg_level level, kvm_pfn_t pfn) +{ + /* + * TDX requires TLB tracking before dropping private page. Do + * it here, although it is also done later. + * If hkid isn't assigned, the guest is destroying and no vcpu + * runs further. TLB shootdown isn't needed. + * + * TODO: implement with_range version for optimization. + * kvm_flush_remote_tlbs_with_address(kvm, gfn, 1); + * => tdx_sept_tlb_remote_flush_with_range(kvm, gfn, + * KVM_PAGES_PER_HPAGE(level)); + */ + if (is_hkid_assigned(to_kvm_tdx(kvm))) + kvm_flush_remote_tlbs(kvm); + + return tdx_sept_drop_private_spte(kvm, gfn, level, pfn); +} + int tdx_dev_ioctl(void __user *argp) { struct kvm_tdx_capabilities __user *user_caps; @@ -802,6 +1072,25 @@ static int tdx_td_init(struct kvm *kvm, struct kvm_tdx_cmd *cmd) return ret; } +void tdx_flush_tlb(struct kvm_vcpu *vcpu) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm); + struct kvm_mmu *mmu = vcpu->arch.mmu; + u64 root_hpa = mmu->root.hpa; + + /* Flush the shared EPTP, if it's valid. */ + if (VALID_PAGE(root_hpa)) + ept_sync_context(construct_eptp(vcpu, root_hpa, + mmu->root_role.level)); + + /* + * See tdx_track(). Wait for tlb shootdown initiater to finish + * TDH_MEM_TRACK() so that TLB is flushed on the next TDENTER. + */ + while (atomic_read(&kvm_tdx->tdh_mem_track)) + cpu_relax(); +} + int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { struct kvm_tdx_cmd tdx_cmd; @@ -1011,8 +1300,16 @@ int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops) if (!r) r = tdx_module_setup(); vmxoff_all(); + if (r) + return r; - return r; + x86_ops->link_private_spt = tdx_sept_link_private_spt; + x86_ops->free_private_spt = tdx_sept_free_private_spt; + x86_ops->set_private_spte = tdx_sept_set_private_spte; + x86_ops->remove_private_spte = tdx_sept_remove_private_spte; + x86_ops->zap_private_spte = tdx_sept_zap_private_spte; + + return 0; } void tdx_hardware_unsetup(void) diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index a95f25845f24..80d595c5f96f 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -24,6 +24,7 @@ struct kvm_tdx { int hkid; bool finalized; + atomic_t tdh_mem_track; u64 tsc_offset; }; @@ -181,6 +182,12 @@ static __always_inline u64 td_tdcs_exec_read64(struct kvm_tdx *kvm_tdx, u32 fiel return out.r8; } +static __always_inline int pg_level_to_tdx_sept_level(enum pg_level level) +{ + WARN_ON_ONCE(level == PG_LEVEL_NONE); + return level - 1; +} + #else struct kvm_tdx { struct kvm kvm; diff --git a/arch/x86/kvm/vmx/tdx_ops.h b/arch/x86/kvm/vmx/tdx_ops.h index 8cc2f01c509b..35e285ae6f9e 100644 --- a/arch/x86/kvm/vmx/tdx_ops.h +++ b/arch/x86/kvm/vmx/tdx_ops.h @@ -18,6 +18,8 @@ void pr_tdx_error(u64 op, u64 error_code, const struct tdx_module_output *out); +#define TDX_ERROR_SEPT_BUSY (TDX_OPERAND_BUSY | TDX_OPERAND_ID_SEPT) + static inline u64 tdh_mng_addcx(hpa_t tdr, hpa_t addr) { clflush_cache_range(__va(addr), PAGE_SIZE); diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index cf7e0c6c65ac..5fccd98be06e 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -151,6 +151,8 @@ void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event); int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); +void tdx_flush_tlb(struct kvm_vcpu *vcpu); +int tdx_sept_tlb_remote_flush(struct kvm *kvm); void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int root_level); #else static inline int tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { return 0; } @@ -171,6 +173,8 @@ static inline void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) {} static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; } +static inline void tdx_flush_tlb(struct kvm_vcpu *vcpu) {} +static inline int tdx_sept_tlb_remote_flush(struct kvm *kvm) { return 0; } static inline void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int root_level) {} #endif From patchwork Sun Oct 30 06:23:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12882 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666360wru; Sat, 29 Oct 2022 23:29:56 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7gNGZUQC+B7fx+CFK1ltVMOdpajWQVEXJ9rLcWH9/p2T92AJASFFu5GZpVPjO1YQVLK/ny X-Received: by 2002:a62:32c2:0:b0:56b:2cce:d46a with SMTP id y185-20020a6232c2000000b0056b2cced46amr7680172pfy.36.1667111395937; Sat, 29 Oct 2022 23:29:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111395; cv=none; d=google.com; s=arc-20160816; b=pQGu1tevD7VI3RB2qvW/UgYhqawWz2ZXLojExGLCsEeA7mrTTLETj0Q1E2nTrJJdqj HOlD2sn1IFTv59JrvbjB32O7EP2Zjju71auCMFZC/fbJG6GCpJ+kZPXpEtaK6Qn4lgdy /5TPlKN5CrSEroGRYXNiB0erB7vJf416fMH5/vMCDn9yjSXgu8KCLVe7A0rX7owMZc5j GF67SqLDnlFbp7fgbJEZBmdpMVM4HUjWvXNUiR12O6jG4DcNmYtljZ/SJ6fQc10lZNuN aSB3Mlk3d4H1Oz/GxMdcIfVHQMUWHF/ac/usWAZkvq6SnaZSrrooUAZ+zB+aedKxXoy7 d3nQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=8g+rcJvDgJThlLdHV8A2IZuTTd6Fzk1uUucmTi37Cf8=; b=c+1yc1nc8ag83AlQ+a2zzB/rFqAHmcmfgtsdeY/BxGbWXSOWXjYkN59rmo14k+R/q2 kL+Ofh3JRcZYS6mrxZdXnry2MZHpEmBer6JRVrn+DFqkGGO+q8t0bgdeCcX/B4i6M+yR z8EZyBuGPKu812FJyCWaxvl/HMVgZp1VQEbekNOEi7nHRlGGLf8B9WgYRNR6sOpJNaq8 IU7QjfPMQmoxUK/WUITcGz3iM507DMWKe0BUpQ1YeENp/bIegNBDnXN1rHL6jmfXZKjT WBgZCA3esFAWHsc+6uWsFf6qNBAZJzB/PRot3oCby1tSkzTFdPIV4lZ8iyDDDvvsq7zk Fl7w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=OT8PLz3E; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g23-20020a635217000000b0044a0b2e174asi4364905pgb.83.2022.10.29.23.29.43; Sat, 29 Oct 2022 23:29:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=OT8PLz3E; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231153AbiJ3G2w (ORCPT + 99 others); Sun, 30 Oct 2022 02:28:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47724 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230132AbiJ3GZP (ORCPT ); Sun, 30 Oct 2022 02:25:15 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E481F21B; Sat, 29 Oct 2022 23:24:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111054; x=1698647054; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=A805Zxruh6yYNtylIzj9f+DsWSAMkPS9vWZxb71Yd8s=; b=OT8PLz3E2XxRTMRIaHvbZJGenE5ZQ8q+KWaJ90YxbYJAr7yKCLA+R9Dt mAls/trPDDrGHiEZHxEQ7v2hp2y9YdnDl8ARtiDDG/pgulk+/ce0DKN0K h+xcB3AE7DZgI994hBmsBdbQgUjkfBQHOSzV9sWcS6sxMjBXSNx4BBUGl rZKYW/6Y4zX4J2Q82p7SocWjlqIzeDI/0oLYISgKRrhYK6pdKdduIXuuM 2PTbRHm3ReSW1TZrDolC93DEEKVXO4B864oDf6A95XTxbs6DNNszeosKi teL5Et+8SvR5ETs5FNb26VJg7evwhyj029RrxrBBP9yDjAv0bKlgz9Opg g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037172" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037172" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:07 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393036" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393036" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:07 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 059/108] [MARKER] The start of TDX KVM patch series: KVM TDP MMU MapGPA Date: Sat, 29 Oct 2022 23:23:00 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092999248088166?= X-GMAIL-MSGID: =?utf-8?q?1748092999248088166?= From: Isaku Yamahata This empty commit is to mark the start of patch series of KVM TDP MMU MapGPA. Signed-off-by: Isaku Yamahata --- Documentation/virt/kvm/intel-tdx-layer-status.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/Documentation/virt/kvm/intel-tdx-layer-status.rst b/Documentation/virt/kvm/intel-tdx-layer-status.rst index c3e675bea802..5797d172176d 100644 --- a/Documentation/virt/kvm/intel-tdx-layer-status.rst +++ b/Documentation/virt/kvm/intel-tdx-layer-status.rst @@ -11,6 +11,7 @@ What qemu can do - TDX VM TYPE is exposed to Qemu. - Qemu can create/destroy guest of TDX vm type. - Qemu can create/destroy vcpu of TDX vm type. +- Qemu can populate initial guest memory image. Patch Layer status ------------------ @@ -19,7 +20,7 @@ Patch Layer status * TDX architectural definitions: Applied * TD VM creation/destruction: Applied * TD vcpu creation/destruction: Applied -* TDX EPT violation: Applying +* TDX EPT violation: Applied * TD finalization: Not yet * TD vcpu enter/exit: Not yet * TD vcpu interrupts/exit/hypercall: Not yet From patchwork Sun Oct 30 06:23:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12884 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666369wru; Sat, 29 Oct 2022 23:29:58 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7DdOgbKjs9FgaDqDgKeJWO+F3RyV3JbEv1ORe0QxpvLoFyAx5+7Gg1U8b3cBzpH7qSl2Z7 X-Received: by 2002:a17:902:778f:b0:17f:8347:ff83 with SMTP id o15-20020a170902778f00b0017f8347ff83mr7851379pll.146.1667111397737; Sat, 29 Oct 2022 23:29:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111397; cv=none; d=google.com; s=arc-20160816; b=Bc2UTTaG5XhDXHPF0QTY2ae8aD4mSRseAxR7URbbT1hiX6FfVM45wEKqnwK6breijm Z7VSru1h8rwBUA0Ft8xMdunYloYEGTX/X7EA5E5u22RURtp4Taj0AD/4hz+NS5YiC6xE GBXJFCMs7tpggmo9pyv+W8ucm8i+VBOdOUr72Xjloxz2MQ0pVSWfkrWBLMmnXVasWSm/ d3KzSHg7H5O1U9PbPP2Y8pqTnGmwEyc5DslpSbZacCUAeiZRJE4ELUtUKj+v8CKhHPVv 60IHttay6CkL/ObQ62hoS5VpFJJzf8zTEfsyg0QWC1rC2Os8yTwmQ9htLZ5vTy5pwGaX gTVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=/jdmg5rn5w/Dzt12UtJq63PpaJO0o37Uw0kFPZjFi8k=; b=xAh+R9DmVLqMtrVTKZCFP6KCwcr7rKHbaUTQW/PV9DkfgK88EKkoph5GjX9VzxKZqD mdNtJCBRpSDWynWkao1RToIesm1WeJBAg7MpQL0wyh0CB13SgOW5JQ2OeRENtYt6QSZh RjK0Zn5yutZCNBdKszN6UL/J80zcmKyZshakoNVs6Ry53aET5tjgf6b+lQxVl4WeuGeS /BRXCf4vnN5XZRbwyXEo8RIInhzpj4QtBUot3lW35RKb1jyZle8hiOW4BiKlPgp373dq ztkCkPJbxxUntTQU3i2rPPKRQKYddu4gwqhZheX/ysIkjAzqTw9UyA1Nyq6ElnRuEHxm YIJQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=KfMvq5nY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bk15-20020a17090b080f00b0020de216d0c4si4040452pjb.61.2022.10.29.23.29.45; Sat, 29 Oct 2022 23:29:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=KfMvq5nY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231160AbiJ3G24 (ORCPT + 99 others); Sun, 30 Oct 2022 02:28:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49684 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230161AbiJ3GZZ (ORCPT ); Sun, 30 Oct 2022 02:25:25 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 27596222; Sat, 29 Oct 2022 23:24:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111055; x=1698647055; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=RM3+eWq8eAgKigI5ubM3QJ4p1TJTvjO5HFLkxD5vwu4=; b=KfMvq5nYj8PjsTedF3CtxHrYNoTr/THR66KqkINxZcv9zKov3Mj+0wkA 4U3ApEwwEmgb74Mrod7A/LcER0PKAs6doCShzHj5VUldNrdpQ8FT284q8 +8DF9QZYpnR1g3iPRmEiRmDlnGhtrIGdOUWQCs+gfpLEgXCh02xuGcwuH mLGR2zrtV6yucHychY7B8TEoMAlq2eH7tWAA9oymxB+k41pzy+tXvrmRl 1YP4o5NhVMwE5giOo6n9/o9a35u4z9XqCQ9Y4siu3fCIqRKXU7ZiEQr3r 8PaksTDTq/CyJ6PzzB759ka2tMUuEuUUg9ftUjR9diHsDLkQX5sakjbMo g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037173" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037173" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:07 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393039" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393039" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:07 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 060/108] KVM: Add functions to set GFN to private or shared Date: Sat, 29 Oct 2022 23:23:01 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093000903004035?= X-GMAIL-MSGID: =?utf-8?q?1748093000903004035?= From: Isaku Yamahata TDX KVM support needs to track whether GFN is private or shared. Introduce functions to set whether GFN is private or shared and pre-allocate memory for xarray. Suggested-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- include/linux/kvm_host.h | 11 ++++++ virt/kvm/kvm_main.c | 74 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 85 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index a0b64308d240..fac07886ab39 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2307,9 +2307,20 @@ static inline void kvm_account_pgtable_pages(void *virt, int nr) #define KVM_MEM_ATTR_PRIVATE 0x0002 #ifdef __KVM_HAVE_ARCH_UPDATE_MEM_ATTR +/* memory attr on [start, end) */ +int kvm_vm_reserve_mem_attr(struct kvm *kvm, gfn_t start, gfn_t end); +int kvm_vm_set_mem_attr(struct kvm *kvm, int attr, gfn_t start, gfn_t end); void kvm_arch_update_mem_attr(struct kvm *kvm, struct kvm_memory_slot *slot, unsigned int attr, gfn_t start, gfn_t end); #else +static inline int kvm_vm_reserve_mem_attr(struct kvm *kvm, gfn_t start, gfn_t end) +{ + return -EOPNOTSUPP; +} +static inline int kvm_vm_set_mem_attr(struct kvm *kvm, int attr, gfn_t start, gfn_t end) +{ + return -EOPNOTSUPP; +} static inline void kvm_arch_update_mem_attr(struct kvm *kvm, struct kvm_memory_slot *slot, unsigned int attr, diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 9f82b03a8118..f0e77b65939b 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1121,6 +1121,80 @@ static inline void kvm_restrictedmem_unregister(struct kvm_memory_slot *slot) &slot->notifier); } +/* + * Reserve memory for [start, end) so that the next set oepration won't fail + * with -ENOMEM. + */ +int kvm_vm_reserve_mem_attr(struct kvm *kvm, gfn_t start, gfn_t end) +{ + int r = 0; + gfn_t gfn; + + xa_lock(&kvm->mem_attr_array); + for (gfn = start; gfn < end; gfn++) { + r = __xa_insert(&kvm->mem_attr_array, gfn, NULL, GFP_KERNEL_ACCOUNT); + if (r == -EBUSY) + r = 0; + if (r) + break; + } + xa_unlock(&kvm->mem_attr_array); + + return r; +} +EXPORT_SYMBOL_GPL(kvm_vm_reserve_mem_attr); + +/* Set memory attr for [start, end) */ +int kvm_vm_set_mem_attr(struct kvm *kvm, int attr, gfn_t start, gfn_t end) +{ + void *entry; + gfn_t gfn; + int r; + int i; + + /* By default, the entry is private. */ + switch (attr) { + case KVM_MEM_ATTR_PRIVATE: + entry = NULL; + break; + case KVM_MEM_ATTR_SHARED: + entry = xa_mk_value(KVM_MEM_ATTR_SHARED); + break; + default: + WARN_ON_ONCE(1); + return -EINVAL; + } + + WARN_ON_ONCE(start >= end); + for (gfn = start; gfn < end; gfn++) { + r = xa_err(xa_store(&kvm->mem_attr_array, gfn, entry, + GFP_KERNEL_ACCOUNT)); + if (r) + break; + } + if (start >= gfn) + return r; + + end = gfn; + for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { + struct kvm_memslot_iter iter; + struct kvm_memory_slot *slot; + struct kvm_memslots *slots; + + slots = __kvm_memslots(kvm, i); + kvm_for_each_memslot_in_gfn_range(&iter, slots, start, end) { + gfn_t s = max(start, slot->base_gfn); + gfn_t e = min(end, slot->base_gfn + slot->npages); + + WARN_ON_ONCE(s >= e); + kvm_arch_update_mem_attr(kvm, slot, attr, s, e); + } + } + + return r; +} +EXPORT_SYMBOL_GPL(kvm_vm_set_mem_attr); + #else /* !CONFIG_HAVE_KVM_RESTRICTED_MEM */ static inline void kvm_restrictedmem_register(struct kvm_memory_slot *slot) From patchwork Sun Oct 30 06:23:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12885 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666372wru; Sat, 29 Oct 2022 23:29:59 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4VqoVtPBSgLig5Pii+junw2yYWP8rztMWAU29e8J8AEfnaSukJt6SsF0g8u7a98GcN4B2+ X-Received: by 2002:a05:6a00:17aa:b0:56d:1d65:91e9 with SMTP id s42-20020a056a0017aa00b0056d1d6591e9mr5246704pfg.12.1667111399018; Sat, 29 Oct 2022 23:29:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111399; cv=none; d=google.com; s=arc-20160816; b=JIYYQXLPfhvlm7GIdINtrwCHpcXhOSygIanV6cX6TNtR2Nj0h7lK/8k1hR4aLf+A1N uRproupg8qPazMfXsMyjQRfSc6i9cgYZnUfnp2DmeAkROTuA6yryCVJDEo3R2uVPznBg +f/Ld9v0PaTKN2VpwTzjmIwB03Qy3q2PnurwV1LQzxmu7kg1OUJD8TqSmfv3Mktho/sw MdictJIUpGGzzxrCubgi+CIJ8ZhP0/uA77YhzT6zc0OuPcj4hHUkMyni8be7jVeVzM3x 1B9qYNb36ZwO9o6JqXeucCnJNAVrM+/uB0VfAb4G1TO8taQ9q4txMXcTerol67lIeIzl uprg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=z9TxV+cYAg8zrhzEs4u2wfCNNVy/zuTRNYtq8WtWt74=; b=JEdiu0sgrMlJFCwJq4Si9wuGjzuFaLjlvjnTJizs5nw7rqrCThuATiJq1WvRTuRxSW k7sOBPiR/H7PQp60Zd5dSDW23xXWyxc1zvyWTzCMg2DRKExKZGUglMccdsvB8iDNgpE9 A9XcSHz/G9DJMqks8NUI9HADJCWaaOFCL9R63+UULDBE4t55jlKxdHjPnCDzoyzGyK0j gadC2iloQaLZ04dvh7T0+EkgAlhjqzrEmnneg8m34HZcpunKXCR9dfW1hHInC9uSgOoU 4BkEWrNYvbD1rwfi3lIMJageCveViepMt5FdAecyr/OJgoirbdtKqYyczlZ4aN8lKHit QPfQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=VwjAt6OS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ji14-20020a170903324e00b0018678dab05dsi4167429plb.199.2022.10.29.23.29.46; Sat, 29 Oct 2022 23:29:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=VwjAt6OS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231166AbiJ3G3C (ORCPT + 99 others); Sun, 30 Oct 2022 02:29:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47820 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230177AbiJ3GZ2 (ORCPT ); Sun, 30 Oct 2022 02:25:28 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8F1D7231; Sat, 29 Oct 2022 23:24:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111055; x=1698647055; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7faA3B0OqUMnFM6ysYVSaG6I4oEbGhUujDKJN05XdS0=; b=VwjAt6OS+nj3GmPqO3I1QBbdqImh8IkNm3Hd/RvRGlCOkeoyJnOYgg29 Hznjel3383qCaANoJuu6XPu9syCdgk5estyuYYVYt+LRAwc/lKBg6WH6z wqG9b4/RhRspRks0/KPtImUYOkwk60yu0hG8bjAwcWUOuWjIt6xAh8xK6 BcCIu0oOPTnj/Tju81r4OJMFkVIuXfHyUaV/RdHY0EwGOzJQCge4Qphk+ PZ/Ee9Uua8l3waFRpeY90/yphCgQqTgnibZLCf03PdoqcxsMVL4Jefnj8 G0SPSlY/rqt12SF2c03JNlD9NBwjSdhY8vaLqnL0EYiB9G0QrrCgDniWK A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037174" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037174" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:07 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393042" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393042" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:07 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 061/108] KVM: x86/mmu: Introduce kvm_mmu_map_tdp_page() for use by TDX Date: Sat, 29 Oct 2022 23:23:02 -0700 Message-Id: <861847305216ba97ab65ad2e0ebe5bf08e2fd71a.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093002427609048?= X-GMAIL-MSGID: =?utf-8?q?1748093002427609048?= From: Sean Christopherson Introduce a helper to directly (pun intended) fault-in a TDP page without having to go through the full page fault path. This allows TDX to get the resulting pfn and also allows the RET_PF_* enums to stay in mmu.c where they belong. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu.h | 3 +++ arch/x86/kvm/mmu/mmu.c | 39 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 42 insertions(+) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 50d240d52697..e2a0dfbee56d 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -154,6 +154,9 @@ static inline void kvm_mmu_load_pgd(struct kvm_vcpu *vcpu) vcpu->arch.mmu->root_role.level); } +kvm_pfn_t kvm_mmu_map_tdp_page(struct kvm_vcpu *vcpu, gpa_t gpa, + u32 error_code, int max_level); + /* * Check if a given access (described through the I/D, W/R and U/S bits of a * page fault error code pfec) causes a permission fault with the given PTE diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 08923b64dcc8..168c84c99de3 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4485,6 +4485,45 @@ int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) return direct_page_fault(vcpu, fault); } +kvm_pfn_t kvm_mmu_map_tdp_page(struct kvm_vcpu *vcpu, gpa_t gpa, + u32 error_code, int max_level) +{ + int r; + struct kvm_page_fault fault = (struct kvm_page_fault) { + .addr = gpa, + .error_code = error_code, + .exec = error_code & PFERR_FETCH_MASK, + .write = error_code & PFERR_WRITE_MASK, + .present = error_code & PFERR_PRESENT_MASK, + .rsvd = error_code & PFERR_RSVD_MASK, + .user = error_code & PFERR_USER_MASK, + .prefetch = false, + .is_tdp = true, + .nx_huge_page_workaround_enabled = is_nx_huge_page_enabled(vcpu->kvm), + .is_private = kvm_is_private_gpa(vcpu->kvm, gpa), + }; + + if (mmu_topup_memory_caches(vcpu, false)) + return KVM_PFN_ERR_FAULT; + + /* + * Loop on the page fault path to handle the case where an mmu_notifier + * invalidation triggers RET_PF_RETRY. In the normal page fault path, + * KVM needs to resume the guest in case the invalidation changed any + * of the page fault properties, i.e. the gpa or error code. For this + * path, the gpa and error code are fixed by the caller, and the caller + * expects failure if and only if the page fault can't be fixed. + */ + do { + fault.max_level = max_level; + fault.req_level = PG_LEVEL_4K; + fault.goal_level = PG_LEVEL_4K; + r = direct_page_fault(vcpu, &fault); + } while (r == RET_PF_RETRY && !is_error_noslot_pfn(fault.pfn)); + return fault.pfn; +} +EXPORT_SYMBOL_GPL(kvm_mmu_map_tdp_page); + static void nonpaging_init_context(struct kvm_mmu *context) { context->page_fault = nonpaging_page_fault; From patchwork Sun Oct 30 06:23:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12888 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666407wru; Sat, 29 Oct 2022 23:30:05 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7V8319tdcu4ISL8aIwaCv1v7HKNufFD46TSFVlBBIdFDohUO0gXpfrRLN62Jn/EciTqBQX X-Received: by 2002:a63:64d:0:b0:46b:158f:102e with SMTP id 74-20020a63064d000000b0046b158f102emr7214066pgg.150.1667111405461; Sat, 29 Oct 2022 23:30:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111405; cv=none; d=google.com; s=arc-20160816; b=o0qS4bkdyNSGoByStu8WSCJJLelxTnPl5h9llnkP2r9kdlF2WpkpQAFgcqX0vEZHYJ +KgLmLIsmaIJ1wSTbp5RWPf8V0vw8B3zPcjfzJ76Q+LPiRMEu1xQBnWdXokw9ThJ/J9J QeNhhkv8LU7Fx2CAZkm+rKTfdS9MBqVeX7DUfmMldbgEYvuTbkRENjjel//lZUdP6nME qi5O52DEHBA+JAdMsNq50bSffI8TlJ7gVFGDV6OVL3SAUX98VTKubKzQ267Ux2ngTCSG Su+iIVMHRekO9lfPoW2CLEAU8xUtUomBzu55470fzcwp1doTex2xjI3OS4otIST93qVJ cpTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=EYiiG0FM/35XBdp/xZOrM51iiWitvarThKqHs23YwjI=; b=LA1O1c/MIjb+l5YyhtqfLrkLmCWWbqL9S+GyNQwA9/MLdQa51Dvd+tHjaUIQxkSqlu Hs+cHiByoEXbVoAX+ob+bgBz3E1HIXxTo5l42pKVwITWgQy37HLfdifJ7GRpm8yjW0Us nmwNW78Xgai/ri7fG+Kxk4j6PC4i5C+7406CrIf9fyLkxzfQwvbkUH3yJ/e4dVvr/jhK WicWz9SrzO0tVk9LXrIO/MdVL6SF32rlVBWLZ5E0tba8jyRS0moUXPANVmWSoyCEtq/U oFqRwdeyXUxOM1kTGpLMglJ5Ighlbju7E2fBOVFGKQw0z/rxXNT9jzIPD+NFL5QQIvti vokg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=SHq1PChI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c29-20020a630d1d000000b0046f57dc1613si4351171pgl.599.2022.10.29.23.29.53; Sat, 29 Oct 2022 23:30:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=SHq1PChI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231190AbiJ3G3N (ORCPT + 99 others); Sun, 30 Oct 2022 02:29:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49126 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230107AbiJ3G0F (ORCPT ); Sun, 30 Oct 2022 02:26:05 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05F0E275; Sat, 29 Oct 2022 23:24:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111057; x=1698647057; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pSMPEvDBc8a1ASJVlEuOshAwnXNncrPDSUz+PdeyGII=; b=SHq1PChI9IDuVVegeU84HaMwaXXTdnOxV96+u9+m4K+3ZJelrwdYFGX4 uK3whXOEeCjoNjmIY1TzLEcQ+5d6sS7AD3X2J1TYBUfGIVDfsagDPBq2j CnZuohlHgTLFVbBQ8dcwaC3wifQuYarUqVYp9Ya+ZR1mHwRfQ2ZJgpuVl egdIK1M/VBr471u9QHbwScauhwDLdQpdtXFmCoofwnb564N3CleG3dAmu Qp3qLJ2HTehxaOlgqevLc6GRaNj48ElszkAa3FIEB9saGlgijuUXjJZYQ cS1F1mwMfdk7kFIzoDdCG+HIPAA5A97vLTaMiMkkcoe5Jpxl7IrnBMSZi A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037175" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037175" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:07 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393045" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393045" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:07 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 062/108] KVM: x86/tdp_mmu: implement MapGPA hypercall for TDX Date: Sat, 29 Oct 2022 23:23:03 -0700 Message-Id: <73ef2bdcdf8ec88bbec9d3780484cecda7a21e6f.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093009092969778?= X-GMAIL-MSGID: =?utf-8?q?1748093009092969778?= From: Isaku Yamahata The TDX Guest-Hypervisor communication interface(GHCI) specification defines MapGPA hypercall for guest TD to request the host VMM to map given GPA range as private or shared. It means the guest TD uses the GPA as shared (or private). The GPA won't be used as private (or shared). VMM should enforce GPA usage. VMM doesn't have to map the GPA on the hypercall request. - Zap the aliased region. If shared (or private) GPA is requested, zap private (or shared) GPA (modulo shared bit). - Record the request GPA is shared (or private) by kvm.mem_attr_array. - Don't map GPA. The GPA is mapped on the next EPT violation. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/mmu.h | 5 ++++ arch/x86/kvm/mmu/mmu.c | 60 ++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/mmu/tdp_mmu.c | 35 ++++++++++++++++++++++ arch/x86/kvm/mmu/tdp_mmu.h | 3 ++ 4 files changed, 103 insertions(+) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index e2a0dfbee56d..e1641fa5a862 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -219,6 +219,11 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end); int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu); +int __kvm_mmu_map_gpa(struct kvm *kvm, gfn_t *startp, gfn_t end, + bool map_private); +int kvm_mmu_map_gpa(struct kvm_vcpu *vcpu, gfn_t *startp, gfn_t end, + bool map_private); + int kvm_mmu_post_init_vm(struct kvm *kvm); void kvm_mmu_pre_destroy_vm(struct kvm *kvm); diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 168c84c99de3..37b378bf60df 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6778,6 +6778,66 @@ void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen) } } +int __kvm_mmu_map_gpa(struct kvm *kvm, gfn_t *startp, gfn_t end, + bool map_private) +{ + gfn_t start = *startp; + int attr; + int ret; + + if (!kvm_gfn_shared_mask(kvm)) + return -EOPNOTSUPP; + + attr = map_private ? KVM_MEM_ATTR_PRIVATE : KVM_MEM_ATTR_SHARED; + start = start & ~kvm_gfn_shared_mask(kvm); + end = end & ~kvm_gfn_shared_mask(kvm); + + /* + * To make the following kvm_vm_set_mem_attr() success within spinlock + * without memory allocation. + */ + ret = kvm_vm_reserve_mem_attr(kvm, start, end); + if (ret) + return ret; + + write_lock(&kvm->mmu_lock); + if (is_tdp_mmu_enabled(kvm)) { + gfn_t s = start; + + ret = kvm_tdp_mmu_map_gpa(kvm, &s, end, map_private); + if (!ret) { + KVM_BUG_ON(kvm_vm_set_mem_attr(kvm, attr, start, end), kvm); + } else if (ret == -EAGAIN) { + KVM_BUG_ON(kvm_vm_set_mem_attr(kvm, attr, start, s), kvm); + start = s; + } + } else { + ret = -EOPNOTSUPP; + } + write_unlock(&kvm->mmu_lock); + + if (ret == -EAGAIN) { + if (map_private) + *startp = kvm_gfn_private(kvm, start); + else + *startp = kvm_gfn_shared(kvm, start); + } + return ret; +} +EXPORT_SYMBOL_GPL(__kvm_mmu_map_gpa); + +int kvm_mmu_map_gpa(struct kvm_vcpu *vcpu, gfn_t *startp, gfn_t end, + bool map_private) +{ + struct kvm_mmu *mmu = vcpu->arch.mmu; + + if (!VALID_PAGE(mmu->root.hpa) || !VALID_PAGE(mmu->private_root_hpa)) + return -EINVAL; + + return __kvm_mmu_map_gpa(vcpu->kvm, startp, end, map_private); +} +EXPORT_SYMBOL_GPL(kvm_mmu_map_gpa); + static unsigned long mmu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) { diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 4b207ce83ffe..d3bab382ceaa 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -2156,6 +2156,41 @@ bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm, return spte_set; } +int kvm_tdp_mmu_map_gpa(struct kvm *kvm, + gfn_t *startp, gfn_t end, bool map_private) +{ + struct kvm_mmu_page *root; + gfn_t start = *startp; + bool flush = false; + int i; + + lockdep_assert_held_write(&kvm->mmu_lock); + KVM_BUG_ON(start & kvm_gfn_shared_mask(kvm), kvm); + KVM_BUG_ON(end & kvm_gfn_shared_mask(kvm), kvm); + + kvm_mmu_invalidate_begin(kvm, start, end); + for (i = 0; i < KVM_ADDRESS_SPACE_NUM; i++) { + for_each_tdp_mmu_root_yield_safe(kvm, root, i) { + if (is_private_sp(root) == map_private) + continue; + + /* + * TODO: If necessary, return to the caller with -EAGAIN + * instead of yield-and-resume within + * tdp_mmu_zap_leafs(). + */ + flush = tdp_mmu_zap_leafs(kvm, root, start, end, + /*can_yield=*/true, flush, + /*zap_private=*/is_private_sp(root)); + } + } + if (flush) + kvm_flush_remote_tlbs_with_address(kvm, start, end - start); + kvm_mmu_invalidate_end(kvm, start, end); + + return 0; +} + /* * Return the level of the lowest level SPTE added to sptes. * That SPTE may be non-present. diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h index 695175c921a5..cb13bc1c3679 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.h +++ b/arch/x86/kvm/mmu/tdp_mmu.h @@ -51,6 +51,9 @@ void kvm_tdp_mmu_try_split_huge_pages(struct kvm *kvm, gfn_t start, gfn_t end, int target_level, bool shared); +int kvm_tdp_mmu_map_gpa(struct kvm *kvm, + gfn_t *startp, gfn_t end, bool map_private); + static inline void kvm_tdp_mmu_walk_lockless_begin(void) { rcu_read_lock(); From patchwork Sun Oct 30 06:23:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12908 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666886wru; Sat, 29 Oct 2022 23:31:48 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7x96AAOqo0aVbwjUyJTAmOyIWUIW66MwojtDAeUIGAh2mV9nTcO8oOids7mv64D/wKVzS3 X-Received: by 2002:a17:90b:2751:b0:20a:e437:a9e8 with SMTP id qi17-20020a17090b275100b0020ae437a9e8mr24793669pjb.181.1667111508128; Sat, 29 Oct 2022 23:31:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111508; cv=none; d=google.com; s=arc-20160816; b=OL8SsmGzmBV1rL21Op+DMl0qaVItwW4mbdFjclrqzMCMTDtB/7+7aaUSpUC8ylbI4C hR7IkEcBuhuEgPIB9LDJ4tU9ZQlM/sRmX6vzwYyD0sGFs1xYxn4CY15q1Cs3ltrj/51G h5yx5oC6fuibKAdy5JDM/2VynBE255IMaA8AMUFft59CFWyNEQAdCvk8BSsSS7e1CEDr rU5dKwV9Ah62YllOAuQObUR3lH8Z7pXiAcvL96XmUEXYxM3OgovHy+/sLiTMz0AwVEuN ros82g7Uqs4moHTjXgxrlK1vV06HB0kzdjz73d9Eduhvo8M0gMmS3FmoheIXmRYPXH6O mi8w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=+OUVIocULe/KmC9B29ibcQcbOj8t+IpUGUQn1yJRPHY=; b=qjpGRrgbwlAn5tCeY5ctJKT+49YCJSNSM6G6Ekm1vIXYwNVKOvyBdLSzdDOA26YenF HBAzS/lfRLlsqUri8cVYz61/7ztIH5c5jmmDjyquRVsqrRY3p/U1vvkuQEV+Bz4kvkBg DWhccadh4rxTrG/V4/+FKMU9CRnPG4KOYkhKq5r0KX3I8Wpc/gHqxFZCIkLFc8oa8gEf AXdZ688wuhAqdamvOzD+vQ9AzKheMHh97Alj/7klk9pBhjgkoiUzanxvpfv8TqTQg6Ab 4i3LVmWU7thzSOhEXoeWzKcY1ea5noglByuY/jVfeGfo5cwGnJUPH9HkLGwCQt3OUvE+ VCkQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=k8fyzWFi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id pg14-20020a17090b1e0e00b002088ad6d93fsi4949722pjb.49.2022.10.29.23.31.35; Sat, 29 Oct 2022 23:31:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=k8fyzWFi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231196AbiJ3G3S (ORCPT + 99 others); Sun, 30 Oct 2022 02:29:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49128 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230318AbiJ3G0F (ORCPT ); Sun, 30 Oct 2022 02:26:05 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0F9B1277; Sat, 29 Oct 2022 23:24:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111058; x=1698647058; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yzsW3HP34uHnzDykzPATauL2DDgMbwFkjBwFhJu1DX0=; b=k8fyzWFiS3X0eYvdLKfh7md3jy0vAiB/VQKLakRA95P2h18iOGDNRjam YhuL9XumcagyaaR1jKlQoQ4bwf+vX5I8Ic2wu1PpZCajX3TMPSFIIpIg1 kBIFD68rM+taj0e4BAtqEcqkqph3OEdKdyIWx1ilBYUV6OoE1SLmB72Lz NkU6jvILP8oGb8KzWWNmllz3BkaqjnAAuli5yp2wdn2UJQhYal7kqig1u ZQAXRlQZJ/oTxyU2VPMdUk5Xbt70c7ApM/6iru3RFVo8rTWjWv3jfexag sRP54rXISutx3RpQeeodtQsTpUrCf8Q4IeNzldj7MEUlMXT+4MFMkQgo4 Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037176" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037176" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:07 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393048" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393048" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:07 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 063/108] [MARKER] The start of TDX KVM patch series: TD finalization Date: Sat, 29 Oct 2022 23:23:04 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093116882104972?= X-GMAIL-MSGID: =?utf-8?q?1748093116882104972?= From: Isaku Yamahata This empty commit is to mark the start of patch series of TD finalization. Signed-off-by: Isaku Yamahata --- Documentation/virt/kvm/intel-tdx-layer-status.rst | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/virt/kvm/intel-tdx-layer-status.rst b/Documentation/virt/kvm/intel-tdx-layer-status.rst index 5797d172176d..53897312699f 100644 --- a/Documentation/virt/kvm/intel-tdx-layer-status.rst +++ b/Documentation/virt/kvm/intel-tdx-layer-status.rst @@ -21,11 +21,11 @@ Patch Layer status * TD VM creation/destruction: Applied * TD vcpu creation/destruction: Applied * TDX EPT violation: Applied -* TD finalization: Not yet +* TD finalization: Applying * TD vcpu enter/exit: Not yet * TD vcpu interrupts/exit/hypercall: Not yet * KVM MMU GPA shared bits: Applied * KVM TDP refactoring for TDX: Applied * KVM TDP MMU hooks: Applied -* KVM TDP MMU MapGPA: Not yet +* KVM TDP MMU MapGPA: Applied From patchwork Sun Oct 30 06:23:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12909 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666891wru; Sat, 29 Oct 2022 23:31:49 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6z9/CjXBgsmgpMqgAwUInb6dq+DUoZRBylmkqq0hFxVNQyhnL88BrZRm/ESgxXmTub4FRl X-Received: by 2002:a63:87c7:0:b0:434:883:ea21 with SMTP id i190-20020a6387c7000000b004340883ea21mr7294953pge.152.1667111509622; Sat, 29 Oct 2022 23:31:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111509; cv=none; d=google.com; s=arc-20160816; b=1IYszCQGWcs+kAUsRxPtWaWd42PXNlwlstpZFaTnTq367Q43vP2Fj8j4l8dHYiNjR9 P/EYniopCTUTgKgh1vqK2fuiY7osy/JiI5A2L2jKKRMDNx0woR6i0vwHqgRWIewN09H4 B5coiFxGicC1NE8aQiJOPcR5EIfK+KaGB4WsbfHCJMliweL178DEmLUnZlzC6Z6bVNq4 X5kUVvp5G5OzEnaxeZ4itmOlmXHfbKqGShD74xYKxqpwccCrlhCInwyMl5AB8jRB7jDA EwECbG4z4nXG0HBY26x8xt9EZ1+3eMNe/sSXsebzM8+cKAJpV9gGz508Iv571+TY/w/s wdXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=l6rqXFWP20hcU33kDht7JLooKZHe3XbellDrLYnmeMg=; b=hGKL0kT8YEN5Tc0oKc/DK8gPdeEmWdX0oMIHSWSLDL1WwLNndO08ju0ULsf4r4XsKK K/1TpC6IJeYeHp5dH9zIIHoNBCA9yQDuokSiCZnuPJnry18qTm83T/B9iMjlMLD/1b4T JozioS6aYhkHWHOp2chzWIEdMxoQg9bftMkW+ErZQfhFi1ZBcwsxwBHfRAvUvAtfRhK4 75w0TsLGYS9gJO417MIVNqvVbs2w8JAHhI8sAAX49Sk48olMzYMGzud6QjXL5LmKszfu 3Xw8yURODUlJE5gMKyBzZAZLg04VI21qM1Hc0gU6cY4GJHZJOH7UxtenjFKcZEw9pB/u dP/Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=mzIrpL36; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y192-20020a638ac9000000b0046b3dce845dsi4137987pgd.470.2022.10.29.23.31.36; Sat, 29 Oct 2022 23:31:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=mzIrpL36; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231202AbiJ3G3Y (ORCPT + 99 others); Sun, 30 Oct 2022 02:29:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229941AbiJ3G0K (ORCPT ); Sun, 30 Oct 2022 02:26:10 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F23E282; Sat, 29 Oct 2022 23:24:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111059; x=1698647059; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=OlYTcaXM5SWGvgmPTEWp0KCJKidI9N4c4MKrlDRq2SY=; b=mzIrpL36vOaM0+CN8F+M1RtjsqxDbzMHCbt9YYZkedZJOiM2res4tCvk vlr7aCI+2NopV9RbwQrfbMJia5Rorm4YVV94m50pDgfuJqcmHi+qtAXBx RzRI/iUbPsLD3IVn6i8fHJEaTbox4mnaMbqmvOxxOS8C1aaqfYuvGUbS0 iVSAA68M1xO4kmubU8+MsEOzjQQoqLnL953N3doxMADuRuyWgBmsiBWOe +CsYKkQAlkAYnePw3BPzuXvJYCoxw8hclFiK3ZKZxiMP8howqlCbXUGM0 piLQ5F7pqX7ga2IPYQxb+bKIyUyDtRw74+NOlqyZGj5PtQvB9TcDpTUGo w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037177" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037177" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:08 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393051" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393051" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:07 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 064/108] KVM: TDX: Create initial guest memory Date: Sat, 29 Oct 2022 23:23:05 -0700 Message-Id: <2b04a33103e12b476a7f3547eb54abd16fb5d21a.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093118350652080?= X-GMAIL-MSGID: =?utf-8?q?1748093118350652080?= From: Isaku Yamahata Because the guest memory is protected in TDX, the creation of the initial guest memory requires a dedicated TDX module API, tdh_mem_page_add, instead of directly copying the memory contents into the guest memory in the case of the default VM type. KVM MMU page fault handler callback, private_page_add, handles it. Define new subcommand, KVM_TDX_INIT_MEM_REGION, of VM-scoped KVM_MEMORY_ENCRYPT_OP. It assigns the guest page, copies the initial memory contents into the guest memory, encrypts the guest memory. At the same time, optionally it extends memory measurement of the TDX guest. It calls the KVM MMU page fault(EPT-violation) handler to trigger the callbacks for it. Signed-off-by: Isaku Yamahata --- arch/x86/include/uapi/asm/kvm.h | 9 ++ arch/x86/kvm/mmu/mmu.c | 1 + arch/x86/kvm/vmx/tdx.c | 158 +++++++++++++++++++++++++- arch/x86/kvm/vmx/tdx.h | 2 + tools/arch/x86/include/uapi/asm/kvm.h | 9 ++ 5 files changed, 174 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 80db152430e4..6ae52926e05a 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -540,6 +540,7 @@ enum kvm_tdx_cmd_id { KVM_TDX_CAPABILITIES = 0, KVM_TDX_INIT_VM, KVM_TDX_INIT_VCPU, + KVM_TDX_INIT_MEM_REGION, KVM_TDX_CMD_NR_MAX, }; @@ -615,4 +616,12 @@ struct kvm_tdx_init_vm { }; }; +#define KVM_TDX_MEASURE_MEMORY_REGION (1UL << 0) + +struct kvm_tdx_init_mem_region { + __u64 source_addr; + __u64 gpa; + __u64 nr_pages; +}; + #endif /* _ASM_X86_KVM_H */ diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 37b378bf60df..8e24dd0e3c3c 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5492,6 +5492,7 @@ int kvm_mmu_load(struct kvm_vcpu *vcpu) out: return r; } +EXPORT_SYMBOL(kvm_mmu_load); void kvm_mmu_unload(struct kvm_vcpu *vcpu) { diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 5378d2c35e27..7c00f71d42af 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -417,6 +417,21 @@ void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int pgd_level) td_vmcs_write64(to_tdx(vcpu), SHARED_EPT_POINTER, root_hpa & PAGE_MASK); } +static void tdx_measure_page(struct kvm_tdx *kvm_tdx, hpa_t gpa) +{ + struct tdx_module_output out; + u64 err; + int i; + + for (i = 0; i < PAGE_SIZE; i += TDX_EXTENDMR_CHUNKSIZE) { + err = tdh_mr_extend(kvm_tdx->tdr.pa, gpa + i, &out); + if (KVM_BUG_ON(err, &kvm_tdx->kvm)) { + pr_tdx_error(TDH_MR_EXTEND, err, &out); + break; + } + } +} + static void tdx_unpin(struct kvm *kvm, kvm_pfn_t pfn) { struct page *page = pfn_to_page(pfn); @@ -431,20 +446,23 @@ static int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn, hpa_t hpa = pfn_to_hpa(pfn); gpa_t gpa = gfn_to_gpa(gfn); struct tdx_module_output out; + hpa_t source_pa; + bool measure; u64 err; if (WARN_ON_ONCE(is_error_noslot_pfn(pfn) || !kvm_pfn_to_refcounted_page(pfn))) return 0; - /* TODO: handle large pages. */ - if (KVM_BUG_ON(level != PG_LEVEL_4K, kvm)) - return -EINVAL; - /* To prevent page migration, do nothing on mmu notifier. */ get_page(pfn_to_page(pfn)); + /* Build-time faults are induced and handled via TDH_MEM_PAGE_ADD. */ if (likely(is_td_finalized(kvm_tdx))) { + /* TODO: handle large pages. */ + if (KVM_BUG_ON(level != PG_LEVEL_4K, kvm)) + return -EINVAL; + err = tdh_mem_page_aug(kvm_tdx->tdr.pa, gpa, hpa, &out); if (err == TDX_ERROR_SEPT_BUSY) { tdx_unpin(kvm, pfn); @@ -453,11 +471,50 @@ static int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn, if (KVM_BUG_ON(err, kvm)) { pr_tdx_error(TDH_MEM_PAGE_AUG, err, &out); tdx_unpin(kvm, pfn); + return -EIO; } return 0; } - /* TODO: tdh_mem_page_add() comes here */ + /* + * KVM_INIT_MEM_REGION, tdx_init_mem_region(), supports only 4K page + * because tdh_mem_page_add() supports only 4K page. + */ + if (KVM_BUG_ON(level != PG_LEVEL_4K, kvm)) + return -EINVAL; + + /* + * In case of TDP MMU, fault handler can run concurrently. Note + * 'source_pa' is a TD scope variable, meaning if there are multiple + * threads reaching here with all needing to access 'source_pa', it + * will break. However fortunately this won't happen, because below + * TDH_MEM_PAGE_ADD code path is only used when VM is being created + * before it is running, using KVM_TDX_INIT_MEM_REGION ioctl (which + * always uses vcpu 0's page table and protected by vcpu->mutex). + */ + if (KVM_BUG_ON(kvm_tdx->source_pa == INVALID_PAGE, kvm)) { + tdx_unpin(kvm, pfn); + return -EINVAL; + } + + source_pa = kvm_tdx->source_pa & ~KVM_TDX_MEASURE_MEMORY_REGION; + measure = kvm_tdx->source_pa & KVM_TDX_MEASURE_MEMORY_REGION; + kvm_tdx->source_pa = INVALID_PAGE; + + do { + err = tdh_mem_page_add(kvm_tdx->tdr.pa, gpa, hpa, source_pa, + &out); + /* + * This path is executed during populating initial guest memory + * image. i.e. before running any vcpu. Race is rare. + */ + } while (err == TDX_ERROR_SEPT_BUSY); + if (KVM_BUG_ON(err, kvm)) { + pr_tdx_error(TDH_MEM_PAGE_ADD, err, &out); + tdx_unpin(kvm, pfn); + return -EIO; + } else if (measure) + tdx_measure_page(kvm_tdx, gpa); return 0; } @@ -1091,6 +1148,94 @@ void tdx_flush_tlb(struct kvm_vcpu *vcpu) cpu_relax(); } +#define TDX_SEPT_PFERR PFERR_WRITE_MASK + +static int tdx_init_mem_region(struct kvm *kvm, struct kvm_tdx_cmd *cmd) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + struct kvm_tdx_init_mem_region region; + struct kvm_vcpu *vcpu; + struct page *page; + kvm_pfn_t pfn; + int idx, ret = 0; + + /* The BSP vCPU must be created before initializing memory regions. */ + if (!atomic_read(&kvm->online_vcpus)) + return -EINVAL; + + if (cmd->flags & ~KVM_TDX_MEASURE_MEMORY_REGION) + return -EINVAL; + + if (copy_from_user(®ion, (void __user *)cmd->data, sizeof(region))) + return -EFAULT; + + /* Sanity check */ + if (!IS_ALIGNED(region.source_addr, PAGE_SIZE) || + !IS_ALIGNED(region.gpa, PAGE_SIZE) || + !region.nr_pages || + region.gpa + (region.nr_pages << PAGE_SHIFT) <= region.gpa || + !kvm_is_private_gpa(kvm, region.gpa) || + !kvm_is_private_gpa(kvm, region.gpa + (region.nr_pages << PAGE_SHIFT))) + return -EINVAL; + + vcpu = kvm_get_vcpu(kvm, 0); + if (mutex_lock_killable(&vcpu->mutex)) + return -EINTR; + + vcpu_load(vcpu); + idx = srcu_read_lock(&kvm->srcu); + + kvm_mmu_reload(vcpu); + + while (region.nr_pages) { + if (signal_pending(current)) { + ret = -ERESTARTSYS; + break; + } + + if (need_resched()) + cond_resched(); + + + /* Pin the source page. */ + ret = get_user_pages_fast(region.source_addr, 1, 0, &page); + if (ret < 0) + break; + if (ret != 1) { + ret = -ENOMEM; + break; + } + + kvm_tdx->source_pa = pfn_to_hpa(page_to_pfn(page)) | + (cmd->flags & KVM_TDX_MEASURE_MEMORY_REGION); + + pfn = kvm_mmu_map_tdp_page(vcpu, region.gpa, TDX_SEPT_PFERR, + PG_LEVEL_4K); + if (is_error_noslot_pfn(pfn) || kvm->vm_bugged) + ret = -EFAULT; + else + ret = 0; + + put_page(page); + if (ret) + break; + + region.source_addr += PAGE_SIZE; + region.gpa += PAGE_SIZE; + region.nr_pages--; + } + + srcu_read_unlock(&kvm->srcu, idx); + vcpu_put(vcpu); + + mutex_unlock(&vcpu->mutex); + + if (copy_to_user((void __user *)cmd->data, ®ion, sizeof(region))) + ret = -EFAULT; + + return ret; +} + int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { struct kvm_tdx_cmd tdx_cmd; @@ -1107,6 +1252,9 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) case KVM_TDX_INIT_VM: r = tdx_td_init(kvm, &tdx_cmd); break; + case KVM_TDX_INIT_MEM_REGION: + r = tdx_init_mem_region(kvm, &tdx_cmd); + break; default: r = -EINVAL; goto out; diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index 80d595c5f96f..686da2321683 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -23,6 +23,8 @@ struct kvm_tdx { u64 xfam; int hkid; + hpa_t source_pa; + bool finalized; atomic_t tdh_mem_track; diff --git a/tools/arch/x86/include/uapi/asm/kvm.h b/tools/arch/x86/include/uapi/asm/kvm.h index 35e3b4aa2e96..37e713ffab72 100644 --- a/tools/arch/x86/include/uapi/asm/kvm.h +++ b/tools/arch/x86/include/uapi/asm/kvm.h @@ -540,6 +540,7 @@ enum kvm_tdx_cmd_id { KVM_TDX_CAPABILITIES = 0, KVM_TDX_INIT_VM, KVM_TDX_INIT_VCPU, + KVM_TDX_INIT_MEM_REGION, KVM_TDX_CMD_NR_MAX, }; @@ -617,4 +618,12 @@ struct kvm_tdx_init_vm { }; }; +#define KVM_TDX_MEASURE_MEMORY_REGION (1UL << 0) + +struct kvm_tdx_init_mem_region { + __u64 source_addr; + __u64 gpa; + __u64 nr_pages; +}; + #endif /* _ASM_X86_KVM_H */ From patchwork Sun Oct 30 06:23:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12892 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666496wru; Sat, 29 Oct 2022 23:30:24 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4ZCIOG0+wuqNXlOqedUq05tMZiC7nyJ50HQRNzXaXSf+cRsUBjIqgsIU7bJGORGhHPb6Pd X-Received: by 2002:a17:906:2681:b0:783:6a92:4c38 with SMTP id t1-20020a170906268100b007836a924c38mr6851072ejc.75.1667111424219; Sat, 29 Oct 2022 23:30:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111424; cv=none; d=google.com; s=arc-20160816; b=zeQJuF2arsQJ80dpfBi+GS90rnr+gEidfm3N3AKPWM1XgnthfM6EmeEP+mgkfW8Nj5 +4GZhtxY+BC03ZPmt7GfIHRDrDqJzIAZC6UB2bNtanoSAyN1moTCROe0fEUssfR3EL68 lv7MWDZot8hX/V0phPCtoQoPORkVKDVue5j00TmgmeDPHdakM/WWmXsvemowJjNg0H/4 tH1/IR159aJsDVzWVcZPvBF3wpcCTOwHu2DK1dXncDbLQZY7vlezMH9xVc33bbOFjyqX f8BX7dkQAA2ZO204DLFkKr5IqyRCnAS+E+jwKcTvuQii4Z9MsuzoJbkw26iYOouWfkCz 6pfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=j6YXgu1j/cryzUugCp7iFdN0inbCODZhrs2C/1ns6Ck=; b=hmZZ1XrN1NyTBOkgWlJQmT9SgvvFbg7E5dwkubYBrGp5eFh1ysShNMOfqj8ArFzIv6 uyhjWKZCyZORiqPJqIBjv3D+17orwRJ5zzF6XgNHQ//0rhkl+WTvaI186Ofw8lM31JBS 4OVZoUNuv08tPSsc3hd+Uitj+syA4J5DQwKZhc2dtITf99VPJu1WK2/Fnv7+R00tXvGw G1W/ynlnqxHZs+bwXX/F9+znkB+n0RwIA7UrwX6uJWvUVrC5KI9NC7f65kwXa+JA8efz cN5sIlo9XhyLGJZqGEZaFnb4iRF9qnXtxdCW1qGZdauZU7ZY6TWYnUBtxZqGZmDXKQ3h ifBw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=COwfKpvq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i11-20020a50870b000000b00451e1aae675si3484460edb.547.2022.10.29.23.30.00; Sat, 29 Oct 2022 23:30:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=COwfKpvq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230241AbiJ3G32 (ORCPT + 99 others); Sun, 30 Oct 2022 02:29:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47722 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230336AbiJ3G0Q (ORCPT ); Sun, 30 Oct 2022 02:26:16 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CB383283; Sat, 29 Oct 2022 23:24:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111059; x=1698647059; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0uHShJOfWkZALFatNyLGZvdWrcXhZtI8MbG7lnSItHY=; b=COwfKpvqTCWZ+NF+RQcj00EKTzlrJ6d76A6AAcSGNPhUlXauvhclrUEI kvEnyTCppBRKGoCTkGkt9JFn9+eOnURjkVdtgCpbp/kXIXBHTYkwt+R0/ TroGb+PYXD/rErQo83O6s2pSh5C2z8cZwIV3K/ngroPyuuVobWC0Wa3KS rpfCA0QbLCt5SG4vKcfHz/C64eqUb3Ea3FTgHOiG/6bqQS6gjynHgMcFF bH/uXfS55jCa2GAk0IVQ6UVM9w/XUA1JUnWDP5YoRfud2A6RjvCrgiXFQ 3CZDCPe2eQNh4MYeVSLcfY7AHCK8FpmeUAMOf1OLy5+rYUNmug/Fz9SfX w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037178" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037178" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:08 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393054" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393054" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:08 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 065/108] KVM: TDX: Finalize VM initialization Date: Sat, 29 Oct 2022 23:23:06 -0700 Message-Id: <3b1e9bc1488e592faac5ec2df5c94f88f3276ea4.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093028759377395?= X-GMAIL-MSGID: =?utf-8?q?1748093028759377395?= From: Isaku Yamahata To protect the initial contents of the guest TD, the TDX module measures the guest TD during the build process as SHA-384 measurement. The measurement of the guest TD contents needs to be completed to make the guest TD ready to run. Add a new subcommand, KVM_TDX_FINALIZE_VM, for VM-scoped KVM_MEMORY_ENCRYPT_OP to finalize the measurement and mark the TDX VM ready to run. Signed-off-by: Isaku Yamahata --- arch/x86/include/uapi/asm/kvm.h | 1 + arch/x86/kvm/vmx/tdx.c | 31 +++++++++++++++++++++++++++ tools/arch/x86/include/uapi/asm/kvm.h | 1 + 3 files changed, 33 insertions(+) diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index 6ae52926e05a..a8e3945a3ea2 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -541,6 +541,7 @@ enum kvm_tdx_cmd_id { KVM_TDX_INIT_VM, KVM_TDX_INIT_VCPU, KVM_TDX_INIT_MEM_REGION, + KVM_TDX_FINALIZE_VM, KVM_TDX_CMD_NR_MAX, }; diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 7c00f71d42af..cce6ccd4a0be 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1236,6 +1236,34 @@ static int tdx_init_mem_region(struct kvm *kvm, struct kvm_tdx_cmd *cmd) return ret; } +static int tdx_td_finalizemr(struct kvm *kvm) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); + u64 err; + + if (!is_td_initialized(kvm) || is_td_finalized(kvm_tdx)) + return -EINVAL; + + err = tdh_mr_finalize(kvm_tdx->tdr.pa); + if (WARN_ON_ONCE(err)) { + pr_tdx_error(TDH_MR_FINALIZE, err, NULL); + return -EIO; + } + + /* + * Blindly do TDH_MEM_TRACK after finalizing the measurement to handle + * the case where SEPT entries were zapped/blocked, e.g. from failed + * NUMA balancing, after they were added to the TD via + * tdx_init_mem_region(). TDX module doesn't allow TDH_MEM_TRACK prior + * to TDH.MR.FINALIZE, and conversely requires TDH.MEM.TRACK for entries + * that were TDH.MEM.RANGE.BLOCK'd prior to TDH.MR.FINALIZE. + */ + (void)tdh_mem_track(to_kvm_tdx(kvm)->tdr.pa); + + kvm_tdx->finalized = true; + return 0; +} + int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { struct kvm_tdx_cmd tdx_cmd; @@ -1255,6 +1283,9 @@ int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) case KVM_TDX_INIT_MEM_REGION: r = tdx_init_mem_region(kvm, &tdx_cmd); break; + case KVM_TDX_FINALIZE_VM: + r = tdx_td_finalizemr(kvm); + break; default: r = -EINVAL; goto out; diff --git a/tools/arch/x86/include/uapi/asm/kvm.h b/tools/arch/x86/include/uapi/asm/kvm.h index 37e713ffab72..0aeb4639be89 100644 --- a/tools/arch/x86/include/uapi/asm/kvm.h +++ b/tools/arch/x86/include/uapi/asm/kvm.h @@ -541,6 +541,7 @@ enum kvm_tdx_cmd_id { KVM_TDX_INIT_VM, KVM_TDX_INIT_VCPU, KVM_TDX_INIT_MEM_REGION, + KVM_TDX_FINALIZE_VM, KVM_TDX_CMD_NR_MAX, }; From patchwork Sun Oct 30 06:23:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12889 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666480wru; Sat, 29 Oct 2022 23:30:20 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5XWdBsG8HFqGKqzJfe3YWoksIrrgGqdFx2ucVdao4cqyP1uBuLy1gmn/evYSgIYfxoB1eO X-Received: by 2002:a63:fd48:0:b0:46e:d8b1:8243 with SMTP id m8-20020a63fd48000000b0046ed8b18243mr6992875pgj.350.1667111420123; Sat, 29 Oct 2022 23:30:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111420; cv=none; d=google.com; s=arc-20160816; b=Yl+u+MaWVV7vfUMhUfWbeC5sHdXjL81IDUij+nKMz5GkPDdz1BgVy8S4NcB/GGYUH9 ZBFA0GvGswgbNCBLnKks2UKxe8ndH6SrKQGWOspBU3UP5Xbi3mN2bvUBkCtJAmoMg1DS xzFx+6yP0BLQNNXK4RSRaVHSclKgFkYYRoOCzkJKGRjluK4Ag1xaSO1SaCX7DM8RbgHL hBDUvx4lJpmTw89A+UcFlxXInacZLUqWNpofiyls0QcobABTbSk7R2/ROQ9O0+h32NMR nNzaqYosLAAb4Kfca+Ii732fnygAYEfLRcgJDzLvZOEz4y3uhhvZcfTvcFw50LtNQrPK VHaQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=clH6UDWBYMsTFmSai1aVhEGtQID2TuVGtttScGy9b5M=; b=b86HgIs8rf28VEjKhAaSaQPx0sEiWlwy/hB0WNFgiut2yxhK6ZuRUW7IGGlI95UhfG 0yDz4QNJAXa90yf3wvEhepEIPYRK8Mf0ZVRG0uqwRqzSyRqwyLxx/st7zYjlhIofIU+t fTcwAVFptxAZtX4JLyM+lyDzXWRWE6SIT/UuALUeAw9TUns1xjoFbsouzXPwipep3oiE xNigo9sG7Jo4tsQbKyBQPZBxHVbNMLkqDtjxXVk+ZyGu15wW9AdBQMWd4RI220mCKOVw 3VGcCTZT5jIRa+CDML2R8f1VZtVUsughbUrOdx5UTX6uHEgMfwI8jlR/LNsZuuPwWiVd DA0g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ShleHIpn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 7-20020a631047000000b0045abcc62064si4493483pgq.695.2022.10.29.23.30.07; Sat, 29 Oct 2022 23:30:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ShleHIpn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231214AbiJ3G3d (ORCPT + 99 others); Sun, 30 Oct 2022 02:29:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46972 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229890AbiJ3G0X (ORCPT ); Sun, 30 Oct 2022 02:26:23 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E83A9228; Sat, 29 Oct 2022 23:24:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111060; x=1698647060; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=aIudZd3USQowTJaovq/Lv5Tlgdhp2EQL5ZZ1g16g3i0=; b=ShleHIpnPKu+sII5anUU995YeDbt9ghLqjyN685FePR65J65drzhjwhH vBfXWNlzPBxr5C39gJF5xbQ3T8zaBSoPuJXpF/g4e9T6v5MKBpM/hp3n2 Fr0MnVYj9D8QK3FnVRirGfsVhEobazqEacYf0uFzcLLzEpNeGtkSfuaiS FpV6s+t5k6hDf+XRNxNaGUGlnYLx9hRgZUNKvbl4/DuzcdZgdlQe+6bDM giBqELVW48xDxSW6RX/6PlOkyh9UnLBY44+0u2zZUVmrbtk3Z1T/nrZeh sP+GgfBxW4jh4pNxEU6Wazy9hazshKcPlSW/tUx+0XCd/llDbU4LY2+BM w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037179" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037179" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:08 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393057" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393057" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:08 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 066/108] [MARKER] The start of TDX KVM patch series: TD vcpu enter/exit Date: Sat, 29 Oct 2022 23:23:07 -0700 Message-Id: <911804bc3c3773bcdc588d665aa472a76e8856f9.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093024708511778?= X-GMAIL-MSGID: =?utf-8?q?1748093024708511778?= From: Isaku Yamahata This empty commit is to mark the start of patch series of TD vcpu enter/exit. Signed-off-by: Isaku Yamahata --- Documentation/virt/kvm/intel-tdx-layer-status.rst | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/Documentation/virt/kvm/intel-tdx-layer-status.rst b/Documentation/virt/kvm/intel-tdx-layer-status.rst index 53897312699f..b51e8e6b1541 100644 --- a/Documentation/virt/kvm/intel-tdx-layer-status.rst +++ b/Documentation/virt/kvm/intel-tdx-layer-status.rst @@ -12,6 +12,7 @@ What qemu can do - Qemu can create/destroy guest of TDX vm type. - Qemu can create/destroy vcpu of TDX vm type. - Qemu can populate initial guest memory image. +- Qemu can finalize guest TD. Patch Layer status ------------------ @@ -21,8 +22,8 @@ Patch Layer status * TD VM creation/destruction: Applied * TD vcpu creation/destruction: Applied * TDX EPT violation: Applied -* TD finalization: Applying -* TD vcpu enter/exit: Not yet +* TD finalization: Applied +* TD vcpu enter/exit: Applying * TD vcpu interrupts/exit/hypercall: Not yet * KVM MMU GPA shared bits: Applied From patchwork Sun Oct 30 06:23:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12890 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666487wru; Sat, 29 Oct 2022 23:30:23 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6xOIK2+T4yXKF9yzYDuyOmmTixtMWFjQn9m4cW6QNTfTSERO4ce9ALzNz/C5ZeFfPXX21d X-Received: by 2002:a17:90b:1252:b0:213:beea:80c0 with SMTP id gx18-20020a17090b125200b00213beea80c0mr4193961pjb.169.1667111422734; Sat, 29 Oct 2022 23:30:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111422; cv=none; d=google.com; s=arc-20160816; b=nVHtZ4O9MknsUtLiPKJ4rJi630PF58lV4F8y6wQWE/Xn0PSLoIpXXlbSR3+T7He3x7 HlCXeP5qkp48+xKWvgmReZLe74JrJ5Ewu/oonhrvZm1TXqlmK9sweMGATVrkzyVvJ/C5 n6NstbdqSx82qzb4qUlAzUg9wiQJps2t6ptK6CmDVn3OIQuIJX+Ljmp4vgJAyKbvmsil JJSZ4vuFgQmEWdsmJOANacmE+M468odXmY8evJgSQcqprKDl82wG1QrijIAh8vRdxGEr u440VRVHddsKlSr/d8wuhNPUyOPtjh1XMfS5kourjetrd84rgSeq3P5725/c+j0x7Pjz SL0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=wNog1bT+fLbiTwkmcOkUXE0+hVn8y2AgjTOSRcyNmAc=; b=Jeoxz82XQlD7nfZxTYq+6absjO7weoYULziKyTVlbm+rWQdZrBGzYDI+lgOc2KqkG/ zbSsD6dxipfmH+/3Et8OJxBrIYlWua0NpLAXVe9pzWa4DlKj5xLsI5e/YjQR6upe22Dx OJNvIxTRFo5XCsKvHo2Ya1nS4ItuSZWV2RlO379rD1n0V/vhAx+E+3/JpuUblDJPjP3M /POFkE3vVMqlkLazB8D3FuK0kSfMTQsn9flcRyhMt4QuLfJ+y3JcEoUNiglwkJMDnx5t bxDR2YAnJ5yzTDGtaWx3fPdC1qdurePMPmUkvPxJO5mLAr43YNYkrHVVifGwJrTC7I14 JPJw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=PDi3L3yn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w2-20020a634902000000b004473c654f95si4503552pga.653.2022.10.29.23.30.10; Sat, 29 Oct 2022 23:30:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=PDi3L3yn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231219AbiJ3G3g (ORCPT + 99 others); Sun, 30 Oct 2022 02:29:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47720 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230419AbiJ3G1P (ORCPT ); Sun, 30 Oct 2022 02:27:15 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E9BF228E; Sat, 29 Oct 2022 23:24:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111062; x=1698647062; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=PZxmuf+iutkFw1axxqZiYqqQxLbEuuA3Sz6U+STmYHk=; b=PDi3L3yn0YIs+UV0Nin3YtlI1D6ZHMhPM3s86SSp2TpxIZt7oyYD/kD7 +/Y6g1h4Dwy7HZWc2g7xLQmqeBHAHug87GnkRsoYep2CXoKGqGILJJyh3 Eb7eelvUSaVF1wUR7rhJX/3gfWk/+6ySyx5hVAwo7GIxj9QcjCSwlMEL3 Xw5UEJyX/uK8R1X5xjsHsyNDAm1YezAL8WQuUDrmLhezgXcwprCn5hq6B XCbCATJhomg5xVb3I8vxreYdWcINKVnuY1WNrOJ3/2rRg1l5Nq5Lx2heK O/dDrUG6RCib4fLDUUigOv2rmbG29RuNuPL9PHHwjmdNm9qirWcMdRWe9 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037180" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037180" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:08 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393060" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393060" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:08 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 067/108] KVM: TDX: Add helper assembly function to TDX vcpu Date: Sat, 29 Oct 2022 23:23:08 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093027397589570?= X-GMAIL-MSGID: =?utf-8?q?1748093027397589570?= From: Isaku Yamahata TDX defines an API to run TDX vcpu with its own ABI. Define an assembly helper function to run TDX vcpu to hide the special ABI so that C code can call it with function call ABI. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/vmenter.S | 157 +++++++++++++++++++++++++++++++++++++ 1 file changed, 157 insertions(+) diff --git a/arch/x86/kvm/vmx/vmenter.S b/arch/x86/kvm/vmx/vmenter.S index 8477d8bdd69c..9066eea1ede5 100644 --- a/arch/x86/kvm/vmx/vmenter.S +++ b/arch/x86/kvm/vmx/vmenter.S @@ -3,6 +3,7 @@ #include #include #include +#include #include #include #include @@ -31,6 +32,13 @@ #define VCPU_R15 __VCPU_REGS_R15 * WORD_SIZE #endif +#ifdef CONFIG_INTEL_TDX_HOST +#define TDENTER 0 +#define EXIT_REASON_TDCALL 77 +#define TDENTER_ERROR_BIT 63 +#define seamcall .byte 0x66,0x0f,0x01,0xcf +#endif + .section .noinstr.text, "ax" /** @@ -350,3 +358,152 @@ SYM_FUNC_START(vmx_do_interrupt_nmi_irqoff) pop %_ASM_BP RET SYM_FUNC_END(vmx_do_interrupt_nmi_irqoff) + +#ifdef CONFIG_INTEL_TDX_HOST + +.pushsection .noinstr.text, "ax" + +/** + * __tdx_vcpu_run - Call SEAMCALL(TDENTER) to run a TD vcpu + * @tdvpr: physical address of TDVPR + * @regs: void * (to registers of TDVCPU) + * @gpr_mask: non-zero if guest registers need to be loaded prior to TDENTER + * + * Returns: + * TD-Exit Reason + * + * Note: KVM doesn't support using XMM in its hypercalls, it's the HyperV + * code's responsibility to save/restore XMM registers on TDVMCALL. + */ +SYM_FUNC_START(__tdx_vcpu_run) + push %rbp + mov %rsp, %rbp + + push %r15 + push %r14 + push %r13 + push %r12 + push %rbx + + /* Save @regs, which is needed after TDENTER to capture output. */ + push %rsi + + /* Load @tdvpr to RCX */ + mov %rdi, %rcx + + /* No need to load guest GPRs if the last exit wasn't a TDVMCALL. */ + test %dx, %dx + je 1f + + /* Load @regs to RAX, which will be clobbered with $TDENTER anyways. */ + mov %rsi, %rax + + mov VCPU_RBX(%rax), %rbx + mov VCPU_RDX(%rax), %rdx + mov VCPU_RBP(%rax), %rbp + mov VCPU_RSI(%rax), %rsi + mov VCPU_RDI(%rax), %rdi + + mov VCPU_R8 (%rax), %r8 + mov VCPU_R9 (%rax), %r9 + mov VCPU_R10(%rax), %r10 + mov VCPU_R11(%rax), %r11 + mov VCPU_R12(%rax), %r12 + mov VCPU_R13(%rax), %r13 + mov VCPU_R14(%rax), %r14 + mov VCPU_R15(%rax), %r15 + + /* Load TDENTER to RAX. This kills the @regs pointer! */ +1: mov $TDENTER, %rax + +2: seamcall + + /* + * Use same return value convention to tdxcall.S. + * TDX_SEAMCALL_VMFAILINVALID doesn't conflict with any TDX status code. + */ + jnc 3f + mov $TDX_SEAMCALL_VMFAILINVALID, %rax + jmp 5f +3: + + /* Skip to the exit path if TDENTER failed. */ + bt $TDENTER_ERROR_BIT, %rax + jc 5f + + /* Temporarily save the TD-Exit reason. */ + push %rax + + /* check if TD-exit due to TDVMCALL */ + cmp $EXIT_REASON_TDCALL, %ax + + /* Reload @regs to RAX. */ + mov 8(%rsp), %rax + + /* Jump on non-TDVMCALL */ + jne 4f + + /* Save all output from SEAMCALL(TDENTER) */ + mov %rbx, VCPU_RBX(%rax) + mov %rbp, VCPU_RBP(%rax) + mov %rsi, VCPU_RSI(%rax) + mov %rdi, VCPU_RDI(%rax) + mov %r10, VCPU_R10(%rax) + mov %r11, VCPU_R11(%rax) + mov %r12, VCPU_R12(%rax) + mov %r13, VCPU_R13(%rax) + mov %r14, VCPU_R14(%rax) + mov %r15, VCPU_R15(%rax) + +4: mov %rcx, VCPU_RCX(%rax) + mov %rdx, VCPU_RDX(%rax) + mov %r8, VCPU_R8 (%rax) + mov %r9, VCPU_R9 (%rax) + + /* + * Clear all general purpose registers except RSP and RAX to prevent + * speculative use of the guest's values. + */ + xor %rbx, %rbx + xor %rcx, %rcx + xor %rdx, %rdx + xor %rsi, %rsi + xor %rdi, %rdi + xor %rbp, %rbp + xor %r8, %r8 + xor %r9, %r9 + xor %r10, %r10 + xor %r11, %r11 + xor %r12, %r12 + xor %r13, %r13 + xor %r14, %r14 + xor %r15, %r15 + + /* Restore the TD-Exit reason to RAX for return. */ + pop %rax + + /* "POP" @regs. */ +5: add $8, %rsp + pop %rbx + pop %r12 + pop %r13 + pop %r14 + pop %r15 + + pop %rbp + RET + +6: cmpb $0, kvm_rebooting + je 1f + mov $TDX_SW_ERROR, %r12 + orq %r12, %rax + jmp 5b +1: ud2 + /* Use FAULT version to know what fault happened. */ + _ASM_EXTABLE_FAULT(2b, 6b) + +SYM_FUNC_END(__tdx_vcpu_run) + +.popsection + +#endif From patchwork Sun Oct 30 06:23:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12893 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666521wru; Sat, 29 Oct 2022 23:30:28 -0700 (PDT) X-Google-Smtp-Source: AMsMyM73LAIOIOHUHuXP2uoIji8WKHrmZh4TnmTsb+pKdlkqJ82/kGMH3QHHphd1cz2mhSQIg4If X-Received: by 2002:a17:902:ce0f:b0:187:640:42f with SMTP id k15-20020a170902ce0f00b001870640042fmr6542495plg.115.1667111428224; Sat, 29 Oct 2022 23:30:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111428; cv=none; d=google.com; s=arc-20160816; b=MnC+zT1RWiLDLXqdQoI98S7QQtrLTRKGS84oYn81yOCvHvr1FdrcYxtvmaR5ZW0jun paGcGn3ZAWXrE4yT4RvSaCWkpnsB9X7oc62FkDqTADFil164WJFeu8W/UFGRKDhTuK+4 dw/uE/+0kfcF68HR4i+fDO2cycxh5wTMYHc+JkcMembpAcoPohDm3K6tSKkcCTVyiFVG YhJE2nd++KmR0sbWE5HfXAdWUOINjIMuzO2Qkj2fkmMmq1Qpp7p2qDOOqWdzM2uByC2/ IamsrTsRtdipI9fbrNhfIlnCTUcaC7mey2kUMN/bBsJBrq6cW+1ozhsZ9+mTtpPdn27h oNtw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=3cbdT4T9NPPB4hZ4J1GX4CsqdwSxGE2OGkXYMSatX/U=; b=mdaZnBP234DqwB1Vd1K1J5oIqLemr79Esh85EQgc1fZunqB/uSIPiQ4UktmzjntbEg twqLKub9U4TQno9nKFdD5ZOHuq4OsrSn0kkCemdto6Ccqrp1ypRz73rWsZUgtdB3f/kU gAiYKW2RY/q+7JXAlZojHPCsU6lgrowJF59Vk9ulcMvorHQj5lhGcivN4GGhHiuv4Bnn YIO6TOM4N3wT8s6qqEw9OliIzF6QqTqPMWlERf2K69jupUAzWxhlWB7F1NICg3tI/ydr Gnc+9ty8oMwDbQ91wrBGSOykDR4ospGcG1fObSVLmdc9Qx2B/ciz93m9yrN1f8+ku6B+ S6jQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=W9G3JtWU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q14-20020a170902f78e00b00176c891c8a0si4199861pln.6.2022.10.29.23.30.15; Sat, 29 Oct 2022 23:30:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=W9G3JtWU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231244AbiJ3G3p (ORCPT + 99 others); Sun, 30 Oct 2022 02:29:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49714 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230432AbiJ3G1W (ORCPT ); Sun, 30 Oct 2022 02:27:22 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 97283291; Sat, 29 Oct 2022 23:24:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111065; x=1698647065; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WODencu2bQ9wPyYlpFMcjW7EQiU3YFh5ab1FR4/gQss=; b=W9G3JtWUvLFg5Ba/Z5ROf06oEeCcjr71C5BOz57eWY4VMvZhOrrD/vQH WcGyRT7/g5/Nl+inCUs8YoSZQP0evOwoPVimKUdd2jaMr+S+/huJTryn4 LkzlC1GP4rjNLmQbEoL3BrzRGH/xgHD2plhu2gQxyLRQJ/YpsoQSXPBcj yUYZf1xv0sGdWm5m/le0hu/BX3fEdvlFlwtZx9jVIUu3yRjPGnb3zxklB bbj0c3LugsNecTnAq2g3MRktk+Y/dN+5P/U5QqLxS6ihQlojxDIx0B9Gn wlC+24azMtBNF8BBWIWjfGHBuzMLi5V/5xAiTpTE1QgYa9Xkxkn6iCacF w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037181" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037181" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:08 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393063" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393063" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:08 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 068/108] KVM: TDX: Implement TDX vcpu enter/exit path Date: Sat, 29 Oct 2022 23:23:09 -0700 Message-Id: <7e4e5bb55c1c346694719078d8ddc3e69c1964b3.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093032952891976?= X-GMAIL-MSGID: =?utf-8?q?1748093032952891976?= From: Isaku Yamahata This patch implements running TDX vcpu. Once vcpu runs on the logical processor (LP), the TDX vcpu is associated with it. When the TDX vcpu moves to another LP, the TDX vcpu needs to flush its status on the LP. When destroying TDX vcpu, it needs to complete flush and flush cpu memory cache. Track which LP the TDX vcpu run and flush it as necessary. Do nothing on sched_in event as TDX doesn't support pause loop. TDX vcpu execution requires restoring PMU debug store after returning back to KVM because the TDX module unconditionally resets the value. To reuse the existing code, export perf_restore_debug_store. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/main.c | 21 +++++++++++++++++++-- arch/x86/kvm/vmx/tdx.c | 32 ++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/tdx.h | 33 +++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/x86_ops.h | 2 ++ arch/x86/kvm/x86.c | 1 + 5 files changed, 87 insertions(+), 2 deletions(-) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 3163915e2e3d..252c33820271 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -100,6 +100,23 @@ static void vt_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) return vmx_vcpu_reset(vcpu, init_event); } +static int vt_vcpu_pre_run(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + /* Unconditionally continue to vcpu_run(). */ + return 1; + + return vmx_vcpu_pre_run(vcpu); +} + +static fastpath_t vt_vcpu_run(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return tdx_vcpu_run(vcpu); + + return vmx_vcpu_run(vcpu); +} + static void vt_flush_tlb_all(struct kvm_vcpu *vcpu) { if (is_td_vcpu(vcpu)) @@ -232,8 +249,8 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .flush_tlb_gva = vt_flush_tlb_gva, .flush_tlb_guest = vt_flush_tlb_guest, - .vcpu_pre_run = vmx_vcpu_pre_run, - .vcpu_run = vmx_vcpu_run, + .vcpu_pre_run = vt_vcpu_pre_run, + .vcpu_run = vt_vcpu_run, .handle_exit = vmx_handle_exit, .skip_emulated_instruction = vmx_skip_emulated_instruction, .update_emulated_instruction = vmx_update_emulated_instruction, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index cce6ccd4a0be..2f57f62eb103 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -10,6 +10,9 @@ #include "x86.h" #include "mmu.h" +#include +#include "trace.h" + #undef pr_fmt #define pr_fmt(fmt) "tdx: " fmt @@ -412,6 +415,35 @@ void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vcpu->kvm->vm_bugged = true; } +u64 __tdx_vcpu_run(hpa_t tdvpr, void *regs, u32 regs_mask); + +static noinstr void tdx_vcpu_enter_exit(struct kvm_vcpu *vcpu, + struct vcpu_tdx *tdx) +{ + guest_enter_irqoff(); + tdx->exit_reason.full = __tdx_vcpu_run(tdx->tdvpr.pa, vcpu->arch.regs, 0); + guest_exit_irqoff(); +} + +fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu) +{ + struct vcpu_tdx *tdx = to_tdx(vcpu); + + if (unlikely(vcpu->kvm->vm_bugged)) { + tdx->exit_reason.full = TDX_NON_RECOVERABLE_VCPU; + return EXIT_FASTPATH_NONE; + } + + trace_kvm_entry(vcpu); + + tdx_vcpu_enter_exit(vcpu, tdx); + + vcpu->arch.regs_avail &= ~VMX_REGS_LAZY_LOAD_SET; + trace_kvm_exit(vcpu, KVM_ISA_VMX); + + return EXIT_FASTPATH_NONE; +} + void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int pgd_level) { td_vmcs_write64(to_tdx(vcpu), SHARED_EPT_POINTER, root_hpa & PAGE_MASK); diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index 686da2321683..064e1f2f61d5 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -31,12 +31,45 @@ struct kvm_tdx { u64 tsc_offset; }; +union tdx_exit_reason { + struct { + /* 31:0 mirror the VMX Exit Reason format */ + u64 basic : 16; + u64 reserved16 : 1; + u64 reserved17 : 1; + u64 reserved18 : 1; + u64 reserved19 : 1; + u64 reserved20 : 1; + u64 reserved21 : 1; + u64 reserved22 : 1; + u64 reserved23 : 1; + u64 reserved24 : 1; + u64 reserved25 : 1; + u64 bus_lock_detected : 1; + u64 enclave_mode : 1; + u64 smi_pending_mtf : 1; + u64 smi_from_vmx_root : 1; + u64 reserved30 : 1; + u64 failed_vmentry : 1; + + /* 63:32 are TDX specific */ + u64 details_l1 : 8; + u64 class : 8; + u64 reserved61_48 : 14; + u64 non_recoverable : 1; + u64 error : 1; + }; + u64 full; +}; + struct vcpu_tdx { struct kvm_vcpu vcpu; struct tdx_td_page tdvpr; struct tdx_td_page *tdvpx; + union tdx_exit_reason exit_reason; + bool vcpu_initialized; /* diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 5fccd98be06e..ccae338dcfdd 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -147,6 +147,7 @@ void tdx_vm_free(struct kvm *kvm); int tdx_vcpu_create(struct kvm_vcpu *vcpu); void tdx_vcpu_free(struct kvm_vcpu *vcpu); void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event); +fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu); int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); @@ -169,6 +170,7 @@ static inline void tdx_vm_free(struct kvm *kvm) {} static inline int tdx_vcpu_create(struct kvm_vcpu *vcpu) { return -EOPNOTSUPP; } static inline void tdx_vcpu_free(struct kvm_vcpu *vcpu) {} static inline void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) {} +static inline fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu) { return EXIT_FASTPATH_NONE; } static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5dadd0f9a10e..3662f64f3b5e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -307,6 +307,7 @@ const struct kvm_stats_header kvm_vcpu_stats_header = { }; u64 __read_mostly host_xcr0; +EXPORT_SYMBOL_GPL(host_xcr0); static struct kmem_cache *x86_emulator_cache; From patchwork Sun Oct 30 06:23:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12898 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666648wru; Sat, 29 Oct 2022 23:30:53 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5Li39lXtCrz2bDTdZHmFWWsvPXLtSsAGTz1f32SZVCKyJMFp+IIlV8HUHiLK3VK4Pmjxg7 X-Received: by 2002:a17:907:8a18:b0:7ad:a942:f0e2 with SMTP id sc24-20020a1709078a1800b007ada942f0e2mr6991126ejc.119.1667111453411; Sat, 29 Oct 2022 23:30:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111453; cv=none; d=google.com; s=arc-20160816; b=XoCke/X5W3ud+MzHPkX79oESb2vIxqImuqjMbZYgotPitE+yxWuhzgLkYMZNdxbfpP /5X6g9dSmwXbC3xYxFzvCMiZuk8LeABfpylupmURjT5C9o4QOx6LT+Ln2VYnNDaYDElT dLmJj+kjQF+uQQom8uxl+LjmizD+NmLKDmzmu9izRu5hlpUzGELcQ+DFfkUtpcMXocGU 5Jz5JpM0L54dmtgR1AdPTmB4OzCJRFvLrTR05xRXdshubkbjOe8jngxdKsysIJCmuzCb oD6Wjt+Cksw1ezP3/uiy+hkVwPyPaK465LUmIHkxDZiolPHTkeB3Ob8Dh8/e/ZfGovee 0vSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=kHYjmM1YhAffvNtVWrzl5zzYNfRmWeuv7RRgpms7U+A=; b=braGmJVIh5dCFp5IkFOpTncJUeWC37gJJDHa7yUtTJXyS69KVV4GX0g2SZdTA3lFri MMK26CyQFatIjYEBfCIC0jlktTCLtWA6FfRkNMU/W3uLdKamOoQt65wEyUIRX/lwV3H3 2fK4gBchnzuXqlwX4BSgSs2Xouc4D0k2u7LsTC6qQIBJa5wqAuq3RHfcmK9ZBCWU4adv uqMiRcEC4KTc/OQZFL3+fjajZoH0M9t5Q5C4+HD6QWCH0cuv9JmKkIwN6w8rWn3bOtFD ajvS2hGystcdOY+GoR8rN4RBYEaw75NL0vRhhK5O1TiwFMYyvYXRR1xH7yy1dxY+Tx06 zOhw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Yd9XzP+o; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ds14-20020a170907724e00b0078d9d69ae0dsi4141690ejc.877.2022.10.29.23.30.30; Sat, 29 Oct 2022 23:30:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Yd9XzP+o; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231265AbiJ3G3x (ORCPT + 99 others); Sun, 30 Oct 2022 02:29:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49724 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230442AbiJ3G12 (ORCPT ); Sun, 30 Oct 2022 02:27:28 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AB7E7D8; Sat, 29 Oct 2022 23:24:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111067; x=1698647067; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=l77jQ4gJS+CcJ3+mZO3jl3ANs87zjcLaB5TTLAVX6b4=; b=Yd9XzP+oudAd8wISj+R4aCPyTQ1LIkA+uq2t5gcBYZxEi7IejKkZ2ddS IkSZICllJMYbbgOwkHV546PrfvXFl1PJ1qYs3TmXBh6ZKSmqVwj4dhxGR mwyTBUqXa+1IB618fWFdq7tffUIZ4xVUA2ygx6kYVZ8IlKFOIABH51IAN qCn4TRql7Yrn0TQX/8Xrhv/E2AplmDGodyGi9FxvY1Bf/wBzg5EZhHoic GUD3vYvr4UQNzMd7WD4DAaMmXF0CdluwaThf/MUXBW2USt9JrcRRFSNpK Y62oYWpeaCInAu2pbGQS+yKr9NKqbwmIIxQTetS1B3SfU99qq38tKW4nG Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037182" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037182" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:08 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393067" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393067" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:08 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 069/108] KVM: TDX: vcpu_run: save/restore host state(host kernel gs) Date: Sat, 29 Oct 2022 23:23:10 -0700 Message-Id: <4bd691b93fc90f5cad66688e12e671ad12d4c913.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093059192579558?= X-GMAIL-MSGID: =?utf-8?q?1748093059192579558?= From: Isaku Yamahata On entering/exiting TDX vcpu, Preserved or clobbered CPU state is different from VMX case. Add TDX hooks to save/restore host/guest CPU state. Save/restore kernel GS base MSR. Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/vmx/main.c | 28 ++++++++++++++++++++-- arch/x86/kvm/vmx/tdx.c | 42 +++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/tdx.h | 4 ++++ arch/x86/kvm/vmx/x86_ops.h | 4 ++++ arch/x86/kvm/x86.c | 10 ++++++-- 6 files changed, 85 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 5f9634c130d0..b225cdfac4bc 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -2080,6 +2080,7 @@ int kvm_pv_send_ipi(struct kvm *kvm, unsigned long ipi_bitmap_low, int kvm_add_user_return_msr(u32 msr); int kvm_find_user_return_msr(u32 msr); +void kvm_user_return_msr_init_cpu(void); int kvm_set_user_return_msr(unsigned index, u64 val, u64 mask); static inline bool kvm_is_supported_user_return_msr(u32 msr) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 252c33820271..379b3343557b 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -100,6 +100,30 @@ static void vt_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) return vmx_vcpu_reset(vcpu, init_event); } +static void vt_prepare_switch_to_guest(struct kvm_vcpu *vcpu) +{ + /* + * All host state is saved/restored across SEAMCALL/SEAMRET, and the + * guest state of a TD is obviously off limits. Deferring MSRs and DRs + * is pointless because the TDX module needs to load *something* so as + * not to expose guest state. + */ + if (is_td_vcpu(vcpu)) { + tdx_prepare_switch_to_guest(vcpu); + return; + } + + vmx_prepare_switch_to_guest(vcpu); +} + +static void vt_vcpu_put(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return tdx_vcpu_put(vcpu); + + return vmx_vcpu_put(vcpu); +} + static int vt_vcpu_pre_run(struct kvm_vcpu *vcpu) { if (is_td_vcpu(vcpu)) @@ -214,9 +238,9 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .vcpu_free = vt_vcpu_free, .vcpu_reset = vt_vcpu_reset, - .prepare_switch_to_guest = vmx_prepare_switch_to_guest, + .prepare_switch_to_guest = vt_prepare_switch_to_guest, .vcpu_load = vmx_vcpu_load, - .vcpu_put = vmx_vcpu_put, + .vcpu_put = vt_vcpu_put, .update_exception_bitmap = vmx_update_exception_bitmap, .get_msr_feature = vmx_get_msr_feature, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 2f57f62eb103..021040fdd630 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 #include +#include #include @@ -329,6 +330,8 @@ int tdx_vm_init(struct kvm *kvm) int tdx_vcpu_create(struct kvm_vcpu *vcpu) { + struct vcpu_tdx *tdx = to_tdx(vcpu); + /* TDX only supports x2APIC, which requires an in-kernel local APIC. */ if (!vcpu->arch.apic) return -EINVAL; @@ -345,9 +348,46 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu) vcpu->arch.guest_state_protected = !(to_kvm_tdx(vcpu->kvm)->attributes & TDX_TD_ATTRIBUTE_DEBUG); + tdx->host_state_need_save = true; + tdx->host_state_need_restore = false; + return 0; } +void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) +{ + struct vcpu_tdx *tdx = to_tdx(vcpu); + + kvm_user_return_msr_init_cpu(); + if (!tdx->host_state_need_save) + return; + + if (likely(is_64bit_mm(current->mm))) + tdx->msr_host_kernel_gs_base = current->thread.gsbase; + else + tdx->msr_host_kernel_gs_base = read_msr(MSR_KERNEL_GS_BASE); + + tdx->host_state_need_save = false; +} + +static void tdx_prepare_switch_to_host(struct kvm_vcpu *vcpu) +{ + struct vcpu_tdx *tdx = to_tdx(vcpu); + + tdx->host_state_need_save = true; + if (!tdx->host_state_need_restore) + return; + + wrmsrl(MSR_KERNEL_GS_BASE, tdx->msr_host_kernel_gs_base); + tdx->host_state_need_restore = false; +} + +void tdx_vcpu_put(struct kvm_vcpu *vcpu) +{ + vmx_vcpu_pi_put(vcpu); + tdx_prepare_switch_to_host(vcpu); +} + void tdx_vcpu_free(struct kvm_vcpu *vcpu) { struct vcpu_tdx *tdx = to_tdx(vcpu); @@ -438,6 +478,8 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu) tdx_vcpu_enter_exit(vcpu, tdx); + tdx->host_state_need_restore = true; + vcpu->arch.regs_avail &= ~VMX_REGS_LAZY_LOAD_SET; trace_kvm_exit(vcpu, KVM_ISA_VMX); diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index 064e1f2f61d5..e5f973b2d752 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -72,6 +72,10 @@ struct vcpu_tdx { bool vcpu_initialized; + bool host_state_need_save; + bool host_state_need_restore; + u64 msr_host_kernel_gs_base; + /* * Dummy to make pmu_intel not corrupt memory. * TODO: Support PMU for TDX. Future work. diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index ccae338dcfdd..a4e50c5a4bf5 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -148,6 +148,8 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu); void tdx_vcpu_free(struct kvm_vcpu *vcpu); void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event); fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu); +void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu); +void tdx_vcpu_put(struct kvm_vcpu *vcpu); int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); @@ -171,6 +173,8 @@ static inline int tdx_vcpu_create(struct kvm_vcpu *vcpu) { return -EOPNOTSUPP; } static inline void tdx_vcpu_free(struct kvm_vcpu *vcpu) {} static inline void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) {} static inline fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu) { return EXIT_FASTPATH_NONE; } +static inline void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) {} +static inline void tdx_vcpu_put(struct kvm_vcpu *vcpu) {} static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 3662f64f3b5e..65541bfebb37 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -418,7 +418,7 @@ int kvm_find_user_return_msr(u32 msr) } EXPORT_SYMBOL_GPL(kvm_find_user_return_msr); -static void kvm_user_return_msr_init_cpu(struct kvm_user_return_msrs *msrs) +static void __kvm_user_return_msr_init_cpu(struct kvm_user_return_msrs *msrs) { u64 value; int i; @@ -434,12 +434,18 @@ static void kvm_user_return_msr_init_cpu(struct kvm_user_return_msrs *msrs) msrs->initialized = true; } +void kvm_user_return_msr_init_cpu(void) +{ + __kvm_user_return_msr_init_cpu(this_cpu_ptr(user_return_msrs)); +} +EXPORT_SYMBOL_GPL(kvm_user_return_msr_init_cpu); + int kvm_set_user_return_msr(unsigned slot, u64 value, u64 mask) { struct kvm_user_return_msrs *msrs = this_cpu_ptr(user_return_msrs); int err; - kvm_user_return_msr_init_cpu(msrs); + __kvm_user_return_msr_init_cpu(msrs); value = (value & mask) | (msrs->values[slot].host & ~mask); if (value == msrs->values[slot].curr) From patchwork Sun Oct 30 06:23:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12895 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666596wru; Sat, 29 Oct 2022 23:30:43 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6T9r0xAi80uTYVOrrAnUOW91ph7tg+gY6f/xiFjk46LoXbVljBtAN/ZrmR6gCcTOgNEx5P X-Received: by 2002:a17:902:cf03:b0:17e:c7a:678e with SMTP id i3-20020a170902cf0300b0017e0c7a678emr7733302plg.10.1667111443468; Sat, 29 Oct 2022 23:30:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111443; cv=none; d=google.com; s=arc-20160816; b=WrmzggHBigyxvUaPHbDF7RksKsdQ8u48K8U/dxnhEWP7zG46Hzj0n7N16RNd5cbeqx gBHZEZwrj9AACHK+aH1kBaurq/O1zOCP8HA3l/5AyH4iUEkAJUo0ZGgJvwk16KDJ2vJ2 xV/+PUcalghvHokAb9rymMce6kBJfJZk0iU6FHVJhN1Vsj0eQwd1lTueuD58aR885OvT 5MPrIVmIO1pjL4J33D+tOt0LI62ykaP06aUlXLzttw3ryv2pj8jynHBqVryp/SJswVce bWRd0YuFj/eFVF9iKKWzXzO1TiIRhmiWOwyN+d90+Ou6Kc9GAJcHQ2Cu2wmxXgfr/7Me L4oA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=JWBUh5oXNLLh7qVEAd1BGHbo9ryiEdqrFJQPRUBbazw=; b=UeE4q2221twTSD6Lwil9YFg4aVNy3X8Y5qdJ8Yn1/F5ZFWe1xYVIQAtXvLk8LW70hu ZYxL4gZ2PQctma5Wk0SB2xb8Z4ISKUM0NUy7z3MdEnlmBdTi+TVGYsS8dGtopNi56q3u xX17cO8vNMXb2rsCSXj1IUnQlKC4XZ/wEchLk0JjfjM6DpETGXPr1t0MbnBp7+ZFG7Z0 2KuyHzhAueywRvaj75n/IL42eKU9VmeLKY2UfZWG/JQ+uKzZiAJv5kX1Vlo1AgxwvGGQ wife7Ggb/ofAkZ5UWHv5jKK975HDDGjqqDmWQ31/J4cd+zmAs2mM4lhslGJEUDaGJGog /tRQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=I2RVeNVb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t135-20020a635f8d000000b0043891d59cb7si4305535pgb.833.2022.10.29.23.30.30; Sat, 29 Oct 2022 23:30:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=I2RVeNVb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231270AbiJ3G34 (ORCPT + 99 others); Sun, 30 Oct 2022 02:29:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230267AbiJ3G1t (ORCPT ); Sun, 30 Oct 2022 02:27:49 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02D222B9; Sat, 29 Oct 2022 23:24:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111074; x=1698647074; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=64+6ECToZEcCn4qfUeaWI0Qwqi2QNI8ybbnZnGdMBJA=; b=I2RVeNVbVktTNv31kx2GzhY5JdQw3VMnJclmDV/OzIUf6PZn4xf+M2Bu zI2T1+gG4YwbHss65EZQKdivKmrlWb4CBlqOBEjy3k9czFw/efT5VLmtR oaSa+l2VdwENoavmjZplerT8oty3u8HslHJMMJnkthcKruQEkMP9z9g/m jPmqh5caI2do34qqytBC1BNRU0M5VSCDfn1r+njlI4Rmut6cX720c5DXb /bz2BPREP53gIOOvMs9LwYYHOSTh5iNQMTHRiuA+G0r9VrljyNB8h+zWA lG26QAAkbGjsdmaePYRNnf7ckzWip3otsdiZ/h511bSssy+fyxkui8Vtp Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037183" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037183" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393070" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393070" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:09 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 070/108] KVM: TDX: restore host xsave state when exit from the guest TD Date: Sat, 29 Oct 2022 23:23:11 -0700 Message-Id: <1be74c1754ef1401d5413270112651faefa8b41b.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093049112622678?= X-GMAIL-MSGID: =?utf-8?q?1748093049112622678?= From: Isaku Yamahata On exiting from the guest TD, xsave state is clobbered. Restore xsave state on TD exit. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/tdx.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 021040fdd630..3ec465cbaeef 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -2,6 +2,7 @@ #include #include +#include #include #include "capabilities.h" @@ -455,6 +456,22 @@ void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vcpu->kvm->vm_bugged = true; } +static void tdx_restore_host_xsave_state(struct kvm_vcpu *vcpu) +{ + struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm); + + if (static_cpu_has(X86_FEATURE_XSAVE) && + host_xcr0 != (kvm_tdx->xfam & kvm_caps.supported_xcr0)) + xsetbv(XCR_XFEATURE_ENABLED_MASK, host_xcr0); + if (static_cpu_has(X86_FEATURE_XSAVES) && + /* PT can be exposed to TD guest regardless of KVM's XSS support */ + host_xss != (kvm_tdx->xfam & (kvm_caps.supported_xss | XFEATURE_MASK_PT))) + wrmsrl(MSR_IA32_XSS, host_xss); + if (static_cpu_has(X86_FEATURE_PKU) && + (kvm_tdx->xfam & XFEATURE_MASK_PKRU)) + write_pkru(vcpu->arch.host_pkru); +} + u64 __tdx_vcpu_run(hpa_t tdvpr, void *regs, u32 regs_mask); static noinstr void tdx_vcpu_enter_exit(struct kvm_vcpu *vcpu, @@ -478,6 +495,7 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu) tdx_vcpu_enter_exit(vcpu, tdx); + tdx_restore_host_xsave_state(vcpu); tdx->host_state_need_restore = true; vcpu->arch.regs_avail &= ~VMX_REGS_LAZY_LOAD_SET; From patchwork Sun Oct 30 06:23:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12899 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666701wru; Sat, 29 Oct 2022 23:31:03 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6VOGs0wqLXk456Z1KMuPpHA4Ibl23P0JfNGNsasH8aQz9ESEnYGCfn2EcNh+R9CGvuVTSv X-Received: by 2002:a17:902:8a88:b0:17f:8642:7c9a with SMTP id p8-20020a1709028a8800b0017f86427c9amr7991794plo.13.1667111463627; Sat, 29 Oct 2022 23:31:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111463; cv=none; d=google.com; s=arc-20160816; b=wo4hx1QEsB8HY/pWqq8eIVvvuXyYKACn057FG9j81o88T2z3bUrYquj8o9tFRdd9DG HBEeh33yNk5ZV6NUjU+/IQrsszjg27bt/uMfom0S2+Paf+9OtOaa7kv5kuIWCoLkXE17 thnj1i294PCaD4mvAXaC4myCcgW2tMWe9h+wLaNd9jP6QEaQ9VOS5j6ihmVoHbPhu+4i TwU9yIWx4HCaz6GbL/xxOm5Lxr7lrD6nkfPv5rTPnwgOXosJhv9jL2UeziQdtnTmhm29 kW64o8l6yaisEnJW9Phr5FloPMPsGX/dr19gx0lbTxuuaWZloCE7z+vMVWcURGeXv+Jj tUbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=/OV1ghw0ssQBprFc3RNHgt30SYJrjX8SnzoxgGeRslo=; b=ibAVvNndAOF41ct8wIldjC88TdcLs/UN17tqvN1J0uG46vw1FcP2ii/HGem/MrJcFh z1xM1dcav7us7iftPFtmYM08z7VortcWmej8yMmyUZeq+4Zo4MbbDmXaTK2YCmkKypWk 7YhJMmAKUBp63OZag767SCE/OOZHwCkcHPizylsy/yw/sQvYqqYCLIyFHQIA7sGpT4kb rfjflfuZdalTGiBmu20QA9uxDJKlEV4pTGmhkQklQR8DAGDk6kLHYxKoU8I1BIE/qHWC O+VX6cWrM53TuDCd55kBSqLYk073zlFxSulYb64ogmIi1AmZ7WJWoQxwtcKaRfwpRvi4 JbJQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=f4E3p5Pk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q7-20020a056a00150700b0052eb81ff734si4938294pfu.113.2022.10.29.23.30.49; Sat, 29 Oct 2022 23:31:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=f4E3p5Pk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231299AbiJ3GaO (ORCPT + 99 others); Sun, 30 Oct 2022 02:30:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54048 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230491AbiJ3G2J (ORCPT ); Sun, 30 Oct 2022 02:28:09 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C3EF92C9; Sat, 29 Oct 2022 23:24:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111075; x=1698647075; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HraWBcwqZJ2nGI3BZlCpi18wRbeK6+tErMp5WGmZpNI=; b=f4E3p5Pk+cdapkEkvmpw+HZbRhIacVzsebNCbwj50cRMxqyosOikuqSj 5Dy2C78Gnxg0qfI+hFe5R12E/rXbLS6ucrbbqXVQ6XzgJD0p3x6A7saSV /T2/Ij3XUVndWx87tdSVoixi2zrxaGjU9K0fGc4ThkgQD3yV+aYgKqHD/ fmw5bozyPayaOFINnw2wewE09Wk0d29F60zWvv6KrKJT6Pd9x5Rm/JxHb un89XMGXMeNjYudxzi7xK82m09h1HksK+ksh7Hw78cD+Dbn/dAm8PLaSU GecmAhGjWpXhMCxbsJd8AF48wwHN5+uSzwhf/pcKqfO7SOYOSb73xbZwx A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037184" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037184" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393073" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393073" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:09 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Chao Gao Subject: [PATCH v10 071/108] KVM: x86: Allow to update cached values in kvm_user_return_msrs w/o wrmsr Date: Sat, 29 Oct 2022 23:23:12 -0700 Message-Id: <238ab2d9a9d2ea71ecacb25203b91abbaf6fbcb4.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093070319913605?= X-GMAIL-MSGID: =?utf-8?q?1748093070319913605?= From: Chao Gao Several MSRs are constant and only used in userspace(ring 3). But VMs may have different values. KVM uses kvm_set_user_return_msr() to switch to guest's values and leverages user return notifier to restore them when the kernel is to return to userspace. To eliminate unnecessary wrmsr, KVM also caches the value it wrote to an MSR last time. TDX module unconditionally resets some of these MSRs to architectural INIT state on TD exit. It makes the cached values in kvm_user_return_msrs are inconsistent with values in hardware. This inconsistency needs to be fixed. Otherwise, it may mislead kvm_on_user_return() to skip restoring some MSRs to the host's values. kvm_set_user_return_msr() can help correct this case, but it is not optimal as it always does a wrmsr. So, introduce a variation of kvm_set_user_return_msr() to update cached values and skip that wrmsr. Signed-off-by: Chao Gao Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/x86.c | 26 +++++++++++++++++++++----- 2 files changed, 22 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index b225cdfac4bc..fdb00d96e954 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -2082,6 +2082,7 @@ int kvm_add_user_return_msr(u32 msr); int kvm_find_user_return_msr(u32 msr); void kvm_user_return_msr_init_cpu(void); int kvm_set_user_return_msr(unsigned index, u64 val, u64 mask); +void kvm_user_return_update_cache(unsigned int index, u64 val); static inline bool kvm_is_supported_user_return_msr(u32 msr) { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 65541bfebb37..4d4b71c4cdb1 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -440,6 +440,15 @@ void kvm_user_return_msr_init_cpu(void) } EXPORT_SYMBOL_GPL(kvm_user_return_msr_init_cpu); +static void kvm_user_return_register_notifier(struct kvm_user_return_msrs *msrs) +{ + if (!msrs->registered) { + msrs->urn.on_user_return = kvm_on_user_return; + user_return_notifier_register(&msrs->urn); + msrs->registered = true; + } +} + int kvm_set_user_return_msr(unsigned slot, u64 value, u64 mask) { struct kvm_user_return_msrs *msrs = this_cpu_ptr(user_return_msrs); @@ -455,15 +464,22 @@ int kvm_set_user_return_msr(unsigned slot, u64 value, u64 mask) return 1; msrs->values[slot].curr = value; - if (!msrs->registered) { - msrs->urn.on_user_return = kvm_on_user_return; - user_return_notifier_register(&msrs->urn); - msrs->registered = true; - } + kvm_user_return_register_notifier(msrs); return 0; } EXPORT_SYMBOL_GPL(kvm_set_user_return_msr); +/* Update the cache, "curr", and register the notifier */ +void kvm_user_return_update_cache(unsigned int slot, u64 value) +{ + struct kvm_user_return_msrs *msrs = this_cpu_ptr(user_return_msrs); + + WARN_ON_ONCE(!msrs->initialized); + msrs->values[slot].curr = value; + kvm_user_return_register_notifier(msrs); +} +EXPORT_SYMBOL_GPL(kvm_user_return_update_cache); + static void drop_user_return_notifiers(void) { struct kvm_user_return_msrs *msrs = this_cpu_ptr(user_return_msrs); From patchwork Sun Oct 30 06:23:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12897 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666635wru; Sat, 29 Oct 2022 23:30:51 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5pyiHw2nGAZViNs+jIjT3pWqCwVr+4LsCoR3oV2AGfbVHTWUVyNmrdR7CGiS3aCGLfmqzS X-Received: by 2002:a17:902:a5c9:b0:182:3c47:6e86 with SMTP id t9-20020a170902a5c900b001823c476e86mr7806256plq.152.1667111450708; Sat, 29 Oct 2022 23:30:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111450; cv=none; d=google.com; s=arc-20160816; b=NEZV7LSalWZpokVJFdHzmI/bFZd0fAx9JgcTSQzYiLePGuHO8Pig5tLOlezZb/66/c ldUAQPbKnzoKL9lzu1bgLtXAcSEg5C/llDd3as1ePlJWJWjh5GoWY3LNn0A5+lR4BRZq anfHcd8JNnBeyZ00624oWdR4wiySt2caRx/FN62PmsfAquRIk119StGj7DvSWKBLJRTr sp+22Op+qYbpEcY9Rg8YreYqsd8CC++AniGeLPQPi8lktjDgU7cRKELZPkla+v1CklEq 246dT9EVcEj4oRy28vCyYtuRopzrmHALiCW+uRdaWaKP3/A/p5i2TiOqII66IrZlqaba RChw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=3FA+kdmq2+f7pLMyrXpMo8DGm8MRxrE8av/pqQ5JEJ0=; b=wzku8YJvjgi97u0n/ZU0E3RuKXqqeufeQFr3jY96hJzPiU//MRzwNU+PnNKStAOESp OnRIfhv+KA51J14CCvlyCMW5W9Kgx9VR2vqOhgBwgVvXvlOaiAsxksqooMxHhhRUtAH+ +ujJiYbeK+nF10WbdZJnbXVP2k9N61IXtwN92XPTf0zsi8rEA6+X2ioPqyyezD1nW6hy 2J70xaRJNfmLqxK8J1vqvnNrXm/XU404Z+tePXbEOfPHINZzmW78Z8daPIUzuf5IpXp/ C+75JsnZS3/I/cJMwOPf6r6lne6IHXudXtZvkFNz0CNZTtUfCFSBWmc7XviP7y77wX2b rAOg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=hLGVNUJk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j12-20020a056a00234c00b0056d568dc25fsi570741pfj.33.2022.10.29.23.30.38; Sat, 29 Oct 2022 23:30:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=hLGVNUJk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230385AbiJ3GaK (ORCPT + 99 others); Sun, 30 Oct 2022 02:30:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54074 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230490AbiJ3G2J (ORCPT ); Sun, 30 Oct 2022 02:28:09 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 050DA2E3; Sat, 29 Oct 2022 23:24:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111076; x=1698647076; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=PjzSXtmy1uhV89CrB2ZyHwBrD7+QbM0oUfMOi99hUlI=; b=hLGVNUJkTgTBMlALNx6+KiR6a6h6xOx/FBTg7KSTFPCJcbhMu8lPqsi1 NRxrAs9ogqNhdwTOBw9NleMOpLcmFSHS2f76YGjqkSFUx/zqXEYusN4u+ mwaJm/NBHBJhO1OOXQ6x0lSf6B3/qLjKDKABOJHJuVUR7Zey7v6shkETA 9klpkOitTMX602XmDFKfDrS1z3IU5l4OVJsDrKbvSb19jKSWOwxnMgqoY ivdzSaU+cgSCyInGrFCklSk53meyCH5cThZoYNCsqzsBLlbTtImSA7Boc LXSA7WKwJnBcMKTN8k4VFQIoNir9MwpW5uU0jPN+omVal0bZ7Vi2+zF9f w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037185" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037185" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393076" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393076" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:09 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 072/108] KVM: TDX: restore user ret MSRs Date: Sat, 29 Oct 2022 23:23:13 -0700 Message-Id: <3260994f3d9a036795c81bf06842558afabeeef7.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093056835435248?= X-GMAIL-MSGID: =?utf-8?q?1748093056835435248?= From: Isaku Yamahata Several user ret MSRs are clobbered on TD exit. Restore those values on TD exit and before returning to ring 3. Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/kvm/vmx/tdx.c | 43 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 3ec465cbaeef..f35ccf2b502d 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -456,6 +456,28 @@ void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vcpu->kvm->vm_bugged = true; } +struct tdx_uret_msr { + u32 msr; + unsigned int slot; + u64 defval; +}; + +static struct tdx_uret_msr tdx_uret_msrs[] = { + {.msr = MSR_SYSCALL_MASK,}, + {.msr = MSR_STAR,}, + {.msr = MSR_LSTAR,}, + {.msr = MSR_TSC_AUX,}, +}; + +static void tdx_user_return_update_cache(void) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(tdx_uret_msrs); i++) + kvm_user_return_update_cache(tdx_uret_msrs[i].slot, + tdx_uret_msrs[i].defval); +} + static void tdx_restore_host_xsave_state(struct kvm_vcpu *vcpu) { struct kvm_tdx *kvm_tdx = to_kvm_tdx(vcpu->kvm); @@ -495,6 +517,7 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu) tdx_vcpu_enter_exit(vcpu, tdx); + tdx_user_return_update_cache(); tdx_restore_host_xsave_state(vcpu); tdx->host_state_need_restore = true; @@ -1558,6 +1581,26 @@ int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops) return -ENODEV; } + for (i = 0; i < ARRAY_SIZE(tdx_uret_msrs); i++) { + /* + * Here it checks if MSRs (tdx_uret_msrs) can be saved/restored + * before returning to user space. + * + * this_cpu_ptr(user_return_msrs)->registered isn't checked + * because the registration is done at vcpu runtime by + * kvm_set_user_return_msr(). + * Here is setting up cpu feature before running vcpu, + * registered is alreays false. + */ + tdx_uret_msrs[i].slot = kvm_find_user_return_msr(tdx_uret_msrs[i].msr); + if (tdx_uret_msrs[i].slot == -1) { + /* If any MSR isn't supported, it is a KVM bug */ + pr_err("MSR %x isn't included by kvm_find_user_return_msr\n", + tdx_uret_msrs[i].msr); + return -EIO; + } + } + max_pkgs = topology_max_packages(); tdx_mng_key_config_lock = kcalloc(max_pkgs, sizeof(*tdx_mng_key_config_lock), GFP_KERNEL); From patchwork Sun Oct 30 06:23:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12896 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666608wru; Sat, 29 Oct 2022 23:30:45 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4SvAQ8pyOLiTmhYjxlzecl0VOFYmYOcsbwY328iTzsixRtxfGsFs5yTD5XHbP9xwxijKu4 X-Received: by 2002:a63:d744:0:b0:46f:1a6e:1a2 with SMTP id w4-20020a63d744000000b0046f1a6e01a2mr6997701pgi.226.1667111445686; Sat, 29 Oct 2022 23:30:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111445; cv=none; d=google.com; s=arc-20160816; b=fcUOpZDATVU5ESCqs1OXMnjv7/qMDXX3sml7mq5oJvwjevwFXWrzfIJyJ3R+gN6Xal UjgMlD9yIr+2THCoWFOUhoSuWC2Quhzfptn0HtzKnfRgrJkmW62mE0p8jss3NcGPL4V0 0aUn2xgAX8JP0M68xkn9EaMJ1YFGAVR6110kSUFgLJxNmRDszPNbf+xnhNDXrZxUxtmc bAFUlkndGDO3shjY5V+voRlin+t40zez6SYoU2iAio0JvN6kKHrgkaiDCMzNmnyMgPwz OOd+zawNc3Jrvxuldvfl0XBvL1Vq6Y3MoOX4qAzmr7afagowGmLpzh/vnodw5PS1FoBV gGeg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=slibjKlYHGa/3lSXuzu0eC+hzUAfnz4fJQM6izkQenQ=; b=pqFC8JliynDkXfMOl1KRp+KQ05mfGh+Qv/8NMW0hJPgyf+yRUyK4FaZJYi4Sx3YdQT +ZoKZYtS97z3r37fuhWVHX2UHf3eeTs9CZLNvRq52HMEvDbqTpHO3MpynsZfFeN2hu9T lIVGOSXAnuq9+tc6ymPIkl0Cv+iQ0+QHjUg6yVx1P9KvlMnnPVs5uvvICdeHl+1tUFXl hnEdd20cFZhXTjiDgxfcl8FpaHwx52/H6jBQMiTyQQ5mdEVzqSQFw8RCEZBxshXVKx3W HI7qO947w0wuoIYtbvX9gvthUHf2rXzCF0J9x/hQVfuUCAW8bLa4JIoCTo+vc6hixfdL nk1g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Nhwi0SRh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h14-20020a63b00e000000b0046eed2ed669si4910545pgf.209.2022.10.29.23.30.32; Sat, 29 Oct 2022 23:30:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Nhwi0SRh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231277AbiJ3GaD (ORCPT + 99 others); Sun, 30 Oct 2022 02:30:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54058 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230489AbiJ3G2J (ORCPT ); Sun, 30 Oct 2022 02:28:09 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0542E2E4; Sat, 29 Oct 2022 23:24:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111076; x=1698647076; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=JDJD5HxBG16VKKz5Tw7vIJeGTJe3WzrVqBHL0IAN/Ng=; b=Nhwi0SRhqXTVldQdOwYrr1iUoyZVdg4UsIwKF2Jv7z9MvlB8CdMYKsmR ZIRITjEUI6REpG8KHY30LXTWvaHe2q4f/ukam8123gHZmafcKyKIWckxG VIkYuGfVPWzU+kdkON+kR3PaEqsGB8VnnX2vdbsgyJwVuta7NsiO0KMgj liLf9rV2sd5zSAAoNJMpwQv/2JfD7p1AsSRBADk9cnnfLL7Idot5VlgWQ srXUttc4LF2zvCrMcfeU66q4oBf7DgpERJDk1t3FAEsryJ7hpT3FBVtWI /XFStPSp+/DK0KILuU3lqzHGCl4FRaWnQ+S0imvAd4K2kISK0dZK1tdaw Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037186" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037186" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393080" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393080" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:09 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 073/108] [MARKER] The start of TDX KVM patch series: TD vcpu exits/interrupts/hypercalls Date: Sat, 29 Oct 2022 23:23:14 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093051546918130?= X-GMAIL-MSGID: =?utf-8?q?1748093051546918130?= From: Isaku Yamahata This empty commit is to mark the start of patch series of TD vcpu exits, interrupts, and hypercalls. Signed-off-by: Isaku Yamahata --- Documentation/virt/kvm/intel-tdx-layer-status.rst | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/Documentation/virt/kvm/intel-tdx-layer-status.rst b/Documentation/virt/kvm/intel-tdx-layer-status.rst index b51e8e6b1541..1cec14213f69 100644 --- a/Documentation/virt/kvm/intel-tdx-layer-status.rst +++ b/Documentation/virt/kvm/intel-tdx-layer-status.rst @@ -13,6 +13,7 @@ What qemu can do - Qemu can create/destroy vcpu of TDX vm type. - Qemu can populate initial guest memory image. - Qemu can finalize guest TD. +- Qemu can start to run vcpu. But vcpu can not make progress yet. Patch Layer status ------------------ @@ -23,7 +24,7 @@ Patch Layer status * TD vcpu creation/destruction: Applied * TDX EPT violation: Applied * TD finalization: Applied -* TD vcpu enter/exit: Applying +* TD vcpu enter/exit: Applied * TD vcpu interrupts/exit/hypercall: Not yet * KVM MMU GPA shared bits: Applied From patchwork Sun Oct 30 06:23:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12901 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666769wru; Sat, 29 Oct 2022 23:31:18 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5lGVEHyQ+oSfuSRaelPojjEOqBD5g3sEPZyzAd7zX9c4Y/JlaZ4nglXkpzoB7Udc+t0QUx X-Received: by 2002:a17:907:94ce:b0:79c:d3c5:e9f6 with SMTP id dn14-20020a17090794ce00b0079cd3c5e9f6mr6741947ejc.219.1667111478068; Sat, 29 Oct 2022 23:31:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111478; cv=none; d=google.com; s=arc-20160816; b=en5AUpf2NZTYY3viy+fYIxkW0Df/olncZTuQ60XqP3THMKNYhZuFFGi4ceRxIfEKj8 BDLynQCBUYmP5ONUq+LvMnyFBumSsL7jQW1Nv12ltWtkSwj1IxumsI+9qTIWawdDCILi UIuYgCYdfXH9l1gL/X+dGT3K/II4HcNNphike+RATKOObE9sCZs3yOgJ5dJq29kv2Rno LsXo6lzM8MGh0O0ShMCa85SMjG5ZVhgIZXvExP3vQId3ecY5Owp5S1pfZwKBkkTfA3xd vb8HinR3Dpij03UJFRfVP7JW8M5lbwD0I6yg7E54BlB4nwVu/ZAcDCSb4j18ThRWxOI3 ja2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=LXJ25Yea3XCVp5p6VJZLMOh/U7NY2gTK0V6699ncLiU=; b=V2/rDKs9QcGEKdlzwNatxaT8PvcjwVgYLuxz7N5a61pp3Fv/7tGQwgQCcB1B/1btdx A9qC6UTJRbCZoVA0zg1uXEoI4OKqBJwldYAXd/ENekyFh4241nbILogXH74HncpcF7km coUS8HlRbyVBDbrPzHWXirhlaof3JBvdNt2Rp2Atxh0lz6/rCzK5mUwh4L6dM2ezfqh+ j7SClAfvyEu75wRIIepjrhvEIzE7Br2e3QTyKO38MAmvsLUizgHgOKngE9JhawpjgFMF zkRe7mgt+46Jy6GT4GsoZtgR1djqDkRadyJ9Z4AnFUw7C/s8OL9DzCO6FeiZSjnPrpSc yCXg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=MvxW5cuR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qk7-20020a1709077f8700b0077fc66b581esi3941282ejc.688.2022.10.29.23.30.52; Sat, 29 Oct 2022 23:31:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=MvxW5cuR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231307AbiJ3GaU (ORCPT + 99 others); Sun, 30 Oct 2022 02:30:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54088 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230493AbiJ3G2J (ORCPT ); Sun, 30 Oct 2022 02:28:09 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3FFA72EE; Sat, 29 Oct 2022 23:24:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111077; x=1698647077; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bxQogyGeF26Y2rtQ1vDn3KVS8MkMlSfBtAXMo5C8hPA=; b=MvxW5cuR8ecq7ysLL87E8Rjg0DHhKDzXpgFAV25UldEYGDi1q+VUkZ/a fpTBd+kIrBF79nMiHcb9LP6nZq8uc5LgFYASoM1skdhwCrDVivBxHqQ84 cVCGTjwwWclMPKc+Uk/2MuclSipmUzg7wGlwmBIDVRPV0qiZSrDu+z5t2 e+rYCVuRwOs/9KviERptFuRw3YPm7Nsgo6CqG7NJAlc9AG2wdPKvKk6DO EqIC0lAOWSqWeDgaRpQAwtEFlMTpdboavr/K17wawsgwb4on2dhid7eoi IwIGV6htovy9kZ9Qc/Y/HflTbhNJy1ueiyLrXFUH+escQT0QIY+pN2oK5 w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037187" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037187" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:09 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393083" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393083" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:09 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 074/108] KVM: TDX: complete interrupts after tdexit Date: Sat, 29 Oct 2022 23:23:15 -0700 Message-Id: <62b2229edec27f04f0f0a7dbc1ff1fd1b1e13378.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093085003336111?= X-GMAIL-MSGID: =?utf-8?q?1748093085003336111?= From: Isaku Yamahata This corresponds to VMX __vmx_complete_interrupts(). Because TDX virtualize vAPIC, KVM only needs to care NMI injection. Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/kvm/vmx/tdx.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index f35ccf2b502d..af13c19af339 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -456,6 +456,14 @@ void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) vcpu->kvm->vm_bugged = true; } +static void tdx_complete_interrupts(struct kvm_vcpu *vcpu) +{ + /* Avoid costly SEAMCALL if no nmi was injected */ + if (vcpu->arch.nmi_injected) + vcpu->arch.nmi_injected = td_management_read8(to_tdx(vcpu), + TD_VCPU_PEND_NMI); +} + struct tdx_uret_msr { u32 msr; unsigned int slot; @@ -524,6 +532,8 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu) vcpu->arch.regs_avail &= ~VMX_REGS_LAZY_LOAD_SET; trace_kvm_exit(vcpu, KVM_ISA_VMX); + tdx_complete_interrupts(vcpu); + return EXIT_FASTPATH_NONE; } From patchwork Sun Oct 30 06:23:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12902 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666787wru; Sat, 29 Oct 2022 23:31:22 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6sH0MD3Rpf9+MmsGBoC74duWg+Gq6lvFLW3O25qpMMuJXB61Z6dvxOzMyBiUIGXQccFg9A X-Received: by 2002:a17:902:be07:b0:187:48d:d9be with SMTP id r7-20020a170902be0700b00187048dd9bemr8050197pls.102.1667111482220; Sat, 29 Oct 2022 23:31:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111482; cv=none; d=google.com; s=arc-20160816; b=pZX9vZXlKZIH0e8DHz2GNMTIZh0ungUqKr6jIAZH1/qBLmKRMJ1l2dRgjH/w+AkQKf ec7p0LEiQHEdEUtNSvKopErJ14SSJsGvDLPsJgFaVUxOGlphYt5gJcicuf9dS89Dbt4U Me38wdCOuMEC0G1r9kugYMt+W7x+DbSKIfeKX4WB6539HAatCd+QoUygIOA4S/chiKW0 ZBU04GPmJ1bwN5VOGyWLJegX9RXZZx4UvI1LKVsKU3FiQOnn8lPUI8E1OvBVN9OeFyaz iYBGr1bSV3u9PiDiW+vuoqEXLwP8ov3N3kHwi2Lwgftzxvd089+GJ8XWnssAUii5SBMV oXUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=0063pmI4F92CwXB0ynWQVFCNIWIWDc3gQ+WOPhjT3ew=; b=Of0YQbwhrNqwOFy9RWrZSglfW5azUx0rdYwvp7eXA1C8VvOAod9mhej5GdicdM1rlN wIw2lYqWaNzgRgFjQ7kmrGUsAKuuesK6htqr2IKZBJsJR6T2NO/7YKMmUVlROG817Hb+ yCbCMIminIUU3IrfAvmBYuoAxq2MjOFxXS0N30B5YxkvciKTJfrRwh4tJFTmGsAKN6YO eWuj9WyyTjk5NOL78G0WlgbaLZHlKLx8CwB2ZzjcRc2hQCnGQrb5ofcWxY08zg6Cn2k7 7OhEEjlSPhI14qSRd/U4zE/Ecy5JG72LkWk3IFdfeROsB/OD2MM/zQfVlEWVmgWwis7y U77A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=hcXsxHar; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e38-20020a631e26000000b0046f5808167asi4113739pge.812.2022.10.29.23.31.09; Sat, 29 Oct 2022 23:31:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=hcXsxHar; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230156AbiJ3Ga2 (ORCPT + 99 others); Sun, 30 Oct 2022 02:30:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46972 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230312AbiJ3G2X (ORCPT ); Sun, 30 Oct 2022 02:28:23 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2DB462C7; Sat, 29 Oct 2022 23:24:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111078; x=1698647078; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CH8X9PQ1WntEbptpMuYy9mmJ5w7JHl1yNfjV5rQQPzI=; b=hcXsxHarcNvdltYyWkaBAfAHAcxRbMQFAIgQD2awQLZ1k2hENr26t1VE 2UbKTqPKtwVyZVV50hvGnwA3WPi2f1Xv8p4HZgP5R4MMu8MNLgXHXMvDY Si+CEY9RqPT8Dz0dTqsLXP0NQOdqyE/H4Vvl4VUkPI2EcrqMoMT/PzGb5 JKVydd70uapscre3c+Ey9+vLWkASsbEahVf6Kiy+SfZ6GJBN5a/t7z/6y JbXOt5/2eVkGQJ4PfRpE794I6WXOZqdH2gV1SafRP6Ch/+UUQeH5yw47V Xf2JwvS/gtURev2KxEHmYJR7K9ZmGyDHCcwH1nsevY3w5W2Z5n8YbmWbz Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037188" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037188" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:10 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393086" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393086" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:09 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 075/108] KVM: TDX: restore debug store when TD exit Date: Sat, 29 Oct 2022 23:23:16 -0700 Message-Id: <82a23837a7d35c4144c5eeea6327a8443f0d1be5.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093089251082521?= X-GMAIL-MSGID: =?utf-8?q?1748093089251082521?= From: Isaku Yamahata Because debug store is clobbered, restore it on TD exit. Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/events/intel/ds.c | 1 + arch/x86/kvm/vmx/tdx.c | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index 7839507b3844..5c310a951a0b 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -2340,3 +2340,4 @@ void perf_restore_debug_store(void) wrmsrl(MSR_IA32_DS_AREA, (unsigned long)ds); } +EXPORT_SYMBOL_GPL(perf_restore_debug_store); diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index af13c19af339..5807a2f564af 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -526,6 +526,7 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu) tdx_vcpu_enter_exit(vcpu, tdx); tdx_user_return_update_cache(); + perf_restore_debug_store(); tdx_restore_host_xsave_state(vcpu); tdx->host_state_need_restore = true; From patchwork Sun Oct 30 06:23:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12903 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666807wru; Sat, 29 Oct 2022 23:31:26 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4esQNs+vjIi3Jwm/7xMN9efxcArS7r87Ct+RUgb32jj0YVGhoaQfhfLFf/iPPejgozOMuq X-Received: by 2002:a63:cc4a:0:b0:439:1c48:2fed with SMTP id q10-20020a63cc4a000000b004391c482fedmr7143797pgi.618.1667111486168; Sat, 29 Oct 2022 23:31:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111486; cv=none; d=google.com; s=arc-20160816; b=Ww7d+/b8BWF3srvFo1akCPlsRL8WlL4bBsAvsGB4zuOOG5ZIcIulrtbtSR4LXqLkZe OIlFOyryR9ujNHxHX5Pe2XEfEMx4S/Rxijmqrhxqz8ZyV+2OnQICKizsG8Eu0qomDiSQ Gda/gIiGk7mgrYl1QwpMeioxl81xsLgQYioOx4wue0vMzLlALqoagWxVjlT+WO33kuJ9 S6Dx6ntrOunP5Pyyn38eNUTfq81Al2kFN6HOxY+cXkavGVAATm3SSYt8amrrA0lEclAX UvVcBK3UAVdL6crwrbhWPsE5e2N7PjDvROIqtyM6vxFi3XoolaKwc7jUd+TPjIizVTkF qRTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=SznMTjZKnsci9eDdOTPeD9Ht8kQHMKZ5NhzcEJqIGSs=; b=J/LIz3PaRkn4HfFneOjCfipN4GawBKx4rsKX1z/wVUwQijwNrzJekI6rHbpIG4ppLs 2vMIwUCMGu9ALcR45BdJfYNCbn+EYQ261xUp8lmVC76Az3zcdP7W+WGH4x3cYSJBOJWX WjVbOrukhduav1tSnc+tCpvl91SCZUNNWOMMVmkg98NUA8KaRTQ27tk5dGau/cF71Lhm 1dwhOqtUrJTaHTTV0DkfV+0dtaQeeoQqleRxWb9wbmKl2s5FiyKhA0mC8Y11BLioSpZk xm79dOr5ByqdsyiTKygREVozh2jkpAgw5OvwSj1c/mSxY1l95wAk1xbalrBavfW30tyO 02ow== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ZloAyAvZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n16-20020a170902f61000b001868bb70fd5si4521299plg.124.2022.10.29.23.31.13; Sat, 29 Oct 2022 23:31:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ZloAyAvZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231177AbiJ3Gac (ORCPT + 99 others); Sun, 30 Oct 2022 02:30:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49276 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231132AbiJ3G2q (ORCPT ); Sun, 30 Oct 2022 02:28:46 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D14E307; Sat, 29 Oct 2022 23:24:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111079; x=1698647079; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bQM8PfhCx6jh/BP6Xk98zHasvuKf8vficueIPnO0/KY=; b=ZloAyAvZh4I/ggmz8cwL050seY5/zQSOeYJBLurjz7zf3k7+28aizb3I NOitkW7c+7U9fiQ8RP8jEE084tvyl38w2qylZe+R6xgXTpilPGvhSWQ6h l6co5+Jd09+ihfWQ9OVmKLufi7FnCo8ABdB/JtJ4HZ//KhIn5PKyMJmEI Kw4782pr/umdLIzF0w2tXaWXHTW+fXvaX6lafnQdIphhPSjhdV93rMhcZ Z1jJVW74waQiYnyjClsBgP4qqCl0z+ESTMAyX8zxrhDfMnP7Xmk375Xsu n8/ATilvm/E1n6L0OaaMFZcJqcMkNv/RBBBFdwnZB9EW3aLrLA+y0VnrZ w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037189" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037189" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:10 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393089" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393089" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:10 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 076/108] KVM: TDX: handle vcpu migration over logical processor Date: Sat, 29 Oct 2022 23:23:17 -0700 Message-Id: <782f74f7d5375a36b2857be59262c1c4c4cf16a7.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093093824506299?= X-GMAIL-MSGID: =?utf-8?q?1748093093824506299?= From: Isaku Yamahata For vcpu migration, in the case of VMX, VCMS is flushed on the source pcpu, and load it on the target pcpu. There are corresponding TDX SEAMCALL APIs, call them on vcpu migration. The logic is mostly same as VMX except the TDX SEAMCALLs are used. When shutting down the machine, (VMX or TDX) vcpus needs to be shutdown on each pcpu. Do the similar for TDX with TDX SEAMCALL APIs. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/main.c | 43 +++++++++++-- arch/x86/kvm/vmx/tdx.c | 121 +++++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/tdx.h | 2 + arch/x86/kvm/vmx/x86_ops.h | 6 ++ 4 files changed, 168 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 379b3343557b..6d46ae9c5dce 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -17,6 +17,25 @@ static bool vt_is_vm_type_supported(unsigned long type) (enable_tdx && tdx_is_vm_type_supported(type)); } +static int vt_hardware_enable(void) +{ + int ret; + + ret = vmx_hardware_enable(); + if (ret) + return ret; + + tdx_hardware_enable(); + return 0; +} + +static void vt_hardware_disable(void) +{ + /* Note, TDX *and* VMX need to be disabled if TDX is enabled. */ + tdx_hardware_disable(); + vmx_hardware_disable(); +} + static __init int vt_hardware_setup(void) { int ret; @@ -141,6 +160,14 @@ static fastpath_t vt_vcpu_run(struct kvm_vcpu *vcpu) return vmx_vcpu_run(vcpu); } +static void vt_vcpu_load(struct kvm_vcpu *vcpu, int cpu) +{ + if (is_td_vcpu(vcpu)) + return tdx_vcpu_load(vcpu, cpu); + + return vmx_vcpu_load(vcpu, cpu); +} + static void vt_flush_tlb_all(struct kvm_vcpu *vcpu) { if (is_td_vcpu(vcpu)) @@ -199,6 +226,14 @@ static void vt_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, vmx_load_mmu_pgd(vcpu, root_hpa, pgd_level); } +static void vt_sched_in(struct kvm_vcpu *vcpu, int cpu) +{ + if (is_td_vcpu(vcpu)) + return; + + vmx_sched_in(vcpu, cpu); +} + static int vt_mem_enc_ioctl(struct kvm *kvm, void __user *argp) { if (!is_td(kvm)) @@ -222,8 +257,8 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .offline_cpu = tdx_offline_cpu, .check_processor_compatibility = vmx_check_processor_compatibility, - .hardware_enable = vmx_hardware_enable, - .hardware_disable = vmx_hardware_disable, + .hardware_enable = vt_hardware_enable, + .hardware_disable = vt_hardware_disable, .has_emulated_msr = vmx_has_emulated_msr, .is_vm_type_supported = vt_is_vm_type_supported, @@ -239,7 +274,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .vcpu_reset = vt_vcpu_reset, .prepare_switch_to_guest = vt_prepare_switch_to_guest, - .vcpu_load = vmx_vcpu_load, + .vcpu_load = vt_vcpu_load, .vcpu_put = vt_vcpu_put, .update_exception_bitmap = vmx_update_exception_bitmap, @@ -327,7 +362,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .request_immediate_exit = vmx_request_immediate_exit, - .sched_in = vmx_sched_in, + .sched_in = vt_sched_in, .cpu_dirty_log_size = PML_ENTITY_NUM, .update_cpu_dirty_logging = vmx_update_cpu_dirty_logging, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 5807a2f564af..fc4de83a2df8 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -51,6 +51,14 @@ static DEFINE_MUTEX(tdx_lock); static struct mutex *tdx_mng_key_config_lock; static atomic_t nr_configured_hkid; +/* + * A per-CPU list of TD vCPUs associated with a given CPU. Used when a CPU + * is brought down to invoke TDH_VP_FLUSH on the approapriate TD vCPUS. + * Protected by interrupt mask. This list is manipulated in process context + * of vcpu and IPI callback. See tdx_flush_vp_on_cpu(). + */ +static DEFINE_PER_CPU(struct list_head, associated_tdvcpus); + static __always_inline hpa_t set_hkid_to_hpa(hpa_t pa, u16 hkid) { return pa | ((hpa_t)hkid << boot_cpu_data.x86_phys_bits); @@ -82,6 +90,36 @@ static inline bool is_td_finalized(struct kvm_tdx *kvm_tdx) return kvm_tdx->finalized; } +static inline void tdx_disassociate_vp(struct kvm_vcpu *vcpu) +{ + list_del(&to_tdx(vcpu)->cpu_list); + + /* + * Ensure tdx->cpu_list is updated is before setting vcpu->cpu to -1, + * otherwise, a different CPU can see vcpu->cpu = -1 and add the vCPU + * to its list before its deleted from this CPUs list. + */ + smp_wmb(); + + vcpu->cpu = -1; +} + +void tdx_hardware_enable(void) +{ + INIT_LIST_HEAD(&per_cpu(associated_tdvcpus, raw_smp_processor_id())); +} + +void tdx_hardware_disable(void) +{ + int cpu = raw_smp_processor_id(); + struct list_head *tdvcpus = &per_cpu(associated_tdvcpus, cpu); + struct vcpu_tdx *tdx, *tmp; + + /* Safe variant needed as tdx_disassociate_vp() deletes the entry. */ + list_for_each_entry_safe(tdx, tmp, tdvcpus, cpu_list) + tdx_disassociate_vp(&tdx->vcpu); +} + static void tdx_clear_page(unsigned long page) { const void *zero_page = (const void *) __va(page_to_phys(ZERO_PAGE(0))); @@ -176,6 +214,41 @@ static void tdx_reclaim_td_page(struct tdx_td_page *page) } } +static void tdx_flush_vp(void *arg) +{ + struct kvm_vcpu *vcpu = arg; + u64 err; + + lockdep_assert_irqs_disabled(); + + /* Task migration can race with CPU offlining. */ + if (vcpu->cpu != raw_smp_processor_id()) + return; + + /* + * No need to do TDH_VP_FLUSH if the vCPU hasn't been initialized. The + * list tracking still needs to be updated so that it's correct if/when + * the vCPU does get initialized. + */ + if (is_td_vcpu_created(to_tdx(vcpu))) { + err = tdh_vp_flush(to_tdx(vcpu)->tdvpr.pa); + if (unlikely(err && err != TDX_VCPU_NOT_ASSOCIATED)) { + if (WARN_ON_ONCE(err)) + pr_tdx_error(TDH_VP_FLUSH, err, NULL); + } + } + + tdx_disassociate_vp(vcpu); +} + +static void tdx_flush_vp_on_cpu(struct kvm_vcpu *vcpu) +{ + if (unlikely(vcpu->cpu == -1)) + return; + + smp_call_function_single(vcpu->cpu, tdx_flush_vp, vcpu, 1); +} + static int tdx_do_tdh_phymem_cache_wb(void *param) { u64 err = 0; @@ -200,6 +273,8 @@ void tdx_mmu_release_hkid(struct kvm *kvm) struct kvm_tdx *kvm_tdx = to_kvm_tdx(kvm); cpumask_var_t packages; bool cpumask_allocated; + struct kvm_vcpu *vcpu; + unsigned long j; u64 err; int ret; int i; @@ -210,6 +285,19 @@ void tdx_mmu_release_hkid(struct kvm *kvm) if (!is_td_created(kvm_tdx)) goto free_hkid; + kvm_for_each_vcpu(j, vcpu, kvm) + tdx_flush_vp_on_cpu(vcpu); + + mutex_lock(&tdx_lock); + err = tdh_mng_vpflushdone(kvm_tdx->tdr.pa); + mutex_unlock(&tdx_lock); + if (WARN_ON_ONCE(err)) { + pr_tdx_error(TDH_MNG_VPFLUSHDONE, err, NULL); + pr_err("tdh_mng_vpflushdone failed. HKID %d is leaked.\n", + kvm_tdx->hkid); + return; + } + cpumask_allocated = zalloc_cpumask_var(&packages, GFP_KERNEL); cpus_read_lock(); for_each_online_cpu(i) { @@ -355,6 +443,26 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu) return 0; } +void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) +{ + struct vcpu_tdx *tdx = to_tdx(vcpu); + + if (vcpu->cpu == cpu) + return; + + tdx_flush_vp_on_cpu(vcpu); + + local_irq_disable(); + /* + * Pairs with the smp_wmb() in tdx_disassociate_vp() to ensure + * vcpu->cpu is read before tdx->cpu_list. + */ + smp_rmb(); + + list_add(&tdx->cpu_list, &per_cpu(associated_tdvcpus, cpu)); + local_irq_enable(); +} + void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) { struct vcpu_tdx *tdx = to_tdx(vcpu); @@ -405,6 +513,19 @@ void tdx_vcpu_free(struct kvm_vcpu *vcpu) tdx->tdvpx = NULL; } tdx_reclaim_td_page(&tdx->tdvpr); + + /* + * kvm_free_vcpus() + * -> kvm_unload_vcpu_mmu() + * + * does vcpu_load() for every vcpu after they already disassociated + * from the per cpu list when tdx_vm_teardown(). So we need to + * disassociate them again, otherwise the freed vcpu data will be + * accessed when do list_{del,add}() on associated_tdvcpus list + * later. + */ + tdx_flush_vp_on_cpu(vcpu); + WARN_ON_ONCE(vcpu->cpu != -1); } void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index e5f973b2d752..c02073102a5f 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -68,6 +68,8 @@ struct vcpu_tdx { struct tdx_td_page tdvpr; struct tdx_td_page *tdvpx; + struct list_head cpu_list; + union tdx_exit_reason exit_reason; bool vcpu_initialized; diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index a4e50c5a4bf5..d4fcb6b29ffe 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -138,6 +138,8 @@ int __init tdx_hardware_setup(struct kvm_x86_ops *x86_ops); bool tdx_is_vm_type_supported(unsigned long type); void tdx_hardware_unsetup(void); int tdx_offline_cpu(void); +void tdx_hardware_enable(void); +void tdx_hardware_disable(void); int tdx_dev_ioctl(void __user *argp); int tdx_vm_init(struct kvm *kvm); @@ -150,6 +152,7 @@ void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event); fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu); void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu); void tdx_vcpu_put(struct kvm_vcpu *vcpu); +void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu); int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); @@ -162,6 +165,8 @@ static inline int tdx_hardware_setup(struct kvm_x86_ops *x86_ops) { return 0; } static inline bool tdx_is_vm_type_supported(unsigned long type) { return false; } static inline void tdx_hardware_unsetup(void) {} static inline int tdx_offline_cpu(void) { return 0; } +static inline void tdx_hardware_enable(void) {} +static inline void tdx_hardware_disable(void) {} static inline int tdx_dev_ioctl(void __user *argp) { return -EOPNOTSUPP; }; static inline int tdx_vm_init(struct kvm *kvm) { return -EOPNOTSUPP; } @@ -175,6 +180,7 @@ static inline void tdx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event) {} static inline fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu) { return EXIT_FASTPATH_NONE; } static inline void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) {} static inline void tdx_vcpu_put(struct kvm_vcpu *vcpu) {} +static inline void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) {} static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; } From patchwork Sun Oct 30 06:23:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12904 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666813wru; Sat, 29 Oct 2022 23:31:29 -0700 (PDT) X-Google-Smtp-Source: AMsMyM63bMMnbhpynx9RFzRuhyqVPj2KsGHezpbr5UnYFbErpeepWd2+d3c79Lop3IcCIaXw5ExS X-Received: by 2002:a05:6a00:2402:b0:52c:81cf:8df8 with SMTP id z2-20020a056a00240200b0052c81cf8df8mr8156671pfh.60.1667111489300; Sat, 29 Oct 2022 23:31:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111489; cv=none; d=google.com; s=arc-20160816; b=PAhNi0WtVUvsqwBjDHU3y+K8wX2+OYeRBs8h72egccLF5TJkKAgpsMPlv9xO2Uz3XD a7RRKY7pwyoRl/VoB9adgzlZKJkuNIiX5OQB4KxJmRx0Jd5RRuWtVg9E6ifY6Kc3GfM7 /89goaMCvoWkvIPcQTYZALsljTwG9KRYNi5WpwXopWK3e9ta+K/HqdeF3KKhtTmqyqgj K1IV2qEa5g6+zvBhChMw2mb50F8hDa3mJ1Y8GkxKpNfRQKQEmAnrkBBFJ33rjubNzCrB EyY5IDXI/412AFZjH9VeU27QK+9mRbKap03yPr9eNfnD7tnLKxrM+YGevbsEAFakiMkF We1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=MGr8Z5qE2R2hjla4RAVL0AKWGvZ6Ww9DrLOX4jxCi0g=; b=T0rYwdVePeZB4hCEJnPDiS3d6G+H/hJH2N2TEHxCWuZFCJgsAVIr+VSSy6deAVllQ0 CN6uP3NPLDCGM5M8cMPg1LrACZYAGKRLJ9Shk/U5QqXZEki4I+d0M+dnVgFrRMKg+3pi LXzOcopFLtYnhRTE3Dbh7HqjGDH+2SxCilA1uZYudpd9neax9McAILufNuFYv3a0NYdu qAzDXulfZQLmUyL3yTqkG5eox0bKMrjcKjJh+FGxVzDFm07x1/LsZTs6kGsgARJEaqON R85JBrykcpeGB9+xqFjDGlmBA3Xfc2w8wmBCIEx/PIUc4Pf9Lp6EOcZKkDtpgYoH4P9j xDCg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=WsFB0jnK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 32-20020a630c60000000b0044034f2c3b8si4100885pgm.310.2022.10.29.23.31.16; Sat, 29 Oct 2022 23:31:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=WsFB0jnK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231340AbiJ3Gag (ORCPT + 99 others); Sun, 30 Oct 2022 02:30:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47796 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230425AbiJ3G2u (ORCPT ); Sun, 30 Oct 2022 02:28:50 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 51A262E6; Sat, 29 Oct 2022 23:24:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111080; x=1698647080; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=NgDNxL+/7jvkhB9H0lGuHNLykdpBprAtOPXa9QdOf+k=; b=WsFB0jnKu9TVcT8RXQyD9wns76q1T4NMN43jjc/G5zZE+884P7f/Ko2L DzbbiKWljtPTWMoutSb/XHZveqp5KaQS7XrhZspZ+KnSXV2DyMpJVluuo cR8doQrB+JIFoN2paK8ycW6Vg/QMJd33FKuB/iD/awadW58Ta1VtpMu/f 8c9arQjf8T8sP+UKiHM/nXdg7wA/LVDuT4zrvI0e5wWLdmGFioda/8kkC 0nVdJVema4mo37AJFHtr6wXvKXV9C34G5off7JJzdq1LyYKPR66ZqSTJz VKuMHAjTNpgrxkwRrwGJOs87lfMIgRbUTc5my83rCTf9k+QGi+Xxb+oDI g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037190" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037190" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:10 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393092" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393092" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:10 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Xiaoyao Li , Sean Christopherson , Chao Gao Subject: [PATCH v10 077/108] KVM: x86: Add a switch_db_regs flag to handle TDX's auto-switched behavior Date: Sat, 29 Oct 2022 23:23:18 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093096852451939?= X-GMAIL-MSGID: =?utf-8?q?1748093096852451939?= From: Isaku Yamahata Add a flag, KVM_DEBUGREG_AUTO_SWITCHED_GUEST, to skip saving/restoring DRs irrespective of any other flags. TDX-SEAM unconditionally saves and restores guest DRs and reset to architectural INIT state on TD exit. So, KVM needs to save host DRs before TD enter without restoring guest DRs and restore host DRs after TD exit. Opportunistically convert the KVM_DEBUGREG_* definitions to use BIT(). Reported-by: Xiaoyao Li Signed-off-by: Sean Christopherson Co-developed-by: Chao Gao Signed-off-by: Chao Gao Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm_host.h | 9 +++++++-- arch/x86/kvm/vmx/tdx.c | 1 + arch/x86/kvm/x86.c | 11 ++++++++--- 3 files changed, 16 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index fdb00d96e954..082e94f78c66 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -583,8 +583,13 @@ struct kvm_pmu { struct kvm_pmu_ops; enum { - KVM_DEBUGREG_BP_ENABLED = 1, - KVM_DEBUGREG_WONT_EXIT = 2, + KVM_DEBUGREG_BP_ENABLED = BIT(0), + KVM_DEBUGREG_WONT_EXIT = BIT(1), + /* + * Guest debug registers are saved/restored by hardware on exit from + * or enter guest. KVM needn't switch them. + */ + KVM_DEBUGREG_AUTO_SWITCH = BIT(2), }; struct kvm_mtrr_range { diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index fc4de83a2df8..57767ef3353b 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -429,6 +429,7 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu) vcpu->arch.efer = EFER_SCE | EFER_LME | EFER_LMA | EFER_NX; + vcpu->arch.switch_db_regs = KVM_DEBUGREG_AUTO_SWITCH; vcpu->arch.cr0_guest_owned_bits = -1ul; vcpu->arch.cr4_guest_owned_bits = -1ul; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 4d4b71c4cdb1..ad7b227b68dd 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -10779,7 +10779,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) if (vcpu->arch.guest_fpu.xfd_err) wrmsrl(MSR_IA32_XFD_ERR, vcpu->arch.guest_fpu.xfd_err); - if (unlikely(vcpu->arch.switch_db_regs)) { + if (unlikely(vcpu->arch.switch_db_regs & ~KVM_DEBUGREG_AUTO_SWITCH)) { set_debugreg(0, 7); set_debugreg(vcpu->arch.eff_db[0], 0); set_debugreg(vcpu->arch.eff_db[1], 1); @@ -10822,6 +10822,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) */ if (unlikely(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)) { WARN_ON(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP); + WARN_ON(vcpu->arch.switch_db_regs & KVM_DEBUGREG_AUTO_SWITCH); static_call(kvm_x86_sync_dirty_debug_regs)(vcpu); kvm_update_dr0123(vcpu); kvm_update_dr7(vcpu); @@ -10834,8 +10835,12 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu) * care about the messed up debug address registers. But if * we have some of them active, restore the old state. */ - if (hw_breakpoint_active()) - hw_breakpoint_restore(); + if (hw_breakpoint_active()) { + if (!(vcpu->arch.switch_db_regs & KVM_DEBUGREG_AUTO_SWITCH)) + hw_breakpoint_restore(); + else + set_debugreg(__this_cpu_read(cpu_dr7), 7); + } vcpu->arch.last_vmentry_cpu = vcpu->cpu; vcpu->arch.last_guest_tsc = kvm_read_l1_tsc(vcpu, rdtsc()); From patchwork Sun Oct 30 06:23:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12906 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666845wru; Sat, 29 Oct 2022 23:31:37 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4zukLxoSs0RHqEWxn+/1dAVLLSEjk6zMMQsOwe9OetsAHxQVpCw45F16R80dnA/MLTWvfa X-Received: by 2002:a17:90a:c258:b0:20b:23d5:8eb2 with SMTP id d24-20020a17090ac25800b0020b23d58eb2mr8073949pjx.85.1667111497512; Sat, 29 Oct 2022 23:31:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111497; cv=none; d=google.com; s=arc-20160816; b=ea3URnl3iauPG3I5LXwBOqg4k4yuSIyHBWpKGBKTAtL4ZnzW+/PkqLKXykqL7sR1B6 TneqS1jzwDy5q6ofn4uQ0ywd2/dzQjScZz9bWRLl08ky8GGPkn8cc3RZw43rlWHMT4Bm 8jQlS0kN+GrfA7JMBtthhmgGb4BGm5sj6UCr+RPlxxZiIvwk68u/3iqHY8msl6Wzu1iL P1iDiGCZ7aZ62HU1Yp49KOxTbeVKs2H3I4ogFue7gfpVHVA9DkwoRAom6UMqr3+hhU9m /b8xJTMCS76Rap+GSufBfMDqFwLWXRL3xZBT6Ev2NJfz1iKfXT03+jfMgI+SunReLqDH HJ0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ixBJRpmEAAnRg6o8AfkMfQBnuoKyKFfeWRyYnhycjYc=; b=IuHOhj7p7wuop61CDkkv/pUdm2WebJgL6lqyCc+19FuEhMtpNDYfpfU28OIHCxkpGm O7cx+zHxtMbWxGdKPpqwIy+80ow7CJO+HixlPO7AdVP/Vj05KxJLuX0uoXnm6qMk4gyv 0zhRaSx5ovl60u7e2oH4/nzSZoDp5DHI+CPEx586yN2sT7gz+hlqd5wD5lrouMFTQ1RL PK0Q0VbbYUeV+PRBy+g9XUm2wkQwqZ0rUqynGwUuPJXBdJ/D8u2+vlQ/kpoY1fUcBgUN as9eSOLEp3GiTj3H03vkvcuDzFIvnhDMgGyCo1uQf76NIzGtTmisQ9LMlkxropjCOhg0 DVcg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=EzxKHbEu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e9-20020a170903240900b00186b45948d1si4507218plo.125.2022.10.29.23.31.25; Sat, 29 Oct 2022 23:31:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=EzxKHbEu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231352AbiJ3Gat (ORCPT + 99 others); Sun, 30 Oct 2022 02:30:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50280 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231228AbiJ3G3j (ORCPT ); Sun, 30 Oct 2022 02:29:39 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF71E32D; Sat, 29 Oct 2022 23:24:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111084; x=1698647084; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bGICZvfOKjNV1tPqov4L745ZqjquYQLhU/mLJoqw8HQ=; b=EzxKHbEu34kGm2OuCXQa4lCaRy7TbCozlYOX3erEZJbzda9cTeXDB4DW xjTXHEJ/2wrZMuKaKCoRTBNyScl/5uiDT4tUZuGMkX7ot4jGCNarl1CeK 6fO475uynZ+U/sCtKtRyfbRzEGCe+F8tcsJ8wzguvxF3U3XUzeiGWhqeL n8WLeulVyTPw/kpt2yyfhsPhrlqrhpKhQ4PxyL+KU1PMylxcfTw3FlkrO rBLVUDhz2M5oBB+Q92MY1czA9vza2t/Q/oB+crk1pR2TBnibA6nWKYwdB OuswKuGTBcvNN50znw4amWKVbMuB9MG03sQe6Y6UK6a4BfM6s4jQFTvp2 Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037191" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037191" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:10 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393095" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393095" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:10 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 078/108] KVM: TDX: Add support for find pending IRQ in a protected local APIC Date: Sat, 29 Oct 2022 23:23:19 -0700 Message-Id: <46c55abae1c12364c9159ca2ad41c342518fc0f9.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093105628544869?= X-GMAIL-MSGID: =?utf-8?q?1748093105628544869?= From: Sean Christopherson Add flag and hook to KVM's local APIC management to support determining whether or not a TDX guest as a pending IRQ. For TDX vCPUs, the virtual APIC page is owned by the TDX module and cannot be accessed by KVM. As a result, registers that are virtualized by the CPU, e.g. PPR, cannot be read or written by KVM. To deliver interrupts for TDX guests, KVM must send an IRQ to the CPU on the posted interrupt notification vector. And to determine if TDX vCPU has a pending interrupt, KVM must check if there is an outstanding notification. Return "no interrupt" in kvm_apic_has_interrupt() if the guest APIC is protected to short-circuit the various other flows that try to pull an IRQ out of the vAPIC, the only valid operation is querying _if_ an IRQ is pending, KVM can't do anything based on _which_ IRQ is pending. Intentionally omit sanity checks from other flows, e.g. PPR update, so as not to degrade non-TDX guests with unnecessary checks. A well-behaved KVM and userspace will never reach those flows for TDX guests, but reaching them is not fatal if something does go awry. Note, this doesn't handle interrupts that have been delivered to the vCPU but not yet recognized by the core, i.e. interrupts that are sitting in vmcs.GUEST_INTR_STATUS. Querying that state requires a SEAMCALL and will be supported in a future patch. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/irq.c | 3 +++ arch/x86/kvm/lapic.c | 3 +++ arch/x86/kvm/lapic.h | 2 ++ arch/x86/kvm/vmx/main.c | 11 +++++++++++ arch/x86/kvm/vmx/tdx.c | 6 ++++++ arch/x86/kvm/vmx/x86_ops.h | 2 ++ 8 files changed, 29 insertions(+) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 1b01dc2098b0..17c3828d42a3 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -116,6 +116,7 @@ KVM_X86_OP_OPTIONAL(pi_update_irte) KVM_X86_OP_OPTIONAL(pi_start_assignment) KVM_X86_OP_OPTIONAL(apicv_post_state_restore) KVM_X86_OP_OPTIONAL_RET0(dy_apicv_has_pending_interrupt) +KVM_X86_OP_OPTIONAL(protected_apic_has_interrupt) KVM_X86_OP_OPTIONAL(set_hv_timer) KVM_X86_OP_OPTIONAL(cancel_hv_timer) KVM_X86_OP(setup_mce) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 082e94f78c66..70549018987d 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1671,6 +1671,7 @@ struct kvm_x86_ops { void (*pi_start_assignment)(struct kvm *kvm); void (*apicv_post_state_restore)(struct kvm_vcpu *vcpu); bool (*dy_apicv_has_pending_interrupt)(struct kvm_vcpu *vcpu); + bool (*protected_apic_has_interrupt)(struct kvm_vcpu *vcpu); int (*set_hv_timer)(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc, bool *expired); diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c index f371f1292ca3..56e52eef0269 100644 --- a/arch/x86/kvm/irq.c +++ b/arch/x86/kvm/irq.c @@ -100,6 +100,9 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v) if (kvm_cpu_has_extint(v)) return 1; + if (lapic_in_kernel(v) && v->arch.apic->guest_apic_protected) + return static_call(kvm_x86_protected_apic_has_interrupt)(v); + return kvm_apic_has_interrupt(v) != -1; /* LAPIC */ } EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt); diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index d7639d126e6c..bcf339d02c0a 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -2624,6 +2624,9 @@ int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu) if (!kvm_apic_present(vcpu)) return -1; + if (apic->guest_apic_protected) + return -1; + __apic_update_ppr(apic, &ppr); return apic_has_interrupt_for_ppr(apic, ppr); } diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h index a5ac4a5a5179..44a9b5131323 100644 --- a/arch/x86/kvm/lapic.h +++ b/arch/x86/kvm/lapic.h @@ -66,6 +66,8 @@ struct kvm_lapic { bool sw_enabled; bool irr_pending; bool lvt0_in_nmi_mode; + /* Select registers in the vAPIC cannot be read/written. */ + bool guest_apic_protected; /* Number of bits set in ISR. */ s16 isr_count; /* The highest vector set in ISR; if -1 - invalid, must scan ISR. */ diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 6d46ae9c5dce..1dfffc6c1533 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -46,6 +46,9 @@ static __init int vt_hardware_setup(void) enable_tdx = enable_tdx && !tdx_hardware_setup(&vt_x86_ops); + if (!enable_tdx) + vt_x86_ops.protected_apic_has_interrupt = NULL; + if (enable_ept) kvm_mmu_set_ept_masks(enable_ept_ad_bits, cpu_has_vmx_ept_execute_only()); @@ -168,6 +171,13 @@ static void vt_vcpu_load(struct kvm_vcpu *vcpu, int cpu) return vmx_vcpu_load(vcpu, cpu); } +static bool vt_protected_apic_has_interrupt(struct kvm_vcpu *vcpu) +{ + KVM_BUG_ON(!is_td_vcpu(vcpu), vcpu->kvm); + + return tdx_protected_apic_has_interrupt(vcpu); +} + static void vt_flush_tlb_all(struct kvm_vcpu *vcpu) { if (is_td_vcpu(vcpu)) @@ -339,6 +349,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .sync_pir_to_irr = vmx_sync_pir_to_irr, .deliver_interrupt = vmx_deliver_interrupt, .dy_apicv_has_pending_interrupt = pi_has_pending_interrupt, + .protected_apic_has_interrupt = vt_protected_apic_has_interrupt, .set_tss_addr = vmx_set_tss_addr, .set_identity_map_addr = vmx_set_identity_map_addr, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 57767ef3353b..19a9263e5788 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -426,6 +426,7 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu) return -EINVAL; fpstate_set_confidential(&vcpu->arch.guest_fpu); + vcpu->arch.apic->guest_apic_protected = true; vcpu->arch.efer = EFER_SCE | EFER_LME | EFER_LMA | EFER_NX; @@ -464,6 +465,11 @@ void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) local_irq_enable(); } +bool tdx_protected_apic_has_interrupt(struct kvm_vcpu *vcpu) +{ + return pi_has_pending_interrupt(vcpu); +} + void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) { struct vcpu_tdx *tdx = to_tdx(vcpu); diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index d4fcb6b29ffe..6bdd956b44c2 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -153,6 +153,7 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu); void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu); void tdx_vcpu_put(struct kvm_vcpu *vcpu); void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu); +bool tdx_protected_apic_has_interrupt(struct kvm_vcpu *vcpu); int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); @@ -181,6 +182,7 @@ static inline fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu) { return EXIT_FASTP static inline void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) {} static inline void tdx_vcpu_put(struct kvm_vcpu *vcpu) {} static inline void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) {} +static inline bool tdx_protected_apic_has_interrupt(struct kvm_vcpu *vcpu) { return false; } static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; } From patchwork Sun Oct 30 06:23:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12907 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666870wru; Sat, 29 Oct 2022 23:31:45 -0700 (PDT) X-Google-Smtp-Source: AMsMyM47iVUUrKLiZsnQpHyVjDIn3ul2JYrDrrWuZ5pyLhP0HDZ4lhNPno8Vvsd23BQdKF/n1GSg X-Received: by 2002:a65:5b0b:0:b0:462:da7a:1ded with SMTP id y11-20020a655b0b000000b00462da7a1dedmr6903128pgq.605.1667111505290; Sat, 29 Oct 2022 23:31:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111505; cv=none; d=google.com; s=arc-20160816; b=qz9X7x0PlXvJCOWnskziILmT3Kenn073qZfZ+LkCOZyOdcYEbZjavWfrFxxkcm5/38 na3WzP4KVTLwYwOlkkoYfkIQqBakiV3ee8Pw4O6Sooh7Ryi7CVZlD26VfOT7Bq7iWQWS hDl4tjJQruwey0sCj0OhAhGz4i5i83XbSoZlMZsPN1u64PNfN5z6dOJDApEHoDxQ7yOD hLLtpuRlgI7V1C0uobOArVHl/qp8mFjMEBH8IkdrVaZ19a/d6D0sbIYjRMIBvoiVSew/ 9RdQQ4UTB8ZUHQh92LvQu/k0LMxO8yumQvpK9/D8MhK9pD0zusardHADKuaGQznEdZzb Wsqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=KZd5jsFAN/ztHb9uYkr73charq0kJqurp9nCspx3TbY=; b=kdfyguQp3Tuf2sT+l3yaabK6VsGoVWt5qGa/HS1ioS/zZi1BVjlWkOOfepctmxqfve fr6/w0oZdvFTMHIKnOAvJiGhNpi4xhRRODfoXBN+9Yiyff4b5WZbkKYg7t69zKd/s/5a nc8EBfVTvspaCytNEGgwvxJAuYukmh0nUNVJ+3EOTbfEAvsBIZlskTGv3OL4bSY4jKrV gw/MKBMW9nZ5vezEPGjHGaYDHCzbwqf9ECi0e9Qt0mTgxh7uJcTjtO6zkVQn/ESmXBry /ttkNEZ43fFmP9Barb8PYr+/SE54NE0JsgxdKticbkpCkoDWoluVqqmDJVpuxeGQpIoc ON6g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=glAILD5K; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j11-20020a056a00174b00b0056bac1881basi5128098pfc.316.2022.10.29.23.31.33; Sat, 29 Oct 2022 23:31:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=glAILD5K; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231359AbiJ3Ga5 (ORCPT + 99 others); Sun, 30 Oct 2022 02:30:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48676 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230254AbiJ3G3k (ORCPT ); Sun, 30 Oct 2022 02:29:40 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF04D335; Sat, 29 Oct 2022 23:24:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111085; x=1698647085; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FA/pCLrZQODOQdV1WGT35bNwtZsU35qRfGMumXhQubE=; b=glAILD5KStH5PDCZRIQ7lAGvWs5um1lGsYL5HsjcCh4khMxAiPKQj1g/ KOWBvBqY2+pALX/MPwRKvwnQvkYYLMg6IuuuVUx+s6hI8uvsXK2BMdHjq Ov0VtdoeJd3Ib2xA8Cr/raa2Q/m07rEFWJIVMslQwYMa7RZAA58XVTXXE 0nvdaVJyrcQuXtJpuMECwBhl0wqHnDBEcnkoTpUWzjLScVr052qvEg7OX a4P/J0leDuLtbVE3Lo5EnPguXZB0FDqZTcRbkcAXBus7J95HmURBEvIgq 1/m+WkvppKiuDsmmyNDMJBf922uLzXQrDQ9j7n+6/yDaV3UA5CQUHVF/E g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037192" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037192" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:10 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393098" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393098" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:10 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 079/108] KVM: x86: Assume timer IRQ was injected if APIC state is proteced Date: Sat, 29 Oct 2022 23:23:20 -0700 Message-Id: <62640c5b69b297151af1e8f512f75f953354f12d.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093113692587094?= X-GMAIL-MSGID: =?utf-8?q?1748093113692587094?= From: Sean Christopherson If APIC state is protected, i.e. the vCPU is a TDX guest, assume a timer IRQ was injected when deciding whether or not to busy wait in the "timer advanced" path. The "real" vIRR is not readable/writable, so trying to query for a pending timer IRQ will return garbage. Note, TDX can scour the PIR if it wants to be more precise and skip the "wait" call entirely. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/kvm/lapic.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index bcf339d02c0a..8d894c3959c8 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1606,8 +1606,17 @@ static void apic_update_lvtt(struct kvm_lapic *apic) static bool lapic_timer_int_injected(struct kvm_vcpu *vcpu) { struct kvm_lapic *apic = vcpu->arch.apic; - u32 reg = kvm_lapic_get_reg(apic, APIC_LVTT); + u32 reg; + /* + * Assume a timer IRQ was "injected" if the APIC is protected. KVM's + * copy of the vIRR is bogus, it's the responsibility of the caller to + * precisely check whether or not a timer IRQ is pending. + */ + if (apic->guest_apic_protected) + return true; + + reg = kvm_lapic_get_reg(apic, APIC_LVTT); if (kvm_apic_hw_enabled(apic)) { int vec = reg & APIC_VECTOR_MASK; void *bitmap = apic->regs + APIC_ISR; From patchwork Sun Oct 30 06:23:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12912 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1667024wru; Sat, 29 Oct 2022 23:32:18 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6lPepKxi17Dzi/W7ckuO4eUH0on7LaAEK1XyLttCFSW/Q/EiHQoNIn95vBcB7U99bsqTu3 X-Received: by 2002:a17:902:ea03:b0:180:b53f:6da with SMTP id s3-20020a170902ea0300b00180b53f06damr7769902plg.69.1667111538132; Sat, 29 Oct 2022 23:32:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111538; cv=none; d=google.com; s=arc-20160816; b=kSLbDZn+pgXKSiImigs1q8cC6xcu61lOFvu19eMJp/VVDPa2VMOJpQ7zknuDXEH4th 0wdODti1uaaVDgJKUvYRPNEaQMs0UIwjaIohDzYZ9vss8pMBvmNtMQJId1AEGgHjwMDO 6UGbvohmN8aCW1LjeLEV4OaQDqA1WbLrSca6SyaJE9QDYsp9YON0In/szSGK4ZVr7vjV FZOUfQ4p2t6gu3SMBwueIpBkcnXeoDCI6tXo4ccgD3+QFmuZYd3V+SMqPemTqyOn4uYv drMnGd9wZxYgbUqfYaurYfzFoV4oK8e8YW7/CJkORyS2BFLNTSwjatHGPsKe/da33JZ4 P4BQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=fPoxYuMOcXqWR4ASOb5XvpzG6lw8mrce1M26SEl5YX0=; b=TRlpaxCYjefoBWcYiGWER/Z95PjK4tisLp91nptz9fjHxCqTA4IgxtTUEdXALYQl5f 7RqlDcBYDNNkirTH56ncEfF3SSoCEZ522XOBpBTRUpoCT+yPjplr9W+oW80L13Cbs7+d g0oE4ugGF0XVP4RlphRh1BT/jiz/9+ppAMm02jSAOBkqecPalcI8u0b8U7Ip5c53Fcva nu3as2ysp4xx6rYwg6sT4wKyHKaoDBKrUJUGCme2MhyNk3TajeU5Q5lHib3fO8aBUdCu Is961xtsxhMb3/MUPP9ki27WlMd+Ob1tZcVXNYrtf9X6YVXo5xDX2uu5Jt7stpah+xPI mFhg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ShQxZ6Hl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l9-20020a170902f68900b00186c3726f8esi4556980plg.42.2022.10.29.23.32.05; Sat, 29 Oct 2022 23:32:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ShQxZ6Hl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231397AbiJ3GbS (ORCPT + 99 others); Sun, 30 Oct 2022 02:31:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54050 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230071AbiJ3G3l (ORCPT ); Sun, 30 Oct 2022 02:29:41 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9295833C; Sat, 29 Oct 2022 23:24:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111086; x=1698647086; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=04v0nNzPo6Xd+SQDiojjk/9ySBFBw4R5+8fa1ILBba4=; b=ShQxZ6HlRP0z2j1B8xvBLmWXyviep9W/XSNqkdjXViBziLvB31yD1oSa XJdvnkDoyMgs0l+qKrGJIWbYK5GGavBjYDgAQ57En10XtuZbksTupqSeY Diu70rhYnmLAR3quTGr4zD3rTH3pVw9KQiglR18YVOqHr56ESNpp7xSJ4 G3NQWsH+nBpGiairovGQ+NWKugeyXG8Wb3PccFr5LfgBhIu0hIaN5OLxg pyPpb9vuSXf351C3qDjWcgdnafj2v3KvDUfyHhl/V7U9N5Ys53xshQ+oH XSe4730P4VN06Agy98J8FlRWnQJvE9pT9VMZ9BihaRd+stfRKPn2HKz93 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037193" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037193" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:10 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393101" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393101" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:10 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 080/108] KVM: TDX: remove use of struct vcpu_vmx from posted_interrupt.c Date: Sat, 29 Oct 2022 23:23:21 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093148128443563?= X-GMAIL-MSGID: =?utf-8?q?1748093148128443563?= From: Isaku Yamahata As TDX will use posted_interrupt.c, the use of struct vcpu_vmx is a blocker. Because the members of struct pi_desc pi_desc and struct list_head pi_wakeup_list are only used in posted_interrupt.c, introduce common structure, struct vcpu_pi, make vcpu_vmx and vcpu_tdx has same layout in the top of structure. To minimize the diff size, avoid code conversion like, vmx->pi_desc => vmx->common->pi_desc. Instead add compile time check if the layout is expected. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/posted_intr.c | 41 ++++++++++++++++++++++++++-------- arch/x86/kvm/vmx/posted_intr.h | 11 +++++++++ arch/x86/kvm/vmx/tdx.c | 1 + arch/x86/kvm/vmx/tdx.h | 8 +++++++ arch/x86/kvm/vmx/vmx.h | 14 +++++++----- 5 files changed, 60 insertions(+), 15 deletions(-) diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c index 1b56c5e5c9fb..62caf74753bc 100644 --- a/arch/x86/kvm/vmx/posted_intr.c +++ b/arch/x86/kvm/vmx/posted_intr.c @@ -9,6 +9,7 @@ #include "posted_intr.h" #include "trace.h" #include "vmx.h" +#include "tdx.h" /* * Maintain a per-CPU list of vCPUs that need to be awakened by wakeup_handler() @@ -29,9 +30,29 @@ static DEFINE_PER_CPU(struct list_head, wakeup_vcpus_on_cpu); */ static DEFINE_PER_CPU(raw_spinlock_t, wakeup_vcpus_on_cpu_lock); +/* + * The layout of the head of struct vcpu_vmx and struct vcpu_tdx must match with + * struct vcpu_pi. + */ +static_assert(offsetof(struct vcpu_pi, pi_desc) == + offsetof(struct vcpu_vmx, pi_desc)); +static_assert(offsetof(struct vcpu_pi, pi_wakeup_list) == + offsetof(struct vcpu_vmx, pi_wakeup_list)); +#ifdef CONFIG_INTEL_TDX_HOST +static_assert(offsetof(struct vcpu_pi, pi_desc) == + offsetof(struct vcpu_tdx, pi_desc)); +static_assert(offsetof(struct vcpu_pi, pi_wakeup_list) == + offsetof(struct vcpu_tdx, pi_wakeup_list)); +#endif + +static inline struct vcpu_pi *vcpu_to_pi(struct kvm_vcpu *vcpu) +{ + return (struct vcpu_pi *)vcpu; +} + static inline struct pi_desc *vcpu_to_pi_desc(struct kvm_vcpu *vcpu) { - return &(to_vmx(vcpu)->pi_desc); + return &vcpu_to_pi(vcpu)->pi_desc; } static int pi_try_set_control(struct pi_desc *pi_desc, u64 *pold, u64 new) @@ -50,8 +71,8 @@ static int pi_try_set_control(struct pi_desc *pi_desc, u64 *pold, u64 new) void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu) { - struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu); - struct vcpu_vmx *vmx = to_vmx(vcpu); + struct vcpu_pi *vcpu_pi = vcpu_to_pi(vcpu); + struct pi_desc *pi_desc = &vcpu_pi->pi_desc; struct pi_desc old, new; unsigned long flags; unsigned int dest; @@ -88,7 +109,7 @@ void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu) */ if (pi_desc->nv == POSTED_INTR_WAKEUP_VECTOR) { raw_spin_lock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu)); - list_del(&vmx->pi_wakeup_list); + list_del(&vcpu_pi->pi_wakeup_list); raw_spin_unlock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu)); } @@ -143,15 +164,15 @@ static bool vmx_can_use_vtd_pi(struct kvm *kvm) */ static void pi_enable_wakeup_handler(struct kvm_vcpu *vcpu) { - struct pi_desc *pi_desc = vcpu_to_pi_desc(vcpu); - struct vcpu_vmx *vmx = to_vmx(vcpu); + struct vcpu_pi *vcpu_pi = vcpu_to_pi(vcpu); + struct pi_desc *pi_desc = &vcpu_pi->pi_desc; struct pi_desc old, new; unsigned long flags; local_irq_save(flags); raw_spin_lock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu)); - list_add_tail(&vmx->pi_wakeup_list, + list_add_tail(&vcpu_pi->pi_wakeup_list, &per_cpu(wakeup_vcpus_on_cpu, vcpu->cpu)); raw_spin_unlock(&per_cpu(wakeup_vcpus_on_cpu_lock, vcpu->cpu)); @@ -188,7 +209,8 @@ static bool vmx_needs_pi_wakeup(struct kvm_vcpu *vcpu) * notification vector is switched to the one that calls * back to the pi_wakeup_handler() function. */ - return vmx_can_use_ipiv(vcpu) || vmx_can_use_vtd_pi(vcpu->kvm); + return (vmx_can_use_ipiv(vcpu) && !is_td_vcpu(vcpu)) || + vmx_can_use_vtd_pi(vcpu->kvm); } void vmx_vcpu_pi_put(struct kvm_vcpu *vcpu) @@ -198,7 +220,8 @@ void vmx_vcpu_pi_put(struct kvm_vcpu *vcpu) if (!vmx_needs_pi_wakeup(vcpu)) return; - if (kvm_vcpu_is_blocking(vcpu) && !vmx_interrupt_blocked(vcpu)) + if (kvm_vcpu_is_blocking(vcpu) && + (is_td_vcpu(vcpu) || !vmx_interrupt_blocked(vcpu))) pi_enable_wakeup_handler(vcpu); /* diff --git a/arch/x86/kvm/vmx/posted_intr.h b/arch/x86/kvm/vmx/posted_intr.h index 26992076552e..2fe8222308b2 100644 --- a/arch/x86/kvm/vmx/posted_intr.h +++ b/arch/x86/kvm/vmx/posted_intr.h @@ -94,6 +94,17 @@ static inline bool pi_test_sn(struct pi_desc *pi_desc) (unsigned long *)&pi_desc->control); } +struct vcpu_pi { + struct kvm_vcpu vcpu; + + /* Posted interrupt descriptor */ + struct pi_desc pi_desc; + + /* Used if this vCPU is waiting for PI notification wakeup. */ + struct list_head pi_wakeup_list; + /* Until here common layout betwwn vcpu_vmx and vcpu_tdx. */ +}; + void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu); void vmx_vcpu_pi_put(struct kvm_vcpu *vcpu); void pi_wakeup_handler(void); diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 19a9263e5788..8d0eb1d405d7 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -427,6 +427,7 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu) fpstate_set_confidential(&vcpu->arch.guest_fpu); vcpu->arch.apic->guest_apic_protected = true; + INIT_LIST_HEAD(&tdx->pi_wakeup_list); vcpu->arch.efer = EFER_SCE | EFER_LME | EFER_LMA | EFER_NX; diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index c02073102a5f..64e9b864e20e 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -4,6 +4,7 @@ #ifdef CONFIG_INTEL_TDX_HOST +#include "posted_intr.h" #include "pmu_intel.h" #include "tdx_ops.h" @@ -65,6 +66,13 @@ union tdx_exit_reason { struct vcpu_tdx { struct kvm_vcpu vcpu; + /* Posted interrupt descriptor */ + struct pi_desc pi_desc; + + /* Used if this vCPU is waiting for PI notification wakeup. */ + struct list_head pi_wakeup_list; + /* Until here same layout to struct vcpu_pi. */ + struct tdx_td_page tdvpr; struct tdx_td_page *tdvpx; diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index 1813caeb24d8..0a7ab0a7d604 100644 --- a/arch/x86/kvm/vmx/vmx.h +++ b/arch/x86/kvm/vmx/vmx.h @@ -245,6 +245,14 @@ struct nested_vmx { struct vcpu_vmx { struct kvm_vcpu vcpu; + + /* Posted interrupt descriptor */ + struct pi_desc pi_desc; + + /* Used if this vCPU is waiting for PI notification wakeup. */ + struct list_head pi_wakeup_list; + /* Until here same layout to struct vcpu_pi. */ + u8 fail; u8 x2apic_msr_bitmap_mode; @@ -314,12 +322,6 @@ struct vcpu_vmx { union vmx_exit_reason exit_reason; - /* Posted interrupt descriptor */ - struct pi_desc pi_desc; - - /* Used if this vCPU is waiting for PI notification wakeup. */ - struct list_head pi_wakeup_list; - /* Support for a guest hypervisor (nested VMX) */ struct nested_vmx nested; From patchwork Sun Oct 30 06:23:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12911 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666992wru; Sat, 29 Oct 2022 23:32:11 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6Ki5FudZX4TdqEmsd+ipxfCTLCxBarv93cU+CfLDSpq/quO2VZKKpqib91kBPfL0rkbVwy X-Received: by 2002:a17:90a:ba05:b0:213:b1c2:ff88 with SMTP id s5-20020a17090aba0500b00213b1c2ff88mr5945881pjr.240.1667111531471; Sat, 29 Oct 2022 23:32:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111531; cv=none; d=google.com; s=arc-20160816; b=SFFIdVOM6EbiQ/XfwC0gOoiq5c/fzFkWr+w7Hx6rfjNkZeZ+txyI72hsGnupWop2DY CfnuI7vbUm1QXdVjmgLEIFA61OqicVqPiVcptVbSEORHYaVpsSGUP3aiy6Zq1ci76Z17 w/6wIfcJ6aOcJC117EwHyUX3v3vHyiQOH6I6mmejAGolqAB9CmUAHN4AbfnJ/Nss0UPT h6Mnphc/cs+IBKXDMssgxH+nD+/RZFWB+k+O2xhtJb+AHaPtV2i6EgbTsIZaR5jb3Idj G82j4BO4fo8VZNYYd6zxIk3zL0tUqvqZCVTv0ZqonFiuLbIo8CgaYD+Q+spY9L1tcZX+ VuCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=EbHftW0CsCGNChjM8btaH0hdNsvgqWo0n9lYkcasqqA=; b=CTbQkIbX0kSU1DAVmmzb4ogdRFfDPTSuMAI4I3BZLmN6N8G2hTe3tD0uKZiTwAFZL4 2wVTUaFkSn7CO49kSaUwwHihyZDJ4S7b8AcnRmdwvCghADkvP+KPahs1MAObXsdciQMO 5ugpYAN+7DR7U/EktQT/nLoqLJplZcRSjtaElN6eI4uj+n95iHstFeWvL3q1v1X7Efy9 8cIXgrSX7iQrUKCqTpm96ymI2Tbun9XM/R+RQoMf8o0iXRVGcUetOwQIzu9VKllJ1jvH pMJ4SVmFoCIVR9/5wiThIB0+S7MWhwaVlpdZumdV+7f1moDVWJ1ExXgQ1EyIQuJrMy1o r2EQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="hVbs/krW"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q24-20020a63e958000000b00439ae52c996si4890598pgj.71.2022.10.29.23.31.59; Sat, 29 Oct 2022 23:32:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="hVbs/krW"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231386AbiJ3GbN (ORCPT + 99 others); Sun, 30 Oct 2022 02:31:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54044 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230468AbiJ3G3l (ORCPT ); Sun, 30 Oct 2022 02:29:41 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B16ED380; Sat, 29 Oct 2022 23:24:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111087; x=1698647087; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KjSvNOjth8Bs7VXH0IQexL7TQ9AX6IQJRgXeF4yfdFc=; b=hVbs/krWFaxTQ99Rtunzec+OaY7mGkFc/1ZlEZTOeQzXu4PZavnQDXh9 uBjnMpUQS4c9FmBLHxvE/0EyArWj8CBdnGSX63lBdF2sLvFdbwKaJv+ww T6UyJVvP00VuYLl08j4XfV+3P30CCoCxoGqveI6DTn2dqaeCmJbCYkXOR 6DmNf68mmEPHkpdkd5wkYi5tJiRTV819Wfo3YD+f9xVUoNMdIIPnjyo04 TClE6/eXuWytMPBVHtQhzK2ZBmKvqhd69+lcfN5UjKtv0HPrb5M5qwyoG 0GD6+CjDBuxOblwsVwobcsSwpBU4/zsWTvRLt2JttUdRmqT2T+n13neps Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037194" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037194" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:11 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393104" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393104" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:10 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 081/108] KVM: TDX: Implement interrupt injection Date: Sat, 29 Oct 2022 23:23:22 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093141182076443?= X-GMAIL-MSGID: =?utf-8?q?1748093141182076443?= From: Isaku Yamahata TDX supports interrupt inject into vcpu with posted interrupt. Wire up the corresponding kvm x86 operations to posted interrupt. Move kvm_vcpu_trigger_posted_interrupt() from vmx.c to common.h to share the code. VMX can inject interrupt by setting interrupt information field, VM_ENTRY_INTR_INFO_FIELD, of VMCS. TDX supports interrupt injection only by posted interrupt. Ignore the execution path to access VM_ENTRY_INTR_INFO_FIELD. As cpu state is protected and apicv is enabled for the TDX guest, VMM can inject interrupt by updating posted interrupt descriptor. Treat interrupt can be injected always. Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/kvm/vmx/common.h | 71 ++++++++++++++++++++++++++ arch/x86/kvm/vmx/main.c | 93 ++++++++++++++++++++++++++++++---- arch/x86/kvm/vmx/posted_intr.c | 2 +- arch/x86/kvm/vmx/posted_intr.h | 2 + arch/x86/kvm/vmx/tdx.c | 25 +++++++++ arch/x86/kvm/vmx/vmx.c | 67 +----------------------- arch/x86/kvm/vmx/x86_ops.h | 7 ++- 7 files changed, 190 insertions(+), 77 deletions(-) diff --git a/arch/x86/kvm/vmx/common.h b/arch/x86/kvm/vmx/common.h index 235908f3e044..747f993cf7de 100644 --- a/arch/x86/kvm/vmx/common.h +++ b/arch/x86/kvm/vmx/common.h @@ -4,6 +4,7 @@ #include +#include "posted_intr.h" #include "mmu.h" static inline int __vmx_handle_ept_violation(struct kvm_vcpu *vcpu, gpa_t gpa, @@ -30,4 +31,74 @@ static inline int __vmx_handle_ept_violation(struct kvm_vcpu *vcpu, gpa_t gpa, return kvm_mmu_page_fault(vcpu, gpa, error_code, NULL, 0); } +static inline void kvm_vcpu_trigger_posted_interrupt(struct kvm_vcpu *vcpu, + int pi_vec) +{ +#ifdef CONFIG_SMP + if (vcpu->mode == IN_GUEST_MODE) { + /* + * The vector of the virtual has already been set in the PIR. + * Send a notification event to deliver the virtual interrupt + * unless the vCPU is the currently running vCPU, i.e. the + * event is being sent from a fastpath VM-Exit handler, in + * which case the PIR will be synced to the vIRR before + * re-entering the guest. + * + * When the target is not the running vCPU, the following + * possibilities emerge: + * + * Case 1: vCPU stays in non-root mode. Sending a notification + * event posts the interrupt to the vCPU. + * + * Case 2: vCPU exits to root mode and is still runnable. The + * PIR will be synced to the vIRR before re-entering the guest. + * Sending a notification event is ok as the host IRQ handler + * will ignore the spurious event. + * + * Case 3: vCPU exits to root mode and is blocked. vcpu_block() + * has already synced PIR to vIRR and never blocks the vCPU if + * the vIRR is not empty. Therefore, a blocked vCPU here does + * not wait for any requested interrupts in PIR, and sending a + * notification event also results in a benign, spurious event. + */ + + if (vcpu != kvm_get_running_vcpu()) + apic->send_IPI_mask(get_cpu_mask(vcpu->cpu), pi_vec); + return; + } +#endif + /* + * The vCPU isn't in the guest; wake the vCPU in case it is blocking, + * otherwise do nothing as KVM will grab the highest priority pending + * IRQ via ->sync_pir_to_irr() in vcpu_enter_guest(). + */ + kvm_vcpu_wake_up(vcpu); +} + +/* + * Send interrupt to vcpu via posted interrupt way. + * 1. If target vcpu is running(non-root mode), send posted interrupt + * notification to vcpu and hardware will sync PIR to vIRR atomically. + * 2. If target vcpu isn't running(root mode), kick it to pick up the + * interrupt from PIR in next vmentry. + */ +static inline void __vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, + struct pi_desc *pi_desc, int vector) +{ + if (pi_test_and_set_pir(vector, pi_desc)) + return; + + /* If a previous notification has sent the IPI, nothing to do. */ + if (pi_test_and_set_on(pi_desc)) + return; + + /* + * The implied barrier in pi_test_and_set_on() pairs with the smp_mb_*() + * after setting vcpu->mode in vcpu_enter_guest(), thus the vCPU is + * guaranteed to see PID.ON=1 and sync the PIR to IRR if triggering a + * posted interrupt "fails" because vcpu->mode != IN_GUEST_MODE. + */ + kvm_vcpu_trigger_posted_interrupt(vcpu, POSTED_INTR_VECTOR); +} + #endif /* __KVM_X86_VMX_COMMON_H */ diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 1dfffc6c1533..13ad1278bef2 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -178,6 +178,34 @@ static bool vt_protected_apic_has_interrupt(struct kvm_vcpu *vcpu) return tdx_protected_apic_has_interrupt(vcpu); } +static void vt_apicv_post_state_restore(struct kvm_vcpu *vcpu) +{ + struct pi_desc *pi = vcpu_to_pi_desc(vcpu); + + pi_clear_on(pi); + memset(pi->pir, 0, sizeof(pi->pir)); +} + +static int vt_sync_pir_to_irr(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return -1; + + return vmx_sync_pir_to_irr(vcpu); +} + +static void vt_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, + int trig_mode, int vector) +{ + if (is_td_vcpu(apic->vcpu)) { + tdx_deliver_interrupt(apic, delivery_mode, trig_mode, + vector); + return; + } + + vmx_deliver_interrupt(apic, delivery_mode, trig_mode, vector); +} + static void vt_flush_tlb_all(struct kvm_vcpu *vcpu) { if (is_td_vcpu(vcpu)) @@ -244,6 +272,53 @@ static void vt_sched_in(struct kvm_vcpu *vcpu, int cpu) vmx_sched_in(vcpu, cpu); } +static void vt_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask) +{ + if (is_td_vcpu(vcpu)) + return; + vmx_set_interrupt_shadow(vcpu, mask); +} + +static u32 vt_get_interrupt_shadow(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return 0; + + return vmx_get_interrupt_shadow(vcpu); +} + +static void vt_inject_irq(struct kvm_vcpu *vcpu, bool reinjected) +{ + if (is_td_vcpu(vcpu)) + return; + + vmx_inject_irq(vcpu, reinjected); +} + +static void vt_cancel_injection(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return; + + vmx_cancel_injection(vcpu); +} + +static int vt_interrupt_allowed(struct kvm_vcpu *vcpu, bool for_injection) +{ + if (is_td_vcpu(vcpu)) + return true; + + return vmx_interrupt_allowed(vcpu, for_injection); +} + +static void vt_enable_irq_window(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return; + + vmx_enable_irq_window(vcpu); +} + static int vt_mem_enc_ioctl(struct kvm *kvm, void __user *argp) { if (!is_td(kvm)) @@ -323,31 +398,31 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .handle_exit = vmx_handle_exit, .skip_emulated_instruction = vmx_skip_emulated_instruction, .update_emulated_instruction = vmx_update_emulated_instruction, - .set_interrupt_shadow = vmx_set_interrupt_shadow, - .get_interrupt_shadow = vmx_get_interrupt_shadow, + .set_interrupt_shadow = vt_set_interrupt_shadow, + .get_interrupt_shadow = vt_get_interrupt_shadow, .patch_hypercall = vmx_patch_hypercall, - .inject_irq = vmx_inject_irq, + .inject_irq = vt_inject_irq, .inject_nmi = vmx_inject_nmi, .inject_exception = vmx_inject_exception, - .cancel_injection = vmx_cancel_injection, - .interrupt_allowed = vmx_interrupt_allowed, + .cancel_injection = vt_cancel_injection, + .interrupt_allowed = vt_interrupt_allowed, .nmi_allowed = vmx_nmi_allowed, .get_nmi_mask = vmx_get_nmi_mask, .set_nmi_mask = vmx_set_nmi_mask, .enable_nmi_window = vmx_enable_nmi_window, - .enable_irq_window = vmx_enable_irq_window, + .enable_irq_window = vt_enable_irq_window, .update_cr8_intercept = vmx_update_cr8_intercept, .set_virtual_apic_mode = vmx_set_virtual_apic_mode, .set_apic_access_page_addr = vmx_set_apic_access_page_addr, .refresh_apicv_exec_ctrl = vmx_refresh_apicv_exec_ctrl, .load_eoi_exitmap = vmx_load_eoi_exitmap, - .apicv_post_state_restore = vmx_apicv_post_state_restore, + .apicv_post_state_restore = vt_apicv_post_state_restore, .check_apicv_inhibit_reasons = vmx_check_apicv_inhibit_reasons, .hwapic_irr_update = vmx_hwapic_irr_update, .hwapic_isr_update = vmx_hwapic_isr_update, .guest_apic_has_interrupt = vmx_guest_apic_has_interrupt, - .sync_pir_to_irr = vmx_sync_pir_to_irr, - .deliver_interrupt = vmx_deliver_interrupt, + .sync_pir_to_irr = vt_sync_pir_to_irr, + .deliver_interrupt = vt_deliver_interrupt, .dy_apicv_has_pending_interrupt = pi_has_pending_interrupt, .protected_apic_has_interrupt = vt_protected_apic_has_interrupt, diff --git a/arch/x86/kvm/vmx/posted_intr.c b/arch/x86/kvm/vmx/posted_intr.c index 62caf74753bc..91ea3396463d 100644 --- a/arch/x86/kvm/vmx/posted_intr.c +++ b/arch/x86/kvm/vmx/posted_intr.c @@ -50,7 +50,7 @@ static inline struct vcpu_pi *vcpu_to_pi(struct kvm_vcpu *vcpu) return (struct vcpu_pi *)vcpu; } -static inline struct pi_desc *vcpu_to_pi_desc(struct kvm_vcpu *vcpu) +struct pi_desc *vcpu_to_pi_desc(struct kvm_vcpu *vcpu) { return &vcpu_to_pi(vcpu)->pi_desc; } diff --git a/arch/x86/kvm/vmx/posted_intr.h b/arch/x86/kvm/vmx/posted_intr.h index 2fe8222308b2..0f9983b6910b 100644 --- a/arch/x86/kvm/vmx/posted_intr.h +++ b/arch/x86/kvm/vmx/posted_intr.h @@ -105,6 +105,8 @@ struct vcpu_pi { /* Until here common layout betwwn vcpu_vmx and vcpu_tdx. */ }; +struct pi_desc *vcpu_to_pi_desc(struct kvm_vcpu *vcpu); + void vmx_vcpu_pi_load(struct kvm_vcpu *vcpu, int cpu); void vmx_vcpu_pi_put(struct kvm_vcpu *vcpu); void pi_wakeup_handler(void); diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 8d0eb1d405d7..7dad75b3b4a6 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -7,6 +7,7 @@ #include "capabilities.h" #include "x86_ops.h" +#include "common.h" #include "tdx.h" #include "vmx.h" #include "x86.h" @@ -440,6 +441,9 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu) vcpu->arch.guest_state_protected = !(to_kvm_tdx(vcpu->kvm)->attributes & TDX_TD_ATTRIBUTE_DEBUG); + tdx->pi_desc.nv = POSTED_INTR_VECTOR; + tdx->pi_desc.sn = 1; + tdx->host_state_need_save = true; tdx->host_state_need_restore = false; @@ -450,6 +454,7 @@ void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) { struct vcpu_tdx *tdx = to_tdx(vcpu); + vmx_vcpu_pi_load(vcpu, cpu); if (vcpu->cpu == cpu) return; @@ -652,6 +657,12 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu) trace_kvm_entry(vcpu); + if (pi_test_on(&tdx->pi_desc)) { + apic->send_IPI_self(POSTED_INTR_VECTOR); + + kvm_wait_lapic_expire(vcpu); + } + tdx_vcpu_enter_exit(vcpu, tdx); tdx_user_return_update_cache(); @@ -981,6 +992,16 @@ static int tdx_sept_remove_private_spte(struct kvm *kvm, gfn_t gfn, return tdx_sept_drop_private_spte(kvm, gfn, level, pfn); } +void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, + int trig_mode, int vector) +{ + struct kvm_vcpu *vcpu = apic->vcpu; + struct vcpu_tdx *tdx = to_tdx(vcpu); + + /* TDX supports only posted interrupt. No lapic emulation. */ + __vmx_deliver_posted_interrupt(vcpu, &tdx->pi_desc, vector); +} + int tdx_dev_ioctl(void __user *argp) { struct kvm_tdx_capabilities __user *user_caps; @@ -1652,6 +1673,10 @@ int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) if (ret) return ret; + td_vmcs_write16(tdx, POSTED_INTR_NV, POSTED_INTR_VECTOR); + td_vmcs_write64(tdx, POSTED_INTR_DESC_ADDR, __pa(&tdx->pi_desc)); + td_vmcs_setbit32(tdx, PIN_BASED_VM_EXEC_CONTROL, PIN_BASED_POSTED_INTR); + tdx->vcpu_initialized = true; return 0; } diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index f2887dbde700..b4c65eb17cc2 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -4120,50 +4120,6 @@ void vmx_msr_filter_changed(struct kvm_vcpu *vcpu) pt_update_intercept_for_msr(vcpu); } -static inline void kvm_vcpu_trigger_posted_interrupt(struct kvm_vcpu *vcpu, - int pi_vec) -{ -#ifdef CONFIG_SMP - if (vcpu->mode == IN_GUEST_MODE) { - /* - * The vector of the virtual has already been set in the PIR. - * Send a notification event to deliver the virtual interrupt - * unless the vCPU is the currently running vCPU, i.e. the - * event is being sent from a fastpath VM-Exit handler, in - * which case the PIR will be synced to the vIRR before - * re-entering the guest. - * - * When the target is not the running vCPU, the following - * possibilities emerge: - * - * Case 1: vCPU stays in non-root mode. Sending a notification - * event posts the interrupt to the vCPU. - * - * Case 2: vCPU exits to root mode and is still runnable. The - * PIR will be synced to the vIRR before re-entering the guest. - * Sending a notification event is ok as the host IRQ handler - * will ignore the spurious event. - * - * Case 3: vCPU exits to root mode and is blocked. vcpu_block() - * has already synced PIR to vIRR and never blocks the vCPU if - * the vIRR is not empty. Therefore, a blocked vCPU here does - * not wait for any requested interrupts in PIR, and sending a - * notification event also results in a benign, spurious event. - */ - - if (vcpu != kvm_get_running_vcpu()) - apic->send_IPI_mask(get_cpu_mask(vcpu->cpu), pi_vec); - return; - } -#endif - /* - * The vCPU isn't in the guest; wake the vCPU in case it is blocking, - * otherwise do nothing as KVM will grab the highest priority pending - * IRQ via ->sync_pir_to_irr() in vcpu_enter_guest(). - */ - kvm_vcpu_wake_up(vcpu); -} - static int vmx_deliver_nested_posted_interrupt(struct kvm_vcpu *vcpu, int vector) { @@ -4216,20 +4172,7 @@ static int vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector) if (!vcpu->arch.apic->apicv_active) return -1; - if (pi_test_and_set_pir(vector, &vmx->pi_desc)) - return 0; - - /* If a previous notification has sent the IPI, nothing to do. */ - if (pi_test_and_set_on(&vmx->pi_desc)) - return 0; - - /* - * The implied barrier in pi_test_and_set_on() pairs with the smp_mb_*() - * after setting vcpu->mode in vcpu_enter_guest(), thus the vCPU is - * guaranteed to see PID.ON=1 and sync the PIR to IRR if triggering a - * posted interrupt "fails" because vcpu->mode != IN_GUEST_MODE. - */ - kvm_vcpu_trigger_posted_interrupt(vcpu, POSTED_INTR_VECTOR); + __vmx_deliver_posted_interrupt(vcpu, &vmx->pi_desc, vector); return 0; } @@ -6855,14 +6798,6 @@ void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap) vmcs_write64(EOI_EXIT_BITMAP3, eoi_exit_bitmap[3]); } -void vmx_apicv_post_state_restore(struct kvm_vcpu *vcpu) -{ - struct vcpu_vmx *vmx = to_vmx(vcpu); - - pi_clear_on(&vmx->pi_desc); - memset(vmx->pi_desc.pir, 0, sizeof(vmx->pi_desc.pir)); -} - void vmx_do_interrupt_nmi_irqoff(unsigned long entry); static void handle_interrupt_nmi_irqoff(struct kvm_vcpu *vcpu, diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 6bdd956b44c2..01fac8ba8c50 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -55,7 +55,6 @@ int vmx_check_intercept(struct kvm_vcpu *vcpu, bool vmx_apic_init_signal_blocked(struct kvm_vcpu *vcpu); void vmx_migrate_timers(struct kvm_vcpu *vcpu); void vmx_set_virtual_apic_mode(struct kvm_vcpu *vcpu); -void vmx_apicv_post_state_restore(struct kvm_vcpu *vcpu); bool vmx_check_apicv_inhibit_reasons(enum kvm_apicv_inhibit reason); void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr); void vmx_hwapic_isr_update(int max_isr); @@ -155,6 +154,9 @@ void tdx_vcpu_put(struct kvm_vcpu *vcpu); void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu); bool tdx_protected_apic_has_interrupt(struct kvm_vcpu *vcpu); +void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, + int trig_mode, int vector); + int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); @@ -184,6 +186,9 @@ static inline void tdx_vcpu_put(struct kvm_vcpu *vcpu) {} static inline void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) {} static inline bool tdx_protected_apic_has_interrupt(struct kvm_vcpu *vcpu) { return false; } +static inline void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, + int trig_mode, int vector) {} + static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; } From patchwork Sun Oct 30 06:23:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12910 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666906wru; Sat, 29 Oct 2022 23:31:54 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5TmFI/maPUULazJYE6URpuFoTr/CseWxX6+nKJ27naOUFwbm3+RHeoeV6y4y4lR0yC+Oqg X-Received: by 2002:a05:6a00:2350:b0:541:b5bf:2774 with SMTP id j16-20020a056a00235000b00541b5bf2774mr7578439pfj.28.1667111514556; Sat, 29 Oct 2022 23:31:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111514; cv=none; d=google.com; s=arc-20160816; b=n8zD8wXbtwC38y7pnYlcFh3t/gnDlzeeOZWsQGbGMS/uiMn433xncmXdoawneUM0tx j+SGLj+mHb40nzjQwIbtnCEBUmEuqPnyXyzBhitrFwte1AALi4ls0XwUJw4Od4Z9ERxt Q7kAdDYQ4Q1kCr3yDcznDr1Tvhic/PWHfueleT/VdpSDhwtVd4HzjuaJyqCQx2AUsMuL iy6MGi/o+x7aIw1cafnX8vj577I3yZRFJ9xwNbnDDYug8ebc4S+v9vz6Y2DyBd8KxzJy hjlbCzEtIPJqE91KalpzJ7eSdQfZ+i08JjUSdy9uZ9o1+Vbsg2zB3As9ac7zRp+NSA1K UIrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=inxVGe4vF4QKhLwMNf44Py0JWUdBuj5CIFdGMj7BVmc=; b=C+JaYnhWjTsH4pcJT9nvsWcSXSH0MJYsh57pCNyrqOoNHM1SaIVPdu65y5ZUxbFGsm OE64226pE0ImJU06X8mJ+psfldDXz491r/y9Q8guuYRkzpJPp+M492qIg0pGKnnEA3T6 HZLxHWbn1KDCffRczECZyJJe0dQDnecq6nSpL4vhdQIXH/bB15Gs0T//Q/qoXvPAzri6 DUhimJFiQEhsrUDqenRZW4oOuhdBke3BJuxqSEuPjOhSRYOCNfPbYHyZk6jzmdQrU/y6 obSiCtLBS9gre/JEmJVFqQGddR4XUsP+pRhQ19xowdF/ZY+TPHfIvBSjDZDk65cpovWs M9ig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ChN4CBjE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 11-20020a63194b000000b0043966935b33si4101782pgz.166.2022.10.29.23.31.42; Sat, 29 Oct 2022 23:31:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ChN4CBjE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231372AbiJ3GbJ (ORCPT + 99 others); Sun, 30 Oct 2022 02:31:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57556 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230464AbiJ3G3k (ORCPT ); Sun, 30 Oct 2022 02:29:40 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B168D33F; Sat, 29 Oct 2022 23:24:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111087; x=1698647087; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WudwhLKpm6+sj0md1RnElvGL9BNFi0c26OmHP/0eVm4=; b=ChN4CBjE9xf493v4op0BGibWSdPyTEVD97lGfzKssanIDo/vsccIvkvc JVQYwm46VQWWqWxQKsVYAMAXIXR3NqSmVsBtIL53Tpa5s5W9OtsQUFeee OwXEDRolbzf45HvDiZyECQXOmE0EcuIOUsg2C/A5iVEluNxhmZvRQ5F8p aN8b6YoT0StxgiQE5K9EO2xDvDxHZJ5qv9g6eoF4LslfZ+j7KMOx7nlz8 y5JJfHBfiz4J95Hl1OQOZrxdMKIj7yqHrx4GCW23coDJbrjPo47jJr9YF f+Sb2nV9yHYPVKMmopNvmfXiT3rQsjxOBZ+9lsBJknjHeqsaJZTl3eptV Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037195" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037195" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:11 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393107" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393107" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:11 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 082/108] KVM: TDX: Implements vcpu request_immediate_exit Date: Sat, 29 Oct 2022 23:23:23 -0700 Message-Id: <6901f181c0bfb99a8f675fdffc6429559edb5732.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093123327203514?= X-GMAIL-MSGID: =?utf-8?q?1748093123327203514?= From: Isaku Yamahata Now we are able to inject interrupts into TDX vcpu, it's ready to block TDX vcpu. Wire up kvm x86 methods for blocking/unblocking vcpu for TDX. To unblock on pending events, request immediate exit methods is also needed. Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/kvm/vmx/main.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 13ad1278bef2..3e9007a7edfb 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -319,6 +319,14 @@ static void vt_enable_irq_window(struct kvm_vcpu *vcpu) vmx_enable_irq_window(vcpu); } +static void vt_request_immediate_exit(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return __kvm_request_immediate_exit(vcpu); + + vmx_request_immediate_exit(vcpu); +} + static int vt_mem_enc_ioctl(struct kvm *kvm, void __user *argp) { if (!is_td(kvm)) @@ -446,7 +454,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .check_intercept = vmx_check_intercept, .handle_exit_irqoff = vmx_handle_exit_irqoff, - .request_immediate_exit = vmx_request_immediate_exit, + .request_immediate_exit = vt_request_immediate_exit, .sched_in = vt_sched_in, From patchwork Sun Oct 30 06:23:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12914 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1667056wru; Sat, 29 Oct 2022 23:32:24 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5HsliK0xrQjn0CP/ujdNMCBaZcL7Y8/8ULSFlwCreIgyf4dbQHVl945hGsx23WDDuBx+IX X-Received: by 2002:a63:c104:0:b0:459:c6a1:15c0 with SMTP id w4-20020a63c104000000b00459c6a115c0mr6853075pgf.588.1667111544012; Sat, 29 Oct 2022 23:32:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111544; cv=none; d=google.com; s=arc-20160816; b=qTpBC58XdNBYhaGvccZELIl+a4c6aNfOQaU8FdZ7INC5hdd0m8aN7T9eklUujnvw9W Rj7iOejnyRlDhizZD1OdyZ43PiWPP2DOBbKu5Eo5Dw/kQxP7b1LaUf3b+0QIuA/IuDyt rT2nFGUdos3nxeWRcxpmLfblHPdGiA8sONG059GItLl54eyARxmYPcjHRz++joGRhIS7 HKjtsFDhUuVHbh1PmTWo3UpXWNiJdRZ4q/dTQAS6aAA2m3ShsRXK4Fp0wZEzXvWoTzC3 rcOAKun9ZkM3RSrkqIGZym0KTte1MGO8XoCoclwB5uv7bEvSx3lNCW6qGtckweRPj/3/ 7Cxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ypNwnhiODNqEQrm1AZhSuWDQe8PpMP3JgaQv0geeRvk=; b=RkmHnALVqMPoQVzF2QpdFGnaA6lvdfbdFzNlWUn0RJHHjTDPkjBtD/RGFY4Yu2dIji RXZYMiFC95tzL7dJREZyh4RDHAWBnk9vuMSY51ARS0e0j5c8NilNm6kjM7x/cGOYyHh4 lF2wbYg51sBfR6qjoDh3L8qLCqAB4ruO2n5MGaNzprbxRLZNz1WfuKKz3GvLjhTTEsZj 04Bk2YROjLM9ibY6YQyOS8qkyqMSQ602k3ND4F3PvMECdW/1w/fy83vfiqg758n4R0MO 1xWPuJd1TIe9DIzd57zy5YKLjQWeuDny3JbA/3V7xRNq6ctNcoyaaGiOA+vDMji5iYFM vNVw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=T85Aw3oe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q2-20020a170902eb8200b0017a0c197f3fsi4263633plg.335.2022.10.29.23.32.11; Sat, 29 Oct 2022 23:32:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=T85Aw3oe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230482AbiJ3Gb2 (ORCPT + 99 others); Sun, 30 Oct 2022 02:31:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54076 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230475AbiJ3G3m (ORCPT ); Sun, 30 Oct 2022 02:29:42 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E77B338C; Sat, 29 Oct 2022 23:24:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111088; x=1698647088; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=PVCyb5G5M4RHdyzQqlkKvrELC7neLel3F9Dn+hpx67Q=; b=T85Aw3oeNkWukkudtY58PQ/7NlawnBVG+FrrhMdGY6frDdryY9L7ZtgP KcJ1gNn/CBY0s8ZI3irFK+oCEPejcpX9E5Mw64Wews914bKMoI5YCBzj2 aO9N0wk81DtuOU9hVg3VDQMCCElQYNcHgIk8l7/T/bJsl7zeB/qiO6bXT kIzZIhXj2vq3wkh9C5SJdU/1Ypnkm4iT22eKXbi62zZlhrR1/6ZtYeWBp wvQEl59FBr//LQEMJ75uKMqCJzFzTJupdK7aLWWE7m+Hki46APjAypUk+ ihlCvPZ5qT23+HnSZau59wd5ZjlvVM68WdA6hLeFuqQR1C8Q4fB3A7qeB w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037196" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037196" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:11 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393110" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393110" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:11 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 083/108] KVM: TDX: Implement methods to inject NMI Date: Sat, 29 Oct 2022 23:23:24 -0700 Message-Id: <2c94568e26710ba09a4879bcf1601119f3c436be.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093154380744729?= X-GMAIL-MSGID: =?utf-8?q?1748093154380744729?= From: Isaku Yamahata TDX vcpu control structure defines one bit for pending NMI for VMM to inject NMI by setting the bit without knowing TDX vcpu NMI states. Because the vcpu state is protected, VMM can't know about NMI states of TDX vcpu. The TDX module handles actual injection and NMI states transition. Add methods for NMI and treat NMI can be injected always. Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/kvm/vmx/main.c | 62 +++++++++++++++++++++++++++++++++++--- arch/x86/kvm/vmx/tdx.c | 5 +++ arch/x86/kvm/vmx/x86_ops.h | 2 ++ 3 files changed, 64 insertions(+), 5 deletions(-) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 3e9007a7edfb..510bffb3e2f6 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -255,6 +255,58 @@ static void vt_flush_tlb_guest(struct kvm_vcpu *vcpu) vmx_flush_tlb_guest(vcpu); } +static void vt_inject_nmi(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return tdx_inject_nmi(vcpu); + + vmx_inject_nmi(vcpu); +} + +static int vt_nmi_allowed(struct kvm_vcpu *vcpu, bool for_injection) +{ + /* + * The TDX module manages NMI windows and NMI reinjection, and hides NMI + * blocking, all KVM can do is throw an NMI over the wall. + */ + if (is_td_vcpu(vcpu)) + return true; + + return vmx_nmi_allowed(vcpu, for_injection); +} + +static bool vt_get_nmi_mask(struct kvm_vcpu *vcpu) +{ + /* + * Assume NMIs are always unmasked. KVM could query PEND_NMI and treat + * NMIs as masked if a previous NMI is still pending, but SEAMCALLs are + * expensive and the end result is unchanged as the only relevant usage + * of get_nmi_mask() is to limit the number of pending NMIs, i.e. it + * only changes whether KVM or the TDX module drops an NMI. + */ + if (is_td_vcpu(vcpu)) + return false; + + return vmx_get_nmi_mask(vcpu); +} + +static void vt_set_nmi_mask(struct kvm_vcpu *vcpu, bool masked) +{ + if (is_td_vcpu(vcpu)) + return; + + vmx_set_nmi_mask(vcpu, masked); +} + +static void vt_enable_nmi_window(struct kvm_vcpu *vcpu) +{ + /* Refer the comment in vt_get_nmi_mask(). */ + if (is_td_vcpu(vcpu)) + return; + + vmx_enable_nmi_window(vcpu); +} + static void vt_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int pgd_level) { @@ -410,14 +462,14 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .get_interrupt_shadow = vt_get_interrupt_shadow, .patch_hypercall = vmx_patch_hypercall, .inject_irq = vt_inject_irq, - .inject_nmi = vmx_inject_nmi, + .inject_nmi = vt_inject_nmi, .inject_exception = vmx_inject_exception, .cancel_injection = vt_cancel_injection, .interrupt_allowed = vt_interrupt_allowed, - .nmi_allowed = vmx_nmi_allowed, - .get_nmi_mask = vmx_get_nmi_mask, - .set_nmi_mask = vmx_set_nmi_mask, - .enable_nmi_window = vmx_enable_nmi_window, + .nmi_allowed = vt_nmi_allowed, + .get_nmi_mask = vt_get_nmi_mask, + .set_nmi_mask = vt_set_nmi_mask, + .enable_nmi_window = vt_enable_nmi_window, .enable_irq_window = vt_enable_irq_window, .update_cr8_intercept = vmx_update_cr8_intercept, .set_virtual_apic_mode = vmx_set_virtual_apic_mode, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 7dad75b3b4a6..9805308079f1 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -678,6 +678,11 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu) return EXIT_FASTPATH_NONE; } +void tdx_inject_nmi(struct kvm_vcpu *vcpu) +{ + td_management_write8(to_tdx(vcpu), TD_VCPU_PEND_NMI, 1); +} + void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int pgd_level) { td_vmcs_write64(to_tdx(vcpu), SHARED_EPT_POINTER, root_hpa & PAGE_MASK); diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 01fac8ba8c50..ba4e51446e41 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -156,6 +156,7 @@ bool tdx_protected_apic_has_interrupt(struct kvm_vcpu *vcpu); void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, int trig_mode, int vector); +void tdx_inject_nmi(struct kvm_vcpu *vcpu); int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); @@ -188,6 +189,7 @@ static inline bool tdx_protected_apic_has_interrupt(struct kvm_vcpu *vcpu) { ret static inline void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, int trig_mode, int vector) {} +static inline void tdx_inject_nmi(struct kvm_vcpu *vcpu) {} static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; } From patchwork Sun Oct 30 06:23:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12915 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1667066wru; Sat, 29 Oct 2022 23:32:27 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4VygX4u2t3psHWD5jAUa6A4F+gRCwhsZ82Cl3r+z5VrxBVkYS02HYw0+0yqKKMdprhYr24 X-Received: by 2002:a17:903:22c7:b0:187:190d:da89 with SMTP id y7-20020a17090322c700b00187190dda89mr1568172plg.68.1667111546939; Sat, 29 Oct 2022 23:32:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111546; cv=none; d=google.com; s=arc-20160816; b=TtsEdJpMU8eJCWbhAvbUdxGJiiAtn6/dfRFjeM/5teeebKMSYUjnQfvKApVD+8jO3E HQLnzYXMx0bFTEwXpOqSYjs9Aws6BJzmiUlZvHKZ+0BGiHXH4M63Ef0L8e6Lr4O2kkjo nEHbYa8U+6DWoIFXAeoBpoC3WA8EIJrZOvIkDVciBNIILQyi7VbSaLcmeMZfKKtdgLY7 MjO4XUg2QudGCEefsrruEm+B8jD2uEdtAHUtB3uaJzYO0WPxdcQIBaUhaOmwulXJBhnI Axav5493Ew8t+bcbL53QpZrR7vDxUA9Qrt5tBRqmXgu9N/e++mlWefmFGLlVZl4pIQqz qr2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=XZsqKe7r7AX6kU4HCDKMcH47PsynUSkeQONzF7VfAQY=; b=oOcI5Z/IB1Kq7p7B8R/64psBROYTQPqj3WR0AUA/HooP3K2JIBKWY45dybcMYDLjLB qH71x7Oqo4NvR/2DupbavHjj+S22MYo8NwAnw0SR74HUP16SICK3LI8DATdtZAkkRIut UhuwKsDqwFCWT1L2Zh25lMWTcleju0NtKG0J1SKXos/2bBBwVuUVvX4Qbu7c4ajqVhKR I9l9ye7uUjJ2G7vIyFl2M/ltihS7owERucabLuvNuVMN5uGOMWszG33aR5KXcti85Fw6 zKJHch6+M+3sX6ZD/pH/aL1SEOYAbhj0EMDMLXSGRCemJ+l4M4nPvvAgisI50R6cPdVN OI6g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=dd0dMKtM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f192-20020a636ac9000000b004626957c3c7si4355813pgc.193.2022.10.29.23.32.14; Sat, 29 Oct 2022 23:32:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=dd0dMKtM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231422AbiJ3Gbf (ORCPT + 99 others); Sun, 30 Oct 2022 02:31:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54088 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230432AbiJ3G3q (ORCPT ); Sun, 30 Oct 2022 02:29:46 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E1DC239C; Sat, 29 Oct 2022 23:24:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111089; x=1698647089; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QfilQfz5T6WX7jIiQti9o/c2YET0YrmSZ7uU0IfCFmE=; b=dd0dMKtMGn6x5i9nU/MvAyaG1/4bsVYvt1IH7mIzD1UXTlO7FIaFa0R6 duAA0lWTJeSxdITv+r7hwycCFyj9Uq45O6h5LBPxpYBrOCskOGVu4ZSLz mlbths3ANKQ4aqnng+Ld40ZH1382GsJEIKNSDZWDKbTQFS+5PRjsx4P07 g6Hl11HWAmE0/PKyzKxFYAtQEgJ6tDWSJu8P4zQDFSEiUlU6wtVx3fX1b AFT3A13OCalbufwWNywIj+UXV2ouW0qATnips6b3S7nwfNiFG6ORWn216 7Tvu6Dy568shrTDixJxgkO58VzUKGOnKqwdHxpeVisDbxqD+hk9lx1rpk Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037197" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037197" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:11 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393113" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393113" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:11 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 084/108] KVM: VMX: Modify NMI and INTR handlers to take intr_info as function argument Date: Sat, 29 Oct 2022 23:23:25 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093157631144412?= X-GMAIL-MSGID: =?utf-8?q?1748093157631144412?= From: Sean Christopherson TDX uses different ABI to get information about VM exit. Pass intr_info to the NMI and INTR handlers instead of pulling it from vcpu_vmx in preparation for sharing the bulk of the handlers with TDX. When the guest TD exits to VMM, RAX holds status and exit reason, RCX holds exit qualification etc rather than the VMCS fields because VMM doesn't have access to the VMCS. The eventual code will be VMX: - get exit reason, intr_info, exit_qualification, and etc from VMCS - call NMI/INTR handlers (common code) TDX: - get exit reason, intr_info, exit_qualification, and etc from guest registers - call NMI/INTR handlers (common code) Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/kvm/vmx/vmx.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index b4c65eb17cc2..79d8d6a89516 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6830,28 +6830,27 @@ static void handle_nm_fault_irqoff(struct kvm_vcpu *vcpu) rdmsrl(MSR_IA32_XFD_ERR, vcpu->arch.guest_fpu.xfd_err); } -static void handle_exception_nmi_irqoff(struct vcpu_vmx *vmx) +static void handle_exception_nmi_irqoff(struct kvm_vcpu *vcpu, u32 intr_info) { const unsigned long nmi_entry = (unsigned long)asm_exc_nmi_noist; - u32 intr_info = vmx_get_intr_info(&vmx->vcpu); /* if exit due to PF check for async PF */ if (is_page_fault(intr_info)) - vmx->vcpu.arch.apf.host_apf_flags = kvm_read_and_reset_apf_flags(); + vcpu->arch.apf.host_apf_flags = kvm_read_and_reset_apf_flags(); /* if exit due to NM, handle before interrupts are enabled */ else if (is_nm_fault(intr_info)) - handle_nm_fault_irqoff(&vmx->vcpu); + handle_nm_fault_irqoff(vcpu); /* Handle machine checks before interrupts are enabled */ else if (is_machine_check(intr_info)) kvm_machine_check(); /* We need to handle NMIs before interrupts are enabled */ else if (is_nmi(intr_info)) - handle_interrupt_nmi_irqoff(&vmx->vcpu, nmi_entry); + handle_interrupt_nmi_irqoff(vcpu, nmi_entry); } -static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu) +static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu, + u32 intr_info) { - u32 intr_info = vmx_get_intr_info(vcpu); unsigned int vector = intr_info & INTR_INFO_VECTOR_MASK; gate_desc *desc = (gate_desc *)host_idt_base + vector; @@ -6871,9 +6870,9 @@ void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu) return; if (vmx->exit_reason.basic == EXIT_REASON_EXTERNAL_INTERRUPT) - handle_external_interrupt_irqoff(vcpu); + handle_external_interrupt_irqoff(vcpu, vmx_get_intr_info(vcpu)); else if (vmx->exit_reason.basic == EXIT_REASON_EXCEPTION_NMI) - handle_exception_nmi_irqoff(vmx); + handle_exception_nmi_irqoff(vcpu, vmx_get_intr_info(vcpu)); } /* From patchwork Sun Oct 30 06:23:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12918 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1667124wru; Sat, 29 Oct 2022 23:32:45 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7wjAHbTyWu2V+IceJFkBD4qYYz7PvIs4Nsub5ahkcqE3pdvEDp2426TMJBoxOBHBeKgoVy X-Received: by 2002:a17:90b:50e:b0:213:d7cb:83e3 with SMTP id r14-20020a17090b050e00b00213d7cb83e3mr278988pjz.232.1667111565319; Sat, 29 Oct 2022 23:32:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111565; cv=none; d=google.com; s=arc-20160816; b=CWGdzzw7LLE683UE0Ox6pyvHoe23zRQTYrVvSBNNT/eKPQ0i9ulpjXBrmjS5Dx244f A6PiEsM20OV5MvyaJ7Ik2sWsKfVF9PTeA+RvAafy3bO3PvvYN0T2qakcNXMJX1qs/4p/ qm9OuBbnBGeA58Z+/KgDGRNzaAazSn32dPBlNfz+fuulUciZna0MKyU9rSOEDS5hB0qC 2lTv51XuQn2RFDP0mSIkveYOg1ME3rqm98sv7OXgHKY1Laag876OU4YTLspW8SbMI3rG wne/sp6q7gzbUq8wo2uKqIrSuxE1OF7l/IqAIhOh64FUpwR2XlPdMdGk72L/+QVlQV5T F5/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=PuLixola/pmuUouMVgTh0Pmxg328WE63T8fLiplCd9Q=; b=og4sul5Dw2Qmr4ojTOUmV1URH6S0o7liBiVnBgVIlPg+lQLA+ocP0mpiWslUqJT08x WYEMCHYAQblHEO9aAVsyLJHgLYTwZ68KKkQ215NA2jGERY3UQC/7dWo53QFw+WsvNNIO DwH8Go/MjfGjnqCUYUlzLOLqkP9urkoTkEYnzdvyDWp4kwF+6IhZoPmqRvf0imzV1Kb8 GEDJ8OEgJYhlql4wrKK0AsB+w4VZfUt06t1Db3gwjxUzkl5BX28HW/uqSQCHkeOwCdLf 06TRSMN3hQEYw0yefN2tBblUZ74CU7wjZR8qO2RkBYDNBVNYnDlMrvX3vmc/KrGwRf66 Y1Gg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=daJ4Om+s; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l14-20020a17090b078e00b0020d2d54066csi3782127pjz.171.2022.10.29.23.32.32; Sat, 29 Oct 2022 23:32:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=daJ4Om+s; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231434AbiJ3Gbk (ORCPT + 99 others); Sun, 30 Oct 2022 02:31:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58236 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231285AbiJ3GaI (ORCPT ); Sun, 30 Oct 2022 02:30:08 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F4BD3A1; Sat, 29 Oct 2022 23:24:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111090; x=1698647090; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=U1/poVuEMcIZBzQv/9oQmn7JVsou64bidpPaDtiRnlQ=; b=daJ4Om+seASM5P0V4wQAXce6GQO7uWG340Xb8NgPN/+WMz+Ppk1GTdaV 23cHGe1bW2w6TntkznvGV1BJXqnLj5+wWjFGcbG3MxnTXI1X7lFFimGvN 4VQ7wAUby7QHLAeJ02ufbESJXpVEZ6xuhJs5i7UZ+RiarXW2oZEHvuYHE OkN965uaI1k2jCWi79Ehhoa3Yl5hgkVNnjtMD+KhhMsYk704FWpwfYy+r /ZIeqTPiE50FELqNhU5U5pAWgE7KnU3OxSoHtBvCfOvIEtDjoNt36iZ9c vBv1L3h5S/QPaQcoPbT/9CBJWXMs8DEQFNIMzCf/TkwK3oCQPgYgsSma4 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037198" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037198" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:11 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393116" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393116" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:11 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 085/108] KVM: VMX: Move NMI/exception handler to common helper Date: Sat, 29 Oct 2022 23:23:26 -0700 Message-Id: <2954b502dd4477a439e1ff1ffc299997ba91c5f7.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093176225200052?= X-GMAIL-MSGID: =?utf-8?q?1748093176225200052?= From: Sean Christopherson TDX mostly handles NMI/exception exit mostly the same to VMX case. The difference is how to retrieve exit qualification. To share the code with TDX, move NMI/exception to a common header, common.h. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/common.h | 70 ++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/vmx.c | 79 ++++----------------------------------- 2 files changed, 78 insertions(+), 71 deletions(-) diff --git a/arch/x86/kvm/vmx/common.h b/arch/x86/kvm/vmx/common.h index 747f993cf7de..04836d88baaa 100644 --- a/arch/x86/kvm/vmx/common.h +++ b/arch/x86/kvm/vmx/common.h @@ -4,8 +4,78 @@ #include +#include + #include "posted_intr.h" #include "mmu.h" +#include "vmcs.h" +#include "x86.h" + +extern unsigned long vmx_host_idt_base; +void vmx_do_interrupt_nmi_irqoff(unsigned long entry); + +static inline void vmx_handle_interrupt_nmi_irqoff(struct kvm_vcpu *vcpu, + unsigned long entry) +{ + bool is_nmi = entry == (unsigned long)asm_exc_nmi_noist; + + kvm_before_interrupt(vcpu, is_nmi ? KVM_HANDLING_NMI : KVM_HANDLING_IRQ); + vmx_do_interrupt_nmi_irqoff(entry); + kvm_after_interrupt(vcpu); +} + +static inline void vmx_handle_nm_fault_irqoff(struct kvm_vcpu *vcpu) +{ + /* + * Save xfd_err to guest_fpu before interrupt is enabled, so the + * MSR value is not clobbered by the host activity before the guest + * has chance to consume it. + * + * Do not blindly read xfd_err here, since this exception might + * be caused by L1 interception on a platform which doesn't + * support xfd at all. + * + * Do it conditionally upon guest_fpu::xfd. xfd_err matters + * only when xfd contains a non-zero value. + * + * Queuing exception is done in vmx_handle_exit. See comment there. + */ + if (vcpu->arch.guest_fpu.fpstate->xfd) + rdmsrl(MSR_IA32_XFD_ERR, vcpu->arch.guest_fpu.xfd_err); +} + +static inline void vmx_handle_exception_nmi_irqoff(struct kvm_vcpu *vcpu, + u32 intr_info) +{ + const unsigned long nmi_entry = (unsigned long)asm_exc_nmi_noist; + + /* if exit due to PF check for async PF */ + if (is_page_fault(intr_info)) + vcpu->arch.apf.host_apf_flags = kvm_read_and_reset_apf_flags(); + /* if exit due to NM, handle before interrupts are enabled */ + else if (is_nm_fault(intr_info)) + vmx_handle_nm_fault_irqoff(vcpu); + /* Handle machine checks before interrupts are enabled */ + else if (is_machine_check(intr_info)) + kvm_machine_check(); + /* We need to handle NMIs before interrupts are enabled */ + else if (is_nmi(intr_info)) + vmx_handle_interrupt_nmi_irqoff(vcpu, nmi_entry); +} + +static inline void vmx_handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu, + u32 intr_info) +{ + unsigned int vector = intr_info & INTR_INFO_VECTOR_MASK; + gate_desc *desc = (gate_desc *)vmx_host_idt_base + vector; + + if (KVM_BUG(!is_external_intr(intr_info), vcpu->kvm, + "KVM: unexpected VM-Exit interrupt info: 0x%x", intr_info)) + return; + + vmx_handle_interrupt_nmi_irqoff(vcpu, gate_offset(desc)); + vcpu->arch.at_instruction_boundary = true; +} static inline int __vmx_handle_ept_violation(struct kvm_vcpu *vcpu, gpa_t gpa, unsigned long exit_qualification) diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 79d8d6a89516..ee2705707266 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -525,7 +525,7 @@ static inline void vmx_segment_cache_clear(struct vcpu_vmx *vmx) vmx->segment_cache.bitmask = 0; } -static unsigned long host_idt_base; +unsigned long vmx_host_idt_base; #if IS_ENABLED(CONFIG_HYPERV) static bool __read_mostly enlightened_vmcs = true; @@ -4236,7 +4236,7 @@ void vmx_set_constant_host_state(struct vcpu_vmx *vmx) vmcs_write16(HOST_SS_SELECTOR, __KERNEL_DS); /* 22.2.4 */ vmcs_write16(HOST_TR_SELECTOR, GDT_ENTRY_TSS*8); /* 22.2.4 */ - vmcs_writel(HOST_IDTR_BASE, host_idt_base); /* 22.2.4 */ + vmcs_writel(HOST_IDTR_BASE, vmx_host_idt_base); /* 22.2.4 */ vmcs_writel(HOST_RIP, (unsigned long)vmx_vmexit); /* 22.2.5 */ @@ -5120,10 +5120,10 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu) intr_info = vmx_get_intr_info(vcpu); if (is_machine_check(intr_info) || is_nmi(intr_info)) - return 1; /* handled by handle_exception_nmi_irqoff() */ + return 1; /* handled by vmx_handle_exception_nmi_irqoff() */ /* - * Queue the exception here instead of in handle_nm_fault_irqoff(). + * Queue the exception here instead of in vmx_handle_nm_fault_irqoff(). * This ensures the nested_vmx check is not skipped so vmexit can * be reflected to L1 (when it intercepts #NM) before reaching this * point. @@ -6798,70 +6798,6 @@ void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap) vmcs_write64(EOI_EXIT_BITMAP3, eoi_exit_bitmap[3]); } -void vmx_do_interrupt_nmi_irqoff(unsigned long entry); - -static void handle_interrupt_nmi_irqoff(struct kvm_vcpu *vcpu, - unsigned long entry) -{ - bool is_nmi = entry == (unsigned long)asm_exc_nmi_noist; - - kvm_before_interrupt(vcpu, is_nmi ? KVM_HANDLING_NMI : KVM_HANDLING_IRQ); - vmx_do_interrupt_nmi_irqoff(entry); - kvm_after_interrupt(vcpu); -} - -static void handle_nm_fault_irqoff(struct kvm_vcpu *vcpu) -{ - /* - * Save xfd_err to guest_fpu before interrupt is enabled, so the - * MSR value is not clobbered by the host activity before the guest - * has chance to consume it. - * - * Do not blindly read xfd_err here, since this exception might - * be caused by L1 interception on a platform which doesn't - * support xfd at all. - * - * Do it conditionally upon guest_fpu::xfd. xfd_err matters - * only when xfd contains a non-zero value. - * - * Queuing exception is done in vmx_handle_exit. See comment there. - */ - if (vcpu->arch.guest_fpu.fpstate->xfd) - rdmsrl(MSR_IA32_XFD_ERR, vcpu->arch.guest_fpu.xfd_err); -} - -static void handle_exception_nmi_irqoff(struct kvm_vcpu *vcpu, u32 intr_info) -{ - const unsigned long nmi_entry = (unsigned long)asm_exc_nmi_noist; - - /* if exit due to PF check for async PF */ - if (is_page_fault(intr_info)) - vcpu->arch.apf.host_apf_flags = kvm_read_and_reset_apf_flags(); - /* if exit due to NM, handle before interrupts are enabled */ - else if (is_nm_fault(intr_info)) - handle_nm_fault_irqoff(vcpu); - /* Handle machine checks before interrupts are enabled */ - else if (is_machine_check(intr_info)) - kvm_machine_check(); - /* We need to handle NMIs before interrupts are enabled */ - else if (is_nmi(intr_info)) - handle_interrupt_nmi_irqoff(vcpu, nmi_entry); -} - -static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu, - u32 intr_info) -{ - unsigned int vector = intr_info & INTR_INFO_VECTOR_MASK; - gate_desc *desc = (gate_desc *)host_idt_base + vector; - - if (KVM_BUG(!is_external_intr(intr_info), vcpu->kvm, - "KVM: unexpected VM-Exit interrupt info: 0x%x", intr_info)) - return; - - handle_interrupt_nmi_irqoff(vcpu, gate_offset(desc)); - vcpu->arch.at_instruction_boundary = true; -} - void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -6870,9 +6806,10 @@ void vmx_handle_exit_irqoff(struct kvm_vcpu *vcpu) return; if (vmx->exit_reason.basic == EXIT_REASON_EXTERNAL_INTERRUPT) - handle_external_interrupt_irqoff(vcpu, vmx_get_intr_info(vcpu)); + vmx_handle_external_interrupt_irqoff(vcpu, + vmx_get_intr_info(vcpu)); else if (vmx->exit_reason.basic == EXIT_REASON_EXCEPTION_NMI) - handle_exception_nmi_irqoff(vcpu, vmx_get_intr_info(vcpu)); + vmx_handle_exception_nmi_irqoff(vcpu, vmx_get_intr_info(vcpu)); } /* @@ -8120,7 +8057,7 @@ __init int vmx_hardware_setup(void) int r; store_idt(&dt); - host_idt_base = dt.address; + vmx_host_idt_base = dt.address; vmx_setup_user_return_msrs(); From patchwork Sun Oct 30 06:23:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12917 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1667109wru; Sat, 29 Oct 2022 23:32:39 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5bS1xEV/Xh8zlt9nKRr5ow0uQ8LLxBctOO2SfAMgrQgMYRXf6J69NV9NR7C9qXWgIAwgAE X-Received: by 2002:a17:90a:6c41:b0:212:fdaf:d79c with SMTP id x59-20020a17090a6c4100b00212fdafd79cmr8141597pjj.134.1667111559309; Sat, 29 Oct 2022 23:32:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111559; cv=none; d=google.com; s=arc-20160816; b=A79GnkJVow8LzpYapM5/gUe0qoEndBofAH/1WQJduLBGkdEYTTBYsrSLe0uYzbPPDx Hff43SvWuTh1Ws11Ndcywf4dbke2yTNlohBjqp2poPyKMjej//ORvZxGH28HmE1ei0LB XgbZ3nEiHTSnAPsHVD6YSeu42+ChZVfNoucodRrFcbmwCRDBLNPo3R2D/RTF5fZwsG0H J5VSwREJs589RvQRg8GbIAqnbNiZQRoImZIzNXTBvdvk+23IieL7Khb5F2LFFVkY8xD2 pHNt2/AmTJyMO61Sl8qgqQZsKqLnI/Lupmk5WUo3fe85WonSCO4Gqg+kUIrF1e4Ppt8C Elqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Chq0J6ZS7N54dVQB0feSxO9NZVXC6078nDa3DN0lLsw=; b=judv8aSqO2/W+mryBPVUsytUVJTLhNY9DkpRuvskkcxKpTZDjQzaMLRlbvLYMbHwKB kqRPQXQ3J9NZM+cNXmd3FDScJP3+kmGaxwCVkSwTPEWJ1KP8cszS+U/dh4TfEZ0PiVEe dERkiu6xCIU1abFY9OJUAzblgjRZ+/sUuA3ndjeZnBL2dXeh/ZfFbjyWV7fWQWEY7b1E vvhN5mbnU7bvn6J3Jm1d0qd3kdK10AxQfFM/eNLHy75K068MkLNhtfVB3q8iVY46x/q2 EACoSFE3wOX8i86KRA0L1M+r62Qg8LiPrityiR3L3efE/TzmpkaWQI9cRpHlPJPjFXY+ nE7A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=h70NMDf2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id oa6-20020a17090b1bc600b002136a8424e9si4858909pjb.1.2022.10.29.23.32.27; Sat, 29 Oct 2022 23:32:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=h70NMDf2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231171AbiJ3GcC (ORCPT + 99 others); Sun, 30 Oct 2022 02:32:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58896 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231321AbiJ3Ga2 (ORCPT ); Sun, 30 Oct 2022 02:30:28 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6E54FB0A; Sat, 29 Oct 2022 23:24:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111093; x=1698647093; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SVVFPTYiF9KYWEo+FXLY2P2CyEXga1RHRMRYTXfVETU=; b=h70NMDf2z8z5+VThR5w0zANIaocnyNUR0+wLyWJtmO1Xp6QsCVsDKRwu JRe1RRf1wRGDG2VZg9WVzBFaHlgGoViVDO8weiZ7Jn6MojdmAR9k1qd4J DeMD60yFHh8hbzX6IFxONzVKNrMRshDI8rVBh2PyuQEq1GPF5XHZ7OAI4 lxuqZ5t+khOtPNWsvpXqskCa8VirEzVJRjcITetTpudpGaupnlv/FNj0K gkQxbTa8u10regbr1v5eU8SuHH+hYGAW7zSDFe+Br/PUdVNOe8I2gdz6U fieS9pRJVOv5qbKElD0VWkxDUiBryeNTbxx+GGqR/lug1ACfMHC0Vg7dy Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037199" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037199" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:11 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393119" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393119" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:11 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 086/108] KVM: x86: Split core of hypercall emulation to helper function Date: Sat, 29 Oct 2022 23:23:27 -0700 Message-Id: <906a188cbc382ff2d26954984b72d6e0f617382b.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093170758053989?= X-GMAIL-MSGID: =?utf-8?q?1748093170758053989?= From: Sean Christopherson By necessity, TDX will use a different register ABI for hypercalls. Break out the core functionality so that it may be reused for TDX. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm_host.h | 4 +++ arch/x86/kvm/x86.c | 54 ++++++++++++++++++++------------- 2 files changed, 37 insertions(+), 21 deletions(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 70549018987d..094fff5414e1 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -2004,6 +2004,10 @@ static inline void kvm_clear_apicv_inhibit(struct kvm *kvm, kvm_set_or_clear_apicv_inhibit(kvm, reason, false); } +unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, + unsigned long a0, unsigned long a1, + unsigned long a2, unsigned long a3, + int op_64_bit); int kvm_emulate_hypercall(struct kvm_vcpu *vcpu); int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 error_code, diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index ad7b227b68dd..fad5108dff1e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9684,26 +9684,15 @@ static int complete_hypercall_exit(struct kvm_vcpu *vcpu) return kvm_skip_emulated_instruction(vcpu); } -int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) +unsigned long __kvm_emulate_hypercall(struct kvm_vcpu *vcpu, unsigned long nr, + unsigned long a0, unsigned long a1, + unsigned long a2, unsigned long a3, + int op_64_bit) { - unsigned long nr, a0, a1, a2, a3, ret; - int op_64_bit; - - if (kvm_xen_hypercall_enabled(vcpu->kvm)) - return kvm_xen_hypercall(vcpu); - - if (kvm_hv_hypercall_enabled(vcpu)) - return kvm_hv_hypercall(vcpu); - - nr = kvm_rax_read(vcpu); - a0 = kvm_rbx_read(vcpu); - a1 = kvm_rcx_read(vcpu); - a2 = kvm_rdx_read(vcpu); - a3 = kvm_rsi_read(vcpu); + unsigned long ret; trace_kvm_hypercall(nr, a0, a1, a2, a3); - op_64_bit = is_64_bit_hypercall(vcpu); if (!op_64_bit) { nr &= 0xFFFFFFFF; a0 &= 0xFFFFFFFF; @@ -9712,11 +9701,6 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) a3 &= 0xFFFFFFFF; } - if (static_call(kvm_x86_get_cpl)(vcpu) != 0) { - ret = -KVM_EPERM; - goto out; - } - ret = -KVM_ENOSYS; switch (nr) { @@ -9775,6 +9759,34 @@ int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) ret = -KVM_ENOSYS; break; } + return ret; +} +EXPORT_SYMBOL_GPL(__kvm_emulate_hypercall); + +int kvm_emulate_hypercall(struct kvm_vcpu *vcpu) +{ + unsigned long nr, a0, a1, a2, a3, ret; + int op_64_bit; + + if (kvm_xen_hypercall_enabled(vcpu->kvm)) + return kvm_xen_hypercall(vcpu); + + if (kvm_hv_hypercall_enabled(vcpu)) + return kvm_hv_hypercall(vcpu); + + nr = kvm_rax_read(vcpu); + a0 = kvm_rbx_read(vcpu); + a1 = kvm_rcx_read(vcpu); + a2 = kvm_rdx_read(vcpu); + a3 = kvm_rsi_read(vcpu); + op_64_bit = is_64_bit_hypercall(vcpu); + + if (static_call(kvm_x86_get_cpl)(vcpu) != 0) { + ret = -KVM_EPERM; + goto out; + } + + ret = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, op_64_bit); out: if (!op_64_bit) ret = (u32)ret; From patchwork Sun Oct 30 06:23:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12921 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1667221wru; Sat, 29 Oct 2022 23:33:06 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6sH53CRxrGEt3OQnp33nMmmksoVTT/w9pkt1sNEK+5VRER6GNZV82gkIUPcpw3v6XeY0IB X-Received: by 2002:a17:90a:9bc7:b0:213:9d21:b0b0 with SMTP id b7-20020a17090a9bc700b002139d21b0b0mr8343817pjw.26.1667111585987; Sat, 29 Oct 2022 23:33:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111585; cv=none; d=google.com; s=arc-20160816; b=aWsRxoeZxpcdt4cZqndzcJoPpADiiOZW01PGe+keXIxeVsjopDjs8T67H6G24d3Er7 LlthBxsFjFRKbtH02wz2lhePL4gFh0MLmF2Xg7Yrsjn0vpoI/CGiivqX16kk8/6y7ids dOg8v5pW4BDhRjGreDyUAzfrs6w75ITYavY6pIPfrz1FHz8HIodi2GsfOUWpKmL7p+hB iHwk+hthpHypjaH7Ocu5alsnto483m1Uf/WfcEVAIwygcERA9Kh1F0FDI2bTX61uSU+a AY+EDfbesj54+Wpn4qJ/OATPiNPZ7tSwSreJRWwsrUIRIspl7GR0oohLJACd9QQOJAaQ fKjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=DWBzWfmxn7jmLYYFM6CIYjJw+1DPxhakSdchUpms+iM=; b=jxZcBGakqnd/v2MDObzC67sR4pV9W6MnkjWoJvB4tniLByQfgp64kz0Tkh6D6FZ8+R fOCWfrYn2qawnzRnPSCapNOM/FW8+Od8OZId/kRwR2QWl88x4JZC5wgVa8YtgO1ZuTL/ BI4Z1bKOF3oO681+COeueBcRnNS3fItA7FWC5JTTwvuk6Adu/WlhRN91FMT57s6Q4gcJ dWu8j0Wwg3NLxh8faWy3Td4ZHEDXgqXvKgyBHORf6qaDj53wtVOl88DdXSlJdsF9+4aC O4V7agGIgnmjsQY6eQJXbI+ZQ4oV6MDNthlGkD/W4ftpfwq7i+RgyVYDCX2nB03RASBZ 2xtQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="Bu4X4R/B"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x186-20020a6386c3000000b00429946b0642si4297184pgd.469.2022.10.29.23.32.52; Sat, 29 Oct 2022 23:33:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="Bu4X4R/B"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231499AbiJ3GcU (ORCPT + 99 others); Sun, 30 Oct 2022 02:32:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231322AbiJ3Ga2 (ORCPT ); Sun, 30 Oct 2022 02:30:28 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6EE68B0E; Sat, 29 Oct 2022 23:24:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111093; x=1698647093; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=NuD7xE6fSBLn6MKywGixO5NPoMn/owWdYx5cJ8Y8gi0=; b=Bu4X4R/BvC8S9N8snu4Fr49bXT/SxyMSdytJwp3omw+P9tlsqzf+9AeE 5nBvwLp/DH5XYHin6nz3vtqv/RiAOSzS2oOc0dtpsKwsEYBX1uPUMtp4Y VjDFmXBCidsCH3t82dr1jgKYbUT8++G9GYXEB8G0YjdlZRIt75eUCvipJ Zn6DNIq/iGOPpT2T8JNTdU/8/fdrxIp/5Ak8aQ+p9uKHnEXY+n8cMqNXH wQgxQ27ZNbZ9tOYYRhDKMxYrVPMj9Pk1pMFbWQIpLAvscBoDCKudgWlQN zqDQUHtsLvAQSUFwmUlmif2adyoFf7b4rl/NBln+cOY8F00ihbdIIeu0w g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037202" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037202" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393122" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393122" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:11 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 087/108] KVM: TDX: Add a place holder to handle TDX VM exit Date: Sat, 29 Oct 2022 23:23:28 -0700 Message-Id: <9ff55f3061cb838b883cb317f58c0e26a2e82dd0.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093198312031239?= X-GMAIL-MSGID: =?utf-8?q?1748093198312031239?= From: Isaku Yamahata Wire up handle_exit and handle_exit_irqoff methods and add a place holder to handle VM exit. Add helper functions to get exit info, exit qualification, etc. Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/kvm/vmx/main.c | 33 +++++++++++++-- arch/x86/kvm/vmx/tdx.c | 84 ++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/x86_ops.h | 10 +++++ 3 files changed, 124 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 510bffb3e2f6..74c561e3eb46 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -178,6 +178,23 @@ static bool vt_protected_apic_has_interrupt(struct kvm_vcpu *vcpu) return tdx_protected_apic_has_interrupt(vcpu); } +static int vt_handle_exit(struct kvm_vcpu *vcpu, + enum exit_fastpath_completion fastpath) +{ + if (is_td_vcpu(vcpu)) + return tdx_handle_exit(vcpu, fastpath); + + return vmx_handle_exit(vcpu, fastpath); +} + +static void vt_handle_exit_irqoff(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return tdx_handle_exit_irqoff(vcpu); + + vmx_handle_exit_irqoff(vcpu); +} + static void vt_apicv_post_state_restore(struct kvm_vcpu *vcpu) { struct pi_desc *pi = vcpu_to_pi_desc(vcpu); @@ -379,6 +396,16 @@ static void vt_request_immediate_exit(struct kvm_vcpu *vcpu) vmx_request_immediate_exit(vcpu); } +static void vt_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, + u64 *info1, u64 *info2, u32 *intr_info, u32 *error_code) +{ + if (is_td_vcpu(vcpu)) + return tdx_get_exit_info(vcpu, reason, info1, info2, intr_info, + error_code); + + return vmx_get_exit_info(vcpu, reason, info1, info2, intr_info, error_code); +} + static int vt_mem_enc_ioctl(struct kvm *kvm, void __user *argp) { if (!is_td(kvm)) @@ -455,7 +482,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .vcpu_pre_run = vt_vcpu_pre_run, .vcpu_run = vt_vcpu_run, - .handle_exit = vmx_handle_exit, + .handle_exit = vt_handle_exit, .skip_emulated_instruction = vmx_skip_emulated_instruction, .update_emulated_instruction = vmx_update_emulated_instruction, .set_interrupt_shadow = vt_set_interrupt_shadow, @@ -490,7 +517,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .set_identity_map_addr = vmx_set_identity_map_addr, .get_mt_mask = vmx_get_mt_mask, - .get_exit_info = vmx_get_exit_info, + .get_exit_info = vt_get_exit_info, .vcpu_after_set_cpuid = vmx_vcpu_after_set_cpuid, @@ -504,7 +531,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .load_mmu_pgd = vt_load_mmu_pgd, .check_intercept = vmx_check_intercept, - .handle_exit_irqoff = vmx_handle_exit_irqoff, + .handle_exit_irqoff = vt_handle_exit_irqoff, .request_immediate_exit = vt_request_immediate_exit, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 9805308079f1..b0b193cc180e 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -65,6 +65,26 @@ static __always_inline hpa_t set_hkid_to_hpa(hpa_t pa, u16 hkid) return pa | ((hpa_t)hkid << boot_cpu_data.x86_phys_bits); } +static __always_inline unsigned long tdexit_exit_qual(struct kvm_vcpu *vcpu) +{ + return kvm_rcx_read(vcpu); +} + +static __always_inline unsigned long tdexit_ext_exit_qual(struct kvm_vcpu *vcpu) +{ + return kvm_rdx_read(vcpu); +} + +static __always_inline unsigned long tdexit_gpa(struct kvm_vcpu *vcpu) +{ + return kvm_r8_read(vcpu); +} + +static __always_inline unsigned long tdexit_intr_info(struct kvm_vcpu *vcpu) +{ + return kvm_r9_read(vcpu); +} + static inline bool is_td_vcpu_created(struct vcpu_tdx *tdx) { return tdx->tdvpr.added; @@ -683,6 +703,25 @@ void tdx_inject_nmi(struct kvm_vcpu *vcpu) td_management_write8(to_tdx(vcpu), TD_VCPU_PEND_NMI, 1); } +void tdx_handle_exit_irqoff(struct kvm_vcpu *vcpu) +{ + struct vcpu_tdx *tdx = to_tdx(vcpu); + u16 exit_reason = tdx->exit_reason.basic; + + if (exit_reason == EXIT_REASON_EXCEPTION_NMI) + vmx_handle_exception_nmi_irqoff(vcpu, tdexit_intr_info(vcpu)); + else if (exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT) + vmx_handle_external_interrupt_irqoff(vcpu, + tdexit_intr_info(vcpu)); +} + +static int tdx_handle_triple_fault(struct kvm_vcpu *vcpu) +{ + vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN; + vcpu->mmio_needed = 0; + return 0; +} + void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int pgd_level) { td_vmcs_write64(to_tdx(vcpu), SHARED_EPT_POINTER, root_hpa & PAGE_MASK); @@ -1007,6 +1046,51 @@ void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, __vmx_deliver_posted_interrupt(vcpu, &tdx->pi_desc, vector); } +int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) +{ + union tdx_exit_reason exit_reason = to_tdx(vcpu)->exit_reason; + + if (unlikely(exit_reason.non_recoverable || exit_reason.error)) { + if (exit_reason.basic == EXIT_REASON_TRIPLE_FAULT) + return tdx_handle_triple_fault(vcpu); + + kvm_pr_unimpl("TD exit 0x%llx, %d hkid 0x%x hkid pa 0x%llx\n", + exit_reason.full, exit_reason.basic, + to_kvm_tdx(vcpu->kvm)->hkid, + set_hkid_to_hpa(0, to_kvm_tdx(vcpu->kvm)->hkid)); + goto unhandled_exit; + } + + WARN_ON_ONCE(fastpath != EXIT_FASTPATH_NONE); + + switch (exit_reason.basic) { + default: + break; + } + +unhandled_exit: + vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR; + vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_UNEXPECTED_EXIT_REASON; + vcpu->run->internal.ndata = 2; + vcpu->run->internal.data[0] = exit_reason.full; + vcpu->run->internal.data[1] = vcpu->arch.last_vmentry_cpu; + return 0; +} + +void tdx_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, + u64 *info1, u64 *info2, u32 *intr_info, u32 *error_code) +{ + struct vcpu_tdx *tdx = to_tdx(vcpu); + + *reason = tdx->exit_reason.full; + + *info1 = tdexit_exit_qual(vcpu); + *info2 = tdexit_ext_exit_qual(vcpu); + + *intr_info = tdexit_intr_info(vcpu); + *error_code = 0; +} + int tdx_dev_ioctl(void __user *argp) { struct kvm_tdx_capabilities __user *user_caps; diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index ba4e51446e41..d02619f64b6e 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -153,10 +153,15 @@ void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu); void tdx_vcpu_put(struct kvm_vcpu *vcpu); void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu); bool tdx_protected_apic_has_interrupt(struct kvm_vcpu *vcpu); +void tdx_handle_exit_irqoff(struct kvm_vcpu *vcpu); +int tdx_handle_exit(struct kvm_vcpu *vcpu, + enum exit_fastpath_completion fastpath); void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, int trig_mode, int vector); void tdx_inject_nmi(struct kvm_vcpu *vcpu); +void tdx_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, + u64 *info1, u64 *info2, u32 *intr_info, u32 *error_code); int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); @@ -186,10 +191,15 @@ static inline void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) {} static inline void tdx_vcpu_put(struct kvm_vcpu *vcpu) {} static inline void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) {} static inline bool tdx_protected_apic_has_interrupt(struct kvm_vcpu *vcpu) { return false; } +static inline void tdx_handle_exit_irqoff(struct kvm_vcpu *vcpu) {} +static inline int tdx_handle_exit(struct kvm_vcpu *vcpu, + enum exit_fastpath_completion fastpath) { return 0; } static inline void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, int trig_mode, int vector) {} static inline void tdx_inject_nmi(struct kvm_vcpu *vcpu) {} +static inline void tdx_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, u64 *info1, + u64 *info2, u32 *intr_info, u32 *error_code) {} static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; } From patchwork Sun Oct 30 06:23:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12919 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1667127wru; Sat, 29 Oct 2022 23:32:46 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7EvlgVgu3FOps3scNHJCbWBSyQ4rW+lDMwuEPip4PowoEI2bUKsb8I0mrwTvq8NBqJXmdH X-Received: by 2002:a17:90b:4d90:b0:213:687d:c0f0 with SMTP id oj16-20020a17090b4d9000b00213687dc0f0mr8374468pjb.212.1667111566330; Sat, 29 Oct 2022 23:32:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111566; cv=none; d=google.com; s=arc-20160816; b=gcWGi8YqSBy6ZbzB8tH9Zi3XCLX+6T7aRp68nlum0W8o83EM/hBUlEp/WqYNBjIbRt CksocuTKaKabAFb4GUvtpK7rayx+OtC+AFEouVyJSN8GTsDX2nz9w5qUL85UKWkElAE/ iutZ09tSBW4/SHQwlJePs5mL/rxfCfBCuea9S4fl16T+szpzpjcE6lMkNjooIrkoHVKc GquXACGiP6ey3JCJK7mIl5ZUKhaFwglJDL1mpvbphkKyv70rb1ZzuwruR4wlRw6tYbKa RUygen8aVCZ7Nxw8JQ02mfDPb5RpiKADsdRWSHn1urvZkij5IJ5grmuYEN8nPYJKf1Nn iScw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=1Cxil/BngfAUQ1BE/xsCwEtzHuLll+6Zq9W9ChZRhZE=; b=DLCGFiJOnYiI3MFpCK1QwRmII+xR7VYirR+a5EM8JhYAWjs+Ml+5BWbkClRH3jCsmS rz3IpWnBsUQgpv4/9poMGBUCqMnSRxqF1CFbxT8GH39EDNAe26Wira6bgelA7h3JmM5e LmZqEEf0TG1K+/nnhBzZj/BJQSnSRjLKMDBdtJHBcgk32m1VIo9oPOYjyv+CyP2ccvcQ mMVIV06Xyb8iSpZJ8kRFBrOIYOUF+cnUbYafwP7cXfe5Gc+1JjJAKW29CMniUCtiug+K s6igeF8fM/hzSbAzw69KWaS5fwKLaZ4fFWkNzsZhP/lVy6bRBq1qH2W4aYY8qvi2iDr7 TZpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=DhoQUeHk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w184-20020a6382c1000000b0046af2452ed8si4307153pgd.766.2022.10.29.23.32.33; Sat, 29 Oct 2022 23:32:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=DhoQUeHk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231329AbiJ3GcL (ORCPT + 99 others); Sun, 30 Oct 2022 02:32:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58908 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231324AbiJ3Ga2 (ORCPT ); Sun, 30 Oct 2022 02:30:28 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 92935B10; Sat, 29 Oct 2022 23:24:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111093; x=1698647093; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wfEchl1NiccXT0JGjR9Y16+RK9hpFyG15uj0MzcaKCQ=; b=DhoQUeHkhWytmt42GO3dsxZ+ckZEpEep7V0kVarpePTEZ/BtiTr6NhZt n07Bcd0tnre0RE2PtBGz8nPIA2ukzn3mAlSPyOHcsp8MMB0iPObmgWn7y FiUfIJIN/VB48wktOqZ/uEoO5lm8UeMZg39zjr9KKOCd+yw1Fy0jn6iMT wt/Q6QcSP1YQonqWVYKE3+/HT7BfpJZQYDgCRJ8xk34/4qO5KdMhuBskl ayNaP6fwm1mfpEFNB/F/HN8YmgKpznNSqKEr/XqXdi6R/iSn198QnvoJE SmjmbsBPyWqD3u+3Da+DJFXMOJq0RTuWc1TH9XfUC5wnqAxlSwCPUZV+D Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037204" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037204" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393125" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393125" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:12 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Yuan Yao Subject: [PATCH v10 088/108] KVM: TDX: Retry seamcall when TDX_OPERAND_BUSY with operand SEPT Date: Sat, 29 Oct 2022 23:23:29 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093177488489779?= X-GMAIL-MSGID: =?utf-8?q?1748093177488489779?= From: Yuan Yao TDX module internally uses locks to protect internal resources. It tries to acquire the locks. If it fails to obtain the lock, it returns TDX_OPERAND_BUSY error without spin because its execution time limitation. TDX SEAMCALL API reference describes what resources are used. It's known which TDX SEAMCALL can cause contention with which resources. VMM can avoid contention inside the TDX module by avoiding contentious TDX SEAMCALL with, for example, spinlock. Because OS knows better its process scheduling and its scalability, a lock at OS/VMM layer would work better than simply retrying TDX SEAMCALLs. TDH.MEM.* API except for TDH.MEM.TRACK operates on a secure EPT tree and the TDX module internally tries to acquire the lock of the secure EPT tree. They return TDX_OPERAND_BUSY | TDX_OPERAND_ID_SEPT in case of failure to get the lock. TDX KVM allows sept callbacks to return error so that TDP MMU layer can retry. TDH.VP.ENTER is an exception with zero-step attack mitigation. Normally TDH.VP.ENTER uses only TD vcpu resources and it doesn't cause contention. When a zero-step attack is suspected, it obtains a secure EPT tree lock and tracks the GPAs causing a secure EPT fault. Thus TDG.VP.ENTER may result in TDX_OPERAND_BUSY | TDX_OPERAND_ID_SEPT. Also TDH.MEM.* SEAMCALLs may result in TDX_OPERAN_BUSY | TDX_OPERAND_ID_SEPT. Retry TDX TDH.MEM.* API and TDH.VP.ENTER on the error because the error is a rare event caused by zero-step attack mitigation and spinlock can not be used for TDH.VP.ENTER due to indefinite time execution. Signed-off-by: Yuan Yao Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/tdx.c | 4 ++++ arch/x86/kvm/vmx/tdx_ops.h | 42 +++++++++++++++++++++++++++++++------- 2 files changed, 39 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index b0b193cc180e..088e98232227 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1050,6 +1050,10 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) { union tdx_exit_reason exit_reason = to_tdx(vcpu)->exit_reason; + /* See the comment of tdh_sept_seamcall(). */ + if (unlikely(exit_reason.full == (TDX_OPERAND_BUSY | TDX_OPERAND_ID_SEPT))) + return 1; + if (unlikely(exit_reason.non_recoverable || exit_reason.error)) { if (exit_reason.basic == EXIT_REASON_TRIPLE_FAULT) return tdx_handle_triple_fault(vcpu); diff --git a/arch/x86/kvm/vmx/tdx_ops.h b/arch/x86/kvm/vmx/tdx_ops.h index 35e285ae6f9e..86330d0e4b22 100644 --- a/arch/x86/kvm/vmx/tdx_ops.h +++ b/arch/x86/kvm/vmx/tdx_ops.h @@ -18,7 +18,35 @@ void pr_tdx_error(u64 op, u64 error_code, const struct tdx_module_output *out); -#define TDX_ERROR_SEPT_BUSY (TDX_OPERAND_BUSY | TDX_OPERAND_ID_SEPT) +/* + * TDX module acquires its internal lock for resources. It doesn't spin to get + * locks because of its restrictions of allowed execution time. Instead, it + * returns TDX_OPERAND_BUSY with an operand id. + * + * Multiple VCPUs can operate on SEPT. Also with zero-step attack mitigation, + * TDH.VP.ENTER may rarely acquire SEPT lock and release it when zero-step + * attack is suspected. It results in TDX_OPERAND_BUSY | TDX_OPERAND_ID_SEPT + * with TDH.MEM.* operation. Note: TDH.MEM.TRACK is an exception. + * + * Because TDP MMU uses read lock for scalability, spin lock around SEAMCALL + * spoils TDP MMU effort. Retry several times with the assumption that SEPT + * lock contention is rare. But don't loop forever to avoid lockup. Let TDP + * MMU retry. + */ +#define TDX_ERROR_SEPT_BUSY (TDX_OPERAND_BUSY | TDX_OPERAND_ID_SEPT) + +static inline u64 seamcall_sept(u64 op, u64 rcx, u64 rdx, u64 r8, u64 r9, + struct tdx_module_output *out) +{ +#define SEAMCALL_RETRY_MAX 16 + int retry = SEAMCALL_RETRY_MAX; + u64 ret; + + do { + ret = __seamcall(op, rcx, rdx, r8, r9, out); + } while (ret == TDX_ERROR_SEPT_BUSY && retry-- > 0); + return ret; +} static inline u64 tdh_mng_addcx(hpa_t tdr, hpa_t addr) { @@ -30,14 +58,14 @@ static inline u64 tdh_mem_page_add(hpa_t tdr, gpa_t gpa, hpa_t hpa, hpa_t source struct tdx_module_output *out) { clflush_cache_range(__va(hpa), PAGE_SIZE); - return __seamcall(TDH_MEM_PAGE_ADD, gpa, tdr, hpa, source, out); + return seamcall_sept(TDH_MEM_PAGE_ADD, gpa, tdr, hpa, source, out); } static inline u64 tdh_mem_sept_add(hpa_t tdr, gpa_t gpa, int level, hpa_t page, struct tdx_module_output *out) { clflush_cache_range(__va(page), PAGE_SIZE); - return __seamcall(TDH_MEM_SEPT_ADD, gpa | level, tdr, page, 0, out); + return seamcall_sept(TDH_MEM_SEPT_ADD, gpa | level, tdr, page, 0, out); } static inline u64 tdh_mem_sept_remove(hpa_t tdr, gpa_t gpa, int level, @@ -63,13 +91,13 @@ static inline u64 tdh_mem_page_aug(hpa_t tdr, gpa_t gpa, hpa_t hpa, struct tdx_module_output *out) { clflush_cache_range(__va(hpa), PAGE_SIZE); - return __seamcall(TDH_MEM_PAGE_AUG, gpa, tdr, hpa, 0, out); + return seamcall_sept(TDH_MEM_PAGE_AUG, gpa, tdr, hpa, 0, out); } static inline u64 tdh_mem_range_block(hpa_t tdr, gpa_t gpa, int level, struct tdx_module_output *out) { - return __seamcall(TDH_MEM_RANGE_BLOCK, gpa | level, tdr, 0, 0, out); + return seamcall_sept(TDH_MEM_RANGE_BLOCK, gpa | level, tdr, 0, 0, out); } static inline u64 tdh_mng_key_config(hpa_t tdr) @@ -151,7 +179,7 @@ static inline u64 tdh_phymem_page_reclaim(hpa_t page, static inline u64 tdh_mem_page_remove(hpa_t tdr, gpa_t gpa, int level, struct tdx_module_output *out) { - return __seamcall(TDH_MEM_PAGE_REMOVE, gpa | level, tdr, 0, 0, out); + return seamcall_sept(TDH_MEM_PAGE_REMOVE, gpa | level, tdr, 0, 0, out); } static inline u64 tdh_sys_lp_shutdown(void) @@ -167,7 +195,7 @@ static inline u64 tdh_mem_track(hpa_t tdr) static inline u64 tdh_mem_range_unblock(hpa_t tdr, gpa_t gpa, int level, struct tdx_module_output *out) { - return __seamcall(TDH_MEM_RANGE_UNBLOCK, gpa | level, tdr, 0, 0, out); + return seamcall_sept(TDH_MEM_RANGE_UNBLOCK, gpa | level, tdr, 0, 0, out); } static inline u64 tdh_phymem_cache_wb(bool resume) From patchwork Sun Oct 30 06:23:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12920 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1667204wru; Sat, 29 Oct 2022 23:33:02 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7PhO8Rd60oBVHc/5weLscWWBcxN4nQe3nlT3FFgHeBn20puTiccHFZs7Mj6V6VMJLgkSjf X-Received: by 2002:a05:6a00:993:b0:56c:80f6:db5 with SMTP id u19-20020a056a00099300b0056c80f60db5mr7768560pfg.45.1667111582413; Sat, 29 Oct 2022 23:33:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111582; cv=none; d=google.com; s=arc-20160816; b=w+cJtfjTSkMZFyf9sISlkK4UHLkHEinVAF3EYSLoFo65j8oChuZIypNg2xpRW/GQkF b0n8pHgteMJhs15lV4h4wEIU+L8la/e3dicq9yj6fNB0HfMbOhy4n3FkiaTnKw5bxqeF 5cuQgnw+kqVIOsRkTsbzoh5YWgwcbi9pgYWsYHnlSCuCPiKo1olTvnGvTKW0y5QRZI/P AKa16Z/bRH26brT+qkSbjatBe/UxDwZd73qbr18oxvTZ+gOBTRU6FtT2Q9cLxgCaewpD 6b30TmHJI0cIYYAjn689qJpvKJOjFzycMlYDSXzhZHcf8cHU5/8iTHKej5ly8ZDuwmPA /ujQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=zRz4lCZ6Ur2LLgUSZ7yWQJWF8yFQ816WyWiMUi+fbQA=; b=uP2JcK4AS0peai6iaaPyBwrStDNCMj5z3TSURBBx+7grAL9ET4FKUKmCd8TCJKd1JM HtN1q52CpZ3hGX1KmKPx07EmNHKee2rTJJg399KVGqMU64NfISMUtKX1auGUn6iIAXdU QrJyvdcC99UuO+QDyW+erqh8UhCZnrusoLS1SjxhEM0lgpJq4cJVcK1h60AAJKECFZfB 3sscBsEEe1c8umJs0WP1zEs1AzQSX7Pl9MuVUr1copfdjnldeGUVVonqJUsbSzKQJUKO oyLHoO2nlJiV7+ODw6ZqOABe31UEEw5uFrXhRWzUn9DthXmacuznLBChiVawrunf2pa9 L2sQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=SBmcALrA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q16-20020a170902dad000b0017f792fe3edsi4052023plx.266.2022.10.29.23.32.48; Sat, 29 Oct 2022 23:33:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=SBmcALrA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231489AbiJ3GcO (ORCPT + 99 others); Sun, 30 Oct 2022 02:32:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58912 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231327AbiJ3Ga2 (ORCPT ); Sun, 30 Oct 2022 02:30:28 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D40E6B12; Sat, 29 Oct 2022 23:24:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111093; x=1698647093; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=iJoF+MfYcxL6N0xQS4giY9evG0EJzc+d/OqtX3SJcNw=; b=SBmcALrA/RX/kvmiZMyPmKPsFx+7p2DxvsA5PoOKM4AhwiQCNTwTx36m NjQL8TxZ2q99QaTSdVsANtdz+Ok+AYA4Oq886E43LdKazU7IzKnMdTpX1 4ryDpHop2e8x8GeRSo0h47GP7J3ZnLSNCSk5MztC4bl/4khU9z3d8cW9c pgjld00gULQR/mWESoD9M8NwRQ+RF/bVyywo1Vwex173+3iU0nVmWTE3f 0+m4shDXgCa150SLWw150SgOVP/tY7q4/qleaE+kcWJgVHNciXXjTuMSy LgxJqcQ6EA4z7GLaJ52NZl9NTYvYVCanGJ8EslfWhoo2psl8iYeJDm07S A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037206" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037206" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393128" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393128" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:12 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 089/108] KVM: TDX: handle EXIT_REASON_OTHER_SMI Date: Sat, 29 Oct 2022 23:23:30 -0700 Message-Id: <32d3e497b2e088ef060191f39f3f0495e2c3561a.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093194886307933?= X-GMAIL-MSGID: =?utf-8?q?1748093194886307933?= From: Isaku Yamahata If the control reaches EXIT_REASON_OTHER_SMI, #SMI is delivered and handled right after returning from the TDX module to KVM nothing needs to be done in KVM. Continue TDX vcpu execution. Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/include/uapi/asm/vmx.h | 1 + arch/x86/kvm/vmx/tdx.c | 7 +++++++ 2 files changed, 8 insertions(+) diff --git a/arch/x86/include/uapi/asm/vmx.h b/arch/x86/include/uapi/asm/vmx.h index a5faf6d88f1b..b3a30ef3efdd 100644 --- a/arch/x86/include/uapi/asm/vmx.h +++ b/arch/x86/include/uapi/asm/vmx.h @@ -34,6 +34,7 @@ #define EXIT_REASON_TRIPLE_FAULT 2 #define EXIT_REASON_INIT_SIGNAL 3 #define EXIT_REASON_SIPI_SIGNAL 4 +#define EXIT_REASON_OTHER_SMI 6 #define EXIT_REASON_INTERRUPT_WINDOW 7 #define EXIT_REASON_NMI_WINDOW 8 diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 088e98232227..68d9d590ca8f 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1068,6 +1068,13 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) WARN_ON_ONCE(fastpath != EXIT_FASTPATH_NONE); switch (exit_reason.basic) { + case EXIT_REASON_OTHER_SMI: + /* + * If reach here, it's not a Machine Check System Management + * Interrupt(MSMI). #SMI is delivered and handled right after + * SEAMRET, nothing needs to be done in KVM. + */ + return 1; default: break; } From patchwork Sun Oct 30 06:23:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12923 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1667283wru; Sat, 29 Oct 2022 23:33:21 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5jNmzSIPJFwlo0R5xT7rGMtv5uFQIAbuAX6+Lskkm+qyTXcaOYi0M7U1tWq5YTZBvx3lY/ X-Received: by 2002:a17:902:ccc4:b0:186:6fcb:3fcf with SMTP id z4-20020a170902ccc400b001866fcb3fcfmr7963990ple.100.1667111601039; Sat, 29 Oct 2022 23:33:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111601; cv=none; d=google.com; s=arc-20160816; b=ATWbxhzml51qmfbHNEeJjj+Z93DiV51Xf7bVxTKsPWPUpB9YD9/iAhitdWU9798u6V 1h59usCSdH8XfMlWlAh3bqabpKhwbezuFpZhTYCufKY/EGEdRF8wws7uGYV/adC7tG3c BRAZlYYMhFnbumYmah9GuVLpRKwxj1eRQG2eB3smc8SLfAbI31TIwEFDV45q5K5WYd4O TgjiMX2cBb/AX1Du4zUxAjyW5Lzspg3XbhwGkVE45O62WzLFvcI/T5oizCMTAO0FTfNj 9kHjqAUVTun1pZHeVQk+I24ZUMIHMBRaPEoNj0hbdb7u9i4LhaheZUk/P3hvt/3tZ/Ba HdQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=/B2TrcsmuDfst9umd0bBD7sLQSuwHyWdODfTj0SrW70=; b=Ry0UFvRD7ciGG/SJjZ3xHzQ/ulvDNPIMBo3OtUzCp/MAkX1s4K34OD43pgttI+Q484 ResS3k7H5kSNer00WHwk2RZV3ry0AqCiAsXAvJkBXPo1WAalG6oA2NMQD78deUvwZmeU QfaL4d0i6cBP8TlcS1nSdplx1U2n9M15gtoS6mP4mM568hKj3EyLty+jX06recLiJz+N wBOUAl/glmg8/x0z0yXfLz2QNwWGoSgEPsOFKrBOSfVtDI1QtQQKRYmJduku3feT5LKH 6H+qm9rhLxwcBqA+FZBjjfLHeryHsoq53Rp4brm6PXE7O7sdg2f2Eq3sy1/NMH8MgE4+ mjKA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=CctKhbK2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x14-20020a63170e000000b0045bba345711si4554826pgl.646.2022.10.29.23.33.08; Sat, 29 Oct 2022 23:33:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=CctKhbK2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231514AbiJ3Gc1 (ORCPT + 99 others); Sun, 30 Oct 2022 02:32:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231330AbiJ3Ga3 (ORCPT ); Sun, 30 Oct 2022 02:30:29 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 82E4CB22; Sat, 29 Oct 2022 23:24:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111094; x=1698647094; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=963P+hhDt/ar3E42RUb3zTGxhbDumI9gA95hn4/yWtM=; b=CctKhbK2jPz/e3L37fIekuoZGlUOqGrq0nxXa1Jt8/aocQhRNdLFVC2E 8JL16/NCRLefBVw5o2rwuFQmXHI20siLtKCYgSTwrWd8AXywlZR+pDOSL VtGuFc6keI7Gxjal1r2WoK5Z241pnmm52Re0MPZstElVfkKRb4fv4IHfx epxSd6KtCPIf9KcMgjEHrpidT99dXrby1IDycYUnFvs2LjDuAx+mg0smO Y07JrJ4s8UdsTF7xvvvB3QdloUiK5hT662UKdau51mo4I+I/yejtb/9NZ J/Ygyg+uAZBL+4dqo0eKKLrdsYWSBYFl0hkVW1zxZehnvDE7Y2e5W3jCd Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037208" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037208" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393131" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393131" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:12 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 090/108] KVM: TDX: handle ept violation/misconfig exit Date: Sat, 29 Oct 2022 23:23:31 -0700 Message-Id: <228e05d40e7bbee922c241afa6efb9032cc9e974.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093213895971230?= X-GMAIL-MSGID: =?utf-8?q?1748093213895971230?= From: Isaku Yamahata On EPT violation, call a common function, __vmx_handle_ept_violation() to trigger x86 MMU code. On EPT misconfiguration, exit to ring 3 with KVM_EXIT_UNKNOWN. because EPT misconfiguration can't happen as MMIO is trigged by TDG.VP.VMCALL. No point to set a misconfiguration value for the fast path. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/tdx.c | 46 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 68d9d590ca8f..4a95db961f82 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1046,6 +1046,48 @@ void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, __vmx_deliver_posted_interrupt(vcpu, &tdx->pi_desc, vector); } +static int tdx_handle_ept_violation(struct kvm_vcpu *vcpu) +{ + unsigned long exit_qual; + + if (kvm_is_private_gpa(vcpu->kvm, tdexit_gpa(vcpu))) { + /* + * Always treat SEPT violations as write faults. Ignore the + * EXIT_QUALIFICATION reported by TDX-SEAM for SEPT violations. + * TD private pages are always RWX in the SEPT tables, + * i.e. they're always mapped writable. Just as importantly, + * treating SEPT violations as write faults is necessary to + * avoid COW allocations, which will cause TDAUGPAGE failures + * due to aliasing a single HPA to multiple GPAs. + */ +#define TDX_SEPT_VIOLATION_EXIT_QUAL EPT_VIOLATION_ACC_WRITE + exit_qual = TDX_SEPT_VIOLATION_EXIT_QUAL; + } else { + exit_qual = tdexit_exit_qual(vcpu); + if (exit_qual & EPT_VIOLATION_ACC_INSTR) { + pr_warn("kvm: TDX instr fetch to shared GPA = 0x%lx @ RIP = 0x%lx\n", + tdexit_gpa(vcpu), kvm_rip_read(vcpu)); + vcpu->run->exit_reason = KVM_EXIT_EXCEPTION; + vcpu->run->ex.exception = PF_VECTOR; + vcpu->run->ex.error_code = exit_qual; + return 0; + } + } + + trace_kvm_page_fault(vcpu, tdexit_gpa(vcpu), exit_qual); + return __vmx_handle_ept_violation(vcpu, tdexit_gpa(vcpu), exit_qual); +} + +static int tdx_handle_ept_misconfig(struct kvm_vcpu *vcpu) +{ + WARN_ON_ONCE(1); + + vcpu->run->exit_reason = KVM_EXIT_UNKNOWN; + vcpu->run->hw.hardware_exit_reason = EXIT_REASON_EPT_MISCONFIG; + + return 0; +} + int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) { union tdx_exit_reason exit_reason = to_tdx(vcpu)->exit_reason; @@ -1068,6 +1110,10 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) WARN_ON_ONCE(fastpath != EXIT_FASTPATH_NONE); switch (exit_reason.basic) { + case EXIT_REASON_EPT_VIOLATION: + return tdx_handle_ept_violation(vcpu); + case EXIT_REASON_EPT_MISCONFIG: + return tdx_handle_ept_misconfig(vcpu); case EXIT_REASON_OTHER_SMI: /* * If reach here, it's not a Machine Check System Management From patchwork Sun Oct 30 06:23:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12924 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1667290wru; Sat, 29 Oct 2022 23:33:22 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7LExxnbU7nhogEWOX12NcjqC5iBBcndbnOk1jcJqDpKthMAIqB7g5ctWnLQedsDIx4nEl5 X-Received: by 2002:a17:90b:3e83:b0:212:de17:952c with SMTP id rj3-20020a17090b3e8300b00212de17952cmr8076206pjb.102.1667111602383; Sat, 29 Oct 2022 23:33:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111602; cv=none; d=google.com; s=arc-20160816; b=VU2O68G0rQeQKRoPTTnr/Rr+hj7F/5m/uTlmVNwtzDlTbuazgMAR6otIvQR1SlJQVT vXA/ViyVz3cAfDrWFmCOEtzswGapgGGuP/LXVAutzb3n0Hbh/e/HHOT0DID3q2a+KXGH 6wa7fYQWpXoxy7F8DetbPtPS5nxO0gM7/WFIzlLK0Y3L6Cqkt+l7TwOhmP9/i/HQSHtt kS0M9di5OPRFpeLGUQ6nsIk8SfrW0NyNM6cf9MkxhV7v0chHePmNvc8oJ7qfpdnlCAvH 2GZLFXU4MNTi/bXsXFACJat/wRsS3s88v236QNBb7q+Exh+QS2XpMnvnJ1MVSFevyRDc tTYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=0mjgoY4vlRDx03LTXSGrw4fWXMvjn/foCp2hdpSYN00=; b=rdHNr8/DdhDm0fK9ZeWxv3Yj8AbvEk2lKnvlHAHgcaq6iapDNdYoRqksgRSAiH2YxF qRBypGHF2DvQAIq3C2Kw/e3sbbjzNdWgPAdO17SS5w+PbvZ5g0wgNcul0Ol2x+uGxNbK SBQMn1sE6QbJGGqBtSVqJo1fXcfK6e1TMahxP3TCCo7Vm689C5WqyaehLkm4vOLSS7Tg I+GAKOTbmR9YCrPtTY7ohMU48v5iHXuydY+ZlGeumVOqbPlgiWlyCJJljEUEiJ4YyHzS SezclMqbAHooK4bkLR3ZiHyAUpe++9wBSMzX01XjrqvRObys0kYfGPr4RfWCKYL+/RJ+ Z+Hg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=goscXi9f; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x14-20020a63f70e000000b00448c6dd8fa9si4454524pgh.444.2022.10.29.23.33.10; Sat, 29 Oct 2022 23:33:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=goscXi9f; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231519AbiJ3Gcc (ORCPT + 99 others); Sun, 30 Oct 2022 02:32:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57556 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231362AbiJ3GbE (ORCPT ); Sun, 30 Oct 2022 02:31:04 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA11CB32; Sat, 29 Oct 2022 23:24:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111095; x=1698647095; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=O1H2BP12miObQ6OUjo/UBSqFUHBgzKJvctZ4ql3UceA=; b=goscXi9fSG5hpFcvMJvZ3ekfc2YwcqpZV/QKDhnND0AasJGOKzDVEWOQ 2GD3sRGR2vu6Rj3u4oSoeAnjjkLzWfrDQfux40klf7aCdvcH8He3WyaRM IUyKPeZDMtDNTt84MfJ2nBCq2VNNFUTLxL4AJwjVUSOHOSLxUKm3O9OO8 RzmZOKFF4umvWQXQaJYaH73Ixw63+C01XmtLqMJG9uf+UydPIkvbfzZn6 FjbIhvCm/r3drvexLmr8kbD3XQrI4peykDcpH0gjifI+C3dpUe0EvDnxs Si57kNqx5sEPsMak7nA3hPsTVqsYBoZ9k3NZ/UYfWj/l1LXJ37DZT1qd9 Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037209" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037209" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393134" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393134" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:12 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 091/108] KVM: TDX: handle EXCEPTION_NMI and EXTERNAL_INTERRUPT Date: Sat, 29 Oct 2022 23:23:32 -0700 Message-Id: <9f0a4c24b769ff117ea5cb48ad6a9ae4d386e997.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093215474690043?= X-GMAIL-MSGID: =?utf-8?q?1748093215474690043?= From: Isaku Yamahata Because guest TD state is protected, exceptions in guest TDs can't be intercepted. TDX VMM doesn't need to handle exceptions. tdx_handle_exit_irqoff() handles NMI and machine check. Ignore NMI and machine check and continue guest TD execution. For external interrupt, increment stats same to the VMX case. Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/kvm/vmx/tdx.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 4a95db961f82..3b973a2f1381 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -715,6 +715,25 @@ void tdx_handle_exit_irqoff(struct kvm_vcpu *vcpu) tdexit_intr_info(vcpu)); } +static int tdx_handle_exception(struct kvm_vcpu *vcpu) +{ + u32 intr_info = tdexit_intr_info(vcpu); + + if (is_nmi(intr_info) || is_machine_check(intr_info)) + return 1; + + kvm_pr_unimpl("unexpected exception 0x%x(exit_reason 0x%llx qual 0x%lx)\n", + intr_info, + to_tdx(vcpu)->exit_reason.full, tdexit_exit_qual(vcpu)); + return -EFAULT; +} + +static int tdx_handle_external_interrupt(struct kvm_vcpu *vcpu) +{ + ++vcpu->stat.irq_exits; + return 1; +} + static int tdx_handle_triple_fault(struct kvm_vcpu *vcpu) { vcpu->run->exit_reason = KVM_EXIT_SHUTDOWN; @@ -1110,6 +1129,10 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) WARN_ON_ONCE(fastpath != EXIT_FASTPATH_NONE); switch (exit_reason.basic) { + case EXIT_REASON_EXCEPTION_NMI: + return tdx_handle_exception(vcpu); + case EXIT_REASON_EXTERNAL_INTERRUPT: + return tdx_handle_external_interrupt(vcpu); case EXIT_REASON_EPT_VIOLATION: return tdx_handle_ept_violation(vcpu); case EXIT_REASON_EPT_MISCONFIG: From patchwork Sun Oct 30 06:23:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12925 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1667304wru; Sat, 29 Oct 2022 23:33:26 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4kyNk+rfT9Lkn5+UVJJ3HJ6i3C0KTSuA2hT+2qiHUdG3+9x/3Xxyd5CH04VtOv3Znz6eoM X-Received: by 2002:a17:90a:a897:b0:213:a61d:8920 with SMTP id h23-20020a17090aa89700b00213a61d8920mr8152521pjq.207.1667111605812; Sat, 29 Oct 2022 23:33:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111605; cv=none; d=google.com; s=arc-20160816; b=XzAZKZS407PcW4WE5Lf6ucRqDfqvodJNymSrhYjhZquFqftucNzMkpRD8MLCgskuKH bPEgEgA59VEto/W8AaHNyfxEnDWZK8YtLSbGbdSgh0raeA2bY6+5+9xiDvr1ybpfHTVK Q/eMu0qjl/f1UqdTbwqPguzxJo/3G7sQtExjkPHSmkfzxC9MTeW44C1xpCg8uDFC3kIP rlYp3RB3fm3yy3CywZEbavADlhLHSaQTkGu7QpyaO4INuPMP8W47S9ZTxDn7mRxxapti tUNmaie8J93rr6wXkBKgYovpNjzsIBwlrqZzf3NYgs1lajLlX1Ud398DLGpTqXQhZ0Ss pRzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=b129VnIGKNfZzaktyswT3KhkmSldmEvnCV7xEku+fHo=; b=wCHDmMZtAiDyO2eVzXgQDyNwfe9iOIV0ksRFsr0Eb/oxe7tiJ8c/HNDGFJGp/QI9xU 0QkmPWDfKTkiFW+uFwkRqzgmkzb9KQQUWc3XlAYyT1We3P0ddxg2AnP2QdFvjjyETDSV uT2v0ssd3F9VgwaRWeIgnzqQKgwdFN5t9SuYVTJog4+yBJG2qqa11Pj+SpPzwrDNfy5k XpV/qQ7XEW6OpO3H3+tEjacIvN51mq9PjnPXfz5zFxwIbnP4EXsSyJoRax9BoarQXdIE Vkos+AsIdFAbslVQGclBNBkEdfv5le5gpTxS4zuaq80+abApCBsnC0dAi+dA2bSJJ3th mIkQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Bx7Sh1nP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s34-20020a632162000000b0046004f18c6csi4885575pgm.456.2022.10.29.23.33.13; Sat, 29 Oct 2022 23:33:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Bx7Sh1nP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231418AbiJ3Gcp (ORCPT + 99 others); Sun, 30 Oct 2022 02:32:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54082 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231232AbiJ3GbT (ORCPT ); Sun, 30 Oct 2022 02:31:19 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7879BB41; Sat, 29 Oct 2022 23:24:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111097; x=1698647097; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=aCC+7Jxk4MD8YtemzAcln1lm9ZUTm7trWY2pDlAYJGA=; b=Bx7Sh1nPi5jOH3gH4t0vCnWIs7k/l85/cCII4389dsq10mP5/v2NG/X6 wNUk9CFMqDAPhGIVG4nVu90EDCEg6dm/vEV1HlWi3/m25WK76tsHu7hVt fr4jrc916Kmb0GtG1tpDQdV4ZoHs6Uv1NQcWqoL2ReXXQHDDkjr6bjn15 QJb7B3pYYQzXXvkysmBPlF5duNLG/4dU1J/hdPHC5K3lG26YUcvPd0ffV khfxs/Bi3mVM9qt8jUE8WPlJomlA0c6j6827iURO+7sqxXn9BUc8YlS9G avADw1qwknw0gEIv0cyOTLSvjolJABQaY6o5ehWzJza5bUJXFwwCs2Ft4 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037210" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037210" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:12 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393137" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393137" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:12 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Xiaoyao Li , Sean Christopherson Subject: [PATCH v10 092/108] KVM: TDX: Add a place holder for handler of TDX hypercalls (TDG.VP.VMCALL) Date: Sat, 29 Oct 2022 23:23:33 -0700 Message-Id: <25de0d4917d813f901f8c7badecd49860f5e9ca5.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093219535925804?= X-GMAIL-MSGID: =?utf-8?q?1748093219535925804?= From: Isaku Yamahata The TDX module specification defines TDG.VP.VMCALL API (TDVMCALL for short) for the guest TD to call hypercall to VMM. When the guest TD issues TDG.VP.VMCALL, the guest TD exits to VMM with a new exit reason of TDVMCALL. The arguments from the guest TD and returned values from the VMM are passed in the guest registers. The guest RCX registers indicates which registers are used. Define helper functions to access those registers as ABI. Define the TDVMCALL exit reason, which is carved out from the VMX exit reason namespace as the TDVMCALL exit from TDX guest to TDX-SEAM is really just a VM-Exit. Add a place holder to handle TDVMCALL exit. Co-developed-by: Xiaoyao Li Signed-off-by: Xiaoyao Li Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/include/uapi/asm/vmx.h | 4 ++- arch/x86/kvm/vmx/tdx.c | 56 ++++++++++++++++++++++++++++++++- arch/x86/kvm/vmx/tdx.h | 13 ++++++++ 3 files changed, 71 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/uapi/asm/vmx.h b/arch/x86/include/uapi/asm/vmx.h index b3a30ef3efdd..f0f4a4cf84a7 100644 --- a/arch/x86/include/uapi/asm/vmx.h +++ b/arch/x86/include/uapi/asm/vmx.h @@ -93,6 +93,7 @@ #define EXIT_REASON_TPAUSE 68 #define EXIT_REASON_BUS_LOCK 74 #define EXIT_REASON_NOTIFY 75 +#define EXIT_REASON_TDCALL 77 #define VMX_EXIT_REASONS \ { EXIT_REASON_EXCEPTION_NMI, "EXCEPTION_NMI" }, \ @@ -156,7 +157,8 @@ { EXIT_REASON_UMWAIT, "UMWAIT" }, \ { EXIT_REASON_TPAUSE, "TPAUSE" }, \ { EXIT_REASON_BUS_LOCK, "BUS_LOCK" }, \ - { EXIT_REASON_NOTIFY, "NOTIFY" } + { EXIT_REASON_NOTIFY, "NOTIFY" }, \ + { EXIT_REASON_TDCALL, "TDCALL" } #define VMX_EXIT_REASON_FLAGS \ { VMX_EXIT_REASONS_FAILED_VMENTRY, "FAILED_VMENTRY" } diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 3b973a2f1381..65d9b88f1d50 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -85,6 +85,41 @@ static __always_inline unsigned long tdexit_intr_info(struct kvm_vcpu *vcpu) return kvm_r9_read(vcpu); } +#define BUILD_TDVMCALL_ACCESSORS(param, gpr) \ +static __always_inline \ +unsigned long tdvmcall_##param##_read(struct kvm_vcpu *vcpu) \ +{ \ + return kvm_##gpr##_read(vcpu); \ +} \ +static __always_inline void tdvmcall_##param##_write(struct kvm_vcpu *vcpu, \ + unsigned long val) \ +{ \ + kvm_##gpr##_write(vcpu, val); \ +} +BUILD_TDVMCALL_ACCESSORS(a0, r12); +BUILD_TDVMCALL_ACCESSORS(a1, r13); +BUILD_TDVMCALL_ACCESSORS(a2, r14); +BUILD_TDVMCALL_ACCESSORS(a3, r15); + +static __always_inline unsigned long tdvmcall_exit_type(struct kvm_vcpu *vcpu) +{ + return kvm_r10_read(vcpu); +} +static __always_inline unsigned long tdvmcall_leaf(struct kvm_vcpu *vcpu) +{ + return kvm_r11_read(vcpu); +} +static __always_inline void tdvmcall_set_return_code(struct kvm_vcpu *vcpu, + long val) +{ + kvm_r10_write(vcpu, val); +} +static __always_inline void tdvmcall_set_return_val(struct kvm_vcpu *vcpu, + unsigned long val) +{ + kvm_r11_write(vcpu, val); +} + static inline bool is_td_vcpu_created(struct vcpu_tdx *tdx) { return tdx->tdvpr.added; @@ -662,7 +697,8 @@ static noinstr void tdx_vcpu_enter_exit(struct kvm_vcpu *vcpu, struct vcpu_tdx *tdx) { guest_enter_irqoff(); - tdx->exit_reason.full = __tdx_vcpu_run(tdx->tdvpr.pa, vcpu->arch.regs, 0); + tdx->exit_reason.full = __tdx_vcpu_run(tdx->tdvpr.pa, vcpu->arch.regs, + tdx->tdvmcall.regs_mask); guest_exit_irqoff(); } @@ -695,6 +731,11 @@ fastpath_t tdx_vcpu_run(struct kvm_vcpu *vcpu) tdx_complete_interrupts(vcpu); + if (tdx->exit_reason.basic == EXIT_REASON_TDCALL) + tdx->tdvmcall.rcx = vcpu->arch.regs[VCPU_REGS_RCX]; + else + tdx->tdvmcall.rcx = 0; + return EXIT_FASTPATH_NONE; } @@ -741,6 +782,17 @@ static int tdx_handle_triple_fault(struct kvm_vcpu *vcpu) return 0; } +static int handle_tdvmcall(struct kvm_vcpu *vcpu) +{ + switch (tdvmcall_leaf(vcpu)) { + default: + break; + } + + tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_INVALID_OPERAND); + return 1; +} + void tdx_load_mmu_pgd(struct kvm_vcpu *vcpu, hpa_t root_hpa, int pgd_level) { td_vmcs_write64(to_tdx(vcpu), SHARED_EPT_POINTER, root_hpa & PAGE_MASK); @@ -1133,6 +1185,8 @@ int tdx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t fastpath) return tdx_handle_exception(vcpu); case EXIT_REASON_EXTERNAL_INTERRUPT: return tdx_handle_external_interrupt(vcpu); + case EXIT_REASON_TDCALL: + return handle_tdvmcall(vcpu); case EXIT_REASON_EPT_VIOLATION: return tdx_handle_ept_violation(vcpu); case EXIT_REASON_EPT_MISCONFIG: diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index 64e9b864e20e..eac341e08c9a 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -78,6 +78,19 @@ struct vcpu_tdx { struct list_head cpu_list; + union { + struct { + union { + struct { + u16 gpr_mask; + u16 xmm_mask; + }; + u32 regs_mask; + }; + u32 reserved; + }; + u64 rcx; + } tdvmcall; union tdx_exit_reason exit_reason; bool vcpu_initialized; From patchwork Sun Oct 30 06:23:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12926 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1667314wru; Sat, 29 Oct 2022 23:33:28 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7lRe/MAfOcX8EmdmPwif7L1g69dEzoqfi3rSyZcE82AIAJ9EQzzQDGWF35cUC+dQuiF33o X-Received: by 2002:a17:90a:708e:b0:213:6876:330 with SMTP id g14-20020a17090a708e00b0021368760330mr20838536pjk.167.1667111608660; Sat, 29 Oct 2022 23:33:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111608; cv=none; d=google.com; s=arc-20160816; b=b8qgY5z2R6URcZCliRv9DFhrz5kkrCUkoD8dvb4N3fAEEuPyqZSXH+qtHqQNOc1pqO 4Wk/j8PS45s7dslveseLNp+KFLHF6QN+ucBIvtNyxZ8Q7uEeDWglT2ZssVC7fD8p7JQC H06OieQJkwLOEFvlSVj3fvgiAva99fcSV9kfGjThNmuycyOXeFz3d4vMS8aO1Iv8aQkC TuVm8QiRWTRSb1dvmuqrBJAnnUIjpUhJE3cFfLUfuO2Ll+AOPqtNn/yijdm/laRjGAUd 4w+GiMgaL3lQzqVSD47T45Ha5lvGV6UYaSYHyy1C5PZ6uCUsEcdUXNln+eDjCaTOMSnS DyQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Vw+MYFv99srb83WSolCJEuuQb/lGNZlzuTBjYgiJrRo=; b=N8TdHu+wf+4/PHNakVYHfwpKxvWrfgb5NqisrhyjieJTN8Jnqi00jeKwF0CXMQLFgg 2Yt/Q6U6SJxv0gakLtbU0ykjO/ASx+aOypf7lTTm4g+AhLzA8BI10oDWW/PVt1mzlst5 CjZyndFlqg8bOM0mx/PzabYwHyyRM3vGBwpNqtZKvMJt4OvaeYLVUr1YrmSYQdreMzxe 8QUdNFmJC4cdhd6b3lPrD6VwnJWHv4s3FilbpoZhFC7KpgGh3AYyOCbhQWNHTx5/3t0Z kZhKNXAbsipSTt+pyK08nJgnhPG3l1Ue5NyuPsfG2rqWfq6yRy5E6+l5ofvaMm+kb33X M5ig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ZzKjbI1g; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j68-20020a625547000000b00567719e34aesi3933326pfb.49.2022.10.29.23.33.16; Sat, 29 Oct 2022 23:33:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ZzKjbI1g; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230311AbiJ3Gcs (ORCPT + 99 others); Sun, 30 Oct 2022 02:32:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54090 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231411AbiJ3Gb1 (ORCPT ); Sun, 30 Oct 2022 02:31:27 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A7FC9119; Sat, 29 Oct 2022 23:24:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111098; x=1698647098; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=eBqMoB0m83MuwaAMk2sQDSmkQWX2Pc0er+8L/z4c57c=; b=ZzKjbI1gpsjCAC7SDOIamduifeNx4FJfsXDSkQpOPP5t21X+GqaL4br/ LFp5UkoKaX9HXsJywvCk9DYwm7IBBzmmt3yiuZoool53yq1zB/kUONvyO 2qGyi1X1ApHSk3Ihxh68NoTtlaZmj6tJaqVfEaxkOh4XtAsK/U6+KL06u TITfOPO/YtG70UGCzFfDPm3agjf+qiwEJjtTBeUjTLs+fHJlpTFO30wOO +WqJro6JJAd7Ab0o5y3bvq4ed78VIGVVohUUBY4n4bc397225j6bUZQL4 eDVkCmRs7H0ZtA9VtrvzjhH/2LriOwhlr7lfSNZkIlbC/VnQCV71SWQIV g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037211" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037211" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:13 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393140" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393140" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:13 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 093/108] KVM: TDX: handle KVM hypercall with TDG.VP.VMCALL Date: Sat, 29 Oct 2022 23:23:34 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093222306477266?= X-GMAIL-MSGID: =?utf-8?q?1748093222306477266?= From: Isaku Yamahata The TDX Guest-Host communication interface (GHCI) specification defines the ABI for the guest TD to issue hypercall. It reserves vendor specific arguments for VMM specific use. Use it as KVM hypercall and handle it. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/tdx.c | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 65d9b88f1d50..f6477d577001 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -782,8 +782,39 @@ static int tdx_handle_triple_fault(struct kvm_vcpu *vcpu) return 0; } +static int tdx_emulate_vmcall(struct kvm_vcpu *vcpu) +{ + unsigned long nr, a0, a1, a2, a3, ret; + + /* + * ABI for KVM tdvmcall argument: + * In Guest-Hypervisor Communication Interface(GHCI) specification, + * Non-zero leaf number (R10 != 0) is defined to indicate + * vendor-specific. KVM uses this for KVM hypercall. NOTE: KVM + * hypercall number starts from one. Zero isn't used for KVM hypercall + * number. + * + * R10: KVM hypercall number + * arguments: R11, R12, R13, R14. + */ + nr = kvm_r10_read(vcpu); + a0 = kvm_r11_read(vcpu); + a1 = kvm_r12_read(vcpu); + a2 = kvm_r13_read(vcpu); + a3 = kvm_r14_read(vcpu); + + ret = __kvm_emulate_hypercall(vcpu, nr, a0, a1, a2, a3, true); + + tdvmcall_set_return_code(vcpu, ret); + + return 1; +} + static int handle_tdvmcall(struct kvm_vcpu *vcpu) { + if (tdvmcall_exit_type(vcpu)) + return tdx_emulate_vmcall(vcpu); + switch (tdvmcall_leaf(vcpu)) { default: break; From patchwork Sun Oct 30 06:23:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12927 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1667323wru; Sat, 29 Oct 2022 23:33:34 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7irH7oy2eLSfECrbRjd+3wRTzxUXf82q3lcc5zHwq9ytQKDcbQPd1lYPjHDKSOY1mbb66z X-Received: by 2002:a63:444f:0:b0:464:3985:3c92 with SMTP id t15-20020a63444f000000b0046439853c92mr7112273pgk.412.1667111614153; Sat, 29 Oct 2022 23:33:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111614; cv=none; d=google.com; s=arc-20160816; b=vgMN/14gjbBCCXX7XD75z8mI2ZvE3zAcKuhbr/eH7cKFPUbpTCIBUBXpJyfNZZ9Lrd 1COy00WDW/vDqo2UFz2u84jtdkuByNgXrZCEkGU1JGxDg7pwJcQM0RKMApjPc7ZMQL80 VeqsAYV31Bs2WxJX5y1hZqCGqjP1U357+uIHoqyu1fwwU7CDPeirpihXxw+Gv+hS80KK 1CDI+b8N9Owf+3dSd6uzM6RmLViyQEUss7G02p0rN8Fs5AvrH8PSBH5GAii2VcKDzLr7 yQSMIvogp9C7CK9KupQLcxVio21TMO98VHqYQsBXgTOgmwtmW+JzpdsFO/mU0LDt//pS xtQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ochc0WFFB7dozB9K/UquIlTG+xBj87rBoqENF6sCdXY=; b=qRgoX2sk8FksMHuis58OLx+RLM9+UyykhZNhTkv8/ZfMe8BQkMtDGF7FDuoEnTwQto r5PmmIna6AhiDJgeIJrwzMU41+/m9McrnYOqeGdBchzMp5V2ahjx3KSPweldTX2WAg/u TTCX0nQz/mNI5kVWLmas9yeJG/rC0F3o4v7NS5suPoLK9C7VAbAex55oCyNoVKR+zjde piXN/IIrGGC1jCb+OqSi88qDaxzT2Xy81n95VFqU0ccWsIsv+OISkFOJmIUQo2oEm0bA XEkjqqPxyiw8MA5XYPSkiXaWJg2or3VAHcetBIYwSYP3SZozrp7LNyiyDzHZ2Dd/1OHZ /1Lg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GpsXxkqw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r73-20020a632b4c000000b0046ecbfda052si4437522pgr.389.2022.10.29.23.33.21; Sat, 29 Oct 2022 23:33:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GpsXxkqw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231445AbiJ3Gcw (ORCPT + 99 others); Sun, 30 Oct 2022 02:32:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58902 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231318AbiJ3Gb5 (ORCPT ); Sun, 30 Oct 2022 02:31:57 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB92BBA4; Sat, 29 Oct 2022 23:25:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111102; x=1698647102; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=EWRhmPnmbBo5KrcXEgUQVLRg+/G51NCswzjBOF3vJSw=; b=GpsXxkqwZpIpWWwwbctVN92j9qoUrslSdQMWVGUPUH12b5N1WQkpzwTE K09dDw27R6+Tjh3g9ytxDODJxPCQpGOAFXSQTWUswlR2RcgUpLk1SJkDS tDvNys9nV7z42d6/ogVA2aQFT6Ul1D4kMxP+TK74b2tSk4GqlqKn3B1Ei bZmrjDRxWW4fWVFcwJhoQgducAYUXDoixzUKctxx6EervJtt69ynGGR86 M3ZbWWN9/QmmhGwp5+h6cGQjnWpSLhPir+uoTarPrT0Wmdx3cX/ygXZrL xwtDNC041FnHMYJF6V6JGWu1XpYpt7eB0Fv7XYzxQ+apNCuvEazh3uR2t A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037212" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037212" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:13 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393143" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393143" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:13 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 094/108] KVM: TDX: Handle TDX PV CPUID hypercall Date: Sat, 29 Oct 2022 23:23:35 -0700 Message-Id: <450f6c2c56c08e0676fd0559876e8413502209ae.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093227904955450?= X-GMAIL-MSGID: =?utf-8?q?1748093227904955450?= From: Isaku Yamahata Wire up TDX PV CPUID hypercall to the KVM backend function. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/tdx.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index f6477d577001..4b83d7a81433 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -810,12 +810,34 @@ static int tdx_emulate_vmcall(struct kvm_vcpu *vcpu) return 1; } +static int tdx_emulate_cpuid(struct kvm_vcpu *vcpu) +{ + u32 eax, ebx, ecx, edx; + + /* EAX and ECX for cpuid is stored in R12 and R13. */ + eax = tdvmcall_a0_read(vcpu); + ecx = tdvmcall_a1_read(vcpu); + + kvm_cpuid(vcpu, &eax, &ebx, &ecx, &edx, false); + + tdvmcall_a0_write(vcpu, eax); + tdvmcall_a1_write(vcpu, ebx); + tdvmcall_a2_write(vcpu, ecx); + tdvmcall_a3_write(vcpu, edx); + + tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_SUCCESS); + + return 1; +} + static int handle_tdvmcall(struct kvm_vcpu *vcpu) { if (tdvmcall_exit_type(vcpu)) return tdx_emulate_vmcall(vcpu); switch (tdvmcall_leaf(vcpu)) { + case EXIT_REASON_CPUID: + return tdx_emulate_cpuid(vcpu); default: break; } From patchwork Sun Oct 30 06:23:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12930 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1680832wru; Sun, 30 Oct 2022 00:29:54 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6J3aCuN0163tQri39tCJsLMXiRzcBxIWdBifaEw6OWwaipRu8frtHJSatSPlVbicioBlDt X-Received: by 2002:a17:902:e48b:b0:187:16d8:4428 with SMTP id i11-20020a170902e48b00b0018716d84428mr2196236ple.127.1667114994211; Sun, 30 Oct 2022 00:29:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667114994; cv=none; d=google.com; s=arc-20160816; b=YUEYcaO7M78H4CGSp3bcwN0M+Y2xzWMfIEdNr9n7lZmWEt4LeTgG1pRsmgvAOvUUs5 XrgnB1woGqs7kBJ6LpXL8dt1QLgKHyjCouoqd+Lu61sDTDyg9SkXEgb0KrWcr4kOUPKH ppOx8Vku/uiK9jQDS4CW2A6+9cdQJ4jGRg9us6IqxaSZB2rdFHg0cXtH8b3EHn+dRldo 8WwErpQcIeW7qTpDGUi484X5XM/gK+aIYQPrX8TGIH6E8fSohgZWu22YbW2viHpB/eWd S2cEouB2mi6la2bnXXulOed0OUS1YCL0RPV/J/YVRq4RXPSWw6NVG7ByH2BMtlfXl1i5 Eh0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Lk+mmfdTJUrllKXxhjCm67XTKRS4XarWnjdY2rcLV10=; b=wuXmwBoCxyzB7WVoMQjQy2FsGyQszOtHl0MhEmM0D1WO9wary089fFHteBbugygc8U jsfHN9hZ7B5lhj7N7xyriS7XD7yCcWi/G1XT3ddNqkBLpMd2QPzImiDX4Bc1PoH71xBO mm5pfRMEJZR3XJcVt4uBTxt/rn1sLOS1FykkAjMo+ApByKKbcA00iERv0RWRzQDJNl1i 69x7FcmGoeLCAPHO5UW0Rf9mGyK9rTXCWlKWk6wcyoAO54ENyvlgj36i0j4ulX02336j kopKntaKhbTe/Mi2oDdkVe7j4/cW7a5Q51u03EqA8vmgo4QqJWwJ21XPDv6dt4wskMKB 51oQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=HFGATXAz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i6-20020a170902c94600b001780ba6c694si5661343pla.35.2022.10.30.00.29.42; Sun, 30 Oct 2022 00:29:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=HFGATXAz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231569AbiJ3GdF (ORCPT + 99 others); Sun, 30 Oct 2022 02:33:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58896 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231449AbiJ3Gb5 (ORCPT ); Sun, 30 Oct 2022 02:31:57 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB79EBA1; Sat, 29 Oct 2022 23:25:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111102; x=1698647102; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=lZQmYrPEne6WmQc9Y8A1VT3t9Z2cJUiBqGjv5f3sfJ0=; b=HFGATXAz++sfFqUDRk/CDZX/u2jcQujqqaKg0otLdiS2yGknvlpTnBfM GLDD8sSnD2FnIZNwe70i9D0zOu7taD7ITp+GjQnb1/zva2XuGDOV2ZKJK TqEhk6WH7WDNxBRMXtsVJRHxhx7hjj0Wx83ULsWwnBpjD/VKjX+u8JFv8 Bfb7Yskl5DvTd1wNt2eHBitrtVvJk9uSUCnyr4Y61IztAYGM2civ6YX8r Awxh1a5MUWoQXPRKkwE2Fqek85vRIqDJHjdv70MAP6iO4lv0wjsWAttrX rQwwTqiM7re9AfcE9El7GxxqPtlBUaK2xra4kIy500Kcl+nFX8278MjSz w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037213" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037213" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:13 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393146" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393146" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:13 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 095/108] KVM: TDX: Handle TDX PV HLT hypercall Date: Sat, 29 Oct 2022 23:23:36 -0700 Message-Id: <9125a3e35d0bfe933bcec5b41a19943d88cdaefd.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748096772620577511?= X-GMAIL-MSGID: =?utf-8?q?1748096772620577511?= From: Isaku Yamahata Wire up TDX PV HLT hypercall to the KVM backend function. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/tdx.c | 42 +++++++++++++++++++++++++++++++++++++++++- arch/x86/kvm/vmx/tdx.h | 3 +++ 2 files changed, 44 insertions(+), 1 deletion(-) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 4b83d7a81433..03b08b3f1ff6 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -528,7 +528,32 @@ void tdx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) bool tdx_protected_apic_has_interrupt(struct kvm_vcpu *vcpu) { - return pi_has_pending_interrupt(vcpu); + bool ret = pi_has_pending_interrupt(vcpu); + struct vcpu_tdx *tdx = to_tdx(vcpu); + + if (ret || vcpu->arch.mp_state != KVM_MP_STATE_HALTED) + return true; + + if (tdx->interrupt_disabled_hlt) + return false; + + /* + * This is for the case where the virtual interrupt is recognized, + * i.e. set in vmcs.RVI, between the STI and "HLT". KVM doesn't have + * access to RVI and the interrupt is no longer in the PID (because it + * was "recognized". It doesn't get delivered in the guest because the + * TDCALL completes before interrupts are enabled. + * + * TDX modules sets RVI while in an STI interrupt shadow. + * - TDExit(typically TDG.VP.VMCALL) from the guest to TDX module. + * The interrupt shadow at this point is gone. + * - It knows that there is an interrupt that can be delivered + * (RVI > PPR && EFLAGS.IF=1, the other conditions of 29.2.2 don't + * matter) + * - It forwards the TDExit nevertheless, to a clueless hypervisor that + * has no way to glean either RVI or PPR. + */ + return !!xchg(&tdx->buggy_hlt_workaround, 0); } void tdx_prepare_switch_to_guest(struct kvm_vcpu *vcpu) @@ -830,6 +855,17 @@ static int tdx_emulate_cpuid(struct kvm_vcpu *vcpu) return 1; } +static int tdx_emulate_hlt(struct kvm_vcpu *vcpu) +{ + struct vcpu_tdx *tdx = to_tdx(vcpu); + + /* See tdx_protected_apic_has_interrupt() to avoid heavy seamcall */ + tdx->interrupt_disabled_hlt = tdvmcall_a0_read(vcpu); + + tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_SUCCESS); + return kvm_emulate_halt_noskip(vcpu); +} + static int handle_tdvmcall(struct kvm_vcpu *vcpu) { if (tdvmcall_exit_type(vcpu)) @@ -838,6 +874,8 @@ static int handle_tdvmcall(struct kvm_vcpu *vcpu) switch (tdvmcall_leaf(vcpu)) { case EXIT_REASON_CPUID: return tdx_emulate_cpuid(vcpu); + case EXIT_REASON_HLT: + return tdx_emulate_hlt(vcpu); default: break; } @@ -1166,6 +1204,8 @@ void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, struct kvm_vcpu *vcpu = apic->vcpu; struct vcpu_tdx *tdx = to_tdx(vcpu); + /* See comment in tdx_protected_apic_has_interrupt(). */ + tdx->buggy_hlt_workaround = 1; /* TDX supports only posted interrupt. No lapic emulation. */ __vmx_deliver_posted_interrupt(vcpu, &tdx->pi_desc, vector); } diff --git a/arch/x86/kvm/vmx/tdx.h b/arch/x86/kvm/vmx/tdx.h index eac341e08c9a..685184aa10e9 100644 --- a/arch/x86/kvm/vmx/tdx.h +++ b/arch/x86/kvm/vmx/tdx.h @@ -99,6 +99,9 @@ struct vcpu_tdx { bool host_state_need_restore; u64 msr_host_kernel_gs_base; + bool interrupt_disabled_hlt; + unsigned int buggy_hlt_workaround; + /* * Dummy to make pmu_intel not corrupt memory. * TODO: Support PMU for TDX. Future work. From patchwork Sun Oct 30 06:23:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12928 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1673182wru; Sun, 30 Oct 2022 00:02:09 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4sWSdvQUPB0RySwF7lD4igLjlrXl412U/mSBb+5DYuRenbT7Xgf+N6DPRRjQ0PA5/fvL6u X-Received: by 2002:aa7:dc10:0:b0:440:b446:c0cc with SMTP id b16-20020aa7dc10000000b00440b446c0ccmr7451978edu.34.1667113329325; Sun, 30 Oct 2022 00:02:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667113329; cv=none; d=google.com; s=arc-20160816; b=j7s6UW0mk3mc082ch8TduIaDQ1fdRANkDxDpl4dV9zKxSR4ZdCogvy83jpTKlwjaVC PpMK+r/DMGiYCLYr+vPhSPieXXvbtzFd7pr63hGUsOennVOLO9Jk9ZTIkwS8YnVvww5C GfHLDw+NlKxX12k+ba3q7U4B91ywKP2xoY3rN1U+sUvOhRz7mzzY4Y6wIglFx5vF7Ja8 m1LBGxtMAzgiJT1Vu7Qkpn82OnaLSgUEJg2uzDqBpc+tPAsj8yJsee0aYyvUQSDtO0lm 8juKetmmo778nqkblRB4NE7e6/GnewVKSbTUIaoDEKIM3gHLE+zKa17qs7A8PoZvVlLU Muiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=xhvsN4DKyYTYyCr62IO7iptcab7oqDj6VGdS8rkUjXs=; b=YbKkBS0FI5pJIFJjIHjhevjp25Kui9hfGGx4XxfmYCfLb6R2Jp+84c9RXVI6sKzF/C rrzonX5IV1M+JBMjIhNLd/cLhvqod9SelsbCTgF+1orXL52ERas2Yj/zaEb+p9E///Xm XB0lUpoaF/6Ijt5vT6QLbxsfr9Fn0R/hF2sR0sTGObLwDGfAiHq6JCxAULb4cpMupsRv hIceip+ZF8a2iR+b4bDWnkO+ietaURJVjnz17r//tdz6kxSs/4cpaEbRqZgZDDKRl71Y h1shMdiZgRUuUP+39SbDT6m011XiUmBFUEU3kmTGPlqIEQ8r3NpHp2kcQiGV3/JKOee9 drHg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=KSBAc0WL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id cs21-20020a170906dc9500b007acef3bec44si4722213ejc.221.2022.10.30.00.01.30; Sun, 30 Oct 2022 00:02:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=KSBAc0WL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231467AbiJ3GdJ (ORCPT + 99 others); Sun, 30 Oct 2022 02:33:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58908 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231450AbiJ3Gb6 (ORCPT ); Sun, 30 Oct 2022 02:31:58 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F07A8BA7; Sat, 29 Oct 2022 23:25:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111102; x=1698647102; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dUD7oj5Uxf9yB14+vLG/PkVhuzTk5E78JzXU65V215o=; b=KSBAc0WLYyVXRtBApcIp6Uc5GFiJax/nknIlym89ayr118furRQtOZuE pgvvKsFKetCALWVEEPLwfKPLbpS5SrHz1sMFuByNcwPWS05znr74chUJy BsdMU9H+js0p9zq4bwxxbT4zeq7Nrd+g/q5olyvPSHd6ZiLxPhumn96V8 JVi28q3kuOB3VUQ7xWEVXfz9J93ndS+WPqevT2e1GqdQe+aX+LrCfcscu Uo/EnqSLJg/KZtcUm04MfC4x0ohSosWJXZObMHAIDHZKTt8LTnbl3kKFW JTFYzBCJ03V3KFiV1l2cMz/DOanmnMeL0nkGKJTCayQKZBARxgtDqhiCj w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037214" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037214" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:13 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393149" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393149" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:13 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 096/108] KVM: TDX: Handle TDX PV port io hypercall Date: Sat, 29 Oct 2022 23:23:37 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748095026441082212?= X-GMAIL-MSGID: =?utf-8?q?1748095026441082212?= From: Isaku Yamahata Wire up TDX PV port IO hypercall to the KVM backend function. Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/kvm/vmx/tdx.c | 57 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 57 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 03b08b3f1ff6..69a3e7007e83 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -866,6 +866,61 @@ static int tdx_emulate_hlt(struct kvm_vcpu *vcpu) return kvm_emulate_halt_noskip(vcpu); } +static int tdx_complete_pio_in(struct kvm_vcpu *vcpu) +{ + struct x86_emulate_ctxt *ctxt = vcpu->arch.emulate_ctxt; + unsigned long val = 0; + int ret; + + WARN_ON_ONCE(vcpu->arch.pio.count != 1); + + ret = ctxt->ops->pio_in_emulated(ctxt, vcpu->arch.pio.size, + vcpu->arch.pio.port, &val, 1); + WARN_ON_ONCE(!ret); + + tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_SUCCESS); + tdvmcall_set_return_val(vcpu, val); + + return 1; +} + +static int tdx_emulate_io(struct kvm_vcpu *vcpu) +{ + struct x86_emulate_ctxt *ctxt = vcpu->arch.emulate_ctxt; + unsigned long val = 0; + unsigned int port; + int size, ret; + bool write; + + ++vcpu->stat.io_exits; + + size = tdvmcall_a0_read(vcpu); + write = tdvmcall_a1_read(vcpu); + port = tdvmcall_a2_read(vcpu); + + if (size != 1 && size != 2 && size != 4) { + tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_INVALID_OPERAND); + return 1; + } + + if (write) { + val = tdvmcall_a3_read(vcpu); + ret = ctxt->ops->pio_out_emulated(ctxt, size, port, &val, 1); + + /* No need for a complete_userspace_io callback. */ + vcpu->arch.pio.count = 0; + } else { + ret = ctxt->ops->pio_in_emulated(ctxt, size, port, &val, 1); + if (!ret) + vcpu->arch.complete_userspace_io = tdx_complete_pio_in; + else + tdvmcall_set_return_val(vcpu, val); + } + if (ret) + tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_SUCCESS); + return ret; +} + static int handle_tdvmcall(struct kvm_vcpu *vcpu) { if (tdvmcall_exit_type(vcpu)) @@ -876,6 +931,8 @@ static int handle_tdvmcall(struct kvm_vcpu *vcpu) return tdx_emulate_cpuid(vcpu); case EXIT_REASON_HLT: return tdx_emulate_hlt(vcpu); + case EXIT_REASON_IO_INSTRUCTION: + return tdx_emulate_io(vcpu); default: break; } From patchwork Sun Oct 30 06:23:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12931 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1681528wru; Sun, 30 Oct 2022 00:32:23 -0700 (PDT) X-Google-Smtp-Source: AMsMyM48xaGLCjzC1xrNcIfr35jm1w0doyyw718wspnlw4yvhQhZFRpDM4LYxOrf1F2/DQbfglmY X-Received: by 2002:a17:903:248:b0:172:7520:db07 with SMTP id j8-20020a170903024800b001727520db07mr8093969plh.76.1667115143077; Sun, 30 Oct 2022 00:32:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667115143; cv=none; d=google.com; s=arc-20160816; b=arkXL5AZZ1G6RAIfP0skPCIBi07+3+f2FsWDvR1gER/2eMSweLkeZKpz9CcwJB/IHw Nfc59eSGeiKDgOUMiOJO2rdTLM5NLtP18pSNMNDahtFzzdAi14nnsxUl2rv0x9xOwDRB ejXCyZqK+NJralK/NLlGhHhvzDX+XywDKPNYsjaq99cHbYC00ZwZyaiCixBktpG4KuTS bJicWCWItE93UYkULHK62gc+bPfuWlzgt8Ar7ALOvJIJItCWxoDiGLWUo7Ikg9870e20 ve6ki7SZJeoweVDuwJGk6uvpTJGgsVpINzmnxGUaGLzy5tTOJ9ct4AJJ+uKqn/M9zqm3 AlBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=QIvcIgLowsWQxUi1B0XzT2oaNb4P2V5PznyoC/okcQ8=; b=qaMyH/Lvxfingdubd5z2+mP/vpvMY4QJANsd1OWAUiY8kXOpsFBuTwi9gvqMF1i8D2 WUB6ni60za6z3SAQr27ncG6IaixSnrCYgD1KZOqF/vhhY+kySYdOenbHvOozeGPYWzw+ csjCwmcKSD7UjyeqIKKvbyVQal+phzGT21mYL40xp3K37yIUS2bWMCLjOHWXpDMEyQod BxYbu4Qa7t9HW/v/GyDCR9Z6HQwXS9b9MbW5DnECp07vbNV86XJzj64sVJ05QZDgS75C FMPio8l7K6tyfzZFw6ZLCjGURGXXQxc2DbKMM/i/CmA65Rn5O+ENjiUOmGf84kuJ9jOO 5n+g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BhrptuR0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d16-20020a056a00245000b0056d568dc21dsi738995pfj.153.2022.10.30.00.32.10; Sun, 30 Oct 2022 00:32:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BhrptuR0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231589AbiJ3GdO (ORCPT + 99 others); Sun, 30 Oct 2022 02:33:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56416 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231462AbiJ3GcA (ORCPT ); Sun, 30 Oct 2022 02:32:00 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F0CD8BA9; Sat, 29 Oct 2022 23:25:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111102; x=1698647102; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tnAtm3KtnQ+29C/9Cuw+1ZSF7f/8m3DPXFPaVLAsS7w=; b=BhrptuR0y4SrjgS4jGph2Xm6qfrA29qP8DiDBwmCzKXSW5CqoP5YF8Yt 8ySJdmi0k87casmvOnW4Nu0gHLs+3LRm3ODqlnN7zGWdzXsUoMhVCeB/B Nza1URTBtDhcp3ihXlxm97KRytBJo1rrV1CqXl0YrdupVnVvlmo+vPbQt sdAfplfc7G2GXyH43DWuKbHRVkibpRgXYL0TeC++RB8hxQ8pWinSPEkVM a25XMetoRPsyigLuABsNkBcAxOmwmXbQCHZnnCAhCR6RL5h4YV3pbox8n kG4pvm0C4MnRLI290iepSW+Tq9zr4cy2+SF61NMIc06mi3u29sEMM1wSj g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037215" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037215" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:13 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393152" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393152" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:13 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 097/108] KVM: TDX: Handle TDX PV MMIO hypercall Date: Sat, 29 Oct 2022 23:23:38 -0700 Message-Id: <24003e50e3424cbf28518dbfc7d5a9da72482316.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748096928209900403?= X-GMAIL-MSGID: =?utf-8?q?1748096928209900403?= From: Sean Christopherson Export kvm_io_bus_read and kvm_mmio tracepoint and wire up TDX PV MMIO hypercall to the KVM backend functions. kvm_io_bus_read/write() searches KVM device emulated in kernel of the given MMIO address and emulates the MMIO. As TDX PV MMIO also needs it, export kvm_io_bus_read(). kvm_io_bus_write() is already exported. TDX PV MMIO emulates some of MMIO itself. To add trace point consistently with x86 kvm, export kvm_mmio tracepoint. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/kvm/vmx/tdx.c | 114 +++++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/x86.c | 1 + virt/kvm/kvm_main.c | 2 + 3 files changed, 117 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 69a3e7007e83..50e9352464a9 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -921,6 +921,118 @@ static int tdx_emulate_io(struct kvm_vcpu *vcpu) return ret; } +static int tdx_complete_mmio(struct kvm_vcpu *vcpu) +{ + unsigned long val = 0; + gpa_t gpa; + int size; + + KVM_BUG_ON(vcpu->mmio_needed != 1, vcpu->kvm); + vcpu->mmio_needed = 0; + + if (!vcpu->mmio_is_write) { + gpa = vcpu->mmio_fragments[0].gpa; + size = vcpu->mmio_fragments[0].len; + + memcpy(&val, vcpu->run->mmio.data, size); + tdvmcall_set_return_val(vcpu, val); + trace_kvm_mmio(KVM_TRACE_MMIO_READ, size, gpa, &val); + } + return 1; +} + +static inline int tdx_mmio_write(struct kvm_vcpu *vcpu, gpa_t gpa, int size, + unsigned long val) +{ + if (kvm_iodevice_write(vcpu, &vcpu->arch.apic->dev, gpa, size, &val) && + kvm_io_bus_write(vcpu, KVM_MMIO_BUS, gpa, size, &val)) + return -EOPNOTSUPP; + + trace_kvm_mmio(KVM_TRACE_MMIO_WRITE, size, gpa, &val); + return 0; +} + +static inline int tdx_mmio_read(struct kvm_vcpu *vcpu, gpa_t gpa, int size) +{ + unsigned long val; + + if (kvm_iodevice_read(vcpu, &vcpu->arch.apic->dev, gpa, size, &val) && + kvm_io_bus_read(vcpu, KVM_MMIO_BUS, gpa, size, &val)) + return -EOPNOTSUPP; + + tdvmcall_set_return_val(vcpu, val); + trace_kvm_mmio(KVM_TRACE_MMIO_READ, size, gpa, &val); + return 0; +} + +static int tdx_emulate_mmio(struct kvm_vcpu *vcpu) +{ + struct kvm_memory_slot *slot; + int size, write, r; + unsigned long val; + gpa_t gpa; + + KVM_BUG_ON(vcpu->mmio_needed, vcpu->kvm); + + size = tdvmcall_a0_read(vcpu); + write = tdvmcall_a1_read(vcpu); + gpa = tdvmcall_a2_read(vcpu); + val = write ? tdvmcall_a3_read(vcpu) : 0; + + if (size != 1 && size != 2 && size != 4 && size != 8) + goto error; + if (write != 0 && write != 1) + goto error; + + /* Strip the shared bit, allow MMIO with and without it set. */ + gpa = gpa & ~gfn_to_gpa(kvm_gfn_shared_mask(vcpu->kvm)); + + if (size > 8u || ((gpa + size - 1) ^ gpa) & PAGE_MASK) + goto error; + + slot = kvm_vcpu_gfn_to_memslot(vcpu, gpa_to_gfn(gpa)); + if (slot && !(slot->flags & KVM_MEMSLOT_INVALID)) + goto error; + + if (!kvm_io_bus_write(vcpu, KVM_FAST_MMIO_BUS, gpa, 0, NULL)) { + trace_kvm_fast_mmio(gpa); + return 1; + } + + if (write) + r = tdx_mmio_write(vcpu, gpa, size, val); + else + r = tdx_mmio_read(vcpu, gpa, size); + if (!r) { + /* Kernel completed device emulation. */ + tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_SUCCESS); + return 1; + } + + /* Request the device emulation to userspace device model. */ + vcpu->mmio_needed = 1; + vcpu->mmio_is_write = write; + vcpu->arch.complete_userspace_io = tdx_complete_mmio; + + vcpu->run->mmio.phys_addr = gpa; + vcpu->run->mmio.len = size; + vcpu->run->mmio.is_write = write; + vcpu->run->exit_reason = KVM_EXIT_MMIO; + + if (write) { + memcpy(vcpu->run->mmio.data, &val, size); + } else { + vcpu->mmio_fragments[0].gpa = gpa; + vcpu->mmio_fragments[0].len = size; + trace_kvm_mmio(KVM_TRACE_MMIO_READ_UNSATISFIED, size, gpa, NULL); + } + return 0; + +error: + tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_INVALID_OPERAND); + return 1; +} + static int handle_tdvmcall(struct kvm_vcpu *vcpu) { if (tdvmcall_exit_type(vcpu)) @@ -933,6 +1045,8 @@ static int handle_tdvmcall(struct kvm_vcpu *vcpu) return tdx_emulate_hlt(vcpu); case EXIT_REASON_IO_INSTRUCTION: return tdx_emulate_io(vcpu); + case EXIT_REASON_EPT_VIOLATION: + return tdx_emulate_mmio(vcpu); default: break; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index fad5108dff1e..2eacc4929d5d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -13919,6 +13919,7 @@ bool kvm_arch_has_private_mem(struct kvm *kvm) EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_entry); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit); +EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_mmio); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_fast_mmio); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_inj_virq); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_page_fault); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index f0e77b65939b..6953da8b74d3 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2815,6 +2815,7 @@ struct kvm_memory_slot *kvm_vcpu_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn return NULL; } +EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_memslot); bool kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn) { @@ -5822,6 +5823,7 @@ int kvm_io_bus_read(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr, r = __kvm_io_bus_read(vcpu, bus, &range, val); return r < 0 ? r : 0; } +EXPORT_SYMBOL_GPL(kvm_io_bus_read); /* Caller must hold slots_lock. */ int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr, From patchwork Sun Oct 30 06:23:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12929 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1680811wru; Sun, 30 Oct 2022 00:29:47 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7qu2e42KMnilRFosgPNQbiB2ba51lKsSgLjNFznnuqD86+X9rndhL/W16qwQyLGGW2EeI5 X-Received: by 2002:a17:903:189:b0:187:190f:6aa1 with SMTP id z9-20020a170903018900b00187190f6aa1mr1716398plg.142.1667114987540; Sun, 30 Oct 2022 00:29:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667114987; cv=none; d=google.com; s=arc-20160816; b=zyIf7W0dfCUMOLEzUne69nJk7rvJoosk+Rs6Q+45eL1ZziO6tQ7KIMWR72GCS6YfqN Rq+/phEhnntceDM3lyUeHHUAqolSCq2MElnC0n15fjUHVNZB+9jo407njYxEEp6VFfTw rWa4iQgdOhm4/CIuaK5gP7iQQ0yz1/sbbGu/2cWOSsgjBkZb6pk3FWSIjBkWqub/Ly8p 9xt5OxRarQZt784nLnrXXCfOinLtEN/NsbzuEz5KrJcSzBcRGDGLwCtviip+xdG5CTqS L1aRjeogNTymPec9LetHozKoBuH7s2nXbt8fuDw6Uxq6T/AEL0ZNMlS/YOO1TKoJsjrA 8kRw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=hAPYHlpR+ivceShBxTkbMeU98Y3zF+ySVgxaou4qDMg=; b=QA+wyaJRCTwUuaH54VyncskfZ5jZcx+DkuszKqcEfilwvJ/228VEoJVh8pRG950dug l473A0atiAFUWqh0xljyy54VnNsIeE7+mDo3AFf0zVXA+zA7KtZt7Hm2Mn7jKF9AWHiz cHZ7E39JwzzsN4/erVVz+N5NV4tInWySt4xFq3dSYjTjxizeIdfpaKNDiVm2fsRXOevE hjGOwAWp7xnaXctQsL9Co7askj1bgeM3QEwQM16RkATlI7B57rG3ppQ+EFkY4H8FpTkM BANehrJxd9eONyGstoPA1CtCd0YY/SSWtyIcex9/cKfSgttKRE2E9TesyfOPdw+hDmCW G5PA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=FzYn3Ld7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e11-20020a170902744b00b001769e71398csi4086078plt.340.2022.10.30.00.29.22; Sun, 30 Oct 2022 00:29:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=FzYn3Ld7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231598AbiJ3GdR (ORCPT + 99 others); Sun, 30 Oct 2022 02:33:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59416 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230460AbiJ3GcE (ORCPT ); Sun, 30 Oct 2022 02:32:04 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E3C8BAD; Sat, 29 Oct 2022 23:25:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111105; x=1698647105; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=uQszNqIP5KoUtH/A1/iHTxTjLI5819ZdeQ0wpPdr8kg=; b=FzYn3Ld7ZD30wLqr3ll67YITGx0w19egSmAfwS+Wpo3jo3JAjL8o8zLh 2ZIfOoFm0dNkTzYq7LXkJqzgTDYHY+1FiekwkkQiv6aZ/WswEwhnNT8Ll gStnUzyHMMneLUws/d/0xroYM9OOGv8sAxYToo7BfOUGBxAr9DVzRqF8a OL9AcvMwRmfJfY5tXNAK4vOo/tNUL2bRAYowFtTSNe5KY52hRpVHQ8uuU Ix82ddQ/4fxGWjlfb6EQQOufWBPNNfMnrVajXPYkF7HO4FBco5dsJxYTZ 0Ub1P2RQd0ObpsFdNxJ8NHFHzO2DykfgJB1HilnZse29pZ3mn/OV2cvFE w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037216" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037216" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:14 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393155" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393155" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:13 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 098/108] KVM: TDX: Implement callbacks for MSR operations for TDX Date: Sat, 29 Oct 2022 23:23:39 -0700 Message-Id: <1cacbda18e3c7dcccd92a7390b0ca7f4ba073f85.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748096765367231871?= X-GMAIL-MSGID: =?utf-8?q?1748096765367231871?= From: Isaku Yamahata Implements set_msr/get_msr/has_emulated_msr methods for TDX to handle hypercall from guest TD for paravirtualized rdmsr and wrmsr. The TDX module virtualizes MSRs. For some MSRs, it injects #VE to the guest TD upon RDMSR or WRMSR. The exact list of such MSRs are defined in the spec. Upon #VE, the guest TD may execute hypercalls, TDG.VP.VMCALL and TDG.VP.VMCALL, which are defined in GHCI (Guest-Host Communication Interface) so that the host VMM (e.g. KVM) can virtualizes the MSRs. Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/kvm/vmx/main.c | 34 ++++++++++++++++++-- arch/x86/kvm/vmx/tdx.c | 63 ++++++++++++++++++++++++++++++++++++++ arch/x86/kvm/vmx/x86_ops.h | 6 ++++ arch/x86/kvm/x86.c | 1 - arch/x86/kvm/x86.h | 2 ++ 5 files changed, 102 insertions(+), 4 deletions(-) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 74c561e3eb46..3a8a4fdf1ce7 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -195,6 +195,34 @@ static void vt_handle_exit_irqoff(struct kvm_vcpu *vcpu) vmx_handle_exit_irqoff(vcpu); } +static int vt_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) +{ + if (unlikely(is_td_vcpu(vcpu))) + return tdx_set_msr(vcpu, msr_info); + + return vmx_set_msr(vcpu, msr_info); +} + +/* + * The kvm parameter can be NULL (module initialization, or invocation before + * VM creation). Be sure to check the kvm parameter before using it. + */ +static bool vt_has_emulated_msr(struct kvm *kvm, u32 index) +{ + if (kvm && is_td(kvm)) + return tdx_is_emulated_msr(index, true); + + return vmx_has_emulated_msr(kvm, index); +} + +static int vt_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) +{ + if (unlikely(is_td_vcpu(vcpu))) + return tdx_get_msr(vcpu, msr_info); + + return vmx_get_msr(vcpu, msr_info); +} + static void vt_apicv_post_state_restore(struct kvm_vcpu *vcpu) { struct pi_desc *pi = vcpu_to_pi_desc(vcpu); @@ -431,7 +459,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .hardware_enable = vt_hardware_enable, .hardware_disable = vt_hardware_disable, - .has_emulated_msr = vmx_has_emulated_msr, + .has_emulated_msr = vt_has_emulated_msr, .is_vm_type_supported = vt_is_vm_type_supported, .vm_size = sizeof(struct kvm_vmx), @@ -451,8 +479,8 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .update_exception_bitmap = vmx_update_exception_bitmap, .get_msr_feature = vmx_get_msr_feature, - .get_msr = vmx_get_msr, - .set_msr = vmx_set_msr, + .get_msr = vt_get_msr, + .set_msr = vt_set_msr, .get_segment_base = vmx_get_segment_base, .get_segment = vmx_get_segment, .set_segment = vmx_set_segment, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 50e9352464a9..b820475ce0ab 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1489,6 +1489,69 @@ void tdx_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, *error_code = 0; } +bool tdx_is_emulated_msr(u32 index, bool write) +{ + switch (index) { + case MSR_IA32_UCODE_REV: + case MSR_IA32_ARCH_CAPABILITIES: + case MSR_IA32_POWER_CTL: + case MSR_MTRRcap: + case 0x200 ... 0x26f: + /* IA32_MTRR_PHYS{BASE, MASK}, IA32_MTRR_FIX*_* */ + case MSR_IA32_CR_PAT: + case MSR_MTRRdefType: + case MSR_IA32_TSC_DEADLINE: + case MSR_IA32_MISC_ENABLE: + case MSR_KVM_STEAL_TIME: + case MSR_KVM_POLL_CONTROL: + case MSR_PLATFORM_INFO: + case MSR_MISC_FEATURES_ENABLES: + case MSR_IA32_MCG_CAP: + case MSR_IA32_MCG_STATUS: + case MSR_IA32_MCG_CTL: + case MSR_IA32_MCG_EXT_CTL: + case MSR_IA32_MC0_CTL ... MSR_IA32_MCx_CTL(KVM_MAX_MCE_BANKS) - 1: + case MSR_IA32_MC0_CTL2 ... MSR_IA32_MCx_CTL2(KVM_MAX_MCE_BANKS) - 1: + /* MSR_IA32_MCx_{CTL, STATUS, ADDR, MISC, CTL2} */ + return true; + case APIC_BASE_MSR ... APIC_BASE_MSR + 0xff: + /* + * x2APIC registers that are virtualized by the CPU can't be + * emulated, KVM doesn't have access to the virtual APIC page. + */ + switch (index) { + case X2APIC_MSR(APIC_TASKPRI): + case X2APIC_MSR(APIC_PROCPRI): + case X2APIC_MSR(APIC_EOI): + case X2APIC_MSR(APIC_ISR) ... X2APIC_MSR(APIC_ISR + APIC_ISR_NR): + case X2APIC_MSR(APIC_TMR) ... X2APIC_MSR(APIC_TMR + APIC_ISR_NR): + case X2APIC_MSR(APIC_IRR) ... X2APIC_MSR(APIC_IRR + APIC_ISR_NR): + return false; + default: + return true; + } + case MSR_IA32_APICBASE: + case MSR_EFER: + return !write; + default: + return false; + } +} + +int tdx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) +{ + if (tdx_is_emulated_msr(msr->index, false)) + return kvm_get_msr_common(vcpu, msr); + return 1; +} + +int tdx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) +{ + if (tdx_is_emulated_msr(msr->index, true)) + return kvm_set_msr_common(vcpu, msr); + return 1; +} + int tdx_dev_ioctl(void __user *argp) { struct kvm_tdx_capabilities __user *user_caps; diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index d02619f64b6e..8b9399b154f3 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -162,6 +162,9 @@ void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, void tdx_inject_nmi(struct kvm_vcpu *vcpu); void tdx_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, u64 *info1, u64 *info2, u32 *intr_info, u32 *error_code); +bool tdx_is_emulated_msr(u32 index, bool write); +int tdx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr); +int tdx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr); int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); @@ -200,6 +203,9 @@ static inline void tdx_deliver_interrupt(struct kvm_lapic *apic, int delivery_mo static inline void tdx_inject_nmi(struct kvm_vcpu *vcpu) {} static inline void tdx_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, u64 *info1, u64 *info2, u32 *intr_info, u32 *error_code) {} +static inline bool tdx_is_emulated_msr(u32 index, bool write) { return false; } +static inline int tdx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) { return 1; } +static inline int tdx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) { return 1; } static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 2eacc4929d5d..5304b27f2566 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -85,7 +85,6 @@ #include "trace.h" #define MAX_IO_MSRS 256 -#define KVM_MAX_MCE_BANKS 32 struct kvm_caps kvm_caps __read_mostly = { .supported_mce_cap = MCG_CTL_P | MCG_SER_P, diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 829d3134c1eb..1d77c39821ae 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -8,6 +8,8 @@ #include "kvm_cache_regs.h" #include "kvm_emulate.h" +#define KVM_MAX_MCE_BANKS 32 + struct kvm_caps { /* control of guest tsc rate supported? */ bool has_tsc_control; From patchwork Sun Oct 30 06:23:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12932 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1681792wru; Sun, 30 Oct 2022 00:33:08 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4x/Yul7sszWJ5YtidBsxwkkiTacM348xYxjNEfAt2ATY064dwbtp/Nv5aZjbyelbJtmZkf X-Received: by 2002:a17:906:eeca:b0:730:6880:c398 with SMTP id wu10-20020a170906eeca00b007306880c398mr7231653ejb.706.1667115188538; Sun, 30 Oct 2022 00:33:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667115188; cv=none; d=google.com; s=arc-20160816; b=qS/W1m0Ye7ok4+1JFNnPMxTdOZRIZPx1/eWqEOeOIw9vXYlP5LIP3mto/f7Ne6DNlu lOn220TyU3qyVrWOXXzI6uBU4/0djkWCuGsnc+85KNd3IqYioXt6qWGxgc4+VAMgrRLq ZnvnnzIPzgI+44X8DKpTfY7u+oa+VHlr7vEThz6q5VCYCRUG4juT+JOxvuZi0KGNpBV6 d952RCeDo7HaHwYN250eYvIiI5ICX0Lz8pCxt+RTwxSpt8wKn8PEus5pJ7pFGLgcb1j3 0UCAyXQqCn6LlMM132tVYQNTawdZ1nCr2Poet998B/zO7gHTX8PYMtNtkbgn3cprmb42 B6lQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=HQgxkx/m2vl6aE+FQL3MOlEqoORtlynoJ57yzgn26Ws=; b=BzazHegmoSb8/gotVljLhcn3YfVPQHUo8cCgPTDXgKTjfo9XqBaSsj0PwFFmgSjgdg C13bTjKcowT0dCOlUsskEjaDrNOwuwrfbgZN/4eHFK9MDgsTupvQRISAsZQDzmvO19/E Wa0lJcWE9dzLZV5P9sIwYrctdPWdJzFBV/AqMNPErWvhEH30Fcji4/IHc7gWjuPnhf/W jwNOxDJH7ZkstNBYAYNOiwBohdWgL9aLQs/z1nw6DwlkVyDftfDhkdJ5sEg+dxCQCMUM RNcvTqpTWf9bgpxnFoO+kvnsf6TMTbuXd+jimu2jFeEAFZ8y4ZDAH4sQLDfjREsbCLzk z2NA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Hx6r1DF5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h10-20020a0564020e8a00b004570e4887desi3132618eda.437.2022.10.30.00.32.45; Sun, 30 Oct 2022 00:33:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Hx6r1DF5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231603AbiJ3GdW (ORCPT + 99 others); Sun, 30 Oct 2022 02:33:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59482 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231501AbiJ3GcU (ORCPT ); Sun, 30 Oct 2022 02:32:20 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EDEA1BAA; Sat, 29 Oct 2022 23:25:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111105; x=1698647105; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=39PnIeDQnGG7uhnXgi1UbAsJ6LOJDAraBTB2/rPpwig=; b=Hx6r1DF5mVqcybgSDgispdVZ3vnFwWzy+KPw2CpFnVU/nq8x5HuUnYjE a9r5/s0HA90WO8aa7I7uAg1JxHlN9k5dL0Dh66KKp2xwD4g4P2tue9r7y PnB0tiG8BT4itZDcF0/hoEMRP8xzdIQ3bKLT877LlQdnkRKCieOJZU3SW 7SHJ6Mv+g0UPd7r1zRdENgnVUdQ9KAaTCARIgK3kNHJfeubsEm12uRKl8 bo8bVmxAm7Eb8JqETtkAomeSSx/LinN1c5YQGVzB5rZ3oQ3iej0JpcyEq hrnObqJ09whkv1J7rnZh9T7+nHXGd4Qy8AtEqCgJYgjzCSaXsluSZyrQA Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="395037217" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="395037217" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:14 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393159" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393159" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:14 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 099/108] KVM: TDX: Handle TDX PV rdmsr/wrmsr hypercall Date: Sat, 29 Oct 2022 23:23:40 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748096976075541378?= X-GMAIL-MSGID: =?utf-8?q?1748096976075541378?= From: Isaku Yamahata Wire up TDX PV rdmsr/wrmsr hypercall to the KVM backend function. Signed-off-by: Isaku Yamahata Reviewed-by: Paolo Bonzini --- arch/x86/kvm/vmx/tdx.c | 39 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index b820475ce0ab..e3062c245e70 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1033,6 +1033,41 @@ static int tdx_emulate_mmio(struct kvm_vcpu *vcpu) return 1; } +static int tdx_emulate_rdmsr(struct kvm_vcpu *vcpu) +{ + u32 index = tdvmcall_a0_read(vcpu); + u64 data; + + if (!kvm_msr_allowed(vcpu, index, KVM_MSR_FILTER_READ) || + kvm_get_msr(vcpu, index, &data)) { + trace_kvm_msr_read_ex(index); + tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_INVALID_OPERAND); + return 1; + } + trace_kvm_msr_read(index, data); + + tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_SUCCESS); + tdvmcall_set_return_val(vcpu, data); + return 1; +} + +static int tdx_emulate_wrmsr(struct kvm_vcpu *vcpu) +{ + u32 index = tdvmcall_a0_read(vcpu); + u64 data = tdvmcall_a1_read(vcpu); + + if (!kvm_msr_allowed(vcpu, index, KVM_MSR_FILTER_WRITE) || + kvm_set_msr(vcpu, index, data)) { + trace_kvm_msr_write_ex(index, data); + tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_INVALID_OPERAND); + return 1; + } + + trace_kvm_msr_write(index, data); + tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_SUCCESS); + return 1; +} + static int handle_tdvmcall(struct kvm_vcpu *vcpu) { if (tdvmcall_exit_type(vcpu)) @@ -1047,6 +1082,10 @@ static int handle_tdvmcall(struct kvm_vcpu *vcpu) return tdx_emulate_io(vcpu); case EXIT_REASON_EPT_VIOLATION: return tdx_emulate_mmio(vcpu); + case EXIT_REASON_MSR_READ: + return tdx_emulate_rdmsr(vcpu); + case EXIT_REASON_MSR_WRITE: + return tdx_emulate_wrmsr(vcpu); default: break; } From patchwork Sun Oct 30 06:23:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12881 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666337wru; Sat, 29 Oct 2022 23:29:51 -0700 (PDT) X-Google-Smtp-Source: AMsMyM74SCtK1MbLqQa0zuzmknsZz/8GXSSEu1nu+46/72yLQjJPh8EklxaeOiz/xD5jXP1ZeIoI X-Received: by 2002:a17:90a:ec04:b0:213:60bf:e6f7 with SMTP id l4-20020a17090aec0400b0021360bfe6f7mr22818682pjy.211.1667111390906; Sat, 29 Oct 2022 23:29:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111390; cv=none; d=google.com; s=arc-20160816; b=qUoC7RUEXDSMF6e03XN0uLov+w3TEENBYrOEJ+6l/afoZ+XYhuqu54+9ha+wJJUl5j A0mRRnZKzvci4OKgHJ3/oTPEKWTYwcKlR91xNY7eXyjPEd5ZduPoaPiJ7rXSKGhvGBNr M6Mcc+8So55TSMDQTR48w3q8CZgo9+hN5uCz5WIpXq40AKssIrBIDjb6oqcGoH7bsyTo LUnmGSSeGQesuauoj5Van0XiDqTTUJiVx0xfbHQKpNIqz8Tu9Zl/O7cX5ARvakQ8bZeh RvHHUhw99FaBf4JyzQzeaMRMkgv9GH8E/EASePsqPnZDITmC4p1QZMaS35bYQH1J/nMm b2sg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=2a2e7nI4DndrU64CojP8GWx/BpwQ2l2ruGd0xC2PBac=; b=HmpAShT0nAfh11N7St7UE/zFJB0QiTXK50EDv85kq+CiY7o4iHORsmAntE7N8sf+jT jxrQi7KDuu+VakH/J5NkSFXptHf+2XUycQip1ktdaK/RoVvbuHLd4v1eCaRgyiMknwAx 3DPhjv7W5TORnh9wZ+TQ4rYGrcP8IfZTlo3/vsot2ofbYFbvYh+kMaxTFvKlPHMFenXX 68K/I5ozX0pJ/T+bh3Gk2ed4G3I8lo03eu1cffmhzKkG7qE8SObk5hlkuiuSVUFtbSKn oLe5vajYdx67LJytqYxrixmRgtly7tbdjTkeLYDIhBG0FB1dRnN72Jn7dBK/RzvCIItZ pjjQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=mx9Rtyaa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b6-20020a170903228600b0016f5e7d0febsi5173471plh.244.2022.10.29.23.29.39; Sat, 29 Oct 2022 23:29:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=mx9Rtyaa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231139AbiJ3G2t (ORCPT + 99 others); Sun, 30 Oct 2022 02:28:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230128AbiJ3GZP (ORCPT ); Sun, 30 Oct 2022 02:25:15 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D72A0209; Sat, 29 Oct 2022 23:24:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111054; x=1698647054; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=MDrBeEVg3cc7h0tcvmuzjVL0AqXKibnPzzkP01NyP5o=; b=mx9Rtyaa2pH6qBlmHvJJKuuera43ZdJ4MQ+h3nQnjfzc9kp9NsFgv1sF 4H+9vzlXHblYbzNIyARXNkHfR2xXIJFnHnH0B0axrrHWsvKNqOlDdlErV 541ZcDQTmUCInM95Fkzm3JTkVftCkOueSLT7Wb1hd254y9GGj42m5Qqg8 lgUNyJjL3Vnb740kCZfI67IviOxxThUBaeCAvqzxwzNK0Kv0DU6By6E/O W0HTMrZJ4iiPrpCZB9tL/ePgJKSmd0m9+v/2ElifjP5Cf0mWG1BNYtNn5 WbiU6Msya+9hGvlD4Xfn7taOPlSZ1nzL3alcrk4ZlXKP5aLAp3rK+2GX0 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="288435993" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="288435993" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:14 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393162" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393162" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:14 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 100/108] KVM: TDX: Handle TDX PV report fatal error hypercall Date: Sat, 29 Oct 2022 23:23:41 -0700 Message-Id: <82671e3e811ab5ad423e125186c050f46621dd86.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748092994080509013?= X-GMAIL-MSGID: =?utf-8?q?1748092994080509013?= From: Isaku Yamahata Wire up TDX PV report fatal error hypercall to KVM_SYSTEM_EVENT_CRASH KVM exit event. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/tdx.c | 20 ++++++++++++++++++++ include/uapi/linux/kvm.h | 1 + 2 files changed, 21 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index e3062c245e70..16f168f4f21a 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1068,6 +1068,24 @@ static int tdx_emulate_wrmsr(struct kvm_vcpu *vcpu) return 1; } +static int tdx_report_fatal_error(struct kvm_vcpu *vcpu) +{ + /* + * Exit to userspace device model for teardown. + * Because guest TD is already panicing, returning an error to guerst TD + * doesn't make sense. No argument check is done. + */ + + vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT; + vcpu->run->system_event.type = KVM_SYSTEM_EVENT_TDX; + vcpu->run->system_event.ndata = 3; + vcpu->run->system_event.data[0] = TDG_VP_VMCALL_REPORT_FATAL_ERROR; + vcpu->run->system_event.data[1] = tdvmcall_a0_read(vcpu); + vcpu->run->system_event.data[2] = tdvmcall_a1_read(vcpu); + + return 0; +} + static int handle_tdvmcall(struct kvm_vcpu *vcpu) { if (tdvmcall_exit_type(vcpu)) @@ -1086,6 +1104,8 @@ static int handle_tdvmcall(struct kvm_vcpu *vcpu) return tdx_emulate_rdmsr(vcpu); case EXIT_REASON_MSR_WRITE: return tdx_emulate_wrmsr(vcpu); + case TDG_VP_VMCALL_REPORT_FATAL_ERROR: + return tdx_report_fatal_error(vcpu); default: break; } diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 49386e4de8b8..504a8f73284b 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -478,6 +478,7 @@ struct kvm_run { #define KVM_SYSTEM_EVENT_WAKEUP 4 #define KVM_SYSTEM_EVENT_SUSPEND 5 #define KVM_SYSTEM_EVENT_SEV_TERM 6 +#define KVM_SYSTEM_EVENT_TDX 7 __u32 type; __u32 ndata; union { From patchwork Sun Oct 30 06:23:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12887 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666394wru; Sat, 29 Oct 2022 23:30:03 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5jpOMDiEf9B8S6O3BtmdiOEMes3F3kgSLNZGcoExF113mLssVLMyRSuy8fD4yazYYWfT/U X-Received: by 2002:a62:cec8:0:b0:56c:235:8399 with SMTP id y191-20020a62cec8000000b0056c02358399mr8115676pfg.64.1667111403577; Sat, 29 Oct 2022 23:30:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111403; cv=none; d=google.com; s=arc-20160816; b=ZScKy6z1zpfzQPCvqmvhUb4uu9e5e37fGczW8tL801IdwO3WUkNhAvHQ15m7Ey9p+k Pt1TwfI4FNxjO7FNUNu1yAk36LCjfu/EfBVKr38YiY008UyVRbBXMU2UvkSu4Db4yh5g 44maNVxF+4wOhtmDkwe4jdPJazA+zCVe1pqNOXocu/ugQSYriTp4PaqgMelRqqBVb/yc N7vfzmNm3avnwP7n11Fld1bKPUnqo6djwi+CLC5Mw6jeblpfUiTkO6UCYllviN75c4Xr TXEC/rTA+sZUAcXIx87jFE94fhCRjDV2ZPl+tlYREsoWqcNusqIG5GD/ODnbDe1jDULK GoUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=3qUI2X1m3G+B94K9hF/444ZTAHtkQr+8oEBfXc0YCeA=; b=OxVIrqoI05fKXiPwR+8/YLtF4ba1CWV+0jJ0yWIt0qDOwhNPsXw4yxuqiljc+iS7do ruFfV3wdhxjBSIz63OrL+jxGr1KM5jz5c3E7EImZYvDzOEAioJx1E9IKFUAS45JF9m/K dC0USHJQWKK7Nlv3rEi7vViF2ARqIJDEGf4rvTmh0yIsjexHj8f6nX4YMZIseFgTMQAP C/hLcql9qHTEQa2IqSqY4GO18Ku/bmfCGJFROfPXdEHLO0IoP63b2m0OZjNz6Sg9xLzt Y5NiZOFqii/U9OFU+WBK/No7+Gxw49dvRq93NBd4ewwrdtnIzoGq0z9ByaiQSGk9Iz3X TPIA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=LsOfxghN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i186-20020a6387c3000000b0043c2e57961asi4133878pge.799.2022.10.29.23.29.51; Sat, 29 Oct 2022 23:30:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=LsOfxghN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230440AbiJ3G3I (ORCPT + 99 others); Sun, 30 Oct 2022 02:29:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48658 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230242AbiJ3GZo (ORCPT ); Sun, 30 Oct 2022 02:25:44 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 53D0D23F; Sat, 29 Oct 2022 23:24:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111056; x=1698647056; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=b+y5/d+EDi9i+dA7LdCA6x2ctNEQ880z2kcYTSMQtzk=; b=LsOfxghNe5OFAZ7lTuIaxGHWJmbCL8ORIBR9stvCz1o7lCwdg6sQBolz Pu1YDqlElp/wISEsHvRW+mRzRifotvUDMQBpR/KNoL4T/IgZSr5ZyvmUQ deZm7JKDXvFTxUtRDp/tPWi9vMGksKGLBq55X2cLXFbQd/PrH4+ovkOZt 4WIIQWnte9xIY3XZcsTPGO701rC3oiqR6ivSUA5Xm0qmWmh2e2ENpe+b+ U771Wxswon2xSRnLrMyr1gxG/sLLbHkqVCJBv7INShTxlQntW31semRy4 bn6jgIRhajNn4mnF/aAo+mNjJBAiGXxO3GL7tNMPqjjRDUtKgqBdztkQB g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="288435995" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="288435995" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:14 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393165" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393165" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:14 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 101/108] KVM: TDX: Handle TDX PV map_gpa hypercall Date: Sat, 29 Oct 2022 23:23:42 -0700 Message-Id: <89030b9ba53cfe2cc1884941c12168c1d5a838df.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093007129752220?= X-GMAIL-MSGID: =?utf-8?q?1748093007129752220?= From: Isaku Yamahata Wire up TDX PV map_gpa hypercall to the kvm/mmu backend. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/tdx.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 16f168f4f21a..4db552b60271 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1086,6 +1086,37 @@ static int tdx_report_fatal_error(struct kvm_vcpu *vcpu) return 0; } +static int tdx_map_gpa(struct kvm_vcpu *vcpu) +{ + struct kvm *kvm = vcpu->kvm; + gpa_t gpa = tdvmcall_a0_read(vcpu); + gpa_t size = tdvmcall_a1_read(vcpu); + gpa_t end = gpa + size; + gfn_t s = gpa_to_gfn(gpa) & ~kvm_gfn_shared_mask(kvm); + gfn_t e = gpa_to_gfn(end) & ~kvm_gfn_shared_mask(kvm); + bool map_private = kvm_is_private_gpa(kvm, gpa); + int ret; + + if (!IS_ALIGNED(gpa, 4096) || !IS_ALIGNED(size, 4096) || + end < gpa || + end > kvm_gfn_shared_mask(kvm) << (PAGE_SHIFT + 1) || + kvm_is_private_gpa(kvm, gpa) != kvm_is_private_gpa(kvm, end)) { + tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_INVALID_OPERAND); + return 1; + } + + ret = kvm_mmu_map_gpa(vcpu, &s, e, map_private); + if (ret == -EAGAIN) { + tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_RETRY); + tdvmcall_set_return_val(vcpu, gfn_to_gpa(s)); + } else if (ret) + tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_INVALID_OPERAND); + else + tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_SUCCESS); + + return 1; +} + static int handle_tdvmcall(struct kvm_vcpu *vcpu) { if (tdvmcall_exit_type(vcpu)) @@ -1106,6 +1137,8 @@ static int handle_tdvmcall(struct kvm_vcpu *vcpu) return tdx_emulate_wrmsr(vcpu); case TDG_VP_VMCALL_REPORT_FATAL_ERROR: return tdx_report_fatal_error(vcpu); + case TDG_VP_VMCALL_MAP_GPA: + return tdx_map_gpa(vcpu); default: break; } From patchwork Sun Oct 30 06:23:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12891 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666490wru; Sat, 29 Oct 2022 23:30:23 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4nNHuLmkuVVLqsK7KZtvb21TUQAcfFoyDbCUYHD8g3YYwpHWZ9jNI47fi3WmIi/k16swjc X-Received: by 2002:aa7:8750:0:b0:56c:318a:f811 with SMTP id g16-20020aa78750000000b0056c318af811mr7676172pfo.14.1667111423108; Sat, 29 Oct 2022 23:30:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111423; cv=none; d=google.com; s=arc-20160816; b=zg7xi7kgcTMo0SrBcw9ooxMNl7yfeGaRwjtHzEwqXrjBm+cBcN8Wige2WWca7LLGR3 J8/k4z7mduCcp9qdzpcGpX1MX7dxkkSnC+Od/VBw9Np5D1oXZkLOP+GeVjuQdu/PNQi+ oqEnzT6LRZmEAYHtsGvIn805T1Tzoud2NIJQEmLG7gbxHYdBenxWcSnclXAdIm1mHw0+ 78xOrfnL9IVVM/OPdNDe+pI7/KtPxUBSiNVg70F0sdlbzpTR7HLtAsLX//0sug4d4sjv EXSc9pii3HR+zr7UwpG82uhkW3E0tqAC/yYgnlozYdDNYEVm6cS+z0JNWnTy2mK5SVYN UNCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=xAOn1yxmr6Mrub1+62oU563RkVMRUrUsy2JXqFXS+GQ=; b=jCVdiOriu16mA4VZtdpCh/wHaijpOaxtLptIAf07UlHonEP47eaAaX/khwPfd+ODRz RXOCpOFxa+I4rb242ECBed3isrzIx0SF4mz/Dz0dsllf0O+6dBmV73NXGYjNNsA53Msh 06ZWBgFz/hwkgBqyHeFn8OiDK7k6GoLk1nwx+KryFcqoDsIyUaItzBflp67NAaCqWkfT JiakD27rj9Wmr3Y7O8T0Q80G1IV4/u5QK1VauNlf3l9SAyLPo3mLsaO9DMO2ISou2Wx4 To+YasRGThIi2s4QFtxX2x1DOFFCNEt8FLwcS2BBn+lYnIcp7JrctKA+GKPVb0Hf/j6y V8vw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=OWchnVj3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b17-20020a631b11000000b0044c3e11a7b4si4834676pgb.42.2022.10.29.23.30.11; Sat, 29 Oct 2022 23:30:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=OWchnVj3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230055AbiJ3G3l (ORCPT + 99 others); Sun, 30 Oct 2022 02:29:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230420AbiJ3G1P (ORCPT ); Sun, 30 Oct 2022 02:27:15 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 383F6286; Sat, 29 Oct 2022 23:24:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111062; x=1698647062; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tg3ti5XASbFLYxgQdIxcPquIONUemSJw0hYO8oY7anU=; b=OWchnVj3PFC9l32zzftfaHkD4zUXNQN1c1YIN6DfAjWIt0qB26fhLWUF h2Jnm5PJGHJQ6Qchv7zuOeopteY3vjLnUIeIqZtftVlSFkc0xPbh2/n9B 2QZNJj8Y+EcD9phlFvhs5YAhtXLbBy8ycz03j6TLLn2AX6dGeTxHgXuYL SeW2hLHLQQuOHP6D6n+g6D38NsKLQ5W8aGeYxBfSNEZ9O6Z+I7LTVBC7U gSxs9f9/isjhVpikA3BFr2RVPLpmu2tJ9AFs+PVFdj2v4labePzW9RLcQ oz00GT+xOqd7e6p5LUGI13IOY9EaeWGsbFwotddqHiRZu+dcYsjGGx/lk A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="288435997" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="288435997" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:14 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393168" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393168" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:14 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 102/108] KVM: TDX: Handle TDG.VP.VMCALL hypercall Date: Sat, 29 Oct 2022 23:23:43 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093027503694852?= X-GMAIL-MSGID: =?utf-8?q?1748093027503694852?= From: Isaku Yamahata Implement TDG.VP.VMCALL hypercall. If the input value is zero, return success code and zero in output registers. TDG.VP.VMCALL hypercall is a subleaf of TDG.VP.VMCALL to enumerate which TDG.VP.VMCALL sub leaves are supported. This hypercall is for future enhancement of the Guest-Host-Communication Interface (GHCI) specification. The GHCI version of 344426-001US defines it to require input R12 to be zero and to return zero in output registers, R11, R12, R13, and R14 so that guest TD enumerates no enhancement. Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/tdx.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 4db552b60271..8e8ac1081ce4 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1068,6 +1068,20 @@ static int tdx_emulate_wrmsr(struct kvm_vcpu *vcpu) return 1; } +static int tdx_get_td_vm_call_info(struct kvm_vcpu *vcpu) +{ + if (tdvmcall_a0_read(vcpu)) + tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_INVALID_OPERAND); + else { + tdvmcall_set_return_code(vcpu, TDG_VP_VMCALL_SUCCESS); + kvm_r11_write(vcpu, 0); + tdvmcall_a0_write(vcpu, 0); + tdvmcall_a1_write(vcpu, 0); + tdvmcall_a2_write(vcpu, 0); + } + return 1; +} + static int tdx_report_fatal_error(struct kvm_vcpu *vcpu) { /* @@ -1135,6 +1149,8 @@ static int handle_tdvmcall(struct kvm_vcpu *vcpu) return tdx_emulate_rdmsr(vcpu); case EXIT_REASON_MSR_WRITE: return tdx_emulate_wrmsr(vcpu); + case TDG_VP_VMCALL_GET_TD_VM_CALL_INFO: + return tdx_get_td_vm_call_info(vcpu); case TDG_VP_VMCALL_REPORT_FATAL_ERROR: return tdx_report_fatal_error(vcpu); case TDG_VP_VMCALL_MAP_GPA: From patchwork Sun Oct 30 06:23:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12894 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666546wru; Sat, 29 Oct 2022 23:30:35 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4QJjr7eWYYo643nyYbE6I2bsUMe6XaQC3cIHk0IL8utrlFF4KwNl4UrTkMEPMNSr6+JeDN X-Received: by 2002:aa7:9083:0:b0:55f:9827:42e7 with SMTP id i3-20020aa79083000000b0055f982742e7mr7582371pfa.15.1667111434755; Sat, 29 Oct 2022 23:30:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111434; cv=none; d=google.com; s=arc-20160816; b=SC+03kYYrbXw65nvwYrjmPpxynIiudBiKSTM0piKxgDrA+6eWhNQ4TTk5jtgxjP8Yh fyPEPPVjgSB9G7dQVz7YFPtqYy3GdIVOd/l+Q3xeQE93K9HV62hGWG6IgjLNEp7KLm56 L3Zq07ZyC3ME1m9pNp8A6VWBY8onC+BmgpuW+cbbMvlm8Ln4aOvEJEcZqXmtIc6vokr1 74TEouOOYb5CduMWz2+PI48VFTVoW0RAGdehrDMF2suByK7GGBamqT7/LJLCvXJ3qRtN G+g++JrOoAk/Od8PIh06jKkP0qBgkE3UzhsmovklfNqi1AUlg1HNCiKfsNmVp/n4pLru q1dw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=3R60bdWr56BWSKbRC5EGfRvddnhEaLRvjwlMdedwMks=; b=cXBuhlwydrt1WqnbvK1uzWijmhxqKEReo3oVKMP8J8mJnF8CWltfeVIeZU+WdC+YDM zLyN2PyUT25ghz9fDZH7+r+hJ33m0MhNeEWgYkF1ynmxmpRv52odpMjicQAkvBUdSeP8 2cOzIfHuOWcM1/FAo26e8nPEp0VdaaTwpcftKCAVT4d5FOvtKlg/B+uea0KoCuVJjMOH WqRZJyszawrFY04LhyM9J0+MRYPXWXGZH6YeTa4u/0z6de0he1suIKfAIsu2/LONaNMj F+poiYNaeT029S9CmUF4rek68JRrfmqqKP19K9mIPsnhPKeBk6XiWzF/ehEAITXwTgT3 qBUw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=QG0coiFr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l71-20020a63914a000000b0046eecbac482si4597280pge.415.2022.10.29.23.30.22; Sat, 29 Oct 2022 23:30:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=QG0coiFr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231253AbiJ3G3s (ORCPT + 99 others); Sun, 30 Oct 2022 02:29:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49732 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230448AbiJ3G12 (ORCPT ); Sun, 30 Oct 2022 02:27:28 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E6B63126; Sat, 29 Oct 2022 23:24:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111068; x=1698647068; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=JYNUDx8GHocZLxIorehBxnsfBypc958rUI5pLMpWkV4=; b=QG0coiFrJsFeuyOYWrKfmE2uh0Nno/1Yw/VvHs2kHFO5P2qjLcBQmZFg zfgqEGP/g3cFgRVOE01ByI3lLMpK5tMP8VaQocf4V3s9afNnqYgrl+Gbh VuS5k8ZS/h+8LkNEHgET6Q/csfaXkOGps1g+Uh6fEm3XW9lqBlZ7/Rojl Cnr1bgjH0FGWGYU0dCJdH4ED4Htvw7QenKBp7giuZ432ArRFLLgHxYP98 +g7c3Qzz8MLdy1NZ6Zb4TZZREbIyG1BKNYycFfdM3Xls45SHUyYmvrMYi Ey/RqJTKe7IBxdeaH0Opmdhzd266CDOXgiqJOXnQ9Q6GqjQaGBvWcBPgv w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="288435999" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="288435999" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:14 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393171" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393171" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:14 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 103/108] KVM: TDX: Silently discard SMI request Date: Sat, 29 Oct 2022 23:23:44 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093039785134157?= X-GMAIL-MSGID: =?utf-8?q?1748093039785134157?= From: Isaku Yamahata TDX doesn't support system-management mode (SMM) and system-management interrupt (SMI) in guest TDs. Because guest state (vcpu state, memory state) is protected, it must go through the TDX module APIs to change guest state, injecting SMI and changing vcpu mode into SMM. The TDX module doesn't provide a way for VMM to inject SMI into guest TD and a way for VMM to switch guest vcpu mode into SMM. We have two options in KVM when handling SMM or SMI in the guest TD or the device model (e.g. QEMU): 1) silently ignore the request or 2) return a meaningful error. For simplicity, we implemented the option 1). Signed-off-by: Isaku Yamahata --- arch/x86/kvm/lapic.c | 7 +++++-- arch/x86/kvm/vmx/main.c | 43 ++++++++++++++++++++++++++++++++++---- arch/x86/kvm/vmx/tdx.c | 27 ++++++++++++++++++++++++ arch/x86/kvm/vmx/x86_ops.h | 8 +++++++ arch/x86/kvm/x86.c | 3 ++- 5 files changed, 81 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 8d894c3959c8..7a1d612bd138 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -1171,8 +1171,11 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode, case APIC_DM_SMI: result = 1; - kvm_make_request(KVM_REQ_SMI, vcpu); - kvm_vcpu_kick(vcpu); + if (static_call(kvm_x86_has_emulated_msr)(vcpu->kvm, + MSR_IA32_SMBASE)) { + kvm_make_request(KVM_REQ_SMI, vcpu); + kvm_vcpu_kick(vcpu); + } break; case APIC_DM_NMI: diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 3a8a4fdf1ce7..4acba8d8cb27 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -223,6 +223,41 @@ static int vt_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return vmx_get_msr(vcpu, msr_info); } +static int vt_smi_allowed(struct kvm_vcpu *vcpu, bool for_injection) +{ + if (is_td_vcpu(vcpu)) + return tdx_smi_allowed(vcpu, for_injection); + + return vmx_smi_allowed(vcpu, for_injection); +} + +static int vt_enter_smm(struct kvm_vcpu *vcpu, char *smstate) +{ + if (unlikely(is_td_vcpu(vcpu))) + return tdx_enter_smm(vcpu, smstate); + + return vmx_enter_smm(vcpu, smstate); +} + +static int vt_leave_smm(struct kvm_vcpu *vcpu, const char *smstate) +{ + if (unlikely(is_td_vcpu(vcpu))) + return tdx_leave_smm(vcpu, smstate); + + return vmx_leave_smm(vcpu, smstate); +} + +static void vt_enable_smi_window(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) { + tdx_enable_smi_window(vcpu); + return; + } + + /* RSM will cause a vmexit anyway. */ + vmx_enable_smi_window(vcpu); +} + static void vt_apicv_post_state_restore(struct kvm_vcpu *vcpu) { struct pi_desc *pi = vcpu_to_pi_desc(vcpu); @@ -580,10 +615,10 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .setup_mce = vmx_setup_mce, - .smi_allowed = vmx_smi_allowed, - .enter_smm = vmx_enter_smm, - .leave_smm = vmx_leave_smm, - .enable_smi_window = vmx_enable_smi_window, + .smi_allowed = vt_smi_allowed, + .enter_smm = vt_enter_smm, + .leave_smm = vt_leave_smm, + .enable_smi_window = vt_enable_smi_window, .can_emulate_instruction = vmx_can_emulate_instruction, .apic_init_signal_blocked = vmx_apic_init_signal_blocked, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 8e8ac1081ce4..111027724e06 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -1660,6 +1660,33 @@ int tdx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) return 1; } +int tdx_smi_allowed(struct kvm_vcpu *vcpu, bool for_injection) +{ + /* SMI isn't supported for TDX. */ + WARN_ON_ONCE(1); + return false; +} + +int tdx_enter_smm(struct kvm_vcpu *vcpu, char *smstate) +{ + /* smi_allowed() is always false for TDX as above. */ + WARN_ON_ONCE(1); + return 0; +} + +int tdx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate) +{ + WARN_ON_ONCE(1); + return 0; +} + +void tdx_enable_smi_window(struct kvm_vcpu *vcpu) +{ + /* SMI isn't supported for TDX. Silently discard SMI request. */ + WARN_ON_ONCE(1); + vcpu->arch.smi_pending = false; +} + int tdx_dev_ioctl(void __user *argp) { struct kvm_tdx_capabilities __user *user_caps; diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index 8b9399b154f3..d4ffbf580bc5 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -165,6 +165,10 @@ void tdx_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, bool tdx_is_emulated_msr(u32 index, bool write); int tdx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr); int tdx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr); +int tdx_smi_allowed(struct kvm_vcpu *vcpu, bool for_injection); +int tdx_enter_smm(struct kvm_vcpu *vcpu, char *smstate); +int tdx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate); +void tdx_enable_smi_window(struct kvm_vcpu *vcpu); int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); @@ -206,6 +210,10 @@ static inline void tdx_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, u64 *in static inline bool tdx_is_emulated_msr(u32 index, bool write) { return false; } static inline int tdx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) { return 1; } static inline int tdx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) { return 1; } +static inline int tdx_smi_allowed(struct kvm_vcpu *vcpu, bool for_injection) { return false; } +static inline int tdx_enter_smm(struct kvm_vcpu *vcpu, char *smstate) { return 0; } +static inline int tdx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate) { return 0; } +static inline void tdx_enable_smi_window(struct kvm_vcpu *vcpu) {} static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 5304b27f2566..d07563d0e204 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -4936,7 +4936,8 @@ static int kvm_vcpu_ioctl_nmi(struct kvm_vcpu *vcpu) static int kvm_vcpu_ioctl_smi(struct kvm_vcpu *vcpu) { - kvm_make_request(KVM_REQ_SMI, vcpu); + if (static_call(kvm_x86_has_emulated_msr)(vcpu->kvm, MSR_IA32_SMBASE)) + kvm_make_request(KVM_REQ_SMI, vcpu); return 0; } From patchwork Sun Oct 30 06:23:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12900 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666724wru; Sat, 29 Oct 2022 23:31:07 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4eaonykgWtoG1+KAk3UMdWc/K3D1IXarXzdzZ3uceEbL66GqQ8pdwKbjnpB8V7ChsbDqon X-Received: by 2002:a63:581d:0:b0:42b:399:f15a with SMTP id m29-20020a63581d000000b0042b0399f15amr7039423pgb.337.1667111467639; Sat, 29 Oct 2022 23:31:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111467; cv=none; d=google.com; s=arc-20160816; b=n+0X0izlTvea0cYvekax9ActHEAXyFFk+ix6rtb1xcaPHWsU9xrLYg1VkAylvLI0fz FzzWsihkTsBI++4bc7oH4dH+pMJJZuSMt3DMNN3XIwuNHKVQ+RY50CX0/rJ3hi5drtga EMrH/4jheIL/j/o20S1delCOrEmxxDZnu90qccuxioHZXNOfPbxK0xq4T4nwp+bvrp4D /7meA6ixruqRZVhRIOOW0FsEWDsOu1XEcBP48KX7CzPMt+SRyACUkawEPfg7Z+lLWWiy C1vICGoF6p2raQaI2zpRuLo7zUlXBpFuae3LMODkVtqfptaeDUqkehzPCNUlRryGm9LC vDOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=jDHWXF/zkNTxy7EV7EC8904mHMf7pu2KZhUJIQHkDZM=; b=iyecSDaUImJ8J8PngsoMYkRUqQcB9E1p9YZcF+ekcAeZwUtjEpB5u5jgrQtxiApEfa PwsoA5dJJTaHcmxyot5N+sWZQR8BuYsYOaYBYdRMkZaXO10HTdo9yzOZCMaVsLyptlLl fmcu/f9kI4ZELIPnKfUQJYuGAltj0vp/bXyrjNkAFLRmGoiteTOGs/DSJKRqIenGJwnm OxBUdGf4j4PG0KqHqHJ7kjSeeKKEtaxgfwLas8+hQldZqhP3ZzryihAPLZyZmPOPVVo3 7dLomBmCpSh2PW80JfMnNcGk3umhSHVKj3HzEomGar3Lhpjv+SgDOV1HXDdk4Q2pWzit zfrw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BCShBrjb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b17-20020a056a000a9100b0056d3f75ea65si2259842pfl.117.2022.10.29.23.30.53; Sat, 29 Oct 2022 23:31:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BCShBrjb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231310AbiJ3GaX (ORCPT + 99 others); Sun, 30 Oct 2022 02:30:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230496AbiJ3G2M (ORCPT ); Sun, 30 Oct 2022 02:28:12 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DC8012DA; Sat, 29 Oct 2022 23:24:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111077; x=1698647077; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=eZh7/sZOFjViDLI+mxWW/ArYyS76JriGARHJrAfbUgw=; b=BCShBrjbXPI9/sVUkrFStX48PkNfPRtdsp46rJW85yBM9g89XtZwbs8H NkXFN7jgrm+VeatUMSCx/uINcbS9m4T39/ate8OjYAbRBoAKlI9IWBGlR fKfTaFMYq6lNPIr6voMJmkarEjXdm62/eh+RizSElGMdCdpmiN4qHYvj7 +86i4YKLgdWJQhPC30H70Z8I4FrC9Y5nOGSlGckyH68eJ0aVbx3Yklr9T iKmaX+oKawka6ERByZj0rW+XM71/O54ZQSnD2U3QoQ2EbHbAOKM8dKqJc E2yye4/rxMGXVVZY+aA3P7+Bq0P0k5BKYakmXUr2dFfl7KFJkkVJbLgyw Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="288436001" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="288436001" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:15 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393174" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393174" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:14 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 104/108] KVM: TDX: Silently ignore INIT/SIPI Date: Sat, 29 Oct 2022 23:23:45 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093074536933829?= X-GMAIL-MSGID: =?utf-8?q?1748093074536933829?= From: Isaku Yamahata The TDX module API doesn't provide API for VMM to inject INIT IPI and SIPI. Instead it defines the different protocols to boot application processors. Ignore INIT and SIPI events for the TDX guest. There are two options. 1) (silently) ignore INIT/SIPI request or 2) return error to guest TDs somehow. Given that TDX guest is paravirtualized to boot AP, the option 1 is chosen for simplicity. Signed-off-by: Isaku Yamahata --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/kvm/lapic.c | 19 ++++++++++++------- arch/x86/kvm/svm/svm.c | 1 + arch/x86/kvm/vmx/main.c | 22 +++++++++++++++++++++- 5 files changed, 37 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h index 17c3828d42a3..4e9b96480716 100644 --- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -140,6 +140,7 @@ KVM_X86_OP_OPTIONAL(migrate_timers) KVM_X86_OP(msr_filter_changed) KVM_X86_OP(complete_emulated_msr) KVM_X86_OP(vcpu_deliver_sipi_vector) +KVM_X86_OP(vcpu_deliver_init) KVM_X86_OP_OPTIONAL_RET0(vcpu_get_apicv_inhibit_reasons); KVM_X86_OP(check_processor_compatibility) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 094fff5414e1..df67ca7b23d3 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1706,6 +1706,7 @@ struct kvm_x86_ops { int (*complete_emulated_msr)(struct kvm_vcpu *vcpu, int err); void (*vcpu_deliver_sipi_vector)(struct kvm_vcpu *vcpu, u8 vector); + void (*vcpu_deliver_init)(struct kvm_vcpu *vcpu); /* * Returns vCPU specific APICv inhibit reasons @@ -1914,6 +1915,7 @@ int kvm_emulate_wbinvd(struct kvm_vcpu *vcpu); void kvm_get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); int kvm_load_segment_descriptor(struct kvm_vcpu *vcpu, u16 selector, int seg); void kvm_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector); +void kvm_vcpu_deliver_init(struct kvm_vcpu *vcpu); int kvm_task_switch(struct kvm_vcpu *vcpu, u16 tss_selector, int idt_index, int reason, bool has_error_code, u32 error_code); diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 7a1d612bd138..7393d858ed72 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -3035,6 +3035,16 @@ int kvm_lapic_set_pv_eoi(struct kvm_vcpu *vcpu, u64 data, unsigned long len) return 0; } +void kvm_vcpu_deliver_init(struct kvm_vcpu *vcpu) +{ + kvm_vcpu_reset(vcpu, true); + if (kvm_vcpu_is_bsp(vcpu)) + vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; + else + vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED; +} +EXPORT_SYMBOL_GPL(kvm_vcpu_deliver_init); + int kvm_apic_accept_events(struct kvm_vcpu *vcpu) { struct kvm_lapic *apic = vcpu->arch.apic; @@ -3066,13 +3076,8 @@ int kvm_apic_accept_events(struct kvm_vcpu *vcpu) return 0; } - if (test_and_clear_bit(KVM_APIC_INIT, &apic->pending_events)) { - kvm_vcpu_reset(vcpu, true); - if (kvm_vcpu_is_bsp(apic->vcpu)) - vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; - else - vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED; - } + if (test_and_clear_bit(KVM_APIC_INIT, &apic->pending_events)) + static_call(kvm_x86_vcpu_deliver_init)(vcpu); if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events)) { if (vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) { /* evaluate pending_events before reading the vector */ diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 2bcf2e1a5271..5d56b0f1f595 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4857,6 +4857,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = { .complete_emulated_msr = svm_complete_emulated_msr, .vcpu_deliver_sipi_vector = svm_vcpu_deliver_sipi_vector, + .vcpu_deliver_init = kvm_vcpu_deliver_init, .vcpu_get_apicv_inhibit_reasons = avic_vcpu_get_apicv_inhibit_reasons, }; diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index 4acba8d8cb27..d776d5d169d0 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -286,6 +286,25 @@ static void vt_deliver_interrupt(struct kvm_lapic *apic, int delivery_mode, vmx_deliver_interrupt(apic, delivery_mode, trig_mode, vector); } +static void vt_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector) +{ + if (is_td_vcpu(vcpu)) + return; + + kvm_vcpu_deliver_sipi_vector(vcpu, vector); +} + +static void vt_vcpu_deliver_init(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) { + /* TDX doesn't support INIT. Ignore INIT event */ + vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; + return; + } + + kvm_vcpu_deliver_init(vcpu); +} + static void vt_flush_tlb_all(struct kvm_vcpu *vcpu) { if (is_td_vcpu(vcpu)) @@ -627,7 +646,8 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .msr_filter_changed = vmx_msr_filter_changed, .complete_emulated_msr = kvm_complete_insn_gp, - .vcpu_deliver_sipi_vector = kvm_vcpu_deliver_sipi_vector, + .vcpu_deliver_sipi_vector = vt_vcpu_deliver_sipi_vector, + .vcpu_deliver_init = vt_vcpu_deliver_init, .dev_mem_enc_ioctl = tdx_dev_ioctl, .mem_enc_ioctl = vt_mem_enc_ioctl, From patchwork Sun Oct 30 06:23:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12905 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1666819wru; Sat, 29 Oct 2022 23:31:30 -0700 (PDT) X-Google-Smtp-Source: AMsMyM60iHNcoc/TwS6BmNFiK7m1Tng3Bk1xQhI0MGTlRP8U/P02rB8okOgDy6WPiGW/gdEwyXiz X-Received: by 2002:a17:90a:1a43:b0:20a:db02:6841 with SMTP id 3-20020a17090a1a4300b0020adb026841mr24752200pjl.104.1667111490699; Sat, 29 Oct 2022 23:31:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111490; cv=none; d=google.com; s=arc-20160816; b=cFb1lB+O7GPT16yNgtW4Ad63zcStSANnxfIPjhxDN+DlMnY1REpnA4+nj5+knjwnU6 ScovQ2u4sHKOALu+mQarZydgFxCX0sZyU9gXY5MXClNIshxxbixHTELkw1ObkY3MeZe+ N9KhSXM4X55vm+o8IW+k98i8UIBp1PcU41u2Z43Iau04sG0nMCaSk3LgS34M+QyHnsOW CAqbTkjoHHmmWW24vKSZ8+XF48qA89nkRnShk5wd+Ac7zWFsH5RAJtIFAN9cABF3vPaB R+tp1NNHgM/0BnjRqHoQGTOk3/h7dWaWRZhPXeiPJWpsjlytxVbvwcq5XDCcJF4ZxMTy YJQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=0BA/Bf75HphQOhxY91sfcOPTMnfsAl49P7fAb0THpTw=; b=K5wkXRnggoXnW2ZCVA010BgapdCeXOM1T7CrJO6ekJ8G41G/3qPYSyQ3iefimdPYi3 rWp/utZC4SaiiFHQ8Mi76wT+qsl+oUFvFlsH/3mjZUdHtL9tsglfA5VtjpMTDMMjWCFK MlNRe2JwGgnXfgd5IUqhuFIrMUYqGkMhBRDIpKiy24DG4qHtFNfWHR2FVca4r3l1VbY9 z1DzwsTCTOtH9OR+gJizrKtRfKvb+xQB/oV/EFs1njr4uMi/sk41zfIoAN1CCMOBTvw5 b4qX0kWkzsnGVvm66J7lC7sDWlfGW7UYo3i2hiIaTC24XuJ7GJEGbdlIyiZbQTwtiCdE go0Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=dB6TG6dJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u12-20020a170902e5cc00b0018685257c0csi5037581plf.121.2022.10.29.23.31.17; Sat, 29 Oct 2022 23:31:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=dB6TG6dJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231208AbiJ3Gaj (ORCPT + 99 others); Sun, 30 Oct 2022 02:30:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49732 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231180AbiJ3G3G (ORCPT ); Sun, 30 Oct 2022 02:29:06 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA84C30B; Sat, 29 Oct 2022 23:24:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111080; x=1698647080; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mLTKTv27Tg4+9nELrEJmMajk5JkLeBqyfbkyU3owZBQ=; b=dB6TG6dJjksEpwll1Y0OCwuLNXvPsSMnfrlCIZcVlYovb//longCBMVt 8A4C2xn41lrbMoagFkAWLCCu/TONM6ma4/LwYIEmDg8ePMv+QY/9pyQEk 1sEvNwSFB9GvfeU3fmG0NbBDj82LeWXIWpYlDkQnY1S/20NVmSk9murzC UXYS6I4LYAZGb1t8eApUb53pVE3XCzkJXb3mAJXgGYFLXaCl7ohvxTUjG U43SH18PhiHdrN4vzy3grz0x3Vi0H2LQcU51lIEoUzYiaIqJDJbO9GLYF N5+gDF0idHLiJklUDBNMDJ3cAbc1MNE3O/9IbZg8y+d9nnzKmm4ulgUXf g==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="288436003" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="288436003" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:15 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393177" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393177" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:15 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Sean Christopherson Subject: [PATCH v10 105/108] KVM: TDX: Add methods to ignore accesses to CPU state Date: Sat, 29 Oct 2022 23:23:46 -0700 Message-Id: <282dd5a8edbee0aa87cdf035088ecd8558b0b999.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093098555028269?= X-GMAIL-MSGID: =?utf-8?q?1748093098555028269?= From: Sean Christopherson TDX protects TDX guest state from VMM. Implements to access methods for TDX guest state to ignore them or return zero. Signed-off-by: Sean Christopherson Signed-off-by: Isaku Yamahata --- arch/x86/kvm/vmx/main.c | 463 +++++++++++++++++++++++++++++++++---- arch/x86/kvm/vmx/tdx.c | 55 ++++- arch/x86/kvm/vmx/x86_ops.h | 17 ++ 3 files changed, 490 insertions(+), 45 deletions(-) diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c index d776d5d169d0..4381ef503540 100644 --- a/arch/x86/kvm/vmx/main.c +++ b/arch/x86/kvm/vmx/main.c @@ -258,6 +258,46 @@ static void vt_enable_smi_window(struct kvm_vcpu *vcpu) vmx_enable_smi_window(vcpu); } +static bool vt_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type, + void *insn, int insn_len) +{ + if (is_td_vcpu(vcpu)) + return false; + + return vmx_can_emulate_instruction(vcpu, emul_type, insn, insn_len); +} + +static int vt_check_intercept(struct kvm_vcpu *vcpu, + struct x86_instruction_info *info, + enum x86_intercept_stage stage, + struct x86_exception *exception) +{ + /* + * This call back is triggered by the x86 instruction emulator. TDX + * doesn't allow guest memory inspection. + */ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) + return X86EMUL_UNHANDLEABLE; + + return vmx_check_intercept(vcpu, info, stage, exception); +} + +static bool vt_apic_init_signal_blocked(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return true; + + return vmx_apic_init_signal_blocked(vcpu); +} + +static void vt_set_virtual_apic_mode(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return tdx_set_virtual_apic_mode(vcpu); + + return vmx_set_virtual_apic_mode(vcpu); +} + static void vt_apicv_post_state_restore(struct kvm_vcpu *vcpu) { struct pi_desc *pi = vcpu_to_pi_desc(vcpu); @@ -266,6 +306,31 @@ static void vt_apicv_post_state_restore(struct kvm_vcpu *vcpu) memset(pi->pir, 0, sizeof(pi->pir)); } +static void vt_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr) +{ + if (is_td_vcpu(vcpu)) + return; + + return vmx_hwapic_irr_update(vcpu, max_irr); +} + +static void vt_hwapic_isr_update(int max_isr) +{ + if (is_td_vcpu(kvm_get_running_vcpu())) + return; + + return vmx_hwapic_isr_update(max_isr); +} + +static bool vt_guest_apic_has_interrupt(struct kvm_vcpu *vcpu) +{ + /* TDX doesn't support L2 at the moment. */ + if (WARN_ON_ONCE(is_td_vcpu(vcpu))) + return false; + + return vmx_guest_apic_has_interrupt(vcpu); +} + static int vt_sync_pir_to_irr(struct kvm_vcpu *vcpu) { if (is_td_vcpu(vcpu)) @@ -305,6 +370,177 @@ static void vt_vcpu_deliver_init(struct kvm_vcpu *vcpu) kvm_vcpu_deliver_init(vcpu); } +static void vt_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return; + + return vmx_vcpu_after_set_cpuid(vcpu); +} + +static void vt_update_exception_bitmap(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return; + + vmx_update_exception_bitmap(vcpu); +} + +static u64 vt_get_segment_base(struct kvm_vcpu *vcpu, int seg) +{ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) + return tdx_get_segment_base(vcpu, seg); + + return vmx_get_segment_base(vcpu, seg); +} + +static void vt_get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, + int seg) +{ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) + return tdx_get_segment(vcpu, var, seg); + + vmx_get_segment(vcpu, var, seg); +} + +static void vt_set_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, + int seg) +{ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) + return; + + vmx_set_segment(vcpu, var, seg); +} + +static int vt_get_cpl(struct kvm_vcpu *vcpu) +{ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) + return tdx_get_cpl(vcpu); + + return vmx_get_cpl(vcpu); +} + +static void vt_get_cs_db_l_bits(struct kvm_vcpu *vcpu, int *db, int *l) +{ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) + return; + + vmx_get_cs_db_l_bits(vcpu, db, l); +} + +static void vt_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0) +{ + if (is_td_vcpu(vcpu)) + return; + + vmx_set_cr0(vcpu, cr0); +} + +static void vt_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4) +{ + if (is_td_vcpu(vcpu)) + return; + + vmx_set_cr4(vcpu, cr4); +} + +static int vt_set_efer(struct kvm_vcpu *vcpu, u64 efer) +{ + if (is_td_vcpu(vcpu)) + return 0; + + return vmx_set_efer(vcpu, efer); +} + +static void vt_get_idt(struct kvm_vcpu *vcpu, struct desc_ptr *dt) +{ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) { + memset(dt, 0, sizeof(*dt)); + return; + } + + vmx_get_idt(vcpu, dt); +} + +static void vt_set_idt(struct kvm_vcpu *vcpu, struct desc_ptr *dt) +{ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) + return; + + vmx_set_idt(vcpu, dt); +} + +static void vt_get_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt) +{ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) { + memset(dt, 0, sizeof(*dt)); + return; + } + + vmx_get_gdt(vcpu, dt); +} + +static void vt_set_gdt(struct kvm_vcpu *vcpu, struct desc_ptr *dt) +{ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) + return; + + vmx_set_gdt(vcpu, dt); +} + +static void vt_set_dr7(struct kvm_vcpu *vcpu, unsigned long val) +{ + if (is_td_vcpu(vcpu)) + return; + + vmx_set_dr7(vcpu, val); +} + +static void vt_sync_dirty_debug_regs(struct kvm_vcpu *vcpu) +{ + /* + * MOV-DR exiting is always cleared for TD guest, even in debug mode. + * Thus KVM_DEBUGREG_WONT_EXIT can never be set and it should never + * reach here for TD vcpu. + */ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) + return; + + vmx_sync_dirty_debug_regs(vcpu); +} + +static void vt_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg) +{ + if (is_td_vcpu(vcpu)) + return tdx_cache_reg(vcpu, reg); + + return vmx_cache_reg(vcpu, reg); +} + +static unsigned long vt_get_rflags(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return tdx_get_rflags(vcpu); + + return vmx_get_rflags(vcpu); +} + +static void vt_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags) +{ + if (is_td_vcpu(vcpu)) + return; + + vmx_set_rflags(vcpu, rflags); +} + +static bool vt_get_if_flag(struct kvm_vcpu *vcpu) +{ + if (is_td_vcpu(vcpu)) + return false; + + return vmx_get_if_flag(vcpu); +} + static void vt_flush_tlb_all(struct kvm_vcpu *vcpu) { if (is_td_vcpu(vcpu)) @@ -438,6 +674,15 @@ static u32 vt_get_interrupt_shadow(struct kvm_vcpu *vcpu) return vmx_get_interrupt_shadow(vcpu); } +static void vt_patch_hypercall(struct kvm_vcpu *vcpu, + unsigned char *hypercall) +{ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) + return; + + vmx_patch_hypercall(vcpu, hypercall); +} + static void vt_inject_irq(struct kvm_vcpu *vcpu, bool reinjected) { if (is_td_vcpu(vcpu)) @@ -446,6 +691,14 @@ static void vt_inject_irq(struct kvm_vcpu *vcpu, bool reinjected) vmx_inject_irq(vcpu, reinjected); } +static void vt_inject_exception(struct kvm_vcpu *vcpu) +{ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) + return; + + vmx_inject_exception(vcpu); +} + static void vt_cancel_injection(struct kvm_vcpu *vcpu) { if (is_td_vcpu(vcpu)) @@ -478,6 +731,130 @@ static void vt_request_immediate_exit(struct kvm_vcpu *vcpu) vmx_request_immediate_exit(vcpu); } +static void vt_update_cr8_intercept(struct kvm_vcpu *vcpu, int tpr, int irr) +{ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) + return; + + vmx_update_cr8_intercept(vcpu, tpr, irr); +} + +static void vt_set_apic_access_page_addr(struct kvm_vcpu *vcpu) +{ + if (WARN_ON_ONCE(is_td_vcpu(vcpu))) + return; + + vmx_set_apic_access_page_addr(vcpu); +} + +static void vt_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu) +{ + if (WARN_ON_ONCE(is_td_vcpu(vcpu))) + return; + + vmx_refresh_apicv_exec_ctrl(vcpu); +} + +static void vt_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap) +{ + if (is_td_vcpu(vcpu)) + return; + + vmx_load_eoi_exitmap(vcpu, eoi_exit_bitmap); +} + +static int vt_set_tss_addr(struct kvm *kvm, unsigned int addr) +{ + if (is_td(kvm)) + return 0; + + return vmx_set_tss_addr(kvm, addr); +} + +static int vt_set_identity_map_addr(struct kvm *kvm, u64 ident_addr) +{ + if (is_td(kvm)) + return 0; + + return vmx_set_identity_map_addr(kvm, ident_addr); +} + +static u8 vt_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio) +{ + if (is_td_vcpu(vcpu)) { + if (is_mmio) + return MTRR_TYPE_UNCACHABLE << VMX_EPT_MT_EPTE_SHIFT; + return MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT; + } + + return vmx_get_mt_mask(vcpu, gfn, is_mmio); +} + +static u64 vt_get_l2_tsc_offset(struct kvm_vcpu *vcpu) +{ + /* TDX doesn't support L2 guest at the moment. */ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) + return 0; + + return vmx_get_l2_tsc_offset(vcpu); +} + +static u64 vt_get_l2_tsc_multiplier(struct kvm_vcpu *vcpu) +{ + /* TDX doesn't support L2 guest at the moment. */ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) + return 0; + + return vmx_get_l2_tsc_multiplier(vcpu); +} + +static void vt_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset) +{ + /* In TDX, tsc offset can't be changed. */ + if (is_td_vcpu(vcpu)) + return; + + vmx_write_tsc_offset(vcpu, offset); +} + +static void vt_write_tsc_multiplier(struct kvm_vcpu *vcpu, u64 multiplier) +{ + /* In TDX, tsc multiplier can't be changed. */ + if (is_td_vcpu(vcpu)) + return; + + vmx_write_tsc_multiplier(vcpu, multiplier); +} + +static void vt_update_cpu_dirty_logging(struct kvm_vcpu *vcpu) +{ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) + return; + + vmx_update_cpu_dirty_logging(vcpu); +} + +#ifdef CONFIG_X86_64 +static int vt_set_hv_timer(struct kvm_vcpu *vcpu, u64 guest_deadline_tsc, + bool *expired) +{ + /* VMX-preemption timer isn't available for TDX. */ + if (is_td_vcpu(vcpu)) + return -EINVAL; + + return vmx_set_hv_timer(vcpu, guest_deadline_tsc, expired); +} + +static void vt_cancel_hv_timer(struct kvm_vcpu *vcpu) +{ + /* VMX-preemption timer can't be set. Set vt_set_hv_timer(). */ + if (KVM_BUG_ON(is_td_vcpu(vcpu), vcpu->kvm)) + return; + + vmx_cancel_hv_timer(vcpu); +} +#endif + static void vt_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, u64 *info1, u64 *info2, u32 *intr_info, u32 *error_code) { @@ -531,29 +908,29 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .vcpu_load = vt_vcpu_load, .vcpu_put = vt_vcpu_put, - .update_exception_bitmap = vmx_update_exception_bitmap, + .update_exception_bitmap = vt_update_exception_bitmap, .get_msr_feature = vmx_get_msr_feature, .get_msr = vt_get_msr, .set_msr = vt_set_msr, - .get_segment_base = vmx_get_segment_base, - .get_segment = vmx_get_segment, - .set_segment = vmx_set_segment, - .get_cpl = vmx_get_cpl, - .get_cs_db_l_bits = vmx_get_cs_db_l_bits, - .set_cr0 = vmx_set_cr0, + .get_segment_base = vt_get_segment_base, + .get_segment = vt_get_segment, + .set_segment = vt_set_segment, + .get_cpl = vt_get_cpl, + .get_cs_db_l_bits = vt_get_cs_db_l_bits, + .set_cr0 = vt_set_cr0, .is_valid_cr4 = vmx_is_valid_cr4, - .set_cr4 = vmx_set_cr4, - .set_efer = vmx_set_efer, - .get_idt = vmx_get_idt, - .set_idt = vmx_set_idt, - .get_gdt = vmx_get_gdt, - .set_gdt = vmx_set_gdt, - .set_dr7 = vmx_set_dr7, - .sync_dirty_debug_regs = vmx_sync_dirty_debug_regs, - .cache_reg = vmx_cache_reg, - .get_rflags = vmx_get_rflags, - .set_rflags = vmx_set_rflags, - .get_if_flag = vmx_get_if_flag, + .set_cr4 = vt_set_cr4, + .set_efer = vt_set_efer, + .get_idt = vt_get_idt, + .set_idt = vt_set_idt, + .get_gdt = vt_get_gdt, + .set_gdt = vt_set_gdt, + .set_dr7 = vt_set_dr7, + .sync_dirty_debug_regs = vt_sync_dirty_debug_regs, + .cache_reg = vt_cache_reg, + .get_rflags = vt_get_rflags, + .set_rflags = vt_set_rflags, + .get_if_flag = vt_get_if_flag, .flush_tlb_all = vt_flush_tlb_all, .flush_tlb_current = vt_flush_tlb_current, @@ -569,10 +946,10 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .update_emulated_instruction = vmx_update_emulated_instruction, .set_interrupt_shadow = vt_set_interrupt_shadow, .get_interrupt_shadow = vt_get_interrupt_shadow, - .patch_hypercall = vmx_patch_hypercall, + .patch_hypercall = vt_patch_hypercall, .inject_irq = vt_inject_irq, .inject_nmi = vt_inject_nmi, - .inject_exception = vmx_inject_exception, + .inject_exception = vt_inject_exception, .cancel_injection = vt_cancel_injection, .interrupt_allowed = vt_interrupt_allowed, .nmi_allowed = vt_nmi_allowed, @@ -580,39 +957,39 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .set_nmi_mask = vt_set_nmi_mask, .enable_nmi_window = vt_enable_nmi_window, .enable_irq_window = vt_enable_irq_window, - .update_cr8_intercept = vmx_update_cr8_intercept, - .set_virtual_apic_mode = vmx_set_virtual_apic_mode, - .set_apic_access_page_addr = vmx_set_apic_access_page_addr, - .refresh_apicv_exec_ctrl = vmx_refresh_apicv_exec_ctrl, - .load_eoi_exitmap = vmx_load_eoi_exitmap, + .update_cr8_intercept = vt_update_cr8_intercept, + .set_virtual_apic_mode = vt_set_virtual_apic_mode, + .set_apic_access_page_addr = vt_set_apic_access_page_addr, + .refresh_apicv_exec_ctrl = vt_refresh_apicv_exec_ctrl, + .load_eoi_exitmap = vt_load_eoi_exitmap, .apicv_post_state_restore = vt_apicv_post_state_restore, .check_apicv_inhibit_reasons = vmx_check_apicv_inhibit_reasons, - .hwapic_irr_update = vmx_hwapic_irr_update, - .hwapic_isr_update = vmx_hwapic_isr_update, - .guest_apic_has_interrupt = vmx_guest_apic_has_interrupt, + .hwapic_irr_update = vt_hwapic_irr_update, + .hwapic_isr_update = vt_hwapic_isr_update, + .guest_apic_has_interrupt = vt_guest_apic_has_interrupt, .sync_pir_to_irr = vt_sync_pir_to_irr, .deliver_interrupt = vt_deliver_interrupt, .dy_apicv_has_pending_interrupt = pi_has_pending_interrupt, .protected_apic_has_interrupt = vt_protected_apic_has_interrupt, - .set_tss_addr = vmx_set_tss_addr, - .set_identity_map_addr = vmx_set_identity_map_addr, - .get_mt_mask = vmx_get_mt_mask, + .set_tss_addr = vt_set_tss_addr, + .set_identity_map_addr = vt_set_identity_map_addr, + .get_mt_mask = vt_get_mt_mask, .get_exit_info = vt_get_exit_info, - .vcpu_after_set_cpuid = vmx_vcpu_after_set_cpuid, + .vcpu_after_set_cpuid = vt_vcpu_after_set_cpuid, .has_wbinvd_exit = cpu_has_vmx_wbinvd_exit, - .get_l2_tsc_offset = vmx_get_l2_tsc_offset, - .get_l2_tsc_multiplier = vmx_get_l2_tsc_multiplier, - .write_tsc_offset = vmx_write_tsc_offset, - .write_tsc_multiplier = vmx_write_tsc_multiplier, + .get_l2_tsc_offset = vt_get_l2_tsc_offset, + .get_l2_tsc_multiplier = vt_get_l2_tsc_multiplier, + .write_tsc_offset = vt_write_tsc_offset, + .write_tsc_multiplier = vt_write_tsc_multiplier, .load_mmu_pgd = vt_load_mmu_pgd, - .check_intercept = vmx_check_intercept, + .check_intercept = vt_check_intercept, .handle_exit_irqoff = vt_handle_exit_irqoff, .request_immediate_exit = vt_request_immediate_exit, @@ -620,7 +997,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .sched_in = vt_sched_in, .cpu_dirty_log_size = PML_ENTITY_NUM, - .update_cpu_dirty_logging = vmx_update_cpu_dirty_logging, + .update_cpu_dirty_logging = vt_update_cpu_dirty_logging, .nested_ops = &vmx_nested_ops, @@ -628,8 +1005,8 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .pi_start_assignment = vmx_pi_start_assignment, #ifdef CONFIG_X86_64 - .set_hv_timer = vmx_set_hv_timer, - .cancel_hv_timer = vmx_cancel_hv_timer, + .set_hv_timer = vt_set_hv_timer, + .cancel_hv_timer = vt_cancel_hv_timer, #endif .setup_mce = vmx_setup_mce, @@ -639,8 +1016,8 @@ struct kvm_x86_ops vt_x86_ops __initdata = { .leave_smm = vt_leave_smm, .enable_smi_window = vt_enable_smi_window, - .can_emulate_instruction = vmx_can_emulate_instruction, - .apic_init_signal_blocked = vmx_apic_init_signal_blocked, + .can_emulate_instruction = vt_can_emulate_instruction, + .apic_init_signal_blocked = vt_apic_init_signal_blocked, .migrate_timers = vmx_migrate_timers, .msr_filter_changed = vmx_msr_filter_changed, diff --git a/arch/x86/kvm/vmx/tdx.c b/arch/x86/kvm/vmx/tdx.c index 111027724e06..2ea010e96414 100644 --- a/arch/x86/kvm/vmx/tdx.c +++ b/arch/x86/kvm/vmx/tdx.c @@ -3,6 +3,7 @@ #include #include +#include #include #include "capabilities.h" @@ -493,8 +494,15 @@ int tdx_vcpu_create(struct kvm_vcpu *vcpu) vcpu->arch.tsc_offset = to_kvm_tdx(vcpu->kvm)->tsc_offset; vcpu->arch.l1_tsc_offset = vcpu->arch.tsc_offset; - vcpu->arch.guest_state_protected = - !(to_kvm_tdx(vcpu->kvm)->attributes & TDX_TD_ATTRIBUTE_DEBUG); + /* + * TODO: support off-TD debug. If TD DEBUG is enabled, guest state + * can be accessed. guest_state_protected = false. and kvm ioctl to + * access CPU states should be usable for user space VMM (e.g. qemu). + * + * vcpu->arch.guest_state_protected = + * !(to_kvm_tdx(vcpu->kvm)->attributes & TDX_TD_ATTRIBUTE_DEBUG); + */ + vcpu->arch.guest_state_protected = true; tdx->pi_desc.nv = POSTED_INTR_VECTOR; tdx->pi_desc.sn = 1; @@ -1687,6 +1695,49 @@ void tdx_enable_smi_window(struct kvm_vcpu *vcpu) vcpu->arch.smi_pending = false; } +void tdx_set_virtual_apic_mode(struct kvm_vcpu *vcpu) +{ + /* Only x2APIC mode is supported for TD. */ + WARN_ON_ONCE(kvm_get_apic_mode(vcpu) != LAPIC_MODE_X2APIC); +} + +int tdx_get_cpl(struct kvm_vcpu *vcpu) +{ + return 0; +} + +void tdx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg) +{ + kvm_register_mark_available(vcpu, reg); + switch (reg) { + case VCPU_REGS_RSP: + case VCPU_REGS_RIP: + case VCPU_EXREG_PDPTR: + case VCPU_EXREG_CR0: + case VCPU_EXREG_CR3: + case VCPU_EXREG_CR4: + break; + default: + KVM_BUG_ON(1, vcpu->kvm); + break; + } +} + +unsigned long tdx_get_rflags(struct kvm_vcpu *vcpu) +{ + return 0; +} + +u64 tdx_get_segment_base(struct kvm_vcpu *vcpu, int seg) +{ + return 0; +} + +void tdx_get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg) +{ + memset(var, 0, sizeof(*var)); +} + int tdx_dev_ioctl(void __user *argp) { struct kvm_tdx_capabilities __user *user_caps; diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h index d4ffbf580bc5..1341787f1378 100644 --- a/arch/x86/kvm/vmx/x86_ops.h +++ b/arch/x86/kvm/vmx/x86_ops.h @@ -169,6 +169,14 @@ int tdx_smi_allowed(struct kvm_vcpu *vcpu, bool for_injection); int tdx_enter_smm(struct kvm_vcpu *vcpu, char *smstate); int tdx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate); void tdx_enable_smi_window(struct kvm_vcpu *vcpu); +void tdx_set_virtual_apic_mode(struct kvm_vcpu *vcpu); + +int tdx_get_cpl(struct kvm_vcpu *vcpu); +void tdx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg); +unsigned long tdx_get_rflags(struct kvm_vcpu *vcpu); +bool tdx_is_emulated_msr(u32 index, bool write); +u64 tdx_get_segment_base(struct kvm_vcpu *vcpu, int seg); +void tdx_get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, int seg); int tdx_vm_ioctl(struct kvm *kvm, void __user *argp); int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp); @@ -210,10 +218,19 @@ static inline void tdx_get_exit_info(struct kvm_vcpu *vcpu, u32 *reason, u64 *in static inline bool tdx_is_emulated_msr(u32 index, bool write) { return false; } static inline int tdx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) { return 1; } static inline int tdx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr) { return 1; } + static inline int tdx_smi_allowed(struct kvm_vcpu *vcpu, bool for_injection) { return false; } static inline int tdx_enter_smm(struct kvm_vcpu *vcpu, char *smstate) { return 0; } static inline int tdx_leave_smm(struct kvm_vcpu *vcpu, const char *smstate) { return 0; } static inline void tdx_enable_smi_window(struct kvm_vcpu *vcpu) {} +static inline void tdx_set_virtual_apic_mode(struct kvm_vcpu *vcpu) {} + +static inline int tdx_get_cpl(struct kvm_vcpu *vcpu) { return 0; } +static inline void tdx_cache_reg(struct kvm_vcpu *vcpu, enum kvm_reg reg) {} +static inline unsigned long tdx_get_rflags(struct kvm_vcpu *vcpu) { return 0; } +static inline u64 tdx_get_segment_base(struct kvm_vcpu *vcpu, int seg) { return 0; } +static inline void tdx_get_segment(struct kvm_vcpu *vcpu, struct kvm_segment *var, + int seg) {} static inline int tdx_vm_ioctl(struct kvm *kvm, void __user *argp) { return -EOPNOTSUPP; } static inline int tdx_vcpu_ioctl(struct kvm_vcpu *vcpu, void __user *argp) { return -EOPNOTSUPP; } From patchwork Sun Oct 30 06:23:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12913 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1667041wru; Sat, 29 Oct 2022 23:32:22 -0700 (PDT) X-Google-Smtp-Source: AMsMyM58XqtQnc8LlmUX9fbGNNywQSHQCY8EVTYhcMHtaFTDx9DuVu3zwqeCOMgwyzAVQonfqp+W X-Received: by 2002:a17:903:447:b0:186:b945:c0b2 with SMTP id iw7-20020a170903044700b00186b945c0b2mr7903912plb.25.1667111542132; Sat, 29 Oct 2022 23:32:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111542; cv=none; d=google.com; s=arc-20160816; b=l5p9vs7RTVQXYE0/Tq5bBF/8uDSObi/XzVx2quHLJsfqI9dBDHU5wChipyy4lFvw1l mTK4ESVexhivfO/nLWckJ6pvgvTaqY8+/dieVRGL4kZld0wMHwu+K8wM+47paPgnYoh+ Uiv0Bklhx4sqwcr3zhrLlfyM8bHhdbtdVwxJYoYRIjYGRQtiErrszQVVURXUr53HeO62 Blw9sexP9iicFqzVhEx30zbwl+SUGDDM2xlEt3yUpyP1Kdj8W6klc4Bo6TkF2BYiaksy mZw6lX3C4/3MeuiDHpTkE5jsqNd9GL360hnrLQJTTOtxIyl007o9hj/EQt19Iuw+cKB3 UUpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=DAdlhnM34he/fPYd3IrihAhpejPjDLdBmuAO43ImOJo=; b=RHvGOFdZWZIAXPe572wUVewas/7F/XDlB8EkDX6/TZUjRhM04HAG4m9S2pjPr9t16H NHQUiCpJe+e6InBy5TS98phqI3OLGw9aP3+qv+peDjFVDDOKv5oTT2igo7OajFMw/ZZ7 +t1lhuPgIqQhu/1xxafourFZW+Q+Adg7zl3BIGTBADD1zA0ROwpsczWq9VruZaAxdpdo cp5G4Pm+x6w8JIVdtmbos+E+RLNwoHl9iR70g1P8sh9VSn7PUXwetFj7coVIQ7ERMoSP qzUAQ9rjxWRB5aw6zCajqSE/up0zgJAeTpXRSZNtvimGyoCzKrBgj0cqGC/znqPnZlxY ZhTA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Ms8Sx4eC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u13-20020a170903124d00b00186af8159desi4703574plh.523.2022.10.29.23.32.09; Sat, 29 Oct 2022 23:32:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Ms8Sx4eC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231405AbiJ3GbW (ORCPT + 99 others); Sun, 30 Oct 2022 02:31:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230477AbiJ3G3m (ORCPT ); Sun, 30 Oct 2022 02:29:42 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 25AAC386; Sat, 29 Oct 2022 23:24:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111088; x=1698647088; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wkmgpGya40skNn+g2aTLntezleQg+tzTH6LD0mNCygk=; b=Ms8Sx4eCyNHIkfnFw38u5E7b6n12aa+ssr/JStG3ZflHiyCTnSTStZHF +Q3L8OVGwii2XnmBhnN1ijJ+zeS/BLJem/P3d/42zPc3Prd5Fa0L6qS8m FpCWWGdG9ueKVnvKMlLc6w1RBYOBv/OlevF9QuvpNvs6tCh7UIajlzuxX iNtzmPSQcR16pUAuGv+uduIRNvUY0bbKQYB4XQb3uKooPWjZFdRCn5NOW cWJS8WIgOl6n/hkN6LkKLB/5v1qypZD3IICXEqkXvbDqHbNjej5bSE2Gz gxKzFfpilEf4UxzwXN/RcLbataN7bivWN5Ox8fPAf8atc8miw9ZosGCcm Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="288436005" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="288436005" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:15 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393181" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393181" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:15 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 106/108] Documentation/virt/kvm: Document on Trust Domain Extensions(TDX) Date: Sat, 29 Oct 2022 23:23:47 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093152111794433?= X-GMAIL-MSGID: =?utf-8?q?1748093152111794433?= From: Isaku Yamahata Add documentation to Intel Trusted Domain Extensions(TDX) support. Signed-off-by: Isaku Yamahata --- Documentation/virt/kvm/api.rst | 9 +- Documentation/virt/kvm/index.rst | 2 + Documentation/virt/kvm/intel-tdx.rst | 345 +++++++++++++++++++++++++++ 3 files changed, 355 insertions(+), 1 deletion(-) create mode 100644 Documentation/virt/kvm/intel-tdx.rst diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index b6f08e8a8320..3d819b2ceb78 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -1426,6 +1426,9 @@ It is recommended to use this API instead of the KVM_SET_MEMORY_REGION ioctl. The KVM_SET_MEMORY_REGION does not allow fine grained control over memory allocation and is deprecated. +For TDX guest, deleting/moving memory region loses guest memory contents. +Read only region isn't supported. Only as-id 0 is supported. + 4.36 KVM_SET_TSS_ADDR --------------------- @@ -4714,7 +4717,7 @@ H_GET_CPU_CHARACTERISTICS hypercall. :Capability: basic :Architectures: x86 -:Type: vm +:Type: vm ioctl, vcpu ioctl :Parameters: an opaque platform specific structure (in/out) :Returns: 0 on success; -1 on error @@ -4726,6 +4729,10 @@ Currently, this ioctl is used for issuing Secure Encrypted Virtualization (SEV) commands on AMD Processors. The SEV commands are defined in Documentation/virt/kvm/x86/amd-memory-encryption.rst. +Currently, this ioctl is used for issuing Trusted Domain Extensions +(TDX) commands on Intel Processors. The TDX commands are defined in +Documentation/virt/kvm/intel-tdx.rst. + 4.111 KVM_MEMORY_ENCRYPT_REG_REGION ----------------------------------- diff --git a/Documentation/virt/kvm/index.rst b/Documentation/virt/kvm/index.rst index e0a2c74e1043..cdb8b43ce797 100644 --- a/Documentation/virt/kvm/index.rst +++ b/Documentation/virt/kvm/index.rst @@ -18,3 +18,5 @@ KVM locking vcpu-requests review-checklist + + intel-tdx diff --git a/Documentation/virt/kvm/intel-tdx.rst b/Documentation/virt/kvm/intel-tdx.rst new file mode 100644 index 000000000000..6999b0f4f6c2 --- /dev/null +++ b/Documentation/virt/kvm/intel-tdx.rst @@ -0,0 +1,345 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=================================== +Intel Trust Domain Extensions (TDX) +=================================== + +Overview +======== +TDX stands for Trust Domain Extensions which isolates VMs from +the virtual-machine manager (VMM)/hypervisor and any other software on +the platform. For details, see the specifications [1]_, whitepaper [2]_, +architectural extensions specification [3]_, module documentation [4]_, +loader interface specification [5]_, guest-hypervisor communication +interface [6]_, virtual firmware design guide [7]_, and other resources +([8]_, [9]_, [10]_, [11]_, and [12]_). + + +API description +=============== + +KVM_MEMORY_ENCRYPT_OP +--------------------- +:Type: vm ioctl, vcpu ioctl + +For TDX operations, KVM_MEMORY_ENCRYPT_OP is re-purposed to be generic +ioctl with TDX specific sub ioctl command. + +:: + + /* Trust Domain eXtension sub-ioctl() commands. */ + enum kvm_tdx_cmd_id { + KVM_TDX_CAPABILITIES = 0, + KVM_TDX_INIT_VM, + KVM_TDX_INIT_VCPU, + KVM_TDX_INIT_MEM_REGION, + KVM_TDX_FINALIZE_VM, + + KVM_TDX_CMD_NR_MAX, + }; + + struct kvm_tdx_cmd { + /* enum kvm_tdx_cmd_id */ + __u32 id; + /* flags for sub-commend. If sub-command doesn't use this, set zero. */ + __u32 flags; + /* + * data for each sub-command. An immediate or a pointer to the actual + * data in process virtual address. If sub-command doesn't use it, + * set zero. + */ + __u64 data; + /* + * Auxiliary error code. The sub-command may return TDX SEAMCALL + * status code in addition to -Exxx. + * Defined for consistency with struct kvm_sev_cmd. + */ + __u64 error; + /* Reserved: Defined for consistency with struct kvm_sev_cmd. */ + __u64 unused; + }; + +KVM_TDX_CAPABILITIES +-------------------- +:Type: vm ioctl + +Subset of TDSYSINFO_STRCUCT retrieved by TDH.SYS.INFO TDX SEAM call will be +returned. Which describes about Intel TDX module. + +- id: KVM_TDX_CAPABILITIES +- flags: must be 0 +- data: pointer to struct kvm_tdx_capabilities +- error: must be 0 +- unused: must be 0 + +:: + + struct kvm_tdx_cpuid_config { + __u32 leaf; + __u32 sub_leaf; + __u32 eax; + __u32 ebx; + __u32 ecx; + __u32 edx; + }; + + struct kvm_tdx_capabilities { + __u64 attrs_fixed0; + __u64 attrs_fixed1; + __u64 xfam_fixed0; + __u64 xfam_fixed1; + + __u32 nr_cpuid_configs; + struct kvm_tdx_cpuid_config cpuid_configs[0]; + }; + + +KVM_TDX_INIT_VM +--------------- +:Type: vm ioctl + +Does additional VM initialization specific to TDX which corresponds to +TDH.MNG.INIT TDX SEAM call. + +- id: KVM_TDX_INIT_VM +- flags: must be 0 +- data: pointer to struct kvm_tdx_init_vm +- error: must be 0 +- unused: must be 0 + +:: + + struct kvm_tdx_init_vm { + __u32 max_vcpus; + __u32 reserved; + __u64 attributes; + __u64 cpuid; /* pointer to struct kvm_cpuid2 */ + __u64 mrconfigid[6]; /* sha384 digest */ + __u64 mrowner[6]; /* sha384 digest */ + __u64 mrownerconfig[6]; /* sha348 digest */ + __u64 reserved[43]; /* must be zero for future extensibility */ + }; + + +KVM_TDX_INIT_VCPU +----------------- +:Type: vcpu ioctl + +Does additional VCPU initialization specific to TDX which corresponds to +TDH.VP.INIT TDX SEAM call. + +- id: KVM_TDX_INIT_VCPU +- flags: must be 0 +- data: initial value of the guest TD VCPU RCX +- error: must be 0 +- unused: must be 0 + +KVM_TDX_INIT_MEM_REGION +----------------------- +:Type: vm ioctl + +Encrypt a memory continuous region which corresponding to TDH.MEM.PAGE.ADD +TDX SEAM call. +If KVM_TDX_MEASURE_MEMORY_REGION flag is specified, it also extends measurement +which corresponds to TDH.MR.EXTEND TDX SEAM call. + +- id: KVM_TDX_INIT_VCPU +- flags: flags + currently only KVM_TDX_MEASURE_MEMORY_REGION is defined +- data: pointer to struct kvm_tdx_init_mem_region +- error: must be 0 +- unused: must be 0 + +:: + + #define KVM_TDX_MEASURE_MEMORY_REGION (1UL << 0) + + struct kvm_tdx_init_mem_region { + __u64 source_addr; + __u64 gpa; + __u64 nr_pages; + }; + + +KVM_TDX_FINALIZE_VM +------------------- +:Type: vm ioctl + +Complete measurement of the initial TD contents and mark it ready to run +which corresponds to TDH.MR.FINALIZE + +- id: KVM_TDX_FINALIZE_VM +- flags: must be 0 +- data: must be 0 +- error: must be 0 +- unused: must be 0 + +KVM TDX creation flow +===================== +In addition to KVM normal flow, new TDX ioctls need to be called. The control flow +looks like as follows. + +#. system wide capability check + + * KVM_CAP_VM_TYPES: check if VM type is supported and if TDX_VM_TYPE is + supported. + +#. creating VM + + * KVM_CREATE_VM + * KVM_TDX_CAPABILITIES: query if TDX is supported on the platform. + * KVM_TDX_INIT_VM: pass TDX specific VM parameters. + +#. creating VCPU + + * KVM_CREATE_VCPU + * KVM_TDX_INIT_VCPU: pass TDX specific VCPU parameters. + +#. initializing guest memory + + * allocate guest memory and initialize page same to normal KVM case + In TDX case, parse and load TDVF into guest memory in addition. + * KVM_TDX_INIT_MEM_REGION to add and measure guest pages. + If the pages has contents above, those pages need to be added. + Otherwise the contents will be lost and guest sees zero pages. + * KVM_TDX_FINALIAZE_VM: Finalize VM and measurement + This must be after KVM_TDX_INIT_MEM_REGION. + +#. run vcpu + +Design discussion +================= + +Coexistence of normal(VMX) VM and TD VM +--------------------------------------- +It's required to allow both legacy(normal VMX) VMs and new TD VMs to +coexist. Otherwise the benefits of VM flexibility would be eliminated. +The main issue for it is that the logic of kvm_x86_ops callbacks for +TDX is different from VMX. On the other hand, the variable, +kvm_x86_ops, is global single variable. Not per-VM, not per-vcpu. + +Several points to be considered: + + * No or minimal overhead when TDX is disabled(CONFIG_INTEL_TDX_HOST=n). + * Avoid overhead of indirect call via function pointers. + * Contain the changes under arch/x86/kvm/vmx directory and share logic + with VMX for maintenance. + Even though the ways to operation on VM (VMX instruction vs TDX + SEAM call) is different, the basic idea remains same. So, many + logic can be shared. + * Future maintenance + The huge change of kvm_x86_ops in (near) future isn't expected. + a centralized file is acceptable. + +- Wrapping kvm x86_ops: The current choice + + Introduce dedicated file for arch/x86/kvm/vmx/main.c (the name, + main.c, is just chosen to show main entry points for callbacks.) and + wrapper functions around all the callbacks with + "if (is-tdx) tdx-callback() else vmx-callback()". + + Pros: + + - No major change in common x86 KVM code. The change is (mostly) + contained under arch/x86/kvm/vmx/. + - When TDX is disabled(CONFIG_INTEL_TDX_HOST=n), the overhead is + optimized out. + - Micro optimization by avoiding function pointer. + + Cons: + + - Many boiler plates in arch/x86/kvm/vmx/main.c. + +KVM MMU Changes +--------------- +KVM MMU needs to be enhanced to handle Secure/Shared-EPT. The +high-level execution flow is mostly same to normal EPT case. +EPT violation/misconfiguration -> invoke TDP fault handler -> +resolve TDP fault -> resume execution. (or emulate MMIO) +The difference is, that S-EPT is operated(read/write) via TDX SEAM +call which is expensive instead of direct read/write EPT entry. +One bit of GPA (51 or 47 bit) is repurposed so that it means shared +with host(if set to 1) or private to TD(if cleared to 0). + +- The current implementation + + * Reuse the existing MMU code with minimal update. Because the + execution flow is mostly same. But additional operation, TDX call + for S-EPT, is needed. So add hooks for it to kvm_x86_ops. + * For performance, minimize TDX SEAM call to operate on S-EPT. When + getting corresponding S-EPT pages/entry from faulting GPA, don't + use TDX SEAM call to read S-EPT entry. Instead create shadow copy + in host memory. + Repurpose the existing kvm_mmu_page as shadow copy of S-EPT and + associate S-EPT to it. + * Treats share bit as attributes. mask/unmask the bit where + necessary to keep the existing traversing code works. + Introduce kvm.arch.gfn_shared_mask and use "if (gfn_share_mask)" + for special case. + + * 0 : for non-TDX case + * 51 or 47 bit set for TDX case. + + Pros: + + - Large code reuse with minimal new hooks. + - Execution path is same. + + Cons: + + - Complicates the existing code. + - Repurpose kvm_mmu_page as shadow of Secure-EPT can be confusing. + +New KVM API, ioctl (sub)command, to manage TD VMs +------------------------------------------------- +Additional KVM API are needed to control TD VMs. The operations on TD +VMs are specific to TDX. + +- Piggyback and repurpose KVM_MEMORY_ENCRYPT_OP + + Although not all operation isn't memory encryption, repupose to get + TDX specific ioctls. + + Pros: + + - No major change in common x86 KVM code. + + Cons: + + - The operations aren't actually memory encryption, but operations + on TD VMs. + +References +========== + +.. [1] TDX specification + https://software.intel.com/content/www/us/en/develop/articles/intel-trust-domain-extensions.html +.. [2] Intel Trust Domain Extensions (Intel TDX) + https://software.intel.com/content/dam/develop/external/us/en/documents/tdx-whitepaper-final9-17.pdf +.. [3] Intel CPU Architectural Extensions Specification + https://software.intel.com/content/dam/develop/external/us/en/documents/intel-tdx-cpu-architectural-specification.pdf +.. [4] Intel TDX Module 1.0 EAS + https://software.intel.com/content/dam/develop/external/us/en/documents/intel-tdx-module-1eas.pdf +.. [5] Intel TDX Loader Interface Specification + https://software.intel.com/content/dam/develop/external/us/en/documents/intel-tdx-seamldr-interface-specification.pdf +.. [6] Intel TDX Guest-Hypervisor Communication Interface + https://software.intel.com/content/dam/develop/external/us/en/documents/intel-tdx-guest-hypervisor-communication-interface.pdf +.. [7] Intel TDX Virtual Firmware Design Guide + https://software.intel.com/content/dam/develop/external/us/en/documents/tdx-virtual-firmware-design-guide-rev-1. +.. [8] intel public github + + * kvm TDX branch: https://github.com/intel/tdx/tree/kvm + * TDX guest branch: https://github.com/intel/tdx/tree/guest + +.. [9] tdvf + https://github.com/tianocore/edk2-staging/tree/TDVF +.. [10] KVM forum 2020: Intel Virtualization Technology Extensions to + Enable Hardware Isolated VMs + https://osseu2020.sched.com/event/eDzm/intel-virtualization-technology-extensions-to-enable-hardware-isolated-vms-sean-christopherson-intel +.. [11] Linux Security Summit EU 2020: + Architectural Extensions for Hardware Virtual Machine Isolation + to Advance Confidential Computing in Public Clouds - Ravi Sahita + & Jun Nakajima, Intel Corporation + https://osseu2020.sched.com/event/eDOx/architectural-extensions-for-hardware-virtual-machine-isolation-to-advance-confidential-computing-in-public-clouds-ravi-sahita-jun-nakajima-intel-corporation +.. [12] [RFCv2,00/16] KVM protected memory extension + https://lkml.org/lkml/2020/10/20/66 From patchwork Sun Oct 30 06:23:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12916 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1667099wru; Sat, 29 Oct 2022 23:32:36 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5NhPIHiAAG9dWD5jg6ob3wcOiwOeQB/6pImyzhYl0ej1DgwseiHT0uFVs0iDiw/3Z812cb X-Received: by 2002:a62:174a:0:b0:56b:9fc2:4ebd with SMTP id 71-20020a62174a000000b0056b9fc24ebdmr8101848pfx.21.1667111555928; Sat, 29 Oct 2022 23:32:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111555; cv=none; d=google.com; s=arc-20160816; b=oOXoz5OVBpipAw113MqW0gfOMaqgdbZDJVimvJC+9m/uA4jpRrYKOMPuA7rzcKqA5w IURVlsnObkx66nKyrOVtNhiSai7RZ99spAjIf7pIy6CuH2LRKnbk7XFX3JI3yOEaFs/W rWX1gZ/AnE43qwauRQgZWjPw9S7IREEVvTcKyYZQAXV99zzLu3SsW2pjzCvBey8120wC TRCjluegeMq1bsxmg/vnLYqKuqnnyAJuqy+0EbqSgzogIRcX4Yw86pa5TxTm1o0u6hf+ TGg156pmibf91fx0wKbXyWgGui/N+GGpmO62+9ZMbnSCltvwz3tsNFapvdFXq1CUUP93 Ap+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=4jSIPFcEJz5nkiqSPQ0x/mp22gmwF29drI7DjqL0H3U=; b=pOHCh3jEJbLx+B86hneycGtYaGqxRDQHlfuaf9RSr8Hoe36iazBWgOKd1Cds5BpUO9 25kpubmJc2+HoJy4gSFFLnl+EVpnPd/I94s0j2l26uDZrYl+Lg1DlTnMPCJUWA5hxgL7 /o4BLPEuzduTLJv8xwKoqnE2VzG2aY9kl3dMVGIvCvOXCKA3vfOBXkOAhi41dnRt/pUv Qdb9XR87Y0dL6lNhEWaeFaVZ+rx21tbkH/Oz34fpWyolwqnJlEB5ZksK6FQ2Z4KHgkqJ VO4vi68Eldsq+0HMJrxDCaoB6HPzd72pSG/imjQp+N6w2hTnu+EdVzUtGnxobuq8Gq0i oGgQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=L4guKt9Y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n3-20020a1709026a8300b0017a0e7aaf6bsi4063550plk.128.2022.10.29.23.32.23; Sat, 29 Oct 2022 23:32:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=L4guKt9Y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231167AbiJ3Gb5 (ORCPT + 99 others); Sun, 30 Oct 2022 02:31:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49640 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231305AbiJ3GaU (ORCPT ); Sun, 30 Oct 2022 02:30:20 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7E443B2; Sat, 29 Oct 2022 23:24:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111091; x=1698647091; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/iL/5HKwTLM57fOI1n7l4Rg3A8WpxBIbgeC/p+evIPo=; b=L4guKt9YvSa9iSj8qXgOIJPe0LyGjg5VuvCrBzjQRSYrYaTrHRo2ZSKY s111XbVjd3ha5EylICYj5pCa6cJMPWJbQ+eOP4XXIbHMGpOIyUmRnUBjL /7ibfJwAk2CN9Cqljs8wiOJnutlWhPSZab5A2SMs1Fp5xg/K/XVJ6Ah1Z CuaqqkGYUPOwkmvsFdTMIuD2EaUToGdCLqga8wGI4c2lzRxRzrfjBXe4k w6SWY6dCRGWr1fH1ZS0jj6QSc0FT1kxJe7GzSJONZFdsx5xCyqxDLu+59 3MfuHYN3R3wOl+FWv0GUzg1E58YL7pUHk1e2OktVTq17U+bKw4R3U9UFY w==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="288436007" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="288436007" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:15 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393185" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393185" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:15 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack , Bagas Sanjaya Subject: [PATCH v10 107/108] KVM: x86: design documentation on TDX support of x86 KVM TDP MMU Date: Sat, 29 Oct 2022 23:23:48 -0700 Message-Id: <91062ba1b723d5b866b17447e3f8f8addaa334ee.1667110240.git.isaku.yamahata@intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093167025740824?= X-GMAIL-MSGID: =?utf-8?q?1748093167025740824?= From: Isaku Yamahata Add a high level design document on TDX changes to TDP MMU. Signed-off-by: Isaku Yamahata Co-developed-by: Bagas Sanjaya Signed-off-by: Bagas Sanjaya --- Documentation/virt/kvm/tdx-tdp-mmu.rst | 417 +++++++++++++++++++++++++ 1 file changed, 417 insertions(+) create mode 100644 Documentation/virt/kvm/tdx-tdp-mmu.rst diff --git a/Documentation/virt/kvm/tdx-tdp-mmu.rst b/Documentation/virt/kvm/tdx-tdp-mmu.rst new file mode 100644 index 000000000000..2d91c94e6d8f --- /dev/null +++ b/Documentation/virt/kvm/tdx-tdp-mmu.rst @@ -0,0 +1,417 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Design of TDP MMU for TDX support +================================= +This document describes a (high level) design for TDX support of KVM TDP MMU of +x86 KVM. + +In this document, we use "TD" or "guest TD" to differentiate it from the current +"VM" (Virtual Machine), which is supported by KVM today. + + +Background of TDX +================= +TD private memory is designed to hold TD private content, encrypted by the CPU +using the TD ephemeral key. An encryption engine holds a table of encryption +keys, and an encryption key is selected for each memory transaction based on a +Host Key Identifier (HKID). By design, the host VMM does not have access to the +encryption keys. + +In the first generation of MKTME, HKID is "stolen" from the physical address by +allocating a configurable number of bits from the top of the physical address. +The HKID space is partitioned into shared HKIDs for legacy MKTME accesses and +private HKIDs for SEAM-mode-only accesses. We use 0 for the shared HKID on the +host so that MKTME can be opaque or bypassed on the host. + +During TDX non-root operation (i.e. guest TD), memory accesses can be qualified +as either shared or private, based on the value of a new SHARED bit in the Guest +Physical Address (GPA). The CPU translates shared GPAs using the usual VMX EPT +(Extended Page Table) or "Shared EPT" (in this document), which resides in the +host VMM memory. The Shared EPT is directly managed by the host VMM - the same +as with the current VMX. Since guest TDs usually require I/O, and the data +exchange needs to be done via shared memory, thus KVM needs to use the current +EPT functionality even for TDs. + +The CPU translates private GPAs using a separate Secure EPT. The Secure EPT +pages are encrypted and integrity-protected with the TD's ephemeral private key. +Secure EPT can be managed _indirectly_ by the host VMM, using the TDX interface +functions (SEAMCALLs), and thus conceptually Secure EPT is a subset of EPT +because not all functionalities are available. + +Since the execution of such interface functions takes much longer time than +accessing memory directly, in KVM we use the existing TDP code to mirror the +Secure EPT for the TD. And we think there are at least two options today in +terms of the timing for executing such SEAMCALLs: + +1. synchronous, i.e. while walking the TDP page tables, or +2. post-walk, i.e. record what needs to be done to the real Secure EPT during + the walk, and execute SEAMCALLs later. + +The option 1 seems to be more intuitive and simpler, but the Secure EPT +concurrency rules are different from the ones of the TDP or EPT. For example, +MEM.SEPT.RD acquire shared access to the whole Secure EPT tree of the target + +Secure EPT(SEPT) operations +--------------------------- +Secure EPT is an Extended Page Table for GPA-to-HPA translation of TD private +HPA. A Secure EPT is designed to be encrypted with the TD's ephemeral private +key. SEPT pages are allocated by the host VMM via Intel TDX functions, but their +content is intended to be hidden and is not architectural. + +Unlike the conventional EPT, the CPU can't directly read/write its entry. +Instead, TDX SEAMCALL API is used. Several SEAMCALLs correspond to operation on +the EPT entry. + +* TDH.MEM.SEPT.ADD(): + + Add a secure EPT page from the secure EPT tree. This corresponds to updating + the non-leaf EPT entry with present bit set + +* TDH.MEM.SEPT.REMOVE(): + + Remove the secure page from the secure EPT tree. There is no corresponding + to the EPT operation. + +* TDH.MEM.SEPT.RD(): + + Read the secure EPT entry. This corresponds to reading the EPT entry as + memory. Please note that this is much slower than direct memory reading. + +* TDH.MEM.PAGE.ADD() and TDH.MEM.PAGE.AUG(): + + Add a private page to the secure EPT tree. This corresponds to updating the + leaf EPT entry with present bit set. + +* THD.MEM.PAGE.REMOVE(): + + Remove a private page from the secure EPT tree. There is no corresponding + to the EPT operation. + +* TDH.MEM.RANGE.BLOCK(): + + This (mostly) corresponds to clearing the present bit of the leaf EPT entry. + Note that the private page is still linked in the secure EPT. To remove it + from the secure EPT, TDH.MEM.SEPT.REMOVE() and TDH.MEM.PAGE.REMOVE() needs to + be called. + +* TDH.MEM.TRACK(): + + Increment the TLB epoch counter. This (mostly) corresponds to EPT TLB flush. + Note that the private page is still linked in the secure EPT. To remove it + from the secure EPT, tdh_mem_page_remove() needs to be called. + + +Adding private page +------------------- +The procedure of populating the private page looks as follows. + +1. TDH.MEM.SEPT.ADD(512G level) +2. TDH.MEM.SEPT.ADD(1G level) +3. TDH.MEM.SEPT.ADD(2M level) +4. TDH.MEM.PAGE.AUG(4K level) + +Those operations correspond to updating the EPT entries. + +Dropping private page and TLB shootdown +--------------------------------------- +The procedure of dropping the private page looks as follows. + +1. TDH.MEM.RANGE.BLOCK(4K level) + + This mostly corresponds to clear the present bit in the EPT entry. This + prevents (or blocks) TLB entry from creating in the future. Note that the + private page is still linked in the secure EPT tree and the existing cache + entry in the TLB isn't flushed. + +2. TDH.MEM.TRACK(range) and TLB shootdown + + This mostly corresponds to the EPT TLB shootdown. Because all vcpus share + the same Secure EPT, all vcpus need to flush TLB. + + * TDH.MEM.TRACK(range) by one vcpu. It increments the global internal TLB + epoch counter. + + * send IPI to remote vcpus + * Other vcpu exits to VMM from guest TD and then re-enter. TDH.VP.ENTER(). + * TDH.VP.ENTER() checks the TLB epoch counter and If its TLB is old, flush + TLB. + + Note that only single vcpu issues tdh_mem_track(). + + Note that the private page is still linked in the secure EPT tree, unlike the + conventional EPT. + +3. TDH.MEM.PAGE.PROMOTE, TDH.MEM.PAGEDEMOTE(), TDH.MEM.PAGE.RELOCATE(), or + TDH.MEM.PAGE.REMOVE() + + There is no corresponding operation to the conventional EPT. + + * When changing page size (e.g. 4K <-> 2M) TDH.MEM.PAGE.PROMOTE() or + TDH.MEM.PAGE.DEMOTE() is used. During those operation, the guest page is + kept referenced in the Secure EPT. + + * When migrating page, TDH.MEM.PAGE.RELOCATE(). This requires both source + page and destination page. + * when destroying TD, TDH.MEM.PAGE.REMOVE() removes the private page from the + secure EPT tree. In this case TLB shootdown is not needed because vcpus + don't run any more. + +The basic idea for TDX support +============================== +Because shared EPT is the same as the existing EPT, use the existing logic for +shared EPT. On the other hand, secure EPT requires additional operations +instead of directly reading/writing of the EPT entry. + +On EPT violation, The KVM mmu walks down the EPT tree from the root, determines +the EPT entry to operate, and updates the entry. If necessary, a TLB shootdown +is done. Because it's very slow to directly walk secure EPT by TDX SEAMCALL, +TDH.MEM.SEPT.RD(), the mirror of secure EPT is created and maintained. Add +hooks to KVM MMU to reuse the existing code. + +EPT violation on shared GPA +--------------------------- +(1) EPT violation on shared GPA or zapping shared GPA + :: + + walk down shared EPT tree (the existing code) + | + | + V + shared EPT tree (CPU refers.) + +(2) update the EPT entry. (the existing code) + + TLB shootdown in the case of zapping. + + +EPT violation on private GPA +---------------------------- +(1) EPT violation on private GPA or zapping private GPA + :: + + walk down the mirror of secure EPT tree (mostly same as the existing code) + | + | + V + mirror of secure EPT tree (KVM MMU software only. reuse of the existing code) + +(2) update the (mirrored) EPT entry. (mostly same as the existing code) + +(3) call the hooks with what EPT entry is changed + :: + + | + NEW: hooks in KVM MMU + | + V + secure EPT root(CPU refers) + +(4) the TDX backend calls necessary TDX SEAMCALLs to update real secure EPT. + +The major modification is to add hooks for the TDX backend for additional +operations and to pass down which EPT, shared EPT, or private EPT is used, and +twist the behavior if we're operating on private EPT. + +The following depicts the relationship. +:: + + KVM | TDX module + | | | + -------------+---------- | | + | | | | + V V | | + shared GPA private GPA | | + CPU shared EPT pointer KVM private EPT pointer | CPU secure EPT pointer + | | | | + | | | | + V V | V + shared EPT private EPT<-------mirror----->Secure EPT + | | | | + | \--------------------+------\ | + | | | | + V | V V + shared guest page | private guest page + | + | + non-encrypted memory | encrypted memory + | + +shared EPT: CPU and KVM walk with shared GPA + Maintained by the existing code +private EPT: KVM walks with private GPA + Maintained by the twisted existing code +secure EPT: CPU walks with private GPA. + Maintained by TDX module with TDX SEAMCALLs via hooks + + +Tracking private EPT page +========================= +Shared EPT pages are managed by struct kvm_mmu_page. They are linked in a list +structure. When necessary, the list is traversed to operate on. Private EPT +pages have different characteristics. For example, private pages can't be +swapped out. When shrinking memory, we'd like to traverse only shared EPT pages +and skip private EPT pages. Likewise, page migration isn't supported for +private pages (yet). Introduce an additional list to track shared EPT pages and +track private EPT pages independently. + +At the beginning of EPT violation, the fault handler knows fault GPA, thus it +knows which EPT to operate on, private or shared. If it's private EPT, +an additional task is done. Something like "if (private) { callback a hook }". +Since the fault handler has deep function calls, it's cumbersome to hold the +information of which EPT is operating. Options to mitigate it are + +1. Pass the information as an argument for the function call. +2. Record the information in struct kvm_mmu_page somehow. +3. Record the information in vcpu structure. + +Option 2 was chosen. Because option 1 requires modifying all the functions. It +would affect badly to the normal case. Option 3 doesn't work well because in +some cases, we need to walk both private and shared EPT. + +The role of the EPT page can be utilized and one bit can be curved out from +unused bits in struct kvm_mmu_page_role. When allocating the EPT page, +initialize the information. Mostly struct kvm_mmu_page is available because +we're operating on EPT pages. + + +The conversion of private GPA and shared GPA +============================================ +A page of a given GPA can be assigned to only private GPA xor shared GPA at one +time. The GPA can't be accessed simultaneously via both private GPA and shared +GPA. On guest startup, all the GPAs are assigned as private. Guest converts +the range of GPA to shared (or private) from private (or shared) by MapGPA +hypercall. MapGPA hypercall takes the start GPA and the size of the region. If +the given start GPA is shared, VMM converts the region into shared (if it's +already shared, nop). If the start GPA is private, VMM converts the region into +private. It implies the guest won't access the unmapped region. private(or +shared) region after converting to shared(or private). + +If the guest TD triggers an EPT violation on the already converted region, the +access won't be allowed (loop in EPT violation) until other vcpu converts back +the region. + +KVM MMU records which GPA is allowed to access, private or shared by xarray. + + +The original TDP MMU and race condition +======================================= +Because vcpus share the EPT, once the EPT entry is zapped, we need to shootdown +TLB. Send IPI to remote vcpus. Remote vcpus flush their down TLBs. Until TLB +shootdown is done, vcpus may reference the zapped guest page. + +TDP MMU uses read lock of mmu_lock to mitigate vcpu contention. When read lock +is obtained, it depends on the atomic update of the EPT entry. (On the other +hand legacy MMU uses write lock.) When vcpu is populating/zapping the EPT entry +with a read lock held, other vcpu may be populating or zapping the same EPT +entry at the same time. + +To avoid the race condition, the entry is frozen. It means the EPT entry is set +to the special value, REMOVED_SPTE which clears the present bit. And then after +TLB shootdown, update the EPT entry to the final value. + +Concurrent zapping +------------------ +1. read lock +2. freeze the EPT entry (atomically set the value to REMOVED_SPTE) + If other vcpu froze the entry, restart page fault. +3. TLB shootdown + + * send IPI to remote vcpus + * TLB flush (local and remote) + + For each entry update, TLB shootdown is needed because of the + concurrency. +4. atomically set the EPT entry to the final value +5. read unlock + +Concurrent populating +--------------------- +In the case of populating the non-present EPT entry, atomically update the EPT +entry. + +1. read lock + +2. atomically update the EPT entry + If other vcpu frozen the entry or updated the entry, restart page fault. + +3. read unlock + +In the case of updating the present EPT entry (e.g. page migration), the +operation is split into two. Zapping the entry and populating the entry. + +1. read lock +2. zap the EPT entry. follow the concurrent zapping case. +3. populate the non-present EPT entry. +4. read unlock + +Non-concurrent batched zapping +------------------------------ +In some cases, zapping the ranges is done exclusively with a write lock held. +In this case, the TLB shootdown is batched into one. + +1. write lock +2. zap the EPT entries by traversing them +3. TLB shootdown +4. write unlock + +For Secure EPT, TDX SEAMCALLs are needed in addition to updating the mirrored +EPT entry. + +TDX concurrent zapping +---------------------- +Add a hook for TDX SEAMCALLs at the step of the TLB shootdown. + +1. read lock +2. freeze the EPT entry(set the value to REMOVED_SPTE) +3. TLB shootdown via a hook + + * TLB.MEM.RANGE.BLOCK() + * TLB.MEM.TRACK() + * send IPI to remote vcpus + +4. set the EPT entry to the final value +5. read unlock + +TDX concurrent populating +------------------------- +TDX SEAMCALLs are required in addition to operating the mirrored EPT entry. The +frozen entry is utilized by following the zapping case to avoid the race +condition. A hook can be added. + +1. read lock +2. freeze the EPT entry +3. hook + + * TDH_MEM_SEPT_ADD() for non-leaf or TDH_MEM_PAGE_AUG() for leaf. + +4. set the EPT entry to the final value +5. read unlock + +Without freezing the entry, the following race can happen. Suppose two vcpus +are faulting on the same GPA and the 2M and 4K level entries aren't populated +yet. + +* vcpu 1: update 2M level EPT entry +* vcpu 2: update 4K level EPT entry +* vcpu 2: TDX SEAMCALL to update 4K secure EPT entry => error +* vcpu 1: TDX SEAMCALL to update 2M secure EPT entry + + +TDX non-concurrent batched zapping +---------------------------------- +For simplicity, the procedure of concurrent populating is utilized. The +procedure can be optimized later. + + +Co-existing with unmapping guest private memory +=============================================== +TODO. This needs to be addressed. + + +Restrictions or future work +=========================== +The following features aren't supported yet at the moment. + +* optimizing non-concurrent zap +* Large page +* Page migration From patchwork Sun Oct 30 06:23:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Isaku Yamahata X-Patchwork-Id: 12922 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1667231wru; Sat, 29 Oct 2022 23:33:09 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4CeZb5NyeRyVh7A1e9MT0Zk/ytE70QledQoXSnEm55zfNTRO2MipXyFWURfiQxlNLN2UO7 X-Received: by 2002:a17:90b:4c02:b0:20a:7d26:149 with SMTP id na2-20020a17090b4c0200b0020a7d260149mr26227250pjb.134.1667111588763; Sat, 29 Oct 2022 23:33:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667111588; cv=none; d=google.com; s=arc-20160816; b=QNxVlKbKCINsedvLffTc9IR7iiWedse3FSnJKXfduJFTbPJFN5GeptVPh3M3FjbfYy 1g7UWmMJIeEeSumq1nBMO1wVPO84tndJ589e8+YNYnCr+scE6WJFmUAxQeTtEP4ujAb7 h1dgpbwfm/xQvGrLSD2d51kyDlEVyCRhUQAyR5XYnD8DkJIJUKg+pxwfufuISYLmdOft mu2a/EhhG3PzP18cQvHh6LX8S/WBcvxWZxgR+eNpPYFPxXS31RfQmPiOghlNP+ydk3Tv NEYCCBQJ9GYFhCC6/2SwyC09KPLH1xY29OayOImXWzOl+4I8fO6W4/30Sy/TtlR8rWOR TfPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=RbQyiXaKO/hiaw1an0kPbA2oXJO1g0W61wfj7NKc3Bk=; b=C2sni6rb+JVuDBJYOW9suZC/0eoJCGu7W2KS+R/dK/rAd2YZS2M8vIppLZ9kIoEjkx Xy9ntPlHoSQJRfjgOeziwzX0HniBIIdwtmUxnxOjQoDvD4SOBtK/d2BmEnlsFU0EXXZx mQ3ccIUy9Fzqr8hcGhNxntyGEM4mXI3KbEs5PKOvz8tdChtYd9yZMziet65WxfN+I6R4 s/BouLo7JvAnvRfAFwti9jg9bcaQ+Kf7Uz0AcJ2frqZj4yrnW53RjaM3Rq2zsczj37Oy lVf6D9JIabp8EVJUj4usbw49/GMHcTaFK/dYMONfH4N3yNIDI/qB/tBuMnFKODWooiT7 B/Uw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=FOxIFIHb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q6-20020a17090311c600b00186a4783b53si5190537plh.478.2022.10.29.23.32.54; Sat, 29 Oct 2022 23:33:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=FOxIFIHb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231508AbiJ3GcX (ORCPT + 99 others); Sun, 30 Oct 2022 02:32:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57186 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231337AbiJ3Gaf (ORCPT ); Sun, 30 Oct 2022 02:30:35 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA237B25; Sat, 29 Oct 2022 23:24:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1667111094; x=1698647094; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=sOBwFi6TijFmWxRR2u7LJCs6ZAO9aD248QKXtMkkXSw=; b=FOxIFIHb7f7YDYcVvacuYnW8t7S1b/mYJ3bRd3xYby9bHy8x098oqFYY Nuj2xy29c7wPmG9LyCwsvvx4L04CpKymUJmxnd7Oa7rhgj9ni7dAz98+w 2juoON7pklbw56jhIYNKSP/tkGA006lrs5GBiPF1VX/twucVfSZtXwJYH HvG3Y/MGIFCxB9XN690XdwxUAcMnJtQWppN2fJSwBaKo7wmeopBcyQDBW ZDFtGELXfQXkwXngozLb5xXJyf1ZxeSu9PJx16Uf7otw8Nmx0qtkA7LgJ gCdgMfCS8I/pl650DiD8utP09cR9pythe1ueW7A60bzwmk8DsUIKEtaBZ A==; X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="288436009" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="288436009" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:15 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10515"; a="878393188" X-IronPort-AV: E=Sophos;i="5.95,225,1661842800"; d="scan'208";a="878393188" Received: from ls.sc.intel.com (HELO localhost) ([143.183.96.54]) by fmsmga006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Oct 2022 23:24:15 -0700 From: isaku.yamahata@intel.com To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: isaku.yamahata@intel.com, isaku.yamahata@gmail.com, Paolo Bonzini , erdemaktas@google.com, Sean Christopherson , Sagi Shahar , David Matlack Subject: [PATCH v10 108/108] [MARKER] the end of (the first phase of) TDX KVM patch series Date: Sat, 29 Oct 2022 23:23:49 -0700 Message-Id: X-Mailer: git-send-email 2.25.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748093200889354638?= X-GMAIL-MSGID: =?utf-8?q?1748093200889354638?= From: Isaku Yamahata This empty commit is to mark the end of (the first phase of) patch series of TDX KVM support. Signed-off-by: Isaku Yamahata --- .../virt/kvm/intel-tdx-layer-status.rst | 33 ------------------- 1 file changed, 33 deletions(-) delete mode 100644 Documentation/virt/kvm/intel-tdx-layer-status.rst diff --git a/Documentation/virt/kvm/intel-tdx-layer-status.rst b/Documentation/virt/kvm/intel-tdx-layer-status.rst deleted file mode 100644 index 1cec14213f69..000000000000 --- a/Documentation/virt/kvm/intel-tdx-layer-status.rst +++ /dev/null @@ -1,33 +0,0 @@ -.. SPDX-License-Identifier: GPL-2.0 - -=================================== -Intel Trust Dodmain Extensions(TDX) -=================================== - -Layer status -============ -What qemu can do ----------------- -- TDX VM TYPE is exposed to Qemu. -- Qemu can create/destroy guest of TDX vm type. -- Qemu can create/destroy vcpu of TDX vm type. -- Qemu can populate initial guest memory image. -- Qemu can finalize guest TD. -- Qemu can start to run vcpu. But vcpu can not make progress yet. - -Patch Layer status ------------------- - Patch layer Status -* TDX, VMX coexistence: Applied -* TDX architectural definitions: Applied -* TD VM creation/destruction: Applied -* TD vcpu creation/destruction: Applied -* TDX EPT violation: Applied -* TD finalization: Applied -* TD vcpu enter/exit: Applied -* TD vcpu interrupts/exit/hypercall: Not yet - -* KVM MMU GPA shared bits: Applied -* KVM TDP refactoring for TDX: Applied -* KVM TDP MMU hooks: Applied -* KVM TDP MMU MapGPA: Applied