Message ID | 20230509135006.1604-1-yan.y.zhao@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp2945276vqo; Tue, 9 May 2023 07:53:03 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6vuuvo7ZNz2QTbRLUUtkR/lWCeZI7DcH6CoYmo4P2D/9lavMd2dJ+8NWO3s4Sn4i0T3znF X-Received: by 2002:a05:6a00:15ca:b0:63b:19e5:a9ec with SMTP id o10-20020a056a0015ca00b0063b19e5a9ecmr18648583pfu.33.1683643982637; Tue, 09 May 2023 07:53:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683643982; cv=none; d=google.com; s=arc-20160816; b=TkAD3XeORKy0QvRH1hXnmiwnN9mnEZ67SVUkS1mKJdtPChkOOoQnb7SnynnrhXy0aO /LBsuF/XUKo3eTjeqb1Wrzxn2zo1JSuydVFd2SIz1HnsfPIvixeqBOVHGHvhPfGq0xAg nfjG8cnCoNBzh/S4eqi4pAvJ2MkGCDwOn+/TPsxnEUFGb8J81Qhpce88B2pl13d7YUL+ a0KnQO1dkn2IbLjkNffUrkZFE5aOHNtIkzIzID+109CrHAKGzcExdtjWr7bLUi3uewiQ i5Rc8/nkep87WaPeCt7tFbJcjgKFn3YeGz00f4bAWW4ePE/KNYubJ0cTOYznLQmLJY1Y 5Q0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=IKgLlBX7U+AsXcCXovdWygNYIqf4E6RMU7tv/cmibE8=; b=M24B6l8ntWKufUHeRs7wBjpg3VmV5ba3hYxoErGt5m1KHQpbecNIYpfRQPgSsY84Vk 8QHkWpVs/52ZHepOLOKp14AZYlGQF7yHKH18k4tVsVtTCy5NQjyVJK5e8d5sHPIAjXQ2 fjrVFBs9qbpekDtEK9t5652gnYQ1w6O2EYubreU5VOOIP3qAeOwIpGKUeYBBEMc5IFcG Z66oyCTI1+stYTq4zBy8LfHk42N5UAuZHvG/0zE9SrUZQ4SHTPD5VNzTgWJBDQQxn7+v Mh5FpKlDDDUSGOgM/Gg3gwEr9tnMBywbZ6zp3eDb55MXW/z+4jxP8sAPN4ALgwSYiEEE /mxg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iVMw+mPu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f199-20020a6238d0000000b0064399c05a12si2457371pfa.232.2023.05.09.07.52.47; Tue, 09 May 2023 07:53:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iVMw+mPu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235807AbjEIOPa (ORCPT <rfc822;baris.duru.linux@gmail.com> + 99 others); Tue, 9 May 2023 10:15:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55916 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235794AbjEIOP2 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 9 May 2023 10:15:28 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2D11B40C9; Tue, 9 May 2023 07:15:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1683641722; x=1715177722; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=vihWXicRD+3O+AhuYtdyTHXD4NO+CZOfdeB7IWKIjnU=; b=iVMw+mPuNHBktWvx9nZ1U8TQWLdnE+IUtqmYdLHLEmFxoKsoj8tZpXyA 4SMTU5XPQw5oJZXRQWdrU2imdGPWIXjJEaQeAjo2IOSukRmMGmmuwMV8O 3K7i0vQvTSfdcVa6gBhP9/Mgol0AIeSLjjnDxbK+rO+fTl+CILmpS+M8f /+F0mvg14d5aSLt+OJbD8Pf4CV3ZNgHcXxr38Omftv4byuSNASDlcBUfZ 3QyS9kbkH/u+7xwDbNnLuHcun+DsgaRy2OTIh3mTa7WXBCCO5INQ3ruQR kSQBqQlnuSnWjGCQCx46/zjBdTK5DlnIb26qvOClz5yRSetr4oMojXB6U w==; X-IronPort-AV: E=McAfee;i="6600,9927,10705"; a="378034005" X-IronPort-AV: E=Sophos;i="5.99,262,1677571200"; d="scan'208";a="378034005" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2023 07:15:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10705"; a="810701818" X-IronPort-AV: E=Sophos;i="5.99,262,1677571200"; d="scan'208";a="810701818" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 May 2023 07:15:20 -0700 From: Yan Zhao <yan.y.zhao@intel.com> To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, Yan Zhao <yan.y.zhao@intel.com> Subject: [PATCH v2 1/6] KVM: x86/mmu: add a new mmu zap helper to indicate memtype changes Date: Tue, 9 May 2023 21:50:06 +0800 Message-Id: <20230509135006.1604-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230509134825.1523-1-yan.y.zhao@intel.com> References: <20230509134825.1523-1-yan.y.zhao@intel.com> X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765428672919348882?= X-GMAIL-MSGID: =?utf-8?q?1765428672919348882?= |
Series |
KVM: x86/mmu: refine memtype related mmu zap
|
|
Commit Message
Yan Zhao
May 9, 2023, 1:50 p.m. UTC
Add a helper to indicate that the kvm_zap_gfn_range() request is to update
memory type.
Then the zap can be avoided in cases:
1. TDP is not enabled.
2. EPT is not enabled.
This is because only memory type of EPT leaf entries are subjected to
change when noncoherent DMA/guest CR0.CD/guest MTRR settings change.
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
---
arch/x86/kvm/mmu.h | 1 +
arch/x86/kvm/mmu/mmu.c | 16 ++++++++++++++++
2 files changed, 17 insertions(+)
Comments
On Tue, May 09, 2023 at 09:50:06PM +0800, Yan Zhao wrote: >Add a helper to indicate that the kvm_zap_gfn_range() request is to update >memory type. > >Then the zap can be avoided in cases: >1. TDP is not enabled. >2. EPT is not enabled. > >This is because only memory type of EPT leaf entries are subjected to >change when noncoherent DMA/guest CR0.CD/guest MTRR settings change. > >Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> >--- > arch/x86/kvm/mmu.h | 1 + > arch/x86/kvm/mmu/mmu.c | 16 ++++++++++++++++ > 2 files changed, 17 insertions(+) > >diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h >index 92d5a1924fc1..a04577afbc71 100644 >--- a/arch/x86/kvm/mmu.h >+++ b/arch/x86/kvm/mmu.h >@@ -236,6 +236,7 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > } > > void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end); >+void kvm_zap_gfn_for_memtype(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end); > > int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu); > >diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c >index c8961f45e3b1..2706754794d1 100644 >--- a/arch/x86/kvm/mmu/mmu.c >+++ b/arch/x86/kvm/mmu/mmu.c >@@ -6272,6 +6272,22 @@ static bool kvm_rmap_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_e > return flush; > } > >+/* >+ * Invalidate (zap) TDP SPTEs that cover GFNs from gfn_start and up to gfn_end >+ * (not including it) for reason of memory type being updated. >+ */ >+void kvm_zap_gfn_for_memtype(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) >+{ >+ /* Currently only memory type of EPT leaf entries are affected by >+ * guest CR0.CD and guest MTRR. >+ * So skip invalidation (zap) in other cases >+ */ >+ if (!shadow_memtype_mask) Do you need to check kvm_arch_has_noncoherent_dma()? if yes, maybe just extract the first if() statement and its associated comment from kvm_tdp_page_fault()? >+ return; >+ >+ kvm_zap_gfn_range(kvm, gpa_to_gfn(0), gpa_to_gfn(~0ULL)); This should be: kvm_zap_gfn_range(kvm, start, end); >+} >+ > /* > * Invalidate (zap) SPTEs that cover GFNs from gfn_start and up to gfn_end > * (not including it) >-- >2.17.1 >
On Wed, May 10, 2023 at 01:30:29PM +0800, Chao Gao wrote: > On Tue, May 09, 2023 at 09:50:06PM +0800, Yan Zhao wrote: > >Add a helper to indicate that the kvm_zap_gfn_range() request is to update > >memory type. > > > >Then the zap can be avoided in cases: > >1. TDP is not enabled. > >2. EPT is not enabled. > > > >This is because only memory type of EPT leaf entries are subjected to > >change when noncoherent DMA/guest CR0.CD/guest MTRR settings change. > > > >Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> > >--- > > arch/x86/kvm/mmu.h | 1 + > > arch/x86/kvm/mmu/mmu.c | 16 ++++++++++++++++ > > 2 files changed, 17 insertions(+) > > > >diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h > >index 92d5a1924fc1..a04577afbc71 100644 > >--- a/arch/x86/kvm/mmu.h > >+++ b/arch/x86/kvm/mmu.h > >@@ -236,6 +236,7 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > > } > > > > void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end); > >+void kvm_zap_gfn_for_memtype(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end); > > > > int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu); > > > >diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > >index c8961f45e3b1..2706754794d1 100644 > >--- a/arch/x86/kvm/mmu/mmu.c > >+++ b/arch/x86/kvm/mmu/mmu.c > >@@ -6272,6 +6272,22 @@ static bool kvm_rmap_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_e > > return flush; > > } > > > >+/* > >+ * Invalidate (zap) TDP SPTEs that cover GFNs from gfn_start and up to gfn_end > >+ * (not including it) for reason of memory type being updated. > >+ */ > >+void kvm_zap_gfn_for_memtype(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) > >+{ > >+ /* Currently only memory type of EPT leaf entries are affected by > >+ * guest CR0.CD and guest MTRR. > >+ * So skip invalidation (zap) in other cases > >+ */ > >+ if (!shadow_memtype_mask) > > Do you need to check kvm_arch_has_noncoherent_dma()? if yes, maybe just extract > the first if() statement and its associated comment from kvm_tdp_page_fault()? > This kvm_zap_gfn_for_memtype() helper will be called in kvm_arch_unregister_noncoherent_dma() as well when noncoherent_dma_count goes from 1 to 0. So, checking kvm_arch_has_noncoherent_dma() is not desired. > >+ return; > >+ > >+ kvm_zap_gfn_range(kvm, gpa_to_gfn(0), gpa_to_gfn(~0ULL)); > > This should be: > > kvm_zap_gfn_range(kvm, start, end); > Oops. sorry about this mistake. Will fix it in next version. > >+} > >+ > > /* > > * Invalidate (zap) SPTEs that cover GFNs from gfn_start and up to gfn_end > > * (not including it) > >-- > >2.17.1 > >
On Wed, May 10, 2023, Yan Zhao wrote: > On Wed, May 10, 2023 at 01:30:29PM +0800, Chao Gao wrote: > > On Tue, May 09, 2023 at 09:50:06PM +0800, Yan Zhao wrote: > > >Add a helper to indicate that the kvm_zap_gfn_range() request is to update > > >memory type. > > > > > >Then the zap can be avoided in cases: > > >1. TDP is not enabled. > > >2. EPT is not enabled. > > > > > >This is because only memory type of EPT leaf entries are subjected to > > >change when noncoherent DMA/guest CR0.CD/guest MTRR settings change. > > > > > >Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> > > >--- > > > arch/x86/kvm/mmu.h | 1 + > > > arch/x86/kvm/mmu/mmu.c | 16 ++++++++++++++++ > > > 2 files changed, 17 insertions(+) > > > > > >diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h > > >index 92d5a1924fc1..a04577afbc71 100644 > > >--- a/arch/x86/kvm/mmu.h > > >+++ b/arch/x86/kvm/mmu.h > > >@@ -236,6 +236,7 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, > > > } > > > > > > void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end); > > >+void kvm_zap_gfn_for_memtype(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end); > > > > > > int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu); > > > > > >diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > > >index c8961f45e3b1..2706754794d1 100644 > > >--- a/arch/x86/kvm/mmu/mmu.c > > >+++ b/arch/x86/kvm/mmu/mmu.c > > >@@ -6272,6 +6272,22 @@ static bool kvm_rmap_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_e > > > return flush; > > > } > > > > > >+/* > > >+ * Invalidate (zap) TDP SPTEs that cover GFNs from gfn_start and up to gfn_end > > >+ * (not including it) for reason of memory type being updated. > > >+ */ > > >+void kvm_zap_gfn_for_memtype(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) > > >+{ > > >+ /* Currently only memory type of EPT leaf entries are affected by > > >+ * guest CR0.CD and guest MTRR. > > >+ * So skip invalidation (zap) in other cases > > >+ */ > > >+ if (!shadow_memtype_mask) > > > > Do you need to check kvm_arch_has_noncoherent_dma()? if yes, maybe just extract > > the first if() statement and its associated comment from kvm_tdp_page_fault()? > > > This kvm_zap_gfn_for_memtype() helper will be called in > kvm_arch_unregister_noncoherent_dma() as well when noncoherent_dma_count > goes from 1 to 0. > So, checking kvm_arch_has_noncoherent_dma() is not desired. > > > >+ return; > > >+ > > >+ kvm_zap_gfn_range(kvm, gpa_to_gfn(0), gpa_to_gfn(~0ULL)); > > > > This should be: > > > > kvm_zap_gfn_range(kvm, start, end); > > > Oops. sorry about this mistake. Will fix it in next version. Rather than add a helper to zap "for memtype", add a helper to query if KVM honors guest MTRRs. The "for memtype" isn't intuitive and ends up being incomplete. That will also allow for deduplicating other code (and a comment). Waiting until the zap also wastes cycles in the MTRR case, there's no need to compute start and end if the zap will ultimately be skipped. E.g. over two or three patches: --- arch/x86/kvm/mmu.h | 1 + arch/x86/kvm/mmu/mmu.c | 19 ++++++++++++++----- arch/x86/kvm/mtrr.c | 2 +- arch/x86/kvm/x86.c | 2 +- 4 files changed, 17 insertions(+), 7 deletions(-) diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 92d5a1924fc1..c3c50ec9d9db 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -103,6 +103,7 @@ static inline u8 kvm_get_shadow_phys_bits(void) void kvm_mmu_set_mmio_spte_mask(u64 mmio_value, u64 mmio_mask, u64 access_mask); void kvm_mmu_set_me_spte_mask(u64 me_value, u64 me_mask); void kvm_mmu_set_ept_masks(bool has_ad_bits, bool has_exec_only); +bool kvm_mmu_honors_guest_mtrrs(struct kvm *kvm); void kvm_init_mmu(struct kvm_vcpu *vcpu); void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0, diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index c8961f45e3b1..a2b1723bc6a4 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4512,12 +4512,10 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu, } #endif -int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) +bool kvm_mmu_honors_guest_mtrrs(struct kvm *kvm) { /* - * If the guest's MTRRs may be used to compute the "real" memtype, - * restrict the mapping level to ensure KVM uses a consistent memtype - * across the entire mapping. If the host MTRRs are ignored by TDP + * If the TDP is enabled, the host MTRRs are ignored by TDP * (shadow_memtype_mask is non-zero), and the VM has non-coherent DMA * (DMA doesn't snoop CPU caches), KVM's ABI is to honor the memtype * from the guest's MTRRs so that guest accesses to memory that is @@ -4526,7 +4524,18 @@ int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) * Note, KVM may still ultimately ignore guest MTRRs for certain PFNs, * e.g. KVM will force UC memtype for host MMIO. */ - if (shadow_memtype_mask && kvm_arch_has_noncoherent_dma(vcpu->kvm)) { + return tdp_enabled && shadow_memtype_mask && + kvm_arch_has_noncoherent_dma(kvm); +} + +int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) +{ + /* + * If the guest's MTRRs may be used to compute the "real" memtype, + * restrict the mapping level to ensure KVM uses a consistent memtype + * across the entire mapping. + */ + if (kvm_mmu_honors_guest_mtrrs(vcpu->kvm)) { for ( ; fault->max_level > PG_LEVEL_4K; --fault->max_level) { int page_num = KVM_PAGES_PER_HPAGE(fault->max_level); gfn_t base = gfn_round_for_level(fault->gfn, diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c index 3eb6e7f47e96..a67c28a56417 100644 --- a/arch/x86/kvm/mtrr.c +++ b/arch/x86/kvm/mtrr.c @@ -320,7 +320,7 @@ static void update_mtrr(struct kvm_vcpu *vcpu, u32 msr) struct kvm_mtrr *mtrr_state = &vcpu->arch.mtrr_state; gfn_t start, end; - if (!tdp_enabled || !kvm_arch_has_noncoherent_dma(vcpu->kvm)) + if (!kvm_mmu_honors_guest_mtrrs(vcpu->kvm)) return; if (!mtrr_is_enabled(mtrr_state) && msr != MSR_MTRRdefType) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 076d50f6321b..977dceb7ba7e 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -942,7 +942,7 @@ void kvm_post_set_cr0(struct kvm_vcpu *vcpu, unsigned long old_cr0, unsigned lon kvm_mmu_reset_context(vcpu); if (((cr0 ^ old_cr0) & X86_CR0_CD) && - kvm_arch_has_noncoherent_dma(vcpu->kvm) && + kvm_mmu_honors_guest_mtrrs(vcpu->kvm) && !kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_CD_NW_CLEARED)) kvm_zap_gfn_range(vcpu->kvm, 0, ~0ULL); } base-commit: 8d4a1b4011c125b1510254b724433aaae746ce14 --
On Tue, May 23, 2023 at 03:51:49PM -0700, Sean Christopherson wrote: > Rather than add a helper to zap "for memtype", add a helper to query if KVM honors > guest MTRRs. The "for memtype" isn't intuitive and ends up being incomplete. > That will also allow for deduplicating other code (and a comment). Waiting until > the zap also wastes cycles in the MTRR case, there's no need to compute start and > end if the zap will ultimately be skipped. Agreed. > > E.g. over two or three patches: > > --- > arch/x86/kvm/mmu.h | 1 + > arch/x86/kvm/mmu/mmu.c | 19 ++++++++++++++----- > arch/x86/kvm/mtrr.c | 2 +- > arch/x86/kvm/x86.c | 2 +- > 4 files changed, 17 insertions(+), 7 deletions(-) > > diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h > index 92d5a1924fc1..c3c50ec9d9db 100644 > --- a/arch/x86/kvm/mmu.h > +++ b/arch/x86/kvm/mmu.h > @@ -103,6 +103,7 @@ static inline u8 kvm_get_shadow_phys_bits(void) > void kvm_mmu_set_mmio_spte_mask(u64 mmio_value, u64 mmio_mask, u64 access_mask); > void kvm_mmu_set_me_spte_mask(u64 me_value, u64 me_mask); > void kvm_mmu_set_ept_masks(bool has_ad_bits, bool has_exec_only); > +bool kvm_mmu_honors_guest_mtrrs(struct kvm *kvm); > > void kvm_init_mmu(struct kvm_vcpu *vcpu); > void kvm_init_shadow_npt_mmu(struct kvm_vcpu *vcpu, unsigned long cr0, > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c > index c8961f45e3b1..a2b1723bc6a4 100644 > --- a/arch/x86/kvm/mmu/mmu.c > +++ b/arch/x86/kvm/mmu/mmu.c > @@ -4512,12 +4512,10 @@ static int kvm_tdp_mmu_page_fault(struct kvm_vcpu *vcpu, > } > #endif > > -int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > +bool kvm_mmu_honors_guest_mtrrs(struct kvm *kvm) > { > /* > - * If the guest's MTRRs may be used to compute the "real" memtype, > - * restrict the mapping level to ensure KVM uses a consistent memtype > - * across the entire mapping. If the host MTRRs are ignored by TDP > + * If the TDP is enabled, the host MTRRs are ignored by TDP > * (shadow_memtype_mask is non-zero), and the VM has non-coherent DMA > * (DMA doesn't snoop CPU caches), KVM's ABI is to honor the memtype > * from the guest's MTRRs so that guest accesses to memory that is > @@ -4526,7 +4524,18 @@ int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > * Note, KVM may still ultimately ignore guest MTRRs for certain PFNs, > * e.g. KVM will force UC memtype for host MMIO. > */ > - if (shadow_memtype_mask && kvm_arch_has_noncoherent_dma(vcpu->kvm)) { > + return tdp_enabled && shadow_memtype_mask && > + kvm_arch_has_noncoherent_dma(kvm); > +} > + > +int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault) > +{ > + /* > + * If the guest's MTRRs may be used to compute the "real" memtype, > + * restrict the mapping level to ensure KVM uses a consistent memtype > + * across the entire mapping. > + */ > + if (kvm_mmu_honors_guest_mtrrs(vcpu->kvm)) { > for ( ; fault->max_level > PG_LEVEL_4K; --fault->max_level) { > int page_num = KVM_PAGES_PER_HPAGE(fault->max_level); > gfn_t base = gfn_round_for_level(fault->gfn, > diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c > index 3eb6e7f47e96..a67c28a56417 100644 > --- a/arch/x86/kvm/mtrr.c > +++ b/arch/x86/kvm/mtrr.c > @@ -320,7 +320,7 @@ static void update_mtrr(struct kvm_vcpu *vcpu, u32 msr) > struct kvm_mtrr *mtrr_state = &vcpu->arch.mtrr_state; > gfn_t start, end; > > - if (!tdp_enabled || !kvm_arch_has_noncoherent_dma(vcpu->kvm)) > + if (!kvm_mmu_honors_guest_mtrrs(vcpu->kvm)) Could we also add another helper kvm_mmu_cap_honors_guest_mtrrs(), which does not check kvm_arch_has_noncoherent_dma()? +static inline bool kvm_mmu_cap_honors_guest_mtrrs(struct kvm *kvm) +{ + return !!shadow_memtype_mask; +} This is because in patch 4 I plan to do the EPT zap when noncoherent_dma_count goes from 1 to 0. diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 41d7bb51a297..ad0c43d7f532 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -13146,13 +13146,19 @@ EXPORT_SYMBOL_GPL(kvm_arch_has_assigned_device); void kvm_arch_register_noncoherent_dma(struct kvm *kvm) { - atomic_inc(&kvm->arch.noncoherent_dma_count); + if (atomic_inc_return(&kvm->arch.noncoherent_dma_count) == 1) { + if (kvm_mmu_cap_honors_guest_mtrrs(kvm)) + kvm_zap_gfn_range(kvm, 0, ~0ULL); + } } EXPORT_SYMBOL_GPL(kvm_arch_register_noncoherent_dma); void kvm_arch_unregister_noncoherent_dma(struct kvm *kvm) { - atomic_dec(&kvm->arch.noncoherent_dma_count); + if (!atomic_dec_return(&kvm->arch.noncoherent_dma_count)) { + if (kvm_mmu_cap_honors_guest_mtrrs(kvm)) + kvm_zap_gfn_range(kvm, 0, ~0ULL); + } } EXPORT_SYMBOL_GPL(kvm_arch_unregister_noncoherent_dma); > return; > > if (!mtrr_is_enabled(mtrr_state) && msr != MSR_MTRRdefType) > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 076d50f6321b..977dceb7ba7e 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -942,7 +942,7 @@ void kvm_post_set_cr0(struct kvm_vcpu *vcpu, unsigned long old_cr0, unsigned lon > kvm_mmu_reset_context(vcpu); > > if (((cr0 ^ old_cr0) & X86_CR0_CD) && > - kvm_arch_has_noncoherent_dma(vcpu->kvm) && > + kvm_mmu_honors_guest_mtrrs(vcpu->kvm) && > !kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_CD_NW_CLEARED)) > kvm_zap_gfn_range(vcpu->kvm, 0, ~0ULL); > } > > base-commit: 8d4a1b4011c125b1510254b724433aaae746ce14 > -- >
On Wed, May 24, 2023, Yan Zhao wrote: > On Tue, May 23, 2023 at 03:51:49PM -0700, Sean Christopherson wrote: > > diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c > > index 3eb6e7f47e96..a67c28a56417 100644 > > --- a/arch/x86/kvm/mtrr.c > > +++ b/arch/x86/kvm/mtrr.c > > @@ -320,7 +320,7 @@ static void update_mtrr(struct kvm_vcpu *vcpu, u32 msr) > > struct kvm_mtrr *mtrr_state = &vcpu->arch.mtrr_state; > > gfn_t start, end; > > > > - if (!tdp_enabled || !kvm_arch_has_noncoherent_dma(vcpu->kvm)) > > + if (!kvm_mmu_honors_guest_mtrrs(vcpu->kvm)) > Could we also add another helper kvm_mmu_cap_honors_guest_mtrrs(), which > does not check kvm_arch_has_noncoherent_dma()? > > +static inline bool kvm_mmu_cap_honors_guest_mtrrs(struct kvm *kvm) > +{ > + return !!shadow_memtype_mask; > +} > > This is because in patch 4 I plan to do the EPT zap when > noncoherent_dma_count goes from 1 to 0. Hrm, the 1->0 transition is annoying. Rather than trying to capture the "everything except non-coherent DMA" aspect, what about this? mmu.c: bool __kvm_mmu_honors_guest_mtrrs(struct kvm *kvm, bool vm_has_noncoherent_dma) { /* * If the TDP is enabled, the host MTRRs are ignored by TDP * (shadow_memtype_mask is non-zero), and the VM has non-coherent DMA * (DMA doesn't snoop CPU caches), KVM's ABI is to honor the memtype * from the guest's MTRRs so that guest accesses to memory that is * DMA'd aren't cached against the guest's wishes. * * Note, KVM may still ultimately ignore guest MTRRs for certain PFNs, * e.g. KVM will force UC memtype for host MMIO. */ return vm_has_noncoherent_dma && tdp_enabled && shadow_memtype_mask; } mmu.h: bool __kvm_mmu_honors_guest_mtrrs(struct kvm *kvm, bool vm_has_noncoherent_dma); static inline bool kvm_mmu_honors_guest_mtrrs(struct kvm *kvm) { return __kvm_mmu_honors_guest_mtrrs(kvm, kvm_arch_has_noncoherent_dma(kvm)); } > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 41d7bb51a297..ad0c43d7f532 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -13146,13 +13146,19 @@ EXPORT_SYMBOL_GPL(kvm_arch_has_assigned_device); > > void kvm_arch_register_noncoherent_dma(struct kvm *kvm) > { > - atomic_inc(&kvm->arch.noncoherent_dma_count); > + if (atomic_inc_return(&kvm->arch.noncoherent_dma_count) == 1) { > + if (kvm_mmu_cap_honors_guest_mtrrs(kvm)) > + kvm_zap_gfn_range(kvm, 0, ~0ULL); No need for multiple if statements. Though rather than have identical code in both the start/end paths, how about this? That provides a single location for a comment. Or maybe first/last instead of start/end? static void kvm_noncoherent_dma_start_or_end(struct kvm *kvm) { /* comment goes here. */ if (__kvm_mmu_honors_guest_mtrrs(kvm, true)) kvm_zap_gfn_range(kvm, 0, ~0ULL); } void kvm_arch_register_noncoherent_dma(struct kvm *kvm) { if (atomic_inc_return(&kvm->arch.noncoherent_dma_count) == 1) kvm_noncoherent_dma_start_or_end(kvm); } EXPORT_SYMBOL_GPL(kvm_arch_register_noncoherent_dma); void kvm_arch_unregister_noncoherent_dma(struct kvm *kvm) { if (!atomic_dec_return(&kvm->arch.noncoherent_dma_count)) kvm_noncoherent_dma_start_or_end(kvm); } EXPORT_SYMBOL_GPL(kvm_arch_unregister_noncoherent_dma);
On Wed, May 24, 2023 at 07:50:24AM -0700, Sean Christopherson wrote: > On Wed, May 24, 2023, Yan Zhao wrote: > > On Tue, May 23, 2023 at 03:51:49PM -0700, Sean Christopherson wrote: > > > diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c > > > index 3eb6e7f47e96..a67c28a56417 100644 > > > --- a/arch/x86/kvm/mtrr.c > > > +++ b/arch/x86/kvm/mtrr.c > > > @@ -320,7 +320,7 @@ static void update_mtrr(struct kvm_vcpu *vcpu, u32 msr) > > > struct kvm_mtrr *mtrr_state = &vcpu->arch.mtrr_state; > > > gfn_t start, end; > > > > > > - if (!tdp_enabled || !kvm_arch_has_noncoherent_dma(vcpu->kvm)) > > > + if (!kvm_mmu_honors_guest_mtrrs(vcpu->kvm)) > > Could we also add another helper kvm_mmu_cap_honors_guest_mtrrs(), which > > does not check kvm_arch_has_noncoherent_dma()? > > > > +static inline bool kvm_mmu_cap_honors_guest_mtrrs(struct kvm *kvm) > > +{ > > + return !!shadow_memtype_mask; > > +} > > > > This is because in patch 4 I plan to do the EPT zap when > > noncoherent_dma_count goes from 1 to 0. > > Hrm, the 1->0 transition is annoying. Rather than trying to capture the "everything > except non-coherent DMA" aspect, what about this? > > mmu.c: > > bool __kvm_mmu_honors_guest_mtrrs(struct kvm *kvm, bool vm_has_noncoherent_dma) > { > /* > * If the TDP is enabled, the host MTRRs are ignored by TDP > * (shadow_memtype_mask is non-zero), and the VM has non-coherent DMA > * (DMA doesn't snoop CPU caches), KVM's ABI is to honor the memtype > * from the guest's MTRRs so that guest accesses to memory that is > * DMA'd aren't cached against the guest's wishes. > * > * Note, KVM may still ultimately ignore guest MTRRs for certain PFNs, > * e.g. KVM will force UC memtype for host MMIO. > */ > return vm_has_noncoherent_dma && tdp_enabled && shadow_memtype_mask; > } > > mmu.h: > > bool __kvm_mmu_honors_guest_mtrrs(struct kvm *kvm, bool vm_has_noncoherent_dma); > > static inline bool kvm_mmu_honors_guest_mtrrs(struct kvm *kvm) > { > > return __kvm_mmu_honors_guest_mtrrs(kvm, kvm_arch_has_noncoherent_dma(kvm)); > } This should work and it centralizes the comments into one place, though I dislike having to pass true as vm_has_noncoherent_dma in case of 1->0 transition. :) > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 41d7bb51a297..ad0c43d7f532 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -13146,13 +13146,19 @@ EXPORT_SYMBOL_GPL(kvm_arch_has_assigned_device); > > > > void kvm_arch_register_noncoherent_dma(struct kvm *kvm) > > { > > - atomic_inc(&kvm->arch.noncoherent_dma_count); > > + if (atomic_inc_return(&kvm->arch.noncoherent_dma_count) == 1) { > > + if (kvm_mmu_cap_honors_guest_mtrrs(kvm)) > > + kvm_zap_gfn_range(kvm, 0, ~0ULL); > > No need for multiple if statements. Though rather than have identical code in > both the start/end paths, how about this? That provides a single location for a > comment. Or maybe first/last instead of start/end? > > static void kvm_noncoherent_dma_start_or_end(struct kvm *kvm) What does start_or_end or first_or_last stand for? > { > /* comment goes here. */ > if (__kvm_mmu_honors_guest_mtrrs(kvm, true)) > kvm_zap_gfn_range(kvm, 0, ~0ULL); > } > > void kvm_arch_register_noncoherent_dma(struct kvm *kvm) > { > if (atomic_inc_return(&kvm->arch.noncoherent_dma_count) == 1) > kvm_noncoherent_dma_start_or_end(kvm); > } > EXPORT_SYMBOL_GPL(kvm_arch_register_noncoherent_dma); > > void kvm_arch_unregister_noncoherent_dma(struct kvm *kvm) > { > if (!atomic_dec_return(&kvm->arch.noncoherent_dma_count)) > kvm_noncoherent_dma_start_or_end(kvm); > } > EXPORT_SYMBOL_GPL(kvm_arch_unregister_noncoherent_dma); >
On Thu, May 25, 2023, Yan Zhao wrote: > On Wed, May 24, 2023 at 07:50:24AM -0700, Sean Christopherson wrote: > > bool __kvm_mmu_honors_guest_mtrrs(struct kvm *kvm, bool vm_has_noncoherent_dma); > > > > static inline bool kvm_mmu_honors_guest_mtrrs(struct kvm *kvm) > > { > > > > return __kvm_mmu_honors_guest_mtrrs(kvm, kvm_arch_has_noncoherent_dma(kvm)); > > } > > This should work and it centralizes the comments into one place, though I dislike > having to pass true as vm_has_noncoherent_dma in case of 1->0 transition. :) Yeah, I don't love it either, but the whole 1=>0 transition is awkward. FWIW, KVM doesn't strictly need to zap in that case since the guest isn't relying on WB for functionality, i.e. we could just skip it. > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > index 41d7bb51a297..ad0c43d7f532 100644 > > > --- a/arch/x86/kvm/x86.c > > > +++ b/arch/x86/kvm/x86.c > > > @@ -13146,13 +13146,19 @@ EXPORT_SYMBOL_GPL(kvm_arch_has_assigned_device); > > > > > > void kvm_arch_register_noncoherent_dma(struct kvm *kvm) > > > { > > > - atomic_inc(&kvm->arch.noncoherent_dma_count); > > > + if (atomic_inc_return(&kvm->arch.noncoherent_dma_count) == 1) { > > > + if (kvm_mmu_cap_honors_guest_mtrrs(kvm)) > > > + kvm_zap_gfn_range(kvm, 0, ~0ULL); > > > > No need for multiple if statements. Though rather than have identical code in > > both the start/end paths, how about this? That provides a single location for a > > comment. Or maybe first/last instead of start/end? > > > > static void kvm_noncoherent_dma_start_or_end(struct kvm *kvm) > What does start_or_end or first_or_last stand for? Start/End of device (un)assignment, or First/Last device (un)assigned. Definitely feel free to pick a better name.
On Thu, May 25, 2023 at 08:54:15AM -0700, Sean Christopherson wrote: > On Thu, May 25, 2023, Yan Zhao wrote: > > On Wed, May 24, 2023 at 07:50:24AM -0700, Sean Christopherson wrote: > > > bool __kvm_mmu_honors_guest_mtrrs(struct kvm *kvm, bool vm_has_noncoherent_dma); > > > > > > static inline bool kvm_mmu_honors_guest_mtrrs(struct kvm *kvm) > > > { > > > > > > return __kvm_mmu_honors_guest_mtrrs(kvm, kvm_arch_has_noncoherent_dma(kvm)); > > > } > > > > This should work and it centralizes the comments into one place, though I dislike > > having to pass true as vm_has_noncoherent_dma in case of 1->0 transition. :) > > Yeah, I don't love it either, but the whole 1=>0 transition is awkward. FWIW, > KVM doesn't strictly need to zap in that case since the guest isn't relying on > WB for functionality, i.e. we could just skip it. I think zap when 1=>0 transition is still useful. E.g. if this non-coherent DMA is unassigned (1) when CR0.CD=1 (KVM_X86_QUIRK_CD_NW_CLEARED is not enabled), or (2) when CR0.CD=0 and MTRRs are disabled, it's better to zap the UC ranges for better performance.
On Thu, May 25, 2023 at 08:54:15AM -0700, Sean Christopherson wrote: > > > static void kvm_noncoherent_dma_start_or_end(struct kvm *kvm) > > What does start_or_end or first_or_last stand for? > > Start/End of device (un)assignment, or First/Last device (un)assigned. Definitely > feel free to pick a better name. Hi Sean, start_or_end gave me a first impression that it stands for gfn_start, gfn_end. so currently I changed it to start/stop to avoid confusion. +static void kvm_noncoherent_dma_assignment_start_or_stop(struct kvm *kvm) +{ + /* + * Only necessary to zap KVM TDP if guest MTRR is honored. + * As whether a VM has noncoherent DMA can affect whether + * KVM honors guest MTRR and the resulting memory type in TDP, + * specify its value as true here to test if guest MTRR + * is honored after the assignment starts or + * was honored before the assignment stops. + */ + if (kvm_mmu_honors_guest_mtrrs(kvm, true)) + kvm_zap_gfn_range(kvm, 0, ~0ULL); +} And I combined the __kvm_mmu_honors_guest_mtrrs() into kvm_mmu_honors_guest_mtrrs(). Not sure if you like it :) +/* + * Returns if KVM honors guest MTRRs + * @override_vm_has_noncoherent_dma: Allow caller to override non-coherent DMA + * status returned from + * kvm_arch_has_noncoherent_dma() + */ +bool kvm_mmu_honors_guest_mtrrs(struct kvm *kvm, + bool override_vm_has_noncoherent_dma) +{ + bool noncoherent_dma = override_vm_has_noncoherent_dma ? true : + kvm_arch_has_noncoherent_dma(kvm); + + /* + * If the TDP is enabled, the host MTRRs are ignored by TDP + * (shadow_memtype_mask is non-zero), and the VM has non-coherent DMA + * (DMA doesn't snoop CPU caches), KVM's ABI is to honor the memtype + * from the guest's MTRRs so that guest accesses to memory that is + * DMA'd aren't cached against the guest's wishes. + * + * Note, KVM may still ultimately ignore guest MTRRs for certain PFNs, + * e.g. KVM will force UC memtype for host MMIO. + */ + return noncoherent_dma && tdp_enabled && shadow_memtype_mask; +} + Thanks Yan
On Tue, May 30, 2023, Yan Zhao wrote: > On Thu, May 25, 2023 at 08:54:15AM -0700, Sean Christopherson wrote: > And I combined the __kvm_mmu_honors_guest_mtrrs() into > kvm_mmu_honors_guest_mtrrs(). Not sure if you like it :) I would prefer to provide a separater inner helper, mainly so that the common case callers don't need to pass %false. I don't love passing bools, but it's tolerable for a one-off use of an inner helper. > +/* > + * Returns if KVM honors guest MTRRs > + * @override_vm_has_noncoherent_dma: Allow caller to override non-coherent DMA > + * status returned from > + * kvm_arch_has_noncoherent_dma() > + */ > +bool kvm_mmu_honors_guest_mtrrs(struct kvm *kvm, > + bool override_vm_has_noncoherent_dma) > +{ > + bool noncoherent_dma = override_vm_has_noncoherent_dma ? true : > + kvm_arch_has_noncoherent_dma(kvm); The "override" name is confusing, e.g. it won't be clear when it's safe/correct for a new caller to override kvm_arch_has_noncoherent_dma(). If we go with a single helper, I could live with: bool kvm_mmu_honors_guest_mtrrs(struct kvm *kvm, bool stopping_noncoherent_dma) { bool noncoherent_dma = stopping_noncoherent_dma || kvm_arch_has_noncoherent_dma(kvm); ... } but that makes it awkward to use common code for start+stop assignment, and as above there are three "normal" callers that would have to pass magic %false values regardless of the name.
On Tue, May 30, 2023 at 04:51:25PM -0700, Sean Christopherson wrote: > On Tue, May 30, 2023, Yan Zhao wrote: > > On Thu, May 25, 2023 at 08:54:15AM -0700, Sean Christopherson wrote: > > And I combined the __kvm_mmu_honors_guest_mtrrs() into > > kvm_mmu_honors_guest_mtrrs(). Not sure if you like it :) > > I would prefer to provide a separater inner helper, mainly so that the common > case callers don't need to pass %false. I don't love passing bools, but it's > tolerable for a one-off use of an inner helper. Ok. Will follow this way. It's reasonable :) > > +/* > > + * Returns if KVM honors guest MTRRs > > + * @override_vm_has_noncoherent_dma: Allow caller to override non-coherent DMA > > + * status returned from > > + * kvm_arch_has_noncoherent_dma() > > + */ > > +bool kvm_mmu_honors_guest_mtrrs(struct kvm *kvm, > > + bool override_vm_has_noncoherent_dma) > > +{ > > + bool noncoherent_dma = override_vm_has_noncoherent_dma ? true : > > + kvm_arch_has_noncoherent_dma(kvm); > > The "override" name is confusing, e.g. it won't be clear when it's safe/correct > for a new caller to override kvm_arch_has_noncoherent_dma(). If we go with a > single helper, I could live with: > > bool kvm_mmu_honors_guest_mtrrs(struct kvm *kvm, bool stopping_noncoherent_dma) > { > bool noncoherent_dma = stopping_noncoherent_dma || > kvm_arch_has_noncoherent_dma(kvm); > > ... > } > > but that makes it awkward to use common code for start+stop assignment, and as > above there are three "normal" callers that would have to pass magic %false > values regardless of the name.
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h index 92d5a1924fc1..a04577afbc71 100644 --- a/arch/x86/kvm/mmu.h +++ b/arch/x86/kvm/mmu.h @@ -236,6 +236,7 @@ static inline u8 permission_fault(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu, } void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end); +void kvm_zap_gfn_for_memtype(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end); int kvm_arch_write_log_dirty(struct kvm_vcpu *vcpu); diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index c8961f45e3b1..2706754794d1 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6272,6 +6272,22 @@ static bool kvm_rmap_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_e return flush; } +/* + * Invalidate (zap) TDP SPTEs that cover GFNs from gfn_start and up to gfn_end + * (not including it) for reason of memory type being updated. + */ +void kvm_zap_gfn_for_memtype(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end) +{ + /* Currently only memory type of EPT leaf entries are affected by + * guest CR0.CD and guest MTRR. + * So skip invalidation (zap) in other cases + */ + if (!shadow_memtype_mask) + return; + + kvm_zap_gfn_range(kvm, gpa_to_gfn(0), gpa_to_gfn(~0ULL)); +} + /* * Invalidate (zap) SPTEs that cover GFNs from gfn_start and up to gfn_end * (not including it)