[v3,08/11] KVM: x86: move vmx code to get EPT memtype when CR0.CD=1 to x86 common code
Message ID | 20230616023858.7503-1-yan.y.zhao@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp1078680vqr; Thu, 15 Jun 2023 21:16:32 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7styIW1KtUfkrKVO8d10cafwXrAPvWNhCeiEkP7BOL/VYivj3et/eqyVtJfYvOy/xMfVFS X-Received: by 2002:a17:902:f7ce:b0:1b1:9b59:fc68 with SMTP id h14-20020a170902f7ce00b001b19b59fc68mr729137plw.13.1686888992208; Thu, 15 Jun 2023 21:16:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686888992; cv=none; d=google.com; s=arc-20160816; b=yMfhg6e1qrAr8//O2QZ/3YfKzRywsA8OLTqZkshmfbg46vixE5iZLyrzqcOQuj5VF0 ucaztAvl8lcN3D5Y1kdnDKAueDwEycWI8cyZbapIu90QWpdz2w4PbU0H95yQ9k4HLXje PPt5l1o0EMHdSJm0BFL1jF4EiGzLt/EXz1MJnTQk+tcUX5ULsXeglqAyJ/uqKAcy4cph pNTK9DhdgLWow6CyGYnGXuiZdOCF0nfnMsdUfPP//NSH0+fTRsMxBuRgSRTZcy3Dze44 rCVIJRyCSH5x1Gbxd+HsvAohBh27Ra2dM2yDkOmmj6IKCXYC+7kD1G8GjYj3T1Q7/Di5 i7aQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=SHhTE8flg//moecTBoxhIzJqyq8trVddxtUhrhhMHas=; b=LZjc88bJUTF/z2w2eQ3j84Bnnc7k5EvuIQSfrhIkcLKxd094VF1hko6Nld1S/JxN+3 ddObKSJrWY+OO7+gmimOkvJvuL0WTk7hXt2i+D4m+ISnp0TpiaKee3TKmPBYSP48qiMf ohDJfk89RRvG4eqRcxEIhUF0ZGLV5UkPfbvGfppobtOm18/m3+hCZTyGjt0Qz44G1GvD aVXHjCJ47thOiZvYl4749u2k58tGWdQQXuLQUyt45+y3eFPV3RR+7trNAvevSHfnbz7H nKAKxkZT0eDi+c8bcGkLs03cFeX+ZVaSan8yrVkkm8hClXvkKf3v210zkmnkDx5XVFWd zFQw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ZN73HhKE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t18-20020a170902d21200b001b199c98cf4si10157097ply.280.2023.06.15.21.16.17; Thu, 15 Jun 2023 21:16:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ZN73HhKE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241407AbjFPDEU (ORCPT <rfc822;maxin.john@gmail.com> + 99 others); Thu, 15 Jun 2023 23:04:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241534AbjFPDEQ (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 15 Jun 2023 23:04:16 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D06F02D60; Thu, 15 Jun 2023 20:04:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1686884653; x=1718420653; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=BbhsWAIbuhRQ7z0GhXvz/jeDOJUZBkuqg3ejQ3AlksA=; b=ZN73HhKEUJ7uhNxX2fVuQgSYasuT2ipt5XCONxF1vCe5aPaDmDcck1/L xLdcVUcxQ+T/YMnDzCzsP81kHQj+6BrFiXovIcRXHvzofaIDQ6yLzVZeO QaT6cVdpeJhjQ7vb0igiL7mKDSxfNQmkW/O8hV1kFnsCAo7jKjVJQz88c n/tmBbLEms9XzrWc8mVJMyQVLFdpJuOrPIHeQWYzRCkWEFhLJ2lgJ6Wbl NT7i5qaYo+Cw9ahDfkk9Yrd6QZRmO2A4mmqNoxTacSHYXTo3N6OIlMtgG atCm3lC+My4w37+XNvIhCtH9VNc2YrFhiPqoCc1wximuzBVg1kpG8UkXQ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="445482463" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="445482463" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:04:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10742"; a="690037834" X-IronPort-AV: E=Sophos;i="6.00,246,1681196400"; d="scan'208";a="690037834" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2023 20:04:08 -0700 From: Yan Zhao <yan.y.zhao@intel.com> To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: pbonzini@redhat.com, seanjc@google.com, chao.gao@intel.com, kai.huang@intel.com, robert.hoo.linux@gmail.com, Yan Zhao <yan.y.zhao@intel.com> Subject: [PATCH v3 08/11] KVM: x86: move vmx code to get EPT memtype when CR0.CD=1 to x86 common code Date: Fri, 16 Jun 2023 10:38:58 +0800 Message-Id: <20230616023858.7503-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230616023101.7019-1-yan.y.zhao@intel.com> References: <20230616023101.7019-1-yan.y.zhao@intel.com> X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768831311601739154?= X-GMAIL-MSGID: =?utf-8?q?1768831311601739154?= |
Series |
KVM: x86/mmu: refine memtype related mmu zap
|
|
Commit Message
Yan Zhao
June 16, 2023, 2:38 a.m. UTC
Move code in vmx.c to get cache disabled memtype when non-coherent DMA
present to x86 common code.
This is the preparation patch for later implementation of fine-grained gfn
zap for CR0.CD toggles when guest MTRRs are honored.
No functional change intended.
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
---
arch/x86/kvm/mtrr.c | 19 +++++++++++++++++++
arch/x86/kvm/vmx/vmx.c | 10 +++++-----
arch/x86/kvm/x86.h | 1 +
3 files changed, 25 insertions(+), 5 deletions(-)
Comments
On Fri, Jun 16, 2023, Yan Zhao wrote: > Move code in vmx.c to get cache disabled memtype when non-coherent DMA > present to x86 common code. > > This is the preparation patch for later implementation of fine-grained gfn > zap for CR0.CD toggles when guest MTRRs are honored. > > No functional change intended. > > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> > --- > arch/x86/kvm/mtrr.c | 19 +++++++++++++++++++ > arch/x86/kvm/vmx/vmx.c | 10 +++++----- > arch/x86/kvm/x86.h | 1 + > 3 files changed, 25 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c > index 3ce58734ad22..b35dd0bc9cad 100644 > --- a/arch/x86/kvm/mtrr.c > +++ b/arch/x86/kvm/mtrr.c > @@ -721,3 +721,22 @@ bool kvm_mtrr_check_gfn_range_consistency(struct kvm_vcpu *vcpu, gfn_t gfn, > > return type == mtrr_default_type(mtrr_state); > } > + > +void kvm_mtrr_get_cd_memory_type(struct kvm_vcpu *vcpu, u8 *type, bool *ipat) Hmm, I'm not convinced that this logic is subtle enough to warrant a common helper with out params (I *really* don't like out params :-) ). UC, or more specifically CR0.CD=1 on VMX without the quirk, is a super special case, because to faithfully emulatee "No Fill" mode, KVM needs to ignore guest PAT (stupid WC). I don't love having the same logic/assumptions in multiple places, but the CR0.CD=1 behavior is so rigidly tied to what KVM must do to that I think trying to provide a common helper makes the code more complex than it needs to be. If we open code the logic in the MTRR helper, than I think it can be distilled down to: struct kvm_mtrr *mtrr_state = &vcpu->arch.mtrr_state; bool mtrr_enabled = mtrr_is_enabled(mtrr_state); u8 default_type; /* * Faithfully emulating CR0.CD=1 on VMX requires ignoring guest PAT, as * WC in the PAT overrides UC in the MTRRs. Zap all SPTEs so that KVM * will once again start honoring guest PAT. */ if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_CD_NW_CLEARED)) return kvm_mtrr_zap_gfn_range(vcpu, gpa_to_gfn(0), gpa_to_gfn(~0ULL)); default_type = mtrr_enabled ? mtrr_default_type(mtrr_state) : mtrr_disabled_type(vcpu); if (default_type != MTRR_TYPE_WRBACK) return kvm_mtrr_zap_gfn_range(vcpu, gpa_to_gfn(0), gpa_to_gfn(~0ULL)); if (mtrr_enabled) { if (gather_non_wb_fixed_mtrrs(vcpu, MTRR_TYPE_WRBACK)) goto fail; if (gather_non_wb_var_mtrrs(vcpu, MTRR_TYPE_WRBACK)) goto fail; kvm_zap_or_wait_mtrrs(vcpu->kvm); } and this patch goes away. > +{ > + /* > + * this routine is supposed to be called when guest mtrrs are honored > + */ > + if (unlikely(!kvm_mmu_honors_guest_mtrrs(vcpu->kvm))) { I don't think this is worth checking, e.g. this would be WARN-worthy if it weren't for an otherwise benign race with device (un)assignment. > + *type = MTRR_TYPE_WRBACK; > + *ipat = true; > + } else if (unlikely(!kvm_check_has_quirk(vcpu->kvm, Eh, drop the "unlikely()" annotations. AIUI, they almost never provide actual performance benefits, and I dislike unnecessarily speculating on what userspace is doing when it comes to code (though I 100% agree that this definitely unlikely)
On Wed, Jun 28, 2023 at 03:57:09PM -0700, Sean Christopherson wrote: > On Fri, Jun 16, 2023, Yan Zhao wrote: > > Move code in vmx.c to get cache disabled memtype when non-coherent DMA > > present to x86 common code. > > > > This is the preparation patch for later implementation of fine-grained gfn > > zap for CR0.CD toggles when guest MTRRs are honored. > > > > No functional change intended. > > > > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> > > --- > > arch/x86/kvm/mtrr.c | 19 +++++++++++++++++++ > > arch/x86/kvm/vmx/vmx.c | 10 +++++----- > > arch/x86/kvm/x86.h | 1 + > > 3 files changed, 25 insertions(+), 5 deletions(-) > > > > diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c > > index 3ce58734ad22..b35dd0bc9cad 100644 > > --- a/arch/x86/kvm/mtrr.c > > +++ b/arch/x86/kvm/mtrr.c > > @@ -721,3 +721,22 @@ bool kvm_mtrr_check_gfn_range_consistency(struct kvm_vcpu *vcpu, gfn_t gfn, > > > > return type == mtrr_default_type(mtrr_state); > > } > > + > > +void kvm_mtrr_get_cd_memory_type(struct kvm_vcpu *vcpu, u8 *type, bool *ipat) > > Hmm, I'm not convinced that this logic is subtle enough to warrant a common I added this patch because the memtype to use under CR0.CD=1 is determined by vmx specific code (i.e. vmx.c), while mtrr.c is a common code for x86. I don't know if it's good to assume what vmx.c will return as in below open code. (e.g. if someone added IPAT bit for CR0.CD=1 under the quirk, and forgot to update the code here, we actually need to zap everything rather than zap only non-WB ranges). That's why I want to introduce a helper and let vmx.c call into it. (sorry, I didn't write a good commit message to explain the real intent). > helper with out params (I *really* don't like out params :-) ). I don't like the out params either. :) I just don't know a better way to return the ipat in the helper. > > UC, or more specifically CR0.CD=1 on VMX without the quirk, is a super special case, > because to faithfully emulatee "No Fill" mode, KVM needs to ignore guest PAT (stupid WC). > > I don't love having the same logic/assumptions in multiple places, but the CR0.CD=1 > behavior is so rigidly tied to what KVM must do to that I think trying to provide a > common helper makes the code more complex than it needs to be. > > If we open code the logic in the MTRR helper, than I think it can be distilled > down to: > > struct kvm_mtrr *mtrr_state = &vcpu->arch.mtrr_state; > bool mtrr_enabled = mtrr_is_enabled(mtrr_state); > u8 default_type; > > /* > * Faithfully emulating CR0.CD=1 on VMX requires ignoring guest PAT, as > * WC in the PAT overrides UC in the MTRRs. Zap all SPTEs so that KVM > * will once again start honoring guest PAT. > */ > if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_CD_NW_CLEARED)) > return kvm_mtrr_zap_gfn_range(vcpu, gpa_to_gfn(0), gpa_to_gfn(~0ULL)); > > default_type = mtrr_enabled ? mtrr_default_type(mtrr_state) : > mtrr_disabled_type(vcpu); > > if (default_type != MTRR_TYPE_WRBACK) > return kvm_mtrr_zap_gfn_range(vcpu, gpa_to_gfn(0), gpa_to_gfn(~0ULL)); > > if (mtrr_enabled) { > if (gather_non_wb_fixed_mtrrs(vcpu, MTRR_TYPE_WRBACK)) > goto fail; > > if (gather_non_wb_var_mtrrs(vcpu, MTRR_TYPE_WRBACK)) > goto fail; > > kvm_zap_or_wait_mtrrs(vcpu->kvm); > } > > and this patch goes away. > > > +{ > > + /* > > + * this routine is supposed to be called when guest mtrrs are honored > > + */ > > + if (unlikely(!kvm_mmu_honors_guest_mtrrs(vcpu->kvm))) { > > I don't think this is worth checking, e.g. this would be WARN-worthy if it weren't > for an otherwise benign race with device (un)assignment. Yes, WANR_ON is a better way. > > > + *type = MTRR_TYPE_WRBACK; > > + *ipat = true; > > > + } else if (unlikely(!kvm_check_has_quirk(vcpu->kvm, > > Eh, drop the "unlikely()" annotations. AIUI, they almost never provide actual > performance benefits, and I dislike unnecessarily speculating on what userspace > is doing when it comes to code (though I 100% agree that this definitely unlikely) It makes sence! Thanks!
On Thu, Jun 29, 2023, Yan Zhao wrote: > On Wed, Jun 28, 2023 at 03:57:09PM -0700, Sean Christopherson wrote: > > On Fri, Jun 16, 2023, Yan Zhao wrote: > > > Move code in vmx.c to get cache disabled memtype when non-coherent DMA > > > present to x86 common code. > > > > > > This is the preparation patch for later implementation of fine-grained gfn > > > zap for CR0.CD toggles when guest MTRRs are honored. > > > > > > No functional change intended. > > > > > > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> > > > --- > > > arch/x86/kvm/mtrr.c | 19 +++++++++++++++++++ > > > arch/x86/kvm/vmx/vmx.c | 10 +++++----- > > > arch/x86/kvm/x86.h | 1 + > > > 3 files changed, 25 insertions(+), 5 deletions(-) > > > > > > diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c > > > index 3ce58734ad22..b35dd0bc9cad 100644 > > > --- a/arch/x86/kvm/mtrr.c > > > +++ b/arch/x86/kvm/mtrr.c > > > @@ -721,3 +721,22 @@ bool kvm_mtrr_check_gfn_range_consistency(struct kvm_vcpu *vcpu, gfn_t gfn, > > > > > > return type == mtrr_default_type(mtrr_state); > > > } > > > + > > > +void kvm_mtrr_get_cd_memory_type(struct kvm_vcpu *vcpu, u8 *type, bool *ipat) > > > > Hmm, I'm not convinced that this logic is subtle enough to warrant a common > I added this patch because the memtype to use under CR0.CD=1 is determined by > vmx specific code (i.e. vmx.c), while mtrr.c is a common code for x86. > > I don't know if it's good to assume what vmx.c will return as in below open code. > (e.g. if someone added IPAT bit for CR0.CD=1 under the quirk, and forgot > to update the code here, we actually need to zap everything rather than > zap only non-WB ranges). > > That's why I want to introduce a helper and let vmx.c call into it. > (sorry, I didn't write a good commit message to explain the real intent). No need to apologize, I fully understood the intent. I'm just not convinced that the risk of us screwing up this particular case is worth the extra layers of crud that are necessary to let VMX and MTRRs share the core logic. Absent emulating CR0.CD=1 with UC, setting IPAT is complete nonsense when KVM is honoring the guest memtype. I 100% agree that splitting the logic is less than ideal, but providing a common helper feels forced and IMO yields significantly less readable code. And exporting kvm_mtrr_get_cd_memory_type() only adds to the confusion because calling it on SVM, which can't fully ignore gPAT, is also nonsensical.
On Thu, Jun 29, 2023 at 01:42:46PM -0700, Sean Christopherson wrote: > On Thu, Jun 29, 2023, Yan Zhao wrote: > > On Wed, Jun 28, 2023 at 03:57:09PM -0700, Sean Christopherson wrote: > > > On Fri, Jun 16, 2023, Yan Zhao wrote: ... > > > > +void kvm_mtrr_get_cd_memory_type(struct kvm_vcpu *vcpu, u8 *type, bool *ipat) > > > > > > Hmm, I'm not convinced that this logic is subtle enough to warrant a common > > I added this patch because the memtype to use under CR0.CD=1 is determined by > > vmx specific code (i.e. vmx.c), while mtrr.c is a common code for x86. > > > > I don't know if it's good to assume what vmx.c will return as in below open code. > > (e.g. if someone added IPAT bit for CR0.CD=1 under the quirk, and forgot > > to update the code here, we actually need to zap everything rather than > > zap only non-WB ranges). > > > > That's why I want to introduce a helper and let vmx.c call into it. > > (sorry, I didn't write a good commit message to explain the real intent). > > No need to apologize, I fully understood the intent. I'm just not convinced that > the risk of us screwing up this particular case is worth the extra layers of crud > that are necessary to let VMX and MTRRs share the core logic. > > Absent emulating CR0.CD=1 with UC, setting IPAT is complete nonsense when KVM is > honoring the guest memtype. Yes, I'm just paranoid :) > > I 100% agree that splitting the logic is less than ideal, but providing a common > helper feels forced and IMO yields significantly less readable code. And exporting > kvm_mtrr_get_cd_memory_type() only adds to the confusion because calling it on > SVM, which can't fully ignore gPAT, is also nonsensical. Ok. I get your concern now. You are right. Looks the easiest way now is to add some comments in VMM to caution that changes in memtype when noncoherent DMA present and CR0.CD=1 may lead to update of code for GFN zap. Or, do you think it's worth adding a new callback in kvm_x86_ops, e.g. static_call_cond(kvm_x86_get_cd_mt_honor_guest_mtrr)()?
On Fri, Jun 30, 2023 at 03:49:10PM +0800, Yan Zhao wrote: > On Thu, Jun 29, 2023 at 01:42:46PM -0700, Sean Christopherson wrote: > > On Thu, Jun 29, 2023, Yan Zhao wrote: > > > On Wed, Jun 28, 2023 at 03:57:09PM -0700, Sean Christopherson wrote: > > > > On Fri, Jun 16, 2023, Yan Zhao wrote: > ... > > > > > +void kvm_mtrr_get_cd_memory_type(struct kvm_vcpu *vcpu, u8 *type, bool *ipat) > > > > > > > > Hmm, I'm not convinced that this logic is subtle enough to warrant a common > > > I added this patch because the memtype to use under CR0.CD=1 is determined by > > > vmx specific code (i.e. vmx.c), while mtrr.c is a common code for x86. > > > > > > I don't know if it's good to assume what vmx.c will return as in below open code. > > > (e.g. if someone added IPAT bit for CR0.CD=1 under the quirk, and forgot > > > to update the code here, we actually need to zap everything rather than > > > zap only non-WB ranges). > > > > > > That's why I want to introduce a helper and let vmx.c call into it. > > > (sorry, I didn't write a good commit message to explain the real intent). > > > > No need to apologize, I fully understood the intent. I'm just not convinced that > > the risk of us screwing up this particular case is worth the extra layers of crud > > that are necessary to let VMX and MTRRs share the core logic. > > > > Absent emulating CR0.CD=1 with UC, setting IPAT is complete nonsense when KVM is > > honoring the guest memtype. > Yes, I'm just paranoid :) > > > > > I 100% agree that splitting the logic is less than ideal, but providing a common > > helper feels forced and IMO yields significantly less readable code. And exporting What about renaming it to kvm_honors_guest_mtrrs_get_cd_memtype()? Then it's only needed to be called when guest mtrrs are honored and provides a kind of enforcement. So that if there're other x86 participants (besides VMX/SVM) who want to honor guest mtrr, the same memtype is used with CR0.CD=1. (I know there might never be such kind of participants, or you may want to update the code until they appear) I tried in this way in v4 here https://lore.kernel.org/all/20230714065356.20620-1-yan.y.zhao@intel.com/. Feel free to ask me to drop it if you still don't like it :) > > kvm_mtrr_get_cd_memory_type() only adds to the confusion because calling it on > > SVM, which can't fully ignore gPAT, is also nonsensical. > Ok. I get your concern now. You are right. > Looks the easiest way now is to add some comments in VMM to caution that > changes in memtype when noncoherent DMA present and CR0.CD=1 may lead to > update of code for GFN zap. > Or, do you think it's worth adding a new callback in kvm_x86_ops, e.g. > static_call_cond(kvm_x86_get_cd_mt_honor_guest_mtrr)()?
diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c index 3ce58734ad22..b35dd0bc9cad 100644 --- a/arch/x86/kvm/mtrr.c +++ b/arch/x86/kvm/mtrr.c @@ -721,3 +721,22 @@ bool kvm_mtrr_check_gfn_range_consistency(struct kvm_vcpu *vcpu, gfn_t gfn, return type == mtrr_default_type(mtrr_state); } + +void kvm_mtrr_get_cd_memory_type(struct kvm_vcpu *vcpu, u8 *type, bool *ipat) +{ + /* + * this routine is supposed to be called when guest mtrrs are honored + */ + if (unlikely(!kvm_mmu_honors_guest_mtrrs(vcpu->kvm))) { + *type = MTRR_TYPE_WRBACK; + *ipat = true; + } else if (unlikely(!kvm_check_has_quirk(vcpu->kvm, + KVM_X86_QUIRK_CD_NW_CLEARED))) { + *type = MTRR_TYPE_UNCACHABLE; + *ipat = true; + } else { + *type = MTRR_TYPE_WRBACK; + *ipat = false; + } +} +EXPORT_SYMBOL_GPL(kvm_mtrr_get_cd_memory_type); diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index c1e93678cea4..6414c5a6e892 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -7573,11 +7573,11 @@ static u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio) return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT; if (kvm_read_cr0_bits(vcpu, X86_CR0_CD)) { - if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_CD_NW_CLEARED)) - return MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT; - else - return (MTRR_TYPE_UNCACHABLE << VMX_EPT_MT_EPTE_SHIFT) | - VMX_EPT_IPAT_BIT; + bool ipat; + u8 cache; + + kvm_mtrr_get_cd_memory_type(vcpu, &cache, &ipat); + return cache << VMX_EPT_MT_EPTE_SHIFT | (ipat ? VMX_EPT_IPAT_BIT : 0); } return kvm_mtrr_get_guest_memory_type(vcpu, gfn) << VMX_EPT_MT_EPTE_SHIFT; diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 82e3dafc5453..9781b4b32d68 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -313,6 +313,7 @@ int kvm_mtrr_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data); int kvm_mtrr_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata); bool kvm_mtrr_check_gfn_range_consistency(struct kvm_vcpu *vcpu, gfn_t gfn, int page_num); +void kvm_mtrr_get_cd_memory_type(struct kvm_vcpu *vcpu, u8 *type, bool *ipat); bool kvm_vector_hashing_enabled(void); void kvm_fixup_and_inject_pf_error(struct kvm_vcpu *vcpu, gva_t gva, u16 error_code); int x86_decode_emulated_instruction(struct kvm_vcpu *vcpu, int emulation_type,