Message ID | 20230511233351.635053-6-seanjc@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp4735522vqo; Thu, 11 May 2023 16:36:19 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ58JwjXP4jTAFTf+M51dtLVYMh5XIXCHfYtNDDck9uOp0zhl4l5JANwsnvAE9JW5v7eL09c X-Received: by 2002:a17:902:bd85:b0:1a6:9ec2:a48f with SMTP id q5-20020a170902bd8500b001a69ec2a48fmr22542738pls.34.1683848178794; Thu, 11 May 2023 16:36:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683848178; cv=none; d=google.com; s=arc-20160816; b=XozyNrVxm2U2763IvVPap+u0pmxJyTgD+7MjhuYUsraPY8EC+vpY16aU/L/DmCUsd9 bYf6OgZPKW2WF9rx1UAfaBqbqdJAZ6Q90K9XuY4WF09srjSNRmu+NEt7G3aYx47og9JH 2LrrNBvHAuLHCV68H3EpSm+laSHPnuuv4eC7D8I6OMbaOCCu+lRwVo6k1CG9ZglTTx6c U0PjR71HpJNI753sgmjaHTAefyOeDZ1SFlpGar99DoPLQqpvjvTApRhu4rGXHfkepbU6 NEi/BTVC+/EKwSfkLrj0ibXRlo893aWfVL0B2F0ruGIQ9zQv2QgA5ShePppcgfbHhq/4 BLiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=3SaHwYEbuDBGE2GkNVn0nUXWR/dUffqKjw9nfTur2e0=; b=ObVHMqILs2JmybM3/woH8fe8aJgWR0RIQGBMW1b8B7TBXHKCQCRLBBeEy1itXEbi4p MatABHInTpeqVIoQJdno3Fxwzx5PJhQfqQA6xY+wV0fbxtsEPffXNvnPHepwsaIf25hG tfnTAkMo3cQi0pejLvt2UP+nYYCz/GDUEyIjwRXRQ3AS2QYVEDn+0yAqH4nlUq6v46Rr Wbci9Muej5wRgbAWh0stb6xUmw6s2fPgeb1IUzp3Dz/6idSwKZPPxbAYl1+ooNCB8QbR U+umvmlwcBDvZETuMsnDzzN+Mj4fIpOhG3FdpRef/DgxlGxBbzrofjOVlnSktE851uGp GS1w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=azTYuOeM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y23-20020a17090a105700b0025289f6e346si2977613pjd.13.2023.05.11.16.36.06; Thu, 11 May 2023 16:36:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=azTYuOeM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239640AbjEKXeR (ORCPT <rfc822;peekingduck44@gmail.com> + 99 others); Thu, 11 May 2023 19:34:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40134 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239601AbjEKXeF (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 11 May 2023 19:34:05 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B87D47D82 for <linux-kernel@vger.kernel.org>; Thu, 11 May 2023 16:34:03 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-ba237aec108so8383595276.3 for <linux-kernel@vger.kernel.org>; Thu, 11 May 2023 16:34:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1683848043; x=1686440043; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=3SaHwYEbuDBGE2GkNVn0nUXWR/dUffqKjw9nfTur2e0=; b=azTYuOeMwkjRdRGvY2tkEm8mIWO3uiXKb8TEZWxqgdMKxgvivF0rVoBMQv3TkBtBBs dsI2mSzVifPR+40xtIyraTQ5C04mCD/anwqN5rq9NYANplpBl7X4I2y10mT4kBuTyNSC MzkAmn2QGmnpNG/HphgKY6OR2FA86nlHdjWo2y/KVpJI1VZ7RUtl7glezHZtZeKTEjsB YOmirNZvwfUB99+a1JZQdbC3/ghfRhtMqZPOX6Bc+dqpO7zn2aMka9BvX6rEl1S570Bt mX6pB+8e0Aava0hcMaK2IRqnwSvg/+u1gqrLnBM/dWxJAvLfjbk7oyf561ur8fxmpXZX 4p7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683848043; x=1686440043; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=3SaHwYEbuDBGE2GkNVn0nUXWR/dUffqKjw9nfTur2e0=; b=C79P5k+eEXkuggak41jdb8sSJT2MrB2QnhElBJzijK5ahooPNaFR+9YmMoQv5R3Hsk tjA9YfNqglkDbIqAr7zrNlOa4CP8ZqxahUWJAWO7uOsp/vk6geg1ySKIrZt4uMRpnuvQ uGG44+o1n00FMcYyViQ+c6ScNtQYUXWdyYH8oyrDwrQnYFieKNMH5Q25EvNSpmwaglqY 8IqOsYA+MqTc7IT180hSplYqcpT+dZfvRXunYBDuxAavWfEk8jtyRZJJ9qAfZW2GssJp aSo5Rc+HzrrM0B4HGHnN4+fNiGwO5l15tVx/3ljPsbuV3Iv/PCTF3afwi+CM10R2k89p bEFg== X-Gm-Message-State: AC+VfDycBwRV9N0SXI22EPjvfP3t4V+EChPQGnvLuxX/3a0fD6G4m/4t t3dqtJU8XeTzfwr8rBXrBdnrsMOVEEo= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:11c9:b0:b9d:ed0f:b9db with SMTP id n9-20020a05690211c900b00b9ded0fb9dbmr14291341ybu.6.1683848043038; Thu, 11 May 2023 16:34:03 -0700 (PDT) Reply-To: Sean Christopherson <seanjc@google.com> Date: Thu, 11 May 2023 16:33:48 -0700 In-Reply-To: <20230511233351.635053-1-seanjc@google.com> Mime-Version: 1.0 References: <20230511233351.635053-1-seanjc@google.com> X-Mailer: git-send-email 2.40.1.606.ga4b1b128d6-goog Message-ID: <20230511233351.635053-6-seanjc@google.com> Subject: [PATCH v2 5/8] KVM: x86: Use MTRR macros to define possible MTRR MSR ranges From: Sean Christopherson <seanjc@google.com> To: Sean Christopherson <seanjc@google.com>, Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Kai Huang <kai.huang@intel.com>, Wenyao Hai <haiwenyao@uniontech.com>, Ke Guo <guoke@uniontech.com> Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765642788253888625?= X-GMAIL-MSGID: =?utf-8?q?1765642788253888625?= |
Series |
KVM: x86: Clean up MSR PAT handling
|
|
Commit Message
Sean Christopherson
May 11, 2023, 11:33 p.m. UTC
Use the MTRR macros to identify the ranges of possible MTRR MSRs instead
of bounding the ranges with a mismash of open coded values and unrelated
MSR indices. Carving out the gap for the machine check MSRs in particular
is confusing, as it's easy to incorrectly think the case statement handles
MCE MSRs instead of skipping them.
Drop the range-based funneling of MSRs between the end of the MCE MSRs
and MTRR_DEF_TYPE, i.e. 0x2A0-0x2FF, and instead handle MTTR_DEF_TYPE as
the one-off case that it is.
Extract PAT (0x277) as well in anticipation of dropping PAT "handling"
from the MTRR code.
Keep the range-based handling for the variable+fixed MTRRs even though
capturing unknown MSRs 0x214-0x24F is arguably "wrong". There is a gap in
the fixed MTRRs, 0x260-0x267, i.e. the MTRR code needs to filter out
unknown MSRs anyways, and using a single range generates marginally better
code for the big switch statement.
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/mtrr.c | 7 ++++---
arch/x86/kvm/x86.c | 10 ++++++----
2 files changed, 10 insertions(+), 7 deletions(-)
Comments
On Thu, 2023-05-11 at 16:33 -0700, Sean Christopherson wrote: > Use the MTRR macros to identify the ranges of possible MTRR MSRs instead > of bounding the ranges with a mismash of open coded values and unrelated ^ mishmash? > MSR indices. Carving out the gap for the machine check MSRs in particular > is confusing, as it's easy to incorrectly think the case statement handles > MCE MSRs instead of skipping them. > > Drop the range-based funneling of MSRs between the end of the MCE MSRs > and MTRR_DEF_TYPE, i.e. 0x2A0-0x2FF, and instead handle MTTR_DEF_TYPE as > the one-off case that it is. > > Extract PAT (0x277) as well in anticipation of dropping PAT "handling" > from the MTRR code. > > Keep the range-based handling for the variable+fixed MTRRs even though > capturing unknown MSRs 0x214-0x24F is arguably "wrong". There is a gap in > the fixed MTRRs, 0x260-0x267, i.e. the MTRR code needs to filter out > unknown MSRs anyways, and using a single range generates marginally better > code for the big switch statement. > > Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Kai Huang <kai.huang@intel.com> One Nit below ... > --- > arch/x86/kvm/mtrr.c | 7 ++++--- > arch/x86/kvm/x86.c | 10 ++++++---- > 2 files changed, 10 insertions(+), 7 deletions(-) > > diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c > index 59851dbebfea..dc213b940141 100644 > --- a/arch/x86/kvm/mtrr.c > +++ b/arch/x86/kvm/mtrr.c > @@ -34,7 +34,7 @@ static bool is_mtrr_base_msr(unsigned int msr) > static struct kvm_mtrr_range *var_mtrr_msr_to_range(struct kvm_vcpu *vcpu, > unsigned int msr) > { > - int index = (msr - 0x200) / 2; > + int index = (msr - MTRRphysBase_MSR(0)) / 2; > > return &vcpu->arch.mtrr_state.var_ranges[index]; > } > @@ -42,7 +42,7 @@ static struct kvm_mtrr_range *var_mtrr_msr_to_range(struct kvm_vcpu *vcpu, > static bool msr_mtrr_valid(unsigned msr) > { > switch (msr) { > - case 0x200 ... 0x200 + 2 * KVM_NR_VAR_MTRR - 1: > + case MTRRphysBase_MSR(0) ... MTRRphysMask_MSR(KVM_NR_VAR_MTRR - 1): > case MSR_MTRRfix64K_00000: > case MSR_MTRRfix16K_80000: > case MSR_MTRRfix16K_A0000: > @@ -88,7 +88,8 @@ bool kvm_mtrr_valid(struct kvm_vcpu *vcpu, u32 msr, u64 data) > } > > /* variable MTRRs */ > - WARN_ON(!(msr >= 0x200 && msr < 0x200 + 2 * KVM_NR_VAR_MTRR)); > + WARN_ON(!(msr >= MTRRphysBase_MSR(0) && > + msr <= MTRRphysMask_MSR(KVM_NR_VAR_MTRR - 1))); > > mask = kvm_vcpu_reserved_gpa_bits_raw(vcpu); > if ((msr & 1) == 0) { > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index e7f78fe79b32..8b356c9d8a81 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -3700,8 +3700,9 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > return 1; > } > break; > - case 0x200 ... MSR_IA32_MC0_CTL2 - 1: > - case MSR_IA32_MCx_CTL2(KVM_MAX_MCE_BANKS) ... 0x2ff: > + case MSR_IA32_CR_PAT: > + case MTRRphysBase_MSR(0) ... MSR_MTRRfix4K_F8000: > + case MSR_MTRRdefType: > return kvm_mtrr_set_msr(vcpu, msr, data); > case MSR_IA32_APICBASE: > return kvm_set_apic_base(vcpu, msr_info); > @@ -4108,9 +4109,10 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > msr_info->data = kvm_scale_tsc(rdtsc(), ratio) + offset; > break; > } > + case MSR_IA32_CR_PAT: > case MSR_MTRRcap: ... Should we put MSR_IA32_CR_PAT after MSR_MTRRcap so it can be symmetric to kvm_set_msr_common()? Looks there's no reason to put it before MSR_MTRRcap. > - case 0x200 ... MSR_IA32_MC0_CTL2 - 1: > - case MSR_IA32_MCx_CTL2(KVM_MAX_MCE_BANKS) ... 0x2ff: > + case MTRRphysBase_MSR(0) ... MSR_MTRRfix4K_F8000: > + case MSR_MTRRdefType: > return kvm_mtrr_get_msr(vcpu, msr_info->index, &msr_info->data); > case 0xcd: /* fsb frequency */ > msr_info->data = 3; > -- > 2.40.1.606.ga4b1b128d6-goog >
On Fri, May 12, 2023, Kai Huang wrote: > On Thu, 2023-05-11 at 16:33 -0700, Sean Christopherson wrote: > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index e7f78fe79b32..8b356c9d8a81 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -3700,8 +3700,9 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > return 1; > > } > > break; > > - case 0x200 ... MSR_IA32_MC0_CTL2 - 1: > > - case MSR_IA32_MCx_CTL2(KVM_MAX_MCE_BANKS) ... 0x2ff: > > + case MSR_IA32_CR_PAT: > > + case MTRRphysBase_MSR(0) ... MSR_MTRRfix4K_F8000: > > + case MSR_MTRRdefType: > > return kvm_mtrr_set_msr(vcpu, msr, data); > > case MSR_IA32_APICBASE: > > return kvm_set_apic_base(vcpu, msr_info); > > @@ -4108,9 +4109,10 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > msr_info->data = kvm_scale_tsc(rdtsc(), ratio) + offset; > > break; > > } > > + case MSR_IA32_CR_PAT: > > case MSR_MTRRcap: > > ... Should we put MSR_IA32_CR_PAT after MSR_MTRRcap so it can be symmetric to > kvm_set_msr_common()? > > Looks there's no reason to put it before MSR_MTRRcap. No, it's above MTRRcap for two reasons: 1. When PAT is moved out of mtrr.c, PAT doesn't get bunded with the other MTRRs and so would just need to be hoisted back up. The end code looks like: case MSR_IA32_CR_PAT: msr_info->data = vcpu->arch.pat; break; case MSR_MTRRcap: case MTRRphysBase_MSR(0) ... MSR_MTRRfix4K_F8000: case MSR_MTRRdefType: return kvm_mtrr_get_msr(vcpu, msr_info->index, &msr_info->data); 2. There is no MSR_MTRRcap case statement in kvm_set_msr_common() because it's a read-only MSR, i.e. the two can't be symmetric :-) > > - case 0x200 ... MSR_IA32_MC0_CTL2 - 1: > > - case MSR_IA32_MCx_CTL2(KVM_MAX_MCE_BANKS) ... 0x2ff: > > + case MTRRphysBase_MSR(0) ... MSR_MTRRfix4K_F8000: > > + case MSR_MTRRdefType: > > return kvm_mtrr_get_msr(vcpu, msr_info->index, &msr_info->data); > > case 0xcd: /* fsb frequency */ > > msr_info->data = 3; > > -- > > 2.40.1.606.ga4b1b128d6-goog > > >
On Fri, 2023-05-12 at 09:35 -0700, Sean Christopherson wrote: > On Fri, May 12, 2023, Kai Huang wrote: > > On Thu, 2023-05-11 at 16:33 -0700, Sean Christopherson wrote: > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > index e7f78fe79b32..8b356c9d8a81 100644 > > > --- a/arch/x86/kvm/x86.c > > > +++ b/arch/x86/kvm/x86.c > > > @@ -3700,8 +3700,9 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > > return 1; > > > } > > > break; > > > - case 0x200 ... MSR_IA32_MC0_CTL2 - 1: > > > - case MSR_IA32_MCx_CTL2(KVM_MAX_MCE_BANKS) ... 0x2ff: > > > + case MSR_IA32_CR_PAT: > > > + case MTRRphysBase_MSR(0) ... MSR_MTRRfix4K_F8000: > > > + case MSR_MTRRdefType: > > > return kvm_mtrr_set_msr(vcpu, msr, data); > > > case MSR_IA32_APICBASE: > > > return kvm_set_apic_base(vcpu, msr_info); > > > @@ -4108,9 +4109,10 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > > msr_info->data = kvm_scale_tsc(rdtsc(), ratio) + offset; > > > break; > > > } > > > + case MSR_IA32_CR_PAT: > > > case MSR_MTRRcap: > > > > ... Should we put MSR_IA32_CR_PAT after MSR_MTRRcap so it can be symmetric to > > kvm_set_msr_common()? > > > > Looks there's no reason to put it before MSR_MTRRcap. > > No, it's above MTRRcap for two reasons: > > 1. When PAT is moved out of mtrr.c, PAT doesn't get bunded with the other MTRRs > and so would just need to be hoisted back up. The end code looks like: > > case MSR_IA32_CR_PAT: > msr_info->data = vcpu->arch.pat; > break; > case MSR_MTRRcap: > case MTRRphysBase_MSR(0) ... MSR_MTRRfix4K_F8000: > case MSR_MTRRdefType: > return kvm_mtrr_get_msr(vcpu, msr_info->index, &msr_info->data); Sorry I mistakenly thought MSR_MTRRcap wasn't handled in kvm_mtrr_get_msr(). Yes looks good to me. > > 2. There is no MSR_MTRRcap case statement in kvm_set_msr_common() because it's > a read-only MSR, i.e. the two can't be symmetric :-) Do you know why it is a read-only MSR, which enables both FIXED ranges and (fixed number of) dynamic ranges? I am asking because there's a x86 series to fake a simple synthetic MTRR which neither has fixed nor dynamic ranges but only has a default MSR_MTRRdefType: https://lore.kernel.org/lkml/20230509233641.GGZFrZCTDH7VwUMp5R@fat_crate.local/T/ The main use cases are Xen PV guests and SEV-SNP guests running under Hyper-V, but it appears TDX guest is also desired to have similar handling, because: 1) TDX module exposes MTRR in CPUID to guest, but handles nothing about MTRR MSRs but only injects #VE. 2) TDX module always maps guest private memory as WB (and ignores guest's PAT IIUC); 3) For shared memory, w/o non-coherent DMA guest's MTRRs are ignored by KVM anyway. TDX doesn't officially support non-trusted device assignment AFAICT. Even we want to consider non-coherent DMA, it would only add confusion to honor guest's MTRRs since they can point to private memory, which is always mapped as WB. So basically looks there's no value to exposing FIXED and dynamic MTRR ranges to TDX guest.
On Mon, May 15, 2023, Kai Huang wrote: > On Fri, 2023-05-12 at 09:35 -0700, Sean Christopherson wrote: > > On Fri, May 12, 2023, Kai Huang wrote: > > > On Thu, 2023-05-11 at 16:33 -0700, Sean Christopherson wrote: > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > > index e7f78fe79b32..8b356c9d8a81 100644 > > > > --- a/arch/x86/kvm/x86.c > > > > +++ b/arch/x86/kvm/x86.c > > > > @@ -3700,8 +3700,9 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > > > return 1; > > > > } > > > > break; > > > > - case 0x200 ... MSR_IA32_MC0_CTL2 - 1: > > > > - case MSR_IA32_MCx_CTL2(KVM_MAX_MCE_BANKS) ... 0x2ff: > > > > + case MSR_IA32_CR_PAT: > > > > + case MTRRphysBase_MSR(0) ... MSR_MTRRfix4K_F8000: > > > > + case MSR_MTRRdefType: > > > > return kvm_mtrr_set_msr(vcpu, msr, data); > > > > case MSR_IA32_APICBASE: > > > > return kvm_set_apic_base(vcpu, msr_info); > > > > @@ -4108,9 +4109,10 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > > > msr_info->data = kvm_scale_tsc(rdtsc(), ratio) + offset; > > > > break; > > > > } > > > > + case MSR_IA32_CR_PAT: > > > > case MSR_MTRRcap: > > > > > > ... Should we put MSR_IA32_CR_PAT after MSR_MTRRcap so it can be symmetric to > > > kvm_set_msr_common()? > > > > > > Looks there's no reason to put it before MSR_MTRRcap. > > > > No, it's above MTRRcap for two reasons: > > > > 1. When PAT is moved out of mtrr.c, PAT doesn't get bunded with the other MTRRs > > and so would just need to be hoisted back up. The end code looks like: > > > > case MSR_IA32_CR_PAT: > > msr_info->data = vcpu->arch.pat; > > break; > > case MSR_MTRRcap: > > case MTRRphysBase_MSR(0) ... MSR_MTRRfix4K_F8000: > > case MSR_MTRRdefType: > > return kvm_mtrr_get_msr(vcpu, msr_info->index, &msr_info->data); > > Sorry I mistakenly thought MSR_MTRRcap wasn't handled in kvm_mtrr_get_msr(). > Yes looks good to me. > > > > > 2. There is no MSR_MTRRcap case statement in kvm_set_msr_common() because it's > > a read-only MSR, i.e. the two can't be symmetric :-) > > Do you know why it is a read-only MSR, which enables both FIXED ranges and > (fixed number of) dynamic ranges? MTTRcap doesn't "enable" anything, it's a capabilities MSR (MTRR Capability is its given name in the SDM), similar to ARCH_CAPABILITIES, PERF_CAPABILITIES, etc. They're all essentially CPUID leafs, but presumably are MSRs due to being relevant only to CPL0. Or maybe some higher ups at Intel just spin a wheel to determine whether to use a CPUID leaf or an MSR. :-) > I am asking because there's a x86 series to fake a simple synthetic MTRR which > neither has fixed nor dynamic ranges but only has a default MSR_MTRRdefType: > > https://lore.kernel.org/lkml/20230509233641.GGZFrZCTDH7VwUMp5R@fat_crate.local/T/ > > The main use cases are Xen PV guests and SEV-SNP guests running under > Hyper-V, but it appears TDX guest is also desired to have similar handling, > because:� > > 1) TDX module exposes MTRR in CPUID to guest, but handles nothing about MTRR > MSRs but only injects #VE. > > 2) TDX module always maps guest private memory as WB (and ignores guest's PAT > IIUC); > > 3) For shared memory, w/o non-coherent DMA guest's MTRRs are ignored by KVM > anyway. TDX doesn't officially support non-trusted device assignment AFAICT. > Even we want to consider non-coherent DMA, it would only add confusion to honor > guest's MTRRs since they can point to private memory, which is always mapped as > WB. Yeah, I think the best option is for KVM to disallow attaching non-coherent DMA to TDX VMs. AFAIK, there's no use case for such a setup. > So basically looks there's no value to exposing FIXED and dynamic MTRR ranges to > TDX guest. Agreed.
On Mon, 2023-05-15 at 10:49 -0700, Sean Christopherson wrote: > On Mon, May 15, 2023, Kai Huang wrote: > > On Fri, 2023-05-12 at 09:35 -0700, Sean Christopherson wrote: > > > On Fri, May 12, 2023, Kai Huang wrote: > > > > On Thu, 2023-05-11 at 16:33 -0700, Sean Christopherson wrote: > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > > > index e7f78fe79b32..8b356c9d8a81 100644 > > > > > --- a/arch/x86/kvm/x86.c > > > > > +++ b/arch/x86/kvm/x86.c > > > > > @@ -3700,8 +3700,9 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > > > > return 1; > > > > > } > > > > > break; > > > > > - case 0x200 ... MSR_IA32_MC0_CTL2 - 1: > > > > > - case MSR_IA32_MCx_CTL2(KVM_MAX_MCE_BANKS) ... 0x2ff: > > > > > + case MSR_IA32_CR_PAT: > > > > > + case MTRRphysBase_MSR(0) ... MSR_MTRRfix4K_F8000: > > > > > + case MSR_MTRRdefType: > > > > > return kvm_mtrr_set_msr(vcpu, msr, data); > > > > > case MSR_IA32_APICBASE: > > > > > return kvm_set_apic_base(vcpu, msr_info); > > > > > @@ -4108,9 +4109,10 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > > > > msr_info->data = kvm_scale_tsc(rdtsc(), ratio) + offset; > > > > > break; > > > > > } > > > > > + case MSR_IA32_CR_PAT: > > > > > case MSR_MTRRcap: > > > > > > > > ... Should we put MSR_IA32_CR_PAT after MSR_MTRRcap so it can be symmetric to > > > > kvm_set_msr_common()? > > > > > > > > Looks there's no reason to put it before MSR_MTRRcap. > > > > > > No, it's above MTRRcap for two reasons: > > > > > > 1. When PAT is moved out of mtrr.c, PAT doesn't get bunded with the other MTRRs > > > and so would just need to be hoisted back up. The end code looks like: > > > > > > case MSR_IA32_CR_PAT: > > > msr_info->data = vcpu->arch.pat; > > > break; > > > case MSR_MTRRcap: > > > case MTRRphysBase_MSR(0) ... MSR_MTRRfix4K_F8000: > > > case MSR_MTRRdefType: > > > return kvm_mtrr_get_msr(vcpu, msr_info->index, &msr_info->data); > > > > Sorry I mistakenly thought MSR_MTRRcap wasn't handled in kvm_mtrr_get_msr(). > > Yes looks good to me. > > > > > > > > 2. There is no MSR_MTRRcap case statement in kvm_set_msr_common() because it's > > > a read-only MSR, i.e. the two can't be symmetric :-) > > > > Do you know why it is a read-only MSR, which enables both FIXED ranges and > > (fixed number of) dynamic ranges? > > MTTRcap doesn't "enable" anything, it's a capabilities MSR (MTRR Capability is > its given name in the SDM), similar to ARCH_CAPABILITIES, PERF_CAPABILITIES, etc. > They're all essentially CPUID leafs, but presumably are MSRs due to being relevant > only to CPL0. Or maybe some higher ups at Intel just spin a wheel to determine > whether to use a CPUID leaf or an MSR. :-) I meant it may make sense to allow Qemu to configure it. Anyway thanks! > > > I am asking because there's a x86 series to fake a simple synthetic MTRR which > > neither has fixed nor dynamic ranges but only has a default MSR_MTRRdefType: > > > > https://lore.kernel.org/lkml/20230509233641.GGZFrZCTDH7VwUMp5R@fat_crate.local/T/ > > > > The main use cases are Xen PV guests and SEV-SNP guests running under > > Hyper-V, but it appears TDX guest is also desired to have similar handling, > > because:� > > > > 1) TDX module exposes MTRR in CPUID to guest, but handles nothing about MTRR > > MSRs but only injects #VE. > > > > 2) TDX module always maps guest private memory as WB (and ignores guest's PAT > > IIUC); > > > > 3) For shared memory, w/o non-coherent DMA guest's MTRRs are ignored by KVM > > anyway. TDX doesn't officially support non-trusted device assignment AFAICT. > > Even we want to consider non-coherent DMA, it would only add confusion to honor > > guest's MTRRs since they can point to private memory, which is always mapped as > > WB. > > Yeah, I think the best option is for KVM to disallow attaching non-coherent DMA > to TDX VMs. AFAIK, there's no use case for such a setup. +Isaku, Presumably the only case is assigning GPU to TDX VMs, but again not sure we need to consider this as AFAICT TDX *officially* doesn't support untrusted device passthrough. Isaku may have more information. And Iskau, Do you have any comments here, especially considering TDX 2.0 support for TEE- IO? We probably need a solution that is future extensible. > > > So basically looks there's no value to exposing FIXED and dynamic MTRR ranges to > > TDX guest. > > Agreed.
diff --git a/arch/x86/kvm/mtrr.c b/arch/x86/kvm/mtrr.c index 59851dbebfea..dc213b940141 100644 --- a/arch/x86/kvm/mtrr.c +++ b/arch/x86/kvm/mtrr.c @@ -34,7 +34,7 @@ static bool is_mtrr_base_msr(unsigned int msr) static struct kvm_mtrr_range *var_mtrr_msr_to_range(struct kvm_vcpu *vcpu, unsigned int msr) { - int index = (msr - 0x200) / 2; + int index = (msr - MTRRphysBase_MSR(0)) / 2; return &vcpu->arch.mtrr_state.var_ranges[index]; } @@ -42,7 +42,7 @@ static struct kvm_mtrr_range *var_mtrr_msr_to_range(struct kvm_vcpu *vcpu, static bool msr_mtrr_valid(unsigned msr) { switch (msr) { - case 0x200 ... 0x200 + 2 * KVM_NR_VAR_MTRR - 1: + case MTRRphysBase_MSR(0) ... MTRRphysMask_MSR(KVM_NR_VAR_MTRR - 1): case MSR_MTRRfix64K_00000: case MSR_MTRRfix16K_80000: case MSR_MTRRfix16K_A0000: @@ -88,7 +88,8 @@ bool kvm_mtrr_valid(struct kvm_vcpu *vcpu, u32 msr, u64 data) } /* variable MTRRs */ - WARN_ON(!(msr >= 0x200 && msr < 0x200 + 2 * KVM_NR_VAR_MTRR)); + WARN_ON(!(msr >= MTRRphysBase_MSR(0) && + msr <= MTRRphysMask_MSR(KVM_NR_VAR_MTRR - 1))); mask = kvm_vcpu_reserved_gpa_bits_raw(vcpu); if ((msr & 1) == 0) { diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e7f78fe79b32..8b356c9d8a81 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3700,8 +3700,9 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) return 1; } break; - case 0x200 ... MSR_IA32_MC0_CTL2 - 1: - case MSR_IA32_MCx_CTL2(KVM_MAX_MCE_BANKS) ... 0x2ff: + case MSR_IA32_CR_PAT: + case MTRRphysBase_MSR(0) ... MSR_MTRRfix4K_F8000: + case MSR_MTRRdefType: return kvm_mtrr_set_msr(vcpu, msr, data); case MSR_IA32_APICBASE: return kvm_set_apic_base(vcpu, msr_info); @@ -4108,9 +4109,10 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) msr_info->data = kvm_scale_tsc(rdtsc(), ratio) + offset; break; } + case MSR_IA32_CR_PAT: case MSR_MTRRcap: - case 0x200 ... MSR_IA32_MC0_CTL2 - 1: - case MSR_IA32_MCx_CTL2(KVM_MAX_MCE_BANKS) ... 0x2ff: + case MTRRphysBase_MSR(0) ... MSR_MTRRfix4K_F8000: + case MSR_MTRRdefType: return kvm_mtrr_get_msr(vcpu, msr_info->index, &msr_info->data); case 0xcd: /* fsb frequency */ msr_info->data = 3;