Message ID | 20221020031615.890400-1-xiaoyao.li@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp39133wrs; Wed, 19 Oct 2022 20:24:30 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6RWUbd9xn9Oaz0TJ5DJgkuvMjFM6bZ4NDSdmPtNaPH5zKjcEAUGHeSA75ZX3rwz1g8tpAq X-Received: by 2002:a17:906:8a6f:b0:780:96b4:d19e with SMTP id hy15-20020a1709068a6f00b0078096b4d19emr9065483ejc.624.1666236270808; Wed, 19 Oct 2022 20:24:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666236270; cv=none; d=google.com; s=arc-20160816; b=I/IOBH1NtMMDGQk/Zy213ywsh7gYtNSvmcXAfDLxmt7z2coPgzpJlBtgtwxUHLsab3 /QNSjjbDUI8A9bwCRHQSYoMTrq808hzy4XM3ihPwEd43bXKaaprIj8ddqgr6XFNfwe6E PNucrtzlKTmv2HKJ+Jv0aiUQTE2L8GO9b4GiPNKw4X4zeR+j1Y+ehktp0WiWfGvGyJV0 BdaFW9cdudQ7YVErAlhLB2LgEUw2GLnJeJnjifVp3kf+YWO1N7LnlyS+2blI211lvWMq xFpPw/gjKlOzKA+o3jAZs1maqESmBPgK8IM+C3/B+CIms186nxPTkuFDrKQoP+sgJQpW MY5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=CiplUIrY1HpAOdbs6pJoHNIvw0dMg9yBoLioJ1UefEE=; b=kf2VWqvZqOLwUZAzH/4Z2AlkHlBUWhw0UFRkYOnbZHfQvHLA76DZqc7RYSpykHKwvL 4tonMEERVOzL8ZREzmntl8rXt1aNFoNlC5yhkwh09GB4gyYAmePHlaJGi+7oWiBUubVk k0IeRsDQ6RrRckeBm+K6j/aKvgyRYmBh2p1DBtj6FDMhP6K8ZV9lxGMEzLA3a4Q+5dbC TuVx9DumbA+RjqlNFO8OSgPXntLwszLPKXdiF+M1RKFxpR1E4JfsjCfEFHYcNVgJQHZr VyzuLZncuYf1Ln1NFshtp3czMHaXjeTY2KQ1EaxfuXYjOG0+WFAeVfkyGHZs7gNuLvte r18A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=J85kF6O9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i31-20020a0564020f1f00b004603a95e1d3si678974eda.135.2022.10.19.20.24.04; Wed, 19 Oct 2022 20:24:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=J85kF6O9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230325AbiJTDQX (ORCPT <rfc822;samuel.l.nystrom@gmail.com> + 99 others); Wed, 19 Oct 2022 23:16:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59474 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230101AbiJTDQU (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 19 Oct 2022 23:16:20 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F494152013; Wed, 19 Oct 2022 20:16:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666235779; x=1697771779; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=/duzIIr48yquYHPVShw/vjj6dlvWHbKZ3g4MXUDZw60=; b=J85kF6O97aDg3mA6lKgpVXa8lIBeNUNXc/V4zdXvcBago2B0f7kq3AuN 5HZ+5g70fx2A53bXk46Z1sq4ktC6w8O97dPxiyOo9LpOf5KpviQCBIxXd VN8FXxRjXD5ZAkpzX10U0Be6hfJs4ASSuF65hm1tox50qTn8hRjB4x/eB gNlcJrOr0TNc7G6CXCr2FQDuoimqwmdjTpf8nq23/6zP3LwfHHgDqjf3r XKomwopB8p4ayd8asF9HaoY8gYefXYQ3eQCjw2fHTNDl+Ve3W+542AAFH s9MR2dWXKxVKmbtCjE1l70Rw6RFWOTIbzkDQCWeiFuyptQulxCsXiry4B Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10505"; a="333155355" X-IronPort-AV: E=Sophos;i="5.95,196,1661842800"; d="scan'208";a="333155355" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Oct 2022 20:16:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10505"; a="662783145" X-IronPort-AV: E=Sophos;i="5.95,196,1661842800"; d="scan'208";a="662783145" Received: from lxy-dell.sh.intel.com ([10.239.48.100]) by orsmga001.jf.intel.com with ESMTP; 19 Oct 2022 20:16:15 -0700 From: Xiaoyao Li <xiaoyao.li@intel.com> To: Sean Christopherson <seanjc@google.com>, Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Xiaoyao Li <xiaoyao.li@intel.com> Subject: [PATCH] KVM: x86: Fix the initial value of mcg_cap Date: Thu, 20 Oct 2022 11:16:15 +0800 Message-Id: <20221020031615.890400-1-xiaoyao.li@intel.com> X-Mailer: git-send-email 2.27.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,HK_RANDOM_ENVFROM, HK_RANDOM_FROM,RCVD_IN_DNSWL_HI,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747175363824948508?= X-GMAIL-MSGID: =?utf-8?q?1747175363824948508?= |
Series |
KVM: x86: Fix the initial value of mcg_cap
|
|
Commit Message
Xiaoyao Li
Oct. 20, 2022, 3:16 a.m. UTC
vcpu->arch.mcg_cap represents the value of MSR_IA32_MCG_CAP. It's
set via ioctl(KVM_X86_SETUP_MCE) from userspace when exposing and
configuring MCE to guest.
It's wrong to leave the default value as KVM_MAX_MCE_BANKS.
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
---
arch/x86/kvm/x86.c | 1 -
1 file changed, 1 deletion(-)
Comments
On Thu, Oct 20, 2022, Xiaoyao Li wrote: > vcpu->arch.mcg_cap represents the value of MSR_IA32_MCG_CAP. It's > set via ioctl(KVM_X86_SETUP_MCE) from userspace when exposing and > configuring MCE to guest. > > It's wrong to leave the default value as KVM_MAX_MCE_BANKS. Why? I agree it's an odd default, but the whole MCE API is odd. Functionally, I don't see anything that's broken by allowing the guest to access the MCx_CTL MSRs by default.
On 10/20/2022 10:27 PM, Sean Christopherson wrote: > On Thu, Oct 20, 2022, Xiaoyao Li wrote: >> vcpu->arch.mcg_cap represents the value of MSR_IA32_MCG_CAP. It's >> set via ioctl(KVM_X86_SETUP_MCE) from userspace when exposing and >> configuring MCE to guest. >> >> It's wrong to leave the default value as KVM_MAX_MCE_BANKS. > > Why? I agree it's an odd default, but the whole MCE API is odd. Functionally, > I don't see anything that's broken by allowing the guest to access the MCx_CTL MSRs > by default. Yes. Allowing the access doesn't cause any issue for a VM. However, for the perspective of virtualization. It virtualizes a magic hardware that even CPUID.MCA/MCE is not advertised and MCE is not set up by userspace, guest is told there are 32 banks and all the banks can be accessed. The patch doesn't fix any issue but try to make the code more reasonable.
On Thu, Oct 20, 2022, Xiaoyao Li wrote: > On 10/20/2022 10:27 PM, Sean Christopherson wrote: > > On Thu, Oct 20, 2022, Xiaoyao Li wrote: > > > vcpu->arch.mcg_cap represents the value of MSR_IA32_MCG_CAP. It's > > > set via ioctl(KVM_X86_SETUP_MCE) from userspace when exposing and > > > configuring MCE to guest. > > > > > > It's wrong to leave the default value as KVM_MAX_MCE_BANKS. > > > > Why? I agree it's an odd default, but the whole MCE API is odd. Functionally, > > I don't see anything that's broken by allowing the guest to access the MCx_CTL MSRs > > by default. > > Yes. Allowing the access doesn't cause any issue for a VM. > > However, for the perspective of virtualization. It virtualizes a magic > hardware that even CPUID.MCA/MCE is not advertised and MCE is not set up by > userspace, guest is told there are 32 banks and all the banks can be > accessed. '0' isn't necessarily better though, e.g. if userspace parrots back KVM's "supported" CPUID without invoking KVM_X86_SETUP_MCE, then it's equally odd that the guest will see no supported MCE MSRS. Older versions of the SDM also state (or at least very strongly imply) that banks 0-3 are always available on P6. Bank 0 is an especially weird case, as several of the MSRs are aliased to other MSRs that predate the machine check architecture. Anyways, if this were newly introduced code I'd be all for defaulting to '0', but KVM has defaulted to KVM_MAX_MCE_BANKS since KVM_X86_SETUP_MCE was added way back in 2009. Unless there's a bug that's fixed by this, I'm inclined to keep the current behavior even though it's weird, as hiding all MCE MSRs by default could theoretically cause a regression, e.g. by triggering #GP on MSRs that an older guest expects to always exist. If we really want to clean up this code, I think the correct approach would be to inject #GP on all relevant MSRs if CPUID.MCA==0, e.g. diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 4bd5f8a751de..97fafd851d8d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3260,6 +3260,9 @@ static int set_msr_mce(struct kvm_vcpu *vcpu, struct msr_data *msr_info) u64 data = msr_info->data; u32 offset, last_msr; + if (!msr_info->host_initiated && !guest_cpuid_has(X86_FEATURE_MCA)) + return 1; + switch (msr) { case MSR_IA32_MCG_STATUS: vcpu->arch.mcg_status = data; @@ -3891,6 +3894,14 @@ static int get_msr_mce(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata, bool host) unsigned bank_num = mcg_cap & 0xff; u32 offset, last_msr; + if (msr == MSR_IA32_P5_MC_ADDR || msr == MSR_IA32_P5_MC_TYPE) { + *pdata = 0; + return 0; + } + + if (!host && !guest_cpuid_has(X86_FEATURE_MCA)) + return 1; + switch (msr) { case MSR_IA32_P5_MC_ADDR: case MSR_IA32_P5_MC_TYPE: Or alternatively, this should work too: diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 4bd5f8a751de..e4a44d7af0a6 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3774,6 +3774,9 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) case MSR_IA32_MCG_STATUS: case MSR_IA32_MC0_CTL ... MSR_IA32_MCx_CTL(KVM_MAX_MCE_BANKS) - 1: case MSR_IA32_MC0_CTL2 ... MSR_IA32_MCx_CTL2(KVM_MAX_MCE_BANKS) - 1: + if (!msr_info->host_initiated && + !guest_cpuid_has(X86_FEATURE_MCA)) + return 1; return set_msr_mce(vcpu, msr_info); case MSR_K7_PERFCTR0 ... MSR_K7_PERFCTR3: @@ -4142,13 +4145,17 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) msr_info->data = vcpu->arch.msr_kvm_poll_control; break; - case MSR_IA32_P5_MC_ADDR: - case MSR_IA32_P5_MC_TYPE: case MSR_IA32_MCG_CAP: case MSR_IA32_MCG_CTL: case MSR_IA32_MCG_STATUS: case MSR_IA32_MC0_CTL ... MSR_IA32_MCx_CTL(KVM_MAX_MCE_BANKS) - 1: case MSR_IA32_MC0_CTL2 ... MSR_IA32_MCx_CTL2(KVM_MAX_MCE_BANKS) - 1: + if (!msr_info->host_initiated && + !guest_cpuid_has(X86_FEATURE_MCA)) + return 1; + fallthrough; + case MSR_IA32_P5_MC_ADDR: + case MSR_IA32_P5_MC_TYPE: return get_msr_mce(vcpu, msr_info->index, &msr_info->data, msr_info->host_initiated); case MSR_IA32_XSS:
On 10/21/2022 12:32 AM, Sean Christopherson wrote: > On Thu, Oct 20, 2022, Xiaoyao Li wrote: >> On 10/20/2022 10:27 PM, Sean Christopherson wrote: >>> On Thu, Oct 20, 2022, Xiaoyao Li wrote: >>>> vcpu->arch.mcg_cap represents the value of MSR_IA32_MCG_CAP. It's >>>> set via ioctl(KVM_X86_SETUP_MCE) from userspace when exposing and >>>> configuring MCE to guest. >>>> >>>> It's wrong to leave the default value as KVM_MAX_MCE_BANKS. >>> >>> Why? I agree it's an odd default, but the whole MCE API is odd. Functionally, >>> I don't see anything that's broken by allowing the guest to access the MCx_CTL MSRs >>> by default. >> >> Yes. Allowing the access doesn't cause any issue for a VM. >> >> However, for the perspective of virtualization. It virtualizes a magic >> hardware that even CPUID.MCA/MCE is not advertised and MCE is not set up by >> userspace, guest is told there are 32 banks and all the banks can be >> accessed. > > '0' isn't necessarily better though, e.g. if userspace parrots back KVM's "supported" > CPUID without invoking KVM_X86_SETUP_MCE, then it's equally odd that the guest will > see no supported MCE MSRS. > > Older versions of the SDM also state (or at least very strongly imply) that banks > 0-3 are always available on P6. > > Bank 0 is an especially weird case, as several of the MSRs are aliased to other > MSRs that predate the machine check architecture. > > Anyways, if this were newly introduced code I'd be all for defaulting to '0', but > KVM has defaulted to KVM_MAX_MCE_BANKS since KVM_X86_SETUP_MCE was added way back > in 2009. Unless there's a bug that's fixed by this, I'm inclined to keep the > current behavior even though it's weird, as hiding all MCE MSRs by default could > theoretically cause a regression, e.g. by triggering #GP on MSRs that an older > guest expects to always exist. fair enough. > If we really want to clean up this code, I think the correct approach would be to > inject #GP on all relevant MSRs if CPUID.MCA==0, e.g. It's what I thought of as well. But I didn't find any statement in SDM of "Accessing Machine Check MSRs gets #GP if no CPUID.MCA" > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 4bd5f8a751de..97fafd851d8d 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -3260,6 +3260,9 @@ static int set_msr_mce(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > u64 data = msr_info->data; > u32 offset, last_msr; > > + if (!msr_info->host_initiated && !guest_cpuid_has(X86_FEATURE_MCA)) > + return 1; > + > switch (msr) { > case MSR_IA32_MCG_STATUS: > vcpu->arch.mcg_status = data; > @@ -3891,6 +3894,14 @@ static int get_msr_mce(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata, bool host) > unsigned bank_num = mcg_cap & 0xff; > u32 offset, last_msr; > > + if (msr == MSR_IA32_P5_MC_ADDR || msr == MSR_IA32_P5_MC_TYPE) { > + *pdata = 0; > + return 0; > + } > + > + if (!host && !guest_cpuid_has(X86_FEATURE_MCA)) > + return 1; > + > switch (msr) { > case MSR_IA32_P5_MC_ADDR: > case MSR_IA32_P5_MC_TYPE: > > Or alternatively, this should work too: > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 4bd5f8a751de..e4a44d7af0a6 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -3774,6 +3774,9 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > case MSR_IA32_MCG_STATUS: > case MSR_IA32_MC0_CTL ... MSR_IA32_MCx_CTL(KVM_MAX_MCE_BANKS) - 1: > case MSR_IA32_MC0_CTL2 ... MSR_IA32_MCx_CTL2(KVM_MAX_MCE_BANKS) - 1: > + if (!msr_info->host_initiated && > + !guest_cpuid_has(X86_FEATURE_MCA)) > + return 1; > return set_msr_mce(vcpu, msr_info); > > case MSR_K7_PERFCTR0 ... MSR_K7_PERFCTR3: > @@ -4142,13 +4145,17 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > msr_info->data = vcpu->arch.msr_kvm_poll_control; > break; > - case MSR_IA32_P5_MC_ADDR: > - case MSR_IA32_P5_MC_TYPE: > case MSR_IA32_MCG_CAP: > case MSR_IA32_MCG_CTL: > case MSR_IA32_MCG_STATUS: > case MSR_IA32_MC0_CTL ... MSR_IA32_MCx_CTL(KVM_MAX_MCE_BANKS) - 1: > case MSR_IA32_MC0_CTL2 ... MSR_IA32_MCx_CTL2(KVM_MAX_MCE_BANKS) - 1: > + if (!msr_info->host_initiated && > + !guest_cpuid_has(X86_FEATURE_MCA)) > + return 1; > + fallthrough; > + case MSR_IA32_P5_MC_ADDR: > + case MSR_IA32_P5_MC_TYPE: > return get_msr_mce(vcpu, msr_info->index, &msr_info->data, > msr_info->host_initiated); > case MSR_IA32_XSS: >
On Fri, Oct 21, 2022, Xiaoyao Li wrote: > On 10/21/2022 12:32 AM, Sean Christopherson wrote: > > If we really want to clean up this code, I think the correct approach would be to > > inject #GP on all relevant MSRs if CPUID.MCA==0, e.g. > > It's what I thought of as well. But I didn't find any statement in SDM of > "Accessing Machine Check MSRs gets #GP if no CPUID.MCA" Ugh, stupid SDM. Really old SDMs, e.g. circa 1997, explicity state in the CPUID.MCA entry that: Processor supports the MCG_CAP MSR. But, when Intel introduced the "Architectural MSRs" section (2001 or so), the wording was changed to be less explicit: The Machine Check Architecture, which provides a compatible mechanism for error reporting in P6 family, Pentium 4, and Intel Xeon processors, and future processors, is supported. The MCG_CAP MSR contains feature bits describing how many banks of error reporting MSRs are supported. and the entry in the MSR index just lists P6 as the dependency: IA32_MCG_CAP (MCG_CAP) Global Machine Check Capability (R/O) 06_01H So I think it's technically true that MCG_CAP is supposed to exist iff CPUID.MCA=1, but we'd probably need an SDM change to really be able to enforce that :-(
On 10/22/2022 2:35 AM, Sean Christopherson wrote: > On Fri, Oct 21, 2022, Xiaoyao Li wrote: >> On 10/21/2022 12:32 AM, Sean Christopherson wrote: >>> If we really want to clean up this code, I think the correct approach would be to >>> inject #GP on all relevant MSRs if CPUID.MCA==0, e.g. >> >> It's what I thought of as well. But I didn't find any statement in SDM of >> "Accessing Machine Check MSRs gets #GP if no CPUID.MCA" > > Ugh, stupid SDM. Really old SDMs, e.g. circa 1997, explicity state in the > CPUID.MCA entry that: > > Processor supports the MCG_CAP MSR. > > But, when Intel introduced the "Architectural MSRs" section (2001 or so), the > wording was changed to be less explicit: > > The Machine Check Architecture, which provides a compatible mechanism for error > reporting in P6 family, Pentium 4, and Intel Xeon processors, and future processors, > is supported. The MCG_CAP MSR contains feature bits describing how many banks of > error reporting MSRs are supported. > > and the entry in the MSR index just lists P6 as the dependency: > > IA32_MCG_CAP (MCG_CAP) Global Machine Check Capability (R/O) 06_01H > > So I think it's technically true that MCG_CAP is supposed to exist iff CPUID.MCA=1, > but we'd probably need an SDM change to really be able to enforce that :-( I'll talk to Intel architects for this. :)
On Mon, Oct 24, 2022 at 09:37:59AM +0800, Xiaoyao Li wrote: > On 10/22/2022 2:35 AM, Sean Christopherson wrote: > > On Fri, Oct 21, 2022, Xiaoyao Li wrote: > > > On 10/21/2022 12:32 AM, Sean Christopherson wrote: > > > > If we really want to clean up this code, I think the correct approach would be to > > > > inject #GP on all relevant MSRs if CPUID.MCA==0, e.g. > > > > > > It's what I thought of as well. But I didn't find any statement in SDM of > > > "Accessing Machine Check MSRs gets #GP if no CPUID.MCA" > > > > Ugh, stupid SDM. Really old SDMs, e.g. circa 1997, explicity state in the > > CPUID.MCA entry that: > > > > Processor supports the MCG_CAP MSR. > > > > But, when Intel introduced the "Architectural MSRs" section (2001 or so), the > > wording was changed to be less explicit: > > > > The Machine Check Architecture, which provides a compatible mechanism for error > > reporting in P6 family, Pentium 4, and Intel Xeon processors, and future processors, > > is supported. The MCG_CAP MSR contains feature bits describing how many banks of > > error reporting MSRs are supported. > > > > and the entry in the MSR index just lists P6 as the dependency: > > > > IA32_MCG_CAP (MCG_CAP) Global Machine Check Capability (R/O) 06_01H > > > > So I think it's technically true that MCG_CAP is supposed to exist iff CPUID.MCA=1, > > but we'd probably need an SDM change to really be able to enforce that :-( > > I'll talk to Intel architects for this. :) [I'm not a h/w architect ... but I do write/support the Linux machine check code] Current edition of the SDM describes the MCA bit in CPUID(EAX=1).EDX in volume 2, Table 3-11: Machine Check Architecture. A value of 1 indicates the Machine Check Architecture of reporting machine errors is supported. The MCG_CAP MSR contains feature bits describing how many banks of error reporting MSRs are supported So a value of 0 would mean Machine check architecture is NOT supported. The only rationale meaning for "Machine check architecture is supported" is you get everything in Vol3B chapter 15 if MCA is supported, and you don't get it if it isn't. The unsupported behaviour is not explicitly defined ... so if you want the do something other than #GP, you could do so ... but that sounds like s silly choice. Ditto for accessing a machine check bank with number greater than that specified in IA32_MCG_CAP.count. SDM doesn't say that this must #GP, but #GP would be a sane and reasonble response. You could also read as all zero and drop writes. -Tony
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 4bd5f8a751de..ca8f4a3e698d 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11801,7 +11801,6 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu) GFP_KERNEL_ACCOUNT); if (!vcpu->arch.mce_banks || !vcpu->arch.mci_ctl2_banks) goto fail_free_mce_banks; - vcpu->arch.mcg_cap = KVM_MAX_MCE_BANKS; if (!zalloc_cpumask_var(&vcpu->arch.wbinvd_dirty_mask, GFP_KERNEL_ACCOUNT))