Message ID | 9fbf5b4022d67157d6305bc1811f36d9096c26fc.1680179693.git.houwenlong.hwl@antgroup.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1095665vqo; Thu, 30 Mar 2023 05:42:49 -0700 (PDT) X-Google-Smtp-Source: AKy350aS0w1gs2CoNJ4QLV3IiXkhYbB8bBdEWzSscdSkcZTqVZy9dDr5qRELDHJ58BXJSYMGNHEW X-Received: by 2002:a17:903:245:b0:1a1:f0cb:1055 with SMTP id j5-20020a170903024500b001a1f0cb1055mr26629622plh.28.1680180168912; Thu, 30 Mar 2023 05:42:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680180168; cv=none; d=google.com; s=arc-20160816; b=C2NBZmRDI8F9kuhbD/+wVPhvLe5A1gqgK5x99za7O890nNcIc/4NPvoVho8Ob7K6Dr PVx4D1C+tRhOkrLe4CQIhwE7cTb+NTWOir+pH0xttPihOJu6BNjP0dJscU14zx/8jLCU 56RYk5np6nJrQKVrsx3MQ+1thqeSD7OlkAf4E4mPuTKcC2p+oFaX3Sz36MaxqYU8ElVa INVGxyxZJ3zFnu5ToQUNQ8RGUBRsBC5x8Lg8p6q29PhxjeVcOZ3qlR2bL6lp/N2ddm0u AKI5TuCKh7dUN+Rq1JUzuDncCy8vCMO6tZVQBZzicrZDtJB9kITGVAGpWZ58YG0J3WLj hSVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=zpwR9K2zZDVpSuys4DAijpsFUtQvXSVJvjA6omlf7Rk=; b=IQiXHMKAq75lQ4JiHz/ZJRg5RK8IDnfpVX0GbJaBC+4RyvQ5VK+N24faRi3aTyzkco LgYHFW7JDFZgyQ1Gl7adRMPhbasG/TxmdXQJYCYyoDPGaPbZqaT8IerhYiRkcSBMJ0gv JyJm/sZurxfLq+UiM2XedMIiz1mn9gx8AxRPhZsjUFVNMpwcGzJ1fpJFpfu3oxrV4/FL QIbevss77FvLaVZqE1Mr0e0oYtcUhwHNjkdHvpX3w/scOhZ0BUZ4RM+wbvAZwEJxElf9 9VDbRBF7gixetusonnmbVJEiV0tCmSY+wGlVSnRdsuetCy7viC/rRbqU1NpWhBnRm/cI H2JQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=antgroup.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l10-20020a170903244a00b001a1dc236bdbsi25565693pls.539.2023.03.30.05.42.34; Thu, 30 Mar 2023 05:42:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=antgroup.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230482AbjC3MgE (ORCPT <rfc822;rua109.linux@gmail.com> + 99 others); Thu, 30 Mar 2023 08:36:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230014AbjC3MgB (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 30 Mar 2023 08:36:01 -0400 Received: from out0-193.mail.aliyun.com (out0-193.mail.aliyun.com [140.205.0.193]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 70C65768A; Thu, 30 Mar 2023 05:36:00 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R111e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018047212;MF=houwenlong.hwl@antgroup.com;NM=1;PH=DS;RN=10;SR=0;TI=SMTPD_---.S30ISig_1680179756; Received: from localhost(mailfrom:houwenlong.hwl@antgroup.com fp:SMTPD_---.S30ISig_1680179756) by smtp.aliyun-inc.com; Thu, 30 Mar 2023 20:35:57 +0800 From: "Hou Wenlong" <houwenlong.hwl@antgroup.com> To: kvm@vger.kernel.org Cc: Sean Christopherson <seanjc@google.com>, Paolo Bonzini <pbonzini@redhat.com>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>, linux-kernel@vger.kernel.org Subject: [PATCH 2/3] KVM: x86: Don't update KVM PV feature CPUID during vCPU running Date: Thu, 30 Mar 2023 20:35:53 +0800 Message-Id: <9fbf5b4022d67157d6305bc1811f36d9096c26fc.1680179693.git.houwenlong.hwl@antgroup.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <9227068821b275ac547eb2ede09ec65d2281fe07.1680179693.git.houwenlong.hwl@antgroup.com> References: <9227068821b275ac547eb2ede09ec65d2281fe07.1680179693.git.houwenlong.hwl@antgroup.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=0.0 required=5.0 tests=SPF_HELO_NONE,SPF_PASS, UNPARSEABLE_RELAY autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761796600996971951?= X-GMAIL-MSGID: =?utf-8?q?1761796600996971951?= |
Series |
[1/3] KVM: x86: Disallow enable KVM_CAP_X86_DISABLE_EXITS capability after vCPUs have been created
|
|
Commit Message
Hou Wenlong
March 30, 2023, 12:35 p.m. UTC
__kvm_update_cpuid_runtime() may be called during vCPU running and KVM
PV feature CPUID is updated too. But the cached KVM PV feature bitmap is
not updated. Actually, KVM PV feature CPUID shouldn't be updated,
otherwise, KVM PV feature would be broken in guest. Currently, only
KVM_FEATURE_PV_UNHALT is updated, and it's impossible after disallow
disable HLT exits. However, KVM PV feature CPUID should be updated only
in KVM_SET_CPUID{,2} ioctl.
Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
---
arch/x86/kvm/cpuid.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
Comments
+Kechen On Thu, Mar 30, 2023, Hou Wenlong wrote: > __kvm_update_cpuid_runtime() may be called during vCPU running and KVM > PV feature CPUID is updated too. But the cached KVM PV feature bitmap is > not updated. Actually, KVM PV feature CPUID shouldn't be updated, > otherwise, KVM PV feature would be broken in guest. Currently, only > KVM_FEATURE_PV_UNHALT is updated, and it's impossible after disallow > disable HLT exits. However, KVM PV feature CPUID should be updated only > in KVM_SET_CPUID{,2} ioctl. > > Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com> > --- > arch/x86/kvm/cpuid.c | 17 ++++++++++++----- > 1 file changed, 12 insertions(+), 5 deletions(-) > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c > index 6972e0be60fa..af92d3422c79 100644 > --- a/arch/x86/kvm/cpuid.c > +++ b/arch/x86/kvm/cpuid.c > @@ -222,6 +222,17 @@ static struct kvm_cpuid_entry2 *kvm_find_kvm_cpuid_features(struct kvm_vcpu *vcp > vcpu->arch.cpuid_nent); > } > > +static void kvm_update_pv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries, > + int nent) > +{ > + struct kvm_cpuid_entry2 *best; > + > + best = __kvm_find_kvm_cpuid_features(vcpu, entries, nent); > + if (kvm_hlt_in_guest(vcpu->kvm) && best && > + (best->eax & (1 << KVM_FEATURE_PV_UNHALT))) > + best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT); > +} > + > void kvm_update_pv_runtime(struct kvm_vcpu *vcpu) > { > struct kvm_cpuid_entry2 *best = kvm_find_kvm_cpuid_features(vcpu); > @@ -280,11 +291,6 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e > cpuid_entry_has(best, X86_FEATURE_XSAVEC))) > best->ebx = xstate_required_size(vcpu->arch.xcr0, true); > > - best = __kvm_find_kvm_cpuid_features(vcpu, entries, nent); > - if (kvm_hlt_in_guest(vcpu->kvm) && best && > - (best->eax & (1 << KVM_FEATURE_PV_UNHALT))) > - best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT); > - > if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)) { > best = cpuid_entry2_find(entries, nent, 0x1, KVM_CPUID_INDEX_NOT_SIGNIFICANT); > if (best) > @@ -402,6 +408,7 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2, > int r; > > __kvm_update_cpuid_runtime(vcpu, e2, nent); > + kvm_update_pv_cpuid(vcpu, e2, nent); Hrm, this will silently conflict with the proposed per-vCPU controls[*]. Though arguably that patch is buggy and "needs" to toggle PV_UNHALT when userspace messes with HLT passthrough. But that doesn't really make sense either because no guest will react kindly to KVM_FEATURE_PV_UNHALT disappearing. I really wish this code didn't exist, i.e. that KVM let/forced userspace deal with correctly defining guest CPUID. Kechen, is it feasible for your userspace to clear PV_UNHALT when it (might) use the per-vCPU control? I.e. can KVM do as this series proposes and update guest CPUID only on KVM_SET_CPUID{2}? Dropping the behavior for the per-VM control is probably not an option as I gotta assume that'd break userspace, but I would really like to avoid carrying that over to the per-vCPU control, which would get quite messy and probably can't work anyways. [*] https://lkml.kernel.org/r/20230121020738.2973-6-kechenl%40nvidia.com
Hi Sean, > -----Original Message----- > From: Sean Christopherson <seanjc@google.com> > Sent: Wednesday, April 5, 2023 8:29 PM > To: Hou Wenlong <houwenlong.hwl@antgroup.com> > Cc: kvm@vger.kernel.org; Paolo Bonzini <pbonzini@redhat.com>; Thomas > Gleixner <tglx@linutronix.de>; Ingo Molnar <mingo@redhat.com>; Borislav > Petkov <bp@alien8.de>; Dave Hansen <dave.hansen@linux.intel.com>; > x86@kernel.org; H. Peter Anvin <hpa@zytor.com>; linux- > kernel@vger.kernel.org; Kechen Lu <kechenl@nvidia.com> > Subject: Re: [PATCH 2/3] KVM: x86: Don't update KVM PV feature CPUID > during vCPU running > > External email: Use caution opening links or attachments > > > +Kechen > > On Thu, Mar 30, 2023, Hou Wenlong wrote: > > __kvm_update_cpuid_runtime() may be called during vCPU running and > KVM > > PV feature CPUID is updated too. But the cached KVM PV feature bitmap > > is not updated. Actually, KVM PV feature CPUID shouldn't be updated, > > otherwise, KVM PV feature would be broken in guest. Currently, only > > KVM_FEATURE_PV_UNHALT is updated, and it's impossible after disallow > > disable HLT exits. However, KVM PV feature CPUID should be updated > > only in KVM_SET_CPUID{,2} ioctl. > > > > Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com> > > --- > > arch/x86/kvm/cpuid.c | 17 ++++++++++++----- > > 1 file changed, 12 insertions(+), 5 deletions(-) > > > > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index > > 6972e0be60fa..af92d3422c79 100644 > > --- a/arch/x86/kvm/cpuid.c > > +++ b/arch/x86/kvm/cpuid.c > > @@ -222,6 +222,17 @@ static struct kvm_cpuid_entry2 > *kvm_find_kvm_cpuid_features(struct kvm_vcpu *vcp > > vcpu->arch.cpuid_nent); } > > > > +static void kvm_update_pv_cpuid(struct kvm_vcpu *vcpu, struct > kvm_cpuid_entry2 *entries, > > + int nent) { > > + struct kvm_cpuid_entry2 *best; > > + > > + best = __kvm_find_kvm_cpuid_features(vcpu, entries, nent); > > + if (kvm_hlt_in_guest(vcpu->kvm) && best && > > + (best->eax & (1 << KVM_FEATURE_PV_UNHALT))) > > + best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT); } > > + > > void kvm_update_pv_runtime(struct kvm_vcpu *vcpu) { > > struct kvm_cpuid_entry2 *best = > > kvm_find_kvm_cpuid_features(vcpu); > > @@ -280,11 +291,6 @@ static void __kvm_update_cpuid_runtime(struct > kvm_vcpu *vcpu, struct kvm_cpuid_e > > cpuid_entry_has(best, X86_FEATURE_XSAVEC))) > > best->ebx = xstate_required_size(vcpu->arch.xcr0, true); > > > > - best = __kvm_find_kvm_cpuid_features(vcpu, entries, nent); > > - if (kvm_hlt_in_guest(vcpu->kvm) && best && > > - (best->eax & (1 << KVM_FEATURE_PV_UNHALT))) > > - best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT); > > - > > if (!kvm_check_has_quirk(vcpu->kvm, > KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)) { > > best = cpuid_entry2_find(entries, nent, 0x1, > KVM_CPUID_INDEX_NOT_SIGNIFICANT); > > if (best) > > @@ -402,6 +408,7 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, > struct kvm_cpuid_entry2 *e2, > > int r; > > > > __kvm_update_cpuid_runtime(vcpu, e2, nent); > > + kvm_update_pv_cpuid(vcpu, e2, nent); > > Hrm, this will silently conflict with the proposed per-vCPU controls[*]. > Though arguably that patch is buggy and "needs" to toggle PV_UNHALT > when userspace messes with HLT passthrough. But that doesn't really make > sense either because no guest will react kindly to > KVM_FEATURE_PV_UNHALT disappearing. Yes agree, toggling PV_UNHALT with per-vCPU control also sounds not making sense to me. And as pv feature is per VM bases, if current per-vCPU control touches the pv feature toggling, that would probably cause a lot of messes. > > I really wish this code didn't exist, i.e. that KVM let/forced userspace deal > with correctly defining guest CPUID. > > Kechen, is it feasible for your userspace to clear PV_UNHALT when it (might) > use the per-vCPU control? I.e. can KVM do as this series proposes and > update guest CPUID only on KVM_SET_CPUID{2}? Dropping the behavior for > the per-VM control is probably not an option as I gotta assume that'd break > userspace, but I would really like to avoid carrying that over to the per-vCPU > control, which would get quite messy and probably can't work anyways. Yes, in our use cases, it's feasible to clear PV_UNHALT while using the per-vCPU control. I think it makes sense on userspace responsibility to clear the PV_UNHALT bits while trying to use the per-vCPU control for hlt passthrough. We may add notes/requirement after this line of doc Documentation/virt/kvm/api.rst: "Do not enable KVM_FEATURE_PV_UNHALT if you disable HLT exits." Best Regards, Kechen > > [*] https://lkml.kernel.org/r/20230121020738.2973-6-kechenl%40nvidia.com
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index 6972e0be60fa..af92d3422c79 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -222,6 +222,17 @@ static struct kvm_cpuid_entry2 *kvm_find_kvm_cpuid_features(struct kvm_vcpu *vcp vcpu->arch.cpuid_nent); } +static void kvm_update_pv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *entries, + int nent) +{ + struct kvm_cpuid_entry2 *best; + + best = __kvm_find_kvm_cpuid_features(vcpu, entries, nent); + if (kvm_hlt_in_guest(vcpu->kvm) && best && + (best->eax & (1 << KVM_FEATURE_PV_UNHALT))) + best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT); +} + void kvm_update_pv_runtime(struct kvm_vcpu *vcpu) { struct kvm_cpuid_entry2 *best = kvm_find_kvm_cpuid_features(vcpu); @@ -280,11 +291,6 @@ static void __kvm_update_cpuid_runtime(struct kvm_vcpu *vcpu, struct kvm_cpuid_e cpuid_entry_has(best, X86_FEATURE_XSAVEC))) best->ebx = xstate_required_size(vcpu->arch.xcr0, true); - best = __kvm_find_kvm_cpuid_features(vcpu, entries, nent); - if (kvm_hlt_in_guest(vcpu->kvm) && best && - (best->eax & (1 << KVM_FEATURE_PV_UNHALT))) - best->eax &= ~(1 << KVM_FEATURE_PV_UNHALT); - if (!kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT)) { best = cpuid_entry2_find(entries, nent, 0x1, KVM_CPUID_INDEX_NOT_SIGNIFICANT); if (best) @@ -402,6 +408,7 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2, int r; __kvm_update_cpuid_runtime(vcpu, e2, nent); + kvm_update_pv_cpuid(vcpu, e2, nent); /* * KVM does not correctly handle changing guest CPUID after KVM_RUN, as