Message ID | 20221213062306.667649-4-seanjc@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp2662032wrr; Mon, 12 Dec 2022 22:25:31 -0800 (PST) X-Google-Smtp-Source: AA0mqf5BIz56AWvthlVEl++Jj+mQFpoTiTqC7Hw10uQDvyxmfl3dlykMZTPYaOZiL2Nqi+KRLot3 X-Received: by 2002:a17:906:a057:b0:79d:e7d3:4bc8 with SMTP id bg23-20020a170906a05700b0079de7d34bc8mr16014891ejb.54.1670912731825; Mon, 12 Dec 2022 22:25:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670912731; cv=none; d=google.com; s=arc-20160816; b=FPgztcFaOHvJkmv5iEvX1XgdHaIH79/m3Ht4rgyOj52tReBJjnLrXAWUkNnAOOg3kC lOARKce33qBxDQLd4u2GDiDrYLouZ9a6p+6lxV9Igi+foYdSt0J2OGS9nxc8RIWLetQ3 pWHojcs97ojCtbwE3HHDLkN19+5dVmqETzD3EcnI40jprAeVFuu9hWBgYkaoZHntJrxa q3f6QVXRE8TYv5YroK+CQdXogb4jdlvz6g1QGBRm6srMyCSR/efi1Y8hYPXNr/w1oA0O N60qqvhjPw4fTKhpvNfj5yYvRCP/OSUJDBi9ILKod4IQYypFWysCKX9slMfV9cx66tLi 0yfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=zBUvBTND7Qb+8AkTIHs8rpB4FKmhZMCK3qWfQPPeOpc=; b=hhr3jZmXPsnv3eYHNFUK+bobOCSU7YYuo0kABn7m5FscPg+tZcIvjHmqctKbm6dY+z ZH9+c/ZjXTzMLQQp8/Fsk5DbjvojMAgCoxTK+r8/+QCnLHKXDj1tTyyQTQPIRFnJlP9U VSZaeL92YFWbi6IHaVvxFkrb54YgXSiHWulER5lzqSGzQTZG1p7BMrRsmkBZLLgw073P lEsUGYCRIIxfk1kdGe4pZLOdBz6SODmda82tDN/a/nQk9E9dz9JopVJ4TJhFSwMEaBxl ExrLtUTHQ4/fC7Lbp2vX+An77QOp148i7FPqXsEUKD7d+AmgTr9CJMcuDVMnQkcokAWc KVCw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=e6C6OFLO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t8-20020a170906948800b007c0889e324esi6291851ejx.366.2022.12.12.22.25.07; Mon, 12 Dec 2022 22:25:31 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=e6C6OFLO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234530AbiLMGXX (ORCPT <rfc822;jeantsuru.cumc.mandola@gmail.com> + 99 others); Tue, 13 Dec 2022 01:23:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54890 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234526AbiLMGXP (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 13 Dec 2022 01:23:15 -0500 Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com [IPv6:2607:f8b0:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E0EFC1EC77 for <linux-kernel@vger.kernel.org>; Mon, 12 Dec 2022 22:23:13 -0800 (PST) Received: by mail-pl1-x649.google.com with SMTP id n5-20020a170902d2c500b00189e5b86fe2so12395929plc.16 for <linux-kernel@vger.kernel.org>; Mon, 12 Dec 2022 22:23:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=zBUvBTND7Qb+8AkTIHs8rpB4FKmhZMCK3qWfQPPeOpc=; b=e6C6OFLOb8WPWhN4v2ULSzWrbjXx56ED4i5lMniYzFZp7iU3GKnhN/XE9SJhra6oAu 4M6IVE6RT6irQ3FusAwo03VBo8RoBsEhfrG/JVYMjeMh4cRWTLI0MeGa60/EL9J1a6WI H9VjDdo11T3r9/r19L1LpaRIQy7Rqli6DwWd8P+ey0p6Xj3ykK8hGAGIWDqN/0DqTUpv qLZe1+yrvhEYd/D/lrAq0auvklLbq9H443bCbmTaRU0iWp7tZqDiB84bctRqvvJv86xU J3rIlUEdsPBlx5Rg9jUArb7oZMNgi3YAy9X2YQyfWZtxIITo+NpSTbt1kkWCv0MiaCJt vk4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=zBUvBTND7Qb+8AkTIHs8rpB4FKmhZMCK3qWfQPPeOpc=; b=ducbsVSgzbUjzuWIW6JBef+PBwTfIJ6jUshlzK6SpmF4Xq7EUfeBsUTCtT0FSYqv2g l3w/xp9KhKfDqzzeFM8V2LuCs4TMvH6tyoVvQHDbZsTL0xxk6C7rBoUvWXeOYaJwWGkr 5yPt3LXdPU3aFl64p+nSESqS6F/+2Qd3wbBrnaSQU9kqSEoWcy5KpASSTC0W7VoDrDsj AqZvp2+HhDGIM0LThPvhQgNW9ug9FOlkvK0bUvWRZUHmtqp6m6ROXxSYrAZ6auzt1qQF B+G0T/TrvlnWMY9m2lKn+RUIhWZVArzuXJclBgd7aY+bYW8GQLXzO2tEVtMZDl0FKDqV 9rUQ== X-Gm-Message-State: ANoB5pm0jEX3lkn28nLmF9IL8Dr1zHnM/X5KKQK25T5TOUh3s8vcJheX thxcUNpDV3Ebrjxa7m3Pk6NylRpNOOQ= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a62:2785:0:b0:576:bb84:7b50 with SMTP id n127-20020a622785000000b00576bb847b50mr22811996pfn.71.1670912593481; Mon, 12 Dec 2022 22:23:13 -0800 (PST) Reply-To: Sean Christopherson <seanjc@google.com> Date: Tue, 13 Dec 2022 06:23:05 +0000 In-Reply-To: <20221213062306.667649-1-seanjc@google.com> Mime-Version: 1.0 References: <20221213062306.667649-1-seanjc@google.com> X-Mailer: git-send-email 2.39.0.rc1.256.g54fd8350bd-goog Message-ID: <20221213062306.667649-4-seanjc@google.com> Subject: [PATCH v2 3/4] KVM: nVMX: Don't muck with allowed sec exec controls on CPUID changes From: Sean Christopherson <seanjc@google.com> To: Sean Christopherson <seanjc@google.com>, Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Aaron Lewis <aaronlewis@google.com>, Yu Zhang <yu.c.zhang@linux.intel.com> Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1752078988636602253?= X-GMAIL-MSGID: =?utf-8?q?1752078988636602253?= |
Series |
KVM: nVMX: Fix 2nd exec controls override goofs
|
|
Commit Message
Sean Christopherson
Dec. 13, 2022, 6:23 a.m. UTC
Don't modify the set of allowed secondary execution controls, i.e. the
virtual MSR_IA32_VMX_PROCBASED_CTLS2, in response to guest CPUID changes.
To avoid breaking old userspace that never sets the VMX MSRs, i.e. relies
on KVM to provide a consistent vCPU model, keep the existing behavior if
userspace has never written MSR_IA32_VMX_PROCBASED_CTLS2.
KVM should not modify the VMX capabilities presented to L1 based on CPUID
as doing so may discard explicit settings provided by userspace. E.g. if
userspace does KVM_SET_MSRS => KVM_SET_CPUID and disables a feature in
the VMX MSRs but not CPUID (to prevent exposing the feature to L2), then
stuffing the VMX MSRs during KVM_SET_CPUID will expose the feature to L2
against userspace's wishes.
Alternatively, KVM could add a quirk, but that's less than ideal as a VMM
that is affected by the bug would need to be updated in order to opt out
of the buggy behavior. The "has the MSR ever been written" logic handles
both the care where an enlightened userspace sets the MSR during setup,
and the case where userspace blindly migrates the MSR, as the migrated
value will already have been sanitized by the source KVM.
Reported-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/vmx/capabilities.h | 1 +
arch/x86/kvm/vmx/nested.c | 3 +++
arch/x86/kvm/vmx/vmx.c | 7 +++++--
3 files changed, 9 insertions(+), 2 deletions(-)
Comments
On 12/13/22 07:23, Sean Christopherson wrote: > Don't modify the set of allowed secondary execution controls, i.e. the > virtual MSR_IA32_VMX_PROCBASED_CTLS2, in response to guest CPUID changes. > To avoid breaking old userspace that never sets the VMX MSRs, i.e. relies > on KVM to provide a consistent vCPU model, keep the existing behavior if > userspace has never written MSR_IA32_VMX_PROCBASED_CTLS2. > > KVM should not modify the VMX capabilities presented to L1 based on CPUID > as doing so may discard explicit settings provided by userspace. E.g. if > userspace does KVM_SET_MSRS => KVM_SET_CPUID and disables a feature in > the VMX MSRs but not CPUID (to prevent exposing the feature to L2), then > stuffing the VMX MSRs during KVM_SET_CPUID will expose the feature to L2 > against userspace's wishes. The commit message doesn't explain *why* KVM_SET_CPUID would be done before KVM_SET_MSRS. The presence of certain MSRs or bits within is signaled by CPUID bits, and even though KVM is more lenient on host-initiated MSR writes it still verifies in general that the bits are valid. For now I applied patch 1 and (with a reworded comment) patch 2. I'm not opposed to the rest, but I would like to better understand the reason for them. (If it has been reported to the mailing list, please add a "Link" trailer too). Paolo
On Fri, Dec 23, 2022, Paolo Bonzini wrote: > On 12/13/22 07:23, Sean Christopherson wrote: > > Don't modify the set of allowed secondary execution controls, i.e. the > > virtual MSR_IA32_VMX_PROCBASED_CTLS2, in response to guest CPUID changes. > > To avoid breaking old userspace that never sets the VMX MSRs, i.e. relies > > on KVM to provide a consistent vCPU model, keep the existing behavior if > > userspace has never written MSR_IA32_VMX_PROCBASED_CTLS2. > > > > KVM should not modify the VMX capabilities presented to L1 based on CPUID > > as doing so may discard explicit settings provided by userspace. E.g. if > > userspace does KVM_SET_MSRS => KVM_SET_CPUID and disables a feature in > > the VMX MSRs but not CPUID (to prevent exposing the feature to L2), then > > stuffing the VMX MSRs during KVM_SET_CPUID will expose the feature to L2 > > against userspace's wishes. > > The commit message doesn't explain *why* KVM_SET_CPUID would be done before > KVM_SET_MSRS. I assume you mean why KVM_SET_MSRS would be done before KVM_SET_CPUID2? This patch is mostly paranoia, AFAIK there is no userspace that is negatively affected by KVM's manipulations. The only case I can think of is if userspace wanted to emulate dynamic CPUID updates, e.g. set an MSR filter to intercept writes to MISC_ENABLES to update MONITOR/MWAIT support, but that behavior isn't allowed since commit feb627e8d6f6 ("KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN"). There are scenarios where userspace might do KVM_SET_MSRS before KVM_SET_CPUID, e.g. QEMU's reuse of a vCPU for CPU hotplug, but in those cases I would expect userspace to follow up with another KVM_SET_MSRS.
On Wed, Jan 04, 2023, Sean Christopherson wrote: > On Fri, Dec 23, 2022, Paolo Bonzini wrote: > > On 12/13/22 07:23, Sean Christopherson wrote: > > > Don't modify the set of allowed secondary execution controls, i.e. the > > > virtual MSR_IA32_VMX_PROCBASED_CTLS2, in response to guest CPUID changes. > > > To avoid breaking old userspace that never sets the VMX MSRs, i.e. relies > > > on KVM to provide a consistent vCPU model, keep the existing behavior if > > > userspace has never written MSR_IA32_VMX_PROCBASED_CTLS2. > > > > > > KVM should not modify the VMX capabilities presented to L1 based on CPUID > > > as doing so may discard explicit settings provided by userspace. E.g. if > > > userspace does KVM_SET_MSRS => KVM_SET_CPUID and disables a feature in > > > the VMX MSRs but not CPUID (to prevent exposing the feature to L2), then > > > stuffing the VMX MSRs during KVM_SET_CPUID will expose the feature to L2 > > > against userspace's wishes. > > > > The commit message doesn't explain *why* KVM_SET_CPUID would be done before > > KVM_SET_MSRS. > > I assume you mean why KVM_SET_MSRS would be done before KVM_SET_CPUID2? > > This patch is mostly paranoia, AFAIK there is no userspace that is negatively > affected by KVM's manipulations. The only case I can think of is if userspace > wanted to emulate dynamic CPUID updates, e.g. set an MSR filter to intercept writes > to MISC_ENABLES to update MONITOR/MWAIT support, but that behavior isn't allowed > since commit feb627e8d6f6 ("KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN"). > > There are scenarios where userspace might do KVM_SET_MSRS before KVM_SET_CPUID, > e.g. QEMU's reuse of a vCPU for CPU hotplug, but in those cases I would expect > userspace to follow up with another KVM_SET_MSRS. An argument for taking this patch is that it might be necessary to disallow KVM_SET_MSRS after KVM_RUN[*]. If KVM manipulates VMX MSRs during KVM_SET_CPUID2, reusing a vCPU with sequence: SET_CPUID2 => SET_MSRS => RUN => unplug => hotplug => SET_CPUID2 => SET_MSRS sequence will cause the second SET_MSRS to fail due to userspace "changing" the MSR value. [*] https://lore.kernel.org/all/20220805172945.35412-4-seanjc@google.com
diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h index cd2ac9536c99..7b08d6006f52 100644 --- a/arch/x86/kvm/vmx/capabilities.h +++ b/arch/x86/kvm/vmx/capabilities.h @@ -51,6 +51,7 @@ struct nested_vmx_msrs { u64 cr4_fixed1; u64 vmcs_enum; u64 vmfunc_controls; + bool secondary_set_by_userspace; }; struct vmcs_config { diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c index d131375f347a..0140893412b7 100644 --- a/arch/x86/kvm/vmx/nested.c +++ b/arch/x86/kvm/vmx/nested.c @@ -1271,6 +1271,9 @@ vmx_restore_control_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data) if (!is_bitwise_subset(supported, data, GENMASK_ULL(63, 32))) return -EINVAL; + if (msr_index == MSR_IA32_VMX_PROCBASED_CTLS2) + vmx->nested.msrs.secondary_set_by_userspace = true; + vmx_get_control_msr(&vmx->nested.msrs, msr_index, &lowp, &highp); *lowp = data; *highp = data >> 32; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 13d3f5eb4c32..dd0247bc7193 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -4456,9 +4456,12 @@ vmx_adjust_secondary_exec_control(struct vcpu_vmx *vmx, u32 *exec_control, /* * Update the nested MSR settings so that a nested VMM can/can't set - * controls for features that are/aren't exposed to the guest. + * controls for features that are/aren't exposed to the guest. Stuff + * the MSR if and only if userspace hasn't explicitly set the MSR, i.e. + * to avoid ABI breakage if userspace might be relying on KVM's flawed + * behavior to expose features to L1. */ - if (nested) { + if (nested && !vmx->nested.msrs.secondary_set_by_userspace) { /* * All features that got grandfathered into KVM's flawed CPUID- * induced manipulation of VMX MSRs are unconditionally exposed