Message ID | 20230914063325.85503-15-weijiang.yang@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp475046vqi; Thu, 14 Sep 2023 09:34:40 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGNqZqinTZVp8B4nH6WBPJium08vtPyrDydvg8FHjkXwMiQueTAkdgILB0/rz6lHvSPPir9 X-Received: by 2002:a05:6a20:1019:b0:153:1f43:314e with SMTP id gs25-20020a056a20101900b001531f43314emr5414901pzc.57.1694709280425; Thu, 14 Sep 2023 09:34:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694709280; cv=none; d=google.com; s=arc-20160816; b=MhqvzJKP2x9GjKwAjGkt4IuAwjaaZU5RwPy093U1rYKkJeckUUMWwav2mpSVqn5zsL FIBzZWTRWNcIg0CqhfxKAjTEnIJYX+QA2EZeTq9r1F+uc88VvkwoYj2coApqDmAc5Cfs i/S1EY4kQ31HX3Mj5I9HZzoaSc8RUw1j/4yNkF1po0WXifpKlYeNMSuTiiuBHrVlqWzp qj88bQ8l2ALbmkkeLQFHrLWTg+6Ft1sTuhKQQSLNU9fNpl1bJOcsmuxkRZPAZ2wtKPcd oXZpkBzFdILIysBGiAm/Qg+YyUC476ZrMEUC8u3bsotqnY/N/rzaG89k0z1Ezy0qfKDX NiAw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=jOQokFg9P47wNVW1w6tS2mfxIkwYJeilyiaSQOc/6ww=; fh=jQqUhNQZPAXcZ44u72Wu3jv2pQizzn2Be2T8mkd1RwU=; b=CPryPZCRKmDwtT2VasE2SYadVgfm9Uk7eH2Q+Rd3jZ/HSvMbW8yYsxipmtoVaZrPtc UmJXfXjhIcoHkFW6ggW1n8YC1uU7H9LxAlDlhf+3cvw5TvuFEgMpPihBPUwZlgv/iQ2C Y46j67byQSvSft0kZhEKh+r3Y5FekRe7uyX3JnvJigVxWGEEeR4FZLo7/1tvZOFdZ/7p lYbol0x14TpCioznK/ppjNUUDwnqR2dqhDC6xPkps3nWdTIW4Br81JXQ907/ofXuNIJ5 By20gLufkSAeSGFr8ZhgTAf5aoMSD+sLxK5rV4vAqijfjvHYiLnP7x/Hx9Zk4L31HC+d sKgg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=fpU2R+va; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id o4-20020a62cd04000000b0068bede61c1fsi1726264pfg.325.2023.09.14.09.34.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 09:34:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=fpU2R+va; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 3B955832B177; Thu, 14 Sep 2023 02:39:39 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237813AbjINJjJ (ORCPT <rfc822;chrisfriedt@gmail.com> + 35 others); Thu, 14 Sep 2023 05:39:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237369AbjINJi1 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 14 Sep 2023 05:38:27 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 323691FC6; Thu, 14 Sep 2023 02:38:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684301; x=1726220301; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qlHnWyHRSXxxRhVYMCd0RGmIeUUDEFAPvADSMDz/aBU=; b=fpU2R+va+0unNcE5frrR2g6O+qIlBIG9i4RYfEq15BkcmR/Z89oUs3iJ apkzLFCbBRnTJBjuPEH3m5DjGAQI3WAzbSpHyzFJiom1ux8EEPCb7lkuu KFImlQnT5SBM+1komwCLnYsOHKoEO+o1dpbER6xGRn8NMY3uDCmuGvYE2 673uV+fhOOZAOsF5U+eGtX0ByR7CGFr2ZXK0NR618vjajN/e15R6NyEQ6 zNB4JS02LMJRlo7/LmqLHTAkRTE6gMtQLZmqfnnejXXl4DI3tWKKNRyE1 ErLoMdeHZL2gJ5BiZ6t1JjQ9asG7hCar25TKhGnp9Z/mqFrn015LxPMQY A==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857385" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857385" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656260" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656260" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:20 -0700 From: Yang Weijiang <weijiang.yang@intel.com> To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 14/25] KVM: x86: Load guest FPU state when access XSAVE-managed MSRs Date: Thu, 14 Sep 2023 02:33:14 -0400 Message-Id: <20230914063325.85503-15-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Thu, 14 Sep 2023 02:39:39 -0700 (PDT) X-Spam-Status: No, score=0.2 required=5.0 tests=DATE_IN_PAST_03_06, DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777031478357850476 X-GMAIL-MSGID: 1777031478357850476 |
Series |
Enable CET Virtualization
|
|
Commit Message
Yang, Weijiang
Sept. 14, 2023, 6:33 a.m. UTC
From: Sean Christopherson <seanjc@google.com> Load the guest's FPU state if userspace is accessing MSRs whose values are managed by XSAVES. Introduce two helpers, kvm_{get,set}_xstate_msr(), to facilitate access to such kind of MSRs. If MSRs supported in kvm_caps.supported_xss are passed through to guest, the guest MSRs are swapped with host's before vCPU exits to userspace and after it re-enters kernel before next VM-entry. Because the modified code is also used for the KVM_GET_MSRS device ioctl(), explicitly check @vcpu is non-null before attempting to load guest state. The XSS supporting MSRs cannot be retrieved via the device ioctl() without loading guest FPU state (which doesn't exist). Note that guest_cpuid_has() is not queried as host userspace is allowed to access MSRs that have not been exposed to the guest, e.g. it might do KVM_SET_MSRS prior to KVM_SET_CPUID2. Signed-off-by: Sean Christopherson <seanjc@google.com> Co-developed-by: Yang Weijiang <weijiang.yang@intel.com> Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> --- arch/x86/kvm/x86.c | 30 +++++++++++++++++++++++++++++- arch/x86/kvm/x86.h | 24 ++++++++++++++++++++++++ 2 files changed, 53 insertions(+), 1 deletion(-)
Comments
On Thu, 2023-09-14 at 02:33 -0400, Yang Weijiang wrote: > From: Sean Christopherson <seanjc@google.com> > > Load the guest's FPU state if userspace is accessing MSRs whose values > are managed by XSAVES. Introduce two helpers, kvm_{get,set}_xstate_msr(), > to facilitate access to such kind of MSRs. > > If MSRs supported in kvm_caps.supported_xss are passed through to guest, > the guest MSRs are swapped with host's before vCPU exits to userspace and > after it re-enters kernel before next VM-entry. > > Because the modified code is also used for the KVM_GET_MSRS device ioctl(), > explicitly check @vcpu is non-null before attempting to load guest state. > The XSS supporting MSRs cannot be retrieved via the device ioctl() without > loading guest FPU state (which doesn't exist). > > Note that guest_cpuid_has() is not queried as host userspace is allowed to > access MSRs that have not been exposed to the guest, e.g. it might do > KVM_SET_MSRS prior to KVM_SET_CPUID2. > > Signed-off-by: Sean Christopherson <seanjc@google.com> > Co-developed-by: Yang Weijiang <weijiang.yang@intel.com> > Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> > --- > arch/x86/kvm/x86.c | 30 +++++++++++++++++++++++++++++- > arch/x86/kvm/x86.h | 24 ++++++++++++++++++++++++ > 2 files changed, 53 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 66edbed25db8..a091764bf1d2 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -133,6 +133,9 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); > static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); > > static DEFINE_MUTEX(vendor_module_lock); > +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); > +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); > + > struct kvm_x86_ops kvm_x86_ops __read_mostly; > > #define KVM_X86_OP(func) \ > @@ -4372,6 +4375,22 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > } > EXPORT_SYMBOL_GPL(kvm_get_msr_common); > > +static const u32 xstate_msrs[] = { > + MSR_IA32_U_CET, MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, > + MSR_IA32_PL2_SSP, MSR_IA32_PL3_SSP, > +}; > + > +static bool is_xstate_msr(u32 index) > +{ > + int i; > + > + for (i = 0; i < ARRAY_SIZE(xstate_msrs); i++) { > + if (index == xstate_msrs[i]) > + return true; > + } > + return false; > +} The name 'xstate_msr' IMHO is not clear. How about naming it 'guest_fpu_state_msrs', together with adding a comment like that: "These msrs are context switched together with the rest of the guest FPU state, on exit/entry to/from userspace There is also an assumption that loading guest values while the host kernel runs, doesn't cause harm to the host kernel" But if you prefer something else, its fine with me, but I do appreciate to have some comment attached to 'xstate_msr' at least. > + > /* > * Read or write a bunch of msrs. All parameters are kernel addresses. > * > @@ -4382,11 +4401,20 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs, > int (*do_msr)(struct kvm_vcpu *vcpu, > unsigned index, u64 *data)) > { > + bool fpu_loaded = false; > int i; > > - for (i = 0; i < msrs->nmsrs; ++i) > + for (i = 0; i < msrs->nmsrs; ++i) { > + if (vcpu && !fpu_loaded && kvm_caps.supported_xss && > + is_xstate_msr(entries[i].index)) { A comment here about why this is done, will also be appreciated: "Userspace requested us to read a MSR which value resides in the guest FPU state. Load this state temporarily to CPU to read/update it." > + kvm_load_guest_fpu(vcpu); > + fpu_loaded = true; > + } > if (do_msr(vcpu, entries[i].index, &entries[i].data)) > break; > + } And maybe here too: "If KVM loaded the guest FPU state, unload to it to restore the original userspace FPU state and to update the guest FPU state in case it was modified." > + if (fpu_loaded) > + kvm_put_guest_fpu(vcpu); > > return i; > } > diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h > index 1e7be1f6ab29..9a8e3a84eaf4 100644 > --- a/arch/x86/kvm/x86.h > +++ b/arch/x86/kvm/x86.h > @@ -540,4 +540,28 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size, > unsigned int port, void *data, unsigned int count, > int in); > > +/* > + * Lock and/or reload guest FPU and access xstate MSRs. For accesses initiated > + * by host, guest FPU is loaded in __msr_io(). For accesses initiated by guest, > + * guest FPU should have been loaded already. > + */ > + > +static inline void kvm_get_xstate_msr(struct kvm_vcpu *vcpu, > + struct msr_data *msr_info) > +{ > + KVM_BUG_ON(!vcpu->arch.guest_fpu.fpstate->in_use, vcpu->kvm); > + kvm_fpu_get(); > + rdmsrl(msr_info->index, msr_info->data); > + kvm_fpu_put(); > +} > + > +static inline void kvm_set_xstate_msr(struct kvm_vcpu *vcpu, > + struct msr_data *msr_info) > +{ > + KVM_BUG_ON(!vcpu->arch.guest_fpu.fpstate->in_use, vcpu->kvm); > + kvm_fpu_get(); > + wrmsrl(msr_info->index, msr_info->data); > + kvm_fpu_put(); > +} These functions are not used in the patch. I think they should be added later when used. Best regards, Maxim Levitsky > + > #endif
On Tue, Oct 31, 2023, Maxim Levitsky wrote: > On Thu, 2023-09-14 at 02:33 -0400, Yang Weijiang wrote: > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 66edbed25db8..a091764bf1d2 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -133,6 +133,9 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); > > static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); > > > > static DEFINE_MUTEX(vendor_module_lock); > > +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); > > +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); > > + > > struct kvm_x86_ops kvm_x86_ops __read_mostly; > > > > #define KVM_X86_OP(func) \ > > @@ -4372,6 +4375,22 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > } > > EXPORT_SYMBOL_GPL(kvm_get_msr_common); > > > > +static const u32 xstate_msrs[] = { > > + MSR_IA32_U_CET, MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, > > + MSR_IA32_PL2_SSP, MSR_IA32_PL3_SSP, > > +}; > > + > > +static bool is_xstate_msr(u32 index) > > +{ > > + int i; > > + > > + for (i = 0; i < ARRAY_SIZE(xstate_msrs); i++) { > > + if (index == xstate_msrs[i]) > > + return true; > > + } > > + return false; > > +} > > The name 'xstate_msr' IMHO is not clear. > > How about naming it 'guest_fpu_state_msrs', together with adding a comment like that: Maybe xstate_managed_msrs? I'd prefer not to include "guest" because the behavior is more a property of the architecture and/or the host kernel. I understand where you're coming from, but it's the MSR *values* are part of guest state, whereas the check is a query on how KVM manages the MSR value, if that makes sense. And I really don't like "FPU". I get why the the kernel uses the "FPU" terminology, but for this check in particular I want to tie the behavior back to the architecture, i.e. provide the hint that the reason why these MSRs are special is because Intel defined them to be context switched via XSTATE. Actually, this is unnecesary bikeshedding to some extent, using an array is silly. It's easier and likely far more performant (not that that matters in this case) to use a switch statement. Is this better? /* * Returns true if the MSR in question is managed via XSTATE, i.e. is context * switched with the rest of guest FPU state. */ static bool is_xstate_managed_msr(u32 index) { switch (index) { case MSR_IA32_U_CET: case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: return true; default: return false; } } /* * Read or write a bunch of msrs. All parameters are kernel addresses. * * @return number of msrs set successfully. */ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs, struct kvm_msr_entry *entries, int (*do_msr)(struct kvm_vcpu *vcpu, unsigned index, u64 *data)) { bool fpu_loaded = false; int i; for (i = 0; i < msrs->nmsrs; ++i) { /* * If userspace is accessing one or more XSTATE-managed MSRs, * temporarily load the guest's FPU state so that the guest's * MSR value(s) is resident in hardware, i.e. so that KVM can * get/set the MSR via RDMSR/WRMSR. */ if (vcpu && !fpu_loaded && kvm_caps.supported_xss && is_xstate_managed_msr(entries[i].index)) { kvm_load_guest_fpu(vcpu); fpu_loaded = true; } if (do_msr(vcpu, entries[i].index, &entries[i].data)) break; } if (fpu_loaded) kvm_put_guest_fpu(vcpu); return i; }
On Wed, 2023-11-01 at 11:05 -0700, Sean Christopherson wrote: > On Tue, Oct 31, 2023, Maxim Levitsky wrote: > > On Thu, 2023-09-14 at 02:33 -0400, Yang Weijiang wrote: > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > index 66edbed25db8..a091764bf1d2 100644 > > > --- a/arch/x86/kvm/x86.c > > > +++ b/arch/x86/kvm/x86.c > > > @@ -133,6 +133,9 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); > > > static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); > > > > > > static DEFINE_MUTEX(vendor_module_lock); > > > +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); > > > +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); > > > + > > > struct kvm_x86_ops kvm_x86_ops __read_mostly; > > > > > > #define KVM_X86_OP(func) \ > > > @@ -4372,6 +4375,22 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) > > > } > > > EXPORT_SYMBOL_GPL(kvm_get_msr_common); > > > > > > +static const u32 xstate_msrs[] = { > > > + MSR_IA32_U_CET, MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, > > > + MSR_IA32_PL2_SSP, MSR_IA32_PL3_SSP, > > > +}; > > > + > > > +static bool is_xstate_msr(u32 index) > > > +{ > > > + int i; > > > + > > > + for (i = 0; i < ARRAY_SIZE(xstate_msrs); i++) { > > > + if (index == xstate_msrs[i]) > > > + return true; > > > + } > > > + return false; > > > +} > > > > The name 'xstate_msr' IMHO is not clear. > > > > How about naming it 'guest_fpu_state_msrs', together with adding a comment like that: > > Maybe xstate_managed_msrs? I'd prefer not to include "guest" because the behavior > is more a property of the architecture and/or the host kernel. I understand where > you're coming from, but it's the MSR *values* are part of guest state, whereas the > check is a query on how KVM manages the MSR value, if that makes sense. Makes sense. > > And I really don't like "FPU". I get why the the kernel uses the "FPU" terminology, > but for this check in particular I want to tie the behavior back to the architecture, > i.e. provide the hint that the reason why these MSRs are special is because Intel > defined them to be context switched via XSTATE. > > Actually, this is unnecesary bikeshedding to some extent, using an array is silly. > It's easier and likely far more performant (not that that matters in this case) > to use a switch statement. > > Is this better? > > /* > * Returns true if the MSR in question is managed via XSTATE, i.e. is context > * switched with the rest of guest FPU state. > */ > static bool is_xstate_managed_msr(u32 index) > { > switch (index) { > case MSR_IA32_U_CET: > case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: > return true; > default: > return false; > } > } Reasonable. > > /* > * Read or write a bunch of msrs. All parameters are kernel addresses. > * > * @return number of msrs set successfully. > */ > static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs, > struct kvm_msr_entry *entries, > int (*do_msr)(struct kvm_vcpu *vcpu, > unsigned index, u64 *data)) > { > bool fpu_loaded = false; > int i; > > for (i = 0; i < msrs->nmsrs; ++i) { > /* > * If userspace is accessing one or more XSTATE-managed MSRs, > * temporarily load the guest's FPU state so that the guest's > * MSR value(s) is resident in hardware, i.e. so that KVM can > * get/set the MSR via RDMSR/WRMSR. > */ Reasonable as well. > if (vcpu && !fpu_loaded && kvm_caps.supported_xss && > is_xstate_managed_msr(entries[i].index)) { > kvm_load_guest_fpu(vcpu); > fpu_loaded = true; > } > if (do_msr(vcpu, entries[i].index, &entries[i].data)) > break; > } > if (fpu_loaded) > kvm_put_guest_fpu(vcpu); > > return i; > } > Best regards, Maxim Levitsky
On 11/2/2023 2:05 AM, Sean Christopherson wrote: > On Tue, Oct 31, 2023, Maxim Levitsky wrote: >> On Thu, 2023-09-14 at 02:33 -0400, Yang Weijiang wrote: >>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >>> index 66edbed25db8..a091764bf1d2 100644 >>> --- a/arch/x86/kvm/x86.c >>> +++ b/arch/x86/kvm/x86.c >>> @@ -133,6 +133,9 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); >>> static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); >>> >>> static DEFINE_MUTEX(vendor_module_lock); >>> +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); >>> +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); >>> + >>> struct kvm_x86_ops kvm_x86_ops __read_mostly; >>> >>> #define KVM_X86_OP(func) \ >>> @@ -4372,6 +4375,22 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) >>> } >>> EXPORT_SYMBOL_GPL(kvm_get_msr_common); >>> >>> +static const u32 xstate_msrs[] = { >>> + MSR_IA32_U_CET, MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, >>> + MSR_IA32_PL2_SSP, MSR_IA32_PL3_SSP, >>> +}; >>> + >>> +static bool is_xstate_msr(u32 index) >>> +{ >>> + int i; >>> + >>> + for (i = 0; i < ARRAY_SIZE(xstate_msrs); i++) { >>> + if (index == xstate_msrs[i]) >>> + return true; >>> + } >>> + return false; >>> +} >> The name 'xstate_msr' IMHO is not clear. >> >> How about naming it 'guest_fpu_state_msrs', together with adding a comment like that: > Maybe xstate_managed_msrs? I'd prefer not to include "guest" because the behavior > is more a property of the architecture and/or the host kernel. I understand where > you're coming from, but it's the MSR *values* are part of guest state, whereas the > check is a query on how KVM manages the MSR value, if that makes sense. > > And I really don't like "FPU". I get why the the kernel uses the "FPU" terminology, > but for this check in particular I want to tie the behavior back to the architecture, > i.e. provide the hint that the reason why these MSRs are special is because Intel > defined them to be context switched via XSTATE. > > Actually, this is unnecesary bikeshedding to some extent, using an array is silly. > It's easier and likely far more performant (not that that matters in this case) > to use a switch statement. > > Is this better? The change looks good to me! Thanks! > /* > * Returns true if the MSR in question is managed via XSTATE, i.e. is context > * switched with the rest of guest FPU state. > */ > static bool is_xstate_managed_msr(u32 index) How about is_xfeature_msr()? xfeature is XSAVE-Supported-Feature, just to align with SDM convention. > { > switch (index) { > case MSR_IA32_U_CET: > case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP: > return true; > default: > return false; > } > } > > /* > * Read or write a bunch of msrs. All parameters are kernel addresses. > * > * @return number of msrs set successfully. > */ > static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs, > struct kvm_msr_entry *entries, > int (*do_msr)(struct kvm_vcpu *vcpu, > unsigned index, u64 *data)) > { > bool fpu_loaded = false; > int i; > > for (i = 0; i < msrs->nmsrs; ++i) { > /* > * If userspace is accessing one or more XSTATE-managed MSRs, > * temporarily load the guest's FPU state so that the guest's > * MSR value(s) is resident in hardware, i.e. so that KVM can > * get/set the MSR via RDMSR/WRMSR. > */ > if (vcpu && !fpu_loaded && kvm_caps.supported_xss && > is_xstate_managed_msr(entries[i].index)) { > kvm_load_guest_fpu(vcpu); > fpu_loaded = true; > } > if (do_msr(vcpu, entries[i].index, &entries[i].data)) > break; > } > if (fpu_loaded) > kvm_put_guest_fpu(vcpu); > > return i; > }
On Fri, Nov 03, 2023, Weijiang Yang wrote: > On 11/2/2023 2:05 AM, Sean Christopherson wrote: > > /* > > * Returns true if the MSR in question is managed via XSTATE, i.e. is context > > * switched with the rest of guest FPU state. > > */ > > static bool is_xstate_managed_msr(u32 index) > > How about is_xfeature_msr()? xfeature is XSAVE-Supported-Feature, just to align with SDM > convention. My vote remains for is_xstate_managed_msr(). is_xfeature_msr() could also refer to MSRs that control XSTATE features, e.g. XSS.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 66edbed25db8..a091764bf1d2 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -133,6 +133,9 @@ static int __set_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); static void __get_sregs2(struct kvm_vcpu *vcpu, struct kvm_sregs2 *sregs2); static DEFINE_MUTEX(vendor_module_lock); +static void kvm_load_guest_fpu(struct kvm_vcpu *vcpu); +static void kvm_put_guest_fpu(struct kvm_vcpu *vcpu); + struct kvm_x86_ops kvm_x86_ops __read_mostly; #define KVM_X86_OP(func) \ @@ -4372,6 +4375,22 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info) } EXPORT_SYMBOL_GPL(kvm_get_msr_common); +static const u32 xstate_msrs[] = { + MSR_IA32_U_CET, MSR_IA32_PL0_SSP, MSR_IA32_PL1_SSP, + MSR_IA32_PL2_SSP, MSR_IA32_PL3_SSP, +}; + +static bool is_xstate_msr(u32 index) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(xstate_msrs); i++) { + if (index == xstate_msrs[i]) + return true; + } + return false; +} + /* * Read or write a bunch of msrs. All parameters are kernel addresses. * @@ -4382,11 +4401,20 @@ static int __msr_io(struct kvm_vcpu *vcpu, struct kvm_msrs *msrs, int (*do_msr)(struct kvm_vcpu *vcpu, unsigned index, u64 *data)) { + bool fpu_loaded = false; int i; - for (i = 0; i < msrs->nmsrs; ++i) + for (i = 0; i < msrs->nmsrs; ++i) { + if (vcpu && !fpu_loaded && kvm_caps.supported_xss && + is_xstate_msr(entries[i].index)) { + kvm_load_guest_fpu(vcpu); + fpu_loaded = true; + } if (do_msr(vcpu, entries[i].index, &entries[i].data)) break; + } + if (fpu_loaded) + kvm_put_guest_fpu(vcpu); return i; } diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h index 1e7be1f6ab29..9a8e3a84eaf4 100644 --- a/arch/x86/kvm/x86.h +++ b/arch/x86/kvm/x86.h @@ -540,4 +540,28 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size, unsigned int port, void *data, unsigned int count, int in); +/* + * Lock and/or reload guest FPU and access xstate MSRs. For accesses initiated + * by host, guest FPU is loaded in __msr_io(). For accesses initiated by guest, + * guest FPU should have been loaded already. + */ + +static inline void kvm_get_xstate_msr(struct kvm_vcpu *vcpu, + struct msr_data *msr_info) +{ + KVM_BUG_ON(!vcpu->arch.guest_fpu.fpstate->in_use, vcpu->kvm); + kvm_fpu_get(); + rdmsrl(msr_info->index, msr_info->data); + kvm_fpu_put(); +} + +static inline void kvm_set_xstate_msr(struct kvm_vcpu *vcpu, + struct msr_data *msr_info) +{ + KVM_BUG_ON(!vcpu->arch.guest_fpu.fpstate->in_use, vcpu->kvm); + kvm_fpu_get(); + wrmsrl(msr_info->index, msr_info->data); + kvm_fpu_put(); +} + #endif