Message ID | 20230914063325.85503-22-weijiang.yang@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp320726vqi; Thu, 14 Sep 2023 05:46:46 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGuQDEKQVzAtFFZvXzvIHxRlvyaY050UCmTCAuE8XfzrtGqI8sI/6+Snj2liF8Dtez7i7ac X-Received: by 2002:a05:6a20:3d1b:b0:153:8983:d87c with SMTP id y27-20020a056a203d1b00b001538983d87cmr3013867pzi.22.1694695606521; Thu, 14 Sep 2023 05:46:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694695606; cv=none; d=google.com; s=arc-20160816; b=g8fPAbNlfSN8To/XUc66ltAVSEymGZqEFNEusTtY/0kEmxt/0mxKvuT1zFPDCsaj6w zQQTuTbUQp0EsxvM3moNbFqxSqGTYFlJ/x681MB369TihuS1rfFYvG3hgMQBzUMuAxEY mVD6xKQDvwjhgzRlCTY9e0NgLMiB95/mtCeQBMyg9H3QSifLVhywED7ieqP8KHZA6KYM SGw08EZ+emTPzpdFiQsJDDY55KmvzWGJu/jeit2wBY77or+E653Ls4ACDT6ZzzADyuaM NJHKYI31bwZSEgVq7mLd4bS5hIifp1UD7zUuaDX1nmV4NxDti2giJ9osOqiPSb4/s/tx 1PdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=bHulLCQJM790xxXnVsAog+/83sFtbDAsqqiPEAMDq/s=; fh=jQqUhNQZPAXcZ44u72Wu3jv2pQizzn2Be2T8mkd1RwU=; b=aVpPoCqBmrgANeOcwEzc2gaSxCKwTEotSQ3fGix5zQWrzBtX3WKpCzHuwWhnoOVyaU pM6Lqy2LguasDThUq/WlrvNRuGWz+5yo+hBjopHcpGn1+p+7iR3AADYFGe1KOnsKiqvk RTFTUWlYZt4UsmIzrjdnYqJPAEeK4ib6HZkZfHGTGIE1jenB271Os3uWCddmp12zJ7BP py7mrbEYBDcbfhM1uP8V8ggkryXJI5onVHBrjREm7fEU3uatmHzI9CACj964IvTiztnc 8rBgohMifD8YxYMv+e6EOua9PxxwBKlaHm+M8zvFAzrlrlozRMicEhI7rEvUSD8aKfH+ /iVg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=QDwor15h; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id r20-20020a6560d4000000b0056ae965c533si1430827pgv.16.2023.09.14.05.46.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 05:46:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=QDwor15h; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 5E37F8283A76; Thu, 14 Sep 2023 02:39:35 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237788AbjINJjc (ORCPT <rfc822;chrisfriedt@gmail.com> + 35 others); Thu, 14 Sep 2023 05:39:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237432AbjINJia (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 14 Sep 2023 05:38:30 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D4A761FD8; Thu, 14 Sep 2023 02:38:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694684303; x=1726220303; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/Fcn/JsDDhKgNCQIEeFUvEKU2jF55FyYBOBg3kc7yEI=; b=QDwor15hm0/MuaSP+8JnrRhIdpXo6dFgoHBO5h6ERNmZEF0MNt+C20Lh hM6owtQOE8SXcu3UdsszsLlcfduhFB24EqcpOkkD8dxnvbdvzEHO0hFq7 L9OABgrkyAE8ClQAfL/ziNMDP9prScwI+BoyJrcHAKg+SubuIySVQbN8i 3Z5LJvFr8GDPaV4Kwepb00c5rBxZHL7pmhVirC5aJwKmdzMYo4+bjXW2X 5+ly0GusX8she9A9PwlwjVQC+ScIOZetHYfhZ0z0wPY7jtFGL6mRmKtUh VD2R1/j6NOt7umX+4xzvg1ncja/tec0Wk3UGjDO2lR3iNt66gDG1GSBLf A==; X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="409857426" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="409857426" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10832"; a="747656292" X-IronPort-AV: E=Sophos;i="6.02,145,1688454000"; d="scan'208";a="747656292" Received: from embargo.jf.intel.com ([10.165.9.183]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Sep 2023 02:38:23 -0700 From: Yang Weijiang <weijiang.yang@intel.com> To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: dave.hansen@intel.com, peterz@infradead.org, chao.gao@intel.com, rick.p.edgecombe@intel.com, weijiang.yang@intel.com, john.allen@amd.com Subject: [PATCH v6 21/25] KVM: VMX: Set up interception for CET MSRs Date: Thu, 14 Sep 2023 02:33:21 -0400 Message-Id: <20230914063325.85503-22-weijiang.yang@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230914063325.85503-1-weijiang.yang@intel.com> References: <20230914063325.85503-1-weijiang.yang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Thu, 14 Sep 2023 02:39:35 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777017140415613752 X-GMAIL-MSGID: 1777017140415613752 |
Series |
Enable CET Virtualization
|
|
Commit Message
Yang, Weijiang
Sept. 14, 2023, 6:33 a.m. UTC
Enable/disable CET MSRs interception per associated feature configuration.
Shadow Stack feature requires all CET MSRs passed through to guest to make
it supported in user and supervisor mode while IBT feature only depends on
MSR_IA32_{U,S}_CETS_CET to enable user and supervisor IBT.
Note, this MSR design introduced an architectual limitation of SHSTK and
IBT control for guest, i.e., when SHSTK is exposed, IBT is also available
to guest from architectual perspective since IBT relies on subset of SHSTK
relevant MSRs.
Signed-off-by: Yang Weijiang <weijiang.yang@intel.com>
---
arch/x86/kvm/vmx/vmx.c | 42 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 42 insertions(+)
Comments
On Thu, 2023-09-14 at 02:33 -0400, Yang Weijiang wrote: > Enable/disable CET MSRs interception per associated feature configuration. > Shadow Stack feature requires all CET MSRs passed through to guest to make > it supported in user and supervisor mode while IBT feature only depends on > MSR_IA32_{U,S}_CETS_CET to enable user and supervisor IBT. I don't think that this statement is 100% true. KVM can still technically intercept wrmsr/rdmsr access to all CET msrs because they should not be used often by the guest and this way allow to show the guest different values than what the actual hardware values are. For example KVM can hide (and maybe it should) indirect branch tracking bits in the MSR_IA32_S_CET if only the shadow stack is enabled and indirect branch tracking is disabled. The real problem is that MSR_IA32_U_CET is indirectly allowed to be read/written unintercepted, because of XSAVES (CET_U state component 11). Note that on the other hand the MSR_IA32_S_CET is not saved/restored by XSAVES. So this is what I think would be the best effort that KVM can do to separate the two features: 1. If support state of shadow stack and indirect branch tracking matches the host (the common case) then it is simple: - allow both CET_S and CET_U XSAVES components - allow unintercepted access to all CET msrs 2. If only indirect branch is enabled in the guest CPUID, but *host also supports shadow stacks*: - don't expose to the guest either the CET_S nor CET_U XSAVES components. - only support IA32_S_CET/IA32_U_CET msrs, intercept them, and hide the shadow stack bits from the guest. 3. If only shadow stacks are enabled in the guest CPUID but the *host also supports indirect branch tracking*: - intercept access to IA32_S_CET and IA32_U_CET and disallow indirect branch tracking bits to be set there. - for the sake of performance allow both CET_S and CET_U XSAVES components, and accept the fact that these instructions can enable the hidden indirect branch tracking bits there (this causes no harm to the host, and will likely let the guest keep both pieces, fair for using undocumented features). -or- don't enable CET_U XSAVES component and hope that the guest can cope with this by context switching the msrs instead. Yet another solution is to enable the intercept of the XSAVES, and adjust the saved/restored bits of CET_U msrs in the image after its emulation/execution. (This can't be done on AMD, but at least this can be done on Intel, and AMD so far doesn't support the indirect branch tracking at all). Another, much simpler option is to fail the guest creation if the shadow stack + indirect branch tracking state differs between host and the guest, unless both are disabled in the guest. (in essence don't let the guest be created if (2) or (3) happen) Best regards, Maxim Levitsky > > Note, this MSR design introduced an architectual limitation of SHSTK and > IBT control for guest, i.e., when SHSTK is exposed, IBT is also available > to guest from architectual perspective since IBT relies on subset of SHSTK > relevant MSRs. > > Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> > --- > arch/x86/kvm/vmx/vmx.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 42 insertions(+) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index 9f4b56337251..30373258573d 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -699,6 +699,10 @@ static bool is_valid_passthrough_msr(u32 msr) > case MSR_LBR_CORE_TO ... MSR_LBR_CORE_TO + 8: > /* LBR MSRs. These are handled in vmx_update_intercept_for_lbr_msrs() */ > return true; > + case MSR_IA32_U_CET: > + case MSR_IA32_S_CET: > + case MSR_IA32_PL0_SSP ... MSR_IA32_INT_SSP_TAB: > + return true; > } > > r = possible_passthrough_msr_slot(msr) != -ENOENT; > @@ -7769,6 +7773,42 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu) > vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4)); > } > > +static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu) > +{ > + bool incpt; > + > + if (kvm_cpu_cap_has(X86_FEATURE_SHSTK)) { > + incpt = !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK); > + > + vmx_set_intercept_for_msr(vcpu, MSR_IA32_U_CET, > + MSR_TYPE_RW, incpt); > + vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, > + MSR_TYPE_RW, incpt); > + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, > + MSR_TYPE_RW, incpt); > + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP, > + MSR_TYPE_RW, incpt); > + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP, > + MSR_TYPE_RW, incpt); > + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, > + MSR_TYPE_RW, incpt); > + if (guest_cpuid_has(vcpu, X86_FEATURE_LM)) > + vmx_set_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB, > + MSR_TYPE_RW, incpt); > + if (!incpt) > + return; > + } > + > + if (kvm_cpu_cap_has(X86_FEATURE_IBT)) { > + incpt = !guest_cpuid_has(vcpu, X86_FEATURE_IBT); > + > + vmx_set_intercept_for_msr(vcpu, MSR_IA32_U_CET, > + MSR_TYPE_RW, incpt); > + vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, > + MSR_TYPE_RW, incpt); > + } > +} > + > static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) > { > struct vcpu_vmx *vmx = to_vmx(vcpu); > @@ -7846,6 +7886,8 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) > > /* Refresh #PF interception to account for MAXPHYADDR changes. */ > vmx_update_exception_bitmap(vcpu); > + > + vmx_update_intercept_for_cet_msr(vcpu); > } > > static u64 vmx_get_perf_capabilities(void)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 9f4b56337251..30373258573d 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -699,6 +699,10 @@ static bool is_valid_passthrough_msr(u32 msr) case MSR_LBR_CORE_TO ... MSR_LBR_CORE_TO + 8: /* LBR MSRs. These are handled in vmx_update_intercept_for_lbr_msrs() */ return true; + case MSR_IA32_U_CET: + case MSR_IA32_S_CET: + case MSR_IA32_PL0_SSP ... MSR_IA32_INT_SSP_TAB: + return true; } r = possible_passthrough_msr_slot(msr) != -ENOENT; @@ -7769,6 +7773,42 @@ static void update_intel_pt_cfg(struct kvm_vcpu *vcpu) vmx->pt_desc.ctl_bitmask &= ~(0xfULL << (32 + i * 4)); } +static void vmx_update_intercept_for_cet_msr(struct kvm_vcpu *vcpu) +{ + bool incpt; + + if (kvm_cpu_cap_has(X86_FEATURE_SHSTK)) { + incpt = !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK); + + vmx_set_intercept_for_msr(vcpu, MSR_IA32_U_CET, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL0_SSP, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL1_SSP, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL2_SSP, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_PL3_SSP, + MSR_TYPE_RW, incpt); + if (guest_cpuid_has(vcpu, X86_FEATURE_LM)) + vmx_set_intercept_for_msr(vcpu, MSR_IA32_INT_SSP_TAB, + MSR_TYPE_RW, incpt); + if (!incpt) + return; + } + + if (kvm_cpu_cap_has(X86_FEATURE_IBT)) { + incpt = !guest_cpuid_has(vcpu, X86_FEATURE_IBT); + + vmx_set_intercept_for_msr(vcpu, MSR_IA32_U_CET, + MSR_TYPE_RW, incpt); + vmx_set_intercept_for_msr(vcpu, MSR_IA32_S_CET, + MSR_TYPE_RW, incpt); + } +} + static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) { struct vcpu_vmx *vmx = to_vmx(vcpu); @@ -7846,6 +7886,8 @@ static void vmx_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) /* Refresh #PF interception to account for MAXPHYADDR changes. */ vmx_update_exception_bitmap(vcpu); + + vmx_update_intercept_for_cet_msr(vcpu); } static u64 vmx_get_perf_capabilities(void)