Message ID | 20230201194604.11135-5-minipli@grsecurity.net |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp479032wrn; Wed, 1 Feb 2023 11:55:26 -0800 (PST) X-Google-Smtp-Source: AK7set/WVmq7KrBLeX//s0suBJ3r7sapmYURa4jPq9201H7JKENWBZ46te5S+Wn2dfoexzxl6QUM X-Received: by 2002:a05:6402:294c:b0:49e:15ee:4f72 with SMTP id ed12-20020a056402294c00b0049e15ee4f72mr3580871edb.36.1675281326266; Wed, 01 Feb 2023 11:55:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1675281326; cv=none; d=google.com; s=arc-20160816; b=wCgP2hWAuxxukTILPOWk1Te1UHPOgMwktlJIc4V5oIs2OdWzL2NoTZ3sQiRIyupIOM LuhcjukuIbXt/Z9ko+Cak3jRxMK3ijEqvN69kBKNi1tM8GnCFLqzM4GXme0nu4DOK5/U XcVuwVwmsj22+txSukfoM7ZDidd21HE0QtquNqWRslibvMhi2fdjZVi3XJZk1MxWa2oa g55wfuYhRGQKdwSsKyF3tzU4ciOGD35dhvKJfvjNpmfGqXnh8XOGs113XxdIk2X76C1U QwMCdEtmnMlxlJzZ96scH97pE0spi/5G2EXX+yHBYLACetweDmId7RpWUXg8PGP6thDt Gr3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=KunpYARU3Z8opuHOQvkXAppY0penEFLU7lOORGdMuTM=; b=vWHiHCw2qJlEyy1Qb1Knz9J5x+Achf0iFkGtLTyBCyHkCvf7Int+UMVgOJNCTw1zNH rAdcX26ezjQa/lGpvEDjR20cqKhas2AEtbBtUmJEPrAXOAZd1MpR+NF3cCzocqRJ59nZ DwCMHtGnSkfNeEv+qcl+KKv+nYJCdUTMsnmotMdkUlN8DNo+q15lAEed7zqc/Ch4FA1W IsXS8bz55bMCewogrCMjZy5kp9p9bR0BhEcXdJBH8qz38eG5lKriZJB5ISeksFbOdICA cozCcThVPn031Onn6mm4yfFRthBgZOUicU0RqNNkxNThQA6F5hXqGWHsj9usTYmTOL7+ 0uew== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@grsecurity.net header.s=grsec header.b=AI1HafiS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=grsecurity.net Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x3-20020a056402414300b0049fe17759bfsi22980676eda.230.2023.02.01.11.55.02; Wed, 01 Feb 2023 11:55:26 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@grsecurity.net header.s=grsec header.b=AI1HafiS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=grsecurity.net Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229546AbjBAToh (ORCPT <rfc822;duw91626@gmail.com> + 99 others); Wed, 1 Feb 2023 14:44:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54630 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232141AbjBAToa (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 1 Feb 2023 14:44:30 -0500 Received: from mail-ej1-x62c.google.com (mail-ej1-x62c.google.com [IPv6:2a00:1450:4864:20::62c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 777E37A4B4 for <linux-kernel@vger.kernel.org>; Wed, 1 Feb 2023 11:44:29 -0800 (PST) Received: by mail-ej1-x62c.google.com with SMTP id lu11so17314023ejb.3 for <linux-kernel@vger.kernel.org>; Wed, 01 Feb 2023 11:44:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=grsecurity.net; s=grsec; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KunpYARU3Z8opuHOQvkXAppY0penEFLU7lOORGdMuTM=; b=AI1HafiS7yvl+3Vfl6WZGwTbF4dqFNC7qE0AlRRxIcILQypAR3O1OWlF8mQVoGQ/aN mPFTcy+IcLwRC9EpggiAmx2I7s2RFGMAWK3OWFi8Qo36/dHwJnxDx4Z7q3dTJP35uKan EjiialoxgW7rIayXBv3gpS2QYrrVFaOnZZPeXZpVs3xdSEPsXpz91JLi40vUF26p5HK6 RImt/CnMpqcbH88FaixMQAmxzE6C57B+3hSyFkaCI3IvkQyGK2lqHBhFCWzhkNcK5FMJ sl68khinMx9LmNefwLC8hP5uKyf64IkbT9FRHCvBQB9EEDkQBfaB66dsVn88wIQwazFW GApw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KunpYARU3Z8opuHOQvkXAppY0penEFLU7lOORGdMuTM=; b=DjLcpr2CeTX3C20ifu060U5N3hWGjbIurXj1RGwWG/XtcAmvy7i2Zg4HZ7XeadykWL 1Nw0+yQOh5U1c4WExxnT5QaHeQB9hKRTRg5I/RVsF5A0qspYibh6uvTCfDjvqbFsgy3D 1OOmj2maUyT74IF5mEvcdBbD8/GTVC3heQImbFUuRqMwbz6Aqn+pOMKlJ9D8wtLiQzpQ G3N71S/QWQqc0fqNNOBQnpAVuoV34oy7sbQGP+5d1bwa1IEbJkqbMfmQ1qpzI7axYUma 3P+QM4B8VTS4/leQHML+ByG9XFkzJEHQtWqMXPwle2ak+tpMBLeROhU8rLOkQhyCHscO /QDw== X-Gm-Message-State: AO0yUKU8ZqmKdGeBLkkmCPnL/6+bWSpM4XUiZPaOjkLcZRQzDE94KjS2 fECzXx8gyMB7H/ZpY0KLSZoPWoTGrFKgGPJ1MNA= X-Received: by 2002:a17:906:dd4:b0:877:667b:f1e2 with SMTP id p20-20020a1709060dd400b00877667bf1e2mr3330879eji.11.1675280667922; Wed, 01 Feb 2023 11:44:27 -0800 (PST) Received: from nuc.fritz.box (p200300f6af111a00277482c051eca183.dip0.t-ipconnect.de. [2003:f6:af11:1a00:2774:82c0:51ec:a183]) by smtp.gmail.com with ESMTPSA id c23-20020a170906155700b00869f2ca6a87sm10397579ejd.135.2023.02.01.11.44.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Feb 2023 11:44:27 -0800 (PST) From: Mathias Krause <minipli@grsecurity.net> To: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Sean Christopherson <seanjc@google.com>, Paolo Bonzini <pbonzini@redhat.com>, Mathias Krause <minipli@grsecurity.net> Subject: [PATCH v3 4/6] KVM: x86: Make use of kvm_read_cr*_bits() when testing bits Date: Wed, 1 Feb 2023 20:46:02 +0100 Message-Id: <20230201194604.11135-5-minipli@grsecurity.net> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230201194604.11135-1-minipli@grsecurity.net> References: <20230201194604.11135-1-minipli@grsecurity.net> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1756659791963470621?= X-GMAIL-MSGID: =?utf-8?q?1756659791963470621?= |
Series |
KVM: MMU: performance tweaks for heavy CR0.WP users
|
|
Commit Message
Mathias Krause
Feb. 1, 2023, 7:46 p.m. UTC
Make use of the kvm_read_cr{0,4}_bits() helper functions when we only
want to know the state of certain bits instead of the whole register.
This not only makes the intend cleaner, it also avoids a VMREAD in case
the tested bits aren't guest owned.
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
arch/x86/kvm/pmu.c | 4 ++--
arch/x86/kvm/vmx/vmx.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
Comments
On Wed, 1 Feb 2023 20:46:02 +0100 Mathias Krause <minipli@grsecurity.net> wrote: > Make use of the kvm_read_cr{0,4}_bits() helper functions when we only > want to know the state of certain bits instead of the whole register. > > This not only makes the intend cleaner, it also avoids a VMREAD in case > the tested bits aren't guest owned. ^ The patch comment is a little confusing. Not sure if I misunderstood here: Check the code of kvm_read_cr0_bits static inline ulong kvm_read_cr0_bits(struct kvm_vcpu *vcpu, ulong mask) { ulong tmask = mask & KVM_POSSIBLE_CR0_GUEST_BITS; if ((tmask & vcpu->arch.cr0_guest_owned_bits) && !kvm_register_is_available(vcpu, VCPU_EXREG_CR0)) static_call(kvm_x86_cache_reg)(vcpu, VCPU_EXREG_CR0); return vcpu->arch.cr0 & mask; } I suppose the conditions that can avoids a VMREAD is to avoid the vmread in static_call(kvm_x86_cache_reg): Conditions are not triggering vmread: 1) The test bits are guest_owned_bits and cache register is available. 2) The test bits are *not* guest_owned bits. I agree that this makes the intend cleaner, but not sure the later statement is true in the patch comment. If the test bits are not guest owned, it will not reach static_call(kvm_x86_cache_reg). > > Signed-off-by: Mathias Krause <minipli@grsecurity.net> > --- > arch/x86/kvm/pmu.c | 4 ++-- > arch/x86/kvm/vmx/vmx.c | 4 ++-- > 2 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c > index d939d3b84e6f..d9922277df67 100644 > --- a/arch/x86/kvm/pmu.c > +++ b/arch/x86/kvm/pmu.c > @@ -439,9 +439,9 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data) > if (!pmc) > return 1; > > - if (!(kvm_read_cr4(vcpu) & X86_CR4_PCE) && > + if (!(kvm_read_cr4_bits(vcpu, X86_CR4_PCE)) && > (static_call(kvm_x86_get_cpl)(vcpu) != 0) && > - (kvm_read_cr0(vcpu) & X86_CR0_PE)) > + (kvm_read_cr0_bits(vcpu, X86_CR0_PE))) > return 1; > > *data = pmc_read_counter(pmc) & mask; > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index c8198c8a9b55..d3b49e0b6c32 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -5487,7 +5487,7 @@ static int handle_cr(struct kvm_vcpu *vcpu) > break; > case 3: /* lmsw */ > val = (exit_qualification >> LMSW_SOURCE_DATA_SHIFT) & 0x0f; > - trace_kvm_cr_write(0, (kvm_read_cr0(vcpu) & ~0xful) | val); > + trace_kvm_cr_write(0, (kvm_read_cr0_bits(vcpu, ~0xful) | val)); > kvm_lmsw(vcpu, val); > > return kvm_skip_emulated_instruction(vcpu); > @@ -7547,7 +7547,7 @@ static u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio) > if (!kvm_arch_has_noncoherent_dma(vcpu->kvm)) > return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT; > > - if (kvm_read_cr0(vcpu) & X86_CR0_CD) { > + if (kvm_read_cr0_bits(vcpu, X86_CR0_CD)) { > if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_CD_NW_CLEARED)) > cache = MTRR_TYPE_WRBACK; > else
On 07.02.23 14:05, Zhi Wang wrote: > On Wed, 1 Feb 2023 20:46:02 +0100 > Mathias Krause <minipli@grsecurity.net> wrote: > >> Make use of the kvm_read_cr{0,4}_bits() helper functions when we only >> want to know the state of certain bits instead of the whole register. >> >> This not only makes the intend cleaner, it also avoids a VMREAD in case ~~~~~~ Oh, this should have been "intent". Will fix in v4, if there's a need for. >> the tested bits aren't guest owned. > ^ > The patch comment is a little confusing. Not sure if I misunderstood here: Sorry, lets try to clarify. > Check the code of kvm_read_cr0_bits > > static inline ulong kvm_read_cr0_bits(struct kvm_vcpu *vcpu, ulong mask) > { > ulong tmask = mask & KVM_POSSIBLE_CR0_GUEST_BITS; > if ((tmask & vcpu->arch.cr0_guest_owned_bits) && > !kvm_register_is_available(vcpu, VCPU_EXREG_CR0)) > static_call(kvm_x86_cache_reg)(vcpu, VCPU_EXREG_CR0); > return vcpu->arch.cr0 & mask; > } > > I suppose the conditions that can avoids a VMREAD is to avoid the vmread in > static_call(kvm_x86_cache_reg): Correct, that's what this patch is trying to do: It tries to avoid the static_call(kvm_x86_cache_reg)(...) by making the compiler aware of the actually used bits in 'mask'. If those don't intersect with the guest owned bits, the first part of the condition wont be true and we simply can make use of 'vcpu->arch.cr0'. Maybe it gets clearer when looking at kvm_read_cr0() too which is just this: static inline ulong kvm_read_cr0(struct kvm_vcpu *vcpu) { return kvm_read_cr0_bits(vcpu, ~0UL); } So the 'mask' passed to kvm_read_cr0_bits() will always include all (possible) guest owned bits (KVM_POSSIBLE_CR0_GUEST_BITS & ~0UL == KVM_POSSIBLE_CR0_GUEST_BITS) and the compiler cannot do the optimization mentioned above. If we, however, use kvm_read_cr0_bits(..., MASK) directly instead of using kvm_read_cr0() & MASK, it can, like for all bits not in KVM_POSSIBLE_CR0_GUEST_BITS & vcpu->arch.cr0_guest_owned_bits. > Conditions are not triggering vmread: > > 1) The test bits are guest_owned_bits and cache register is available. > 2) The test bits are *not* guest_owned bits. For case 1 the patch would make only a minor difference, by concluding earlier that it can simply make use of vcpu->arch.cr0. But it's case 2 I'm after. If you look up KVM_POSSIBLE_CR0_GUEST_BITS, which is the upper bound for guest owned CR0 bits, you'll find before patch 6: #define KVM_POSSIBLE_CR0_GUEST_BITS X86_CR0_TS and after patch 6: #define KVM_LAZY_CR0_GUEST_BITS X86_CR0_WP #define KVM_POSSIBLE_CR0_GUEST_BITS (X86_CR0_TS|KVM_LAZY_CR0_GUEST_BITS) So the upper bound would be 'X86_CR0_TS|X86_CR0_WP'. Every bit outside that set can directly be read from the 'vcpu' cached register value and that's (mostly) the case for the users this patch is changing, see below. > I agree that this makes the intend cleaner, but not sure the later statement > is true in the patch comment. If the test bits are not guest owned, it will > not reach static_call(kvm_x86_cache_reg). Correct, but that's no different from what I'm saying. My description just set 'static_call(kvm_x86_cache_reg)' mentally equivalent to VMREAD, which abstracts the static_call quite well, IMHO. But maybe I should clarify that 'tested bits' means the bits used by the changed call side? Though, I think that's rather obvious from the change itself. I can factor in the caching aspect, though. Maybe something like this?: This not only makes the intent cleaner, it also avoids a potential VMREAD in case the tested bits aren't guest owned. I've added "potential" but left the remainder as is. >> >> Signed-off-by: Mathias Krause <minipli@grsecurity.net> >> --- >> arch/x86/kvm/pmu.c | 4 ++-- >> arch/x86/kvm/vmx/vmx.c | 4 ++-- >> 2 files changed, 4 insertions(+), 4 deletions(-) >> >> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c >> index d939d3b84e6f..d9922277df67 100644 >> --- a/arch/x86/kvm/pmu.c >> +++ b/arch/x86/kvm/pmu.c >> @@ -439,9 +439,9 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data) >> if (!pmc) >> return 1; >> >> - if (!(kvm_read_cr4(vcpu) & X86_CR4_PCE) && >> + if (!(kvm_read_cr4_bits(vcpu, X86_CR4_PCE)) && X86_CR4_PCE & KVM_POSSIBLE_CR4_GUEST_BITS == X86_CR4_PCE, therefore can only be optimized if X86_CR4_PCE would be dropped from 'vcpu->arch.cr4_guest_owned_bits' as well. But AFAICS we don't do that. So here you're right that this only clears up the intent, not the actual behavior at runtime. >> (static_call(kvm_x86_get_cpl)(vcpu) != 0) && >> - (kvm_read_cr0(vcpu) & X86_CR0_PE)) >> + (kvm_read_cr0_bits(vcpu, X86_CR0_PE))) X86_CR0_PE & KVM_POSSIBLE_CR0_GUEST_BITS == 0, therefore this can be optimized. >> return 1; >> >> *data = pmc_read_counter(pmc) & mask; >> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c >> index c8198c8a9b55..d3b49e0b6c32 100644 >> --- a/arch/x86/kvm/vmx/vmx.c >> +++ b/arch/x86/kvm/vmx/vmx.c >> @@ -5487,7 +5487,7 @@ static int handle_cr(struct kvm_vcpu *vcpu) >> break; >> case 3: /* lmsw */ >> val = (exit_qualification >> LMSW_SOURCE_DATA_SHIFT) & 0x0f; >> - trace_kvm_cr_write(0, (kvm_read_cr0(vcpu) & ~0xful) | val); >> + trace_kvm_cr_write(0, (kvm_read_cr0_bits(vcpu, ~0xful) | val)); ~0xful & KVM_POSSIBLE_CR0_GUEST_BITS is 0 prior to patch 6 and X86_CR0_WP afterwards, therefore this might be optimized, depending on the runtime setting of 'enable_lazy_cr0', possibly capping the guest owned CR0 bits to exclude X86_CR0_WP again. >> kvm_lmsw(vcpu, val); >> >> return kvm_skip_emulated_instruction(vcpu); >> @@ -7547,7 +7547,7 @@ static u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio) >> if (!kvm_arch_has_noncoherent_dma(vcpu->kvm)) >> return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT; >> >> - if (kvm_read_cr0(vcpu) & X86_CR0_CD) { >> + if (kvm_read_cr0_bits(vcpu, X86_CR0_CD)) { X86_CR0_CD & KVM_POSSIBLE_CR0_GUEST_BITS == 0, therefore this can be optimized as well. >> if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_CD_NW_CLEARED)) >> cache = MTRR_TYPE_WRBACK; >> else > Thanks, Mathias
On Wed, 8 Feb 2023 10:11:30 +0100 Mathias Krause <minipli@grsecurity.net> wrote: > On 07.02.23 14:05, Zhi Wang wrote: > > On Wed, 1 Feb 2023 20:46:02 +0100 > > Mathias Krause <minipli@grsecurity.net> wrote: > > > >> Make use of the kvm_read_cr{0,4}_bits() helper functions when we only > >> want to know the state of certain bits instead of the whole register. > >> > >> This not only makes the intend cleaner, it also avoids a VMREAD in case > ~~~~~~ > Oh, this should have been "intent". Will fix in v4, if there's a need for. > > >> the tested bits aren't guest owned. > > ^ > > The patch comment is a little confusing. Not sure if I misunderstood here: > > Sorry, lets try to clarify. > > > Check the code of kvm_read_cr0_bits > > > > static inline ulong kvm_read_cr0_bits(struct kvm_vcpu *vcpu, ulong mask) > > { > > ulong tmask = mask & KVM_POSSIBLE_CR0_GUEST_BITS; > > if ((tmask & vcpu->arch.cr0_guest_owned_bits) && > > !kvm_register_is_available(vcpu, VCPU_EXREG_CR0)) > > static_call(kvm_x86_cache_reg)(vcpu, VCPU_EXREG_CR0); > > return vcpu->arch.cr0 & mask; > > } > > > > I suppose the conditions that can avoids a VMREAD is to avoid the vmread in > > static_call(kvm_x86_cache_reg): > > Correct, that's what this patch is trying to do: It tries to avoid the > static_call(kvm_x86_cache_reg)(...) by making the compiler aware of the > actually used bits in 'mask'. If those don't intersect with the guest > owned bits, the first part of the condition wont be true and we simply > can make use of 'vcpu->arch.cr0'. > > Maybe it gets clearer when looking at kvm_read_cr0() too which is just this: > > static inline ulong kvm_read_cr0(struct kvm_vcpu *vcpu) > { > return kvm_read_cr0_bits(vcpu, ~0UL); > } > > So the 'mask' passed to kvm_read_cr0_bits() will always include all > (possible) guest owned bits (KVM_POSSIBLE_CR0_GUEST_BITS & ~0UL == > KVM_POSSIBLE_CR0_GUEST_BITS) and the compiler cannot do the optimization > mentioned above. > > If we, however, use kvm_read_cr0_bits(..., MASK) directly instead of > using kvm_read_cr0() & MASK, it can, like for all bits not in > KVM_POSSIBLE_CR0_GUEST_BITS & vcpu->arch.cr0_guest_owned_bits. > > > Conditions are not triggering vmread: > > > > 1) The test bits are guest_owned_bits and cache register is available. > > 2) The test bits are *not* guest_owned bits. > > For case 1 the patch would make only a minor difference, by concluding > earlier that it can simply make use of vcpu->arch.cr0. But it's case 2 > I'm after. > Thanks for the explanation. Now I got it. > If you look up KVM_POSSIBLE_CR0_GUEST_BITS, which is the upper bound for > guest owned CR0 bits, you'll find before patch 6: > > #define KVM_POSSIBLE_CR0_GUEST_BITS X86_CR0_TS > > and after patch 6: > > #define KVM_LAZY_CR0_GUEST_BITS X86_CR0_WP > #define KVM_POSSIBLE_CR0_GUEST_BITS (X86_CR0_TS|KVM_LAZY_CR0_GUEST_BITS) > > So the upper bound would be 'X86_CR0_TS|X86_CR0_WP'. Every bit outside > that set can directly be read from the 'vcpu' cached register value and > that's (mostly) the case for the users this patch is changing, see below. > > > I agree that this makes the intend cleaner, but not sure the later statement > > is true in the patch comment. If the test bits are not guest owned, it will > > not reach static_call(kvm_x86_cache_reg). > > Correct, but that's no different from what I'm saying. My description > just set 'static_call(kvm_x86_cache_reg)' mentally equivalent to VMREAD, > which abstracts the static_call quite well, IMHO. But maybe I should > clarify that 'tested bits' means the bits used by the changed call side? > Though, I think that's rather obvious from the change itself. I can > factor in the caching aspect, though. > > Maybe something like this?: > > This not only makes the intent cleaner, it also avoids a potential > VMREAD in case the tested bits aren't guest owned. > > I've added "potential" but left the remainder as is. > > >> > >> Signed-off-by: Mathias Krause <minipli@grsecurity.net> > >> --- > >> arch/x86/kvm/pmu.c | 4 ++-- > >> arch/x86/kvm/vmx/vmx.c | 4 ++-- > >> 2 files changed, 4 insertions(+), 4 deletions(-) > >> > >> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c > >> index d939d3b84e6f..d9922277df67 100644 > >> --- a/arch/x86/kvm/pmu.c > >> +++ b/arch/x86/kvm/pmu.c > >> @@ -439,9 +439,9 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data) > >> if (!pmc) > >> return 1; > >> > >> - if (!(kvm_read_cr4(vcpu) & X86_CR4_PCE) && > >> + if (!(kvm_read_cr4_bits(vcpu, X86_CR4_PCE)) && > > X86_CR4_PCE & KVM_POSSIBLE_CR4_GUEST_BITS == X86_CR4_PCE, therefore can > only be optimized if X86_CR4_PCE would be dropped from > 'vcpu->arch.cr4_guest_owned_bits' as well. But AFAICS we don't do that. > So here you're right that this only clears up the intent, not the actual > behavior at runtime. > > >> (static_call(kvm_x86_get_cpl)(vcpu) != 0) && > >> - (kvm_read_cr0(vcpu) & X86_CR0_PE)) > >> + (kvm_read_cr0_bits(vcpu, X86_CR0_PE))) > > X86_CR0_PE & KVM_POSSIBLE_CR0_GUEST_BITS == 0, therefore this can be > optimized. > > >> return 1; > >> > >> *data = pmc_read_counter(pmc) & mask; > >> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > >> index c8198c8a9b55..d3b49e0b6c32 100644 > >> --- a/arch/x86/kvm/vmx/vmx.c > >> +++ b/arch/x86/kvm/vmx/vmx.c > >> @@ -5487,7 +5487,7 @@ static int handle_cr(struct kvm_vcpu *vcpu) > >> break; > >> case 3: /* lmsw */ > >> val = (exit_qualification >> LMSW_SOURCE_DATA_SHIFT) & 0x0f; > >> - trace_kvm_cr_write(0, (kvm_read_cr0(vcpu) & ~0xful) | val); > >> + trace_kvm_cr_write(0, (kvm_read_cr0_bits(vcpu, ~0xful) | val)); > > ~0xful & KVM_POSSIBLE_CR0_GUEST_BITS is 0 prior to patch 6 and > X86_CR0_WP afterwards, therefore this might be optimized, depending on > the runtime setting of 'enable_lazy_cr0', possibly capping the guest > owned CR0 bits to exclude X86_CR0_WP again. > > >> kvm_lmsw(vcpu, val); > >> > >> return kvm_skip_emulated_instruction(vcpu); > >> @@ -7547,7 +7547,7 @@ static u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio) > >> if (!kvm_arch_has_noncoherent_dma(vcpu->kvm)) > >> return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT; > >> > >> - if (kvm_read_cr0(vcpu) & X86_CR0_CD) { > >> + if (kvm_read_cr0_bits(vcpu, X86_CR0_CD)) { > > X86_CR0_CD & KVM_POSSIBLE_CR0_GUEST_BITS == 0, therefore this can be > optimized as well. > > >> if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_CD_NW_CLEARED)) > >> cache = MTRR_TYPE_WRBACK; > >> else > > > > Thanks, > Mathias
On Wed, Feb 01, 2023, Mathias Krause wrote: > Make use of the kvm_read_cr{0,4}_bits() helper functions when we only > want to know the state of certain bits instead of the whole register. > > This not only makes the intend cleaner, it also avoids a VMREAD in case > the tested bits aren't guest owned. > > Signed-off-by: Mathias Krause <minipli@grsecurity.net> > --- > arch/x86/kvm/pmu.c | 4 ++-- > arch/x86/kvm/vmx/vmx.c | 4 ++-- > 2 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c > index d939d3b84e6f..d9922277df67 100644 > --- a/arch/x86/kvm/pmu.c > +++ b/arch/x86/kvm/pmu.c > @@ -439,9 +439,9 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data) > if (!pmc) > return 1; > > - if (!(kvm_read_cr4(vcpu) & X86_CR4_PCE) && > + if (!(kvm_read_cr4_bits(vcpu, X86_CR4_PCE)) && Purely as an FYI, I proposed adding helpers to query single CR0/CR4 bits in a separate thread[*]. No need to do anything on your end, I'll plan on applying this patch first and will handle whatever conflicts arise. [*] https://lore.kernel.org/all/ZAuRec2NkC3+4jvD@google.com
On 15.03.23 23:18, Sean Christopherson wrote: > On Wed, Feb 01, 2023, Mathias Krause wrote: >> Make use of the kvm_read_cr{0,4}_bits() helper functions when we only >> want to know the state of certain bits instead of the whole register. >> >> This not only makes the intend cleaner, it also avoids a VMREAD in case >> the tested bits aren't guest owned. >> >> Signed-off-by: Mathias Krause <minipli@grsecurity.net> >> --- >> arch/x86/kvm/pmu.c | 4 ++-- >> arch/x86/kvm/vmx/vmx.c | 4 ++-- >> 2 files changed, 4 insertions(+), 4 deletions(-) >> >> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c >> index d939d3b84e6f..d9922277df67 100644 >> --- a/arch/x86/kvm/pmu.c >> +++ b/arch/x86/kvm/pmu.c >> @@ -439,9 +439,9 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data) >> if (!pmc) >> return 1; >> >> - if (!(kvm_read_cr4(vcpu) & X86_CR4_PCE) && >> + if (!(kvm_read_cr4_bits(vcpu, X86_CR4_PCE)) && > > Purely as an FYI, I proposed adding helpers to query single CR0/CR4 bits in a > separate thread[*]. No need to do anything on your end, I'll plan on applying > this patch first and will handle whatever conflicts arise. > > [*] https://lore.kernel.org/all/ZAuRec2NkC3+4jvD@google.com Unfortunately, not all users of kvm_read_cr*_bits() only want to read a single bit. There are a very few that read bit masks -- but you're probably fully aware of this. Thanks, Mathias
On Mon, Mar 20, 2023, Mathias Krause wrote: > On 15.03.23 23:18, Sean Christopherson wrote: > > On Wed, Feb 01, 2023, Mathias Krause wrote: > >> Make use of the kvm_read_cr{0,4}_bits() helper functions when we only > >> want to know the state of certain bits instead of the whole register. > >> > >> This not only makes the intend cleaner, it also avoids a VMREAD in case > >> the tested bits aren't guest owned. > >> > >> Signed-off-by: Mathias Krause <minipli@grsecurity.net> > >> --- > >> arch/x86/kvm/pmu.c | 4 ++-- > >> arch/x86/kvm/vmx/vmx.c | 4 ++-- > >> 2 files changed, 4 insertions(+), 4 deletions(-) > >> > >> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c > >> index d939d3b84e6f..d9922277df67 100644 > >> --- a/arch/x86/kvm/pmu.c > >> +++ b/arch/x86/kvm/pmu.c > >> @@ -439,9 +439,9 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data) > >> if (!pmc) > >> return 1; > >> > >> - if (!(kvm_read_cr4(vcpu) & X86_CR4_PCE) && > >> + if (!(kvm_read_cr4_bits(vcpu, X86_CR4_PCE)) && > > > > Purely as an FYI, I proposed adding helpers to query single CR0/CR4 bits in a > > separate thread[*]. No need to do anything on your end, I'll plan on applying > > this patch first and will handle whatever conflicts arise. > > > > [*] https://lore.kernel.org/all/ZAuRec2NkC3+4jvD@google.com > > Unfortunately, not all users of kvm_read_cr*_bits() only want to read a > single bit. There are a very few that read bit masks -- but you're > probably fully aware of this. Yeah, we won't be able to get rid of kvm_read_cr*_bits() entirely, the goal is purely to make code that does query a single bit slightly more readable/obvious.
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c index d939d3b84e6f..d9922277df67 100644 --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -439,9 +439,9 @@ int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data) if (!pmc) return 1; - if (!(kvm_read_cr4(vcpu) & X86_CR4_PCE) && + if (!(kvm_read_cr4_bits(vcpu, X86_CR4_PCE)) && (static_call(kvm_x86_get_cpl)(vcpu) != 0) && - (kvm_read_cr0(vcpu) & X86_CR0_PE)) + (kvm_read_cr0_bits(vcpu, X86_CR0_PE))) return 1; *data = pmc_read_counter(pmc) & mask; diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index c8198c8a9b55..d3b49e0b6c32 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -5487,7 +5487,7 @@ static int handle_cr(struct kvm_vcpu *vcpu) break; case 3: /* lmsw */ val = (exit_qualification >> LMSW_SOURCE_DATA_SHIFT) & 0x0f; - trace_kvm_cr_write(0, (kvm_read_cr0(vcpu) & ~0xful) | val); + trace_kvm_cr_write(0, (kvm_read_cr0_bits(vcpu, ~0xful) | val)); kvm_lmsw(vcpu, val); return kvm_skip_emulated_instruction(vcpu); @@ -7547,7 +7547,7 @@ static u8 vmx_get_mt_mask(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio) if (!kvm_arch_has_noncoherent_dma(vcpu->kvm)) return (MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT; - if (kvm_read_cr0(vcpu) & X86_CR0_CD) { + if (kvm_read_cr0_bits(vcpu, X86_CR0_CD)) { if (kvm_check_has_quirk(vcpu->kvm, KVM_X86_QUIRK_CD_NW_CLEARED)) cache = MTRR_TYPE_WRBACK; else