Message ID | 20230201194604.11135-3-minipli@grsecurity.net |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp477849wrn; Wed, 1 Feb 2023 11:52:24 -0800 (PST) X-Google-Smtp-Source: AK7set8yopGMEx0emf9jByL7jEhJ0DruIMDpOQp01en5w+kk0b1QgPfnF9b/ho0pRlWRoJOoe8qg X-Received: by 2002:a05:6402:1772:b0:49f:da00:47a5 with SMTP id da18-20020a056402177200b0049fda0047a5mr2939063edb.25.1675281144294; Wed, 01 Feb 2023 11:52:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1675281144; cv=none; d=google.com; s=arc-20160816; b=UXRysXXxO1Uhd6ujPdsJpCXXYMlaTvd7igxRNXxtvbrcKmGzQUXKsNHnl/21DUwuOz 2sDAVGG6w25EpNUksXhtemOikcZzFy7/QQt5aJFms5XCHXS6WLUNCnebO9QsYO1u0NHU Z9n9lYl5oKVkv2wDoFRFgaw9f1CFyFzToxGUV/XDvhvIkuK7VW3ZHyRgBpYIp3uulzw7 WdOAJSFrSQZsxY12xhrkG7BN33TEIV3reHMVGPibDnoz6m7AakEp6eY27UnmPr1tUBEm RJUS5k4j0NliTCCGU4KYsc0has2Q25f5pubhgZJZVD8zzDm/XH1AK+VcN/rzjNWQot3K nOBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=SwDZjz3R/xUrA9TRfuAPMHW2TTSVBtiLMqPJ61VaYAk=; b=siYJEsAYR5NOyUuc22wmI5oP3yESyslRgFxX3jsdDF0+2XFC3F/jMEA94IeQUu9NTd mSMKe3YTJAis92sRMOM6KsZyNHfGQgDbvPO36qRKjpAmdufgBKSs69ZHdqp3tqMMHCaD Mf0XbjFJzrTTqteBiApa91ggv00ObU50EtToJOnq3LASqDef33f7yaDAMbQknPqMqM/S +Xchaqt78xc0Ea8oSiRKuJfgWgBtvH7gkVAqdaD2MmZo7CZFJsqjl9H9WA6a9ELALM19 /WVen+lZU6ZSXXUWwYRJ/IbB2sKUNaWnhZwg4DAfYNMLsniHL8qLW+4jvIQP9EQk+5fD mrpA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@grsecurity.net header.s=grsec header.b="RaxIe/1D"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=grsecurity.net Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bp21-20020a170907919500b0084cd1ecf338si19491587ejb.705.2023.02.01.11.52.00; Wed, 01 Feb 2023 11:52:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@grsecurity.net header.s=grsec header.b="RaxIe/1D"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=grsecurity.net Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232208AbjBAToe (ORCPT <rfc822;duw91626@gmail.com> + 99 others); Wed, 1 Feb 2023 14:44:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54584 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232094AbjBATo2 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 1 Feb 2023 14:44:28 -0500 Received: from mail-ej1-x62b.google.com (mail-ej1-x62b.google.com [IPv6:2a00:1450:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C877A7BE6F for <linux-kernel@vger.kernel.org>; Wed, 1 Feb 2023 11:44:27 -0800 (PST) Received: by mail-ej1-x62b.google.com with SMTP id p26so43432882ejx.13 for <linux-kernel@vger.kernel.org>; Wed, 01 Feb 2023 11:44:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=grsecurity.net; s=grsec; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=SwDZjz3R/xUrA9TRfuAPMHW2TTSVBtiLMqPJ61VaYAk=; b=RaxIe/1DMFnMimcNs7o86F/NH2QJAimOB2DuKkM8Rlp3M6jd8ZwN2g3mowK0Iz+Pqg bYC5ACF10ECbb0drnKJUP3eHL/RHsNmek0qc7K+fOTMcmg7/SFSpv2rJKMJEx3GdsV07 rT9ycN16bqmTAY+53MXGTPGDbAMHlLjodfIG0FT7QxHQpmyYi2igCaIjsIehkX2f4CCl 4qjCG11Xd6eatyCxEBzgvTJT5UBgwiW3lcZf46NgbwCU5LLkDkuZAUX0KTkWTcmt4Otn zwrh/hdBGwSjBdsza6opiEjbdZvqe+jcqP9ZK9WfeWU1nlFVG97Ldwg+a19E3iJeccod 1Fmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SwDZjz3R/xUrA9TRfuAPMHW2TTSVBtiLMqPJ61VaYAk=; b=7NWDBTGQ14qzJKNgSw7EqvXEkp8Nw7KgG56QRPfTgFCuZj4FG37J6okB5tVZZH50l/ 7ElhovdD/p0XSMpmotBx1DqO5XaH8iJoSqiE87n6JJqaddvM3i7qvyv6SCRCIDN3x6Xj E9Wxel5HvAYPTPZZrgfC3AXKlEcIHMpN1AfA8cBIRxk2oMH1zn+DQzeHj4IEDjTcAwKK 4unfPEkNG7qMnX5BB4y4g2aUAMJIXAYnPHevLEZimCx+yKHvY1NOIELixK3+CVu80W/7 aMQ3SUpfmyRKuBRyldvGJTswX1PG8HnPg/DDktM97qlD+gORGAQ66nApJ7axe7agvdXz d0Aw== X-Gm-Message-State: AO0yUKXWRrcgUcBAdzpLkjCclDb9PtlBEM3oxdo8lqS36Uja32kVw9BC J8m9SRvajz5l4eZveyc8Gwt4gQ== X-Received: by 2002:a17:906:3a91:b0:88b:a30:25f0 with SMTP id y17-20020a1709063a9100b0088b0a3025f0mr3397279ejd.32.1675280666358; Wed, 01 Feb 2023 11:44:26 -0800 (PST) Received: from nuc.fritz.box (p200300f6af111a00277482c051eca183.dip0.t-ipconnect.de. [2003:f6:af11:1a00:2774:82c0:51ec:a183]) by smtp.gmail.com with ESMTPSA id c23-20020a170906155700b00869f2ca6a87sm10397579ejd.135.2023.02.01.11.44.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Feb 2023 11:44:26 -0800 (PST) From: Mathias Krause <minipli@grsecurity.net> To: kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Sean Christopherson <seanjc@google.com>, Paolo Bonzini <pbonzini@redhat.com>, Mathias Krause <minipli@grsecurity.net> Subject: [PATCH v3 2/6] KVM: VMX: Avoid retpoline call for control register caused exits Date: Wed, 1 Feb 2023 20:46:00 +0100 Message-Id: <20230201194604.11135-3-minipli@grsecurity.net> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230201194604.11135-1-minipli@grsecurity.net> References: <20230201194604.11135-1-minipli@grsecurity.net> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1756659600770136687?= X-GMAIL-MSGID: =?utf-8?q?1756659600770136687?= |
Series |
KVM: MMU: performance tweaks for heavy CR0.WP users
|
|
Commit Message
Mathias Krause
Feb. 1, 2023, 7:46 p.m. UTC
Complement commit 4289d2728664 ("KVM: retpolines: x86: eliminate
retpoline from vmx.c exit handlers") and avoid a retpoline call for
control register accesses as well.
This speeds up guests that make heavy use of it, like grsecurity
kernels toggling CR0.WP to implement kernel W^X.
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
---
Meanwhile I got my hands on a AMD system and while doing a similar change
for SVM gives a small measurable win (1.1% faster for grsecurity guests),
it would provide nothing for other guests, as the change I was testing was
specifically targeting CR0 caused exits.
A more general approach would instead cover CR3 and, maybe, CR4 as well.
However, that would require a lot more exit code compares, likely
vanishing the gains in the general case. So this tweak is VMX only.
arch/x86/kvm/vmx/vmx.c | 2 ++
1 file changed, 2 insertions(+)
Comments
On Wed, Feb 01, 2023, Mathias Krause wrote: > Complement commit 4289d2728664 ("KVM: retpolines: x86: eliminate > retpoline from vmx.c exit handlers") and avoid a retpoline call for > control register accesses as well. > > This speeds up guests that make heavy use of it, like grsecurity > kernels toggling CR0.WP to implement kernel W^X. I would rather drop this patch for VMX and instead unconditionally make CR0.WP guest owned when TDP (EPT) is enabled, i.e. drop the module param from patch 6. > Signed-off-by: Mathias Krause <minipli@grsecurity.net> > --- > > Meanwhile I got my hands on a AMD system and while doing a similar change > for SVM gives a small measurable win (1.1% faster for grsecurity guests), Mostly out of curiosity... Is the 1.1% roughly aligned with the gains for VMX? If VMX sees a significantly larger improvement, any idea why SVM doesn't benefit as much? E.g. did you double check that the kernel was actually using RETPOLINE? > it would provide nothing for other guests, as the change I was testing was > specifically targeting CR0 caused exits. > > A more general approach would instead cover CR3 and, maybe, CR4 as well. > However, that would require a lot more exit code compares, likely > vanishing the gains in the general case. So this tweak is VMX only. I don't think targeting on CR0 exits is a reason to not do this for SVM. With NPT enabled, CR3 isn't intercepted, and CR4 exits should be very rare. If the performance benefits are marginal (I don't have a good frame of reference for the 1.1%), then _that's_ a good reason to leave SVM alone. But not giving CR3 and CR4 priority is a non-issue. > arch/x86/kvm/vmx/vmx.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > index c788aa382611..c8198c8a9b55 100644 > --- a/arch/x86/kvm/vmx/vmx.c > +++ b/arch/x86/kvm/vmx/vmx.c > @@ -6538,6 +6538,8 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) > return handle_external_interrupt(vcpu); > else if (exit_reason.basic == EXIT_REASON_HLT) > return kvm_emulate_halt(vcpu); > + else if (exit_reason.basic == EXIT_REASON_CR_ACCESS) > + return handle_cr(vcpu); > else if (exit_reason.basic == EXIT_REASON_EPT_MISCONFIG) > return handle_ept_misconfig(vcpu); > #endif > -- > 2.39.1 >
On Wed, Mar 15, 2023 at 02:38:33PM -0700, Sean Christopherson wrote: > On Wed, Feb 01, 2023, Mathias Krause wrote: > > Complement commit 4289d2728664 ("KVM: retpolines: x86: eliminate > > retpoline from vmx.c exit handlers") and avoid a retpoline call for > > control register accesses as well. > > > > This speeds up guests that make heavy use of it, like grsecurity > > kernels toggling CR0.WP to implement kernel W^X. > > I would rather drop this patch for VMX and instead unconditionally make CR0.WP > guest owned when TDP (EPT) is enabled, i.e. drop the module param from patch 6. That's fine with me. As EPT usually implies TDP (if neither of both was explicitly disabled) that should be no limitation and as the non-EPT case only saw a very small gain from this change anyways (less than 1%) we can drop it. > > > Signed-off-by: Mathias Krause <minipli@grsecurity.net> > > --- > > > > Meanwhile I got my hands on a AMD system and while doing a similar change > > for SVM gives a small measurable win (1.1% faster for grsecurity guests), > > Mostly out of curiosity... > > Is the 1.1% roughly aligned with the gains for VMX? If VMX sees a significantly > larger improvement, any idea why SVM doesn't benefit as much? E.g. did you double > check that the kernel was actually using RETPOLINE? I measured the runtime of the ssdd test I used before and got 3.98s for a kernel with the whole series applied and 3.94s with the below change on top: diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index d13cf53e7390..2a471eae11c6 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -3369,6 +3369,10 @@ int svm_invoke_exit_handler(struct kvm_vcpu *vcpu, u64 exit_code) return intr_interception(vcpu); else if (exit_code == SVM_EXIT_HLT) return kvm_emulate_halt(vcpu); + else if (exit_code == SVM_EXIT_READ_CR0 || + exit_code == SVM_EXIT_WRITE_CR0 || + exit_code == SVM_EXIT_CR0_SEL_WRITE) + return cr_interception(vcpu); else if (exit_code == SVM_EXIT_NPF) return npf_interception(vcpu); #endif Inspecting svm_invoke_exit_handler() on the host with perf confirmed it could use the direct call of cr_interception() most of the time, thereby could avoid the retpoline for it: (My version of perf is, apparently, unable to detect tail calls properly and therefore lacks symbol information for the jump targets in the below assembly dump. I therefore added these manually.) Percent│ │ │ ffffffffc194c410 <load0>: # svm_invoke_exit_handler 5.00 │ nop 7.44 │ push %rbp 10.43 │ cmp $0x403,%rsi 5.86 │ mov %rdi,%rbp 1.23 │ push %rbx 2.11 │ mov %rsi,%rbx 4.60 │ jbe 7a │ 16: [svm_handle_invalid_exit() path removed] 4.59 │ 7a: mov -0x3e6a5b00(,%rsi,8),%rax 4.52 │ test %rax,%rax │ je 16 6.25 │ cmp $0x7c,%rsi │ je dd 4.18 │ cmp $0x64,%rsi │ je f2 3.26 │ cmp $0x60,%rsi │ je ca 4.57 │ cmp $0x78,%rsi │ je f9 1.27 │ test $0xffffffffffffffef,%rsi │ je c3 1.67 │ cmp $0x65,%rsi │ je c3 │ cmp $0x400,%rsi │ je 13d │ pop %rbx │ pop %rbp │ jmp 0xffffffffa0487d80 # __x86_indirect_thunk_rax │ int3 11.68 │ c3: pop %rbx 10.01 │ pop %rbp 10.47 │ jmp 0xffffffffc19482a0 # cr_interception │ ca: incq 0x1940(%rdi) │ mov $0x1,%eax │ pop %rbx 0.42 │ pop %rbp │ ret │ int3 │ int3 │ int3 │ int3 │ dd: mov 0x1a20(%rdi),%rax │ cmpq $0x0,0x78(%rax) │ je 100 │ pop %rbx │ pop %rbp │ jmp 0xffffffffc185af20 # kvm_emulate_wrmsr │ f2: pop %rbx │ pop %rbp 0.42 │ jmp 0xffffffffc19472b0 # interrupt_window_interception │ f9: pop %rbx │ pop %rbp │ jmp 0xffffffffc185a6a0 # kvm_emulate_halt │100: pop %rbx │ pop %rbp │ jmp 0xffffffffc18602a0 # kvm_emulate_rdmsr │107: mov %rbp,%rdi │ mov $0x10,%esi │ call kvm_register_read_raw │ mov 0x24(%rbp),%edx │ mov %rax,%rcx │ mov %rbx,%r8 │ mov %gs:0x2ac00,%rax │ mov 0x95c(%rax),%esi │ mov $0xffffffffc195dc28,%rdi │ call _printk │ jmp 31 │13d: pop %rbx │ pop %rbp │ jmp 0xffffffffc1946b90 # npf_interception What's clear from above (or so I hope!), cr_interception() is *the* reason to cause a VM exit for my test run and by taking the shortcut via a direct call, it doesn't have to do the retpoline dance which might be the explanation for the ~1.1% performance gain (even in the face of three additional compare instructions). However! As I realized that these three more instructions probably "hurt" all other workloads (that don't toggle CR0.WP as often as a grsecurity kernel would do), I didn't include the above change as a patch of the series. If you think it's worth it nonetheless, as VM exits shouldn't happen often anyways, I can do a proper patch. > > > it would provide nothing for other guests, as the change I was testing was > > specifically targeting CR0 caused exits. > > > > A more general approach would instead cover CR3 and, maybe, CR4 as well. > > However, that would require a lot more exit code compares, likely > > vanishing the gains in the general case. So this tweak is VMX only. > > I don't think targeting on CR0 exits is a reason to not do this for SVM. With > NPT enabled, CR3 isn't intercepted, and CR4 exits should be very rare. If the > performance benefits are marginal (I don't have a good frame of reference for the > 1.1%), then _that's_ a good reason to leave SVM alone. But not giving CR3 and CR4 > priority is a non-issue. Ok. But yeah, the win isn't all the big either, less so in real workloads that won't exercise this code path so often. > > > arch/x86/kvm/vmx/vmx.c | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c > > index c788aa382611..c8198c8a9b55 100644 > > --- a/arch/x86/kvm/vmx/vmx.c > > +++ b/arch/x86/kvm/vmx/vmx.c > > @@ -6538,6 +6538,8 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) > > return handle_external_interrupt(vcpu); > > else if (exit_reason.basic == EXIT_REASON_HLT) > > return kvm_emulate_halt(vcpu); > > + else if (exit_reason.basic == EXIT_REASON_CR_ACCESS) > > + return handle_cr(vcpu); > > else if (exit_reason.basic == EXIT_REASON_EPT_MISCONFIG) > > return handle_ept_misconfig(vcpu); > > #endif > > -- > > 2.39.1 > >
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index c788aa382611..c8198c8a9b55 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6538,6 +6538,8 @@ static int __vmx_handle_exit(struct kvm_vcpu *vcpu, fastpath_t exit_fastpath) return handle_external_interrupt(vcpu); else if (exit_reason.basic == EXIT_REASON_HLT) return kvm_emulate_halt(vcpu); + else if (exit_reason.basic == EXIT_REASON_CR_ACCESS) + return handle_cr(vcpu); else if (exit_reason.basic == EXIT_REASON_EPT_MISCONFIG) return handle_ept_misconfig(vcpu); #endif