[v2,03/16] KVM: x86: Always use non-compat vcpu_runstate_info size for gfn=>pfn cache
Message ID | 20221013211234.1318131-4-seanjc@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp483892wrs; Thu, 13 Oct 2022 14:14:38 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6IyPOGnvxV8PALR3NPOjHC+hcvTTHGrWWa30pJDzwdEyhdMKOfVm8CJgAG01Bp3HOLUynG X-Received: by 2002:a63:5f54:0:b0:462:1149:f3b3 with SMTP id t81-20020a635f54000000b004621149f3b3mr1579319pgb.445.1665695678530; Thu, 13 Oct 2022 14:14:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665695678; cv=none; d=google.com; s=arc-20160816; b=iyp7RUJdYZUI97dq8mGoHM/0k2p6ahauQZIKqHS3MPjVCcBcXxhPP0h8q9SYMEo6Z5 UvTG0zh+k0eyJceD8ahB/yFaxwl0/kRgh59exqpJfTkDHAbAMQbxhiy2KndJ/F8BKV3u izIcjEho0w5iMySSi7DmalVYnVHr9pShNa9BOhvRTSIk5dxURgkbaVfWPXcExvXlVDe8 CSQVlcOocPk2krT2VNZeQiU9eOPOBnuRKCAGg2CFqqfyMNinLONnQefxaxo7tmkB9OwJ 5HDo3sRBIhMa3ZbDVuzV13b47vUUIVJeSeaVvXLEEHhfTiLDwYovObjSjOf2D5Hjn00b /2XA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=dWPjjwmCXdw1FoLWbTCqb8PkUleSoznDfVrorISf0Bs=; b=lksa5oo0HQpWqgBMKWoecSlkRaEjXhrKIz0RLcn5SNPeiuGpwMB7hDNbG+PKr9TbKz YHzkKH4OdJ1tp3o+gwX5hRh8Vk1+O27aD4z4CHp9wyHd4UMhrBmhKNx/qfWhWMOUSvU3 QN1/IL0rdurJdAFzmWBwY712wPGCGZ69EWz5JnlhWvFIMtmRh9K8fAgVzAsvCchSj76n rB2XDWD/N99NiENv3fYgB7c7IqSvY/wH3mYdBJWkcJVNdjgBZRlL0BxHbb/Zr1yyz1ma pzeVLXdWIn27vtmQmXfGAZ1FaLkm7H8G5g58xVOpcPJG83zw7mnZRIrjGxAaf7NnG6JB Cl6Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Wl3SxD4R; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b27-20020a631b1b000000b0045a3e1d1a6csi508268pgb.712.2022.10.13.14.14.25; Thu, 13 Oct 2022 14:14:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=Wl3SxD4R; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230175AbiJMVNp (ORCPT <rfc822;ouuuleilei@gmail.com> + 99 others); Thu, 13 Oct 2022 17:13:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53916 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230072AbiJMVNV (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 13 Oct 2022 17:13:21 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 93EF4191D7A for <linux-kernel@vger.kernel.org>; Thu, 13 Oct 2022 14:13:05 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id y65-20020a25c844000000b006bb773548d5so2573887ybf.5 for <linux-kernel@vger.kernel.org>; Thu, 13 Oct 2022 14:13:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=dWPjjwmCXdw1FoLWbTCqb8PkUleSoznDfVrorISf0Bs=; b=Wl3SxD4RgiI8mdPWO8AkxLNNNuFvmx2TVKUslqBgYcFOtlxKo0PibzHbprgwD8PCQH oq2O2wmIg5XJC70e4y7GFFVrlusSXIq4WR8SveMxveZzg0/6WISqHpZ7OzBkjr01nnp7 C6uTtRPcGcj5EF+MFcNS3KIOsZAX3AR665gsokStcFaoOmibFpyvlUqT4tzvPCCyECxs lpfUQE8m3YQVfaZP6hMuCSAug86mpg81Q/U0VR/UF1XvD1AAX+FStgqMrwKT8N4FNF6y lcR5IGwRUJE7UaFfSWZucL1k1lLGHEBhkfck+fYHsFX0SS+sCTzjwGEKQ62XJ1/7pdsi UJ+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=dWPjjwmCXdw1FoLWbTCqb8PkUleSoznDfVrorISf0Bs=; b=vFNojcWjMkE0q1CRiZMkmZA8bgIVvOyppbwYOb5N2A2jejWTdRSw91Lt0Y3M1lfUjU MUf+aClhJ0L7xHexF4cNVfcWZiH3hSuqgrluHNqotQlKg7GsPxyyuUpD3A8Bn2PTCFSy BH2hdiLCFfYKYSKuHFsmkJdlNJVThJNIRYZTVCPyhmz06UF3WaqqdY+r6nHnkOBINK/C 1IVlSaWcK3iyhGZlsKfoQaeAYkhzawdHHTJKF2LCpgF1mM3mf7I5bHJuZypTTkSY/l12 tFQ+v+tcr6v/3wGKOe8xn8yWdMGmmYTVYuzCDpllHXrBDlbyNa4/2QVv3lb+c4SViEzf 1tug== X-Gm-Message-State: ACrzQf26NvmPWEt0TtoOc1g7fPV044b924n+dX5Ye6gEgd6/wOK1yEwP JRdKzYIz731MjIBxUY1WOPZkVuEf4KU= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a25:bc4b:0:b0:6bc:a469:468c with SMTP id d11-20020a25bc4b000000b006bca469468cmr1922360ybk.199.1665695564234; Thu, 13 Oct 2022 14:12:44 -0700 (PDT) Reply-To: Sean Christopherson <seanjc@google.com> Date: Thu, 13 Oct 2022 21:12:21 +0000 In-Reply-To: <20221013211234.1318131-1-seanjc@google.com> Mime-Version: 1.0 References: <20221013211234.1318131-1-seanjc@google.com> X-Mailer: git-send-email 2.38.0.413.g74048e4d9e-goog Message-ID: <20221013211234.1318131-4-seanjc@google.com> Subject: [PATCH v2 03/16] KVM: x86: Always use non-compat vcpu_runstate_info size for gfn=>pfn cache From: Sean Christopherson <seanjc@google.com> To: Sean Christopherson <seanjc@google.com>, Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Michal Luczaj <mhal@rbox.co>, David Woodhouse <dwmw2@infradead.org> Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1746608512263343525?= X-GMAIL-MSGID: =?utf-8?q?1746608512263343525?= |
Series |
KVM: x86: gfn_to_pfn_cache fixes and cleanups
|
|
Commit Message
Sean Christopherson
Oct. 13, 2022, 9:12 p.m. UTC
Always use the size of Xen's non-compat vcpu_runstate_info struct when
checking that the GPA+size doesn't cross a page boundary. Conceptually,
using the current mode is more correct, but KVM isn't consistent with
itself as kvm_xen_vcpu_set_attr() unconditionally uses the "full" size
when activating the cache. More importantly, prior to the introduction
of the gfn_to_pfn_cache, KVM _always_ used the full size, i.e. allowing
the guest (userspace?) to use a poorly aligned GPA in 32-bit mode but not
64-bit mode is more of a bug than a feature, and fixing the bug doesn't
break KVM's historical ABI.
Always using the non-compat size will allow for future gfn_to_pfn_cache
clenups as this is (was) the only case where KVM uses a different size at
check()+refresh() than at activate(). E.g. the length/size of the cache
can be made immutable and dropped from check()+refresh(), which yields a
cleaner set of APIs and avoids potential bugs that could occur if check()
where invoked with a different size than refresh().
Fixes: a795cd43c5b5 ("KVM: x86/xen: Use gfn_to_pfn_cache for runstate area")
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
arch/x86/kvm/xen.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
Comments
On 10/13/22 23:12, Sean Christopherson wrote: > Always use the size of Xen's non-compat vcpu_runstate_info struct when > checking that the GPA+size doesn't cross a page boundary. Conceptually, > using the current mode is more correct, but KVM isn't consistent with > itself as kvm_xen_vcpu_set_attr() unconditionally uses the "full" size > when activating the cache. More importantly, prior to the introduction > of the gfn_to_pfn_cache, KVM _always_ used the full size, i.e. allowing > the guest (userspace?) to use a poorly aligned GPA in 32-bit mode but not > 64-bit mode is more of a bug than a feature, and fixing the bug doesn't > break KVM's historical ABI. I'd rather not introduce additional restrictions in KVM, mostly because it's actually easy to avoid this patch by instead enforcing that attributes are set in a sensible order: - long mode cannot be changed after the shared info page is enabled (which makes sense because the shared info page also has a compat version) - the caches must be activated after the shared info page (which enforces that the vCPU attributes are set after the VM attributes) This is technically a userspace API break, but nobody is really using this API outside Amazon so... Patches coming after I finish testing. Paolo > Always using the non-compat size will allow for future gfn_to_pfn_cache > clenups as this is (was) the only case where KVM uses a different size at > check()+refresh() than at activate(). E.g. the length/size of the cache > can be made immutable and dropped from check()+refresh(), which yields a > cleaner set of APIs and avoids potential bugs that could occur if check() > where invoked with a different size than refresh(). > > Fixes: a795cd43c5b5 ("KVM: x86/xen: Use gfn_to_pfn_cache for runstate area") > Cc: David Woodhouse <dwmw2@infradead.org> > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > arch/x86/kvm/xen.c | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c > index b2be60c6efa4..9e79ef2cca99 100644 > --- a/arch/x86/kvm/xen.c > +++ b/arch/x86/kvm/xen.c > @@ -212,10 +212,7 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state) > if (!vx->runstate_cache.active) > return; > > - if (IS_ENABLED(CONFIG_64BIT) && v->kvm->arch.xen.long_mode) > - user_len = sizeof(struct vcpu_runstate_info); > - else > - user_len = sizeof(struct compat_vcpu_runstate_info); > + user_len = sizeof(struct vcpu_runstate_info); > > read_lock_irqsave(&gpc->lock, flags); > while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa,
On Thu, Oct 27, 2022, Paolo Bonzini wrote: > On 10/13/22 23:12, Sean Christopherson wrote: > > Always use the size of Xen's non-compat vcpu_runstate_info struct when > > checking that the GPA+size doesn't cross a page boundary. Conceptually, > > using the current mode is more correct, but KVM isn't consistent with > > itself as kvm_xen_vcpu_set_attr() unconditionally uses the "full" size > > when activating the cache. More importantly, prior to the introduction > > of the gfn_to_pfn_cache, KVM _always_ used the full size, i.e. allowing > > the guest (userspace?) to use a poorly aligned GPA in 32-bit mode but not > > 64-bit mode is more of a bug than a feature, and fixing the bug doesn't > > break KVM's historical ABI. > > I'd rather not introduce additional restrictions in KVM, But KVM already has this restriction. "struct vcpu_info" is always checked for the non-compat size, and as above, "struct vcpu_runstate_info" is checked for the non-compat size during its initialization. > actually easy to avoid this patch by instead enforcing that attributes are > set in a sensible order: I don't care about fixing XEN support, I care about forcing "length" to be immutable in order to simplify the gfn_to_pfn_cache implementation. Avoiding this patch prevents doing that later in this series. > - long mode cannot be changed after the shared info page is enabled (which > makes sense because the shared info page also has a compat version) How is this not introducing an additional restriction? This seems way more onerous than what is effectively a revert. > - the caches must be activated after the shared info page (which enforces > that the vCPU attributes are set after the VM attributes) > > This is technically a userspace API break, but nobody is really using this > API outside Amazon so... Patches coming after I finish testing. It's not just userspace break, it affects the guest ABI as well. arch.xen.long_mode isn't set just by userspace, it's also snapshot when the guest changes the hypercall page. Maybe there's something in the XEN ABI that says the hypercall page can never be changed, but barring that I don't see how to prevent ending up with a misaligned cache due to the guest enabling long mode. int kvm_xen_write_hypercall_page(struct kvm_vcpu *vcpu, u64 data) { struct kvm *kvm = vcpu->kvm; u32 page_num = data & ~PAGE_MASK; u64 page_addr = data & PAGE_MASK; bool lm = is_long_mode(vcpu); /* Latch long_mode for shared_info pages etc. */ vcpu->kvm->arch.xen.long_mode = lm;
On 10/27/22 16:44, Sean Christopherson wrote: >> - long mode cannot be changed after the shared info page is enabled (which >> makes sense because the shared info page also has a compat version) > > How is this not introducing an additional restriction? This seems way more > onerous than what is effectively a revert. > >> - the caches must be activated after the shared info page (which enforces >> that the vCPU attributes are set after the VM attributes) >> >> This is technically a userspace API break, but nobody is really using this >> API outside Amazon so... Patches coming after I finish testing. > > It's not just userspace break, it affects the guest ABI as well. Yes, I was talking of the VMM here; additional restrictions are fine there. The guests however should be compatible with Xen, so you also need to re-activate the cache after the hypercall page is written, but that's two lines of code. Paolo
On Thu, Oct 27, 2022, Paolo Bonzini wrote: > On 10/27/22 16:44, Sean Christopherson wrote: > > > - long mode cannot be changed after the shared info page is enabled (which > > > makes sense because the shared info page also has a compat version) > > > > How is this not introducing an additional restriction? This seems way more > > onerous than what is effectively a revert. > > > > > - the caches must be activated after the shared info page (which enforces > > > that the vCPU attributes are set after the VM attributes) > > > > > > This is technically a userspace API break, but nobody is really using this > > > API outside Amazon so... Patches coming after I finish testing. > > > > It's not just userspace break, it affects the guest ABI as well. > > Yes, I was talking of the VMM here; additional restrictions are fine there. Additional restrictions are fine where? > The guests however should be compatible with Xen, so you also need to > re-activate the cache after the hypercall page is written, but that's two > lines of code. And do what if the guest transitions from 32-bit => 64-bit and the cache isn't aligned for 64-bit? E.g. kvm_xen_set_evtchn() will silently drop events no matter what KVM does. In other words, I don't see how KVM can provide a same ABI without forcing the cached pages to be aligned for the largets possible size.
On Thu, Oct 27, 2022, Sean Christopherson wrote: > On Thu, Oct 27, 2022, Paolo Bonzini wrote: > > On 10/13/22 23:12, Sean Christopherson wrote: > > > Always use the size of Xen's non-compat vcpu_runstate_info struct when > > > checking that the GPA+size doesn't cross a page boundary. Conceptually, > > > using the current mode is more correct, but KVM isn't consistent with > > > itself as kvm_xen_vcpu_set_attr() unconditionally uses the "full" size > > > when activating the cache. More importantly, prior to the introduction > > > of the gfn_to_pfn_cache, KVM _always_ used the full size, i.e. allowing > > > the guest (userspace?) to use a poorly aligned GPA in 32-bit mode but not > > > 64-bit mode is more of a bug than a feature, and fixing the bug doesn't > > > break KVM's historical ABI. > > > > I'd rather not introduce additional restrictions in KVM, > > But KVM already has this restriction. "struct vcpu_info" is always checked for > the non-compat size, and as above, "struct vcpu_runstate_info" is checked for the > non-compat size during its initialization. Ah, I forgot those are the same size: BUILD_BUG_ON(sizeof(struct vcpu_info) != sizeof(struct compat_vcpu_info));
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index b2be60c6efa4..9e79ef2cca99 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -212,10 +212,7 @@ void kvm_xen_update_runstate_guest(struct kvm_vcpu *v, int state) if (!vx->runstate_cache.active) return; - if (IS_ENABLED(CONFIG_64BIT) && v->kvm->arch.xen.long_mode) - user_len = sizeof(struct vcpu_runstate_info); - else - user_len = sizeof(struct compat_vcpu_runstate_info); + user_len = sizeof(struct vcpu_runstate_info); read_lock_irqsave(&gpc->lock, flags); while (!kvm_gfn_to_pfn_cache_check(v->kvm, gpc, gpc->gpa,