Message ID | 20231002095740.1472907-12-paul@xen.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2a8e:b0:403:3b70:6f57 with SMTP id in14csp1404982vqb; Mon, 2 Oct 2023 05:57:20 -0700 (PDT) X-Google-Smtp-Source: AGHT+IExvJCnU8ZzJQ8yBPHieJ4QFgjjQo9BlJ39lN1zqTV5VQ2Z7peeAzRb6UDcJvQuQhw5uRv2 X-Received: by 2002:a17:90a:bb96:b0:26d:54de:b0d6 with SMTP id v22-20020a17090abb9600b0026d54deb0d6mr8672520pjr.20.1696251440587; Mon, 02 Oct 2023 05:57:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696251440; cv=none; d=google.com; s=arc-20160816; b=TU3koJUA74k4OygAvXetr/3YtNHCG3UeZNHXstwL5tXdztWrQjBFfc5rb+iiOd5dYE 1/HlzQDeq9QtksGzmq5iRhDxhIXoD+uTa4AHHMubiXYYLQRwcgJ5j1qopuS69DMGu4kL 3ZqlicjWFmlDV8HG2U2tu3UzX7g5wlz+9KY5tJT2o2AIZVqXLvHj0n/J9o1umM+uXXCp 20bQORQbV+WtbZqiO6ZeO05fUjpCSWO9mXwq5GetfcBAIwAd0pFCWs9zZaF9f2ZVKBL6 fyL788QPlwm8nDAP+IL1m3U2zS/8B2BFxArJXTeVicbqJS6xxLylR19fQSCRZa6MW440 IxsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=oaLU75npsvq/54a8Sa025V+4+WzNDRn33upnn1s6+Ig=; fh=8AvbwpPKF0Qe5WwS5eyABOVDUChKGxIrkPaOnjrOqKo=; b=Cj+LEZqi8zDQcyh0z2XZeSvGbwLq39+YyLZJfZ94L8yxjhHNSlBASPjJtBM5XwXu5j eB6wwsNkmzT1BT36SBMMF8dg0O3XlqwUcywVgZXiY0unmrmFYkYqvT+eYPJ6N5KCenne J1qEy9uQPc6u/CJ8iRJye62aYPl7IRmr7hRxEwgAsQzUJJ7DwR/MmF/LRoIp09pPRdQb ZT4thegC3GnSZnznQBQY1SNqMUkaY7LBYX06WRUbC+lug1ZidlDwiAmM4WyceyOw7yG4 QfedCU7OSrplm8Bs/2DAVUqnsNDfuBC9vRo2DDNG+Ag7y1ra7Fz77fcsmeldhS6GvRE5 EiNA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@xen.org header.s=20200302mail header.b="q/ykCRK8"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id gx15-20020a17090b124f00b00277374d74afsi7256226pjb.12.2023.10.02.05.57.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Oct 2023 05:57:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@xen.org header.s=20200302mail header.b="q/ykCRK8"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 8A03180756FD; Mon, 2 Oct 2023 03:00:52 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236317AbjJBKAc (ORCPT <rfc822;pwkd43@gmail.com> + 16 others); Mon, 2 Oct 2023 06:00:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40594 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236241AbjJBKAV (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 2 Oct 2023 06:00:21 -0400 Received: from mail.xenproject.org (mail.xenproject.org [104.130.215.37]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 513B0CC6; Mon, 2 Oct 2023 03:00:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=xen.org; s=20200302mail; h=Content-Transfer-Encoding:MIME-Version:References: In-Reply-To:Message-Id:Date:Subject:Cc:To:From; bh=oaLU75npsvq/54a8Sa025V+4+WzNDRn33upnn1s6+Ig=; b=q/ykCRK8JD/cssMkft43adWnmO dgFcx/5EExveHYpHaN2qOKKGko25vBzWjhhf5/NVwSP5/M/WTCNqBb4ip0t0km23eeoB95Lt1bDXf JOulZfz2FlZxo7iEZB51Q0Ks17fLm6Vha27xljU5mmelsOkNb5XOTZS8jn+Mukq7VcrM=; Received: from xenbits.xenproject.org ([104.239.192.120]) by mail.xenproject.org with esmtp (Exim 4.92) (envelope-from <paul@xen.org>) id 1qnFiq-000178-IA; Mon, 02 Oct 2023 10:00:16 +0000 Received: from ec2-63-33-11-17.eu-west-1.compute.amazonaws.com ([63.33.11.17] helo=REM-PW02S00X.ant.amazon.com) by xenbits.xenproject.org with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <paul@xen.org>) id 1qnFgh-0000Ft-TV; Mon, 02 Oct 2023 09:58:04 +0000 From: Paul Durrant <paul@xen.org> To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Paul Durrant <pdurrant@amazon.com>, David Woodhouse <dwmw@amazon.co.uk>, David Woodhouse <dwmw2@infradead.org>, Sean Christopherson <seanjc@google.com>, Paolo Bonzini <pbonzini@redhat.com>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, "H. Peter Anvin" <hpa@zytor.com>, x86@kernel.org Subject: [PATCH v7 11/11] KVM: xen: allow vcpu_info content to be 'safely' copied Date: Mon, 2 Oct 2023 09:57:40 +0000 Message-Id: <20231002095740.1472907-12-paul@xen.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231002095740.1472907-1-paul@xen.org> References: <20231002095740.1472907-1-paul@xen.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_PASS,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Mon, 02 Oct 2023 03:00:52 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778648550583994324 X-GMAIL-MSGID: 1778648550583994324 |
Series |
KVM: xen: update shared_info and vcpu_info handling
|
|
Commit Message
Paul Durrant
Oct. 2, 2023, 9:57 a.m. UTC
From: Paul Durrant <pdurrant@amazon.com> If the guest sets an explicit vcpu_info GPA then, for any of the first 32 vCPUs, the content of the default vcpu_info in the shared_info page must be copied into the new location. Because this copy may race with event delivery (which updates the 'evtchn_pending_sel' field in vcpu_info) we need a way to defer that until the copy is complete. Happily there is already a shadow of 'evtchn_pending_sel' in kvm_vcpu_xen that is used in atomic context if the vcpu_info PFN cache has been invalidated so that the update of vcpu_info can be deferred until the cache can be refreshed (on vCPU thread's the way back into guest context). So let's also use this shadow if the vcpu_info cache has been *deactivated*, so that the VMM can safely copy the vcpu_info content and then re-activate the cache with the new GPA. To do this, all we need to do is stop considering an inactive vcpu_info cache as a hard error in kvm_xen_set_evtchn_fast(). Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> --- Cc: David Woodhouse <dwmw2@infradead.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org v6: - New in this version. --- arch/x86/kvm/xen.c | 3 --- 1 file changed, 3 deletions(-)
Comments
On Mon, Oct 02, 2023, Paul Durrant wrote: > From: Paul Durrant <pdurrant@amazon.com> > > If the guest sets an explicit vcpu_info GPA then, for any of the first 32 > vCPUs, the content of the default vcpu_info in the shared_info page must be > copied into the new location. Because this copy may race with event > delivery (which updates the 'evtchn_pending_sel' field in vcpu_info) we > need a way to defer that until the copy is complete. Nit, add a blank link between paragraphs. > Happily there is already a shadow of 'evtchn_pending_sel' in kvm_vcpu_xen > that is used in atomic context if the vcpu_info PFN cache has been > invalidated so that the update of vcpu_info can be deferred until the > cache can be refreshed (on vCPU thread's the way back into guest context). > So let's also use this shadow if the vcpu_info cache has been > *deactivated*, so that the VMM can safely copy the vcpu_info content and > then re-activate the cache with the new GPA. To do this, all we need to do > is stop considering an inactive vcpu_info cache as a hard error in > kvm_xen_set_evtchn_fast(). Please, please try to write changelogs that adhere to the preferred style. I get that the preferred style likely doesn't align with what you're used to, but the preferred style really doesn't help me get through reviews quicker. > diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c > index aafc794940e4..e645066217bb 100644 > --- a/arch/x86/kvm/xen.c > +++ b/arch/x86/kvm/xen.c > @@ -1606,9 +1606,6 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm) > WRITE_ONCE(xe->vcpu_idx, vcpu->vcpu_idx); > } > > - if (!vcpu->arch.xen.vcpu_info_cache.active) > - return -EINVAL; > - Hmm, maybe move this check after the "hard" error checks and explicitly do: return -EWOULDBLOCK That way it's much more obvious that this patch is safe. Alternatively, briefly explain what happens if the cache is invalid in the changelog. > if (xe->port >= max_evtchn_port(kvm)) > return -EINVAL; > > -- > 2.39.2 >
On Tue, 2023-10-31 at 16:58 -0700, Sean Christopherson wrote: > > > diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c > > index aafc794940e4..e645066217bb 100644 > > --- a/arch/x86/kvm/xen.c > > +++ b/arch/x86/kvm/xen.c > > @@ -1606,9 +1606,6 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm) > > WRITE_ONCE(xe->vcpu_idx, vcpu->vcpu_idx); > > } > > > > - if (!vcpu->arch.xen.vcpu_info_cache.active) > > - return -EINVAL; > > - > > Hmm, maybe move this check after the "hard" error checks and explicitly do: > > return -EWOULDBLOCK > > That way it's much more obvious that this patch is safe. Alternatively, briefly > explain what happens if the cache is invalid in the changelog. Hm, I thought we dropped that? It doesn't want to return -EWOULDBLOCK. That would cause a "fast" irqfd delivery to defer to the slow (workqueue) path, but then the same thing would happen on that path *too*. I actually think this check needs to be dropped completely. The evtchn_pending_sel bit will get set in the *shadow* copy in vcpu- >arch.xen.evtchn_pending_sel, and will subsequently be set in the real vcpu_info when that is configured again. We do already handle that case correctly, I believe: if (!kvm_gpc_check(gpc, sizeof(struct vcpu_info))) { /* * Could not access the vcpu_info. Set the bit in-kernel * and prod the vCPU to deliver it for itself. */ if (!test_and_set_bit(port_word_bit, &vcpu->arch.xen.evtchn_pending_sel)) kick_vcpu = true; goto out_rcu; } If we leave this check in place as it is, I think we're going to lose incoming events (and IPIs) directed at a vCPU while it has its vcpu_info temporarily removed. (We do want to "temporarily" remove the vcpu_info so that we can adjust memslots, because there's no way to atomically break apart a large memslot and overlay a single page in the middle of it. And also when a guest is paused for migration, it's nice to be sure that no changes will happen to the guest memory and trigger sanity checks that the memory on the destination precisely matches the source.)
On Wed, Nov 08, 2023, David Woodhouse wrote: > On Tue, 2023-10-31 at 16:58 -0700, Sean Christopherson wrote: > > > > > diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c > > > index aafc794940e4..e645066217bb 100644 > > > --- a/arch/x86/kvm/xen.c > > > +++ b/arch/x86/kvm/xen.c > > > @@ -1606,9 +1606,6 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm) > > > WRITE_ONCE(xe->vcpu_idx, vcpu->vcpu_idx); > > > } > > > > > > - if (!vcpu->arch.xen.vcpu_info_cache.active) > > > - return -EINVAL; > > > - > > > > Hmm, maybe move this check after the "hard" error checks and explicitly do: > > > > return -EWOULDBLOCK > > > > That way it's much more obvious that this patch is safe. Alternatively, briefly > > explain what happens if the cache is invalid in the changelog. > > > Hm, I thought we dropped that? It doesn't want to return -EWOULDBLOCK. > That would cause a "fast" irqfd delivery to defer to the slow > (workqueue) path, but then the same thing would happen on that path > *too*. > > I actually think this check needs to be dropped completely. The > evtchn_pending_sel bit will get set in the *shadow* copy in vcpu- > >arch.xen.evtchn_pending_sel, and will subsequently be set in the real > vcpu_info when that is configured again. We do already handle that case > correctly, I believe: > > if (!kvm_gpc_check(gpc, sizeof(struct vcpu_info))) { > /* > * Could not access the vcpu_info. Set the bit in-kernel > * and prod the vCPU to deliver it for itself. > */ > if (!test_and_set_bit(port_word_bit, &vcpu->arch.xen.evtchn_pending_sel)) > kick_vcpu = true; > goto out_rcu; > } > > If we leave this check in place as it is, I think we're going to lose > incoming events (and IPIs) directed at a vCPU while it has its > vcpu_info temporarily removed. Oh, dagnabbit, there are two different gfn_to_pfn_cache objects. I assumed "gpc" was the same thing as vcpu_info_cache.active, i.e. thought the explicit active check here was just to pre-check the validity before acquiring the lock. if (!vcpu->arch.xen.vcpu_info_cache.active) return -EINVAL; rc = -EWOULDBLOCK; idx = srcu_read_lock(&kvm->srcu); read_lock_irqsave(&gpc->lock, flags); if (!kvm_gpc_check(gpc, PAGE_SIZE)) <== I thought this was the same check goto out_rcu; That's why my comment was nonsensical, and also why I was super confused by your response for a few minutes :-) Good gravy, this even conditionally switches gpc and makes the out path drop different locks. That's just mean. Any objection to splitting out the vcpu_info chunk to a separate function? That'd eliminate the subtle gpc+lock switch, and provide a convenient and noticeable location to explain why it's ok for setting the vcpu_info to fail. E.g. --- arch/x86/kvm/xen.c | 102 ++++++++++++++++++++++++++------------------- 1 file changed, 58 insertions(+), 44 deletions(-) diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index d65e89e062e4..cd2021ba0ae3 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -1621,6 +1621,57 @@ static void kvm_xen_check_poller(struct kvm_vcpu *vcpu, int port) } } +static int kvm_xen_needs_a_name(struct kvm_vcpu *vcpu, int port_word_bit) +{ + struct gfn_to_pfn_cache *gpc = &vcpu->arch.xen.vcpu_info_cache; + bool kick_vcpu = false; + unsigned long flags; + int r = 0; + + read_lock_irqsave(&gpc->lock, flags); + if (!kvm_gpc_check(gpc, sizeof(struct vcpu_info))) { + /* + * Could not access the vcpu_info. Set the bit in-kernel and + * prod the vCPU to deliver it for itself. + */ + if (!test_and_set_bit(port_word_bit, &vcpu->arch.xen.evtchn_pending_sel)) + kick_vcpu = true; + + r = -EWOULDBLOCK; + goto out; + } + + if (IS_ENABLED(CONFIG_64BIT) && vcpu->kvm->arch.xen.long_mode) { + struct vcpu_info *vcpu_info = gpc->khva; + if (!test_and_set_bit(port_word_bit, &vcpu_info->evtchn_pending_sel)) { + WRITE_ONCE(vcpu_info->evtchn_upcall_pending, 1); + kick_vcpu = true; + } + } else { + struct compat_vcpu_info *vcpu_info = gpc->khva; + if (!test_and_set_bit(port_word_bit, + (unsigned long *)&vcpu_info->evtchn_pending_sel)) { + WRITE_ONCE(vcpu_info->evtchn_upcall_pending, 1); + kick_vcpu = true; + } + } + + /* For the per-vCPU lapic vector, deliver it as MSI. */ + if (kick_vcpu && vcpu->arch.xen.upcall_vector) { + kvm_xen_inject_vcpu_vector(vcpu); + kick_vcpu = false; + } + +out: + read_unlock_irqrestore(&gpc->lock, flags); + + if (kick_vcpu) { + kvm_make_request(KVM_REQ_UNBLOCK, vcpu); + kvm_vcpu_kick(vcpu); + } + + return r; +} /* * The return value from this function is propagated to kvm_set_irq() API, * so it returns: @@ -1638,7 +1689,6 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm) unsigned long *pending_bits, *mask_bits; unsigned long flags; int port_word_bit; - bool kick_vcpu = false; int vcpu_idx, idx, rc; vcpu_idx = READ_ONCE(xe->vcpu_idx); @@ -1660,7 +1710,7 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm) read_lock_irqsave(&gpc->lock, flags); if (!kvm_gpc_check(gpc, PAGE_SIZE)) - goto out_rcu; + goto out_unlock; if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) { struct shared_info *shinfo = gpc->khva; @@ -1688,52 +1738,16 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm) kvm_xen_check_poller(vcpu, xe->port); } else { rc = 1; /* Delivered to the bitmap in shared_info. */ - /* Now switch to the vCPU's vcpu_info to set the index and pending_sel */ - read_unlock_irqrestore(&gpc->lock, flags); - gpc = &vcpu->arch.xen.vcpu_info_cache; - - read_lock_irqsave(&gpc->lock, flags); - if (!kvm_gpc_check(gpc, sizeof(struct vcpu_info))) { - /* - * Could not access the vcpu_info. Set the bit in-kernel - * and prod the vCPU to deliver it for itself. - */ - if (!test_and_set_bit(port_word_bit, &vcpu->arch.xen.evtchn_pending_sel)) - kick_vcpu = true; - goto out_rcu; - } - - if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) { - struct vcpu_info *vcpu_info = gpc->khva; - if (!test_and_set_bit(port_word_bit, &vcpu_info->evtchn_pending_sel)) { - WRITE_ONCE(vcpu_info->evtchn_upcall_pending, 1); - kick_vcpu = true; - } - } else { - struct compat_vcpu_info *vcpu_info = gpc->khva; - if (!test_and_set_bit(port_word_bit, - (unsigned long *)&vcpu_info->evtchn_pending_sel)) { - WRITE_ONCE(vcpu_info->evtchn_upcall_pending, 1); - kick_vcpu = true; - } - } - - /* For the per-vCPU lapic vector, deliver it as MSI. */ - if (kick_vcpu && vcpu->arch.xen.upcall_vector) { - kvm_xen_inject_vcpu_vector(vcpu); - kick_vcpu = false; - } } - out_rcu: + out_unlock: read_unlock_irqrestore(&gpc->lock, flags); + + /* Paul to add comment explaining why it's ok kvm_xen_needs_a_name() fails */ + if (rc == 1) + rc = kvm_xen_needs_a_name(vcpu, port_word_bit); + srcu_read_unlock(&kvm->srcu, idx); - - if (kick_vcpu) { - kvm_make_request(KVM_REQ_UNBLOCK, vcpu); - kvm_vcpu_kick(vcpu); - } - return rc; } base-commit: 0f34262cf6fab8163456c7740c7ee1888a8af93c --
On Wed, 2023-11-08 at 12:55 -0800, Sean Christopherson wrote: > > Any objection to splitting out the vcpu_info chunk to a separate function? That'd > eliminate the subtle gpc+lock switch, and provide a convenient and noticeable > location to explain why it's ok for setting the vcpu_info to fail. No objection, that looks entirely sane to me. > E.g. > > --- > arch/x86/kvm/xen.c | 102 ++++++++++++++++++++++++++------------------- > 1 file changed, 58 insertions(+), 44 deletions(-) > > diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c > index d65e89e062e4..cd2021ba0ae3 100644 > --- a/arch/x86/kvm/xen.c > +++ b/arch/x86/kvm/xen.c > @@ -1621,6 +1621,57 @@ static void kvm_xen_check_poller(struct kvm_vcpu *vcpu, int port) > } > } > > +static int kvm_xen_needs_a_name(struct kvm_vcpu *vcpu, int port_word_bit) > kvm_xen_deliver_vcpu_evtchn_pending_sel() ? ... > - out_rcu: > + out_unlock: > read_unlock_irqrestore(&gpc->lock, flags); > + > + /* Paul to add comment explaining why it's ok kvm_xen_needs_a_name() fails */ Basically the point here is that it *doesn't* fail. Either it delivers directly to the vcpu_info at the appropriate GPA via the GPC, or it will set the corresponding bit in the 'shadow' vcpu->arch.xen.evtchn_pending_sel to be delivered later by kvm_xen_inject_pending_events(). Either way, it's 'success'. I do want to vet the code path where userspace forgets to reinstate the vcpu_info before running the vCPU again. Looking at kvm_xen_inject_pending_events(), I think the kvm_gpc_check() for the vcpu_info will fail, so it'll just return, and KVM will enter the vCPU with vcpu->arch.xen.evtchn_pending_sel still unchanged? The bits won't get delivered until the next time the vCPU is entered after the vcpu_info is set. Which I *think* is fine. Userspace has to stop running the vCPU in order to actually set the vcpu_info again, when it finally decides to do so. And if it never does, that works out as well as can be expected. Probably does want adding to the selftest.
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c index aafc794940e4..e645066217bb 100644 --- a/arch/x86/kvm/xen.c +++ b/arch/x86/kvm/xen.c @@ -1606,9 +1606,6 @@ int kvm_xen_set_evtchn_fast(struct kvm_xen_evtchn *xe, struct kvm *kvm) WRITE_ONCE(xe->vcpu_idx, vcpu->vcpu_idx); } - if (!vcpu->arch.xen.vcpu_info_cache.active) - return -EINVAL; - if (xe->port >= max_evtchn_port(kvm)) return -EINVAL;