[v13,18/21] KVM: x86/xen: don't block on pfncache locks in kvm_xen_set_evtchn_fast()
Commit Message
From: Paul Durrant <pdurrant@amazon.com>
As described in [1] compiling with CONFIG_PROVE_RAW_LOCK_NESTING shows that
kvm_xen_set_evtchn_fast() is blocking on pfncache locks in IRQ context.
There is only actually blocking with PREEMPT_RT because the locks will
turned into mutexes. There is no 'raw' version of rwlock_t that can be used
to avoid that, so use read_trylock() and treat failure to lock the same as
an invalid cache.
[1] https://lore.kernel.org/lkml/99771ef3a4966a01fefd3adbb2ba9c3a75f97cf2.camel@infradead.org/T/#mbd06e5a04534ce9c0ee94bd8f1e8d942b2d45bd6
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
---
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: x86@kernel.org
v13:
- Patch title change.
v11:
- Amended the commit comment.
v10:
- New in this version.
---
arch/x86/kvm/xen.c | 30 ++++++++++++++++++++----------
1 file changed, 20 insertions(+), 10 deletions(-)
Comments
On Thu, Feb 15, 2024, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
>
> As described in [1] compiling with CONFIG_PROVE_RAW_LOCK_NESTING shows that
> kvm_xen_set_evtchn_fast() is blocking on pfncache locks in IRQ context.
> There is only actually blocking with PREEMPT_RT because the locks will
> turned into mutexes. There is no 'raw' version of rwlock_t that can be used
> to avoid that, so use read_trylock() and treat failure to lock the same as
> an invalid cache.
>
> [1] https://lore.kernel.org/lkml/99771ef3a4966a01fefd3adbb2ba9c3a75f97cf2.camel@infradead.org/T/#mbd06e5a04534ce9c0ee94bd8f1e8d942b2d45bd6
>
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
> Cc: Sean Christopherson <seanjc@google.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: David Woodhouse <dwmw2@infradead.org>
> Cc: x86@kernel.org
>
> v13:
> - Patch title change.
>
> v11:
> - Amended the commit comment.
>
> v10:
> - New in this version.
> ---
> arch/x86/kvm/xen.c | 30 ++++++++++++++++++++----------
> 1 file changed, 20 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
> index 59073642c078..8650141b266e 100644
> --- a/arch/x86/kvm/xen.c
> +++ b/arch/x86/kvm/xen.c
> @@ -1678,10 +1678,13 @@ static int set_shinfo_evtchn_pending(struct kvm_vcpu *vcpu, u32 port)
> unsigned long flags;
> int rc = -EWOULDBLOCK;
>
> - read_lock_irqsave(&gpc->lock, flags);
> - if (!kvm_gpc_check(gpc, PAGE_SIZE))
> + local_irq_save(flags);
> + if (!read_trylock(&gpc->lock))
> goto out;
I am not comfortable applying this patch. As shown by the need for the next patch
to optimize unrelated invalidations, switching to read_trylock() is more subtle
than it seems at first glance. Specifically, there are no fairness guarantees.
I am not dead set against this change, but I don't want to put my SoB on what I
consider to be a hack.
I've zero objections if you can convince Paolo to take this directly, i.e. this
isn't a NAK. I just don't want to take it through my tree.
On 19/02/2024 22:04, Sean Christopherson wrote:
> On Thu, Feb 15, 2024, Paul Durrant wrote:
>> From: Paul Durrant <pdurrant@amazon.com>
>>
>> As described in [1] compiling with CONFIG_PROVE_RAW_LOCK_NESTING shows that
>> kvm_xen_set_evtchn_fast() is blocking on pfncache locks in IRQ context.
>> There is only actually blocking with PREEMPT_RT because the locks will
>> turned into mutexes. There is no 'raw' version of rwlock_t that can be used
>> to avoid that, so use read_trylock() and treat failure to lock the same as
>> an invalid cache.
>>
>> [1] https://lore.kernel.org/lkml/99771ef3a4966a01fefd3adbb2ba9c3a75f97cf2.camel@infradead.org/T/#mbd06e5a04534ce9c0ee94bd8f1e8d942b2d45bd6
>>
>> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
>> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
>> ---
>> Cc: Sean Christopherson <seanjc@google.com>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Borislav Petkov <bp@alien8.de>
>> Cc: Dave Hansen <dave.hansen@linux.intel.com>
>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>> Cc: David Woodhouse <dwmw2@infradead.org>
>> Cc: x86@kernel.org
>>
>> v13:
>> - Patch title change.
>>
>> v11:
>> - Amended the commit comment.
>>
>> v10:
>> - New in this version.
>> ---
>> arch/x86/kvm/xen.c | 30 ++++++++++++++++++++----------
>> 1 file changed, 20 insertions(+), 10 deletions(-)
>>
>> diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
>> index 59073642c078..8650141b266e 100644
>> --- a/arch/x86/kvm/xen.c
>> +++ b/arch/x86/kvm/xen.c
>> @@ -1678,10 +1678,13 @@ static int set_shinfo_evtchn_pending(struct kvm_vcpu *vcpu, u32 port)
>> unsigned long flags;
>> int rc = -EWOULDBLOCK;
>>
>> - read_lock_irqsave(&gpc->lock, flags);
>> - if (!kvm_gpc_check(gpc, PAGE_SIZE))
>> + local_irq_save(flags);
>> + if (!read_trylock(&gpc->lock))
>> goto out;
>
> I am not comfortable applying this patch. As shown by the need for the next patch
> to optimize unrelated invalidations, switching to read_trylock() is more subtle
> than it seems at first glance. Specifically, there are no fairness guarantees.
>
> I am not dead set against this change, but I don't want to put my SoB on what I
> consider to be a hack.
>
> I've zero objections if you can convince Paolo to take this directly, i.e. this
> isn't a NAK. I just don't want to take it through my tree.
Ok. I'll drop this from v14 then. It can go separately, assuming there
is no move to add the raw lock which would negate it.
@@ -1678,10 +1678,13 @@ static int set_shinfo_evtchn_pending(struct kvm_vcpu *vcpu, u32 port)
unsigned long flags;
int rc = -EWOULDBLOCK;
- read_lock_irqsave(&gpc->lock, flags);
- if (!kvm_gpc_check(gpc, PAGE_SIZE))
+ local_irq_save(flags);
+ if (!read_trylock(&gpc->lock))
goto out;
+ if (!kvm_gpc_check(gpc, PAGE_SIZE))
+ goto out_unlock;
+
if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) {
struct shared_info *shinfo = gpc->khva;
@@ -1703,8 +1706,10 @@ static int set_shinfo_evtchn_pending(struct kvm_vcpu *vcpu, u32 port)
rc = 1; /* It is newly raised */
}
+ out_unlock:
+ read_unlock(&gpc->lock);
out:
- read_unlock_irqrestore(&gpc->lock, flags);
+ local_irq_restore(flags);
return rc;
}
@@ -1714,21 +1719,23 @@ static bool set_vcpu_info_evtchn_pending(struct kvm_vcpu *vcpu, u32 port)
struct gfn_to_pfn_cache *gpc = &vcpu->arch.xen.vcpu_info_cache;
unsigned long flags;
bool kick_vcpu = false;
+ bool locked;
- read_lock_irqsave(&gpc->lock, flags);
+ local_irq_save(flags);
+ locked = read_trylock(&gpc->lock);
/*
* Try to deliver the event directly to the vcpu_info. If successful and
* the guest is using upcall_vector delivery, send the MSI.
- * If the pfncache is invalid, set the shadow. In this case, or if the
- * guest is using another form of event delivery, the vCPU must be
- * kicked to complete the delivery.
+ * If the pfncache lock is contended or the cache is invalid, set the
+ * shadow. In this case, or if the guest is using another form of event
+ * delivery, the vCPU must be kicked to complete the delivery.
*/
if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode) {
struct vcpu_info *vcpu_info = gpc->khva;
int port_word_bit = port / 64;
- if (!kvm_gpc_check(gpc, sizeof(*vcpu_info))) {
+ if ((!locked || !kvm_gpc_check(gpc, sizeof(*vcpu_info)))) {
if (!test_and_set_bit(port_word_bit, &vcpu->arch.xen.evtchn_pending_sel))
kick_vcpu = true;
goto out;
@@ -1742,7 +1749,7 @@ static bool set_vcpu_info_evtchn_pending(struct kvm_vcpu *vcpu, u32 port)
struct compat_vcpu_info *vcpu_info = gpc->khva;
int port_word_bit = port / 32;
- if (!kvm_gpc_check(gpc, sizeof(*vcpu_info))) {
+ if ((!locked || !kvm_gpc_check(gpc, sizeof(*vcpu_info)))) {
if (!test_and_set_bit(port_word_bit, &vcpu->arch.xen.evtchn_pending_sel))
kick_vcpu = true;
goto out;
@@ -1761,7 +1768,10 @@ static bool set_vcpu_info_evtchn_pending(struct kvm_vcpu *vcpu, u32 port)
}
out:
- read_unlock_irqrestore(&gpc->lock, flags);
+ if (locked)
+ read_unlock(&gpc->lock);
+
+ local_irq_restore(flags);
return kick_vcpu;
}