Documentation: kvm: clarify SRCU locking order

Message ID 20221228110410.1682852-2-pbonzini@redhat.com
State New
Headers
Series Documentation: kvm: clarify SRCU locking order |

Commit Message

Paolo Bonzini Dec. 28, 2022, 11:04 a.m. UTC
  Currently only the locking order of SRCU vs kvm->slots_arch_lock
and kvm->slots_lock is documented.  Extend this to kvm->lock
since Xen emulation got it terribly wrong.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 Documentation/virt/kvm/locking.rst | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)
  

Comments

Sean Christopherson Jan. 3, 2023, 5:42 p.m. UTC | #1
On Wed, Dec 28, 2022, Paolo Bonzini wrote:
> Currently only the locking order of SRCU vs kvm->slots_arch_lock
> and kvm->slots_lock is documented.  Extend this to kvm->lock
> since Xen emulation got it terribly wrong.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  Documentation/virt/kvm/locking.rst | 19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
> index 845a561629f1..a3ca76f9be75 100644
> --- a/Documentation/virt/kvm/locking.rst
> +++ b/Documentation/virt/kvm/locking.rst
> @@ -16,17 +16,26 @@ The acquisition orders for mutexes are as follows:
>  - kvm->slots_lock is taken outside kvm->irq_lock, though acquiring
>    them together is quite rare.
>  
> -- Unlike kvm->slots_lock, kvm->slots_arch_lock is released before
> -  synchronize_srcu(&kvm->srcu).  Therefore kvm->slots_arch_lock
> -  can be taken inside a kvm->srcu read-side critical section,
> -  while kvm->slots_lock cannot.
> -
>  - kvm->mn_active_invalidate_count ensures that pairs of
>    invalidate_range_start() and invalidate_range_end() callbacks
>    use the same memslots array.  kvm->slots_lock and kvm->slots_arch_lock
>    are taken on the waiting side in install_new_memslots, so MMU notifiers
>    must not take either kvm->slots_lock or kvm->slots_arch_lock.
>  
> +For SRCU:
> +
> +- ``synchronize_srcu(&kvm->srcu)`` is called _inside_
> +  the kvm->slots_lock critical section, therefore kvm->slots_lock
> +  cannot be taken inside a kvm->srcu read-side critical section.
> +  Instead, kvm->slots_arch_lock is released before the call
> +  to ``synchronize_srcu()`` and _can_ be taken inside a
> +  kvm->srcu read-side critical section.
> +
> +- kvm->lock is taken inside kvm->srcu, therefore

Prior to the recent Xen change, is this actually true?  There are many instances
where kvm->srcu is taken inside kvm->lock, but I can't find any existing cases
where the reverse is true.  Logically, it makes sense to take kvm->lock first since
kvm->srcu can be taken deep in helpers, e.g. for accessing guest memory.  It's also
more consistent to take kvm->lock first since kvm->srcu is taken inside vcpu->mutex,
and vcpu->mutex is taken inside kvm->lock.

Disallowing synchronize_srcu(kvm->srcu) inside kvm->lock isn't probelmatic per se,
but it's going to result in a weird set of rules because synchronize_scru() can,
and is, called while holding a variety of other locks.

In other words, IMO taking kvm->srcu outside of kvm->lock in the Xen code is the
real bug.

> +  ``synchronize_srcu(&kvm->srcu)`` cannot be called inside
> +  a kvm->lock critical section.  If you cannot delay the
> +  call until after kvm->lock is released, use ``call_srcu``.
> +
>  On x86:
>  
>  - vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock
> -- 
> 2.31.1
>
  
Sean Christopherson Jan. 5, 2023, 10:38 p.m. UTC | #2
On Tue, Jan 03, 2023, Sean Christopherson wrote:
> On Wed, Dec 28, 2022, Paolo Bonzini wrote:
> > Currently only the locking order of SRCU vs kvm->slots_arch_lock
> > and kvm->slots_lock is documented.  Extend this to kvm->lock
> > since Xen emulation got it terribly wrong.
> > 
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >  Documentation/virt/kvm/locking.rst | 19 ++++++++++++++-----
> >  1 file changed, 14 insertions(+), 5 deletions(-)
> > 
> > diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
> > index 845a561629f1..a3ca76f9be75 100644
> > --- a/Documentation/virt/kvm/locking.rst
> > +++ b/Documentation/virt/kvm/locking.rst
> > @@ -16,17 +16,26 @@ The acquisition orders for mutexes are as follows:
> >  - kvm->slots_lock is taken outside kvm->irq_lock, though acquiring
> >    them together is quite rare.
> >  
> > -- Unlike kvm->slots_lock, kvm->slots_arch_lock is released before
> > -  synchronize_srcu(&kvm->srcu).  Therefore kvm->slots_arch_lock
> > -  can be taken inside a kvm->srcu read-side critical section,
> > -  while kvm->slots_lock cannot.
> > -
> >  - kvm->mn_active_invalidate_count ensures that pairs of
> >    invalidate_range_start() and invalidate_range_end() callbacks
> >    use the same memslots array.  kvm->slots_lock and kvm->slots_arch_lock
> >    are taken on the waiting side in install_new_memslots, so MMU notifiers
> >    must not take either kvm->slots_lock or kvm->slots_arch_lock.
> >  
> > +For SRCU:
> > +
> > +- ``synchronize_srcu(&kvm->srcu)`` is called _inside_
> > +  the kvm->slots_lock critical section, therefore kvm->slots_lock
> > +  cannot be taken inside a kvm->srcu read-side critical section.
> > +  Instead, kvm->slots_arch_lock is released before the call
> > +  to ``synchronize_srcu()`` and _can_ be taken inside a
> > +  kvm->srcu read-side critical section.
> > +
> > +- kvm->lock is taken inside kvm->srcu, therefore
> 
> Prior to the recent Xen change, is this actually true?

I was thinking of a different change, but v5.19 is still kinda recent, so I'll
count it.  Better to be lucky than good :-)

Commit 2fd6df2f2b47 ("KVM: x86/xen: intercept EVTCHNOP_send from guests") introduced
the only case I can find where kvm->srcu is taken inside kvm->lock.

> There are many instances where kvm->srcu is taken inside kvm->lock, but I
> can't find any existing cases where the reverse is true.  Logically, it makes
> sense to take kvm->lock first since kvm->srcu can be taken deep in helpers,
> e.g. for accessing guest memory.  It's also more consistent to take kvm->lock
> first since kvm->srcu is taken inside vcpu->mutex, and vcpu->mutex is taken
> inside kvm->lock.
> 
> Disallowing synchronize_srcu(kvm->srcu) inside kvm->lock isn't probelmatic per se,
> but it's going to result in a weird set of rules because synchronize_scru() can,
> and is, called while holding a variety of other locks.
> 
> In other words, IMO taking kvm->srcu outside of kvm->lock in the Xen code is the
> real bug.

I'm doubing down on this.  Taking kvm->srcu outside of kvm->lock is all kinds of
sketchy, and likely indicates a larger problem.  The aformentioned commit that
introduced the problematic kvm->srcu vs. kvm->lock also blatantly violates ordering
between kvm->lock and vcpu->mutex.  Details in the link[*].

The vast majority of flows that take kvm->srcu will already hold a lock of some
kind, otherwise the task can't safely deference any VM/vCPU/device data and thus
has no reason to acquire kvm->srcu.  E.g. taking kvm->srcu to read guest memory
is nonsensical without a stable guest physical address to work with.

There are exceptions, e.g. evtchn_set_fn() and maybe some ioctls(), but in most
cases, taking kvm->lock inside kvm->srcu is just asking for problems.

[*] https://lore.kernel.org/all/Y7dN0Negds7XUbvI@google.com
  
David Woodhouse Jan. 11, 2023, 10:26 a.m. UTC | #3
On Thu, 2023-01-05 at 22:38 +0000, Sean Christopherson wrote:
> 
> > > +- kvm->lock is taken inside kvm->srcu, therefore
> > 
> > Prior to the recent Xen change, is this actually true?
> 
> I was thinking of a different change, but v5.19 is still kinda recent, so I'll
> count it.  Better to be lucky than good :-)
> 
> Commit 2fd6df2f2b47 ("KVM: x86/xen: intercept EVTCHNOP_send from guests") introduced
> the only case I can find where kvm->srcu is taken inside kvm->lock.
> 
> > There are many instances where kvm->srcu is taken inside kvm->lock, but I
> > can't find any existing cases where the reverse is true.  Logically, it makes
> > sense to take kvm->lock first since kvm->srcu can be taken deep in helpers,
> > e.g. for accessing guest memory.  It's also more consistent to take kvm->lock
> > first since kvm->srcu is taken inside vcpu->mutex, and vcpu->mutex is taken
> > inside kvm->lock.
> > 
> > Disallowing synchronize_srcu(kvm->srcu) inside kvm->lock isn't probelmatic per se,
> > but it's going to result in a weird set of rules because synchronize_scru() can,
> > and is, called while holding a variety of other locks.
> > 
> > In other words, IMO taking kvm->srcu outside of kvm->lock in the Xen code is the
> > real bug.
> 
> I'm doubing down on this.  Taking kvm->srcu outside of kvm->lock is all kinds of
> sketchy, and likely indicates a larger problem.  The aformentioned commit that
> introduced the problematic kvm->srcu vs. kvm->lock also blatantly violates ordering
> between kvm->lock and vcpu->mutex.  Details in the link[*].
> 
> The vast majority of flows that take kvm->srcu will already hold a lock of some
> kind, otherwise the task can't safely deference any VM/vCPU/device data and thus
> has no reason to acquire kvm->srcu.  E.g. taking kvm->srcu to read guest memory
> is nonsensical without a stable guest physical address to work with.
> 
> There are exceptions, e.g. evtchn_set_fn() and maybe some ioctls(), but in most
> cases, taking kvm->lock inside kvm->srcu is just asking for problems.
> 
> [*] https://lore.kernel.org/all/Y7dN0Negds7XUbvI@google.com

So...

If we apply the patch below, then the Xen code never takes kvm->lock
inside kvm->srcu. (Never takes kvm->lock at all, in fact; I'm just
rereading the manual search/replace to make sure the change is valid in
all cases.)

It does take kvm->scru *without* holding kvm->lock, just not "outside"
it.

I think we can probably revert commit a79b53aaaa ("KVM: x86: fix
deadlock for KVM_XEN_EVTCHN_RESET") after doing this too?



From: David Woodhouse <dwmw@amazon.co.uk>
Date: Wed, 11 Jan 2023 10:04:38 +0000
Subject: [PATCH] KVM: x86/xen: Avoid deadlock by adding kvm->arch.xen.xen_lock leaf node lock

In commit 14243b387137a ("KVM: x86/xen: Add KVM_IRQ_ROUTING_XEN_EVTCHN
and event channel delivery") the clever version of me left some helpful
notes for those who would come after him:

       /*
        * For the irqfd workqueue, using the main kvm->lock mutex is
        * fine since this function is invoked from kvm_set_irq() with
        * no other lock held, no srcu. In future if it will be called
        * directly from a vCPU thread (e.g. on hypercall for an IPI)
        * then it may need to switch to using a leaf-node mutex for
        * serializing the shared_info mapping.
        */
       mutex_lock(&kvm->lock);

In commit 2fd6df2f2b47 ("KVM: x86/xen: intercept EVTCHNOP_send from guests")
the other version of me ran straight past that comment without reading it,
and introduced a potential deadlock by taking vcpu->mutex and kvm->lock
in the wrong order.

Solve this as originally suggested, by adding a leaf-node lock in the Xen
state rather than using kvm->lock for it.

Fixes: 2fd6df2f2b47 ("KVM: x86/xen: intercept EVTCHNOP_send from guests")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/xen.c              | 65 +++++++++++++++------------------
 2 files changed, 30 insertions(+), 36 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 2f5bf581d00a..4ef0143fa682 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1115,6 +1115,7 @@ struct msr_bitmap_range {
 
 /* Xen emulation context */
 struct kvm_xen {
+	struct mutex xen_lock;
 	u32 xen_version;
 	bool long_mode;
 	bool runstate_update_flag;
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index c444948ab1ac..713241808a3d 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -604,26 +604,26 @@ int kvm_xen_hvm_set_attr(struct kvm *kvm, struct kvm_xen_hvm_attr *data)
 		if (!IS_ENABLED(CONFIG_64BIT) && data->u.long_mode) {
 			r = -EINVAL;
 		} else {
-			mutex_lock(&kvm->lock);
+			mutex_lock(&kvm->arch.xen.xen_lock);
 			kvm->arch.xen.long_mode = !!data->u.long_mode;
-			mutex_unlock(&kvm->lock);
+			mutex_unlock(&kvm->arch.xen.xen_lock);
 			r = 0;
 		}
 		break;
 
 	case KVM_XEN_ATTR_TYPE_SHARED_INFO:
-		mutex_lock(&kvm->lock);
+		mutex_lock(&kvm->arch.xen.xen_lock);
 		r = kvm_xen_shared_info_init(kvm, data->u.shared_info.gfn);
-		mutex_unlock(&kvm->lock);
+		mutex_unlock(&kvm->arch.xen.xen_lock);
 		break;
 
 	case KVM_XEN_ATTR_TYPE_UPCALL_VECTOR:
 		if (data->u.vector && data->u.vector < 0x10)
 			r = -EINVAL;
 		else {
-			mutex_lock(&kvm->lock);
+			mutex_lock(&kvm->arch.xen.xen_lock);
 			kvm->arch.xen.upcall_vector = data->u.vector;
-			mutex_unlock(&kvm->lock);
+			mutex_unlock(&kvm->arch.xen.xen_lock);
 			r = 0;
 		}
 		break;
@@ -633,9 +633,9 @@ int kvm_xen_hvm_set_attr(struct kvm *kvm, struct kvm_xen_hvm_attr *data)
 		break;
 
 	case KVM_XEN_ATTR_TYPE_XEN_VERSION:
-		mutex_lock(&kvm->lock);
+		mutex_lock(&kvm->arch.xen.xen_lock);
 		kvm->arch.xen.xen_version = data->u.xen_version;
-		mutex_unlock(&kvm->lock);
+		mutex_unlock(&kvm->arch.xen.xen_lock);
 		r = 0;
 		break;
 
@@ -644,9 +644,9 @@ int kvm_xen_hvm_set_attr(struct kvm *kvm, struct kvm_xen_hvm_attr *data)
 			r = -EOPNOTSUPP;
 			break;
 		}
-		mutex_lock(&kvm->lock);
+		mutex_lock(&kvm->arch.xen.xen_lock);
 		kvm->arch.xen.runstate_update_flag = !!data->u.runstate_update_flag;
-		mutex_unlock(&kvm->lock);
+		mutex_unlock(&kvm->arch.xen.xen_lock);
 		r = 0;
 		break;
 
@@ -661,7 +661,7 @@ int kvm_xen_hvm_get_attr(struct kvm *kvm, struct kvm_xen_hvm_attr *data)
 {
 	int r = -ENOENT;
 
-	mutex_lock(&kvm->lock);
+	mutex_lock(&kvm->arch.xen.xen_lock);
 
 	switch (data->type) {
 	case KVM_XEN_ATTR_TYPE_LONG_MODE:
@@ -700,7 +700,7 @@ int kvm_xen_hvm_get_attr(struct kvm *kvm, struct kvm_xen_hvm_attr *data)
 		break;
 	}
 
-	mutex_unlock(&kvm->lock);
+	mutex_unlock(&kvm->arch.xen.xen_lock);
 	return r;
 }
 
@@ -708,7 +708,7 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 {
 	int idx, r = -ENOENT;
 
-	mutex_lock(&vcpu->kvm->lock);
+	mutex_lock(&vcpu->kvm->arch.xen.xen_lock);
 	idx = srcu_read_lock(&vcpu->kvm->srcu);
 
 	switch (data->type) {
@@ -936,7 +936,7 @@ int kvm_xen_vcpu_set_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 	}
 
 	srcu_read_unlock(&vcpu->kvm->srcu, idx);
-	mutex_unlock(&vcpu->kvm->lock);
+	mutex_unlock(&vcpu->kvm->arch.xen.xen_lock);
 	return r;
 }
 
@@ -944,7 +944,7 @@ int kvm_xen_vcpu_get_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 {
 	int r = -ENOENT;
 
-	mutex_lock(&vcpu->kvm->lock);
+	mutex_lock(&vcpu->kvm->arch.xen.xen_lock);
 
 	switch (data->type) {
 	case KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO:
@@ -1027,7 +1027,7 @@ int kvm_xen_vcpu_get_attr(struct kvm_vcpu *vcpu, struct kvm_xen_vcpu_attr *data)
 		break;
 	}
 
-	mutex_unlock(&vcpu->kvm->lock);
+	mutex_unlock(&vcpu->kvm->arch.xen.xen_lock);
 	return r;
 }
 
@@ -1120,7 +1120,7 @@ int kvm_xen_hvm_config(struct kvm *kvm, struct kvm_xen_hvm_config *xhc)
 	     xhc->blob_size_32 || xhc->blob_size_64))
 		return -EINVAL;
 
-	mutex_lock(&kvm->lock);
+	mutex_lock(&kvm->arch.xen.xen_lock);
 
 	if (xhc->msr && !kvm->arch.xen_hvm_config.msr)
 		static_branch_inc(&kvm_xen_enabled.key);
@@ -1129,7 +1129,7 @@ int kvm_xen_hvm_config(struct kvm *kvm, struct kvm_xen_hvm_config *xhc)
 
 	memcpy(&kvm->arch.xen_hvm_config, xhc, sizeof(*xhc));
 
-	mutex_unlock(&kvm->lock);
+	mutex_unlock(&kvm->arch.xen.xen_lock);
 	return 0;
 }
 
@@ -1672,15 +1672,7 @@ static int kvm_xen_set_evtchn(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 		mm_borrowed = true;
 	}
 
-	/*
-	 * For the irqfd workqueue, using the main kvm->lock mutex is
-	 * fine since this function is invoked from kvm_set_irq() with
-	 * no other lock held, no srcu. In future if it will be called
-	 * directly from a vCPU thread (e.g. on hypercall for an IPI)
-	 * then it may need to switch to using a leaf-node mutex for
-	 * serializing the shared_info mapping.
-	 */
-	mutex_lock(&kvm->lock);
+	mutex_lock(&kvm->arch.xen.xen_lock);
 
 	/*
 	 * It is theoretically possible for the page to be unmapped
@@ -1709,7 +1701,7 @@ static int kvm_xen_set_evtchn(struct kvm_xen_evtchn *xe, struct kvm *kvm)
 		srcu_read_unlock(&kvm->srcu, idx);
 	} while(!rc);
 
-	mutex_unlock(&kvm->lock);
+	mutex_unlock(&kvm->arch.xen.xen_lock);
 
 	if (mm_borrowed)
 		kthread_unuse_mm(kvm->mm);
@@ -1825,7 +1817,7 @@ static int kvm_xen_eventfd_update(struct kvm *kvm,
 	int ret;
 
 	/* Protect writes to evtchnfd as well as the idr lookup.  */
-	mutex_lock(&kvm->lock);
+	mutex_lock(&kvm->arch.xen.xen_lock);
 	evtchnfd = idr_find(&kvm->arch.xen.evtchn_ports, port);
 
 	ret = -ENOENT;
@@ -1856,7 +1848,7 @@ static int kvm_xen_eventfd_update(struct kvm *kvm,
 	}
 	ret = 0;
 out_unlock:
-	mutex_unlock(&kvm->lock);
+	mutex_unlock(&kvm->arch.xen.xen_lock);
 	return ret;
 }
 
@@ -1919,10 +1911,10 @@ static int kvm_xen_eventfd_assign(struct kvm *kvm,
 		evtchnfd->deliver.port.priority = data->u.evtchn.deliver.port.priority;
 	}
 
-	mutex_lock(&kvm->lock);
+	mutex_lock(&kvm->arch.xen.xen_lock);
 	ret = idr_alloc(&kvm->arch.xen.evtchn_ports, evtchnfd, port, port + 1,
 			GFP_KERNEL);
-	mutex_unlock(&kvm->lock);
+	mutex_unlock(&kvm->arch.xen.xen_lock);
 	if (ret >= 0)
 		return 0;
 
@@ -1940,9 +1932,9 @@ static int kvm_xen_eventfd_deassign(struct kvm *kvm, u32 port)
 {
 	struct evtchnfd *evtchnfd;
 
-	mutex_lock(&kvm->lock);
+	mutex_lock(&kvm->arch.xen.xen_lock);
 	evtchnfd = idr_remove(&kvm->arch.xen.evtchn_ports, port);
-	mutex_unlock(&kvm->lock);
+	mutex_unlock(&kvm->arch.xen.xen_lock);
 
 	if (!evtchnfd)
 		return -ENOENT;
@@ -1959,7 +1951,7 @@ static int kvm_xen_eventfd_reset(struct kvm *kvm)
 	struct evtchnfd *evtchnfd;
 	int i;
 
-	mutex_lock(&kvm->lock);
+	mutex_lock(&kvm->arch.xen.xen_lock);
 	idr_for_each_entry(&kvm->arch.xen.evtchn_ports, evtchnfd, i) {
 		idr_remove(&kvm->arch.xen.evtchn_ports, evtchnfd->send_port);
 		synchronize_srcu(&kvm->srcu);
@@ -1967,7 +1959,7 @@ static int kvm_xen_eventfd_reset(struct kvm *kvm)
 			eventfd_ctx_put(evtchnfd->deliver.eventfd.ctx);
 		kfree(evtchnfd);
 	}
-	mutex_unlock(&kvm->lock);
+	mutex_unlock(&kvm->arch.xen.xen_lock);
 
 	return 0;
 }
@@ -2059,6 +2051,7 @@ void kvm_xen_destroy_vcpu(struct kvm_vcpu *vcpu)
 
 void kvm_xen_init_vm(struct kvm *kvm)
 {
+	mutex_init(&kvm->arch.xen.xen_lock);
 	idr_init(&kvm->arch.xen.evtchn_ports);
 	kvm_gpc_init(&kvm->arch.xen.shinfo_cache, kvm, NULL, KVM_HOST_USES_PFN);
 }
  

Patch

diff --git a/Documentation/virt/kvm/locking.rst b/Documentation/virt/kvm/locking.rst
index 845a561629f1..a3ca76f9be75 100644
--- a/Documentation/virt/kvm/locking.rst
+++ b/Documentation/virt/kvm/locking.rst
@@ -16,17 +16,26 @@  The acquisition orders for mutexes are as follows:
 - kvm->slots_lock is taken outside kvm->irq_lock, though acquiring
   them together is quite rare.
 
-- Unlike kvm->slots_lock, kvm->slots_arch_lock is released before
-  synchronize_srcu(&kvm->srcu).  Therefore kvm->slots_arch_lock
-  can be taken inside a kvm->srcu read-side critical section,
-  while kvm->slots_lock cannot.
-
 - kvm->mn_active_invalidate_count ensures that pairs of
   invalidate_range_start() and invalidate_range_end() callbacks
   use the same memslots array.  kvm->slots_lock and kvm->slots_arch_lock
   are taken on the waiting side in install_new_memslots, so MMU notifiers
   must not take either kvm->slots_lock or kvm->slots_arch_lock.
 
+For SRCU:
+
+- ``synchronize_srcu(&kvm->srcu)`` is called _inside_
+  the kvm->slots_lock critical section, therefore kvm->slots_lock
+  cannot be taken inside a kvm->srcu read-side critical section.
+  Instead, kvm->slots_arch_lock is released before the call
+  to ``synchronize_srcu()`` and _can_ be taken inside a
+  kvm->srcu read-side critical section.
+
+- kvm->lock is taken inside kvm->srcu, therefore
+  ``synchronize_srcu(&kvm->srcu)`` cannot be called inside
+  a kvm->lock critical section.  If you cannot delay the
+  call until after kvm->lock is released, use ``call_srcu``.
+
 On x86:
 
 - vcpu->mutex is taken outside kvm->arch.hyperv.hv_lock