[3/4] KVM: introduce memory transaction semaphore

Message ID 20221022154819.1823133-4-eesposit@redhat.com
State New
Headers
Series KVM: API to block and resume all running vcpus in a vm |

Commit Message

Emanuele Giuseppe Esposito Oct. 22, 2022, 3:48 p.m. UTC
  Right now the semaphore is only used to signal that a vcpu
entered KVM_RUN (not necessarly in guest mode, could be also
blocked/halted).
Later it will be used by specific ioctls (writers) to wait that
all vcpus (readers) exit from KVM_RUN.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 virt/kvm/kvm_main.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)
  

Comments

Paolo Bonzini Oct. 23, 2022, 5:50 p.m. UTC | #1
On 10/22/22 17:48, Emanuele Giuseppe Esposito wrote:
> +static DECLARE_RWSEM(memory_transaction);

This cannot be global, it must be per-struct kvm.  Otherwise one VM can 
keep the rwsem indefinitely while a second VM hangs in 
KVM_KICK_ALL_RUNNING_VCPUS.

It can also be changed to an SRCU (with the down_write+up_write sequence 
changed to synchronize_srcu_expedited) which has similar characteristics 
to your use of the rwsem.

Paolo
  
Emanuele Giuseppe Esposito Oct. 24, 2022, 12:57 p.m. UTC | #2
Am 23/10/2022 um 19:50 schrieb Paolo Bonzini:
> On 10/22/22 17:48, Emanuele Giuseppe Esposito wrote:
>> +static DECLARE_RWSEM(memory_transaction);
> 
> This cannot be global, it must be per-struct kvm.  Otherwise one VM can
> keep the rwsem indefinitely while a second VM hangs in
> KVM_KICK_ALL_RUNNING_VCPUS.
> 
> It can also be changed to an SRCU (with the down_write+up_write sequence
> changed to synchronize_srcu_expedited) which has similar characteristics
> to your use of the rwsem.
> 

Makes sense, but why synchronize_srcu_expedited and not synchronize_srcu?

Thank you,
Emanuele
  
Paolo Bonzini Oct. 25, 2022, 10:01 a.m. UTC | #3
On 10/24/22 14:57, Emanuele Giuseppe Esposito wrote:
> 
> 
> Am 23/10/2022 um 19:50 schrieb Paolo Bonzini:
>> On 10/22/22 17:48, Emanuele Giuseppe Esposito wrote:
>>> +static DECLARE_RWSEM(memory_transaction);
>>
>> This cannot be global, it must be per-struct kvm.  Otherwise one VM can
>> keep the rwsem indefinitely while a second VM hangs in
>> KVM_KICK_ALL_RUNNING_VCPUS.
>>
>> It can also be changed to an SRCU (with the down_write+up_write sequence
>> changed to synchronize_srcu_expedited) which has similar characteristics
>> to your use of the rwsem.
>>
> 
> Makes sense, but why synchronize_srcu_expedited and not synchronize_srcu?

Because (thanks to the kick) you expect the grace period to end almost 
immediately, and synchronize_srcu() will slow down sensibly the changes 
to the memory map.

Paolo
  

Patch

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index c080b93edc0d..ae0240928a4a 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -119,6 +119,8 @@  static const struct file_operations stat_fops_per_vm;
 
 static struct file_operations kvm_chardev_ops;
 
+static DECLARE_RWSEM(memory_transaction);
+
 static long kvm_vcpu_ioctl(struct file *file, unsigned int ioctl,
 			   unsigned long arg);
 #ifdef CONFIG_KVM_COMPAT
@@ -4074,7 +4076,19 @@  static long kvm_vcpu_ioctl(struct file *filp,
 				synchronize_rcu();
 			put_pid(oldpid);
 		}
+		/*
+		 * Notify that a vcpu wants to run, and thus could be reading
+		 * memslots.
+		 * If KVM_KICK_ALL_RUNNING_VCPUS runs afterwards, it will have
+		 * to wait that KVM_RUN exited and up_read() is called.
+		 * If KVM_KICK_ALL_RUNNING_VCPUS already returned but
+		 * KVM_RESUME_ALL_KICKED_VCPUS didn't start yet, then there
+		 * is a request pending for the vcpu that will cause it to
+		 * exit KVM_RUN.
+		 */
+		down_read(&memory_transaction);
 		r = kvm_arch_vcpu_ioctl_run(vcpu);
+		up_read(&memory_transaction);
 		trace_kvm_userspace_exit(vcpu->run->exit_reason, r);
 		break;
 	}