[v3,09/13] KVM: xen: automatically use the vcpu_info embedded in shared_info

Message ID 20230918144111.641369-10-paul@xen.org
State New
Headers
Series KVM: xen: update shared_info and vcpu_info handling |

Commit Message

Paul Durrant Sept. 18, 2023, 2:41 p.m. UTC
  From: Paul Durrant <pdurrant@amazon.com>

The VMM should only need to set the KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO
attribute in response to a VCPUOP_register_vcpu_info hypercall. We can
handle the default case internally since we already know where the
shared_info page is. Modify get_vcpu_info_cache() to pass back a pointer
to the shared info pfn cache (and appropriate offset) for any of the
first 32 vCPUs if the attribute has not been set.

A VMM will be able to determine whether it needs to set up default
vcpu_info using the previously defined KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA
Xen capability flag, which will be advertized in a subsequent patch.

Also update the KVM API documentation to describe the new behaviour.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
---
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: x86@kernel.org

v3:
 - Add a note to the API documentation discussing vcpu_info copying.

v2:
 - Dispense with the KVM_XEN_HVM_CONFIG_DEFAULT_VCPU_INFO capability.
 - Add API documentation.
---
 Documentation/virt/kvm/api.rst | 22 +++++++++++++++-------
 arch/x86/kvm/xen.c             | 15 +++++++++++++++
 2 files changed, 30 insertions(+), 7 deletions(-)
  

Comments

David Woodhouse Sept. 18, 2023, 4:07 p.m. UTC | #1
On Mon, 2023-09-18 at 14:41 +0000, Paul Durrant wrote:
> From: Paul Durrant <pdurrant@amazon.com>
> 
> The VMM should only need to set the KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO
> attribute in response to a VCPUOP_register_vcpu_info hypercall. We can
> handle the default case internally since we already know where the
> shared_info page is. Modify get_vcpu_info_cache() to pass back a pointer
> to the shared info pfn cache (and appropriate offset) for any of the
> first 32 vCPUs if the attribute has not been set.
> 
> A VMM will be able to determine whether it needs to set up default
> vcpu_info using the previously defined KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA
> Xen capability flag, which will be advertized in a subsequent patch.
> 
> Also update the KVM API documentation to describe the new behaviour.
> 
> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
> ---
> Cc: Sean Christopherson <seanjc@google.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: "H. Peter Anvin" <hpa@zytor.com>
> Cc: David Woodhouse <dwmw2@infradead.org>
> Cc: x86@kernel.org
> 
> v3:
>  - Add a note to the API documentation discussing vcpu_info copying.
> 
> v2:
>  - Dispense with the KVM_XEN_HVM_CONFIG_DEFAULT_VCPU_INFO capability.
>  - Add API documentation.
> ---
>  Documentation/virt/kvm/api.rst | 22 +++++++++++++++-------
>  arch/x86/kvm/xen.c             | 15 +++++++++++++++
>  2 files changed, 30 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index e9df4df6fe48..47bf3db74674 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -5442,13 +5442,7 @@ KVM_XEN_ATTR_TYPE_LONG_MODE
>  
>  KVM_XEN_ATTR_TYPE_SHARED_INFO
>    Sets the guest physical frame number at which the Xen shared_info
> -  page resides. Note that although Xen places vcpu_info for the first
> -  32 vCPUs in the shared_info page, KVM does not automatically do so
> -  and instead requires that KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO be used
> -  explicitly even when the vcpu_info for a given vCPU resides at the
> -  "default" location in the shared_info page. This is because KVM may
> -  not be aware of the Xen CPU id which is used as the index into the
> -  vcpu_info[] array, so may know the correct default location.
> +  page resides.
>  
>    Note that the shared_info page may be constantly written to by KVM;
>    it contains the event channel bitmap used to deliver interrupts to
> @@ -5564,12 +5558,26 @@ type values:
>  
>  KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO
>    Sets the guest physical address of the vcpu_info for a given vCPU.
> +  The vcpu_info for the first 32 vCPUs defaults to the structures
> +  embedded in the shared_info page.

The above is true only if KVM has KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA.
You kind of touch on that next, but perhaps the 'if the KVM_...'
condition should be moved up?

> +  If the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA flag is also set in the
> +  Xen capabilities then the VMM is not required to set this default
> +  location; KVM will handle that internally. Otherwise this attribute
> +  must be set for all vCPUs.
> +
>    As with the shared_info page for the VM, the corresponding page may be
>    dirtied at any time if event channel interrupt delivery is enabled, so
>    userspace should always assume that the page is dirty without relying
>    on dirty logging. Setting the gpa to KVM_XEN_INVALID_GPA will disable
>    the vcpu_info.
>  
> +  Note that, if the guest sets an explicit vcpu_info location in guest
> +  memory then the VMM is expected to copy the content of the structure
> +  embedded in the shared_info page to the new location. It is therefore
> +  important that no event delivery is in progress at this time, otherwise
> +  events may be missed.
> 

That's difficult. It means tearing down all interrupts from passthrough
devices which are mapped via PIRQs, and also all IPIs. 

The IPI code *should* be able to fall back to just letting the VMM
handle the hypercall in userspace. But PIRQs are harder. I'd be happier
if our plan — handwavy though it may be — led to being able to use the
existing slow path for delivering interrupts by just *invalidating* the
cache. Maybe we *should* move the memcpy into the kernel, and let it
lock *both* the shinfo and new vcpu_info caches while it's doing the
copy? Given that that's the only valid transition, that shouldn't be so
hard, should it?

>  KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO
>    Sets the guest physical address of an additional pvclock structure
>    for a given vCPU. This is typically used for guest vsyscall support.
> diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
> index 459f3ca4710e..660a808c0b50 100644
> --- a/arch/x86/kvm/xen.c
> +++ b/arch/x86/kvm/xen.c
> @@ -491,6 +491,21 @@ static void kvm_xen_inject_vcpu_vector(struct kvm_vcpu *v)
>  
>  static struct gfn_to_pfn_cache *get_vcpu_info_cache(struct kvm_vcpu *v, unsigned long *offset)
>  {
> +       if (!v->arch.xen.vcpu_info_cache.active && v->arch.xen.vcpu_id < MAX_VIRT_CPUS) {
> +               struct kvm *kvm = v->kvm;
> +
> +               if (offset) {
> +                       if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode)
> +                               *offset = offsetof(struct shared_info,
> +                                                  vcpu_info[v->arch.xen.vcpu_id]);
> +                       else
> +                               *offset = offsetof(struct compat_shared_info,
> +                                                  vcpu_info[v->arch.xen.vcpu_id]);
> +               }
> +
> +               return &kvm->arch.xen.shinfo_cache;
> +       }
> +
>         if (offset)
>                 *offset = 0;
>
  
Paul Durrant Sept. 18, 2023, 4:15 p.m. UTC | #2
On 18/09/2023 17:07, David Woodhouse wrote:
> On Mon, 2023-09-18 at 14:41 +0000, Paul Durrant wrote:
>> From: Paul Durrant <pdurrant@amazon.com>
>>
>> The VMM should only need to set the KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO
>> attribute in response to a VCPUOP_register_vcpu_info hypercall. We can
>> handle the default case internally since we already know where the
>> shared_info page is. Modify get_vcpu_info_cache() to pass back a pointer
>> to the shared info pfn cache (and appropriate offset) for any of the
>> first 32 vCPUs if the attribute has not been set.
>>
>> A VMM will be able to determine whether it needs to set up default
>> vcpu_info using the previously defined KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA
>> Xen capability flag, which will be advertized in a subsequent patch.
>>
>> Also update the KVM API documentation to describe the new behaviour.
>>
>> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
>> ---
>> Cc: Sean Christopherson <seanjc@google.com>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Borislav Petkov <bp@alien8.de>
>> Cc: Dave Hansen <dave.hansen@linux.intel.com>
>> Cc: "H. Peter Anvin" <hpa@zytor.com>
>> Cc: David Woodhouse <dwmw2@infradead.org>
>> Cc: x86@kernel.org
>>
>> v3:
>>   - Add a note to the API documentation discussing vcpu_info copying.
>>
>> v2:
>>   - Dispense with the KVM_XEN_HVM_CONFIG_DEFAULT_VCPU_INFO capability.
>>   - Add API documentation.
>> ---
>>   Documentation/virt/kvm/api.rst | 22 +++++++++++++++-------
>>   arch/x86/kvm/xen.c             | 15 +++++++++++++++
>>   2 files changed, 30 insertions(+), 7 deletions(-)
>>
>> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
>> index e9df4df6fe48..47bf3db74674 100644
>> --- a/Documentation/virt/kvm/api.rst
>> +++ b/Documentation/virt/kvm/api.rst
>> @@ -5442,13 +5442,7 @@ KVM_XEN_ATTR_TYPE_LONG_MODE
>>   
>>   KVM_XEN_ATTR_TYPE_SHARED_INFO
>>     Sets the guest physical frame number at which the Xen shared_info
>> -  page resides. Note that although Xen places vcpu_info for the first
>> -  32 vCPUs in the shared_info page, KVM does not automatically do so
>> -  and instead requires that KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO be used
>> -  explicitly even when the vcpu_info for a given vCPU resides at the
>> -  "default" location in the shared_info page. This is because KVM may
>> -  not be aware of the Xen CPU id which is used as the index into the
>> -  vcpu_info[] array, so may know the correct default location.
>> +  page resides.
>>   
>>     Note that the shared_info page may be constantly written to by KVM;
>>     it contains the event channel bitmap used to deliver interrupts to
>> @@ -5564,12 +5558,26 @@ type values:
>>   
>>   KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO
>>     Sets the guest physical address of the vcpu_info for a given vCPU.
>> +  The vcpu_info for the first 32 vCPUs defaults to the structures
>> +  embedded in the shared_info page.
> 
> The above is true only if KVM has KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA.
> You kind of touch on that next, but perhaps the 'if the KVM_...'
> condition should be moved up?
> 
>> +  If the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA flag is also set in the
>> +  Xen capabilities then the VMM is not required to set this default
>> +  location; KVM will handle that internally. Otherwise this attribute
>> +  must be set for all vCPUs.
>> +
>>     As with the shared_info page for the VM, the corresponding page may be
>>     dirtied at any time if event channel interrupt delivery is enabled, so
>>     userspace should always assume that the page is dirty without relying
>>     on dirty logging. Setting the gpa to KVM_XEN_INVALID_GPA will disable
>>     the vcpu_info.
>>   
>> +  Note that, if the guest sets an explicit vcpu_info location in guest
>> +  memory then the VMM is expected to copy the content of the structure
>> +  embedded in the shared_info page to the new location. It is therefore
>> +  important that no event delivery is in progress at this time, otherwise
>> +  events may be missed.
>>
> 
> That's difficult. It means tearing down all interrupts from passthrough
> devices which are mapped via PIRQs, and also all IPIs.

So those don't honour event channel masking? That seems like a problem.

> 
> The IPI code *should* be able to fall back to just letting the VMM
> handle the hypercall in userspace. But PIRQs are harder. I'd be happier
> if our plan — handwavy though it may be — led to being able to use the
> existing slow path for delivering interrupts by just *invalidating* the
> cache. Maybe we *should* move the memcpy into the kernel, and let it
> lock *both* the shinfo and new vcpu_info caches while it's doing the
> copy? Given that that's the only valid transition, that shouldn't be so
> hard, should it?
> 

No, it just kind of oversteps the remit of the attribute... but I'll try 
adding it and see how messy it gets.

   Paul

>>   KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO
>>     Sets the guest physical address of an additional pvclock structure
>>     for a given vCPU. This is typically used for guest vsyscall support.
>> diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
>> index 459f3ca4710e..660a808c0b50 100644
>> --- a/arch/x86/kvm/xen.c
>> +++ b/arch/x86/kvm/xen.c
>> @@ -491,6 +491,21 @@ static void kvm_xen_inject_vcpu_vector(struct kvm_vcpu *v)
>>   
>>   static struct gfn_to_pfn_cache *get_vcpu_info_cache(struct kvm_vcpu *v, unsigned long *offset)
>>   {
>> +       if (!v->arch.xen.vcpu_info_cache.active && v->arch.xen.vcpu_id < MAX_VIRT_CPUS) {
>> +               struct kvm *kvm = v->kvm;
>> +
>> +               if (offset) {
>> +                       if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode)
>> +                               *offset = offsetof(struct shared_info,
>> +                                                  vcpu_info[v->arch.xen.vcpu_id]);
>> +                       else
>> +                               *offset = offsetof(struct compat_shared_info,
>> +                                                  vcpu_info[v->arch.xen.vcpu_id]);
>> +               }
>> +
>> +               return &kvm->arch.xen.shinfo_cache;
>> +       }
>> +
>>          if (offset)
>>                  *offset = 0;
>>   
>
  
David Woodhouse Sept. 18, 2023, 4:20 p.m. UTC | #3
On Mon, 2023-09-18 at 17:15 +0100, Paul Durrant wrote:
> > > +  Note that, if the guest sets an explicit vcpu_info location in guest
> > > +  memory then the VMM is expected to copy the content of the structure
> > > +  embedded in the shared_info page to the new location. It is therefore
> > > +  important that no event delivery is in progress at this time, otherwise
> > > +  events may be missed.
> > > 
> > 
> > That's difficult. It means tearing down all interrupts from passthrough
> > devices which are mapped via PIRQs, and also all IPIs.
> 
> So those don't honour event channel masking? That seems like a problem.

Oh, *mask*. Sure, it does honour masking. But... that would mean the
VMM has to keep track of which ports were *really* masked by the guest,
and which ones were just masked for the switchover. Including if the
guest does some mask/unmask activity *while* the switchover is
happening (or locking to prevent such). I still don't think that's a
kind thing to be telling the VMMs they need to do.

> > 
> > The IPI code *should* be able to fall back to just letting the VMM
> > handle the hypercall in userspace. But PIRQs are harder. I'd be happier
> > if our plan — handwavy though it may be — led to being able to use the
> > existing slow path for delivering interrupts by just *invalidating* the
> > cache. Maybe we *should* move the memcpy into the kernel, and let it
> > lock *both* the shinfo and new vcpu_info caches while it's doing the
> > copy? Given that that's the only valid transition, that shouldn't be so
> > hard, should it?
> > 
> 
> No, it just kind of oversteps the remit of the attribute... but I'll try 
> adding it and see how messy it gets.

Well, there's a reason I left all the vcpu_info address magic in
userspace in the first place. It was there in João's original patches
and I ripped it all out.

But I see your logic for wanting to put it back; I suspect moving the
memcpy too is part of the cost of that? Should work out OK, I think.
  

Patch

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index e9df4df6fe48..47bf3db74674 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -5442,13 +5442,7 @@  KVM_XEN_ATTR_TYPE_LONG_MODE
 
 KVM_XEN_ATTR_TYPE_SHARED_INFO
   Sets the guest physical frame number at which the Xen shared_info
-  page resides. Note that although Xen places vcpu_info for the first
-  32 vCPUs in the shared_info page, KVM does not automatically do so
-  and instead requires that KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO be used
-  explicitly even when the vcpu_info for a given vCPU resides at the
-  "default" location in the shared_info page. This is because KVM may
-  not be aware of the Xen CPU id which is used as the index into the
-  vcpu_info[] array, so may know the correct default location.
+  page resides.
 
   Note that the shared_info page may be constantly written to by KVM;
   it contains the event channel bitmap used to deliver interrupts to
@@ -5564,12 +5558,26 @@  type values:
 
 KVM_XEN_VCPU_ATTR_TYPE_VCPU_INFO
   Sets the guest physical address of the vcpu_info for a given vCPU.
+  The vcpu_info for the first 32 vCPUs defaults to the structures
+  embedded in the shared_info page.
+
+  If the KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA flag is also set in the
+  Xen capabilities then the VMM is not required to set this default
+  location; KVM will handle that internally. Otherwise this attribute
+  must be set for all vCPUs.
+
   As with the shared_info page for the VM, the corresponding page may be
   dirtied at any time if event channel interrupt delivery is enabled, so
   userspace should always assume that the page is dirty without relying
   on dirty logging. Setting the gpa to KVM_XEN_INVALID_GPA will disable
   the vcpu_info.
 
+  Note that, if the guest sets an explicit vcpu_info location in guest
+  memory then the VMM is expected to copy the content of the structure
+  embedded in the shared_info page to the new location. It is therefore
+  important that no event delivery is in progress at this time, otherwise
+  events may be missed.
+
 KVM_XEN_VCPU_ATTR_TYPE_VCPU_TIME_INFO
   Sets the guest physical address of an additional pvclock structure
   for a given vCPU. This is typically used for guest vsyscall support.
diff --git a/arch/x86/kvm/xen.c b/arch/x86/kvm/xen.c
index 459f3ca4710e..660a808c0b50 100644
--- a/arch/x86/kvm/xen.c
+++ b/arch/x86/kvm/xen.c
@@ -491,6 +491,21 @@  static void kvm_xen_inject_vcpu_vector(struct kvm_vcpu *v)
 
 static struct gfn_to_pfn_cache *get_vcpu_info_cache(struct kvm_vcpu *v, unsigned long *offset)
 {
+	if (!v->arch.xen.vcpu_info_cache.active && v->arch.xen.vcpu_id < MAX_VIRT_CPUS) {
+		struct kvm *kvm = v->kvm;
+
+		if (offset) {
+			if (IS_ENABLED(CONFIG_64BIT) && kvm->arch.xen.long_mode)
+				*offset = offsetof(struct shared_info,
+						   vcpu_info[v->arch.xen.vcpu_id]);
+			else
+				*offset = offsetof(struct compat_shared_info,
+						   vcpu_info[v->arch.xen.vcpu_id]);
+		}
+
+		return &kvm->arch.xen.shinfo_cache;
+	}
+
 	if (offset)
 		*offset = 0;