[V3,08/10] vfio/pci: Probe and store ability to support dynamic MSI-X

Message ID 0da4830176e9c4a7877aac0611869f341dda831c.1681837892.git.reinette.chatre@intel.com
State New
Headers
Series vfio/pci: Support dynamic allocation of MSI-X interrupts |

Commit Message

Reinette Chatre April 18, 2023, 5:29 p.m. UTC
  Not all MSI-X devices support dynamic MSI-X allocation. Whether
a device supports dynamic MSI-X should be queried using
pci_msix_can_alloc_dyn().

Instead of scattering code with pci_msix_can_alloc_dyn(),
probe this ability once and store it as a property of the
virtual device.

Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V2:
- New patch. (Alex)

 drivers/vfio/pci/vfio_pci_core.c | 5 ++++-
 include/linux/vfio_pci_core.h    | 1 +
 2 files changed, 5 insertions(+), 1 deletion(-)
  

Comments

Alex Williamson April 18, 2023, 10:38 p.m. UTC | #1
On Tue, 18 Apr 2023 10:29:19 -0700
Reinette Chatre <reinette.chatre@intel.com> wrote:

> Not all MSI-X devices support dynamic MSI-X allocation. Whether
> a device supports dynamic MSI-X should be queried using
> pci_msix_can_alloc_dyn().
> 
> Instead of scattering code with pci_msix_can_alloc_dyn(),
> probe this ability once and store it as a property of the
> virtual device.
> 
> Suggested-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> Changes since V2:
> - New patch. (Alex)
> 
>  drivers/vfio/pci/vfio_pci_core.c | 5 ++++-
>  include/linux/vfio_pci_core.h    | 1 +
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
> index ae0e161c7fc9..a3635a8e54c8 100644
> --- a/drivers/vfio/pci/vfio_pci_core.c
> +++ b/drivers/vfio/pci/vfio_pci_core.c
> @@ -530,8 +530,11 @@ int vfio_pci_core_enable(struct vfio_pci_core_device *vdev)
>  		vdev->msix_bar = table & PCI_MSIX_TABLE_BIR;
>  		vdev->msix_offset = table & PCI_MSIX_TABLE_OFFSET;
>  		vdev->msix_size = ((flags & PCI_MSIX_FLAGS_QSIZE) + 1) * 16;
> -	} else
> +		vdev->has_dyn_msix = pci_msix_can_alloc_dyn(pdev);
> +	} else {
>  		vdev->msix_bar = 0xFF;
> +		vdev->has_dyn_msix = false;
> +	}
>  
>  	if (!vfio_vga_disabled() && vfio_pci_is_vga(pdev))
>  		vdev->has_vga = true;
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index 148fd1ae6c1c..4f070f2d6fde 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -67,6 +67,7 @@ struct vfio_pci_core_device {
>  	u8			msix_bar;
>  	u16			msix_size;
>  	u32			msix_offset;
> +	bool			has_dyn_msix;
>  	u32			rbar[7];
>  	bool			pci_2_3;
>  	bool			virq_disabled;

Nit, the whole data structure probably needs to be sorted with pahole,
but creating a hole here for locality to other msix fields should
probably be secondary to keeping the structure well packed, which
suggests including this new field among the bools below.  Thanks,

Alex
  
Reinette Chatre April 19, 2023, 6:11 p.m. UTC | #2
Hi Alex,

On 4/18/2023 3:38 PM, Alex Williamson wrote:
> On Tue, 18 Apr 2023 10:29:19 -0700
> Reinette Chatre <reinette.chatre@intel.com> wrote:
> 

...

>> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
>> index 148fd1ae6c1c..4f070f2d6fde 100644
>> --- a/include/linux/vfio_pci_core.h
>> +++ b/include/linux/vfio_pci_core.h
>> @@ -67,6 +67,7 @@ struct vfio_pci_core_device {
>>  	u8			msix_bar;
>>  	u16			msix_size;
>>  	u32			msix_offset;
>> +	bool			has_dyn_msix;
>>  	u32			rbar[7];
>>  	bool			pci_2_3;
>>  	bool			virq_disabled;
> 
> Nit, the whole data structure probably needs to be sorted with pahole,
> but creating a hole here for locality to other msix fields should
> probably be secondary to keeping the structure well packed, which
> suggests including this new field among the bools below.  Thanks,

Thanks for catching this. Moving it one field lower as shown in the
delta patch below seems to improve the layout:

diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 4f070f2d6fde..d730d78754a2 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -67,8 +67,8 @@ struct vfio_pci_core_device {
 	u8			msix_bar;
 	u16			msix_size;
 	u32			msix_offset;
-	bool			has_dyn_msix;
 	u32			rbar[7];
+	bool			has_dyn_msix;
 	bool			pci_2_3;
 	bool			virq_disabled;
 	bool			reset_works;


Combined with the other changes to this struct (new struct xarray
for the context, and removing int num_ctx) the bools are no longer
together on a single cache line. Placing has_dyn_msix as shown above
keeps it on the same cache line as the other msix_* fields.

After this change the layout of this struct appears to be improved.
Before this patch series (v6.3-rc7):
        /* size: 2496, cachelines: 39, members: 46 */
        /* sum members: 2485, holes: 4, sum holes: 11 */
        /* paddings: 2, sum paddings: 11 */
        /* forced alignments: 1 */

After this patch series (v6.3-rc7 + V3 + delta patch):
        /* size: 2568, cachelines: 41, members: 46 */
        /* sum members: 2562, holes: 2, sum holes: 6 */
        /* paddings: 2, sum paddings: 11 */
        /* forced alignments: 1 */
        /* last cacheline: 8 bytes */

Reinette
  
Jason Gunthorpe April 24, 2023, 5:43 p.m. UTC | #3
On Wed, Apr 19, 2023 at 11:11:48AM -0700, Reinette Chatre wrote:
> Hi Alex,
> 
> On 4/18/2023 3:38 PM, Alex Williamson wrote:
> > On Tue, 18 Apr 2023 10:29:19 -0700
> > Reinette Chatre <reinette.chatre@intel.com> wrote:
> > 
> 
> ...
> 
> >> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> >> index 148fd1ae6c1c..4f070f2d6fde 100644
> >> --- a/include/linux/vfio_pci_core.h
> >> +++ b/include/linux/vfio_pci_core.h
> >> @@ -67,6 +67,7 @@ struct vfio_pci_core_device {
> >>  	u8			msix_bar;
> >>  	u16			msix_size;
> >>  	u32			msix_offset;
> >> +	bool			has_dyn_msix;
> >>  	u32			rbar[7];
> >>  	bool			pci_2_3;
> >>  	bool			virq_disabled;
> > 
> > Nit, the whole data structure probably needs to be sorted with pahole,
> > but creating a hole here for locality to other msix fields should
> > probably be secondary to keeping the structure well packed, which
> > suggests including this new field among the bools below.  Thanks,
> 
> Thanks for catching this. Moving it one field lower as shown in the
> delta patch below seems to improve the layout:
> 
> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> index 4f070f2d6fde..d730d78754a2 100644
> --- a/include/linux/vfio_pci_core.h
> +++ b/include/linux/vfio_pci_core.h
> @@ -67,8 +67,8 @@ struct vfio_pci_core_device {
>  	u8			msix_bar;
>  	u16			msix_size;
>  	u32			msix_offset;
> -	bool			has_dyn_msix;
>  	u32			rbar[7];
> +	bool			has_dyn_msix;
>  	bool			pci_2_3;
>  	bool			virq_disabled;
>  	bool			reset_works;

Also, Linus on record as strongly disliking these lists of bools

If they don't need read_once/etc stuff then use a list of bitfields

bool abc:1;
bool xyz:1;

Jason
  
Reinette Chatre April 24, 2023, 11:52 p.m. UTC | #4
Hi Jason,

On 4/24/2023 10:43 AM, Jason Gunthorpe wrote:
> On Wed, Apr 19, 2023 at 11:11:48AM -0700, Reinette Chatre wrote:
>> On 4/18/2023 3:38 PM, Alex Williamson wrote:
>>> On Tue, 18 Apr 2023 10:29:19 -0700
>>> Reinette Chatre <reinette.chatre@intel.com> wrote:

...

>> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
>> index 4f070f2d6fde..d730d78754a2 100644
>> --- a/include/linux/vfio_pci_core.h
>> +++ b/include/linux/vfio_pci_core.h
>> @@ -67,8 +67,8 @@ struct vfio_pci_core_device {
>>  	u8			msix_bar;
>>  	u16			msix_size;
>>  	u32			msix_offset;
>> -	bool			has_dyn_msix;
>>  	u32			rbar[7];
>> +	bool			has_dyn_msix;
>>  	bool			pci_2_3;
>>  	bool			virq_disabled;
>>  	bool			reset_works;
> 
> Also, Linus on record as strongly disliking these lists of bools

This looks like an example:
https://lkml.org/lkml/2017/11/21/384

> 
> If they don't need read_once/etc stuff then use a list of bitfields

I do not see any direct usage of read_once in the driver, but it is not
clear to me what falls under the "etc" umbrella. Do you consider all
the bools in struct vfio_pci_core_device to be candidates for 
transition?

> 
> bool abc:1;
> bool xyz:1;
> 

I think a base type of unsigned int since it appears to be the custom
and (if I understand correctly) was preferred at the time Linus wrote
the message I found.

Looking ahead there seems be be a bigger task here. A quick search
revealed a few other instances of vfio using "bool" in a struct. It
does not all qualify for your "lists of bools" comment, but they
may need a closer look because of the "please don't use "bool" in
structures at all" comment made by Linus in the email I found.

vfio_device::iommufd_attached
vfio_container::noiommu
vfio_platform_irq::masked
vfio_platform_device::reset_required
vfio_iommu::v2
vfio_iommu::nesting
vfio_iommu::dirty_page_tracking
vfio_dma::iommu_mapped
vfio_dma::lock_cap
vfio_dma::vaddr_invalid
vfio_iommu_group::pinned_page_dirty_scope
tce_container::enabled
tce_container::v2
tce_container::def_window_pending

Reinette
  
Jason Gunthorpe April 25, 2023, 2:51 p.m. UTC | #5
On Mon, Apr 24, 2023 at 04:52:08PM -0700, Reinette Chatre wrote:
> Hi Jason,
> 
> On 4/24/2023 10:43 AM, Jason Gunthorpe wrote:
> > On Wed, Apr 19, 2023 at 11:11:48AM -0700, Reinette Chatre wrote:
> >> On 4/18/2023 3:38 PM, Alex Williamson wrote:
> >>> On Tue, 18 Apr 2023 10:29:19 -0700
> >>> Reinette Chatre <reinette.chatre@intel.com> wrote:
> 
> ...
> 
> >> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
> >> index 4f070f2d6fde..d730d78754a2 100644
> >> --- a/include/linux/vfio_pci_core.h
> >> +++ b/include/linux/vfio_pci_core.h
> >> @@ -67,8 +67,8 @@ struct vfio_pci_core_device {
> >>  	u8			msix_bar;
> >>  	u16			msix_size;
> >>  	u32			msix_offset;
> >> -	bool			has_dyn_msix;
> >>  	u32			rbar[7];
> >> +	bool			has_dyn_msix;
> >>  	bool			pci_2_3;
> >>  	bool			virq_disabled;
> >>  	bool			reset_works;
> > 
> > Also, Linus on record as strongly disliking these lists of bools
> 
> This looks like an example:
> https://lkml.org/lkml/2017/11/21/384
> 
> > 
> > If they don't need read_once/etc stuff then use a list of bitfields
> 
> I do not see any direct usage of read_once in the driver, but it is not
> clear to me what falls under the "etc" umbrella.

Anything that might assume atomicity, smp_store_release, set_bit, and others

>  Do you consider all the bools in struct vfio_pci_core_device to be
> candidates for transition?

Yes, group them ito into a bitfield.

> I think a base type of unsigned int since it appears to be the custom
> and (if I understand correctly) was preferred at the time Linus wrote
> the message I found.

It doesn't matter a lot, using "bool" means the compiler adds extra
code to ensure "foo = 4" stores true, and the underyling size is not
well defined (but we don't care here)
 
> Looking ahead there seems be be a bigger task here. A quick search
> revealed a few other instances of vfio using "bool" in a struct. It
> does not all qualify for your "lists of bools" comment, but they
> may need a closer look because of the "please don't use "bool" in
> structures at all" comment made by Linus in the email I found.

IMHO bool is helpful for clarity, it says it is a flag. In these cases
we won't gain anything by using u8 instead

Lists of bools however start to get a little silly when we use maybe 4
bytes per bool (though x86-64 is using 1 byte in structs)

Jason
  
Reinette Chatre April 25, 2023, 4:52 p.m. UTC | #6
Hi Jason,

On 4/25/2023 7:51 AM, Jason Gunthorpe wrote:
> On Mon, Apr 24, 2023 at 04:52:08PM -0700, Reinette Chatre wrote:
>> On 4/24/2023 10:43 AM, Jason Gunthorpe wrote:
>>> On Wed, Apr 19, 2023 at 11:11:48AM -0700, Reinette Chatre wrote:
>>>> On 4/18/2023 3:38 PM, Alex Williamson wrote:
>>>>> On Tue, 18 Apr 2023 10:29:19 -0700
>>>>> Reinette Chatre <reinette.chatre@intel.com> wrote:
>>
>> ...
>>
>>>> diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
>>>> index 4f070f2d6fde..d730d78754a2 100644
>>>> --- a/include/linux/vfio_pci_core.h
>>>> +++ b/include/linux/vfio_pci_core.h
>>>> @@ -67,8 +67,8 @@ struct vfio_pci_core_device {
>>>>  	u8			msix_bar;
>>>>  	u16			msix_size;
>>>>  	u32			msix_offset;
>>>> -	bool			has_dyn_msix;
>>>>  	u32			rbar[7];
>>>> +	bool			has_dyn_msix;
>>>>  	bool			pci_2_3;
>>>>  	bool			virq_disabled;
>>>>  	bool			reset_works;
>>>
>>> Also, Linus on record as strongly disliking these lists of bools
>>
>> This looks like an example:
>> https://lkml.org/lkml/2017/11/21/384
>>
>>>
>>> If they don't need read_once/etc stuff then use a list of bitfields
>>
>> I do not see any direct usage of read_once in the driver, but it is not
>> clear to me what falls under the "etc" umbrella.
> 
> Anything that might assume atomicity, smp_store_release, set_bit, and others
> 
>>  Do you consider all the bools in struct vfio_pci_core_device to be
>> candidates for transition?
> 
> Yes, group them ito into a bitfield.

Will do.

> 
>> I think a base type of unsigned int since it appears to be the custom
>> and (if I understand correctly) was preferred at the time Linus wrote
>> the message I found.
> 
> It doesn't matter a lot, using "bool" means the compiler adds extra
> code to ensure "foo = 4" stores true, and the underyling size is not
> well defined (but we don't care here)

Looking further outside that email thread I do see using base type of bool
is common. I will use that. Doing so also reduces the churn of this
transition since only the data structure changes, not the code. 
>> Looking ahead there seems be be a bigger task here. A quick search
>> revealed a few other instances of vfio using "bool" in a struct. It
>> does not all qualify for your "lists of bools" comment, but they
>> may need a closer look because of the "please don't use "bool" in
>> structures at all" comment made by Linus in the email I found.
> 
> IMHO bool is helpful for clarity, it says it is a flag. In these cases
> we won't gain anything by using u8 instead
> 
> Lists of bools however start to get a little silly when we use maybe 4
> bytes per bool (though x86-64 is using 1 byte in structs)
> 

Thank you very much for catching this and providing guidance. I plan to
include this change to struct vfio_pci_core_device as a prep
patch within this series.

Reinette
  

Patch

diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index ae0e161c7fc9..a3635a8e54c8 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -530,8 +530,11 @@  int vfio_pci_core_enable(struct vfio_pci_core_device *vdev)
 		vdev->msix_bar = table & PCI_MSIX_TABLE_BIR;
 		vdev->msix_offset = table & PCI_MSIX_TABLE_OFFSET;
 		vdev->msix_size = ((flags & PCI_MSIX_FLAGS_QSIZE) + 1) * 16;
-	} else
+		vdev->has_dyn_msix = pci_msix_can_alloc_dyn(pdev);
+	} else {
 		vdev->msix_bar = 0xFF;
+		vdev->has_dyn_msix = false;
+	}
 
 	if (!vfio_vga_disabled() && vfio_pci_is_vga(pdev))
 		vdev->has_vga = true;
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h
index 148fd1ae6c1c..4f070f2d6fde 100644
--- a/include/linux/vfio_pci_core.h
+++ b/include/linux/vfio_pci_core.h
@@ -67,6 +67,7 @@  struct vfio_pci_core_device {
 	u8			msix_bar;
 	u16			msix_size;
 	u32			msix_offset;
+	bool			has_dyn_msix;
 	u32			rbar[7];
 	bool			pci_2_3;
 	bool			virq_disabled;