[v2,01/11] genirq/affinity:: Export irq_create_affinity_masks()

Message ID 20221205084127.535-2-xieyongji@bytedance.com
State New
Headers
Series VDUSE: Improve performance |

Commit Message

Yongji Xie Dec. 5, 2022, 8:41 a.m. UTC
  Export irq_create_affinity_masks() so that some modules
can make use of it to implement interrupt affinity
spreading mechanism.

Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
---
 kernel/irq/affinity.c | 1 +
 1 file changed, 1 insertion(+)
  

Comments

Christoph Hellwig Dec. 6, 2022, 8:18 a.m. UTC | #1
On Mon, Dec 05, 2022 at 04:41:17PM +0800, Xie Yongji wrote:
> Export irq_create_affinity_masks() so that some modules
> can make use of it to implement interrupt affinity
> spreading mechanism.

I don't think driver should be building low-level affinity masks.
  
Yongji Xie Dec. 6, 2022, 8:40 a.m. UTC | #2
On Tue, Dec 6, 2022 at 4:18 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Mon, Dec 05, 2022 at 04:41:17PM +0800, Xie Yongji wrote:
> > Export irq_create_affinity_masks() so that some modules
> > can make use of it to implement interrupt affinity
> > spreading mechanism.
>
> I don't think driver should be building low-level affinity masks.

With the vDPA framework, some drivers (vduse, vdpa-sim) can create
software-defined virtio devices and attach them to the virtio bus.
This kind of virtio device is not a pci device or a platform device.
So it would be needed to export this function if we want to implement
the automatic affinity management for the virtio device driver which
is binded to this device.

Thanks,
Yongji
  
Christoph Hellwig Dec. 6, 2022, 8:47 a.m. UTC | #3
On Tue, Dec 06, 2022 at 04:40:37PM +0800, Yongji Xie wrote:
> With the vDPA framework, some drivers (vduse, vdpa-sim) can create
> software-defined virtio devices and attach them to the virtio bus.
> This kind of virtio device is not a pci device or a platform device.
> So it would be needed to export this function if we want to implement
> the automatic affinity management for the virtio device driver which
> is binded to this device.

Why are these devices even using interrupts?  The whjole vdpa thing
is a mess, I also still need to fix up the horrible abuse of the DMA
API for something that isn't even DMA, and this just seems to spread
that same mistake even further.
  
Yongji Xie Dec. 6, 2022, 9:28 a.m. UTC | #4
On Tue, Dec 6, 2022 at 4:47 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Tue, Dec 06, 2022 at 04:40:37PM +0800, Yongji Xie wrote:
> > With the vDPA framework, some drivers (vduse, vdpa-sim) can create
> > software-defined virtio devices and attach them to the virtio bus.
> > This kind of virtio device is not a pci device or a platform device.
> > So it would be needed to export this function if we want to implement
> > the automatic affinity management for the virtio device driver which
> > is binded to this device.
>
> Why are these devices even using interrupts?

They don't use interrupt. But they use a bound workqueue to run the
interrupt callback. So the driver needs an algorithm to choose which
cpu to run the interrupt callback. Then we found the existing
interrupt affinity spreading mechanism is very suitable for this
scenario, so we try to export this function to reuse it.

> The whjole vdpa thing
> is a mess, I also still need to fix up the horrible abuse of the DMA
> API for something that isn't even DMA, and this just seems to spread
> that same mistake even further.

We just want to reuse this algorithm. And it is completely independent
of the IRQ subsystem. I guess it would not mess things up.

Thanks,
Yongji
  
Jason Wang Dec. 7, 2022, 5:56 a.m. UTC | #5
On Tue, Dec 6, 2022 at 5:28 PM Yongji Xie <xieyongji@bytedance.com> wrote:
>
> On Tue, Dec 6, 2022 at 4:47 PM Christoph Hellwig <hch@lst.de> wrote:
> >
> > On Tue, Dec 06, 2022 at 04:40:37PM +0800, Yongji Xie wrote:
> > > With the vDPA framework, some drivers (vduse, vdpa-sim) can create
> > > software-defined virtio devices and attach them to the virtio bus.
> > > This kind of virtio device is not a pci device or a platform device.
> > > So it would be needed to export this function if we want to implement
> > > the automatic affinity management for the virtio device driver which
> > > is binded to this device.
> >
> > Why are these devices even using interrupts?
>
> They don't use interrupt. But they use a bound workqueue to run the
> interrupt callback. So the driver needs an algorithm to choose which
> cpu to run the interrupt callback. Then we found the existing
> interrupt affinity spreading mechanism is very suitable for this
> scenario, so we try to export this function to reuse it.
>
> > The whjole vdpa thing
> > is a mess, I also still need to fix up the horrible abuse of the DMA
> > API for something that isn't even DMA, and this just seems to spread
> > that same mistake even further.

I think it's mostly an issue of some vDPA parents, not the vDPA
itself. I had patches to get rid of the DMA API for vDPA simulators.
Will post.

>
> We just want to reuse this algorithm. And it is completely independent
> of the IRQ subsystem. I guess it would not mess things up.

I think so, it's about which CPU do we want to run the callback and
the callback is not necessarily triggered by an IRQ.

Thanks

>
> Thanks,
> Yongji
>
  
Michael S. Tsirkin Dec. 19, 2022, 7:33 a.m. UTC | #6
On Mon, Dec 05, 2022 at 04:41:17PM +0800, Xie Yongji wrote:
> Export irq_create_affinity_masks() so that some modules
> can make use of it to implement interrupt affinity
> spreading mechanism.
> 
> Signed-off-by: Xie Yongji <xieyongji@bytedance.com>

So this got nacked, what's the plan now?

> ---
>  kernel/irq/affinity.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
> index d9a5c1d65a79..f074a7707c6d 100644
> --- a/kernel/irq/affinity.c
> +++ b/kernel/irq/affinity.c
> @@ -487,6 +487,7 @@ irq_create_affinity_masks(unsigned int nvecs, struct irq_affinity *affd)
>  
>  	return masks;
>  }
> +EXPORT_SYMBOL_GPL(irq_create_affinity_masks);
>  
>  /**
>   * irq_calc_affinity_vectors - Calculate the optimal number of vectors
> -- 
> 2.20.1
  
Yongji Xie Dec. 19, 2022, 9:36 a.m. UTC | #7
On Mon, Dec 19, 2022 at 3:33 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Dec 05, 2022 at 04:41:17PM +0800, Xie Yongji wrote:
> > Export irq_create_affinity_masks() so that some modules
> > can make use of it to implement interrupt affinity
> > spreading mechanism.
> >
> > Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
>
> So this got nacked, what's the plan now?
>

I‘d like to check with Christoph again first.

Hi Christoph,

Jason will post some patches to get rid of the DMA API for vDPA
simulators. And the irq affinity algorithm is independent of the IRQ
subsystem IIUC. So could you allow this patch so that we can reuse the
algorithm to select the best CPU (per-cpu affinity if possible, or at
least per-node) to run the virtqueue's irq callback.

Thanks,
Yongji

> > ---
> >  kernel/irq/affinity.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
> > index d9a5c1d65a79..f074a7707c6d 100644
> > --- a/kernel/irq/affinity.c
> > +++ b/kernel/irq/affinity.c
> > @@ -487,6 +487,7 @@ irq_create_affinity_masks(unsigned int nvecs, struct irq_affinity *affd)
> >
> >       return masks;
> >  }
> > +EXPORT_SYMBOL_GPL(irq_create_affinity_masks);
> >
> >  /**
> >   * irq_calc_affinity_vectors - Calculate the optimal number of vectors
> > --
> > 2.20.1
>
  
Michael S. Tsirkin Jan. 27, 2023, 8:22 a.m. UTC | #8
On Mon, Dec 19, 2022 at 05:36:02PM +0800, Yongji Xie wrote:
> On Mon, Dec 19, 2022 at 3:33 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Dec 05, 2022 at 04:41:17PM +0800, Xie Yongji wrote:
> > > Export irq_create_affinity_masks() so that some modules
> > > can make use of it to implement interrupt affinity
> > > spreading mechanism.
> > >
> > > Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
> >
> > So this got nacked, what's the plan now?
> >
> 
> I‘d like to check with Christoph again first.
> 
> Hi Christoph,
> 
> Jason will post some patches to get rid of the DMA API for vDPA
> simulators. And the irq affinity algorithm is independent of the IRQ
> subsystem IIUC. So could you allow this patch so that we can reuse the
> algorithm to select the best CPU (per-cpu affinity if possible, or at
> least per-node) to run the virtqueue's irq callback.
> 
> Thanks,
> Yongji

I think you need to explain why you are building low level
affinity masks.  what's the plan now?

> > > ---
> > >  kernel/irq/affinity.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
> > > index d9a5c1d65a79..f074a7707c6d 100644
> > > --- a/kernel/irq/affinity.c
> > > +++ b/kernel/irq/affinity.c
> > > @@ -487,6 +487,7 @@ irq_create_affinity_masks(unsigned int nvecs, struct irq_affinity *affd)
> > >
> > >       return masks;
> > >  }
> > > +EXPORT_SYMBOL_GPL(irq_create_affinity_masks);
> > >
> > >  /**
> > >   * irq_calc_affinity_vectors - Calculate the optimal number of vectors
> > > --
> > > 2.20.1
> >
  
Yongji Xie Jan. 30, 2023, 11:53 a.m. UTC | #9
On Fri, Jan 27, 2023 at 4:22 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Dec 19, 2022 at 05:36:02PM +0800, Yongji Xie wrote:
> > On Mon, Dec 19, 2022 at 3:33 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Dec 05, 2022 at 04:41:17PM +0800, Xie Yongji wrote:
> > > > Export irq_create_affinity_masks() so that some modules
> > > > can make use of it to implement interrupt affinity
> > > > spreading mechanism.
> > > >
> > > > Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
> > >
> > > So this got nacked, what's the plan now?
> > >
> >
> > I‘d like to check with Christoph again first.
> >
> > Hi Christoph,
> >
> > Jason will post some patches to get rid of the DMA API for vDPA
> > simulators. And the irq affinity algorithm is independent of the IRQ
> > subsystem IIUC. So could you allow this patch so that we can reuse the
> > algorithm to select the best CPU (per-cpu affinity if possible, or at
> > least per-node) to run the virtqueue's irq callback.
> >
> > Thanks,
> > Yongji
>
> I think you need to explain why you are building low level
> affinity masks.

In VDUSE case, we use workqueue to run the virtqueue's irq callback.
Now I want to queue the irq callback kwork to one specific CPU to get
per-cpu affinity if possible, or at least per-node. So I need to use
this function to build the low level affinity masks for each
virtqueue.

> what's the plan now?
>

If there is no objection, I'll post a new version.

Thanks,
Yongji
  
Michael S. Tsirkin Feb. 13, 2023, 11:59 a.m. UTC | #10
On Mon, Jan 30, 2023 at 07:53:55PM +0800, Yongji Xie wrote:
> On Fri, Jan 27, 2023 at 4:22 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Mon, Dec 19, 2022 at 05:36:02PM +0800, Yongji Xie wrote:
> > > On Mon, Dec 19, 2022 at 3:33 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > >
> > > > On Mon, Dec 05, 2022 at 04:41:17PM +0800, Xie Yongji wrote:
> > > > > Export irq_create_affinity_masks() so that some modules
> > > > > can make use of it to implement interrupt affinity
> > > > > spreading mechanism.
> > > > >
> > > > > Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
> > > >
> > > > So this got nacked, what's the plan now?
> > > >
> > >
> > > I‘d like to check with Christoph again first.
> > >
> > > Hi Christoph,
> > >
> > > Jason will post some patches to get rid of the DMA API for vDPA
> > > simulators. And the irq affinity algorithm is independent of the IRQ
> > > subsystem IIUC. So could you allow this patch so that we can reuse the
> > > algorithm to select the best CPU (per-cpu affinity if possible, or at
> > > least per-node) to run the virtqueue's irq callback.
> > >
> > > Thanks,
> > > Yongji
> >
> > I think you need to explain why you are building low level
> > affinity masks.
> 
> In VDUSE case, we use workqueue to run the virtqueue's irq callback.
> Now I want to queue the irq callback kwork to one specific CPU to get
> per-cpu affinity if possible, or at least per-node. So I need to use
> this function to build the low level affinity masks for each
> virtqueue.
> 
> > what's the plan now?
> >
> 
> If there is no objection, I'll post a new version.
> 
> Thanks,
> Yongji

I doubt you made a convicing case here - I think Christoph was saying if
it is not an irq it should not use an irq affinity API.
So a new API possibly sharing implementation with irq affinity
is called for then? Maybe.
  
Yongji Xie Feb. 13, 2023, 2:50 p.m. UTC | #11
On Mon, Feb 13, 2023 at 8:00 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Jan 30, 2023 at 07:53:55PM +0800, Yongji Xie wrote:
> > On Fri, Jan 27, 2023 at 4:22 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Dec 19, 2022 at 05:36:02PM +0800, Yongji Xie wrote:
> > > > On Mon, Dec 19, 2022 at 3:33 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Mon, Dec 05, 2022 at 04:41:17PM +0800, Xie Yongji wrote:
> > > > > > Export irq_create_affinity_masks() so that some modules
> > > > > > can make use of it to implement interrupt affinity
> > > > > > spreading mechanism.
> > > > > >
> > > > > > Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
> > > > >
> > > > > So this got nacked, what's the plan now?
> > > > >
> > > >
> > > > I‘d like to check with Christoph again first.
> > > >
> > > > Hi Christoph,
> > > >
> > > > Jason will post some patches to get rid of the DMA API for vDPA
> > > > simulators. And the irq affinity algorithm is independent of the IRQ
> > > > subsystem IIUC. So could you allow this patch so that we can reuse the
> > > > algorithm to select the best CPU (per-cpu affinity if possible, or at
> > > > least per-node) to run the virtqueue's irq callback.
> > > >
> > > > Thanks,
> > > > Yongji
> > >
> > > I think you need to explain why you are building low level
> > > affinity masks.
> >
> > In VDUSE case, we use workqueue to run the virtqueue's irq callback.
> > Now I want to queue the irq callback kwork to one specific CPU to get
> > per-cpu affinity if possible, or at least per-node. So I need to use
> > this function to build the low level affinity masks for each
> > virtqueue.
> >
> > > what's the plan now?
> > >
> >
> > If there is no objection, I'll post a new version.
> >
> > Thanks,
> > Yongji
>
> I doubt you made a convicing case here - I think Christoph was saying if
> it is not an irq it should not use an irq affinity API.
> So a new API possibly sharing implementation with irq affinity
> is called for then? Maybe.
>

I'm not sure I get your point on sharing implementation.

I can try to split irq_create_affinity_masks() into a common part and
an irq specific part, and move the common part to a common dir such as
/lib and export it. Then we can use the common part to build a new API
for usage.

But I'm afraid that there will be no difference between the new API
and the irq affinity API except for the name since the new API is
still used for irq affinity management. That means we may still need
the irq specific part in the new API. For example, the virtio-blk
driver doesn't know whether the virtio device is a software-defined
vDPA device or a PCI device, so it will pass a structure irq_affinity
to those APIs, then both the new API and irq affinity API still need
to handle it.

Thanks,
Yongji
  
Thomas Gleixner Feb. 13, 2023, 6:53 p.m. UTC | #12
On Mon, Feb 13 2023 at 22:50, Yongji Xie wrote:
> On Mon, Feb 13, 2023 at 8:00 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> I can try to split irq_create_affinity_masks() into a common part and
> an irq specific part, and move the common part to a common dir such as
> /lib and export it. Then we can use the common part to build a new API
> for usage.

  https://lore.kernel.org/all/20221227022905.352674-1-ming.lei@redhat.com/

Thanks,

        tglx
  
Yongji Xie Feb. 14, 2023, 2:38 a.m. UTC | #13
On Tue, Feb 14, 2023 at 2:54 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Mon, Feb 13 2023 at 22:50, Yongji Xie wrote:
> > On Mon, Feb 13, 2023 at 8:00 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > I can try to split irq_create_affinity_masks() into a common part and
> > an irq specific part, and move the common part to a common dir such as
> > /lib and export it. Then we can use the common part to build a new API
> > for usage.
>
>   https://lore.kernel.org/all/20221227022905.352674-1-ming.lei@redhat.com/
>

Thanks for the kind reminder! I will rebase my patchset on it.

Thanks,
Yongji
  

Patch

diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index d9a5c1d65a79..f074a7707c6d 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -487,6 +487,7 @@  irq_create_affinity_masks(unsigned int nvecs, struct irq_affinity *affd)
 
 	return masks;
 }
+EXPORT_SYMBOL_GPL(irq_create_affinity_masks);
 
 /**
  * irq_calc_affinity_vectors - Calculate the optimal number of vectors