diff mbox series

[v2] genirq: avoid long loops in handle_edge_irq

Message ID	20230925025154.37959-1-gongwei833x@gmail.com
State	New
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; From: Wei Gong <gongwei833x@gmail.com> To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, Wei Gong <gongwei833x@gmail.com> Subject: [PATCH v2] genirq: avoid long loops in handle_edge_irq Date: Mon, 25 Sep 2023 10:51:54 +0800 Message-Id: <20230925025154.37959-1-gongwei833x@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	[v2] genirq: avoid long loops in handle_edge_irq \| [v2] genirq: avoid long loops in handle_edge_irq

Commit Message

Wei Gong Sept. 25, 2023, 2:51 a.m. UTC

  When there are a large number of interrupts occurring on the tx
queue(irq smp_affinity=1) of the network card, changing the CPU
affinity of the tx queue (echo 2 > /proc/irq/xx/smp_affinity)
will cause handle_edge_irq to loop for a long time in the
do {} while() loop.

After setting the IRQ CPU affinity, the next interrupt will only
be activated when it arrives. Therefore, the next interrupt will
still be on CPU 0. When a new CPU affinity is activated on CPU 0,
subsequent interrupts will be processed on CPU 1.

       cpu 0                                cpu 1
  - handle_edge_irq
    - apic_ack_irq
      - irq_do_set_affinity
                                        - handle_edge_irq
    - do {
        - handle_irq_event
          - istate &= ~IRQS_PENDIN
          - IRQD_IRQ_INPROGRESS
          - spin_unlock()
                                          - spin_lock()
                                          - istate |= IRQS_PENDIN
          - handle_irq_event_percpu       - mask_ack_irq()
                                          - spin_unlock()
          - spin_unlock

      } while(IRQS_PENDIN &&
              !irq_disable)

Therefore, when determining whether to continue looping, we add a check
to see if the current CPU belongs to the affinity table of the interrupt.

Signed-off-by: Wei Gong <gongwei833x@gmail.com>
---
 kernel/irq/chip.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Thomas Gleixner Sept. 26, 2023, 12:28 p.m. UTC | #1

On Mon, Sep 25 2023 at 10:51, Wei Gong wrote:
> diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> index dc94e0bf2c94..6da455e1a692 100644
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -831,7 +831,8 @@ void handle_edge_irq(struct irq_desc *desc)
>  		handle_irq_event(desc);
>  
>  	} while ((desc->istate & IRQS_PENDING) &&
> -		 !irqd_irq_disabled(&desc->irq_data));
> +		 !irqd_irq_disabled(&desc->irq_data) &&
> +		 cpumask_test_cpu(smp_processor_id(), irq_data_get_affinity_mask(&desc->irq_data)));

Assume affinty mask has CPU0 and CPU1 set and the loop is on CPU0, but
the effective affinity is on CPU1 then how is this going to move the
interrupt?

Thanks,

        tglx

Wei Gong Sept. 27, 2023, 7:53 a.m. UTC | #2

O Tue, Sep 26, 2023 at 02:28:21PM +0200, Thomas Gleixner wrote:
> On Mon, Sep 25 2023 at 10:51, Wei Gong wrote:
> > diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> > index dc94e0bf2c94..6da455e1a692 100644
> > --- a/kernel/irq/chip.c
> > +++ b/kernel/irq/chip.c
> > @@ -831,7 +831,8 @@ void handle_edge_irq(struct irq_desc *desc)
> >  		handle_irq_event(desc);
> >  
> >  	} while ((desc->istate & IRQS_PENDING) &&
> > -		 !irqd_irq_disabled(&desc->irq_data));
> > +		 !irqd_irq_disabled(&desc->irq_data) &&
> > +		 cpumask_test_cpu(smp_processor_id(), irq_data_get_affinity_mask(&desc->irq_data)));
> 
> Assume affinty mask has CPU0 and CPU1 set and the loop is on CPU0, but
> the effective affinity is on CPU1 then how is this going to move the
> interrupt?

Loop is on the CPU0 means that the previous effective affinity was on CPU0.
When the previous effective affinity is a subset of the new affinity mask,
the effective affinity will not be updated.
Therefore, I understand that the scenario you mentioned will not occur?

> 
> Thanks,
> 
>         tglx


Thanks,

Wei Gong

Thomas Gleixner Sept. 27, 2023, 3:25 p.m. UTC | #3

On Wed, Sep 27 2023 at 15:53, Wei Gong wrote:
> O Tue, Sep 26, 2023 at 02:28:21PM +0200, Thomas Gleixner wrote:
>> On Mon, Sep 25 2023 at 10:51, Wei Gong wrote:
>> > diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
>> > index dc94e0bf2c94..6da455e1a692 100644
>> > --- a/kernel/irq/chip.c
>> > +++ b/kernel/irq/chip.c
>> > @@ -831,7 +831,8 @@ void handle_edge_irq(struct irq_desc *desc)
>> >  		handle_irq_event(desc);
>> >  
>> >  	} while ((desc->istate & IRQS_PENDING) &&
>> > -		 !irqd_irq_disabled(&desc->irq_data));
>> > +		 !irqd_irq_disabled(&desc->irq_data) &&
>> > +		 cpumask_test_cpu(smp_processor_id(), irq_data_get_affinity_mask(&desc->irq_data)));
>> 
>> Assume affinty mask has CPU0 and CPU1 set and the loop is on CPU0, but
>> the effective affinity is on CPU1 then how is this going to move the
>> interrupt?
>
> Loop is on the CPU0 means that the previous effective affinity was on CPU0.
> When the previous effective affinity is a subset of the new affinity mask,
> the effective affinity will not be updated.

That's an implementation detail of a particular interrupt chip driver,
but not a general guaranteed behaviour.

Wei Gong Sept. 28, 2023, 2:22 a.m. UTC | #4

On Wed, Sep 27, 2023 at 05:25:24PM +0200, Thomas Gleixner wrote:
> On Wed, Sep 27 2023 at 15:53, Wei Gong wrote:
> > O Tue, Sep 26, 2023 at 02:28:21PM +0200, Thomas Gleixner wrote:
> >> On Mon, Sep 25 2023 at 10:51, Wei Gong wrote:
> >> > diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
> >> > index dc94e0bf2c94..6da455e1a692 100644
> >> > --- a/kernel/irq/chip.c
> >> > +++ b/kernel/irq/chip.c
> >> > @@ -831,7 +831,8 @@ void handle_edge_irq(struct irq_desc *desc)
> >> >  		handle_irq_event(desc);
> >> >  
> >> >  	} while ((desc->istate & IRQS_PENDING) &&
> >> > -		 !irqd_irq_disabled(&desc->irq_data));
> >> > +		 !irqd_irq_disabled(&desc->irq_data) &&
> >> > +		 cpumask_test_cpu(smp_processor_id(), irq_data_get_affinity_mask(&desc->irq_data)));
> >> 
> >> Assume affinty mask has CPU0 and CPU1 set and the loop is on CPU0, but
> >> the effective affinity is on CPU1 then how is this going to move the
> >> interrupt?

Can replacing irq_data_get_affinity_mask with irq_data_get_effective_affinity_mask
solve this issue?

> >
> > Loop is on the CPU0 means that the previous effective affinity was on CPU0.
> > When the previous effective affinity is a subset of the new affinity mask,
> > the effective affinity will not be updated.
> 
> That's an implementation detail of a particular interrupt chip driver,
> but not a general guaranteed behaviour.
>

Thomas Gleixner Sept. 28, 2023, 9:28 a.m. UTC | #5

On Thu, Sep 28 2023 at 10:22, Wei Gong wrote:
> On Wed, Sep 27, 2023 at 05:25:24PM +0200, Thomas Gleixner wrote:
>> On Wed, Sep 27 2023 at 15:53, Wei Gong wrote:
>> > O Tue, Sep 26, 2023 at 02:28:21PM +0200, Thomas Gleixner wrote:
>> >> On Mon, Sep 25 2023 at 10:51, Wei Gong wrote:
>> >> > diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
>> >> > index dc94e0bf2c94..6da455e1a692 100644
>> >> > --- a/kernel/irq/chip.c
>> >> > +++ b/kernel/irq/chip.c
>> >> > @@ -831,7 +831,8 @@ void handle_edge_irq(struct irq_desc *desc)
>> >> >  		handle_irq_event(desc);
>> >> >  
>> >> >  	} while ((desc->istate & IRQS_PENDING) &&
>> >> > -		 !irqd_irq_disabled(&desc->irq_data));
>> >> > +		 !irqd_irq_disabled(&desc->irq_data) &&
>> >> > +		 cpumask_test_cpu(smp_processor_id(), irq_data_get_affinity_mask(&desc->irq_data)));
>> >> 
>> >> Assume affinty mask has CPU0 and CPU1 set and the loop is on CPU0, but
>> >> the effective affinity is on CPU1 then how is this going to move the
>> >> interrupt?
>> >
>> > Loop is on the CPU0 means that the previous effective affinity was on CPU0.
>> > When the previous effective affinity is a subset of the new affinity mask,
>> > the effective affinity will not be updated.
>> 
>> That's an implementation detail of a particular interrupt chip driver,
>> but not a general guaranteed behaviour.
>> 
>
> Can replacing irq_data_get_affinity_mask with irq_data_get_effective_affinity_mask
> solve this issue?

Yes.

diff mbox series

Patch

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index dc94e0bf2c94..6da455e1a692 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -831,7 +831,8 @@  void handle_edge_irq(struct irq_desc *desc)
 		handle_irq_event(desc);
 
 	} while ((desc->istate & IRQS_PENDING) &&
-		 !irqd_irq_disabled(&desc->irq_data));
+		 !irqd_irq_disabled(&desc->irq_data) &&
+		 cpumask_test_cpu(smp_processor_id(), irq_data_get_affinity_mask(&desc->irq_data)));
 
 out_unlock:
 	raw_spin_unlock(&desc->lock);