genirq: avoid long loops in handle_edge_irq

Message ID 20230921080146.37186-1-gongwei833x@gmail.com
State New
Headers
Series genirq: avoid long loops in handle_edge_irq |

Commit Message

Wei Gong Sept. 21, 2023, 8:01 a.m. UTC
  When there are a large number of interrupts occurring on the tx
queue(irq smp_affinity=1) of the network card, changing the CPU
affinity of the tx queue (echo 2 > /proc/irq/xx/smp_affinity)
will cause handle_edge_irq to loop for a long time in the
do {} while() loop.

After setting the IRQ CPU affinity, the next interrupt will only
be activated when it arrives. Therefore, the next interrupt will
still be on CPU 0. When a new CPU affinity is activated on CPU 0,
subsequent interrupts will be processed on CPU 1.

       cpu 0                                cpu 1
  - handle_edge_irq
    - apic_ack_irq
      - irq_do_set_affinity
                                        - handle_edge_irq
    - do {
        - handle_irq_event
          - istate &= ~IRQS_PENDIN
          - IRQD_IRQ_INPROGRESS
          - spin_unlock()
                                          - spin_lock()
                                          - istate |= IRQS_PENDIN
          - handle_irq_event_percpu       - mask_ack_irq()
                                          - spin_unlock()
          - spin_unlock

      } while(IRQS_PENDIN &&
              !irq_disable)

Therefore, when determining whether to continue looping, we add a check
to see if the current CPU belongs to the affinity table of the interrupt.

Signed-off-by: Wei Gong <gongwei833x@gmail.com>
---
 kernel/irq/chip.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
  

Comments

kernel test robot Sept. 23, 2023, 12:49 a.m. UTC | #1
Hi Wei,

kernel test robot noticed the following build errors:

[auto build test ERROR on tip/irq/core]
[also build test ERROR on linus/master v6.6-rc2 next-20230921]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Wei-Gong/genirq-avoid-long-loops-in-handle_edge_irq/20230922-025437
base:   tip/irq/core
patch link:    https://lore.kernel.org/r/20230921080146.37186-1-gongwei833x%40gmail.com
patch subject: [PATCH] genirq: avoid long loops in handle_edge_irq
config: um-allnoconfig (https://download.01.org/0day-ci/archive/20230923/202309230859.ygo4QTtO-lkp@intel.com/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project.git 4a5ac14ee968ff0ad5d2cc1ffa0299048db4c88a)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230923/202309230859.ygo4QTtO-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202309230859.ygo4QTtO-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from kernel/irq/chip.c:11:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/um/include/asm/io.h:24:
   include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     547 |         val = __raw_readb(PCI_IOBASE + addr);
         |                           ~~~~~~~~~~ ^
   include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     560 |         val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
      37 | #define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
         |                                                   ^
   In file included from kernel/irq/chip.c:11:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/um/include/asm/io.h:24:
   include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     573 |         val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
      35 | #define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
         |                                                   ^
   In file included from kernel/irq/chip.c:11:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/um/include/asm/io.h:24:
   include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     584 |         __raw_writeb(value, PCI_IOBASE + addr);
         |                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     594 |         __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     604 |         __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:692:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     692 |         readsb(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:700:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     700 |         readsw(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:708:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     708 |         readsl(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:717:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     717 |         writesb(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:726:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     726 |         writesw(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:735:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     735 |         writesl(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
>> kernel/irq/chip.c:835:63: error: no member named 'affinity' in 'struct irq_common_data'
     835 |                  cpumask_test_cpu(smp_processor_id(), desc->irq_common_data.affinity));
         |                                                       ~~~~~~~~~~~~~~~~~~~~~ ^
   12 warnings and 1 error generated.


vim +835 kernel/irq/chip.c

   771	
   772	/**
   773	 *	handle_edge_irq - edge type IRQ handler
   774	 *	@desc:	the interrupt description structure for this irq
   775	 *
   776	 *	Interrupt occurs on the falling and/or rising edge of a hardware
   777	 *	signal. The occurrence is latched into the irq controller hardware
   778	 *	and must be acked in order to be reenabled. After the ack another
   779	 *	interrupt can happen on the same source even before the first one
   780	 *	is handled by the associated event handler. If this happens it
   781	 *	might be necessary to disable (mask) the interrupt depending on the
   782	 *	controller hardware. This requires to reenable the interrupt inside
   783	 *	of the loop which handles the interrupts which have arrived while
   784	 *	the handler was running. If all pending interrupts are handled, the
   785	 *	loop is left.
   786	 */
   787	void handle_edge_irq(struct irq_desc *desc)
   788	{
   789		raw_spin_lock(&desc->lock);
   790	
   791		desc->istate &= ~(IRQS_REPLAY | IRQS_WAITING);
   792	
   793		if (!irq_may_run(desc)) {
   794			desc->istate |= IRQS_PENDING;
   795			mask_ack_irq(desc);
   796			goto out_unlock;
   797		}
   798	
   799		/*
   800		 * If its disabled or no action available then mask it and get
   801		 * out of here.
   802		 */
   803		if (irqd_irq_disabled(&desc->irq_data) || !desc->action) {
   804			desc->istate |= IRQS_PENDING;
   805			mask_ack_irq(desc);
   806			goto out_unlock;
   807		}
   808	
   809		kstat_incr_irqs_this_cpu(desc);
   810	
   811		/* Start handling the irq */
   812		desc->irq_data.chip->irq_ack(&desc->irq_data);
   813	
   814		do {
   815			if (unlikely(!desc->action)) {
   816				mask_irq(desc);
   817				goto out_unlock;
   818			}
   819	
   820			/*
   821			 * When another irq arrived while we were handling
   822			 * one, we could have masked the irq.
   823			 * Reenable it, if it was not disabled in meantime.
   824			 */
   825			if (unlikely(desc->istate & IRQS_PENDING)) {
   826				if (!irqd_irq_disabled(&desc->irq_data) &&
   827				    irqd_irq_masked(&desc->irq_data))
   828					unmask_irq(desc);
   829			}
   830	
   831			handle_irq_event(desc);
   832	
   833		} while ((desc->istate & IRQS_PENDING) &&
   834			 !irqd_irq_disabled(&desc->irq_data) &&
 > 835			 cpumask_test_cpu(smp_processor_id(), desc->irq_common_data.affinity));
   836	
   837	out_unlock:
   838		raw_spin_unlock(&desc->lock);
   839	}
   840	EXPORT_SYMBOL(handle_edge_irq);
   841
  

Patch

diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index dc94e0bf2c94..cafd395367c3 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -831,7 +831,8 @@  void handle_edge_irq(struct irq_desc *desc)
 		handle_irq_event(desc);
 
 	} while ((desc->istate & IRQS_PENDING) &&
-		 !irqd_irq_disabled(&desc->irq_data));
+		 !irqd_irq_disabled(&desc->irq_data) &&
+		 cpumask_test_cpu(smp_processor_id(), desc->irq_common_data.affinity));
 
 out_unlock:
 	raw_spin_unlock(&desc->lock);