[PATCHv10,3/4] genirq: Avoid summation loops for /proc/interrupts

Message ID 20240226020939.45264-4-yaoma@linux.alibaba.com
State New
Headers
Series *** Detect interrupt storm in softlockup *** |

Commit Message

Bitao Hu Feb. 26, 2024, 2:09 a.m. UTC
  We could use the irq_desc::tot_count member to avoid the summation
loop for interrupts which are not marked as 'PER_CPU' interrupts in
'show_interrupts'. This could reduce the time overhead of reading
/proc/interrupts.

Originally-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Bitao Hu <yaoma@linux.alibaba.com>
---
 include/linux/irqdesc.h | 2 ++
 kernel/irq/irqdesc.c    | 2 +-
 kernel/irq/proc.c       | 9 +++++++--
 3 files changed, 10 insertions(+), 3 deletions(-)
  

Comments

Liu Song Feb. 27, 2024, 7:48 a.m. UTC | #1
在 2024/2/26 10:09, Bitao Hu 写道:
> We could use the irq_desc::tot_count member to avoid the summation
> loop for interrupts which are not marked as 'PER_CPU' interrupts in
> 'show_interrupts'. This could reduce the time overhead of reading
> /proc/interrupts.
>
> Originally-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Bitao Hu <yaoma@linux.alibaba.com>
> ---
>   include/linux/irqdesc.h | 2 ++
>   kernel/irq/irqdesc.c    | 2 +-
>   kernel/irq/proc.c       | 9 +++++++--
>   3 files changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/irqdesc.h b/include/linux/irqdesc.h
> index 2912b1998670..1ee96d7232b4 100644
> --- a/include/linux/irqdesc.h
> +++ b/include/linux/irqdesc.h
> @@ -121,6 +121,8 @@ static inline void irq_unlock_sparse(void) { }
>   extern struct irq_desc irq_desc[NR_IRQS];
>   #endif
>   
> +extern bool irq_is_nmi(struct irq_desc *desc);
> +
>   static inline unsigned int irq_desc_kstat_cpu(struct irq_desc *desc,
>   					      unsigned int cpu)
>   {
> diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
> index 9cd17080b2d8..56a767957a9d 100644
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -955,7 +955,7 @@ unsigned int kstat_irqs_cpu(unsigned int irq, int cpu)
>   	return desc && desc->kstat_irqs ? per_cpu(desc->kstat_irqs->cnt, cpu) : 0;
>   }
>   
> -static bool irq_is_nmi(struct irq_desc *desc)
> +bool irq_is_nmi(struct irq_desc *desc)
>   {
>   	return desc->istate & IRQS_NMI;
>   }
> diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c
> index 6954e0a02047..b3b1b93f0410 100644
> --- a/kernel/irq/proc.c
> +++ b/kernel/irq/proc.c
> @@ -489,8 +489,13 @@ int show_interrupts(struct seq_file *p, void *v)
>   		goto outsparse;
>   
>   	if (desc->kstat_irqs) {
> -		for_each_online_cpu(j)
> -			any_count |= data_race(per_cpu(desc->kstat_irqs->cnt, j));
> +		if (!irq_settings_is_per_cpu_devid(desc) &&
> +		    !irq_settings_is_per_cpu(desc) &&
> +		    !irq_is_nmi(desc))
> +			any_count = data_race(desc->tot_count);
> +		else
> +			for_each_online_cpu(j)
> +				any_count |= data_race(per_cpu(desc->kstat_irqs->cnt, j));
>   	}
>   
>   	if ((!desc->action || irq_desc_is_chained(desc)) && !any_count)

The modification borrows from the implementation of |kstat_irqs. Looks 
good.|

|Reviewed-by: Liu Song <liusong@linux.alibaba.com> |

||
  
Thomas Gleixner Feb. 27, 2024, 9:26 a.m. UTC | #2
On Mon, Feb 26 2024 at 10:09, Bitao Hu wrote:
> We could use the irq_desc::tot_count member to avoid the summation
> loop for interrupts which are not marked as 'PER_CPU' interrupts in
> 'show_interrupts'. This could reduce the time overhead of reading
> /proc/interrupts.

"Could" is not really a technical term. Either we do or we do not. Also
please provide context for your change and avoid the 'We'.

> --- a/include/linux/irqdesc.h
> +++ b/include/linux/irqdesc.h
> @@ -121,6 +121,8 @@ static inline void irq_unlock_sparse(void) { }
>  extern struct irq_desc irq_desc[NR_IRQS];
>  #endif
>
> +extern bool irq_is_nmi(struct irq_desc *desc);
> +

If at all this wants to be in kernel/irq/internal.h. There is zero
reason to expose this globally.

> -static bool irq_is_nmi(struct irq_desc *desc)
> +bool irq_is_nmi(struct irq_desc *desc)
>  {
>  	return desc->istate & IRQS_NMI;
>  }

If at all this really wants to be a static inline in internals.h, but
instead of blindly copying code this can be done smarter:

unsigned int kstat_irq_desc(struct irq_desc *desc)
{
	unsigned int sum = 0;
	int cpu;

	if (!irq_settings_is_per_cpu_devid(desc) &&
	    !irq_settings_is_per_cpu(desc) &&
	    !irq_is_nmi(desc))
		return data_race(desc->tot_count);

	for_each_possible_cpu(cpu)
		sum += data_race(*per_cpu_ptr(desc->kstat_irqs, cpu));
	return sum;
}

and then let kstat_irqs() and show_interrupts() use it. See?

With that a proper changelog would be:

   show_interrupts() unconditionally accumulates the per CPU interrupt
   statistics to determine whether an interrupt was ever raised.

   This can be avoided for all interrupts which are not strictly per CPU
   and not of type NMI because those interrupts provide already an
   accumulated counter. The required logic is already implemented in
   kstat_irqs().

   Split the inner access logic out of kstat_irqs() and use it for
   kstat_irqs() and show_interrupts() to avoid the accumulation loop
   when possible.

Thanks,

        tglx
  
Bitao Hu Feb. 27, 2024, 11:20 a.m. UTC | #3
Hi,

On 2024/2/27 17:26, Thomas Gleixner wrote:
> On Mon, Feb 26 2024 at 10:09, Bitao Hu wrote:
>> We could use the irq_desc::tot_count member to avoid the summation
>> loop for interrupts which are not marked as 'PER_CPU' interrupts in
>> 'show_interrupts'. This could reduce the time overhead of reading
>> /proc/interrupts.
> 
> "Could" is not really a technical term. Either we do or we do not. Also
> please provide context for your change and avoid the 'We'.
OK.
> 
>> --- a/include/linux/irqdesc.h
>> +++ b/include/linux/irqdesc.h
>> @@ -121,6 +121,8 @@ static inline void irq_unlock_sparse(void) { }
>>   extern struct irq_desc irq_desc[NR_IRQS];
>>   #endif
>>
>> +extern bool irq_is_nmi(struct irq_desc *desc);
>> +
> 
> If at all this wants to be in kernel/irq/internal.h. There is zero
> reason to expose this globally.
> 
>> -static bool irq_is_nmi(struct irq_desc *desc)
>> +bool irq_is_nmi(struct irq_desc *desc)
>>   {
>>   	return desc->istate & IRQS_NMI;
>>   }
> 
> If at all this really wants to be a static inline in internals.h, but
> instead of blindly copying code this can be done smarter:
> 
> unsigned int kstat_irq_desc(struct irq_desc *desc)
> {
> 	unsigned int sum = 0;
> 	int cpu;
> 
> 	if (!irq_settings_is_per_cpu_devid(desc) &&
> 	    !irq_settings_is_per_cpu(desc) &&
> 	    !irq_is_nmi(desc))
> 		return data_race(desc->tot_count);
> 
> 	for_each_possible_cpu(cpu)
> 		sum += data_race(*per_cpu_ptr(desc->kstat_irqs, cpu));
> 	return sum;
> }
> 
> and then let kstat_irqs() and show_interrupts() use it. See?

I have a concern. kstat_irqs() uses for_each_possible_cpu() for
summation. However, show_interrupts() uses for_each_online_cpu(),
which means it only outputs interrupt statistics for online cpus.
If we use for_each_possible_cpu() in show_interrupts() to calculate
'any_count', there could be a problem with the following scenario:
If an interrupt has a count of zero on online cpus but a non-zero
count on possible cpus, then 'any_count' would not be zero, and the
statistics for that interrupt would be output, which is not the
desired behavior for show_interrupts(). Therefore, I think it's not
good to have kstat_irqs() and show_interrupts() both use the same
logic. What do you think?

> 
> With that a proper changelog would be:
> 
>     show_interrupts() unconditionally accumulates the per CPU interrupt
>     statistics to determine whether an interrupt was ever raised.
> 
>     This can be avoided for all interrupts which are not strictly per CPU
>     and not of type NMI because those interrupts provide already an
>     accumulated counter. The required logic is already implemented in
>     kstat_irqs().
> 
>     Split the inner access logic out of kstat_irqs() and use it for
>     kstat_irqs() and show_interrupts() to avoid the accumulation loop
>     when possible.
> 

Best Regards,
	Bitao Hu
  
Thomas Gleixner Feb. 27, 2024, 3:39 p.m. UTC | #4
On Tue, Feb 27 2024 at 19:20, Bitao Hu wrote:
> On 2024/2/27 17:26, Thomas Gleixner wrote:
>> 
>> and then let kstat_irqs() and show_interrupts() use it. See?
>
> I have a concern. kstat_irqs() uses for_each_possible_cpu() for
> summation. However, show_interrupts() uses for_each_online_cpu(),
> which means it only outputs interrupt statistics for online cpus.
> If we use for_each_possible_cpu() in show_interrupts() to calculate
> 'any_count', there could be a problem with the following scenario:
> If an interrupt has a count of zero on online cpus but a non-zero
> count on possible cpus, then 'any_count' would not be zero, and the
> statistics for that interrupt would be output, which is not the
> desired behavior for show_interrupts(). Therefore, I think it's not
> good to have kstat_irqs() and show_interrupts() both use the same
> logic. What do you think?

Good point. But you simply can have

unsigned int kstat_irq_desc(struct irq_desc *desc, const struct cpumask *mask)

and hand in the appropriate cpumask, which still shares the code, no?

Thanks,

        tglx
  
Bitao Hu Feb. 28, 2024, 6:07 a.m. UTC | #5
On 2024/2/27 23:39, Thomas Gleixner wrote:
> On Tue, Feb 27 2024 at 19:20, Bitao Hu wrote:
>> On 2024/2/27 17:26, Thomas Gleixner wrote:
>>>
>>> and then let kstat_irqs() and show_interrupts() use it. See?
>>
>> I have a concern. kstat_irqs() uses for_each_possible_cpu() for
>> summation. However, show_interrupts() uses for_each_online_cpu(),
>> which means it only outputs interrupt statistics for online cpus.
>> If we use for_each_possible_cpu() in show_interrupts() to calculate
>> 'any_count', there could be a problem with the following scenario:
>> If an interrupt has a count of zero on online cpus but a non-zero
>> count on possible cpus, then 'any_count' would not be zero, and the
>> statistics for that interrupt would be output, which is not the
>> desired behavior for show_interrupts(). Therefore, I think it's not
>> good to have kstat_irqs() and show_interrupts() both use the same
>> logic. What do you think?
> 
> Good point. But you simply can have
> 
> unsigned int kstat_irq_desc(struct irq_desc *desc, const struct cpumask *mask)
> 
> and hand in the appropriate cpumask, which still shares the code, no?
> 
Alright, that is a good approach.
  

Patch

diff --git a/include/linux/irqdesc.h b/include/linux/irqdesc.h
index 2912b1998670..1ee96d7232b4 100644
--- a/include/linux/irqdesc.h
+++ b/include/linux/irqdesc.h
@@ -121,6 +121,8 @@  static inline void irq_unlock_sparse(void) { }
 extern struct irq_desc irq_desc[NR_IRQS];
 #endif
 
+extern bool irq_is_nmi(struct irq_desc *desc);
+
 static inline unsigned int irq_desc_kstat_cpu(struct irq_desc *desc,
 					      unsigned int cpu)
 {
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 9cd17080b2d8..56a767957a9d 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -955,7 +955,7 @@  unsigned int kstat_irqs_cpu(unsigned int irq, int cpu)
 	return desc && desc->kstat_irqs ? per_cpu(desc->kstat_irqs->cnt, cpu) : 0;
 }
 
-static bool irq_is_nmi(struct irq_desc *desc)
+bool irq_is_nmi(struct irq_desc *desc)
 {
 	return desc->istate & IRQS_NMI;
 }
diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c
index 6954e0a02047..b3b1b93f0410 100644
--- a/kernel/irq/proc.c
+++ b/kernel/irq/proc.c
@@ -489,8 +489,13 @@  int show_interrupts(struct seq_file *p, void *v)
 		goto outsparse;
 
 	if (desc->kstat_irqs) {
-		for_each_online_cpu(j)
-			any_count |= data_race(per_cpu(desc->kstat_irqs->cnt, j));
+		if (!irq_settings_is_per_cpu_devid(desc) &&
+		    !irq_settings_is_per_cpu(desc) &&
+		    !irq_is_nmi(desc))
+			any_count = data_race(desc->tot_count);
+		else
+			for_each_online_cpu(j)
+				any_count |= data_race(per_cpu(desc->kstat_irqs->cnt, j));
 	}
 
 	if ((!desc->action || irq_desc_is_chained(desc)) && !any_count)