[PATCHv9,2/3] irq: use a struct for the kstat_irqs in the interrupt descriptor
Commit Message
The current implementation uses an int for the kstat_irqs in the
interrupt descriptor.
However, we need to know the number of interrupts which happened
since softlockup detection took a snapshot in order to analyze
the problem caused by an interrupt storm.
Replacing an int with a struct and providing sensible interfaces
for the watchdog code can keep it self contained to the interrupt
core code.
Signed-off-by: Bitao Hu <yaoma@linux.alibaba.com>
---
arch/mips/dec/setup.c | 2 +-
arch/parisc/kernel/smp.c | 2 +-
arch/powerpc/kvm/book3s_hv_rm_xics.c | 2 +-
include/linux/irqdesc.h | 9 ++++++--
include/linux/kernel_stat.h | 3 +++
kernel/irq/internals.h | 2 +-
kernel/irq/irqdesc.c | 34 ++++++++++++++++++++++------
kernel/irq/proc.c | 9 +++-----
scripts/gdb/linux/interrupts.py | 6 ++---
9 files changed, 47 insertions(+), 22 deletions(-)
Comments
On Thu, Feb 22 2024 at 17:34, Bitao Hu wrote:
First of all the subsystem prefix is 'genirq:'. 'git log kernel/irq/'
gives you a pretty good hint. It's documented....
Secondly the subject line does not match what this patch is about. It's
not about using a struct, it's about providing a snapshot mechanism, no?
> The current implementation uses an int for the kstat_irqs in the
> interrupt descriptor.
>
> However, we need to know the number of interrupts which happened
> since softlockup detection took a snapshot in order to analyze
> the problem caused by an interrupt storm.
>
> Replacing an int with a struct and providing sensible interfaces
> for the watchdog code can keep it self contained to the interrupt
> core code.
So something like this makes a useful change log for this:
Subject: genirq: Provide a snapshot mechanism for interrupt statistics
The soft lockup detector lacks a mechanism to identify interrupt storms
as root cause of a lockup. To enable this the detector needs a
mechanism to snapshot the interrupt count statistics on a CPU when the
detector observes a potential lockup scenario and compare that against
the interrupt count when it warns about the lockup later on. The number
of interrupts in that period give a hint whether the lockup might be
caused by an interrupt storm.
Instead of having extra storage in the lockup detector and accessing
the internals of the interrupt descriptor directly, convert the per CPU
irq_desc::kstat_irq member to a data structure which contains the
counter plus a snapshot member and provide interfaces to take a
snapshot of all interrupts on the current CPU and to retrieve the delta
of a specific interrupt later on.
Hmm?
> Signed-off-by: Bitao Hu <yaoma@linux.alibaba.com>
Interesting. You fully authored the patch?
That's not how it works. You cannot take work from others and claim that
it is yours. The minimal courtesy is to add a 'Originally-by:' tag.
> diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c
> index 623b8136e9af..3ad40cf30c66 100644
> --- a/kernel/irq/proc.c
> +++ b/kernel/irq/proc.c
> @@ -488,18 +488,15 @@ int show_interrupts(struct seq_file *p, void *v)
> if (!desc || irq_settings_is_hidden(desc))
> goto outsparse;
>
> - if (desc->kstat_irqs) {
> - for_each_online_cpu(j)
> - any_count |= data_race(*per_cpu_ptr(desc->kstat_irqs, j));
> - }
> + if (desc->kstat_irqs)
> + any_count = data_race(desc->tot_count);
This is an unrelated change and needs to be split out into a separate
patch with a proper changelog which explains why this is equivalent.
Thanks,
tglx
Hi,
On 2024/2/22 21:22, Thomas Gleixner wrote:
> On Thu, Feb 22 2024 at 17:34, Bitao Hu wrote:
>
> First of all the subsystem prefix is 'genirq:'. 'git log kernel/irq/'
> gives you a pretty good hint. It's documented....
>
> Secondly the subject line does not match what this patch is about. It's
> not about using a struct, it's about providing a snapshot mechanism, no?
>
>> The current implementation uses an int for the kstat_irqs in the
>> interrupt descriptor.
>>
>> However, we need to know the number of interrupts which happened
>> since softlockup detection took a snapshot in order to analyze
>> the problem caused by an interrupt storm.
>>
>> Replacing an int with a struct and providing sensible interfaces
>> for the watchdog code can keep it self contained to the interrupt
>> core code.
>
> So something like this makes a useful change log for this:
>
> Subject: genirq: Provide a snapshot mechanism for interrupt statistics
>
> The soft lockup detector lacks a mechanism to identify interrupt storms
> as root cause of a lockup. To enable this the detector needs a
> mechanism to snapshot the interrupt count statistics on a CPU when the
> detector observes a potential lockup scenario and compare that against
> the interrupt count when it warns about the lockup later on. The number
> of interrupts in that period give a hint whether the lockup might be
> caused by an interrupt storm.
>
> Instead of having extra storage in the lockup detector and accessing
> the internals of the interrupt descriptor directly, convert the per CPU
> irq_desc::kstat_irq member to a data structure which contains the
> counter plus a snapshot member and provide interfaces to take a
> snapshot of all interrupts on the current CPU and to retrieve the delta
> of a specific interrupt later on.
>
Thanks, the changelog you wrote very clearly articulates the purpose of
this patch.
> Hmm?
>
>> Signed-off-by: Bitao Hu <yaoma@linux.alibaba.com>
>
> Interesting. You fully authored the patch?
>
> That's not how it works. You cannot take work from others and claim that
> it is yours. The minimal courtesy is to add a 'Originally-by:' tag.
>
I'm very sorry, the majority of this patch is your work, I will add an
'Originally-by:' tag.
>> diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c
>> index 623b8136e9af..3ad40cf30c66 100644
>> --- a/kernel/irq/proc.c
>> +++ b/kernel/irq/proc.c
>> @@ -488,18 +488,15 @@ int show_interrupts(struct seq_file *p, void *v)
>> if (!desc || irq_settings_is_hidden(desc))
>> goto outsparse;
>>
>> - if (desc->kstat_irqs) {
>> - for_each_online_cpu(j)
>> - any_count |= data_race(*per_cpu_ptr(desc->kstat_irqs, j));
>> - }
>> + if (desc->kstat_irqs)
>> + any_count = data_race(desc->tot_count);
>
> This is an unrelated change and needs to be split out into a separate
> patch with a proper changelog which explains why this is equivalent.
>
Alright, I will remove this change witch is not related to the purpose
of this patch.
I guess that the purpose of suggesting this change in your V1 response
was to speedup the 'show_interrupts'. However, after reviewing the
usage of 'desc->tot_count' in 'unsigned int kstat_irqs(unsigned int
irq)', I think the change might be as follows:
diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c
index 623b8136e9af..53b8d6edd7ac 100644
--- a/kernel/irq/proc.c
+++ b/kernel/irq/proc.c
@@ -489,8 +489,13 @@ int show_interrupts(struct seq_file *p, void *v)
goto outsparse;
if (desc->kstat_irqs) {
- for_each_online_cpu(j)
- any_count |=
data_race(per_cpu(desc->kstat_irqs->cnt, j));
+ if (!irq_settings_is_per_cpu_devid(desc) &&
+ !irq_settings_is_per_cpu(desc) &&
+ !irq_is_nmi(desc))
+ any_count = data_race(desc->tot_count);
+ else
+ for_each_online_cpu(j)
+ any_count |=
data_race(per_cpu(desc->kstat_irqs->cnt, j));
}
if ((!desc->action || irq_desc_is_chained(desc)) && !any_count)
Is my idea correct?
Best Regards,
Bitao
On Fri, Feb 23 2024 at 15:18, Bitao Hu wrote:
> On 2024/2/22 21:22, Thomas Gleixner wrote:
>>> - if (desc->kstat_irqs) {
>>> - for_each_online_cpu(j)
>>> - any_count |= data_race(*per_cpu_ptr(desc->kstat_irqs, j));
>>> - }
>>> + if (desc->kstat_irqs)
>>> + any_count = data_race(desc->tot_count);
>>
>> This is an unrelated change and needs to be split out into a separate
>> patch with a proper changelog which explains why this is equivalent.
>>
>
> Alright, I will remove this change witch is not related to the purpose
> of this patch.
>
> I guess that the purpose of suggesting this change in your V1 response
> was to speedup the 'show_interrupts'. However, after reviewing the
> usage of 'desc->tot_count' in 'unsigned int kstat_irqs(unsigned int
> irq)', I think the change might be as follows:
>
> diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c
> index 623b8136e9af..53b8d6edd7ac 100644
> --- a/kernel/irq/proc.c
> +++ b/kernel/irq/proc.c
> @@ -489,8 +489,13 @@ int show_interrupts(struct seq_file *p, void *v)
> goto outsparse;
>
> if (desc->kstat_irqs) {
> - for_each_online_cpu(j)
> - any_count |=
> data_race(per_cpu(desc->kstat_irqs->cnt, j));
> + if (!irq_settings_is_per_cpu_devid(desc) &&
> + !irq_settings_is_per_cpu(desc) &&
> + !irq_is_nmi(desc))
> + any_count = data_race(desc->tot_count);
> + else
> + for_each_online_cpu(j)
> + any_count |=
> data_race(per_cpu(desc->kstat_irqs->cnt, j));
> }
>
> if ((!desc->action || irq_desc_is_chained(desc)) && !any_count)
>
> Is my idea correct?
Yes.
@@ -756,7 +756,7 @@ void __init arch_init_irq(void)
NULL))
pr_err("Failed to register fpu interrupt\n");
desc_fpu = irq_to_desc(irq_fpu);
- fpu_kstat_irq = this_cpu_ptr(desc_fpu->kstat_irqs);
+ fpu_kstat_irq = this_cpu_ptr(&desc_fpu->kstat_irqs->cnt);
}
if (dec_interrupt[DEC_IRQ_CASCADE] >= 0) {
if (request_irq(dec_interrupt[DEC_IRQ_CASCADE], no_action,
@@ -344,7 +344,7 @@ static int smp_boot_one_cpu(int cpuid, struct task_struct *idle)
struct irq_desc *desc = irq_to_desc(i);
if (desc && desc->kstat_irqs)
- *per_cpu_ptr(desc->kstat_irqs, cpuid) = 0;
+ *per_cpu_ptr(desc->kstat_irqs, cpuid) = (struct irqstat) { };
}
#endif
@@ -837,7 +837,7 @@ static inline void this_cpu_inc_rm(unsigned int __percpu *addr)
*/
static void kvmppc_rm_handle_irq_desc(struct irq_desc *desc)
{
- this_cpu_inc_rm(desc->kstat_irqs);
+ this_cpu_inc_rm(&desc->kstat_irqs->cnt);
__this_cpu_inc(kstat.irqs_sum);
}
@@ -17,6 +17,11 @@ struct irq_desc;
struct irq_domain;
struct pt_regs;
+struct irqstat {
+ unsigned int cnt;
+ unsigned int ref;
+};
+
/**
* struct irq_desc - interrupt descriptor
* @irq_common_data: per irq and chip data passed down to chip functions
@@ -55,7 +60,7 @@ struct pt_regs;
struct irq_desc {
struct irq_common_data irq_common_data;
struct irq_data irq_data;
- unsigned int __percpu *kstat_irqs;
+ struct irqstat __percpu *kstat_irqs;
irq_flow_handler_t handle_irq;
struct irqaction *action; /* IRQ action list */
unsigned int status_use_accessors;
@@ -119,7 +124,7 @@ extern struct irq_desc irq_desc[NR_IRQS];
static inline unsigned int irq_desc_kstat_cpu(struct irq_desc *desc,
unsigned int cpu)
{
- return desc->kstat_irqs ? *per_cpu_ptr(desc->kstat_irqs, cpu) : 0;
+ return desc->kstat_irqs ? per_cpu(desc->kstat_irqs->cnt, cpu) : 0;
}
static inline struct irq_desc *irq_data_to_desc(struct irq_data *data)
@@ -79,6 +79,9 @@ static inline unsigned int kstat_cpu_softirqs_sum(int cpu)
return sum;
}
+extern void kstat_snapshot_irqs(void);
+extern unsigned int kstat_get_irq_since_snapshot(unsigned int irq);
+
/*
* Number of interrupts per specific IRQ source, since bootup
*/
@@ -258,7 +258,7 @@ static inline void irq_state_set_masked(struct irq_desc *desc)
static inline void __kstat_incr_irqs_this_cpu(struct irq_desc *desc)
{
- __this_cpu_inc(*desc->kstat_irqs);
+ __this_cpu_inc(desc->kstat_irqs->cnt);
__this_cpu_inc(kstat.irqs_sum);
}
@@ -122,7 +122,7 @@ static void desc_set_defaults(unsigned int irq, struct irq_desc *desc, int node,
desc->name = NULL;
desc->owner = owner;
for_each_possible_cpu(cpu)
- *per_cpu_ptr(desc->kstat_irqs, cpu) = 0;
+ *per_cpu_ptr(desc->kstat_irqs, cpu) = (struct irqstat) { };
desc_smp_init(desc, node, affinity);
}
@@ -418,8 +418,8 @@ static struct irq_desc *alloc_desc(int irq, int node, unsigned int flags,
desc = kzalloc_node(sizeof(*desc), GFP_KERNEL, node);
if (!desc)
return NULL;
- /* allocate based on nr_cpu_ids */
- desc->kstat_irqs = alloc_percpu(unsigned int);
+
+ desc->kstat_irqs = alloc_percpu(struct irqstat);
if (!desc->kstat_irqs)
goto err_desc;
@@ -593,7 +593,7 @@ int __init early_irq_init(void)
count = ARRAY_SIZE(irq_desc);
for (i = 0; i < count; i++) {
- desc[i].kstat_irqs = alloc_percpu(unsigned int);
+ desc[i].kstat_irqs = alloc_percpu(struct irqstat);
alloc_masks(&desc[i], node);
raw_spin_lock_init(&desc[i].lock);
lockdep_set_class(&desc[i].lock, &irq_desc_lock_class);
@@ -952,8 +952,7 @@ unsigned int kstat_irqs_cpu(unsigned int irq, int cpu)
{
struct irq_desc *desc = irq_to_desc(irq);
- return desc && desc->kstat_irqs ?
- *per_cpu_ptr(desc->kstat_irqs, cpu) : 0;
+ return desc && desc->kstat_irqs ? per_cpu(desc->kstat_irqs->cnt, cpu) : 0;
}
static bool irq_is_nmi(struct irq_desc *desc)
@@ -975,10 +974,31 @@ static unsigned int kstat_irqs(unsigned int irq)
return data_race(desc->tot_count);
for_each_possible_cpu(cpu)
- sum += data_race(*per_cpu_ptr(desc->kstat_irqs, cpu));
+ sum += data_race(per_cpu(desc->kstat_irqs->cnt, cpu));
return sum;
}
+void kstat_snapshot_irqs(void)
+{
+ struct irq_desc *desc;
+ unsigned int irq;
+
+ for_each_irq_desc(irq, desc) {
+ if (!desc->kstat_irqs)
+ continue;
+ this_cpu_write(desc->kstat_irqs->ref, this_cpu_read(desc->kstat_irqs->cnt));
+ }
+}
+
+unsigned int kstat_get_irq_since_snapshot(unsigned int irq)
+{
+ struct irq_desc *desc = irq_to_desc(irq);
+
+ if (!desc || !desc->kstat_irqs)
+ return 0;
+ return this_cpu_read(desc->kstat_irqs->cnt) - this_cpu_read(desc->kstat_irqs->ref);
+}
+
/**
* kstat_irqs_usr - Get the statistics for an interrupt from thread context
* @irq: The interrupt number
@@ -488,18 +488,15 @@ int show_interrupts(struct seq_file *p, void *v)
if (!desc || irq_settings_is_hidden(desc))
goto outsparse;
- if (desc->kstat_irqs) {
- for_each_online_cpu(j)
- any_count |= data_race(*per_cpu_ptr(desc->kstat_irqs, j));
- }
+ if (desc->kstat_irqs)
+ any_count = data_race(desc->tot_count);
if ((!desc->action || irq_desc_is_chained(desc)) && !any_count)
goto outsparse;
seq_printf(p, "%*d: ", prec, i);
for_each_online_cpu(j)
- seq_printf(p, "%10u ", desc->kstat_irqs ?
- *per_cpu_ptr(desc->kstat_irqs, j) : 0);
+ seq_printf(p, "%10u ", desc->kstat_irqs ? per_cpu(desc->kstat_irqs->cnt, j) : 0);
raw_spin_lock_irqsave(&desc->lock, flags);
if (desc->irq_data.chip) {
@@ -37,7 +37,7 @@ def show_irq_desc(prec, irq):
any_count = 0
if desc['kstat_irqs']:
for cpu in cpus.each_online_cpu():
- any_count += cpus.per_cpu(desc['kstat_irqs'], cpu)
+ any_count += cpus.per_cpu(desc['kstat_irqs'], cpu)['cnt']
if (desc['action'] == 0 or irq_desc_is_chained(desc)) and any_count == 0:
return text;
@@ -45,7 +45,7 @@ def show_irq_desc(prec, irq):
text += "%*d: " % (prec, irq)
for cpu in cpus.each_online_cpu():
if desc['kstat_irqs']:
- count = cpus.per_cpu(desc['kstat_irqs'], cpu)
+ count = cpus.per_cpu(desc['kstat_irqs'], cpu)['cnt']
else:
count = 0
text += "%10u" % (count)
@@ -177,7 +177,7 @@ def arm_common_show_interrupts(prec):
if desc == 0:
continue
for cpu in cpus.each_online_cpu():
- text += "%10u" % (cpus.per_cpu(desc['kstat_irqs'], cpu))
+ text += "%10u" % (cpus.per_cpu(desc['kstat_irqs'], cpu)['cnt'])
text += " %s" % (ipi_types[ipi].string())
text += "\n"
return text