[printk,v2,26/26] lockdep: Mark emergency section in lockdep splats

Message ID 20240218185726.1994771-27-john.ogness@linutronix.de
State New
Headers
Series wire up write_atomic() printing |

Commit Message

John Ogness Feb. 18, 2024, 6:57 p.m. UTC
  Mark an emergency section within print_usage_bug(), where
lockdep bugs are printed. In this section, the CPU will not
perform console output for the printk() calls. Instead, a
flushing of the console output is triggered when exiting
the emergency section.

Signed-off-by: John Ogness <john.ogness@linutronix.de>
---
 kernel/locking/lockdep.c | 5 +++++
 1 file changed, 5 insertions(+)
  

Comments

Waiman Long Feb. 19, 2024, 4:14 a.m. UTC | #1
On 2/18/24 13:57, John Ogness wrote:
> Mark an emergency section within print_usage_bug(), where
> lockdep bugs are printed. In this section, the CPU will not
> perform console output for the printk() calls. Instead, a
> flushing of the console output is triggered when exiting
> the emergency section.
>
> Signed-off-by: John Ogness <john.ogness@linutronix.de>
> ---
>   kernel/locking/lockdep.c | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index e85b5ad3e206..00465373d358 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -56,6 +56,7 @@
>   #include <linux/kprobes.h>
>   #include <linux/lockdep.h>
>   #include <linux/context_tracking.h>
> +#include <linux/console.h>
>   
>   #include <asm/sections.h>
>   
> @@ -3970,6 +3971,8 @@ print_usage_bug(struct task_struct *curr, struct held_lock *this,
>   	if (!debug_locks_off() || debug_locks_silent)
>   		return;
>   
> +	nbcon_cpu_emergency_enter();
> +
>   	pr_warn("\n");
>   	pr_warn("================================\n");
>   	pr_warn("WARNING: inconsistent lock state\n");
> @@ -3998,6 +4001,8 @@ print_usage_bug(struct task_struct *curr, struct held_lock *this,
>   
>   	pr_warn("\nstack backtrace:\n");
>   	dump_stack();
> +
> +	nbcon_cpu_emergency_exit();
>   }
>   
>   /*

lockdep.c has multiple functions that print stuff to the console, like

  - print_circular_bug_header()
  - print_bad_irq_dependency()
  - print_deadlock_bug()
  - print_collision()
  - print_usage_bug()
  - print_irq_inversion_bug()
  - print_lock_invalid_wait_context()
  - print_lock_nested_lock_not_held()
  - print_unlock_imbalance_bug()
  - print_lock_contention_bug()
  - print_freed_lock_bug()
  - print_held_locks_bug()
  - lockdep_rcu_suspicious()

So what is special about print_usage_bug() that it needs this emergency 
treatment but not the other ones?

Cheers,
Longman
  
John Ogness Feb. 19, 2024, 11:11 a.m. UTC | #2
On 2024-02-18, Waiman Long <longman@redhat.com> wrote:
> On 2/18/24 13:57, John Ogness wrote:
> lockdep.c has multiple functions that print stuff to the console, like
>
>   - print_circular_bug_header()
>   - print_bad_irq_dependency()
>   - print_deadlock_bug()
>   - print_collision()
>   - print_usage_bug()
>   - print_irq_inversion_bug()
>   - print_lock_invalid_wait_context()
>   - print_lock_nested_lock_not_held()
>   - print_unlock_imbalance_bug()
>   - print_lock_contention_bug()
>   - print_freed_lock_bug()
>   - print_held_locks_bug()
>   - lockdep_rcu_suspicious()
>
> So what is special about print_usage_bug() that it needs this
> emergency treatment but not the other ones?

I do not expect to be able to identify all "emergency printing" paths in
the kernel from the beginning. This series initially marks some sections
that are IMHO interesting for the feature.

As you are implying, for lockdep probably all printing should be
considered emergency. Is it preferred to place the markers outside the
high-level print functions, for example:

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 00465373d358..7a4e4f4a9156 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2182,10 +2182,12 @@ check_noncircular(struct held_lock *src, struct held_lock *target,
 			*trace = save_trace();
 		}
 
+		nbcon_cpu_emergency_enter();
 		if (src->class_idx == target->class_idx)
 			print_deadlock_bug(current, src, target);
 		else
 			print_circular_bug(&src_entry, target_entry, src, target);
+		nbcon_cpu_emergency_exit();
 	}
 
 	return ret;

Or is it preferred to put them directly around the various pr_warn()
blocks (as the patch in this series is doing)?

John Ogness
  
Waiman Long Feb. 19, 2024, 3:07 p.m. UTC | #3
On 2/19/24 06:11, John Ogness wrote:
> On 2024-02-18, Waiman Long <longman@redhat.com> wrote:
>> On 2/18/24 13:57, John Ogness wrote:
>> lockdep.c has multiple functions that print stuff to the console, like
>>
>>    - print_circular_bug_header()
>>    - print_bad_irq_dependency()
>>    - print_deadlock_bug()
>>    - print_collision()
>>    - print_usage_bug()
>>    - print_irq_inversion_bug()
>>    - print_lock_invalid_wait_context()
>>    - print_lock_nested_lock_not_held()
>>    - print_unlock_imbalance_bug()
>>    - print_lock_contention_bug()
>>    - print_freed_lock_bug()
>>    - print_held_locks_bug()
>>    - lockdep_rcu_suspicious()
>>
>> So what is special about print_usage_bug() that it needs this
>> emergency treatment but not the other ones?
> I do not expect to be able to identify all "emergency printing" paths in
> the kernel from the beginning. This series initially marks some sections
> that are IMHO interesting for the feature.
That is what I like to see in the changelog. I am aware that this patch 
is probably not complete, but you need to set the right expectation that 
similar changes will have to be done elsewhere in lockdep to complete 
the change. We can make the other necessary changes after this patch 
series have been merged. It also helps if you can document what 
undesirable thing may happen if printk() is called without setting the 
emergency mode.
>   
>
> As you are implying, for lockdep probably all printing should be
> considered emergency. Is it preferred to place the markers outside the
> high-level print functions, for example:
>
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 00465373d358..7a4e4f4a9156 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -2182,10 +2182,12 @@ check_noncircular(struct held_lock *src, struct held_lock *target,
>   			*trace = save_trace();
>   		}
>   
> +		nbcon_cpu_emergency_enter();
>   		if (src->class_idx == target->class_idx)
>   			print_deadlock_bug(current, src, target);
>   		else
>   			print_circular_bug(&src_entry, target_entry, src, target);
> +		nbcon_cpu_emergency_exit();
>   	}
>   
>   	return ret;
>
> Or is it preferred to put them directly around the various pr_warn()
> blocks (as the patch in this series is doing)?

There are pros and cons for both. It will depend on how expensive is the 
nbcon_cpu_emergency_{enter|exit}() call as printing won't happen if 
lockdep is turned off somehow. Since lockdep is for debugging and 
efficiency isn't that important, putting the emergency enter/exit 
markers outside the high level print functions will make it a bit easier 
to read.

My 2 cents.

Cheers,
Longman


>
> John Ogness
>
  
Petr Mladek March 1, 2024, 3:18 p.m. UTC | #4
On Sun 2024-02-18 20:03:26, John Ogness wrote:
> Mark an emergency section within print_usage_bug(), where
> lockdep bugs are printed. In this section, the CPU will not
> perform console output for the printk() calls. Instead, a
> flushing of the console output is triggered when exiting
> the emergency section.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>

The patch looks fine from my POV. Well, I expect that you will send
another version addressing Waiman's concerns.

Best Regards,
Petr
  

Patch

diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index e85b5ad3e206..00465373d358 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -56,6 +56,7 @@ 
 #include <linux/kprobes.h>
 #include <linux/lockdep.h>
 #include <linux/context_tracking.h>
+#include <linux/console.h>
 
 #include <asm/sections.h>
 
@@ -3970,6 +3971,8 @@  print_usage_bug(struct task_struct *curr, struct held_lock *this,
 	if (!debug_locks_off() || debug_locks_silent)
 		return;
 
+	nbcon_cpu_emergency_enter();
+
 	pr_warn("\n");
 	pr_warn("================================\n");
 	pr_warn("WARNING: inconsistent lock state\n");
@@ -3998,6 +4001,8 @@  print_usage_bug(struct task_struct *curr, struct held_lock *this,
 
 	pr_warn("\nstack backtrace:\n");
 	dump_stack();
+
+	nbcon_cpu_emergency_exit();
 }
 
 /*