[mm-unstable,hotfix] mm/zswap: fix zswap_pools_lock usages after changing to percpu_ref

Message ID 20240228151832.2431993-1-chengming.zhou@linux.dev
State New
Headers
Series [mm-unstable,hotfix] mm/zswap: fix zswap_pools_lock usages after changing to percpu_ref |

Commit Message

Chengming Zhou Feb. 28, 2024, 3:18 p.m. UTC
  Now the release of zswap pool is controlled by percpu_ref, its release
callback (__zswap_pool_empty()) will be called when percpu_ref hit 0.
But this release callback may potentially be called from RCU callback
context by percpu_ref_kill(), which maybe in the interrupt context.

So we need to use spin_lock_irqsave() and spin_unlock_irqrestore()
in the release callback: __zswap_pool_empty(). In other task context
places, spin_lock_irq() and spin_unlock_irq() are enough to avoid
potential deadlock.

This problem is introduced by the commit f3da427e82c4 ("mm/zswap: change
zswap_pool kref to percpu_ref"), which is in mm-unstable branch now.
It can be reproduced by testing kernel build in tmpfs with zswap and
CONFIG_LOCKDEP enabled, meanwhile changing the zswap compressor setting
dynamically.

Signed-off-by: Chengming Zhou <chengming.zhou@linux.dev>
---
 mm/zswap.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)
  

Comments

Matthew Wilcox Feb. 28, 2024, 3:24 p.m. UTC | #1
On Wed, Feb 28, 2024 at 03:18:32PM +0000, Chengming Zhou wrote:
> Now the release of zswap pool is controlled by percpu_ref, its release
> callback (__zswap_pool_empty()) will be called when percpu_ref hit 0.
> But this release callback may potentially be called from RCU callback
> context by percpu_ref_kill(), which maybe in the interrupt context.
> 
> So we need to use spin_lock_irqsave() and spin_unlock_irqrestore()
> in the release callback: __zswap_pool_empty(). In other task context
> places, spin_lock_irq() and spin_unlock_irq() are enough to avoid
> potential deadlock.

RCU callback context is BH, not IRQ, so it's enough to use
spin_lock_bh(), no?
  
Chengming Zhou Feb. 28, 2024, 3:37 p.m. UTC | #2
On 2024/2/28 23:24, Matthew Wilcox wrote:
> On Wed, Feb 28, 2024 at 03:18:32PM +0000, Chengming Zhou wrote:
>> Now the release of zswap pool is controlled by percpu_ref, its release
>> callback (__zswap_pool_empty()) will be called when percpu_ref hit 0.
>> But this release callback may potentially be called from RCU callback
>> context by percpu_ref_kill(), which maybe in the interrupt context.
>>
>> So we need to use spin_lock_irqsave() and spin_unlock_irqrestore()
>> in the release callback: __zswap_pool_empty(). In other task context
>> places, spin_lock_irq() and spin_unlock_irq() are enough to avoid
>> potential deadlock.
> 
> RCU callback context is BH, not IRQ, so it's enough to use
> spin_lock_bh(), no?

You're right, it's the softirq context, so spin_lock_bh() is enough.

Thanks!
  

Patch

diff --git a/mm/zswap.c b/mm/zswap.c
index 011e068eb355..894bd184f78e 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -456,10 +456,11 @@  static struct zswap_pool *zswap_pool_current(void);
 static void __zswap_pool_empty(struct percpu_ref *ref)
 {
 	struct zswap_pool *pool;
+	unsigned long flags;
 
 	pool = container_of(ref, typeof(*pool), ref);
 
-	spin_lock(&zswap_pools_lock);
+	spin_lock_irqsave(&zswap_pools_lock, flags);
 
 	WARN_ON(pool == zswap_pool_current());
 
@@ -468,7 +469,7 @@  static void __zswap_pool_empty(struct percpu_ref *ref)
 	INIT_WORK(&pool->release_work, __zswap_pool_release);
 	schedule_work(&pool->release_work);
 
-	spin_unlock(&zswap_pools_lock);
+	spin_unlock_irqrestore(&zswap_pools_lock, flags);
 }
 
 static int __must_check zswap_pool_get(struct zswap_pool *pool)
@@ -598,7 +599,7 @@  static int __zswap_param_set(const char *val, const struct kernel_param *kp,
 		return -EINVAL;
 	}
 
-	spin_lock(&zswap_pools_lock);
+	spin_lock_irq(&zswap_pools_lock);
 
 	pool = zswap_pool_find_get(type, compressor);
 	if (pool) {
@@ -607,7 +608,7 @@  static int __zswap_param_set(const char *val, const struct kernel_param *kp,
 		list_del_rcu(&pool->list);
 	}
 
-	spin_unlock(&zswap_pools_lock);
+	spin_unlock_irq(&zswap_pools_lock);
 
 	if (!pool)
 		pool = zswap_pool_create(type, compressor);
@@ -628,7 +629,7 @@  static int __zswap_param_set(const char *val, const struct kernel_param *kp,
 	else
 		ret = -EINVAL;
 
-	spin_lock(&zswap_pools_lock);
+	spin_lock_irq(&zswap_pools_lock);
 
 	if (!ret) {
 		put_pool = zswap_pool_current();
@@ -643,7 +644,7 @@  static int __zswap_param_set(const char *val, const struct kernel_param *kp,
 		put_pool = pool;
 	}
 
-	spin_unlock(&zswap_pools_lock);
+	spin_unlock_irq(&zswap_pools_lock);
 
 	if (!zswap_has_pool && !pool) {
 		/* if initial pool creation failed, and this pool creation also