rcu/kvfree: Make page cache growing happen on the correct krcp

Message ID 20230408142530.800612-1-qiang1.zhang@intel.com
State New
Headers
Series rcu/kvfree: Make page cache growing happen on the correct krcp |

Commit Message

Zqiang April 8, 2023, 2:25 p.m. UTC
  When invoke add_ptr_to_bulk_krc_lock() to queue ptr, will invoke
krc_this_cpu_lock() return current CPU's krcp structure and get a
bnode object from the krcp structure's ->bulk_head, if return is
empty or the returned bnode object's nr_records is KVFREE_BULK_MAX_ENTR,
when the can_alloc is set, will unlock current CPU's krcp->lock and
allocate bnode, after that, will invoke krc_this_cpu_lock() again to
return current CPU's krcp structure, if the CPU migration occurs,
the krcp obtained at this time will not be consistent with the previous
one, this causes the bnode will be added to the wrong krcp structure's
->bulk_head or trigger fill page work on wrong krcp.

This commit therefore re-hold krcp->lock after allocated page instead
of re-call krc_this_cpu_lock() to ensure the consistency of krcp.

Signed-off-by: Zqiang <qiang1.zhang@intel.com>
---
 kernel/rcu/tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Paul E. McKenney April 11, 2023, 12:06 a.m. UTC | #1
On Sat, Apr 08, 2023 at 10:25:30PM +0800, Zqiang wrote:
> When invoke add_ptr_to_bulk_krc_lock() to queue ptr, will invoke
> krc_this_cpu_lock() return current CPU's krcp structure and get a
> bnode object from the krcp structure's ->bulk_head, if return is
> empty or the returned bnode object's nr_records is KVFREE_BULK_MAX_ENTR,
> when the can_alloc is set, will unlock current CPU's krcp->lock and
> allocate bnode, after that, will invoke krc_this_cpu_lock() again to
> return current CPU's krcp structure, if the CPU migration occurs,
> the krcp obtained at this time will not be consistent with the previous
> one, this causes the bnode will be added to the wrong krcp structure's
> ->bulk_head or trigger fill page work on wrong krcp.
> 
> This commit therefore re-hold krcp->lock after allocated page instead
> of re-call krc_this_cpu_lock() to ensure the consistency of krcp.
> 
> Signed-off-by: Zqiang <qiang1.zhang@intel.com>

Very good, thank you!  Queued for testing and further review, but
please check my wordsmithing.

							Thanx, Paul

------------------------------------------------------------------------

commit a0bbb5785539ed846f4769368f24a296d54bc801
Author: Zqiang <qiang1.zhang@intel.com>
Date:   Sat Apr 8 22:25:30 2023 +0800

    rcu/kvfree: Use consistent krcp when growing kfree_rcu() page cache
    
    The add_ptr_to_bulk_krc_lock() function is invoked to allocate a new
    kfree_rcu() page, also known as a kvfree_rcu_bulk_data structure.
    The kfree_rcu_cpu structure's lock is used to protect this operation,
    except that this lock must be momentarily dropped when allocating memory.
    It is clearly important that the lock that is reacquired be the same
    lock that was acquired initially via krc_this_cpu_lock().
    
    Unfortunately, this same krc_this_cpu_lock() function is used to
    re-acquire this lock, and if the task migrated to some other CPU during
    the memory allocation, this will result in the kvfree_rcu_bulk_data
    structure being added to the wrong CPU's kfree_rcu_cpu structure.
    
    This commit therefore replaces that second call to krc_this_cpu_lock()
    with raw_spin_lock_irqsave() in order to explicitly acquire the lock on
    the correct kfree_rcu_cpu structure, thus keeping things straight even
    when the task migrates.
    
    Signed-off-by: Zqiang <qiang1.zhang@intel.com>
    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 2699b7acf0e3..41daae3239b5 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3301,7 +3301,7 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
 			// scenarios.
 			bnode = (struct kvfree_rcu_bulk_data *)
 				__get_free_page(GFP_KERNEL | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
-			*krcp = krc_this_cpu_lock(flags);
+			raw_spin_lock_irqsave(&(*krcp)->lock, *flags);
 		}
 
 		if (!bnode)
  
Zqiang April 11, 2023, 4:08 a.m. UTC | #2
> When invoke add_ptr_to_bulk_krc_lock() to queue ptr, will invoke
> krc_this_cpu_lock() return current CPU's krcp structure and get a
> bnode object from the krcp structure's ->bulk_head, if return is
> empty or the returned bnode object's nr_records is KVFREE_BULK_MAX_ENTR,
> when the can_alloc is set, will unlock current CPU's krcp->lock and
> allocate bnode, after that, will invoke krc_this_cpu_lock() again to
> return current CPU's krcp structure, if the CPU migration occurs,
> the krcp obtained at this time will not be consistent with the previous
> one, this causes the bnode will be added to the wrong krcp structure's
> ->bulk_head or trigger fill page work on wrong krcp.
> 
> This commit therefore re-hold krcp->lock after allocated page instead
> of re-call krc_this_cpu_lock() to ensure the consistency of krcp.
> 
> Signed-off-by: Zqiang <qiang1.zhang@intel.com>
>
>Very good, thank you!  Queued for testing and further review, but
>please check my wordsmithing.


More clear and detailed description, Thanks Paul 😊.


>
>							Thanx, Paul
>
>------------------------------------------------------------------------
>
>commit a0bbb5785539ed846f4769368f24a296d54bc801
>Author: Zqiang <qiang1.zhang@intel.com>
>Date:   Sat Apr 8 22:25:30 2023 +0800
>
>    rcu/kvfree: Use consistent krcp when growing kfree_rcu() page cache
>    
>    The add_ptr_to_bulk_krc_lock() function is invoked to allocate a new
>    kfree_rcu() page, also known as a kvfree_rcu_bulk_data structure.
>    The kfree_rcu_cpu structure's lock is used to protect this operation,
>    except that this lock must be momentarily dropped when allocating memory.
>    It is clearly important that the lock that is reacquired be the same
>    lock that was acquired initially via krc_this_cpu_lock().
>    
>    Unfortunately, this same krc_this_cpu_lock() function is used to
>    re-acquire this lock, and if the task migrated to some other CPU during
>    the memory allocation, this will result in the kvfree_rcu_bulk_data
>    structure being added to the wrong CPU's kfree_rcu_cpu structure.
>    
>    This commit therefore replaces that second call to krc_this_cpu_lock()
>    with raw_spin_lock_irqsave() in order to explicitly acquire the lock on
>    the correct kfree_rcu_cpu structure, thus keeping things straight even
>    when the task migrates.
>    
>    Signed-off-by: Zqiang <qiang1.zhang@intel.com>
>    Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
>
>diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
>index 2699b7acf0e3..41daae3239b5 100644
>--- a/kernel/rcu/tree.c
>+++ b/kernel/rcu/tree.c
>@@ -3301,7 +3301,7 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
> 			// scenarios.
> 			bnode = (struct kvfree_rcu_bulk_data *)
> 				__get_free_page(GFP_KERNEL | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
>-			*krcp = krc_this_cpu_lock(flags);
>+			raw_spin_lock_irqsave(&(*krcp)->lock, *flags);
> 		}
> 
> 		if (!bnode)
  

Patch

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 9d9d3772cc45..c9076fa0a954 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3303,7 +3303,7 @@  add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
 			// scenarios.
 			bnode = (struct kvfree_rcu_bulk_data *)
 				__get_free_page(GFP_KERNEL | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
-			*krcp = krc_this_cpu_lock(flags);
+			raw_spin_lock_irqsave(&(*krcp)->lock, *flags);
 		}
 
 		if (!bnode)