[RFC,v1,5/5] maple_tree: replace preallocation with slub percpu array prefill

Message ID 20230808095342.12637-12-vbabka@suse.cz
State New
Headers
Series SLUB percpu array caches and maple tree nodes |

Commit Message

Vlastimil Babka Aug. 8, 2023, 9:53 a.m. UTC
  With the percpu array we can try not doing the preallocations in maple
tree, and instead make sure the percpu array is prefilled, and using
GFP_ATOMIC in places that relied on the preallocation (in case we miss
or fail trylock on the array), i.e. mas_store_prealloc(). For now simply
add __GFP_NOFAIL there as well.

First I tried to change mas_node_count_gfp() to not preallocate anything
anywhere, but that lead to warns and panics, even though the other
caller mas_node_count() uses GFP_NOWAIT | __GFP_NOWARN so it has no
guarantees... So I changed just mas_preallocate(). I let it still to
truly preallocate a single node, but maybe it's not necessary?
---
 lib/maple_tree.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)
  

Comments

Liam R. Howlett Aug. 8, 2023, 2:37 p.m. UTC | #1
* Vlastimil Babka <vbabka@suse.cz> [230808 05:53]:
> With the percpu array we can try not doing the preallocations in maple
> tree, and instead make sure the percpu array is prefilled, and using
> GFP_ATOMIC in places that relied on the preallocation (in case we miss
> or fail trylock on the array), i.e. mas_store_prealloc(). For now simply
> add __GFP_NOFAIL there as well.
> 
> First I tried to change mas_node_count_gfp() to not preallocate anything
> anywhere, but that lead to warns and panics, even though the other
> caller mas_node_count() uses GFP_NOWAIT | __GFP_NOWARN so it has no
> guarantees... So I changed just mas_preallocate(). I let it still to
> truly preallocate a single node, but maybe it's not necessary?

Ah, yes.  I added a check to make sure we didn't allocate more nodes
when using preallocations.  This check is what you are hitting when you
don't allocate anything.  This is tracked in mas_flags by
setting/clearing MA_STATE_PREALLOC.  Good news, that check works!

> ---
>  lib/maple_tree.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/maple_tree.c b/lib/maple_tree.c
> index 7a8e7c467d7c..5a209d88c318 100644
> --- a/lib/maple_tree.c
> +++ b/lib/maple_tree.c
> @@ -5534,7 +5534,12 @@ void mas_store_prealloc(struct ma_state *mas, void *entry)
>  
>  	mas_wr_store_setup(&wr_mas);
>  	trace_ma_write(__func__, mas, 0, entry);
> +
> +retry:
>  	mas_wr_store_entry(&wr_mas);
> +	if (unlikely(mas_nomem(mas, GFP_ATOMIC | __GFP_NOFAIL)))
> +		goto retry;
> +
>  	MAS_WR_BUG_ON(&wr_mas, mas_is_err(mas));
>  	mas_destroy(mas);
>  }
> @@ -5550,9 +5555,10 @@ EXPORT_SYMBOL_GPL(mas_store_prealloc);
>  int mas_preallocate(struct ma_state *mas, gfp_t gfp)
>  {
>  	int ret;
> +	int count = 1 + mas_mt_height(mas) * 3;
>  
> -	mas_node_count_gfp(mas, 1 + mas_mt_height(mas) * 3, gfp);
> -	mas->mas_flags |= MA_STATE_PREALLOC;
> +	mas_node_count_gfp(mas, 1, gfp);
> +	kmem_cache_prefill_percpu_array(maple_node_cache, count, gfp);
>  	if (likely(!mas_is_err(mas)))
>  		return 0;
>  
> -- 
> 2.41.0
>
  
Liam R. Howlett Aug. 8, 2023, 7:01 p.m. UTC | #2
* Liam R. Howlett <Liam.Howlett@Oracle.com> [230808 10:37]:
> * Vlastimil Babka <vbabka@suse.cz> [230808 05:53]:
> > With the percpu array we can try not doing the preallocations in maple
> > tree, and instead make sure the percpu array is prefilled, and using
> > GFP_ATOMIC in places that relied on the preallocation (in case we miss
> > or fail trylock on the array), i.e. mas_store_prealloc(). For now simply
> > add __GFP_NOFAIL there as well.
> > 
> > First I tried to change mas_node_count_gfp() to not preallocate anything
> > anywhere, but that lead to warns and panics, even though the other
> > caller mas_node_count() uses GFP_NOWAIT | __GFP_NOWARN so it has no
> > guarantees... So I changed just mas_preallocate(). I let it still to
> > truly preallocate a single node, but maybe it's not necessary?
> 
> Ah, yes.  I added a check to make sure we didn't allocate more nodes
> when using preallocations.  This check is what you are hitting when you
> don't allocate anything.  This is tracked in mas_flags by
> setting/clearing MA_STATE_PREALLOC.  Good news, that check works!

Adding the attached patch to your series prior to the below allows for
the removal of the extra preallocation.

> 
> > ---
> >  lib/maple_tree.c | 10 ++++++++--
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/lib/maple_tree.c b/lib/maple_tree.c
> > index 7a8e7c467d7c..5a209d88c318 100644
> > --- a/lib/maple_tree.c
> > +++ b/lib/maple_tree.c
> > @@ -5534,7 +5534,12 @@ void mas_store_prealloc(struct ma_state *mas, void *entry)
> >  
> >  	mas_wr_store_setup(&wr_mas);
> >  	trace_ma_write(__func__, mas, 0, entry);
> > +
> > +retry:
> >  	mas_wr_store_entry(&wr_mas);
> > +	if (unlikely(mas_nomem(mas, GFP_ATOMIC | __GFP_NOFAIL)))
> > +		goto retry;
> > +
> >  	MAS_WR_BUG_ON(&wr_mas, mas_is_err(mas));
> >  	mas_destroy(mas);
> >  }
> > @@ -5550,9 +5555,10 @@ EXPORT_SYMBOL_GPL(mas_store_prealloc);
> >  int mas_preallocate(struct ma_state *mas, gfp_t gfp)
> >  {
> >  	int ret;
> > +	int count = 1 + mas_mt_height(mas) * 3;
> >  
> > -	mas_node_count_gfp(mas, 1 + mas_mt_height(mas) * 3, gfp);
> > -	mas->mas_flags |= MA_STATE_PREALLOC;
> > +	mas_node_count_gfp(mas, 1, gfp);
> > +	kmem_cache_prefill_percpu_array(maple_node_cache, count, gfp);
> >  	if (likely(!mas_is_err(mas)))
> >  		return 0;
> >  
> > -- 
> > 2.41.0
> >
  
Liam R. Howlett Aug. 8, 2023, 7:03 p.m. UTC | #3
* Vlastimil Babka <vbabka@suse.cz> [230808 05:53]:
> With the percpu array we can try not doing the preallocations in maple
> tree, and instead make sure the percpu array is prefilled, and using
> GFP_ATOMIC in places that relied on the preallocation (in case we miss
> or fail trylock on the array), i.e. mas_store_prealloc(). For now simply
> add __GFP_NOFAIL there as well.
> 
> First I tried to change mas_node_count_gfp() to not preallocate anything
> anywhere, but that lead to warns and panics, even though the other
> caller mas_node_count() uses GFP_NOWAIT | __GFP_NOWARN so it has no
> guarantees... So I changed just mas_preallocate(). I let it still to
> truly preallocate a single node, but maybe it's not necessary?

Here's a patch to add the percpu array interface to the testing code.

Note that the maple tree preallocation testing isn't updated.

> ---
>  lib/maple_tree.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/maple_tree.c b/lib/maple_tree.c
> index 7a8e7c467d7c..5a209d88c318 100644
> --- a/lib/maple_tree.c
> +++ b/lib/maple_tree.c
> @@ -5534,7 +5534,12 @@ void mas_store_prealloc(struct ma_state *mas, void *entry)
>  
>  	mas_wr_store_setup(&wr_mas);
>  	trace_ma_write(__func__, mas, 0, entry);
> +
> +retry:
>  	mas_wr_store_entry(&wr_mas);
> +	if (unlikely(mas_nomem(mas, GFP_ATOMIC | __GFP_NOFAIL)))
> +		goto retry;
> +
>  	MAS_WR_BUG_ON(&wr_mas, mas_is_err(mas));
>  	mas_destroy(mas);
>  }
> @@ -5550,9 +5555,10 @@ EXPORT_SYMBOL_GPL(mas_store_prealloc);
>  int mas_preallocate(struct ma_state *mas, gfp_t gfp)
>  {
>  	int ret;
> +	int count = 1 + mas_mt_height(mas) * 3;
>  
> -	mas_node_count_gfp(mas, 1 + mas_mt_height(mas) * 3, gfp);
> -	mas->mas_flags |= MA_STATE_PREALLOC;
> +	mas_node_count_gfp(mas, 1, gfp);
> +	kmem_cache_prefill_percpu_array(maple_node_cache, count, gfp);
>  	if (likely(!mas_is_err(mas)))
>  		return 0;
>  
> -- 
> 2.41.0
>
  

Patch

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 7a8e7c467d7c..5a209d88c318 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -5534,7 +5534,12 @@  void mas_store_prealloc(struct ma_state *mas, void *entry)
 
 	mas_wr_store_setup(&wr_mas);
 	trace_ma_write(__func__, mas, 0, entry);
+
+retry:
 	mas_wr_store_entry(&wr_mas);
+	if (unlikely(mas_nomem(mas, GFP_ATOMIC | __GFP_NOFAIL)))
+		goto retry;
+
 	MAS_WR_BUG_ON(&wr_mas, mas_is_err(mas));
 	mas_destroy(mas);
 }
@@ -5550,9 +5555,10 @@  EXPORT_SYMBOL_GPL(mas_store_prealloc);
 int mas_preallocate(struct ma_state *mas, gfp_t gfp)
 {
 	int ret;
+	int count = 1 + mas_mt_height(mas) * 3;
 
-	mas_node_count_gfp(mas, 1 + mas_mt_height(mas) * 3, gfp);
-	mas->mas_flags |= MA_STATE_PREALLOC;
+	mas_node_count_gfp(mas, 1, gfp);
+	kmem_cache_prefill_percpu_array(maple_node_cache, count, gfp);
 	if (likely(!mas_is_err(mas)))
 		return 0;