[1/2] of: Introduce __of_phandle_update_cache

Message ID 20240130105236.3097126-2-dawei.li@shingroup.cn
State New
Headers
Series Update phandle cache on device node attach |

Commit Message

Dawei Li Jan. 30, 2024, 10:52 a.m. UTC
  For system with CONFIG_OF_DYNAMIC=y, device nodes can be inserted/removed
dynamically from device tree. Meanwhile phandle_cache is created for fast
lookup from phandle to device node.

For node detach, phandle cache of removed node is invalidated to maintain
the mapping up to date, but the counterpart operation on node attach is
not implemented yet.

Thus, implement the cache updating operation on node attach.

Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
---
 drivers/of/base.c       | 16 ++++++++++++++++
 drivers/of/of_private.h |  1 +
 2 files changed, 17 insertions(+)
  

Comments

Rob Herring Jan. 31, 2024, 9:29 p.m. UTC | #1
On Tue, Jan 30, 2024 at 06:52:35PM +0800, Dawei Li wrote:
> For system with CONFIG_OF_DYNAMIC=y, device nodes can be inserted/removed
> dynamically from device tree. Meanwhile phandle_cache is created for fast
> lookup from phandle to device node.

Why do we need it to be fast? What's the usecase (upstream dynamic DT 
usecases are limited) and what's the performance difference? We'll 
already cache the new phandle on the first lookup. Plus with only 128 
entries you are likely evicting an entry. 

> For node detach, phandle cache of removed node is invalidated to maintain
> the mapping up to date, but the counterpart operation on node attach is
> not implemented yet.
> 
> Thus, implement the cache updating operation on node attach.

Except this patch does not do that. The next patch does.

> 
> Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
> ---
>  drivers/of/base.c       | 16 ++++++++++++++++
>  drivers/of/of_private.h |  1 +
>  2 files changed, 17 insertions(+)
> 
> diff --git a/drivers/of/base.c b/drivers/of/base.c
> index b0ad8fc06e80..8b7da27835eb 100644
> --- a/drivers/of/base.c
> +++ b/drivers/of/base.c
> @@ -163,6 +163,22 @@ void __of_phandle_cache_inv_entry(phandle handle)
>  		phandle_cache[handle_hash] = NULL;
>  }
>  
> +void __of_phandle_update_cache(struct device_node *np, bool lock)
> +{
> +	u32 hash;
> +
> +	if (lock)
> +		lockdep_assert_held(&devtree_lock);

I don't think this is a good use of a function parameter.

> +
> +	if (unlikely(!np || !np->phandle))
> +		return;
> +
> +	hash = of_phandle_cache_hash(np->phandle);
> +
> +	if (!phandle_cache[hash])
> +		phandle_cache[hash] = np;

Okay, so you don't evict existing entries. I'm not sure what makes more 
sense. I would imagine old entries are less likely to be accessed than 
new phandles for just added nodes given DT is kind of parse it all once 
(e.g. at boot time). Again, need to understand your usecase and 
performance differences.

Rob
  
Dawei Li Feb. 1, 2024, 10 a.m. UTC | #2
Hi Rob,

Thanks for reviewing,

On Wed, Jan 31, 2024 at 03:29:38PM -0600, Rob Herring wrote:
> On Tue, Jan 30, 2024 at 06:52:35PM +0800, Dawei Li wrote:
> > For system with CONFIG_OF_DYNAMIC=y, device nodes can be inserted/removed
> > dynamically from device tree. Meanwhile phandle_cache is created for fast
> > lookup from phandle to device node.
> 
> Why do we need it to be fast? What's the usecase (upstream dynamic DT 
> usecases are limited) and what's the performance difference? We'll 
> already cache the new phandle on the first lookup. Plus with only 128 
> entries you are likely evicting an entry. 

I read the history changelog and get that a _lot_ of lookup has been
taken before of_core_init(), so the update of cache in lookup operation
mean a lot to performance improvement.

> 
> > For node detach, phandle cache of removed node is invalidated to maintain
> > the mapping up to date, but the counterpart operation on node attach is
> > not implemented yet.
> > 
> > Thus, implement the cache updating operation on node attach.
> 
> Except this patch does not do that. The next patch does.

Agreed.

> 
> > 
> > Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
> > ---
> >  drivers/of/base.c       | 16 ++++++++++++++++
> >  drivers/of/of_private.h |  1 +
> >  2 files changed, 17 insertions(+)
> > 
> > diff --git a/drivers/of/base.c b/drivers/of/base.c
> > index b0ad8fc06e80..8b7da27835eb 100644
> > --- a/drivers/of/base.c
> > +++ b/drivers/of/base.c
> > @@ -163,6 +163,22 @@ void __of_phandle_cache_inv_entry(phandle handle)
> >  		phandle_cache[handle_hash] = NULL;
> >  }
> >  
> > +void __of_phandle_update_cache(struct device_node *np, bool lock)
> > +{
> > +	u32 hash;
> > +
> > +	if (lock)
> > +		lockdep_assert_held(&devtree_lock);
> 
> I don't think this is a good use of a function parameter.

Yep, assertion under condition is odd.

> 
> > +
> > +	if (unlikely(!np || !np->phandle))
> > +		return;
> > +
> > +	hash = of_phandle_cache_hash(np->phandle);
> > +
> > +	if (!phandle_cache[hash])
> > +		phandle_cache[hash] = np;
> 
> Okay, so you don't evict existing entries. I'm not sure what makes more 

Yes, the updating policy of dynamic nodes is exactly same with static nodes
(the ones in of_core_init()), no eviction/invalidation on _existing_ cache
involved.

> sense. I would imagine old entries are less likely to be accessed than 

Well, I don't think we are gonna implement a full-fledged cache replacing
algorithm such as LRU.

> new phandles for just added nodes given DT is kind of parse it all once 
> (e.g. at boot time). Again, need to understand your usecase and 
> performance differences.

It's kinda awkward that no such usecases/stats are available for now.

My motivation is simple as that:
As long as detached nodes are supposed to be removed from cache entries,
the newly inserted nodes should be added to cache entries, it is more
balanced and symmetric.

And I am sorry if it breaks/undermines current design.

Thanks,

    Dawei

> 
> Rob
>
  
Rob Herring Feb. 1, 2024, 9:09 p.m. UTC | #3
On Thu, Feb 1, 2024 at 4:01 AM Dawei Li <dawei.li@shingroup.cn> wrote:
>
> Hi Rob,
>
> Thanks for reviewing,
>
> On Wed, Jan 31, 2024 at 03:29:38PM -0600, Rob Herring wrote:
> > On Tue, Jan 30, 2024 at 06:52:35PM +0800, Dawei Li wrote:
> > > For system with CONFIG_OF_DYNAMIC=y, device nodes can be inserted/removed
> > > dynamically from device tree. Meanwhile phandle_cache is created for fast
> > > lookup from phandle to device node.
> >
> > Why do we need it to be fast? What's the usecase (upstream dynamic DT
> > usecases are limited) and what's the performance difference? We'll
> > already cache the new phandle on the first lookup. Plus with only 128
> > entries you are likely evicting an entry.
>
> I read the history changelog and get that a _lot_ of lookup has been
> taken before of_core_init(), so the update of cache in lookup operation
> mean a lot to performance improvement.

Yes, and there was compelling data on the performance difference to
justify the added complexity.

> > > For node detach, phandle cache of removed node is invalidated to maintain
> > > the mapping up to date, but the counterpart operation on node attach is
> > > not implemented yet.
> > >
> > > Thus, implement the cache updating operation on node attach.
> >
> > Except this patch does not do that. The next patch does.
>
> Agreed.
>
> >
> > >
> > > Signed-off-by: Dawei Li <dawei.li@shingroup.cn>
> > > ---
> > >  drivers/of/base.c       | 16 ++++++++++++++++
> > >  drivers/of/of_private.h |  1 +
> > >  2 files changed, 17 insertions(+)
> > >
> > > diff --git a/drivers/of/base.c b/drivers/of/base.c
> > > index b0ad8fc06e80..8b7da27835eb 100644
> > > --- a/drivers/of/base.c
> > > +++ b/drivers/of/base.c
> > > @@ -163,6 +163,22 @@ void __of_phandle_cache_inv_entry(phandle handle)
> > >             phandle_cache[handle_hash] = NULL;
> > >  }
> > >
> > > +void __of_phandle_update_cache(struct device_node *np, bool lock)
> > > +{
> > > +   u32 hash;
> > > +
> > > +   if (lock)
> > > +           lockdep_assert_held(&devtree_lock);
> >
> > I don't think this is a good use of a function parameter.
>
> Yep, assertion under condition is odd.
>
> >
> > > +
> > > +   if (unlikely(!np || !np->phandle))
> > > +           return;
> > > +
> > > +   hash = of_phandle_cache_hash(np->phandle);
> > > +
> > > +   if (!phandle_cache[hash])
> > > +           phandle_cache[hash] = np;
> >
> > Okay, so you don't evict existing entries. I'm not sure what makes more
>
> Yes, the updating policy of dynamic nodes is exactly same with static nodes
> (the ones in of_core_init()), no eviction/invalidation on _existing_ cache
> involved.
>
> > sense. I would imagine old entries are less likely to be accessed than
>
> Well, I don't think we are gonna implement a full-fledged cache replacing
> algorithm such as LRU.
>
> > new phandles for just added nodes given DT is kind of parse it all once
> > (e.g. at boot time). Again, need to understand your usecase and
> > performance differences.
>
> It's kinda awkward that no such usecases/stats are available for now.
>
> My motivation is simple as that:
> As long as detached nodes are supposed to be removed from cache entries,
> the newly inserted nodes should be added to cache entries, it is more
> balanced and symmetric.

The difference is that no entry for attach works fine while accessing
a detached node that may have been freed would be a problem.

Rob
  

Patch

diff --git a/drivers/of/base.c b/drivers/of/base.c
index b0ad8fc06e80..8b7da27835eb 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -163,6 +163,22 @@  void __of_phandle_cache_inv_entry(phandle handle)
 		phandle_cache[handle_hash] = NULL;
 }
 
+void __of_phandle_update_cache(struct device_node *np, bool lock)
+{
+	u32 hash;
+
+	if (lock)
+		lockdep_assert_held(&devtree_lock);
+
+	if (unlikely(!np || !np->phandle))
+		return;
+
+	hash = of_phandle_cache_hash(np->phandle);
+
+	if (!phandle_cache[hash])
+		phandle_cache[hash] = np;
+}
+
 void __init of_core_init(void)
 {
 	struct device_node *np;
diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index f38397c7b582..89559aad8ccb 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -94,6 +94,7 @@  int of_resolve_phandles(struct device_node *tree);
 #endif
 
 void __of_phandle_cache_inv_entry(phandle handle);
+void __of_phandle_update_cache(struct device_node *np, bool lock);
 
 #if defined(CONFIG_OF_OVERLAY)
 void of_overlay_mutex_lock(void);