[RFC,6/7] libfs: Convert simple directory offsets to use a Maple Tree

Message ID 170786028128.11135.4581426129369576567.stgit@91.116.238.104.host.secureserver.net
State New
Headers
Series Use Maple Trees for simple_offset utilities |

Commit Message

Chuck Lever Feb. 13, 2024, 9:38 p.m. UTC
  From: Chuck Lever <chuck.lever@oracle.com>

Test robot reports:
> kernel test robot noticed a -19.0% regression of aim9.disk_src.ops_per_sec on:
>
> commit: a2e459555c5f9da3e619b7e47a63f98574dc75f1 ("shmem: stable directory offsets")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master

Feng Tang further clarifies that:
> ... the new simple_offset_add()
> called by shmem_mknod() brings extra cost related with slab,
> specifically the 'radix_tree_node', which cause the regression.

Willy's analysis is that, over time, the test workload causes
xa_alloc_cyclic() to fragment the underlying SLAB cache.

This patch replaces the offset_ctx's xarray with a Maple Tree in the
hope that Maple Tree's dense node mode will handle this scenario
more scalably.

In addition, we can widen the directory offset to an unsigned long
everywhere.

Suggested-by: Matthew Wilcox <willy@infradead.org>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202309081306.3ecb3734-oliver.sang@intel.com
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/libfs.c         |   53 ++++++++++++++++++++++++++--------------------------
 include/linux/fs.h |    5 +++--
 2 files changed, 29 insertions(+), 29 deletions(-)
  

Comments

Jan Kara Feb. 15, 2024, 1:06 p.m. UTC | #1
On Tue 13-02-24 16:38:01, Chuck Lever wrote:
> From: Chuck Lever <chuck.lever@oracle.com>
> 
> Test robot reports:
> > kernel test robot noticed a -19.0% regression of aim9.disk_src.ops_per_sec on:
> >
> > commit: a2e459555c5f9da3e619b7e47a63f98574dc75f1 ("shmem: stable directory offsets")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> 
> Feng Tang further clarifies that:
> > ... the new simple_offset_add()
> > called by shmem_mknod() brings extra cost related with slab,
> > specifically the 'radix_tree_node', which cause the regression.
> 
> Willy's analysis is that, over time, the test workload causes
> xa_alloc_cyclic() to fragment the underlying SLAB cache.
> 
> This patch replaces the offset_ctx's xarray with a Maple Tree in the
> hope that Maple Tree's dense node mode will handle this scenario
> more scalably.
> 
> In addition, we can widen the directory offset to an unsigned long
> everywhere.
> 
> Suggested-by: Matthew Wilcox <willy@infradead.org>
> Reported-by: kernel test robot <oliver.sang@intel.com>
> Closes: https://lore.kernel.org/oe-lkp/202309081306.3ecb3734-oliver.sang@intel.com
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

OK, but this will need the performance numbers. Otherwise we have no idea
whether this is worth it or not. Maybe you can ask Oliver Sang? Usually
0-day guys are quite helpful.

> @@ -330,9 +329,9 @@ int simple_offset_empty(struct dentry *dentry)
>  	if (!inode || !S_ISDIR(inode->i_mode))
>  		return ret;
>  
> -	index = 2;
> +	index = DIR_OFFSET_MIN;

This bit should go into the simple_offset_empty() patch...

> @@ -434,15 +433,15 @@ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence)
>  
>  	/* In this case, ->private_data is protected by f_pos_lock */
>  	file->private_data = NULL;
> -	return vfs_setpos(file, offset, U32_MAX);
> +	return vfs_setpos(file, offset, MAX_LFS_FILESIZE);
					^^^
Why this? It is ULONG_MAX << PAGE_SHIFT on 32-bit so that doesn't seem
quite right? Why not use ULONG_MAX here directly?

Otherwise the patch looks good to me.

								Honza
  
Chuck Lever Feb. 15, 2024, 1:45 p.m. UTC | #2
On Thu, Feb 15, 2024 at 02:06:01PM +0100, Jan Kara wrote:
> On Tue 13-02-24 16:38:01, Chuck Lever wrote:
> > From: Chuck Lever <chuck.lever@oracle.com>
> > 
> > Test robot reports:
> > > kernel test robot noticed a -19.0% regression of aim9.disk_src.ops_per_sec on:
> > >
> > > commit: a2e459555c5f9da3e619b7e47a63f98574dc75f1 ("shmem: stable directory offsets")
> > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > 
> > Feng Tang further clarifies that:
> > > ... the new simple_offset_add()
> > > called by shmem_mknod() brings extra cost related with slab,
> > > specifically the 'radix_tree_node', which cause the regression.
> > 
> > Willy's analysis is that, over time, the test workload causes
> > xa_alloc_cyclic() to fragment the underlying SLAB cache.
> > 
> > This patch replaces the offset_ctx's xarray with a Maple Tree in the
> > hope that Maple Tree's dense node mode will handle this scenario
> > more scalably.
> > 
> > In addition, we can widen the directory offset to an unsigned long
> > everywhere.
> > 
> > Suggested-by: Matthew Wilcox <willy@infradead.org>
> > Reported-by: kernel test robot <oliver.sang@intel.com>
> > Closes: https://lore.kernel.org/oe-lkp/202309081306.3ecb3734-oliver.sang@intel.com
> > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> 
> OK, but this will need the performance numbers.

Yes, I totally concur. The point of this posting was to get some
early review and start the ball rolling.

Actually we expect roughly the same performance numbers now. "Dense
node" support in Maple Tree is supposed to be the real win, but
I'm not sure it's ready yet.


> Otherwise we have no idea
> whether this is worth it or not. Maybe you can ask Oliver Sang? Usually
> 0-day guys are quite helpful.

Oliver and Feng were copied on this series.


> > @@ -330,9 +329,9 @@ int simple_offset_empty(struct dentry *dentry)
> >  	if (!inode || !S_ISDIR(inode->i_mode))
> >  		return ret;
> >  
> > -	index = 2;
> > +	index = DIR_OFFSET_MIN;
> 
> This bit should go into the simple_offset_empty() patch...
> 
> > @@ -434,15 +433,15 @@ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence)
> >  
> >  	/* In this case, ->private_data is protected by f_pos_lock */
> >  	file->private_data = NULL;
> > -	return vfs_setpos(file, offset, U32_MAX);
> > +	return vfs_setpos(file, offset, MAX_LFS_FILESIZE);
> 					^^^
> Why this? It is ULONG_MAX << PAGE_SHIFT on 32-bit so that doesn't seem
> quite right? Why not use ULONG_MAX here directly?

I initially changed U32_MAX to ULONG_MAX, but for some reason, the
length checking in vfs_setpos() fails. There is probably a sign
extension thing happening here that I don't understand.


> Otherwise the patch looks good to me.

As always, thank you for your review.
  
Jan Kara Feb. 15, 2024, 2:02 p.m. UTC | #3
On Thu 15-02-24 08:45:33, Chuck Lever wrote:
> On Thu, Feb 15, 2024 at 02:06:01PM +0100, Jan Kara wrote:
> > On Tue 13-02-24 16:38:01, Chuck Lever wrote:
> > > From: Chuck Lever <chuck.lever@oracle.com>
> > > 
> > > Test robot reports:
> > > > kernel test robot noticed a -19.0% regression of aim9.disk_src.ops_per_sec on:
> > > >
> > > > commit: a2e459555c5f9da3e619b7e47a63f98574dc75f1 ("shmem: stable directory offsets")
> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > 
> > > Feng Tang further clarifies that:
> > > > ... the new simple_offset_add()
> > > > called by shmem_mknod() brings extra cost related with slab,
> > > > specifically the 'radix_tree_node', which cause the regression.
> > > 
> > > Willy's analysis is that, over time, the test workload causes
> > > xa_alloc_cyclic() to fragment the underlying SLAB cache.
> > > 
> > > This patch replaces the offset_ctx's xarray with a Maple Tree in the
> > > hope that Maple Tree's dense node mode will handle this scenario
> > > more scalably.
> > > 
> > > In addition, we can widen the directory offset to an unsigned long
> > > everywhere.
> > > 
> > > Suggested-by: Matthew Wilcox <willy@infradead.org>
> > > Reported-by: kernel test robot <oliver.sang@intel.com>
> > > Closes: https://lore.kernel.org/oe-lkp/202309081306.3ecb3734-oliver.sang@intel.com
> > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > 
> > OK, but this will need the performance numbers.
> 
> Yes, I totally concur. The point of this posting was to get some
> early review and start the ball rolling.
> 
> Actually we expect roughly the same performance numbers now. "Dense
> node" support in Maple Tree is supposed to be the real win, but
> I'm not sure it's ready yet.
> 
> 
> > Otherwise we have no idea
> > whether this is worth it or not. Maybe you can ask Oliver Sang? Usually
> > 0-day guys are quite helpful.
> 
> Oliver and Feng were copied on this series.
> 
> 
> > > @@ -330,9 +329,9 @@ int simple_offset_empty(struct dentry *dentry)
> > >  	if (!inode || !S_ISDIR(inode->i_mode))
> > >  		return ret;
> > >  
> > > -	index = 2;
> > > +	index = DIR_OFFSET_MIN;
> > 
> > This bit should go into the simple_offset_empty() patch...
> > 
> > > @@ -434,15 +433,15 @@ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence)
> > >  
> > >  	/* In this case, ->private_data is protected by f_pos_lock */
> > >  	file->private_data = NULL;
> > > -	return vfs_setpos(file, offset, U32_MAX);
> > > +	return vfs_setpos(file, offset, MAX_LFS_FILESIZE);
> > 					^^^
> > Why this? It is ULONG_MAX << PAGE_SHIFT on 32-bit so that doesn't seem
> > quite right? Why not use ULONG_MAX here directly?
> 
> I initially changed U32_MAX to ULONG_MAX, but for some reason, the
> length checking in vfs_setpos() fails. There is probably a sign
> extension thing happening here that I don't understand.

Right. loff_t is signed (long long). So I think you should make the
'offset' be long instead of unsigned long and allow values 0..LONG_MAX?
Then you can pass LONG_MAX here. You potentially loose half of the usable
offsets on 32-bit userspace with 64-bit file offsets but who cares I guess?

								Honza
  
Christian Brauner Feb. 16, 2024, 3:15 p.m. UTC | #4
On Thu, Feb 15, 2024 at 08:45:33AM -0500, Chuck Lever wrote:
> On Thu, Feb 15, 2024 at 02:06:01PM +0100, Jan Kara wrote:
> > On Tue 13-02-24 16:38:01, Chuck Lever wrote:
> > > From: Chuck Lever <chuck.lever@oracle.com>
> > > 
> > > Test robot reports:
> > > > kernel test robot noticed a -19.0% regression of aim9.disk_src.ops_per_sec on:
> > > >
> > > > commit: a2e459555c5f9da3e619b7e47a63f98574dc75f1 ("shmem: stable directory offsets")
> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > 
> > > Feng Tang further clarifies that:
> > > > ... the new simple_offset_add()
> > > > called by shmem_mknod() brings extra cost related with slab,
> > > > specifically the 'radix_tree_node', which cause the regression.
> > > 
> > > Willy's analysis is that, over time, the test workload causes
> > > xa_alloc_cyclic() to fragment the underlying SLAB cache.
> > > 
> > > This patch replaces the offset_ctx's xarray with a Maple Tree in the
> > > hope that Maple Tree's dense node mode will handle this scenario
> > > more scalably.
> > > 
> > > In addition, we can widen the directory offset to an unsigned long
> > > everywhere.
> > > 
> > > Suggested-by: Matthew Wilcox <willy@infradead.org>
> > > Reported-by: kernel test robot <oliver.sang@intel.com>
> > > Closes: https://lore.kernel.org/oe-lkp/202309081306.3ecb3734-oliver.sang@intel.com
> > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > 
> > OK, but this will need the performance numbers.
> 
> Yes, I totally concur. The point of this posting was to get some
> early review and start the ball rolling.
> 
> Actually we expect roughly the same performance numbers now. "Dense
> node" support in Maple Tree is supposed to be the real win, but
> I'm not sure it's ready yet.

I keep repeating this but we need a better way to request performance
tests for specific series/branches. Maybe I can add a vfs.perf branch
where we can put patches that we suspect have positive/negative perf
impact and that perf bot can pull that in. I know, that's a fuzzy
boundary but for stuff like this where we already know that there's a
perf impact that's important for us it would really help.

Because it is royally annoying to get a perf regression report after a
patch has been in -next for a long time already and the merge window is
coming up or we already merged that stuff.
  
kernel test robot Feb. 18, 2024, 2:02 a.m. UTC | #5
hi, Chuck Lever,

On Thu, Feb 15, 2024 at 08:45:33AM -0500, Chuck Lever wrote:
> On Thu, Feb 15, 2024 at 02:06:01PM +0100, Jan Kara wrote:
> > On Tue 13-02-24 16:38:01, Chuck Lever wrote:
> > > From: Chuck Lever <chuck.lever@oracle.com>
> > > 
> > > Test robot reports:
> > > > kernel test robot noticed a -19.0% regression of aim9.disk_src.ops_per_sec on:
> > > >
> > > > commit: a2e459555c5f9da3e619b7e47a63f98574dc75f1 ("shmem: stable directory offsets")
> > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > 
> > > Feng Tang further clarifies that:
> > > > ... the new simple_offset_add()
> > > > called by shmem_mknod() brings extra cost related with slab,
> > > > specifically the 'radix_tree_node', which cause the regression.
> > > 
> > > Willy's analysis is that, over time, the test workload causes
> > > xa_alloc_cyclic() to fragment the underlying SLAB cache.
> > > 
> > > This patch replaces the offset_ctx's xarray with a Maple Tree in the
> > > hope that Maple Tree's dense node mode will handle this scenario
> > > more scalably.
> > > 
> > > In addition, we can widen the directory offset to an unsigned long
> > > everywhere.
> > > 
> > > Suggested-by: Matthew Wilcox <willy@infradead.org>
> > > Reported-by: kernel test robot <oliver.sang@intel.com>
> > > Closes: https://lore.kernel.org/oe-lkp/202309081306.3ecb3734-oliver.sang@intel.com
> > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > 
> > OK, but this will need the performance numbers.
> 
> Yes, I totally concur. The point of this posting was to get some
> early review and start the ball rolling.
> 
> Actually we expect roughly the same performance numbers now. "Dense
> node" support in Maple Tree is supposed to be the real win, but
> I'm not sure it's ready yet.
> 
> 
> > Otherwise we have no idea
> > whether this is worth it or not. Maybe you can ask Oliver Sang? Usually
> > 0-day guys are quite helpful.
> 
> Oliver and Feng were copied on this series.

we are in holidays last week, now we are back.

I noticed there is v2 for this patch set
https://lore.kernel.org/all/170820145616.6328.12620992971699079156.stgit@91.116.238.104.host.secureserver.net/

and you also put it in a branch:
https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git
"simple-offset-maple" branch.

we will test aim9 performance based on this branch. Thanks

> 
> 
> > > @@ -330,9 +329,9 @@ int simple_offset_empty(struct dentry *dentry)
> > >  	if (!inode || !S_ISDIR(inode->i_mode))
> > >  		return ret;
> > >  
> > > -	index = 2;
> > > +	index = DIR_OFFSET_MIN;
> > 
> > This bit should go into the simple_offset_empty() patch...
> > 
> > > @@ -434,15 +433,15 @@ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence)
> > >  
> > >  	/* In this case, ->private_data is protected by f_pos_lock */
> > >  	file->private_data = NULL;
> > > -	return vfs_setpos(file, offset, U32_MAX);
> > > +	return vfs_setpos(file, offset, MAX_LFS_FILESIZE);
> > 					^^^
> > Why this? It is ULONG_MAX << PAGE_SHIFT on 32-bit so that doesn't seem
> > quite right? Why not use ULONG_MAX here directly?
> 
> I initially changed U32_MAX to ULONG_MAX, but for some reason, the
> length checking in vfs_setpos() fails. There is probably a sign
> extension thing happening here that I don't understand.
> 
> 
> > Otherwise the patch looks good to me.
> 
> As always, thank you for your review.
> 
> 
> -- 
> Chuck Lever
  
Chuck Lever Feb. 18, 2024, 3:57 p.m. UTC | #6
On Sun, Feb 18, 2024 at 10:02:37AM +0800, Oliver Sang wrote:
> hi, Chuck Lever,
> 
> On Thu, Feb 15, 2024 at 08:45:33AM -0500, Chuck Lever wrote:
> > On Thu, Feb 15, 2024 at 02:06:01PM +0100, Jan Kara wrote:
> > > On Tue 13-02-24 16:38:01, Chuck Lever wrote:
> > > > From: Chuck Lever <chuck.lever@oracle.com>
> > > > 
> > > > Test robot reports:
> > > > > kernel test robot noticed a -19.0% regression of aim9.disk_src.ops_per_sec on:
> > > > >
> > > > > commit: a2e459555c5f9da3e619b7e47a63f98574dc75f1 ("shmem: stable directory offsets")
> > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > > 
> > > > Feng Tang further clarifies that:
> > > > > ... the new simple_offset_add()
> > > > > called by shmem_mknod() brings extra cost related with slab,
> > > > > specifically the 'radix_tree_node', which cause the regression.
> > > > 
> > > > Willy's analysis is that, over time, the test workload causes
> > > > xa_alloc_cyclic() to fragment the underlying SLAB cache.
> > > > 
> > > > This patch replaces the offset_ctx's xarray with a Maple Tree in the
> > > > hope that Maple Tree's dense node mode will handle this scenario
> > > > more scalably.
> > > > 
> > > > In addition, we can widen the directory offset to an unsigned long
> > > > everywhere.
> > > > 
> > > > Suggested-by: Matthew Wilcox <willy@infradead.org>
> > > > Reported-by: kernel test robot <oliver.sang@intel.com>
> > > > Closes: https://lore.kernel.org/oe-lkp/202309081306.3ecb3734-oliver.sang@intel.com
> > > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > > 
> > > OK, but this will need the performance numbers.
> > 
> > Yes, I totally concur. The point of this posting was to get some
> > early review and start the ball rolling.
> > 
> > Actually we expect roughly the same performance numbers now. "Dense
> > node" support in Maple Tree is supposed to be the real win, but
> > I'm not sure it's ready yet.
> > 
> > 
> > > Otherwise we have no idea
> > > whether this is worth it or not. Maybe you can ask Oliver Sang? Usually
> > > 0-day guys are quite helpful.
> > 
> > Oliver and Feng were copied on this series.
> 
> we are in holidays last week, now we are back.
> 
> I noticed there is v2 for this patch set
> https://lore.kernel.org/all/170820145616.6328.12620992971699079156.stgit@91.116.238.104.host.secureserver.net/
> 
> and you also put it in a branch:
> https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git
> "simple-offset-maple" branch.
> 
> we will test aim9 performance based on this branch. Thanks

Very much appreciated!


> > > > @@ -330,9 +329,9 @@ int simple_offset_empty(struct dentry *dentry)
> > > >  	if (!inode || !S_ISDIR(inode->i_mode))
> > > >  		return ret;
> > > >  
> > > > -	index = 2;
> > > > +	index = DIR_OFFSET_MIN;
> > > 
> > > This bit should go into the simple_offset_empty() patch...
> > > 
> > > > @@ -434,15 +433,15 @@ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence)
> > > >  
> > > >  	/* In this case, ->private_data is protected by f_pos_lock */
> > > >  	file->private_data = NULL;
> > > > -	return vfs_setpos(file, offset, U32_MAX);
> > > > +	return vfs_setpos(file, offset, MAX_LFS_FILESIZE);
> > > 					^^^
> > > Why this? It is ULONG_MAX << PAGE_SHIFT on 32-bit so that doesn't seem
> > > quite right? Why not use ULONG_MAX here directly?
> > 
> > I initially changed U32_MAX to ULONG_MAX, but for some reason, the
> > length checking in vfs_setpos() fails. There is probably a sign
> > extension thing happening here that I don't understand.
> > 
> > 
> > > Otherwise the patch looks good to me.
> > 
> > As always, thank you for your review.
> > 
> > 
> > -- 
> > Chuck Lever
  
kernel test robot Feb. 19, 2024, 6 a.m. UTC | #7
hi, Chuck Lever,

On Sun, Feb 18, 2024 at 10:57:07AM -0500, Chuck Lever wrote:
> On Sun, Feb 18, 2024 at 10:02:37AM +0800, Oliver Sang wrote:
> > hi, Chuck Lever,
> > 
> > On Thu, Feb 15, 2024 at 08:45:33AM -0500, Chuck Lever wrote:
> > > On Thu, Feb 15, 2024 at 02:06:01PM +0100, Jan Kara wrote:
> > > > On Tue 13-02-24 16:38:01, Chuck Lever wrote:
> > > > > From: Chuck Lever <chuck.lever@oracle.com>
> > > > > 
> > > > > Test robot reports:
> > > > > > kernel test robot noticed a -19.0% regression of aim9.disk_src.ops_per_sec on:
> > > > > >
> > > > > > commit: a2e459555c5f9da3e619b7e47a63f98574dc75f1 ("shmem: stable directory offsets")
> > > > > > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
> > > > > 
> > > > > Feng Tang further clarifies that:
> > > > > > ... the new simple_offset_add()
> > > > > > called by shmem_mknod() brings extra cost related with slab,
> > > > > > specifically the 'radix_tree_node', which cause the regression.
> > > > > 
> > > > > Willy's analysis is that, over time, the test workload causes
> > > > > xa_alloc_cyclic() to fragment the underlying SLAB cache.
> > > > > 
> > > > > This patch replaces the offset_ctx's xarray with a Maple Tree in the
> > > > > hope that Maple Tree's dense node mode will handle this scenario
> > > > > more scalably.
> > > > > 
> > > > > In addition, we can widen the directory offset to an unsigned long
> > > > > everywhere.
> > > > > 
> > > > > Suggested-by: Matthew Wilcox <willy@infradead.org>
> > > > > Reported-by: kernel test robot <oliver.sang@intel.com>
> > > > > Closes: https://lore.kernel.org/oe-lkp/202309081306.3ecb3734-oliver.sang@intel.com
> > > > > Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> > > > 
> > > > OK, but this will need the performance numbers.
> > > 
> > > Yes, I totally concur. The point of this posting was to get some
> > > early review and start the ball rolling.
> > > 
> > > Actually we expect roughly the same performance numbers now. "Dense
> > > node" support in Maple Tree is supposed to be the real win, but
> > > I'm not sure it's ready yet.
> > > 
> > > 
> > > > Otherwise we have no idea
> > > > whether this is worth it or not. Maybe you can ask Oliver Sang? Usually
> > > > 0-day guys are quite helpful.
> > > 
> > > Oliver and Feng were copied on this series.
> > 
> > we are in holidays last week, now we are back.
> > 
> > I noticed there is v2 for this patch set
> > https://lore.kernel.org/all/170820145616.6328.12620992971699079156.stgit@91.116.238.104.host.secureserver.net/
> > 
> > and you also put it in a branch:
> > https://git.kernel.org/pub/scm/linux/kernel/git/cel/linux.git
> > "simple-offset-maple" branch.
> > 
> > we will test aim9 performance based on this branch. Thanks
> 
> Very much appreciated!

always our pleasure!

we've already sent out a report [1] to you for the commit a616bc6667 in new
branch.
we saw 11.8% improvement of aim9.disk_src.ops_per_sec on it comparing to its
parent.

so the regression we saw on a2e459555c is 'half' recovered.

since I noticed the performance for a2e459555c, v6.8-rc4 and f3f24869a1 (parent
of a616bc6667) are very similar, so ignored results from v6.8-rc4 and f3f24869a1
in below tables for brief. if you want a full table, please let me know. Thanks!


summary for aim9.disk_src.ops_per_sec:

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-ivb-2ep1/disk_src/aim9/300s

commit:
  23a31d8764 ("shmem: Refactor shmem_symlink()")
  a2e459555c ("shmem: stable directory offsets")
  a616bc6667 ("libfs: Convert simple directory offsets to use a Maple Tree")


23a31d87645c6527 a2e459555c5f9da3e619b7e47a6 a616bc666748063733c62e15ea4
---------------- --------------------------- ---------------------------
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \
    202424           -19.0%     163868            -9.3%     183678        aim9.disk_src.ops_per_sec


full data:

=========================================================================================
compiler/cpufreq_governor/kconfig/rootfs/tbox_group/test/testcase/testtime:
  gcc-12/performance/x86_64-rhel-8.3/debian-11.1-x86_64-20220510.cgz/lkp-ivb-2ep1/disk_src/aim9/300s

commit:
  23a31d8764 ("shmem: Refactor shmem_symlink()")
  a2e459555c ("shmem: stable directory offsets")
  a616bc6667 ("libfs: Convert simple directory offsets to use a Maple Tree")


23a31d87645c6527 a2e459555c5f9da3e619b7e47a6 a616bc666748063733c62e15ea4
---------------- --------------------------- ---------------------------
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \
      1404            +0.6%       1412            +3.3%       1450        boot-time.idle
      0.26 ±  9%      +0.1        0.36 ±  2%      -0.1        0.20 ±  4%  mpstat.cpu.all.soft%
      0.61            -0.1        0.52            -0.0        0.58        mpstat.cpu.all.usr%
      1.00            +0.0%       1.00           +19.5%       1.19 ±  2%  vmstat.procs.r
      1525 ±  6%      -5.6%       1440            +9.4%       1668 ±  4%  vmstat.system.cs
     52670            +5.0%      55323 ±  3%     +12.7%      59381 ±  3%  turbostat.C1
      0.02            +0.0        0.02            +0.0        0.03 ± 10%  turbostat.C1%
     12949 ±  9%      -1.2%      12795 ±  6%     -19.6%      10412 ±  6%  turbostat.POLL
    115468 ± 24%     +11.9%     129258 ±  7%     +16.5%     134545 ±  5%  numa-meminfo.node0.AnonPages.max
      2015 ± 12%      -7.5%       1864 ±  5%     +30.3%       2624 ± 10%  numa-meminfo.node0.PageTables
      4795 ± 30%      +1.1%       4846 ± 37%    +117.0%      10405 ± 21%  numa-meminfo.node0.Shmem
      6442 ±  5%      -0.6%       6401 ±  4%     -19.6%       5180 ±  7%  numa-meminfo.node1.KernelStack
     13731 ± 92%     -88.7%       1546 ±  6%    +161.6%      35915 ± 23%  time.involuntary_context_switches
     94.83            -4.2%      90.83            +1.2%      96.00        time.percent_of_cpu_this_job_got
    211.64            +0.5%     212.70            +4.0%     220.06        time.system_time
     73.62           -17.6%      60.69            -6.4%      68.94        time.user_time
    202424           -19.0%     163868            -9.3%     183678        aim9.disk_src.ops_per_sec
     13731 ± 92%     -88.7%       1546 ±  6%    +161.6%      35915 ± 23%  aim9.time.involuntary_context_switches
     94.83            -4.2%      90.83            +1.2%      96.00        aim9.time.percent_of_cpu_this_job_got
    211.64            +0.5%     212.70            +4.0%     220.06        aim9.time.system_time
     73.62           -17.6%      60.69            -6.4%      68.94        aim9.time.user_time
    174558 ±  4%      -1.0%     172852 ±  7%     -19.7%     140084 ±  7%  meminfo.DirectMap4k
     94166            +6.6%     100388           -14.4%      80579        meminfo.KReclaimable
     12941            +0.4%      12989           -14.4%      11078        meminfo.KernelStack
      3769            +0.2%       3775           +31.5%       4955        meminfo.PageTables
     79298            +0.0%      79298           -70.6%      23298        meminfo.Percpu
     94166            +6.6%     100388           -14.4%      80579        meminfo.SReclaimable
    204209            +2.9%     210111            -9.6%     184661        meminfo.Slab
    503.33 ± 12%      -7.5%     465.67 ±  5%     +30.4%     656.39 ± 10%  numa-vmstat.node0.nr_page_table_pages
      1198 ± 30%      +1.1%       1211 ± 37%    +117.0%       2601 ± 21%  numa-vmstat.node0.nr_shmem
    220.33            +0.2%     220.83 ±  2%     -97.3%       6.00 ±100%  numa-vmstat.node1.nr_active_file
    167.00            +0.2%     167.33 ±  2%     -87.7%      20.48 ±100%  numa-vmstat.node1.nr_inactive_file
      6443 ±  5%      -0.6%       6405 ±  4%     -19.5%       5184 ±  7%  numa-vmstat.node1.nr_kernel_stack
    220.33            +0.2%     220.83 ±  2%     -97.3%       6.00 ±100%  numa-vmstat.node1.nr_zone_active_file
    167.00            +0.2%     167.33 ±  2%     -87.7%      20.48 ±100%  numa-vmstat.node1.nr_zone_inactive_file
      0.04 ± 25%      +4.1%       0.05 ± 33%     -40.1%       0.03 ± 34%  perf-sched.sch_delay.avg.ms.syslog_print.do_syslog.kmsg_read.vfs_read
      0.01 ±  4%      +1.6%       0.01 ±  8%    +136.9%       0.02 ± 29%  perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.16 ± 10%     -18.9%       0.13 ± 12%     -16.4%       0.13 ± 15%  perf-sched.sch_delay.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64
      0.14 ± 15%      -6.0%       0.14 ± 20%     -30.4%       0.10 ± 26%  perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone
      0.04 ±  7%   +1802.4%       0.78 ±115%   +7258.6%       3.00 ± 56%  perf-sched.sch_delay.max.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm
      0.05 ± 40%     -14.5%       0.04 ± 31%   +6155.4%       3.09        perf-sched.sch_delay.max.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.01 ±  8%     +25.5%       0.01 ± 21%    +117.0%       0.02 ± 30%  perf-sched.total_sch_delay.average.ms
      0.16 ± 11%   +1204.9%       2.12 ± 90%   +2225.8%       3.77 ± 10%  perf-sched.total_sch_delay.max.ms
     58.50 ± 28%      -4.8%      55.67 ± 14%     -23.9%      44.50 ±  4%  perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.usleep_range_state.ipmi_thread.kthread
    277.83 ±  3%     +10.3%     306.50 ±  5%     +17.9%     327.62 ±  4%  perf-sched.wait_and_delay.count.worker_thread.kthread.ret_from_fork.ret_from_fork_asm
      0.10 ± 75%     -47.1%       0.05 ± 86%    -100.0%       0.00        perf-sched.wait_time.avg.ms.__cond_resched.dput.path_put.vfs_statx.vfs_fstatat
      0.10 ± 72%     -10.8%       0.09 ± 67%    -100.0%       0.00        perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
      2.00 ± 27%      -4.8%       1.90 ± 14%     -23.7%       1.53 ±  4%  perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.do_select.core_sys_select.kern_select
      0.16 ± 81%     +20.1%       0.19 ±104%    -100.0%       0.00        perf-sched.wait_time.max.ms.__cond_resched.dput.path_put.vfs_statx.vfs_fstatat
      0.22 ± 77%     +14.2%       0.25 ± 54%    -100.0%       0.00        perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt
      7.32 ± 43%     -10.3%       6.57 ± 20%     -33.2%       4.89 ±  6%  perf-sched.wait_time.max.ms.schedule_hrtimeout_range_clock.do_select.core_sys_select.kern_select
    220.33            +0.2%     220.83 ±  2%     -94.6%      12.00        proc-vmstat.nr_active_file
    676766            -0.0%     676706            +4.1%     704377        proc-vmstat.nr_file_pages
    167.00            +0.2%     167.33 ±  2%     -75.5%      40.98        proc-vmstat.nr_inactive_file
     12941            +0.4%      12988           -14.4%      11083        proc-vmstat.nr_kernel_stack
    942.50            +0.1%     943.50           +31.4%       1238        proc-vmstat.nr_page_table_pages
      7915 ±  3%      -0.8%       7855            +8.6%       8599        proc-vmstat.nr_shmem
     23541            +6.5%      25074           -14.4%      20144        proc-vmstat.nr_slab_reclaimable
     27499            -0.3%      27422            -5.4%      26016        proc-vmstat.nr_slab_unreclaimable
    668461            +0.0%     668461            +4.2%     696493        proc-vmstat.nr_unevictable
    220.33            +0.2%     220.83 ±  2%     -94.6%      12.00        proc-vmstat.nr_zone_active_file
    167.00            +0.2%     167.33 ±  2%     -75.5%      40.98        proc-vmstat.nr_zone_inactive_file
    668461            +0.0%     668461            +4.2%     696493        proc-vmstat.nr_zone_unevictable
    722.17 ± 71%     -34.7%     471.67 ±136%     -98.3%      12.62 ±129%  proc-vmstat.numa_hint_faults_local
   1437319 ± 24%    +377.6%    6864201           -47.8%     750673 ±  7%  proc-vmstat.numa_hit
   1387016 ± 25%    +391.4%    6815486           -49.5%     700947 ±  7%  proc-vmstat.numa_local
     50329            -0.0%      50324            -1.2%      49704        proc-vmstat.numa_other
   4864362 ± 34%    +453.6%   26931180           -66.1%    1648373 ± 17%  proc-vmstat.pgalloc_normal
   4835960 ± 34%    +455.4%   26856610           -66.3%    1628178 ± 18%  proc-vmstat.pgfree
     11.21           +23.7%      13.87           -81.8%       2.04        perf-stat.i.MPKI
 7.223e+08            -4.4%  6.907e+08            -3.9%   6.94e+08        perf-stat.i.branch-instructions
      2.67            +0.2        2.88            +0.0        2.70        perf-stat.i.branch-miss-rate%
  19988363            +2.8%   20539702            -3.1%   19363031        perf-stat.i.branch-misses
     17.36            -2.8       14.59            +0.4       17.77        perf-stat.i.cache-miss-rate%
  40733859           +19.5%   48659982            -1.9%   39962840        perf-stat.i.cache-references
      1482 ±  7%      -5.7%       1398            +9.6%       1623 ±  5%  perf-stat.i.context-switches
      1.76            +3.5%       1.82            +5.1%       1.85        perf-stat.i.cpi
     55.21            +5.4%      58.21 ±  2%      +0.4%      55.45        perf-stat.i.cpu-migrations
  16524721 ±  8%      -4.8%   15726404 ±  4%     -13.1%   14367627 ±  4%  perf-stat.i.dTLB-load-misses
  1.01e+09            -3.8%  9.719e+08            -4.4%  9.659e+08        perf-stat.i.dTLB-loads
      0.26 ±  4%      -0.0        0.23 ±  3%      -0.0        0.25 ±  3%  perf-stat.i.dTLB-store-miss-rate%
   2166022 ±  4%      -6.9%    2015917 ±  3%      -7.0%    2014037 ±  3%  perf-stat.i.dTLB-store-misses
 8.503e+08            +5.5%  8.968e+08            -3.5%  8.205e+08        perf-stat.i.dTLB-stores
     69.22 ±  4%      +6.4       75.60           +14.4       83.60 ±  3%  perf-stat.i.iTLB-load-miss-rate%
    709457 ±  5%      -5.4%     670950 ±  2%    +133.6%    1657233        perf-stat.i.iTLB-load-misses
    316455 ± 12%     -31.6%     216531 ±  3%      +3.5%     327592 ± 19%  perf-stat.i.iTLB-loads
 3.722e+09            -3.1%  3.608e+09            -4.6%  3.553e+09        perf-stat.i.instructions
      5243 ±  5%      +2.2%       5357 ±  2%     -59.1%       2142        perf-stat.i.instructions-per-iTLB-miss
      0.57            -3.3%       0.55            -4.8%       0.54        perf-stat.i.ipc
    865.04           -10.4%     775.02 ±  3%      -2.5%     843.12        perf-stat.i.metric.K/sec
     53.84            -0.4%      53.61            -4.0%      51.71        perf-stat.i.metric.M/sec
     47.51            -2.1       45.37            +0.7       48.17        perf-stat.i.node-load-miss-rate%
     88195 ±  3%      +5.2%      92745 ±  4%     +16.4%     102647 ±  6%  perf-stat.i.node-load-misses
    106705 ±  3%     +14.8%     122490 ±  5%     +13.2%     120774 ±  5%  perf-stat.i.node-loads
    107169 ±  4%     +29.0%     138208 ±  7%      +7.5%     115217 ±  5%  perf-stat.i.node-stores
     10.94           +23.3%      13.49           -81.8%       1.99        perf-stat.overall.MPKI
      2.77            +0.2        2.97            +0.0        2.79        perf-stat.overall.branch-miss-rate%
     17.28            -2.7       14.56            +0.4       17.67        perf-stat.overall.cache-miss-rate%
      1.73            +3.4%       1.79            +5.0%       1.82        perf-stat.overall.cpi
      0.25 ±  4%      -0.0        0.22 ±  3%      -0.0        0.24 ±  3%  perf-stat.overall.dTLB-store-miss-rate%
     69.20 ±  4%      +6.4       75.60           +14.4       83.58 ±  3%  perf-stat.overall.iTLB-load-miss-rate%
      5260 ±  5%      +2.3%       5380 ±  2%     -59.2%       2144        perf-stat.overall.instructions-per-iTLB-miss
      0.58            -3.2%       0.56            -4.7%       0.55        perf-stat.overall.ipc
     45.25            -2.2       43.10            +0.7       45.93        perf-stat.overall.node-load-miss-rate%
 7.199e+08            -4.4%  6.883e+08            -3.9%  6.917e+08        perf-stat.ps.branch-instructions
  19919808            +2.8%   20469001            -3.1%   19299968        perf-stat.ps.branch-misses
  40597326           +19.5%   48497201            -1.9%   39829580        perf-stat.ps.cache-references
      1477 ±  7%      -5.7%       1393            +9.6%       1618 ±  5%  perf-stat.ps.context-switches
     55.06            +5.4%      58.03 ±  2%      +0.5%      55.32        perf-stat.ps.cpu-migrations
  16469488 ±  8%      -4.8%   15673772 ±  4%     -13.1%   14319828 ±  4%  perf-stat.ps.dTLB-load-misses
 1.007e+09            -3.8%  9.686e+08            -4.4%  9.627e+08        perf-stat.ps.dTLB-loads
   2158768 ±  4%      -6.9%    2009174 ±  3%      -7.0%    2007326 ±  3%  perf-stat.ps.dTLB-store-misses
 8.475e+08            +5.5%  8.937e+08            -3.5%  8.178e+08        perf-stat.ps.dTLB-stores
    707081 ±  5%      -5.4%     668703 ±  2%    +133.6%    1651678        perf-stat.ps.iTLB-load-misses
    315394 ± 12%     -31.6%     215816 ±  3%      +3.5%     326463 ± 19%  perf-stat.ps.iTLB-loads
  3.71e+09            -3.1%  3.595e+09            -4.5%  3.541e+09        perf-stat.ps.instructions
     87895 ±  3%      +5.2%      92424 ±  4%     +16.4%     102341 ±  6%  perf-stat.ps.node-load-misses
    106351 ±  3%     +14.8%     122083 ±  5%     +13.2%     120439 ±  5%  perf-stat.ps.node-loads
    106728 ±  4%     +29.1%     137740 ±  7%      +7.6%     114824 ±  5%  perf-stat.ps.node-stores
 1.117e+12            -3.0%  1.084e+12            -4.5%  1.067e+12        perf-stat.total.instructions
     10.55 ± 95%    -100.0%       0.00          -100.0%       0.00        sched_debug.cfs_rq:/.MIN_vruntime.avg
    506.30 ± 95%    -100.0%       0.00          -100.0%       0.00        sched_debug.cfs_rq:/.MIN_vruntime.max
      0.00            +0.0%       0.00          -100.0%       0.00        sched_debug.cfs_rq:/.MIN_vruntime.min
      1.08 ±  7%      -7.7%       1.00           -46.2%       0.58 ± 27%  sched_debug.cfs_rq:/.h_nr_running.max
    538959 ± 24%     -23.2%     414090           -58.3%     224671 ± 31%  sched_debug.cfs_rq:/.load.max
    130191 ± 14%     -13.3%     112846 ±  6%     -56.6%      56505 ± 39%  sched_debug.cfs_rq:/.load.stddev
     10.56 ± 95%    -100.0%       0.00          -100.0%       0.00        sched_debug.cfs_rq:/.max_vruntime.avg
    506.64 ± 95%    -100.0%       0.00          -100.0%       0.00        sched_debug.cfs_rq:/.max_vruntime.max
      0.00            +0.0%       0.00          -100.0%       0.00        sched_debug.cfs_rq:/.max_vruntime.min
    116849 ± 27%     -51.2%      56995 ± 20%     -66.7%      38966 ± 39%  sched_debug.cfs_rq:/.min_vruntime.max
      1248 ± 14%      +9.8%       1370 ± 15%     -29.8%     876.86 ± 16%  sched_debug.cfs_rq:/.min_vruntime.min
     20484 ± 22%     -37.0%      12910 ± 15%     -60.7%       8059 ± 32%  sched_debug.cfs_rq:/.min_vruntime.stddev
      1.08 ±  7%      -7.7%       1.00           -46.2%       0.58 ± 27%  sched_debug.cfs_rq:/.nr_running.max
      1223 ±191%    -897.4%      -9754          -100.0%       0.00        sched_debug.cfs_rq:/.spread0.avg
    107969 ± 29%     -65.3%      37448 ± 39%    -100.0%       0.00        sched_debug.cfs_rq:/.spread0.max
     -7628          +138.2%     -18173          -100.0%       0.00        sched_debug.cfs_rq:/.spread0.min
     20484 ± 22%     -37.0%      12910 ± 15%    -100.0%       0.00        sched_debug.cfs_rq:/.spread0.stddev
     29.84 ± 19%      -1.1%      29.52 ± 14%    -100.0%       0.00        sched_debug.cfs_rq:/.util_est_enqueued.avg
    569.91 ± 11%      +4.5%     595.58          -100.0%       0.00        sched_debug.cfs_rq:/.util_est_enqueued.max
    109.06 ± 13%      +3.0%     112.32 ±  5%    -100.0%       0.00        sched_debug.cfs_rq:/.util_est_enqueued.stddev
    910320            +0.0%     910388           -44.2%     507736 ± 28%  sched_debug.cpu.avg_idle.avg
   1052715 ±  5%      -3.8%    1012910           -45.8%     570219 ± 28%  sched_debug.cpu.avg_idle.max
    175426 ±  8%      +2.8%     180422 ±  6%     -42.5%     100839 ± 20%  sched_debug.cpu.clock.avg
    175428 ±  8%      +2.8%     180424 ±  6%     -42.5%     100840 ± 20%  sched_debug.cpu.clock.max
    175424 ±  8%      +2.8%     180421 ±  6%     -42.5%     100838 ± 20%  sched_debug.cpu.clock.min
    170979 ±  8%      +2.8%     175682 ±  6%     -42.5%      98373 ± 20%  sched_debug.cpu.clock_task.avg
    173398 ±  8%      +2.6%     177937 ±  6%     -42.6%      99568 ± 20%  sched_debug.cpu.clock_task.max
    165789 ±  8%      +3.2%     171014 ±  6%     -42.4%      95513 ± 20%  sched_debug.cpu.clock_task.min
      2178 ±  8%     -13.1%       1893 ± 18%     -49.7%       1094 ± 33%  sched_debug.cpu.clock_task.stddev
      7443 ±  4%      +1.7%       7566 ±  3%     -43.0%       4246 ± 24%  sched_debug.cpu.curr->pid.max
      1409 ±  2%      +1.1%       1426 ±  2%     -42.0%     818.46 ± 25%  sched_debug.cpu.curr->pid.stddev
    502082            +0.0%     502206           -43.9%     281834 ± 29%  sched_debug.cpu.max_idle_balance_cost.avg
    567767 ±  5%      -3.3%     548996 ±  2%     -46.6%     303439 ± 27%  sched_debug.cpu.max_idle_balance_cost.max
      4294            +0.0%       4294           -43.7%       2415 ± 29%  sched_debug.cpu.next_balance.avg
      4294            +0.0%       4294           -43.7%       2415 ± 29%  sched_debug.cpu.next_balance.max
      4294            +0.0%       4294           -43.7%       2415 ± 29%  sched_debug.cpu.next_balance.min
      0.29 ±  3%      -1.2%       0.28           -43.0%       0.16 ± 29%  sched_debug.cpu.nr_running.stddev
      8212 ±  7%      -2.2%       8032 ±  4%     -40.7%       4867 ± 20%  sched_debug.cpu.nr_switches.avg
     55209 ± 14%     -21.8%      43154 ± 14%     -57.0%      23755 ± 20%  sched_debug.cpu.nr_switches.max
      1272 ± 23%     +10.2%       1402 ±  8%     -39.1%     775.51 ± 29%  sched_debug.cpu.nr_switches.min
      9805 ± 13%     -13.7%       8459 ±  8%     -50.8%       4825 ± 20%  sched_debug.cpu.nr_switches.stddev
    -15.73           -14.1%     -13.52           -64.8%      -5.53        sched_debug.cpu.nr_uninterruptible.min
      6.08 ± 27%      -1.0%       6.02 ± 13%     -53.7%       2.82 ± 24%  sched_debug.cpu.nr_uninterruptible.stddev
    175425 ±  8%      +2.8%     180421 ±  6%     -42.5%     100838 ± 20%  sched_debug.cpu_clk
 4.295e+09            +0.0%  4.295e+09           -43.7%  2.416e+09 ± 29%  sched_debug.jiffies
    174815 ±  8%      +2.9%     179811 ±  6%     -42.5%     100505 ± 20%  sched_debug.ktime
    175972 ±  8%      +2.8%     180955 ±  6%     -42.5%     101150 ± 20%  sched_debug.sched_clk
  58611259            +0.0%   58611259           -94.0%    3508734 ± 29%  sched_debug.sysctl_sched.sysctl_sched_features
      0.75            +0.0%       0.75          -100.0%       0.00        sched_debug.sysctl_sched.sysctl_sched_idle_min_granularity
     24.00            +0.0%      24.00          -100.0%       0.00        sched_debug.sysctl_sched.sysctl_sched_latency
      3.00            +0.0%       3.00          -100.0%       0.00        sched_debug.sysctl_sched.sysctl_sched_min_granularity
      4.00            +0.0%       4.00          -100.0%       0.00        sched_debug.sysctl_sched.sysctl_sched_wakeup_granularity
      0.26 ±100%      -0.3        0.00            +1.2        1.44 ±  9%  perf-profile.calltrace.cycles-pp.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close
      2.08 ± 26%      -0.2        1.87 ± 12%      -1.4        0.63 ± 11%  perf-profile.calltrace.cycles-pp.syscall_exit_to_user_mode.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close
      0.00            +0.0        0.00            +0.7        0.67 ± 11%  perf-profile.calltrace.cycles-pp.rcu_sched_clock_irq.update_process_times.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues
      0.00            +0.0        0.00            +0.7        0.71 ± 18%  perf-profile.calltrace.cycles-pp.rebalance_domains.__do_softirq.irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt
      0.00            +0.0        0.00            +0.8        0.80 ± 14%  perf-profile.calltrace.cycles-pp.getname_flags.__do_sys_newstat.do_syscall_64.entry_SYSCALL_64_after_hwframe.__xstat64
      0.00            +0.0        0.00            +0.8        0.84 ± 12%  perf-profile.calltrace.cycles-pp.__fput.__x64_sys_close.do_syscall_64.entry_SYSCALL_64_after_hwframe.__close
      0.00            +0.0        0.00            +1.0        0.98 ± 13%  perf-profile.calltrace.cycles-pp.mas_alloc_cyclic.mtree_alloc_cyclic.simple_offset_add.shmem_mknod.lookup_open
      0.00            +0.0        0.00            +1.1        1.10 ± 14%  perf-profile.calltrace.cycles-pp.mtree_alloc_cyclic.simple_offset_add.shmem_mknod.lookup_open.open_last_lookups
      0.00            +0.0        0.00            +1.2        1.20 ± 13%  perf-profile.calltrace.cycles-pp.mas_erase.mtree_erase.simple_offset_remove.shmem_unlink.vfs_unlink
      0.00            +0.0        0.00            +1.3        1.34 ± 15%  perf-profile.calltrace.cycles-pp.link_path_walk.path_lookupat.filename_lookup.vfs_statx.__do_sys_newstat
      0.00            +0.0        0.00            +1.4        1.35 ± 12%  perf-profile.calltrace.cycles-pp.mtree_erase.simple_offset_remove.shmem_unlink.vfs_unlink.do_unlinkat
      0.00            +0.0        0.00            +1.6        1.56 ± 18%  perf-profile.calltrace.cycles-pp.__do_softirq.irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state
      0.00            +0.0        0.00            +1.7        1.73 ±  6%  perf-profile.calltrace.cycles-pp.scheduler_tick.update_process_times.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues
      0.00            +0.0        0.00            +1.8        1.80 ± 16%  perf-profile.calltrace.cycles-pp.irq_exit_rcu.sysvec_apic_timer_interrupt.asm_sysvec_apic_timer_interrupt.cpuidle_enter_state.cpuidle_enter
      0.00            +0.0        0.00            +2.0        2.03 ± 15%  perf-profile.calltrace.cycles-pp.path_lookupat.filename_lookup.vfs_statx.__do_sys_newstat.do_syscall_64
      0.00            +0.0        0.00            +2.2        2.16 ± 15%  perf-profile.calltrace.cycles-pp.filename_lookup.vfs_statx.__do_sys_newstat.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00            +0.0        0.00            +2.9        2.94 ± 14%  perf-profile.calltrace.cycles-pp.vfs_statx.__do_sys_newstat.do_syscall_64.entry_SYSCALL_64_after_hwframe.__xstat64
      0.00            +0.0        0.00            +3.2        3.19 ±  8%  perf-profile.calltrace.cycles-pp.update_process_times.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues.hrtimer_interrupt
      0.00            +0.0        0.00            +3.4        3.35 ±  8%  perf-profile.calltrace.cycles-pp.tick_sched_handle.tick_nohz_highres_handler.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt
      0.00            +0.0        0.00            +3.9        3.88 ±  8%  perf-profile.calltrace.cycles-pp.tick_nohz_highres_handler.__hrtimer_run_queues.hrtimer_interrupt.__sysvec_apic_timer_interrupt.sysvec_apic_timer_interrupt
      0.00            +0.8        0.75 ± 12%      +0.0        0.00        perf-profile.calltrace.cycles-pp.__call_rcu_common.xas_store.__xa_erase.xa_erase.simple_offset_remove
      0.00            +0.8        0.78 ± 34%      +0.0        0.00        perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_lru.xas_alloc.xas_create.xas_store
      0.00            +0.8        0.83 ± 29%      +0.0        0.00        perf-profile.calltrace.cycles-pp.allocate_slab.___slab_alloc.kmem_cache_alloc_lru.xas_alloc.xas_expand
      0.00            +0.9        0.92 ± 26%      +0.0        0.00        perf-profile.calltrace.cycles-pp.___slab_alloc.kmem_cache_alloc_lru.xas_alloc.xas_expand.xas_create
      0.00            +1.0        0.99 ± 27%      +0.0        0.00        perf-profile.calltrace.cycles-pp.shuffle_freelist.allocate_slab.___slab_alloc.kmem_cache_alloc_lru.xas_alloc
      0.00            +1.0        1.04 ± 28%      +0.0        0.00        perf-profile.calltrace.cycles-pp.kmem_cache_alloc_lru.xas_alloc.xas_create.xas_store.__xa_alloc
      0.00            +1.1        1.11 ± 26%      +0.0        0.00        perf-profile.calltrace.cycles-pp.xas_alloc.xas_create.xas_store.__xa_alloc.__xa_alloc_cyclic
      1.51 ± 24%      +1.2        2.73 ± 10%      +1.2        2.75 ± 10%  perf-profile.calltrace.cycles-pp.vfs_unlink.do_unlinkat.__x64_sys_unlink.do_syscall_64.entry_SYSCALL_64_after_hwframe
      0.00            +1.2        1.24 ± 20%      +0.0        0.00        perf-profile.calltrace.cycles-pp.kmem_cache_alloc_lru.xas_alloc.xas_expand.xas_create.xas_store
      0.00            +1.3        1.27 ± 10%      +0.0        0.00        perf-profile.calltrace.cycles-pp.xas_store.__xa_erase.xa_erase.simple_offset_remove.shmem_unlink
      0.00            +1.3        1.30 ± 10%      +0.0        0.00        perf-profile.calltrace.cycles-pp.__xa_erase.xa_erase.simple_offset_remove.shmem_unlink.vfs_unlink
      0.00            +1.3        1.33 ± 19%      +0.0        0.00        perf-profile.calltrace.cycles-pp.xas_alloc.xas_expand.xas_create.xas_store.__xa_alloc
      0.00            +1.4        1.36 ± 10%      +0.0        0.00        perf-profile.calltrace.cycles-pp.xa_erase.simple_offset_remove.shmem_unlink.vfs_unlink.do_unlinkat
      0.00            +1.4        1.37 ± 10%      +1.4        1.37 ± 12%  perf-profile.calltrace.cycles-pp.simple_offset_remove.shmem_unlink.vfs_unlink.do_unlinkat.__x64_sys_unlink
      0.00            +1.5        1.51 ± 17%      +0.0        0.00        perf-profile.calltrace.cycles-pp.xas_expand.xas_create.xas_store.__xa_alloc.__xa_alloc_cyclic
      0.00            +1.6        1.62 ± 12%      +1.6        1.65 ± 12%  perf-profile.calltrace.cycles-pp.shmem_unlink.vfs_unlink.do_unlinkat.__x64_sys_unlink.do_syscall_64
      0.00            +2.8        2.80 ± 13%      +0.0        0.00        perf-profile.calltrace.cycles-pp.xas_create.xas_store.__xa_alloc.__xa_alloc_cyclic.simple_offset_add
      0.00            +2.9        2.94 ± 13%      +0.0        0.00        perf-profile.calltrace.cycles-pp.xas_store.__xa_alloc.__xa_alloc_cyclic.simple_offset_add.shmem_mknod
      5.38 ± 24%      +3.1        8.51 ± 11%      +0.6        5.95 ± 11%  perf-profile.calltrace.cycles-pp.lookup_open.open_last_lookups.path_openat.do_filp_open.do_sys_openat2
      6.08 ± 24%      +3.2        9.24 ± 12%      +0.5        6.59 ± 10%  perf-profile.calltrace.cycles-pp.open_last_lookups.path_openat.do_filp_open.do_sys_openat2.__x64_sys_creat
      0.00            +3.2        3.20 ± 13%      +0.0        0.00        perf-profile.calltrace.cycles-pp.__xa_alloc.__xa_alloc_cyclic.simple_offset_add.shmem_mknod.lookup_open
      0.00            +3.2        3.24 ± 13%      +0.0        0.00        perf-profile.calltrace.cycles-pp.__xa_alloc_cyclic.simple_offset_add.shmem_mknod.lookup_open.open_last_lookups
      0.00            +3.4        3.36 ± 14%      +1.2        1.16 ± 13%  perf-profile.calltrace.cycles-pp.simple_offset_add.shmem_mknod.lookup_open.open_last_lookups.path_openat
      2.78 ± 25%      +3.4        6.17 ± 12%      +0.9        3.69 ± 12%  perf-profile.calltrace.cycles-pp.shmem_mknod.lookup_open.open_last_lookups.path_openat.do_filp_open
      0.16 ± 30%      -0.1        0.08 ± 20%      -0.0        0.13 ± 42%  perf-profile.children.cycles-pp.map_id_up
      0.22 ± 18%      -0.0        0.16 ± 17%      -0.1        0.12 ±  8%  perf-profile.children.cycles-pp._raw_spin_lock_irq
      0.47 ± 17%      -0.0        0.43 ± 16%      +1.0        1.49 ± 10%  perf-profile.children.cycles-pp.__x64_sys_close
      0.02 ±142%      -0.0        0.00            +0.1        0.08 ± 22%  perf-profile.children.cycles-pp._find_next_zero_bit
      0.01 ±223%      -0.0        0.00            +2.0        1.97 ± 15%  perf-profile.children.cycles-pp.irq_exit_rcu
      0.00            +0.0        0.00            +0.1        0.08 ± 30%  perf-profile.children.cycles-pp.__wake_up
      0.00            +0.0        0.00            +0.1        0.08 ± 21%  perf-profile.children.cycles-pp.should_we_balance
      0.00            +0.0        0.00            +0.1        0.09 ± 34%  perf-profile.children.cycles-pp.amd_clear_divider
      0.00            +0.0        0.00            +0.1        0.10 ± 36%  perf-profile.children.cycles-pp.apparmor_current_getsecid_subj
      0.00            +0.0        0.00            +0.1        0.10 ± 32%  perf-profile.children.cycles-pp.filp_flush
      0.00            +0.0        0.00            +0.1        0.10 ± 27%  perf-profile.children.cycles-pp.mas_wr_end_piv
      0.00            +0.0        0.00            +0.1        0.12 ± 27%  perf-profile.children.cycles-pp.mnt_get_write_access
      0.00            +0.0        0.00            +0.1        0.12 ± 23%  perf-profile.children.cycles-pp.file_close_fd
      0.00            +0.0        0.00            +0.1        0.13 ± 30%  perf-profile.children.cycles-pp.security_current_getsecid_subj
      0.00            +0.0        0.00            +0.1        0.13 ± 18%  perf-profile.children.cycles-pp.native_apic_mem_eoi
      0.00            +0.0        0.00            +0.2        0.17 ± 22%  perf-profile.children.cycles-pp.mas_leaf_max_gap
      0.00            +0.0        0.00            +0.2        0.18 ± 24%  perf-profile.children.cycles-pp.mtree_range_walk
      0.00            +0.0        0.00            +0.2        0.20 ± 19%  perf-profile.children.cycles-pp.inode_set_ctime_current
      0.00            +0.0        0.00            +0.2        0.24 ± 14%  perf-profile.children.cycles-pp.ima_file_check
      0.00            +0.0        0.00            +0.2        0.24 ± 22%  perf-profile.children.cycles-pp.mas_anode_descend
      0.00            +0.0        0.00            +0.3        0.26 ± 18%  perf-profile.children.cycles-pp.lockref_put_return
      0.00            +0.0        0.00            +0.3        0.29 ± 16%  perf-profile.children.cycles-pp.mas_wr_walk
      0.00            +0.0        0.00            +0.3        0.31 ± 23%  perf-profile.children.cycles-pp.mas_update_gap
      0.00            +0.0        0.00            +0.3        0.32 ± 17%  perf-profile.children.cycles-pp.mas_wr_append
      0.00            +0.0        0.00            +0.4        0.37 ± 15%  perf-profile.children.cycles-pp.mas_empty_area
      0.00            +0.0        0.00            +0.5        0.47 ± 18%  perf-profile.children.cycles-pp.mas_wr_node_store
      0.00            +0.0        0.00            +0.5        0.53 ± 18%  perf-profile.children.cycles-pp.__memcg_slab_post_alloc_hook
      0.00            +0.0        0.00            +0.7        0.65 ± 12%  perf-profile.children.cycles-pp.__memcg_slab_pre_alloc_hook
      0.00            +0.0        0.00            +0.7        0.65 ± 28%  perf-profile.children.cycles-pp.__memcg_slab_free_hook
      0.00            +0.0        0.00            +1.0        0.99 ± 13%  perf-profile.children.cycles-pp.mas_alloc_cyclic
      0.00            +0.0        0.00            +1.1        1.11 ± 14%  perf-profile.children.cycles-pp.mtree_alloc_cyclic
      0.00            +0.0        0.00            +1.2        1.21 ± 14%  perf-profile.children.cycles-pp.mas_erase
      0.00            +0.0        0.00            +1.4        1.35 ± 12%  perf-profile.children.cycles-pp.mtree_erase
      0.00            +0.0        0.00            +1.8        1.78 ±  9%  perf-profile.children.cycles-pp.entry_SYSCALL_64
      0.00            +0.0        0.00            +4.1        4.06 ±  8%  perf-profile.children.cycles-pp.tick_nohz_highres_handler
      0.02 ±146%      +0.1        0.08 ± 13%      +0.0        0.06 ± 81%  perf-profile.children.cycles-pp.shmem_is_huge
      0.02 ±141%      +0.1        0.09 ± 16%      -0.0        0.00        perf-profile.children.cycles-pp.__list_del_entry_valid
      0.00            +0.1        0.08 ± 11%      +0.0        0.00        perf-profile.children.cycles-pp.free_unref_page
      0.00            +0.1        0.08 ± 13%      +0.1        0.08 ± 45%  perf-profile.children.cycles-pp.shmem_destroy_inode
      0.04 ±101%      +0.1        0.14 ± 25%      +0.0        0.05 ± 65%  perf-profile.children.cycles-pp.rcu_nocb_try_bypass
      0.00            +0.1        0.12 ± 27%      +0.0        0.00        perf-profile.children.cycles-pp.xas_find_marked
      0.02 ±144%      +0.1        0.16 ± 14%      -0.0        0.00        perf-profile.children.cycles-pp.__unfreeze_partials
      0.03 ±106%      +0.2        0.19 ± 26%      -0.0        0.03 ±136%  perf-profile.children.cycles-pp.xas_descend
      0.01 ±223%      +0.2        0.17 ± 15%      -0.0        0.00        perf-profile.children.cycles-pp.get_page_from_freelist
      0.11 ± 22%      +0.2        0.29 ± 16%      -0.0        0.08 ± 30%  perf-profile.children.cycles-pp.rcu_segcblist_enqueue
      0.02 ±146%      +0.2        0.24 ± 13%      -0.0        0.01 ±174%  perf-profile.children.cycles-pp.__alloc_pages
      0.36 ± 79%      +0.6        0.98 ± 15%      -0.0        0.31 ± 43%  perf-profile.children.cycles-pp.__slab_free
      0.50 ± 26%      +0.7        1.23 ± 14%      -0.2        0.31 ± 19%  perf-profile.children.cycles-pp.__call_rcu_common
      0.00            +0.8        0.82 ± 13%      +0.0        0.00        perf-profile.children.cycles-pp.radix_tree_node_rcu_free
      0.00            +1.1        1.14 ± 17%      +0.0        0.00        perf-profile.children.cycles-pp.radix_tree_node_ctor
      0.16 ± 86%      +1.2        1.38 ± 16%      -0.1        0.02 ±174%  perf-profile.children.cycles-pp.setup_object
      1.52 ± 25%      +1.2        2.75 ± 10%      +1.2        2.76 ± 11%  perf-profile.children.cycles-pp.vfs_unlink
      0.36 ± 22%      +1.3        1.63 ± 12%      +1.3        1.65 ± 12%  perf-profile.children.cycles-pp.shmem_unlink
      0.00            +1.3        1.30 ± 10%      +0.0        0.00        perf-profile.children.cycles-pp.__xa_erase
      0.20 ± 79%      +1.3        1.53 ± 15%      -0.2        0.02 ±173%  perf-profile.children.cycles-pp.shuffle_freelist
      0.00            +1.4        1.36 ± 10%      +0.0        0.00        perf-profile.children.cycles-pp.xa_erase
      0.00            +1.4        1.38 ± 10%      +1.4        1.37 ± 12%  perf-profile.children.cycles-pp.simple_offset_remove
      0.00            +1.5        1.51 ± 17%      +0.0        0.00        perf-profile.children.cycles-pp.xas_expand
      0.26 ± 78%      +1.6        1.87 ± 13%      -0.2        0.05 ± 68%  perf-profile.children.cycles-pp.allocate_slab
      0.40 ± 49%      +1.7        2.10 ± 13%      -0.3        0.14 ± 28%  perf-profile.children.cycles-pp.___slab_alloc
      1.30 ± 85%      +2.1        3.42 ± 12%      -0.1        1.15 ± 41%  perf-profile.children.cycles-pp.rcu_do_batch
      1.56 ± 27%      +2.4        3.93 ± 11%      -0.2        1.41 ± 12%  perf-profile.children.cycles-pp.kmem_cache_alloc_lru
      0.00            +2.4        2.44 ± 12%      +0.0        0.00        perf-profile.children.cycles-pp.xas_alloc
      2.66 ± 13%      +2.5        5.14 ±  5%      -2.7        0.00        perf-profile.children.cycles-pp.__irq_exit_rcu
     11.16 ± 10%      +2.7       13.88 ±  8%      +0.6       11.72 ±  8%  perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt
     11.77 ± 10%      +2.7       14.49 ±  8%      +0.6       12.40 ±  8%  perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt
      0.00            +2.8        2.82 ± 13%      +0.0        0.00        perf-profile.children.cycles-pp.xas_create
      5.40 ± 24%      +3.1        8.52 ± 11%      +0.6        5.97 ± 11%  perf-profile.children.cycles-pp.lookup_open
      6.12 ± 24%      +3.1        9.27 ± 12%      +0.5        6.62 ± 10%  perf-profile.children.cycles-pp.open_last_lookups
      0.00            +3.2        3.22 ± 13%      +0.0        0.00        perf-profile.children.cycles-pp.__xa_alloc
      0.00            +3.2        3.24 ± 13%      +0.0        0.00        perf-profile.children.cycles-pp.__xa_alloc_cyclic
      0.00            +3.4        3.36 ± 14%      +1.2        1.16 ± 13%  perf-profile.children.cycles-pp.simple_offset_add
      2.78 ± 25%      +3.4        6.18 ± 12%      +0.9        3.70 ± 12%  perf-profile.children.cycles-pp.shmem_mknod
      0.00            +4.2        4.24 ± 12%      +0.0        0.00        perf-profile.children.cycles-pp.xas_store
      0.14 ± 27%      -0.1        0.08 ± 21%      -0.0        0.12 ± 42%  perf-profile.self.cycles-pp.map_id_up
      0.09 ± 18%      -0.0        0.06 ± 52%      +0.1        0.14 ± 12%  perf-profile.self.cycles-pp.obj_cgroup_charge
      0.18 ± 22%      -0.0        0.15 ± 17%      -0.1        0.10 ±  9%  perf-profile.self.cycles-pp._raw_spin_lock_irq
      0.02 ±141%      -0.0        0.00            +0.1        0.08 ± 22%  perf-profile.self.cycles-pp._find_next_zero_bit
      0.16 ± 24%      -0.0        0.16 ± 24%      -0.1        0.07 ± 64%  perf-profile.self.cycles-pp.__sysvec_apic_timer_interrupt
      0.02 ±146%      +0.0        0.02 ±146%      +0.1        0.08 ± 18%  perf-profile.self.cycles-pp.shmem_mknod
      0.00            +0.0        0.00            +0.1        0.09 ± 36%  perf-profile.self.cycles-pp.irq_exit_rcu
      0.00            +0.0        0.00            +0.1        0.09 ± 28%  perf-profile.self.cycles-pp.tick_nohz_highres_handler
      0.00            +0.0        0.00            +0.1        0.09 ± 36%  perf-profile.self.cycles-pp.apparmor_current_getsecid_subj
      0.00            +0.0        0.00            +0.1        0.09 ± 30%  perf-profile.self.cycles-pp.mtree_erase
      0.00            +0.0        0.00            +0.1        0.10 ± 26%  perf-profile.self.cycles-pp.mtree_alloc_cyclic
      0.00            +0.0        0.00            +0.1        0.10 ± 27%  perf-profile.self.cycles-pp.mas_wr_end_piv
      0.00            +0.0        0.00            +0.1        0.12 ± 28%  perf-profile.self.cycles-pp.mnt_get_write_access
      0.00            +0.0        0.00            +0.1        0.12 ± 29%  perf-profile.self.cycles-pp.inode_set_ctime_current
      0.00            +0.0        0.00            +0.1        0.12 ± 38%  perf-profile.self.cycles-pp.mas_empty_area
      0.00            +0.0        0.00            +0.1        0.13 ± 18%  perf-profile.self.cycles-pp.native_apic_mem_eoi
      0.00            +0.0        0.00            +0.1        0.14 ± 38%  perf-profile.self.cycles-pp.mas_update_gap
      0.00            +0.0        0.00            +0.1        0.14 ± 20%  perf-profile.self.cycles-pp.mas_wr_append
      0.00            +0.0        0.00            +0.2        0.16 ± 23%  perf-profile.self.cycles-pp.mas_leaf_max_gap
      0.00            +0.0        0.00            +0.2        0.18 ± 24%  perf-profile.self.cycles-pp.mtree_range_walk
      0.00            +0.0        0.00            +0.2        0.18 ± 29%  perf-profile.self.cycles-pp.mas_alloc_cyclic
      0.00            +0.0        0.00            +0.2        0.20 ± 14%  perf-profile.self.cycles-pp.__memcg_slab_pre_alloc_hook
      0.00            +0.0        0.00            +0.2        0.22 ± 32%  perf-profile.self.cycles-pp.mas_erase
      0.00            +0.0        0.00            +0.2        0.24 ± 35%  perf-profile.self.cycles-pp.__memcg_slab_free_hook
      0.00            +0.0        0.00            +0.2        0.24 ± 22%  perf-profile.self.cycles-pp.mas_anode_descend
      0.00            +0.0        0.00            +0.3        0.26 ± 17%  perf-profile.self.cycles-pp.lockref_put_return
      0.00            +0.0        0.00            +0.3        0.27 ± 16%  perf-profile.self.cycles-pp.mas_wr_walk
      0.00            +0.0        0.00            +0.3        0.34 ± 20%  perf-profile.self.cycles-pp.mas_wr_node_store
      0.00            +0.0        0.00            +0.4        0.35 ± 20%  perf-profile.self.cycles-pp.__memcg_slab_post_alloc_hook
      0.00            +0.0        0.00            +1.6        1.59 ±  8%  perf-profile.self.cycles-pp.entry_SYSCALL_64
      0.05 ± 85%      +0.0        0.06 ± 47%      +0.1        0.15 ± 54%  perf-profile.self.cycles-pp.check_heap_object
      0.05 ± 72%      +0.0        0.06 ± 81%      +0.0        0.10 ± 15%  perf-profile.self.cycles-pp.call_cpuidle
      0.04 ±105%      +0.0        0.04 ± 75%      +0.1        0.10 ± 33%  perf-profile.self.cycles-pp.putname
      0.00            +0.1        0.06 ± 24%      +0.0        0.05 ± 66%  perf-profile.self.cycles-pp.shmem_destroy_inode
      0.00            +0.1        0.07 ±  8%      +0.0        0.00        perf-profile.self.cycles-pp.__xa_alloc
      0.02 ±146%      +0.1        0.11 ± 28%      +0.0        0.02 ±133%  perf-profile.self.cycles-pp.rcu_nocb_try_bypass
      0.01 ±223%      +0.1        0.10 ± 28%      -0.0        0.00        perf-profile.self.cycles-pp.shuffle_freelist
      0.00            +0.1        0.11 ± 40%      +0.0        0.00        perf-profile.self.cycles-pp.xas_create
      0.00            +0.1        0.12 ± 27%      +0.0        0.00        perf-profile.self.cycles-pp.xas_find_marked
      0.00            +0.1        0.14 ± 18%      +0.0        0.00        perf-profile.self.cycles-pp.xas_alloc
      0.03 ±103%      +0.1        0.17 ± 29%      -0.0        0.03 ±136%  perf-profile.self.cycles-pp.xas_descend
      0.00            +0.2        0.16 ± 23%      +0.0        0.00        perf-profile.self.cycles-pp.xas_expand
      0.10 ± 22%      +0.2        0.27 ± 16%      -0.0        0.06 ± 65%  perf-profile.self.cycles-pp.rcu_segcblist_enqueue
      0.92 ± 35%      +0.3        1.22 ± 11%      -0.3        0.59 ± 15%  perf-profile.self.cycles-pp.kmem_cache_free
      0.00            +0.4        0.36 ± 16%      +0.0        0.00        perf-profile.self.cycles-pp.xas_store
      0.32 ± 30%      +0.4        0.71 ± 12%      -0.1        0.18 ± 23%  perf-profile.self.cycles-pp.__call_rcu_common
      0.18 ± 27%      +0.5        0.65 ±  8%      +0.1        0.26 ± 21%  perf-profile.self.cycles-pp.kmem_cache_alloc_lru
      0.36 ± 79%      +0.6        0.96 ± 15%      -0.0        0.31 ± 42%  perf-profile.self.cycles-pp.__slab_free
      0.00            +0.8        0.80 ± 14%      +0.0        0.00        perf-profile.self.cycles-pp.radix_tree_node_rcu_free
      0.00            +1.0        1.01 ± 16%      +0.0        0.00        perf-profile.self.cycles-pp.radix_tree_node_ctor



[1] https://lore.kernel.org/all/202402191308.8e7ee8c7-oliver.sang@intel.com/

> 
> 
> > > > > @@ -330,9 +329,9 @@ int simple_offset_empty(struct dentry *dentry)
> > > > >  	if (!inode || !S_ISDIR(inode->i_mode))
> > > > >  		return ret;
> > > > >  
> > > > > -	index = 2;
> > > > > +	index = DIR_OFFSET_MIN;
> > > > 
> > > > This bit should go into the simple_offset_empty() patch...
> > > > 
> > > > > @@ -434,15 +433,15 @@ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence)
> > > > >  
> > > > >  	/* In this case, ->private_data is protected by f_pos_lock */
> > > > >  	file->private_data = NULL;
> > > > > -	return vfs_setpos(file, offset, U32_MAX);
> > > > > +	return vfs_setpos(file, offset, MAX_LFS_FILESIZE);
> > > > 					^^^
> > > > Why this? It is ULONG_MAX << PAGE_SHIFT on 32-bit so that doesn't seem
> > > > quite right? Why not use ULONG_MAX here directly?
> > > 
> > > I initially changed U32_MAX to ULONG_MAX, but for some reason, the
> > > length checking in vfs_setpos() fails. There is probably a sign
> > > extension thing happening here that I don't understand.
> > > 
> > > 
> > > > Otherwise the patch looks good to me.
> > > 
> > > As always, thank you for your review.
> > > 
> > > 
> > > -- 
> > > Chuck Lever
> 
> -- 
> Chuck Lever
  

Patch

diff --git a/fs/libfs.c b/fs/libfs.c
index 3cf773950f93..f073e9aeb2bf 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -245,17 +245,17 @@  enum {
 	DIR_OFFSET_MIN	= 2,
 };
 
-static void offset_set(struct dentry *dentry, u32 offset)
+static void offset_set(struct dentry *dentry, unsigned long offset)
 {
-	dentry->d_fsdata = (void *)((uintptr_t)(offset));
+	dentry->d_fsdata = (void *)offset;
 }
 
-static u32 dentry2offset(struct dentry *dentry)
+static unsigned long dentry2offset(struct dentry *dentry)
 {
-	return (u32)((uintptr_t)(dentry->d_fsdata));
+	return (unsigned long)dentry->d_fsdata;
 }
 
-static struct lock_class_key simple_offset_xa_lock;
+static struct lock_class_key simple_offset_lock_class;
 
 /**
  * simple_offset_init - initialize an offset_ctx
@@ -264,8 +264,8 @@  static struct lock_class_key simple_offset_xa_lock;
  */
 void simple_offset_init(struct offset_ctx *octx)
 {
-	xa_init_flags(&octx->xa, XA_FLAGS_ALLOC1);
-	lockdep_set_class(&octx->xa.xa_lock, &simple_offset_xa_lock);
+	mt_init_flags(&octx->mt, MT_FLAGS_ALLOC_RANGE);
+	lockdep_set_class(&octx->mt.ma_lock, &simple_offset_lock_class);
 	octx->next_offset = DIR_OFFSET_MIN;
 }
 
@@ -279,15 +279,14 @@  void simple_offset_init(struct offset_ctx *octx)
  */
 int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry)
 {
-	static const struct xa_limit limit = XA_LIMIT(DIR_OFFSET_MIN, U32_MAX);
-	u32 offset;
+	unsigned long offset;
 	int ret;
 
 	if (dentry2offset(dentry) != 0)
 		return -EBUSY;
 
-	ret = xa_alloc_cyclic(&octx->xa, &offset, dentry, limit,
-			      &octx->next_offset, GFP_KERNEL);
+	ret = mtree_alloc_cyclic(&octx->mt, &offset, dentry, DIR_OFFSET_MIN,
+				 ULONG_MAX, &octx->next_offset, GFP_KERNEL);
 	if (ret < 0)
 		return ret;
 
@@ -303,13 +302,13 @@  int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry)
  */
 void simple_offset_remove(struct offset_ctx *octx, struct dentry *dentry)
 {
-	u32 offset;
+	unsigned long offset;
 
 	offset = dentry2offset(dentry);
 	if (offset == 0)
 		return;
 
-	xa_erase(&octx->xa, offset);
+	mtree_erase(&octx->mt, offset);
 	offset_set(dentry, 0);
 }
 
@@ -330,9 +329,9 @@  int simple_offset_empty(struct dentry *dentry)
 	if (!inode || !S_ISDIR(inode->i_mode))
 		return ret;
 
-	index = 2;
+	index = DIR_OFFSET_MIN;
 	octx = inode->i_op->get_offset_ctx(inode);
-	xa_for_each(&octx->xa, index, child) {
+	mt_for_each(&octx->mt, child, index, ULONG_MAX) {
 		spin_lock(&child->d_lock);
 		if (simple_positive(child)) {
 			spin_unlock(&child->d_lock);
@@ -362,8 +361,8 @@  int simple_offset_rename_exchange(struct inode *old_dir,
 {
 	struct offset_ctx *old_ctx = old_dir->i_op->get_offset_ctx(old_dir);
 	struct offset_ctx *new_ctx = new_dir->i_op->get_offset_ctx(new_dir);
-	u32 old_index = dentry2offset(old_dentry);
-	u32 new_index = dentry2offset(new_dentry);
+	unsigned long old_index = dentry2offset(old_dentry);
+	unsigned long new_index = dentry2offset(new_dentry);
 	int ret;
 
 	simple_offset_remove(old_ctx, old_dentry);
@@ -389,9 +388,9 @@  int simple_offset_rename_exchange(struct inode *old_dir,
 
 out_restore:
 	offset_set(old_dentry, old_index);
-	xa_store(&old_ctx->xa, old_index, old_dentry, GFP_KERNEL);
+	mtree_store(&old_ctx->mt, old_index, old_dentry, GFP_KERNEL);
 	offset_set(new_dentry, new_index);
-	xa_store(&new_ctx->xa, new_index, new_dentry, GFP_KERNEL);
+	mtree_store(&new_ctx->mt, new_index, new_dentry, GFP_KERNEL);
 	return ret;
 }
 
@@ -404,7 +403,7 @@  int simple_offset_rename_exchange(struct inode *old_dir,
  */
 void simple_offset_destroy(struct offset_ctx *octx)
 {
-	xa_destroy(&octx->xa);
+	mtree_destroy(&octx->mt);
 }
 
 /**
@@ -434,15 +433,15 @@  static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence)
 
 	/* In this case, ->private_data is protected by f_pos_lock */
 	file->private_data = NULL;
-	return vfs_setpos(file, offset, U32_MAX);
+	return vfs_setpos(file, offset, MAX_LFS_FILESIZE);
 }
 
-static struct dentry *offset_find_next(struct xa_state *xas)
+static struct dentry *offset_find_next(struct ma_state *mas)
 {
 	struct dentry *child, *found = NULL;
 
 	rcu_read_lock();
-	child = xas_next_entry(xas, U32_MAX);
+	child = mas_find(mas, ULONG_MAX);
 	if (!child)
 		goto out;
 	spin_lock(&child->d_lock);
@@ -456,7 +455,7 @@  static struct dentry *offset_find_next(struct xa_state *xas)
 
 static bool offset_dir_emit(struct dir_context *ctx, struct dentry *dentry)
 {
-	u32 offset = dentry2offset(dentry);
+	unsigned long offset = dentry2offset(dentry);
 	struct inode *inode = d_inode(dentry);
 
 	return ctx->actor(ctx, dentry->d_name.name, dentry->d_name.len, offset,
@@ -466,11 +465,11 @@  static bool offset_dir_emit(struct dir_context *ctx, struct dentry *dentry)
 static void *offset_iterate_dir(struct inode *inode, struct dir_context *ctx)
 {
 	struct offset_ctx *octx = inode->i_op->get_offset_ctx(inode);
-	XA_STATE(xas, &octx->xa, ctx->pos);
+	MA_STATE(mas, &octx->mt, ctx->pos, ctx->pos);
 	struct dentry *dentry;
 
 	while (true) {
-		dentry = offset_find_next(&xas);
+		dentry = offset_find_next(&mas);
 		if (!dentry)
 			return ERR_PTR(-ENOENT);
 
@@ -480,7 +479,7 @@  static void *offset_iterate_dir(struct inode *inode, struct dir_context *ctx)
 		}
 
 		dput(dentry);
-		ctx->pos = xas.xa_index + 1;
+		ctx->pos = mas.index + 1;
 	}
 	return NULL;
 }
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 03d141809a2c..55144c12ee0f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -43,6 +43,7 @@ 
 #include <linux/cred.h>
 #include <linux/mnt_idmapping.h>
 #include <linux/slab.h>
+#include <linux/maple_tree.h>
 
 #include <asm/byteorder.h>
 #include <uapi/linux/fs.h>
@@ -3260,8 +3261,8 @@  extern ssize_t simple_write_to_buffer(void *to, size_t available, loff_t *ppos,
 		const void __user *from, size_t count);
 
 struct offset_ctx {
-	struct xarray		xa;
-	u32			next_offset;
+	struct maple_tree	mt;
+	unsigned long		next_offset;
 };
 
 void simple_offset_init(struct offset_ctx *octx);