[2/2] kernfs: dont take i_lock on revalidate

Message ID 166606036967.13363.9336408133975631967.stgit@donald.themaw.net
State New
Headers
Series kernfs: remove i_lock usage that isn't needed |

Commit Message

Ian Kent Oct. 18, 2022, 2:32 a.m. UTC
  In kernfs_dop_revalidate() when the passed in dentry is negative the
dentry directory is checked to see if it has changed and if so the
negative dentry is discarded so it can refreshed. During this check
the dentry inode i_lock is taken to mitigate against a possible
concurrent rename.

But if it's racing with a rename, becuase the dentry is negative, it
can't be the source it must be the target and it must be going to do
a d_move() otherwise the rename will return an error.

In this case the parent dentry of the target will not change, it will
be the same over the d_move(), only the source dentry parent may change
so the inode i_lock isn't needed.

Signed-off-by: Ian Kent <raven@themaw.net>
---
 fs/kernfs/dir.c |   24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)
  

Comments

Miklos Szeredi Oct. 24, 2022, 8:38 a.m. UTC | #1
On Tue, 18 Oct 2022 at 04:32, Ian Kent <raven@themaw.net> wrote:
>
> In kernfs_dop_revalidate() when the passed in dentry is negative the
> dentry directory is checked to see if it has changed and if so the
> negative dentry is discarded so it can refreshed. During this check
> the dentry inode i_lock is taken to mitigate against a possible
> concurrent rename.
>
> But if it's racing with a rename, becuase the dentry is negative, it
> can't be the source it must be the target and it must be going to do
> a d_move() otherwise the rename will return an error.
>
> In this case the parent dentry of the target will not change, it will
> be the same over the d_move(), only the source dentry parent may change
> so the inode i_lock isn't needed.
>
> Signed-off-by: Ian Kent <raven@themaw.net>

Reviewed-by: Miklos Szeredi <mszeredi@redhat.com>
  
Tejun Heo Oct. 31, 2022, 10:31 p.m. UTC | #2
On Tue, Oct 18, 2022 at 10:32:49AM +0800, Ian Kent wrote:
> In kernfs_dop_revalidate() when the passed in dentry is negative the
> dentry directory is checked to see if it has changed and if so the
> negative dentry is discarded so it can refreshed. During this check
> the dentry inode i_lock is taken to mitigate against a possible
> concurrent rename.
> 
> But if it's racing with a rename, becuase the dentry is negative, it
> can't be the source it must be the target and it must be going to do
> a d_move() otherwise the rename will return an error.
> 
> In this case the parent dentry of the target will not change, it will
> be the same over the d_move(), only the source dentry parent may change
> so the inode i_lock isn't needed.
> 
> Signed-off-by: Ian Kent <raven@themaw.net>

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.
  
Amir Goldstein Nov. 1, 2022, 7:46 a.m. UTC | #3
On Tue, Oct 18, 2022 at 5:58 AM Ian Kent <raven@themaw.net> wrote:
>
> In kernfs_dop_revalidate() when the passed in dentry is negative the
> dentry directory is checked to see if it has changed and if so the
> negative dentry is discarded so it can refreshed. During this check
> the dentry inode i_lock is taken to mitigate against a possible
> concurrent rename.
>
> But if it's racing with a rename, becuase the dentry is negative, it
> can't be the source it must be the target and it must be going to do
> a d_move() otherwise the rename will return an error.
>
> In this case the parent dentry of the target will not change, it will
> be the same over the d_move(), only the source dentry parent may change
> so the inode i_lock isn't needed.

You meant d_lock.
Same for the commit title.

Thanks,
Amir.
  
Ian Kent Nov. 1, 2022, 8:09 a.m. UTC | #4
On 1/11/22 15:46, Amir Goldstein wrote:
> On Tue, Oct 18, 2022 at 5:58 AM Ian Kent <raven@themaw.net> wrote:
>> In kernfs_dop_revalidate() when the passed in dentry is negative the
>> dentry directory is checked to see if it has changed and if so the
>> negative dentry is discarded so it can refreshed. During this check
>> the dentry inode i_lock is taken to mitigate against a possible
>> concurrent rename.
>>
>> But if it's racing with a rename, becuase the dentry is negative, it
>> can't be the source it must be the target and it must be going to do
>> a d_move() otherwise the rename will return an error.
>>
>> In this case the parent dentry of the target will not change, it will
>> be the same over the d_move(), only the source dentry parent may change
>> so the inode i_lock isn't needed.
> You meant d_lock.
> Same for the commit title.

Ha, well how do you like that, such an obvious mistake, how

did I not see it?


Not sure what to do about it now though ...

Any suggestions anyone?


Ian
  

Patch

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 3990f3e270cb..6acd9c3d4cff 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -1073,20 +1073,30 @@  static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
 
 		/* If the kernfs parent node has changed discard and
 		 * proceed to ->lookup.
+		 *
+		 * There's nothing special needed here when getting the
+		 * dentry parent, even if a concurrent rename is in
+		 * progress. That's because the dentry is negative so
+		 * it can only be the target of the rename and it will
+		 * be doing a d_move() not a replace. Consequently the
+		 * dentry d_parent won't change over the d_move().
+		 *
+		 * Also kernfs negative dentries transitioning from
+		 * negative to positive during revalidate won't happen
+		 * because they are invalidated on containing directory
+		 * changes and the lookup re-done so that a new positive
+		 * dentry can be properly created.
 		 */
-		spin_lock(&dentry->d_lock);
+		root = kernfs_root_from_sb(dentry->d_sb);
+		down_read(&root->kernfs_rwsem);
 		parent = kernfs_dentry_node(dentry->d_parent);
 		if (parent) {
-			spin_unlock(&dentry->d_lock);
-			root = kernfs_root(parent);
-			down_read(&root->kernfs_rwsem);
 			if (kernfs_dir_changed(parent, dentry)) {
 				up_read(&root->kernfs_rwsem);
 				return 0;
 			}
-			up_read(&root->kernfs_rwsem);
-		} else
-			spin_unlock(&dentry->d_lock);
+		}
+		up_read(&root->kernfs_rwsem);
 
 		/* The kernfs parent node hasn't changed, leave the
 		 * dentry negative and return success.