[for-next,2/3] eventfs: Stop using dcache_readdir() for getdents()

Message ID 20240104164738.483305222@goodmis.org
State New
Headers
Series tracefs/eventfs: Updates for 6.8 |

Commit Message

Steven Rostedt Jan. 4, 2024, 4:47 p.m. UTC
  From: "Steven Rostedt (Google)" <rostedt@goodmis.org>

The eventfs creates dynamically allocated dentries and inodes. Using the
dcache_readdir() logic for its own directory lookups requires hiding the
cursor of the dcache logic and playing games to allow the dcache_readdir()
to still have access to the cursor while the eventfs saved what it created
and what it needs to release.

Instead, just have eventfs have its own iterate_shared callback function
that will fill in the dent entries. This simplifies the code quite a bit.

Link: https://lore.kernel.org/linux-trace-kernel/20240104015435.682218477@goodmis.org

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ajay Kaher <akaher@vmware.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
---
 fs/tracefs/event_inode.c | 194 +++++++++++++--------------------------
 1 file changed, 64 insertions(+), 130 deletions(-)
  

Comments

Linus Torvalds Jan. 4, 2024, 6:46 p.m. UTC | #1
On Thu, 4 Jan 2024 at 08:46, Steven Rostedt <rostedt@goodmis.org> wrote:
>
>         list_for_each_entry_srcu(ei_child, &ei->children, list,
>                                  srcu_read_lock_held(&eventfs_srcu)) {
> +
> +               if (c > 0) {
> +                       c--;
> +                       continue;
>                 }

Thanks for putting that at the top, I really do think it's not just
more efficient, but "more correct" too - ie if some entry that *used*
to exist and was previously counted by 'pos' went away, it's actually
*better* to count it again if we still see it, in order to not skip
subsequent entries that haven't been seen..

And that very fact actually makes me wonder:

>         for (i = 0; i < ei->nr_entries; i++) {
> +               void *cdata = ei->data;
> +
> +               if (c > 0) {
> +                       c--;
> +                       continue;
> +               }

The 'ei->nr_entries' things are in a stable array, so the indexing for
them cannot change (ie even if "is_freed" were to be set the array is
still stable).

So I wonder if - just from a 'pos' iterator stability standpoint - you
should change the tracefs directory iterator to always start with the
non-directory entries in ei->entries[]?

That way, even if concurrent dynamic add/remove events might change
the 'ei->children' list, it could never cause an 'ei->entry[]' to
disappear (or be returned twice).

This is very nitpicky and I doubt it matters, because I doubt the
whole "ls on a tracefs directory while changing it" case matters, but
I thought I'd mention it.

              Linus
  
Steven Rostedt Jan. 4, 2024, 7:02 p.m. UTC | #2
On Thu, 4 Jan 2024 10:46:04 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> And that very fact actually makes me wonder:
> 
> >         for (i = 0; i < ei->nr_entries; i++) {
> > +               void *cdata = ei->data;
> > +
> > +               if (c > 0) {
> > +                       c--;
> > +                       continue;
> > +               }  
> 
> The 'ei->nr_entries' things are in a stable array, so the indexing for
> them cannot change (ie even if "is_freed" were to be set the array is
> still stable).

Yeah, the entries is fixed. If the ei itself gets freed, so does all the
entries at the same time. The individual entries should too.

Hmm, it probably doesn't even make sense to continue the loop if is_freed
is set as the same variable will be checked every time. :-/  Should just be
a "break;" not a "continue;"

> 
> So I wonder if - just from a 'pos' iterator stability standpoint - you
> should change the tracefs directory iterator to always start with the
> non-directory entries in ei->entries[]?

Sure I can do that. I only did it this way because I followed the way the
other functions did the child directories first and then the files (the
entries array represents the files, as the eventfs_inode represents
directories).

> 
> That way, even if concurrent dynamic add/remove events might change
> the 'ei->children' list, it could never cause an 'ei->entry[]' to
> disappear (or be returned twice).
> 
> This is very nitpicky and I doubt it matters, because I doubt the
> whole "ls on a tracefs directory while changing it" case matters, but
> I thought I'd mention it.

I may add this update, but as a separate patch.

-- Steve
  
Steven Rostedt Jan. 4, 2024, 8:05 p.m. UTC | #3
On Thu, 4 Jan 2024 14:02:46 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> > And that very fact actually makes me wonder:
> >   
> > >         for (i = 0; i < ei->nr_entries; i++) {
> > > +               void *cdata = ei->data;
> > > +
> > > +               if (c > 0) {
> > > +                       c--;
> > > +                       continue;
> > > +               }    
> > 
> > The 'ei->nr_entries' things are in a stable array, so the indexing for
> > them cannot change (ie even if "is_freed" were to be set the array is
> > still stable).  
> 
> Yeah, the entries is fixed. If the ei itself gets freed, so does all the
> entries at the same time. The individual entries should too.
> 
> Hmm, it probably doesn't even make sense to continue the loop if is_freed
> is set as the same variable will be checked every time. :-/  Should just be
> a "break;" not a "continue;"

Also, I just realized it breaks if we update the 'c--' before the callback. :-/

I have to put this check *after* the callback check.

Reason being, the callback can say "this event doesn't get this file" and
return 0, which tells eventfs to skip this file.

For example, we have

 # ls /sys/kernel/tracing/events/ftrace/function
format  hist  hist_debug  id  inject

and

 # ls /sys/kernel/tracing/events/sched/sched_switch/
enable  filter  format  hist  hist_debug  id  inject  trigger

The "ftrace" event files are for information only. They describe the
internal events. For example, the "function" event is what is written into
the ring buffer by the function tracer. That event is not enabled by the
events directory. It is only enabled when "function" is written into
current_tracer.

Same for filtering. The filter logic for function events is done by the
"set_ftrace_filter" file in tracefs.

But the "sched_switch" event is enabled and filtered by the eventfs
directory. Here we see "enable" and "filter" files that the user could
write into to enabled and/or filter the sched_switch event.

Now because the "function" and "sched_switch" registered the same way, the
callback that handles the two has:

static int event_callback(const char *name, umode_t *mode, void **data,
			  const struct file_operations **fops)
{
	struct trace_event_file *file = *data;
	struct trace_event_call *call = file->event_call;

[..]
	if (!(call->flags & TRACE_EVENT_FL_IGNORE_ENABLE)) {
		if (call->class->reg && strcmp(name, "enable") == 0) {
			*mode = TRACE_MODE_WRITE;
			*fops = &ftrace_enable_fops;
			return 1;
		}

		if (strcmp(name, "filter") == 0) {
			*mode = TRACE_MODE_WRITE;
			*fops = &ftrace_event_filter_fops;
			return 1;
		}
	}
[..]
	return 0;
}

Where the function event has that "TRACE_EVENT_FL_IGNORE_ENABLE" flag set,
so when the entry for "enable" or "filter" is passed to it, it returns 0.

Before, where we had c-- after the callback, we had:

 # ls /sys/kernel/tracing/events/ftrace/function
format  hist  hist_debug  id  inject

After changing the c-- to before the callback I now get:

 # ls /sys/kernel/tracing/events/ftrace/function/
format  hist  hist  hist_debug  hist_debug  id  inject  inject

Where the missing "enable" and "filter" files caused the indexing to be
off, and the ls repeated "hist" and "hist_debug" because of it. Oh, and the
function event doesn't have "trigger" either which is why we get two
"inject" files.

I have to move the c-- update down. I can still do it before creating the
dentry though, as if that fails, we should just bail.

-- Steve
  
Linus Torvalds Jan. 4, 2024, 8:18 p.m. UTC | #4
On Thu, 4 Jan 2024 at 12:04, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> Also, I just realized it breaks if we update the 'c--' before the callback. :-/
>
> I have to put this check *after* the callback check.

What? No.

> Reason being, the callback can say "this event doesn't get this file" and
> return 0, which tells eventfs to skip this file.

So yes, there seems to be a bug there, in that ctx->pos is only
updated for successful callbacks (and not for "ignored entry").

But that just means that you should always update 'ctx->pos' as you
'continue' the loop.

The logical place to do that would be in the for-loop itself, which
actually is very natural for the simple case, ie you should just do

        for (i = 0; i < ei->nr_entries; i++, ctx->pos++) {

but in the list_for_each_entry_srcu() case the 'update' part of the
for-loop isn't actually accessible, so it would have to be at the
'continue' point(s).

Which is admittedly a bit annoying.

Looking at that I'm actually surprised that I don't recall that we'd
have hit that issue with our 'for_each_xyz()' loops before.

The update for our "for_each_xyz()" helpers are all hardcoded to just
do the "next iterator" thing, and there's no nice way to take
advantage of the normal for-loop semantics of "do this at the end of
the loop"

            Linus
  
Steven Rostedt Jan. 4, 2024, 8:33 p.m. UTC | #5
On Thu, 4 Jan 2024 12:18:06 -0800
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Thu, 4 Jan 2024 at 12:04, Steven Rostedt <rostedt@goodmis.org> wrote:
> >
> > Also, I just realized it breaks if we update the 'c--' before the callback. :-/
> >
> > I have to put this check *after* the callback check.  
> 
> What? No.
> 
> > Reason being, the callback can say "this event doesn't get this file" and
> > return 0, which tells eventfs to skip this file.  
> 
> So yes, there seems to be a bug there, in that ctx->pos is only
> updated for successful callbacks (and not for "ignored entry").

OK, I wasn't sure if it was OK to update the ctx->pos for something we
didn't add, so I avoided doing so.

> 
> But that just means that you should always update 'ctx->pos' as you
> 'continue' the loop.
> 
> The logical place to do that would be in the for-loop itself, which
> actually is very natural for the simple case, ie you should just do
> 
>         for (i = 0; i < ei->nr_entries; i++, ctx->pos++) {

Well, we don't want to do that and c-- at the same time. But of course, if
we do the shortcut, we can have:

	for (i = c; i < ei->nr_entries; i++, ctx->pos++) {

which would be OK. And better if we move it before the ei->children list walk.

> 
> but in the list_for_each_entry_srcu() case the 'update' part of the
> for-loop isn't actually accessible, so it would have to be at the
> 'continue' point(s).
> 
> Which is admittedly a bit annoying.

But not really an issue as we just have:

	list_for_each_entry_srcu(ei_child, &ei->children, list,
				 srcu_read_lock_held(&eventfs_srcu)) {

		if (c > 0) {
			c--;
			continue;
		}

		ctx->pos++;

> 
> Looking at that I'm actually surprised that I don't recall that we'd
> have hit that issue with our 'for_each_xyz()' loops before.
> 
> The update for our "for_each_xyz()" helpers are all hardcoded to just
> do the "next iterator" thing, and there's no nice way to take
> advantage of the normal for-loop semantics of "do this at the end of
> the loop"

Anyway, if I do count ctx->pos++ for every iteration, whether it added
something or not, it appears to work. I'll write up a couple of patches to
handle this.

Thanks,

-- Steve
  
Steven Rostedt Jan. 15, 2024, 8:58 p.m. UTC | #6
On Thu, 04 Jan 2024 11:47:05 -0500
Steven Rostedt <rostedt@goodmis.org> wrote:

> From: "Steven Rostedt (Google)" <rostedt@goodmis.org>
> 
> The eventfs creates dynamically allocated dentries and inodes. Using the
> dcache_readdir() logic for its own directory lookups requires hiding the
> cursor of the dcache logic and playing games to allow the dcache_readdir()
> to still have access to the cursor while the eventfs saved what it created
> and what it needs to release.
> 
> Instead, just have eventfs have its own iterate_shared callback function
> that will fill in the dent entries. This simplifies the code quite a bit.
> 
> Link: https://lore.kernel.org/linux-trace-kernel/20240104015435.682218477@goodmis.org

So now I know why Ajay did what he did with the
dcach_dir_open_wrapper():

  https://lore.kernel.org/all/202401152142.bfc28861-oliver.sang@intel.com/

Basically it crashed with:

> [   41.602625][ T4377] kernel BUG at fs/dcache.c:2031!
> [   41.602625][ T4374] RSP: 0018:ffa000000fcdfcd0 EFLAGS: 00010286
> [   41.602629][ T4374] RAX: 0000000000000002 RBX: ff11000109392980 RCX: 0000000000000000
> [   41.602630][ T4374] RDX: 0000000000000000 RSI: ff1100405e46c6f0 RDI: ff1100405f05afc0
> [   41.602631][ T4374] RBP: ff1100405f05afc0 R08: ffffffff830ad0e0 R09: 0000000000000000
> [   41.602632][ T4374] R10: 0000000000000280 R11: ffffffff8162036a R12: 0000000000000000
> [   41.602633][ T4374] R13: ff1100405e46c6f0 R14: ff1100405f05aff8 R15: 0000000000000000
> [   41.602634][ T4374] FS:  00007f2582ff9740(0000) GS:ff1100407fa80000(0000) knlGS:0000000000000000
> [   41.602635][ T4374] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   41.602635][ T4374] CR2: 00005624511f3328 CR3: 000000208a342006 CR4: 0000000000771ef0
> [   41.602636][ T4374] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   41.602637][ T4374] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   41.602638][ T4374] PKRU: 55555554
> [   41.602638][ T4374] Call Trace:
> [   41.602640][ T4374]  <TASK>
> [ 41.602642][ T4374] ? die (arch/x86/kernel/dumpstack.c:421 arch/x86/kernel/dumpstack.c:434 arch/x86/kernel/dumpstack.c:447) 
> [ 41.602644][ T4374] ? do_trap (arch/x86/kernel/traps.c:112 arch/x86/kernel/traps.c:153) 
> [ 41.602645][ T4374] ? d_instantiate (fs/dcache.c:2031 (discriminator 1)) 
> [ 41.602647][ T4374] ? do_error_trap (arch/x86/include/asm/traps.h:59 arch/x86/kernel/traps.c:174) 
> [ 41.602648][ T4374] ? d_instantiate (fs/dcache.c:2031 (discriminator 1)) 
> [ 41.602649][ T4374] ? exc_invalid_op (arch/x86/kernel/traps.c:265) 
> [ 41.602652][ T4374] ? d_instantiate (fs/dcache.c:2031 (discriminator 1)) 
> [ 41.602653][ T4374] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:568) 
> [ 41.602655][ T4374] ? tracefs_alloc_inode (fs/tracefs/inode.c:38) 
> [ 41.602657][ T4374] ? d_instantiate (fs/dcache.c:2031 (discriminator 1)) 
> [ 41.602659][ T4374] create_dir_dentry (fs/tracefs/event_inode.c:329 fs/tracefs/event_inode.c:516) 
> [ 41.602661][ T4374] eventfs_root_lookup (fs/tracefs/event_inode.c:611) 
> [ 41.602662][ T4374] ? terminate_walk (fs/namei.c:691) 
> [ 41.602665][ T4374] __lookup_slow (fs/namei.c:1694) 
> [ 41.602667][ T4374] lookup_one_len (fs/namei.c:2746 (discriminator 1)) 
> [ 41.602669][ T4374] eventfs_start_creating (fs/tracefs/inode.c:536) 
> [ 41.602671][ T4374] create_dir_dentry (fs/tracefs/event_inode.c:309 fs/tracefs/event_inode.c:516) 
> [ 41.602673][ T4374] eventfs_iterate (fs/tracefs/event_inode.c:701) 
> [ 41.602674][ T4374] ? atime_needs_update (fs/inode.c:1842 fs/inode.c:1994) 
> [ 41.602677][ T4374] iterate_dir (fs/readdir.c:106) 
> [ 41.602680][ T4374] __x64_sys_getdents (fs/readdir.c:323 fs/readdir.c:307 fs/readdir.c:307) 
> [ 41.602682][ T4374] ? __pfx_filldir (fs/readdir.c:260) 
> [ 41.602684][ T4374] do_syscall_64 (arch/x86/entry/common.c:51 arch/x86/entry/common.c:82) 
> [ 41.602686][ T4374] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129) 

I guess that's because I'm creating the inodes and dentries in the
iterare_shared() and not at open. The inodes and dentries are created
with a ref count of zero, as they should be freed when no longer used.
But I don't think the getdents() can handle that in the iterator part.
I do need to create the inodes/dentries in the open, up their ref count
and dec the ref counts at the release.

But I still do not need the dcache_readdir() code to do it. I just have
to create an array of inodes to use at the open (like the code did
before this update), and have the iterate_shared() use that array, and
then the release just decrement the inodes/dentries.

I'm off today, so I will likely fix that tomorrow.

-- Steve
  

Patch

diff --git a/fs/tracefs/event_inode.c b/fs/tracefs/event_inode.c
index c360300fb866..41af56f44f0a 100644
--- a/fs/tracefs/event_inode.c
+++ b/fs/tracefs/event_inode.c
@@ -52,9 +52,7 @@  enum {
 static struct dentry *eventfs_root_lookup(struct inode *dir,
 					  struct dentry *dentry,
 					  unsigned int flags);
-static int dcache_dir_open_wrapper(struct inode *inode, struct file *file);
-static int dcache_readdir_wrapper(struct file *file, struct dir_context *ctx);
-static int eventfs_release(struct inode *inode, struct file *file);
+static int eventfs_iterate(struct file *file, struct dir_context *ctx);
 
 static void update_attr(struct eventfs_attr *attr, struct iattr *iattr)
 {
@@ -148,11 +146,9 @@  static const struct inode_operations eventfs_file_inode_operations = {
 };
 
 static const struct file_operations eventfs_file_operations = {
-	.open		= dcache_dir_open_wrapper,
 	.read		= generic_read_dir,
-	.iterate_shared	= dcache_readdir_wrapper,
+	.iterate_shared	= eventfs_iterate,
 	.llseek		= generic_file_llseek,
-	.release	= eventfs_release,
 };
 
 /* Return the evenfs_inode of the "events" directory */
@@ -643,128 +639,87 @@  static struct dentry *eventfs_root_lookup(struct inode *dir,
 	return ret;
 }
 
-struct dentry_list {
-	void			*cursor;
-	struct dentry		**dentries;
-};
-
-/**
- * eventfs_release - called to release eventfs file/dir
- * @inode: inode to be released
- * @file: file to be released (not used)
- */
-static int eventfs_release(struct inode *inode, struct file *file)
-{
-	struct tracefs_inode *ti;
-	struct dentry_list *dlist = file->private_data;
-	void *cursor;
-	int i;
-
-	ti = get_tracefs(inode);
-	if (!(ti->flags & TRACEFS_EVENT_INODE))
-		return -EINVAL;
-
-	if (WARN_ON_ONCE(!dlist))
-		return -EINVAL;
-
-	for (i = 0; dlist->dentries && dlist->dentries[i]; i++) {
-		dput(dlist->dentries[i]);
-	}
-
-	cursor = dlist->cursor;
-	kfree(dlist->dentries);
-	kfree(dlist);
-	file->private_data = cursor;
-	return dcache_dir_close(inode, file);
-}
-
-static int add_dentries(struct dentry ***dentries, struct dentry *d, int cnt)
-{
-	struct dentry **tmp;
-
-	tmp = krealloc(*dentries, sizeof(d) * (cnt + 2), GFP_NOFS);
-	if (!tmp)
-		return -1;
-	tmp[cnt] = d;
-	tmp[cnt + 1] = NULL;
-	*dentries = tmp;
-	return 0;
-}
-
-/**
- * dcache_dir_open_wrapper - eventfs open wrapper
- * @inode: not used
- * @file: dir to be opened (to create it's children)
- *
- * Used to dynamic create file/dir with-in @file, all the
- * file/dir will be created. If already created then references
- * will be increased
+/*
+ * Walk the children of a eventfs_inode to fill in getdents().
  */
-static int dcache_dir_open_wrapper(struct inode *inode, struct file *file)
+static int eventfs_iterate(struct file *file, struct dir_context *ctx)
 {
 	const struct file_operations *fops;
+	struct inode *f_inode = file_inode(file);
 	const struct eventfs_entry *entry;
 	struct eventfs_inode *ei_child;
 	struct tracefs_inode *ti;
 	struct eventfs_inode *ei;
-	struct dentry_list *dlist;
-	struct dentry **dentries = NULL;
-	struct dentry *parent = file_dentry(file);
-	struct dentry *d;
-	struct inode *f_inode = file_inode(file);
-	const char *name = parent->d_name.name;
+	struct dentry *ei_dentry = NULL;
+	struct dentry *dentry;
+	const char *name;
 	umode_t mode;
-	void *data;
-	int cnt = 0;
 	int idx;
-	int ret;
-	int i;
-	int r;
+	int ret = -EINVAL;
+	int ino;
+	int i, r, c;
+
+	if (!dir_emit_dots(file, ctx))
+		return 0;
 
 	ti = get_tracefs(f_inode);
 	if (!(ti->flags & TRACEFS_EVENT_INODE))
 		return -EINVAL;
 
-	if (WARN_ON_ONCE(file->private_data))
-		return -EINVAL;
+	c = ctx->pos - 2;
 
 	idx = srcu_read_lock(&eventfs_srcu);
 
 	mutex_lock(&eventfs_mutex);
 	ei = READ_ONCE(ti->private);
+	if (ei && !ei->is_freed)
+		ei_dentry = READ_ONCE(ei->dentry);
 	mutex_unlock(&eventfs_mutex);
 
-	if (!ei) {
-		srcu_read_unlock(&eventfs_srcu, idx);
-		return -EINVAL;
-	}
-
-
-	data = ei->data;
+	if (!ei || !ei_dentry)
+		goto out;
 
-	dlist = kmalloc(sizeof(*dlist), GFP_KERNEL);
-	if (!dlist) {
-		srcu_read_unlock(&eventfs_srcu, idx);
-		return -ENOMEM;
-	}
+	ret = 0;
 
-	inode_lock(parent->d_inode);
+	/*
+	 * Need to create the dentries and inodes to have a consistent
+	 * inode number.
+	 */
 	list_for_each_entry_srcu(ei_child, &ei->children, list,
 				 srcu_read_lock_held(&eventfs_srcu)) {
-		d = create_dir_dentry(ei, ei_child, parent);
-		if (d) {
-			ret = add_dentries(&dentries, d, cnt);
-			dput(d);
-			if (ret < 0)
-				break;
-			cnt++;
+
+		if (c > 0) {
+			c--;
+			continue;
 		}
+
+		if (ei_child->is_freed)
+			continue;
+
+		name = ei_child->name;
+
+		dentry = create_dir_dentry(ei, ei_child, ei_dentry);
+		if (!dentry)
+			goto out;
+		ino = dentry->d_inode->i_ino;
+		dput(dentry);
+
+		if (!dir_emit(ctx, name, strlen(name), ino, DT_DIR))
+			goto out;
+		ctx->pos++;
 	}
 
 	for (i = 0; i < ei->nr_entries; i++) {
-		void *cdata = data;
+		void *cdata = ei->data;
+
+		if (c > 0) {
+			c--;
+			continue;
+		}
+
 		entry = &ei->entries[i];
 		name = entry->name;
+
 		mutex_lock(&eventfs_mutex);
 		/* If ei->is_freed, then the event itself may be too */
 		if (!ei->is_freed)
@@ -774,42 +729,21 @@  static int dcache_dir_open_wrapper(struct inode *inode, struct file *file)
 		mutex_unlock(&eventfs_mutex);
 		if (r <= 0)
 			continue;
-		d = create_file_dentry(ei, i, parent, name, mode, cdata, fops);
-		if (d) {
-			ret = add_dentries(&dentries, d, cnt);
-			dput(d);
-			if (ret < 0)
-				break;
-			cnt++;
-		}
-	}
-	inode_unlock(parent->d_inode);
-	srcu_read_unlock(&eventfs_srcu, idx);
-	ret = dcache_dir_open(inode, file);
 
-	/*
-	 * dcache_dir_open() sets file->private_data to a dentry cursor.
-	 * Need to save that but also save all the dentries that were
-	 * opened by this function.
-	 */
-	dlist->cursor = file->private_data;
-	dlist->dentries = dentries;
-	file->private_data = dlist;
-	return ret;
-}
+		dentry = create_file_dentry(ei, i, ei_dentry, name, mode, cdata, fops);
+		if (!dentry)
+			goto out;
+		ino = dentry->d_inode->i_ino;
+		dput(dentry);
 
-/*
- * This just sets the file->private_data back to the cursor and back.
- */
-static int dcache_readdir_wrapper(struct file *file, struct dir_context *ctx)
-{
-	struct dentry_list *dlist = file->private_data;
-	int ret;
+		if (!dir_emit(ctx, name, strlen(name), ino, DT_REG))
+			goto out;
+		ctx->pos++;
+	}
+	ret = 1;
+ out:
+	srcu_read_unlock(&eventfs_srcu, idx);
 
-	file->private_data = dlist->cursor;
-	ret = dcache_readdir(file, ctx);
-	dlist->cursor = file->private_data;
-	file->private_data = dlist;
 	return ret;
 }