[GIT,PULL] vfs fixes

Message ID 20231019-kampfsport-metapher-e5211d7be247@brauner
State New
Headers
Series [GIT,PULL] vfs fixes |

Pull-request

git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/v6.6-rc7.vfs.fixes

Message

Christian Brauner Oct. 19, 2023, 10:07 a.m. UTC
  Hey Linus,

/* Summary */
An openat() call from io_uring triggering an audit call can apparently
cause the refcount of struct filename to be incremented from multiple
threads concurrently during async execution, triggering a refcount
underflow and hitting a BUG_ON(). That bug has been lurking around since
at least v5.16 apparently.

Switch to an atomic counter to fix that. The underflow check is
downgraded from a BUG_ON() to a WARN_ON_ONCE() but we could easily
remove that check altogether tbh and not waste an additional atomic. So
if you feel that extra check isn't needed you could just remove in case
you're pulling.

/* Testing */
clang: Ubuntu clang version 15.0.7
gcc: (Ubuntu 12.2.0-3ubuntu1) 12.2.0

All patches are based on v6.6-rc6 and have been sitting in linux-next.
No build failures or warnings were observed.

/* Conflicts */
At the time of creating this PR no merge conflicts were reported from
linux-next and no merge conflicts showed up doing a test-merge with
current mainline.

The following changes since commit 94f6f0550c625fab1f373bb86a6669b45e9748b3:

  Linux 6.6-rc5 (2023-10-08 13:49:43 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/v6.6-rc7.vfs.fixes

for you to fetch changes up to 03adc61edad49e1bbecfb53f7ea5d78f398fe368:

  audit,io_uring: io_uring openat triggers audit reference count underflow (2023-10-13 18:34:46 +0200)

Please consider pulling these changes from the signed v6.6-rc7.vfs.fixes tag.

Thanks!
Christian

----------------------------------------------------------------
v6.6-rc7.vfs.fixes

----------------------------------------------------------------
Dan Clash (1):
      audit,io_uring: io_uring openat triggers audit reference count underflow

 fs/namei.c         | 9 +++++----
 include/linux/fs.h | 2 +-
 kernel/auditsc.c   | 8 ++++----
 3 files changed, 10 insertions(+), 9 deletions(-)
  

Comments

Linus Torvalds Oct. 19, 2023, 4:37 p.m. UTC | #1
On Thu, 19 Oct 2023 at 03:09, Christian Brauner <brauner@kernel.org> wrote:
>
> An openat() call from io_uring triggering an audit call can apparently
> cause the refcount of struct filename to be incremented from multiple
> threads concurrently during async execution, triggering a refcount
> underflow and hitting a BUG_ON(). That bug has been lurking around since
> at least v5.16 apparently.

Ouch. That filename ref by audit was always supposed to be
thread-local in a "for this system call" kind of sense.

But yes, looks like the io_uring stuff ended up making it no longer
thread-local.

That said, using atomics for reference counting is our default
behavior and should be normal, so the patch isn't wrong, it's just
annoying since getname/putname is very much in the critical path of
filename handling.

That said, the extra atomics are hopefully not really noticeable.

Some people might want to use the non-refcounted version (ie we have
getname/putname used by ksmbd too, for example), if they really care.

It already exists, as __getname/__putname.

But the normal open/stat/etc system call paths are obviously now going
to hit those extra atomics. Not lovely, but I guess it's the best we
can do.

> Switch to an atomic counter to fix that. The underflow check is
> downgraded from a BUG_ON() to a WARN_ON_ONCE() but we could easily
> remove that check altogether tbh and not waste an additional atomic. So
> if you feel that extra check isn't needed you could just remove in case
> you're pulling.

Well, the atomic *read* is cheap - the expensive part is the
atomic_dec_and_test() (and the atomic_inc in the audit code.

I'm not sure why you made it check just for zero in the WARN_ON_ONCE,
rather than <= 0 as it used to, but that check is racy regardless, so
it doesn't matter. It would miss two concurrent decrements coming in
with a count of 1.

We don't have the ternary test of atomic decrement results (positive,
zero or negative), so it is what it is.

                 Linus
  
pr-tracker-bot@kernel.org Oct. 19, 2023, 6:36 p.m. UTC | #2
The pull request you sent on Thu, 19 Oct 2023 12:07:08 +0200:

> git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/v6.6-rc7.vfs.fixes

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/ea1cc20cd4ce55dd920a87a317c43da03ccea192

Thank you!
  
Christian Brauner Oct. 20, 2023, 11:14 a.m. UTC | #3
> Ouch. That filename ref by audit was always supposed to be
> thread-local in a "for this system call" kind of sense.

Yeah, I wasn't happy when that bug showed up.

> That said, using atomics for reference counting is our default
> behavior and should be normal, so the patch isn't wrong, it's just
> annoying since getname/putname is very much in the critical path of
> filename handling.

Yeah.

> That said, the extra atomics are hopefully not really noticeable.
> 
> Some people might want to use the non-refcounted version (ie we have
> getname/putname used by ksmbd too, for example), if they really care.
> 
> It already exists, as __getname/__putname.
> 
> But the normal open/stat/etc system call paths are obviously now going
> to hit those extra atomics. Not lovely, but I guess it's the best we
> can do.

I didn't spend too much time on this issue because it's -rc7 and the
straightforward seemed ok, if annoying.

But if we really really really really cared we could probably do a
deranged thing and massage both audit and io_uring to allows us to
separate the regular system call getname from the io_uring getname. But
I think that would be ugly and likely error prone.