[v3,2/5] fs: Add fchmodat4()

Message ID d11b93ad8e3b669afaff942e25c3fca65c6a983c.1689074739.git.legion@kernel.org
State New
Headers
Series Add a new fchmodat4() syscall |

Commit Message

Alexey Gladkov July 11, 2023, 11:25 a.m. UTC
  From: Palmer Dabbelt <palmer@sifive.com>

On the userspace side fchmodat(3) is implemented as a wrapper
function which implements the POSIX-specified interface. This
interface differs from the underlying kernel system call, which does not
have a flags argument. Most implementations require procfs [1][2].

There doesn't appear to be a good userspace workaround for this issue
but the implementation in the kernel is pretty straight-forward.

The new fchmodat4() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
unlike existing fchmodat.

[1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35
[2] https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Alexey Gladkov <legion@kernel.org>
---
 fs/open.c                | 18 ++++++++++++++----
 include/linux/syscalls.h |  2 ++
 2 files changed, 16 insertions(+), 4 deletions(-)
  

Comments

Arnd Bergmann July 11, 2023, 11:42 a.m. UTC | #1
On Tue, Jul 11, 2023, at 13:25, Alexey Gladkov wrote:
> From: Palmer Dabbelt <palmer@sifive.com>
>
> On the userspace side fchmodat(3) is implemented as a wrapper
> function which implements the POSIX-specified interface. This
> interface differs from the underlying kernel system call, which does not
> have a flags argument. Most implementations require procfs [1][2].
>
> There doesn't appear to be a good userspace workaround for this issue
> but the implementation in the kernel is pretty straight-forward.
>
> The new fchmodat4() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
> unlike existing fchmodat.
>
> [1] 
> https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35
> [2] 
> https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28
>
> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> Signed-off-by: Alexey Gladkov <legion@kernel.org>

I don't know the history of why we ended up with the different
interface, or whether this was done intentionally in the kernel
or if we want this syscall.

Assuming this is in fact needed, I double-checked that the
implementation looks correct to me and is portable to all the
architectures, without the need for a compat wrapper.

Acked-by: Arnd Bergmann <arnd@arndb.de>
  
Christian Brauner July 11, 2023, 11:52 a.m. UTC | #2
On Tue, Jul 11, 2023 at 01:42:19PM +0200, Arnd Bergmann wrote:
> On Tue, Jul 11, 2023, at 13:25, Alexey Gladkov wrote:
> > From: Palmer Dabbelt <palmer@sifive.com>
> >
> > On the userspace side fchmodat(3) is implemented as a wrapper
> > function which implements the POSIX-specified interface. This
> > interface differs from the underlying kernel system call, which does not
> > have a flags argument. Most implementations require procfs [1][2].
> >
> > There doesn't appear to be a good userspace workaround for this issue
> > but the implementation in the kernel is pretty straight-forward.
> >
> > The new fchmodat4() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
> > unlike existing fchmodat.
> >
> > [1] 
> > https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35
> > [2] 
> > https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28
> >
> > Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> > Signed-off-by: Alexey Gladkov <legion@kernel.org>
> 
> I don't know the history of why we ended up with the different
> interface, or whether this was done intentionally in the kernel
> or if we want this syscall.
> 
> Assuming this is in fact needed, I double-checked that the
> implementation looks correct to me and is portable to all the
> architectures, without the need for a compat wrapper.
> 
> Acked-by: Arnd Bergmann <arnd@arndb.de>

The system call itself is useful afaict. But please,

s/fchmodat4/fchmodat2/

With very few exceptions we don't version by argument number but by
revision and we should stick to one scheme:

openat()->openat2()
eventfd()->eventfd2()
clone()/clone2()->clone3()
dup()->dup2()->dup3() // coincides with nr of arguments
pipe()->pipe2() // coincides with nr of arguments
renameat()->renameat2()
  
Matthew Wilcox July 11, 2023, 12:28 p.m. UTC | #3
On Tue, Jul 11, 2023 at 01:25:43PM +0200, Alexey Gladkov wrote:
> -static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
> +static int do_fchmodat4(int dfd, const char __user *filename, umode_t mode, int lookup_flags)

This function can still be called do_fchmodat(); we don't need to
version internal functions.
  
Alexey Gladkov July 11, 2023, 12:49 p.m. UTC | #4
On Tue, Jul 11, 2023 at 01:28:04PM +0100, Matthew Wilcox wrote:
> On Tue, Jul 11, 2023 at 01:25:43PM +0200, Alexey Gladkov wrote:
> > -static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
> > +static int do_fchmodat4(int dfd, const char __user *filename, umode_t mode, int lookup_flags)
> 
> This function can still be called do_fchmodat(); we don't need to
> version internal functions.

Yes. I tried not to change too much when adopting a patch. In the new
version, I will return the old name. Thanks.
  
Alexey Gladkov July 11, 2023, 12:51 p.m. UTC | #5
On Tue, Jul 11, 2023 at 01:52:01PM +0200, Christian Brauner wrote:
> On Tue, Jul 11, 2023 at 01:42:19PM +0200, Arnd Bergmann wrote:
> > On Tue, Jul 11, 2023, at 13:25, Alexey Gladkov wrote:
> > > From: Palmer Dabbelt <palmer@sifive.com>
> > >
> > > On the userspace side fchmodat(3) is implemented as a wrapper
> > > function which implements the POSIX-specified interface. This
> > > interface differs from the underlying kernel system call, which does not
> > > have a flags argument. Most implementations require procfs [1][2].
> > >
> > > There doesn't appear to be a good userspace workaround for this issue
> > > but the implementation in the kernel is pretty straight-forward.
> > >
> > > The new fchmodat4() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
> > > unlike existing fchmodat.
> > >
> > > [1] 
> > > https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35
> > > [2] 
> > > https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28
> > >
> > > Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> > > Signed-off-by: Alexey Gladkov <legion@kernel.org>
> > 
> > I don't know the history of why we ended up with the different
> > interface, or whether this was done intentionally in the kernel
> > or if we want this syscall.
> > 
> > Assuming this is in fact needed, I double-checked that the
> > implementation looks correct to me and is portable to all the
> > architectures, without the need for a compat wrapper.
> > 
> > Acked-by: Arnd Bergmann <arnd@arndb.de>
> 
> The system call itself is useful afaict. But please,
> 
> s/fchmodat4/fchmodat2/

Sure. I will.

> With very few exceptions we don't version by argument number but by
> revision and we should stick to one scheme:
> 
> openat()->openat2()
> eventfd()->eventfd2()
> clone()/clone2()->clone3()
> dup()->dup2()->dup3() // coincides with nr of arguments
> pipe()->pipe2() // coincides with nr of arguments
> renameat()->renameat2()
>
  
Christian Brauner July 11, 2023, 2:01 p.m. UTC | #6
On Tue, Jul 11, 2023 at 02:51:01PM +0200, Alexey Gladkov wrote:
> On Tue, Jul 11, 2023 at 01:52:01PM +0200, Christian Brauner wrote:
> > On Tue, Jul 11, 2023 at 01:42:19PM +0200, Arnd Bergmann wrote:
> > > On Tue, Jul 11, 2023, at 13:25, Alexey Gladkov wrote:
> > > > From: Palmer Dabbelt <palmer@sifive.com>
> > > >
> > > > On the userspace side fchmodat(3) is implemented as a wrapper
> > > > function which implements the POSIX-specified interface. This
> > > > interface differs from the underlying kernel system call, which does not
> > > > have a flags argument. Most implementations require procfs [1][2].
> > > >
> > > > There doesn't appear to be a good userspace workaround for this issue
> > > > but the implementation in the kernel is pretty straight-forward.
> > > >
> > > > The new fchmodat4() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
> > > > unlike existing fchmodat.
> > > >
> > > > [1] 
> > > > https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35
> > > > [2] 
> > > > https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28
> > > >
> > > > Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> > > > Signed-off-by: Alexey Gladkov <legion@kernel.org>
> > > 
> > > I don't know the history of why we ended up with the different
> > > interface, or whether this was done intentionally in the kernel
> > > or if we want this syscall.
> > > 
> > > Assuming this is in fact needed, I double-checked that the
> > > implementation looks correct to me and is portable to all the
> > > architectures, without the need for a compat wrapper.
> > > 
> > > Acked-by: Arnd Bergmann <arnd@arndb.de>
> > 
> > The system call itself is useful afaict. But please,
> > 
> > s/fchmodat4/fchmodat2/
> 
> Sure. I will.

Thanks. Can you also wire this up for every architecture, please?
I don't see that this has been done in this series.
  
Alexey Gladkov July 11, 2023, 3:23 p.m. UTC | #7
On Tue, Jul 11, 2023 at 04:01:03PM +0200, Christian Brauner wrote:
> On Tue, Jul 11, 2023 at 02:51:01PM +0200, Alexey Gladkov wrote:
> > On Tue, Jul 11, 2023 at 01:52:01PM +0200, Christian Brauner wrote:
> > > On Tue, Jul 11, 2023 at 01:42:19PM +0200, Arnd Bergmann wrote:
> > > > On Tue, Jul 11, 2023, at 13:25, Alexey Gladkov wrote:
> > > > > From: Palmer Dabbelt <palmer@sifive.com>
> > > > >
> > > > > On the userspace side fchmodat(3) is implemented as a wrapper
> > > > > function which implements the POSIX-specified interface. This
> > > > > interface differs from the underlying kernel system call, which does not
> > > > > have a flags argument. Most implementations require procfs [1][2].
> > > > >
> > > > > There doesn't appear to be a good userspace workaround for this issue
> > > > > but the implementation in the kernel is pretty straight-forward.
> > > > >
> > > > > The new fchmodat4() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
> > > > > unlike existing fchmodat.
> > > > >
> > > > > [1] 
> > > > > https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35
> > > > > [2] 
> > > > > https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28
> > > > >
> > > > > Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> > > > > Signed-off-by: Alexey Gladkov <legion@kernel.org>
> > > > 
> > > > I don't know the history of why we ended up with the different
> > > > interface, or whether this was done intentionally in the kernel
> > > > or if we want this syscall.
> > > > 
> > > > Assuming this is in fact needed, I double-checked that the
> > > > implementation looks correct to me and is portable to all the
> > > > architectures, without the need for a compat wrapper.
> > > > 
> > > > Acked-by: Arnd Bergmann <arnd@arndb.de>
> > > 
> > > The system call itself is useful afaict. But please,
> > > 
> > > s/fchmodat4/fchmodat2/
> > 
> > Sure. I will.
> 
> Thanks. Can you also wire this up for every architecture, please?
> I don't see that this has been done in this series.

Sure. I have already added in all architectures as far as I can tell:

$ diff -s <(find arch/ -name '*.tbl' |sort -u) <(git grep -lw fchmodat2 arch/ |sort -u)
Files /dev/fd/63 and /dev/fd/62 are identical
  

Patch

diff --git a/fs/open.c b/fs/open.c
index 4478adcc4f3a..58bb88c6afb6 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -671,11 +671,11 @@  SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode)
 	return err;
 }
 
-static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
+static int do_fchmodat4(int dfd, const char __user *filename, umode_t mode, int lookup_flags)
 {
 	struct path path;
 	int error;
-	unsigned int lookup_flags = LOOKUP_FOLLOW;
+
 retry:
 	error = user_path_at(dfd, filename, lookup_flags, &path);
 	if (!error) {
@@ -689,15 +689,25 @@  static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
 	return error;
 }
 
+SYSCALL_DEFINE4(fchmodat4, int, dfd, const char __user *, filename,
+		umode_t, mode, int, flags)
+{
+	if (unlikely(flags & ~AT_SYMLINK_NOFOLLOW))
+		return -EINVAL;
+
+	return do_fchmodat4(dfd, filename, mode,
+			flags & AT_SYMLINK_NOFOLLOW ? 0 : LOOKUP_FOLLOW);
+}
+
 SYSCALL_DEFINE3(fchmodat, int, dfd, const char __user *, filename,
 		umode_t, mode)
 {
-	return do_fchmodat(dfd, filename, mode);
+	return do_fchmodat4(dfd, filename, mode, LOOKUP_FOLLOW);
 }
 
 SYSCALL_DEFINE2(chmod, const char __user *, filename, umode_t, mode)
 {
-	return do_fchmodat(AT_FDCWD, filename, mode);
+	return do_fchmodat4(AT_FDCWD, filename, mode, LOOKUP_FOLLOW);
 }
 
 /**
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 497bdd968c32..b17d37d2bad6 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -466,6 +466,8 @@  asmlinkage long sys_chroot(const char __user *filename);
 asmlinkage long sys_fchmod(unsigned int fd, umode_t mode);
 asmlinkage long sys_fchmodat(int dfd, const char __user *filename,
 			     umode_t mode);
+asmlinkage long sys_fchmodat4(int dfd, const char __user *filename,
+			     umode_t mode, int flags);
 asmlinkage long sys_fchownat(int dfd, const char __user *filename, uid_t user,
 			     gid_t group, int flag);
 asmlinkage long sys_fchown(unsigned int fd, uid_t user, gid_t group);