Message ID | f2a846ef495943c5d101011eebcf01179d0c7b61.1689092120.git.legion@kernel.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp610016vqm; Tue, 11 Jul 2023 09:39:06 -0700 (PDT) X-Google-Smtp-Source: APBJJlFU/MMLQBLN3WW7HpP7nKAso3FflvIdfnbSTLOZwfq+EgvuWcbNP7kTffCKSMZKZTjWTsff X-Received: by 2002:a05:6a20:7d89:b0:12f:31a0:8309 with SMTP id v9-20020a056a207d8900b0012f31a08309mr22251741pzj.33.1689093545827; Tue, 11 Jul 2023 09:39:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689093545; cv=none; d=google.com; s=arc-20160816; b=cRA3zbMYM4nkJC94H8qdCQM9y/jircJhee3VWo8hKHW52V95wQC2G5M+/+HHEzuTf4 ZajbFR3weEG9TACQs+jcF3jpp4c7BezqBfObUvGUoHe5DXnFfB/dm32fX4B4pEXoh8Nc SXs+9sLbytjC9DMfwXUV+aR4sstOCbBPjDi4Fn9CWM+fBQL+1JEvs2aoW7vLaECU0C1n Iw2yE1s39Piphtk1qbMKClimPYcFhLwBuB9ZN6hJYjvSRWXCM9sCke4fBYkkiqrLOuOg eT6jQNbDJ6nubaX96odEqyGULSyMsoBqjx0ydsdY8mWJIPBXyCrVYE5VpSmv/ngL7VNP iaNg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=UV04of/EptKSQBFFnxJ4CWQnJJNPYBzSh+p9WuBv2kk=; fh=JVqkPTWxD7GwSWbKe4k4TEoeG7os0XBd2a8G0TV9L0U=; b=EA7hB0DyBzNwCUEpX2MAAbZZIi0fRLL7nOaNTLj73xSfBn5UHX2BaZcA2JXjuEeKrT 1uAIrQdKRopTJPlcI90u0jFpi3cyLsTSFl1C1BcRTU8AXqVrlMnqSwcU0ulZNHBi5RdJ 2Wdh9fWThktTdvVac+qe182xGWfX8IKi3dEp+CQtyK+JrHdjqy60RynQ2Vo3f/X93pr7 Bo+Z7v3aCHksMwRTANMDYIibirNuCTHjC1HiPQoRF2UultwETWazCKU+bgS0M7XhNdwb LrWyHMKiHio+v/5lgRSgkbcNmAG/S1nr38ER/B6uQK+rnU7jWA0UO3xjkfWUJct8ngky +hEA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p7-20020a056a000b4700b00653fb3f21d3si1717824pfo.373.2023.07.11.09.38.51; Tue, 11 Jul 2023 09:39:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232065AbjGKQST (ORCPT <rfc822;gnulinuxfreebsd@gmail.com> + 99 others); Tue, 11 Jul 2023 12:18:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52236 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233072AbjGKQRq (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 11 Jul 2023 12:17:46 -0400 Received: from us-smtp-delivery-44.mimecast.com (unknown [207.211.30.44]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F09EB172A for <linux-kernel@vger.kernel.org>; Tue, 11 Jul 2023 09:17:44 -0700 (PDT) Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-287-zFqPIjk9Pg208y19fOqwOQ-1; Tue, 11 Jul 2023 12:17:19 -0400 X-MC-Unique: zFqPIjk9Pg208y19fOqwOQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B51E6185AD26; Tue, 11 Jul 2023 16:17:11 +0000 (UTC) Received: from localhost.localdomain.com (unknown [10.45.225.44]) by smtp.corp.redhat.com (Postfix) with ESMTP id 90838200AD6E; Tue, 11 Jul 2023 16:17:00 +0000 (UTC) From: Alexey Gladkov <legion@kernel.org> To: LKML <linux-kernel@vger.kernel.org>, Arnd Bergmann <arnd@arndb.de>, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk Cc: James.Bottomley@HansenPartnership.com, acme@kernel.org, alexander.shishkin@linux.intel.com, axboe@kernel.dk, benh@kernel.crashing.org, borntraeger@de.ibm.com, bp@alien8.de, catalin.marinas@arm.com, christian@brauner.io, dalias@libc.org, davem@davemloft.net, deepa.kernel@gmail.com, deller@gmx.de, dhowells@redhat.com, fenghua.yu@intel.com, fweimer@redhat.com, geert@linux-m68k.org, glebfm@altlinux.org, gor@linux.ibm.com, hare@suse.com, hpa@zytor.com, ink@jurassic.park.msu.ru, jhogan@kernel.org, kim.phillips@arm.com, ldv@altlinux.org, linux-alpha@vger.kernel.org, linux-arch@vger.kernel.org, linux-ia64@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, linux@armlinux.org.uk, linuxppc-dev@lists.ozlabs.org, luto@kernel.org, mattst88@gmail.com, mingo@redhat.com, monstr@monstr.eu, mpe@ellerman.id.au, namhyung@kernel.org, paulus@samba.org, peterz@infradead.org, ralf@linux-mips.org, sparclinux@vger.kernel.org, stefan@agner.ch, tglx@linutronix.de, tony.luck@intel.com, tycho@tycho.ws, will@kernel.org, x86@kernel.org, ysato@users.sourceforge.jp, Palmer Dabbelt <palmer@sifive.com> Subject: [PATCH v4 2/5] fs: Add fchmodat2() Date: Tue, 11 Jul 2023 18:16:04 +0200 Message-Id: <f2a846ef495943c5d101011eebcf01179d0c7b61.1689092120.git.legion@kernel.org> In-Reply-To: <cover.1689092120.git.legion@kernel.org> References: <cover.1689074739.git.legion@kernel.org> <cover.1689092120.git.legion@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Spam-Status: No, score=0.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_BLOCKED, RCVD_IN_VALIDITY_RPBL,RDNS_NONE,SPF_HELO_NONE,SPF_SOFTFAIL, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771142954058440498 X-GMAIL-MSGID: 1771142954058440498 |
Series | Add a new fchmodat2() syscall | |
Commit Message
Alexey Gladkov
July 11, 2023, 4:16 p.m. UTC
On the userspace side fchmodat(3) is implemented as a wrapper function which implements the POSIX-specified interface. This interface differs from the underlying kernel system call, which does not have a flags argument. Most implementations require procfs [1][2]. There doesn't appear to be a good userspace workaround for this issue but the implementation in the kernel is pretty straight-forward. The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag, unlike existing fchmodat. [1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35 [2] https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28 Co-developed-by: Palmer Dabbelt <palmer@sifive.com> Signed-off-by: Palmer Dabbelt <palmer@sifive.com> Signed-off-by: Alexey Gladkov <legion@kernel.org> Acked-by: Arnd Bergmann <arnd@arndb.de> --- fs/open.c | 18 ++++++++++++++---- include/linux/syscalls.h | 2 ++ 2 files changed, 16 insertions(+), 4 deletions(-)
Comments
On Tue, Jul 11, 2023 at 06:16:04PM +0200, Alexey Gladkov wrote: > On the userspace side fchmodat(3) is implemented as a wrapper > function which implements the POSIX-specified interface. This > interface differs from the underlying kernel system call, which does not > have a flags argument. Most implementations require procfs [1][2]. > > There doesn't appear to be a good userspace workaround for this issue > but the implementation in the kernel is pretty straight-forward. > > The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag, > unlike existing fchmodat. > > [1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35 > [2] https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28 > > Co-developed-by: Palmer Dabbelt <palmer@sifive.com> > Signed-off-by: Palmer Dabbelt <palmer@sifive.com> > Signed-off-by: Alexey Gladkov <legion@kernel.org> > Acked-by: Arnd Bergmann <arnd@arndb.de> > --- > fs/open.c | 18 ++++++++++++++---- > include/linux/syscalls.h | 2 ++ > 2 files changed, 16 insertions(+), 4 deletions(-) > > diff --git a/fs/open.c b/fs/open.c > index 0c55c8e7f837..39a7939f0d00 100644 > --- a/fs/open.c > +++ b/fs/open.c > @@ -671,11 +671,11 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode) > return err; > } > > -static int do_fchmodat(int dfd, const char __user *filename, umode_t mode) > +static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, int lookup_flags) Should all be unsigned instead of int here for flags. We also had a documentation update to that effect but smh never sent it. user_path_at() itself takes an unsigned as well. I'll fix that up though.
On 2023-07-11, Alexey Gladkov <legion@kernel.org> wrote: > On the userspace side fchmodat(3) is implemented as a wrapper > function which implements the POSIX-specified interface. This > interface differs from the underlying kernel system call, which does not > have a flags argument. Most implementations require procfs [1][2]. > > There doesn't appear to be a good userspace workaround for this issue > but the implementation in the kernel is pretty straight-forward. > > The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag, > unlike existing fchmodat. > > [1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35 > [2] https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28 > > Co-developed-by: Palmer Dabbelt <palmer@sifive.com> > Signed-off-by: Palmer Dabbelt <palmer@sifive.com> > Signed-off-by: Alexey Gladkov <legion@kernel.org> > Acked-by: Arnd Bergmann <arnd@arndb.de> > --- > fs/open.c | 18 ++++++++++++++---- > include/linux/syscalls.h | 2 ++ > 2 files changed, 16 insertions(+), 4 deletions(-) > > diff --git a/fs/open.c b/fs/open.c > index 0c55c8e7f837..39a7939f0d00 100644 > --- a/fs/open.c > +++ b/fs/open.c > @@ -671,11 +671,11 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode) > return err; > } > > -static int do_fchmodat(int dfd, const char __user *filename, umode_t mode) > +static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, int lookup_flags) I think it'd be much neater to do the conversion of AT_ flags here and pass 0 as a flags argument for all of the wrappers (this is how most of the other xyz(), fxyz(), fxyzat() syscall wrappers are done IIRC). > { > struct path path; > int error; > - unsigned int lookup_flags = LOOKUP_FOLLOW; > + > retry: > error = user_path_at(dfd, filename, lookup_flags, &path); > if (!error) { > @@ -689,15 +689,25 @@ static int do_fchmodat(int dfd, const char __user *filename, umode_t mode) > return error; > } > > +SYSCALL_DEFINE4(fchmodat2, int, dfd, const char __user *, filename, > + umode_t, mode, int, flags) > +{ > + if (unlikely(flags & ~AT_SYMLINK_NOFOLLOW)) > + return -EINVAL; We almost certainly want to support AT_EMPTY_PATH at the same time. Otherwise userspace will still need to go through /proc when trying to chmod a file handle they have. > + > + return do_fchmodat(dfd, filename, mode, > + flags & AT_SYMLINK_NOFOLLOW ? 0 : LOOKUP_FOLLOW); > +} > + > SYSCALL_DEFINE3(fchmodat, int, dfd, const char __user *, filename, > umode_t, mode) > { > - return do_fchmodat(dfd, filename, mode); > + return do_fchmodat(dfd, filename, mode, LOOKUP_FOLLOW); > } > > SYSCALL_DEFINE2(chmod, const char __user *, filename, umode_t, mode) > { > - return do_fchmodat(AT_FDCWD, filename, mode); > + return do_fchmodat(AT_FDCWD, filename, mode, LOOKUP_FOLLOW); > } > > /* > diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h > index 584f404bf868..6e852279fbc3 100644 > --- a/include/linux/syscalls.h > +++ b/include/linux/syscalls.h > @@ -440,6 +440,8 @@ asmlinkage long sys_chroot(const char __user *filename); > asmlinkage long sys_fchmod(unsigned int fd, umode_t mode); > asmlinkage long sys_fchmodat(int dfd, const char __user *filename, > umode_t mode); > +asmlinkage long sys_fchmodat2(int dfd, const char __user *filename, > + umode_t mode, int flags); > asmlinkage long sys_fchownat(int dfd, const char __user *filename, uid_t user, > gid_t group, int flag); > asmlinkage long sys_fchown(unsigned int fd, uid_t user, gid_t group); > -- > 2.33.8 >
On Wed, Jul 26, 2023 at 02:36:25AM +1000, Aleksa Sarai wrote: > On 2023-07-11, Alexey Gladkov <legion@kernel.org> wrote: > > On the userspace side fchmodat(3) is implemented as a wrapper > > function which implements the POSIX-specified interface. This > > interface differs from the underlying kernel system call, which does not > > have a flags argument. Most implementations require procfs [1][2]. > > > > There doesn't appear to be a good userspace workaround for this issue > > but the implementation in the kernel is pretty straight-forward. > > > > The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag, > > unlike existing fchmodat. > > > > [1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35 > > [2] https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28 > > > > Co-developed-by: Palmer Dabbelt <palmer@sifive.com> > > Signed-off-by: Palmer Dabbelt <palmer@sifive.com> > > Signed-off-by: Alexey Gladkov <legion@kernel.org> > > Acked-by: Arnd Bergmann <arnd@arndb.de> > > --- > > fs/open.c | 18 ++++++++++++++---- > > include/linux/syscalls.h | 2 ++ > > 2 files changed, 16 insertions(+), 4 deletions(-) > > > > diff --git a/fs/open.c b/fs/open.c > > index 0c55c8e7f837..39a7939f0d00 100644 > > --- a/fs/open.c > > +++ b/fs/open.c > > @@ -671,11 +671,11 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode) > > return err; > > } > > > > -static int do_fchmodat(int dfd, const char __user *filename, umode_t mode) > > +static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, int lookup_flags) > > I think it'd be much neater to do the conversion of AT_ flags here and > pass 0 as a flags argument for all of the wrappers (this is how most of > the other xyz(), fxyz(), fxyzat() syscall wrappers are done IIRC). I just addressed the Al Viro's suggestion. https://lore.kernel.org/lkml/20190717014802.GS17978@ZenIV.linux.org.uk/ > > { > > struct path path; > > int error; > > - unsigned int lookup_flags = LOOKUP_FOLLOW; > > + > > retry: > > error = user_path_at(dfd, filename, lookup_flags, &path); > > if (!error) { > > @@ -689,15 +689,25 @@ static int do_fchmodat(int dfd, const char __user *filename, umode_t mode) > > return error; > > } > > > > +SYSCALL_DEFINE4(fchmodat2, int, dfd, const char __user *, filename, > > + umode_t, mode, int, flags) > > +{ > > + if (unlikely(flags & ~AT_SYMLINK_NOFOLLOW)) > > + return -EINVAL; > > We almost certainly want to support AT_EMPTY_PATH at the same time. > Otherwise userspace will still need to go through /proc when trying to > chmod a file handle they have. I'm not sure I understand. Can you explain what you mean?
From: Aleksa Sarai > Sent: 25 July 2023 17:36 ... > We almost certainly want to support AT_EMPTY_PATH at the same time. > Otherwise userspace will still need to go through /proc when trying to > chmod a file handle they have. That can't be allowed. Just because a process has a file open and write access to the directory that contains it doesn't mean they are allowed to change the file permissions. They also need directory search access from a directory they have open through to the containing directory. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
> > I think it'd be much neater to do the conversion of AT_ flags here and > > pass 0 as a flags argument for all of the wrappers (this is how most of > > the other xyz(), fxyz(), fxyzat() syscall wrappers are done IIRC). I've fixed that up in-tree.
On Jul 27 2023, David Laight wrote: > From: Aleksa Sarai >> Sent: 25 July 2023 17:36 > ... >> We almost certainly want to support AT_EMPTY_PATH at the same time. >> Otherwise userspace will still need to go through /proc when trying to >> chmod a file handle they have. > > That can't be allowed. IIUC, fchmodat2(fd, "", m, AT_EMPTY_PATH) is equivalent to fchmod(fd, m). With that, new architectures only need to implement the fchmodat2 syscall to cover all chmod variants.
On Thu, Jul 27, 2023 at 09:01:06AM +0000, David Laight wrote: > From: Aleksa Sarai > > Sent: 25 July 2023 17:36 > .... > > We almost certainly want to support AT_EMPTY_PATH at the same time. > > Otherwise userspace will still need to go through /proc when trying to > > chmod a file handle they have. > > That can't be allowed. > > Just because a process has a file open and write access to > the directory that contains it doesn't mean they are allowed > to change the file permissions. > > They also need directory search access from a directory > they have open through to the containing directory. Am I missing something? How is this different from fchmod? Rich
On Thu, Jul 27, 2023 at 06:28:53PM +0200, Andreas Schwab wrote: > On Jul 27 2023, David Laight wrote: > > > From: Aleksa Sarai > >> Sent: 25 July 2023 17:36 > > ... > >> We almost certainly want to support AT_EMPTY_PATH at the same time. > >> Otherwise userspace will still need to go through /proc when trying to > >> chmod a file handle they have. > > > > That can't be allowed. > > IIUC, fchmodat2(fd, "", m, AT_EMPTY_PATH) is equivalent to fchmod(fd, > m). With that, new architectures only need to implement the fchmodat2 > syscall to cover all chmod variants. There's a difference though as fchmod() doesn't work with O_PATH file descriptors while AT_EMPTY_PATH does. Similar to how fchown() doesn't work with O_PATH file descriptors. However, we do allow AT_EMPTY_PATH with fchownat() so there's no reason to not allow it for fchmodat2(). But it's a bit of a shame that O_PATH looks less and less like O_PATH. It came from can-do-barely-anything to can-do-quite-a-lot-of-things over the years. In any case, AT_EMPTY_PATH for fchmodat2() can be an additional patch on top.
On 2023-07-26, Alexey Gladkov <legion@kernel.org> wrote: > On Wed, Jul 26, 2023 at 02:36:25AM +1000, Aleksa Sarai wrote: > > On 2023-07-11, Alexey Gladkov <legion@kernel.org> wrote: > > > On the userspace side fchmodat(3) is implemented as a wrapper > > > function which implements the POSIX-specified interface. This > > > interface differs from the underlying kernel system call, which does not > > > have a flags argument. Most implementations require procfs [1][2]. > > > > > > There doesn't appear to be a good userspace workaround for this issue > > > but the implementation in the kernel is pretty straight-forward. > > > > > > The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag, > > > unlike existing fchmodat. > > > > > > [1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35 > > > [2] https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28 > > > > > > Co-developed-by: Palmer Dabbelt <palmer@sifive.com> > > > Signed-off-by: Palmer Dabbelt <palmer@sifive.com> > > > Signed-off-by: Alexey Gladkov <legion@kernel.org> > > > Acked-by: Arnd Bergmann <arnd@arndb.de> > > > --- > > > fs/open.c | 18 ++++++++++++++---- > > > include/linux/syscalls.h | 2 ++ > > > 2 files changed, 16 insertions(+), 4 deletions(-) > > > > > > diff --git a/fs/open.c b/fs/open.c > > > index 0c55c8e7f837..39a7939f0d00 100644 > > > --- a/fs/open.c > > > +++ b/fs/open.c > > > @@ -671,11 +671,11 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode) > > > return err; > > > } > > > > > > -static int do_fchmodat(int dfd, const char __user *filename, umode_t mode) > > > +static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, int lookup_flags) > > > > I think it'd be much neater to do the conversion of AT_ flags here and > > pass 0 as a flags argument for all of the wrappers (this is how most of > > the other xyz(), fxyz(), fxyzat() syscall wrappers are done IIRC). > > I just addressed the Al Viro's suggestion. > > https://lore.kernel.org/lkml/20190717014802.GS17978@ZenIV.linux.org.uk/ I think Al misspoke, because he also said "pass it 0 as an extra argument", but you actually have to pass LOOKUP_FOLLOW from the wrappers. If you look at how faccessat2 and faccessat are implemented, it follows the behaviour I described. > > > { > > > struct path path; > > > int error; > > > - unsigned int lookup_flags = LOOKUP_FOLLOW; > > > + > > > retry: > > > error = user_path_at(dfd, filename, lookup_flags, &path); > > > if (!error) { > > > @@ -689,15 +689,25 @@ static int do_fchmodat(int dfd, const char __user *filename, umode_t mode) > > > return error; > > > } > > > > > > +SYSCALL_DEFINE4(fchmodat2, int, dfd, const char __user *, filename, > > > + umode_t, mode, int, flags) > > > +{ > > > + if (unlikely(flags & ~AT_SYMLINK_NOFOLLOW)) > > > + return -EINVAL; > > > > We almost certainly want to support AT_EMPTY_PATH at the same time. > > Otherwise userspace will still need to go through /proc when trying to > > chmod a file handle they have. > > I'm not sure I understand. Can you explain what you mean? You should add support for AT_EMPTY_PATH (LOOKUP_EMPTY) as well as AT_SYMLINK_NOFOLLOW. It would only require something like: unsigned int lookup_flags = LOOKUP_FOLLOW; if (flags & ~(AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW)) return -EINVAL; if (flags & AT_EMPTY_PATH) lookup_flags |= LOOKUP_EMPTY; if (flags & AT_SYMLINK_NOFOLLOW) lookup_flags &= ~LOOKUP_FOLLOW; /* ... */ This would be effectively equivalent to fchmod(fd, mode). (I was wrong when I said this wasn't already possible -- I forgot about fchmod(2).)
On Thu, Jul 27, 2023 at 07:02:53PM +0200, Christian Brauner wrote: > On Thu, Jul 27, 2023 at 06:28:53PM +0200, Andreas Schwab wrote: > > On Jul 27 2023, David Laight wrote: > > > > > From: Aleksa Sarai > > >> Sent: 25 July 2023 17:36 > > > ... > > >> We almost certainly want to support AT_EMPTY_PATH at the same time. > > >> Otherwise userspace will still need to go through /proc when trying to > > >> chmod a file handle they have. > > > > > > That can't be allowed. > > > > IIUC, fchmodat2(fd, "", m, AT_EMPTY_PATH) is equivalent to fchmod(fd, > > m). With that, new architectures only need to implement the fchmodat2 > > syscall to cover all chmod variants. > > There's a difference though as fchmod() doesn't work with O_PATH file > descriptors while AT_EMPTY_PATH does. Similar to how fchown() doesn't > work with O_PATH file descriptors. > > However, we do allow AT_EMPTY_PATH with fchownat() so there's no reason > to not allow it for fchmodat2(). > > But it's a bit of a shame that O_PATH looks less and less like O_PATH. > It came from can-do-barely-anything to can-do-quite-a-lot-of-things over > the years. > > In any case, AT_EMPTY_PATH for fchmodat2() can be an additional patch on > top. From a standpoint of implementing O_SEARCH/O_EXEC using it, I don't see any reason fchown/fchmod should not work on O_PATH file descriptors. And indeed when you have procfs available to emulate them via procfs, it already does. So I don't see this as unwanted functionality or an access control regression. I see it as things behaving as expected. Semantically, O_PATH is a reference to the inode, not to the dirent. So there is no reason you should not be able to do things that need permission to the inode (changing permissions on it) rather than to the dirent (renaming/moving). Rich
On Thu, Jul 27, 2023 at 01:13:37PM -0400, dalias@libc.org wrote: > On Thu, Jul 27, 2023 at 07:02:53PM +0200, Christian Brauner wrote: > > On Thu, Jul 27, 2023 at 06:28:53PM +0200, Andreas Schwab wrote: > > > On Jul 27 2023, David Laight wrote: > > > > > > > From: Aleksa Sarai > > > >> Sent: 25 July 2023 17:36 > > > > ... > > > >> We almost certainly want to support AT_EMPTY_PATH at the same time. > > > >> Otherwise userspace will still need to go through /proc when trying to > > > >> chmod a file handle they have. > > > > > > > > That can't be allowed. > > > > > > IIUC, fchmodat2(fd, "", m, AT_EMPTY_PATH) is equivalent to fchmod(fd, > > > m). With that, new architectures only need to implement the fchmodat2 > > > syscall to cover all chmod variants. > > > > There's a difference though as fchmod() doesn't work with O_PATH file > > descriptors while AT_EMPTY_PATH does. Similar to how fchown() doesn't > > work with O_PATH file descriptors. > > > > However, we do allow AT_EMPTY_PATH with fchownat() so there's no reason > > to not allow it for fchmodat2(). > > > > But it's a bit of a shame that O_PATH looks less and less like O_PATH. > > It came from can-do-barely-anything to can-do-quite-a-lot-of-things over > > the years. > > > > In any case, AT_EMPTY_PATH for fchmodat2() can be an additional patch on > > top. > > From a standpoint of implementing O_SEARCH/O_EXEC using it, I don't > see any reason fchown/fchmod should not work on O_PATH file > descriptors. And indeed when you have procfs available to emulate them > via procfs, it already does. So I don't see this as unwanted I'm really not talking about the fact that proc is a giant loophole for basically everyhing related to O_PATH and reopening fds. I'm saying that both fchmod() and fchown() don't work on O_PATH fds. They explicitly reject them. AT_EMPTY_PATH and therefore O_PATH for fchmodat2() is fine given that we do it for fchownat() already.
On 2023-07-28, Aleksa Sarai <cyphar@cyphar.com> wrote: > On 2023-07-26, Alexey Gladkov <legion@kernel.org> wrote: > > On Wed, Jul 26, 2023 at 02:36:25AM +1000, Aleksa Sarai wrote: > > > On 2023-07-11, Alexey Gladkov <legion@kernel.org> wrote: > > > > On the userspace side fchmodat(3) is implemented as a wrapper > > > > function which implements the POSIX-specified interface. This > > > > interface differs from the underlying kernel system call, which does not > > > > have a flags argument. Most implementations require procfs [1][2]. > > > > > > > > There doesn't appear to be a good userspace workaround for this issue > > > > but the implementation in the kernel is pretty straight-forward. > > > > > > > > The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag, > > > > unlike existing fchmodat. > > > > > > > > [1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35 > > > > [2] https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28 > > > > > > > > Co-developed-by: Palmer Dabbelt <palmer@sifive.com> > > > > Signed-off-by: Palmer Dabbelt <palmer@sifive.com> > > > > Signed-off-by: Alexey Gladkov <legion@kernel.org> > > > > Acked-by: Arnd Bergmann <arnd@arndb.de> > > > > --- > > > > fs/open.c | 18 ++++++++++++++---- > > > > include/linux/syscalls.h | 2 ++ > > > > 2 files changed, 16 insertions(+), 4 deletions(-) > > > > > > > > diff --git a/fs/open.c b/fs/open.c > > > > index 0c55c8e7f837..39a7939f0d00 100644 > > > > --- a/fs/open.c > > > > +++ b/fs/open.c > > > > @@ -671,11 +671,11 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode) > > > > return err; > > > > } > > > > > > > > -static int do_fchmodat(int dfd, const char __user *filename, umode_t mode) > > > > +static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, int lookup_flags) > > > > > > I think it'd be much neater to do the conversion of AT_ flags here and > > > pass 0 as a flags argument for all of the wrappers (this is how most of > > > the other xyz(), fxyz(), fxyzat() syscall wrappers are done IIRC). > > > > I just addressed the Al Viro's suggestion. > > > > https://lore.kernel.org/lkml/20190717014802.GS17978@ZenIV.linux.org.uk/ > > I think Al misspoke, because he also said "pass it 0 as an extra > argument", but you actually have to pass LOOKUP_FOLLOW from the > wrappers. If you look at how faccessat2 and faccessat are implemented, > it follows the behaviour I described. > > > > > { > > > > struct path path; > > > > int error; > > > > - unsigned int lookup_flags = LOOKUP_FOLLOW; > > > > + > > > > retry: > > > > error = user_path_at(dfd, filename, lookup_flags, &path); > > > > if (!error) { > > > > @@ -689,15 +689,25 @@ static int do_fchmodat(int dfd, const char __user *filename, umode_t mode) > > > > return error; > > > > } > > > > > > > > +SYSCALL_DEFINE4(fchmodat2, int, dfd, const char __user *, filename, > > > > + umode_t, mode, int, flags) > > > > +{ > > > > + if (unlikely(flags & ~AT_SYMLINK_NOFOLLOW)) > > > > + return -EINVAL; > > > > > > We almost certainly want to support AT_EMPTY_PATH at the same time. > > > Otherwise userspace will still need to go through /proc when trying to > > > chmod a file handle they have. > > > > I'm not sure I understand. Can you explain what you mean? > > You should add support for AT_EMPTY_PATH (LOOKUP_EMPTY) as well as > AT_SYMLINK_NOFOLLOW. It would only require something like: > > unsigned int lookup_flags = LOOKUP_FOLLOW; > > if (flags & ~(AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW)) > return -EINVAL; > > if (flags & AT_EMPTY_PATH) > lookup_flags |= LOOKUP_EMPTY; > if (flags & AT_SYMLINK_NOFOLLOW) > lookup_flags &= ~LOOKUP_FOLLOW; > > /* ... */ > > This would be effectively equivalent to fchmod(fd, mode). (I was wrong > when I said this wasn't already possible -- I forgot about fchmod(2).) ... with the exception (as Christian mentioned) of O_PATH descriptors. However, there are two counter-points to this: * fchownat(AT_EMPTY_PATH) exists but fchown() doesn't work on O_PATH descriptors *by design* (according to open(2)). * chmod(/proc/self/fd/$n) works on O_PATH descriptors, meaning this behaviour is already allowed and all that AT_EMPTY_PATH would do is allow programs to avoid depending on procfs for this. FWIW, I agree with Christian that these behaviours are not ideal (and I'm working on a series that might allow for these things to be properly blocked in the future) but there's also the consistency argument -- I don't think fchownat() is much safer to allow in this way than fchmodat() and (again) this behaviour is already possible through procfs. Ultimately, we can always add AT_EMPTY_PATH later. It just seemed like an obvious omission to me that would be easy to resolve.
... > FWIW, I agree with Christian that these behaviours are not ideal (and > I'm working on a series that might allow for these things to be properly > blocked in the future) but there's also the consistency argument -- I > don't think fchownat() is much safer to allow in this way than > fchmodat() and (again) this behaviour is already possible through > procfs. If the 'through procfs' involves readlink("/proc/self/fd/n") and accessing through the returned path then the permission checks are different. Using the returned path requires search permissions on all the directories. This is all fine for xxxat() functions where a real open directory fd is supplied. But other cases definitely need a lot of thought to ensure they don't let programs acquire permissions they aren't supposed to have. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Fri, Jul 28, 2023 at 08:43:58AM +0000, David Laight wrote: > .... > > FWIW, I agree with Christian that these behaviours are not ideal (and > > I'm working on a series that might allow for these things to be properly > > blocked in the future) but there's also the consistency argument -- I > > don't think fchownat() is much safer to allow in this way than > > fchmodat() and (again) this behaviour is already possible through > > procfs. > > If the 'through procfs' involves readlink("/proc/self/fd/n") and > accessing through the returned path then the permission checks > are different. > Using the returned path requires search permissions on all the > directories. That's *not* how "through procfs" works. The "magic symlinks" in /proc/*/fd are not actual symlinks that get dereferenced to the contents they readlink() to, but special-type objects that dereference directly to the underlying file associated with the open file description. Rich
diff --git a/fs/open.c b/fs/open.c index 0c55c8e7f837..39a7939f0d00 100644 --- a/fs/open.c +++ b/fs/open.c @@ -671,11 +671,11 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode) return err; } -static int do_fchmodat(int dfd, const char __user *filename, umode_t mode) +static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, int lookup_flags) { struct path path; int error; - unsigned int lookup_flags = LOOKUP_FOLLOW; + retry: error = user_path_at(dfd, filename, lookup_flags, &path); if (!error) { @@ -689,15 +689,25 @@ static int do_fchmodat(int dfd, const char __user *filename, umode_t mode) return error; } +SYSCALL_DEFINE4(fchmodat2, int, dfd, const char __user *, filename, + umode_t, mode, int, flags) +{ + if (unlikely(flags & ~AT_SYMLINK_NOFOLLOW)) + return -EINVAL; + + return do_fchmodat(dfd, filename, mode, + flags & AT_SYMLINK_NOFOLLOW ? 0 : LOOKUP_FOLLOW); +} + SYSCALL_DEFINE3(fchmodat, int, dfd, const char __user *, filename, umode_t, mode) { - return do_fchmodat(dfd, filename, mode); + return do_fchmodat(dfd, filename, mode, LOOKUP_FOLLOW); } SYSCALL_DEFINE2(chmod, const char __user *, filename, umode_t, mode) { - return do_fchmodat(AT_FDCWD, filename, mode); + return do_fchmodat(AT_FDCWD, filename, mode, LOOKUP_FOLLOW); } /* diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 584f404bf868..6e852279fbc3 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -440,6 +440,8 @@ asmlinkage long sys_chroot(const char __user *filename); asmlinkage long sys_fchmod(unsigned int fd, umode_t mode); asmlinkage long sys_fchmodat(int dfd, const char __user *filename, umode_t mode); +asmlinkage long sys_fchmodat2(int dfd, const char __user *filename, + umode_t mode, int flags); asmlinkage long sys_fchownat(int dfd, const char __user *filename, uid_t user, gid_t group, int flag); asmlinkage long sys_fchown(unsigned int fd, uid_t user, gid_t group);