[RESEND,RFC,v2,11/14] vfs: Wire up security hooks for lsm-based device guard in userns

Message ID 20231025094224.72858-12-michael.weiss@aisec.fraunhofer.de
State New
Headers
Series device_cgroup: guard mknod for non-initial user namespace |

Commit Message

Michael Weiß Oct. 25, 2023, 9:42 a.m. UTC
  Wire up security_inode_mknod_capns() in fs/namei.c. If implemented
and access is granted by an lsm, check ns_capable() instead of the
global CAP_MKNOD.

Wire up security_sb_alloc_userns() in fs/super.c. If implemented
and access is granted by an lsm, the created super block will allow
access to device nodes also if it was created in a non-inital userns.

Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
---
 fs/namei.c | 16 +++++++++++++++-
 fs/super.c |  6 +++++-
 2 files changed, 20 insertions(+), 2 deletions(-)
  

Comments

Christian Brauner Nov. 24, 2023, 5:06 p.m. UTC | #1
On Wed, Oct 25, 2023 at 11:42:21AM +0200, Michael Weiß wrote:
> Wire up security_inode_mknod_capns() in fs/namei.c. If implemented
> and access is granted by an lsm, check ns_capable() instead of the
> global CAP_MKNOD.
> 
> Wire up security_sb_alloc_userns() in fs/super.c. If implemented
> and access is granted by an lsm, the created super block will allow
> access to device nodes also if it was created in a non-inital userns.
> 
> Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
> ---
>  fs/namei.c | 16 +++++++++++++++-
>  fs/super.c |  6 +++++-
>  2 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/namei.c b/fs/namei.c
> index f601fcbdc4d2..1f68d160e2c0 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -3949,6 +3949,20 @@ inline struct dentry *user_path_create(int dfd, const char __user *pathname,
>  }
>  EXPORT_SYMBOL(user_path_create);
>  
> +static bool mknod_capable(struct inode *dir, struct dentry *dentry,
> +			  umode_t mode, dev_t dev)
> +{
> +	/*
> +	 * In case of a security hook implementation check mknod in user
> +	 * namespace. Otherwise just check global capability.
> +	 */
> +	int error = security_inode_mknod_nscap(dir, dentry, mode, dev);
> +	if (!error)
> +		return ns_capable(current_user_ns(), CAP_MKNOD);
> +	else
> +		return capable(CAP_MKNOD);
> +}
> +
>  /**
>   * vfs_mknod - create device node or file
>   * @idmap:	idmap of the mount the inode was found from
> @@ -3975,7 +3989,7 @@ int vfs_mknod(struct mnt_idmap *idmap, struct inode *dir,
>  		return error;
>  
>  	if ((S_ISCHR(mode) || S_ISBLK(mode)) && !is_whiteout &&
> -	    !capable(CAP_MKNOD))
> +	    !mknod_capable(dir, dentry, mode, dev))
>  		return -EPERM;
>  
>  	if (!dir->i_op->mknod)
> diff --git a/fs/super.c b/fs/super.c
> index 2d762ce67f6e..bb01db6d9986 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -362,7 +362,11 @@ static struct super_block *alloc_super(struct file_system_type *type, int flags,
>  	}
>  	s->s_bdi = &noop_backing_dev_info;
>  	s->s_flags = flags;
> -	if (s->s_user_ns != &init_user_ns)
> +	/*
> +	 * We still have to think about this here. Several concerns exist
> +	 * about the security model, especially about malicious fuse.
> +	 */
> +	if (s->s_user_ns != &init_user_ns && security_sb_alloc_userns(s))
>  		s->s_iflags |= SB_I_NODEV;

Hm, no.

We dont want to have security hooks called in alloc_super(). That's just
the wrong layer for this. This is deeply internal stuff where we should
avoid interfacing with other subsystems.

Removing SB_I_NODEV here is also problematic or at least overly broad
because you allow to circumvent this for _every_ filesystems including
stuff like proc and so on where that doesn't make any sense.
  

Patch

diff --git a/fs/namei.c b/fs/namei.c
index f601fcbdc4d2..1f68d160e2c0 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3949,6 +3949,20 @@  inline struct dentry *user_path_create(int dfd, const char __user *pathname,
 }
 EXPORT_SYMBOL(user_path_create);
 
+static bool mknod_capable(struct inode *dir, struct dentry *dentry,
+			  umode_t mode, dev_t dev)
+{
+	/*
+	 * In case of a security hook implementation check mknod in user
+	 * namespace. Otherwise just check global capability.
+	 */
+	int error = security_inode_mknod_nscap(dir, dentry, mode, dev);
+	if (!error)
+		return ns_capable(current_user_ns(), CAP_MKNOD);
+	else
+		return capable(CAP_MKNOD);
+}
+
 /**
  * vfs_mknod - create device node or file
  * @idmap:	idmap of the mount the inode was found from
@@ -3975,7 +3989,7 @@  int vfs_mknod(struct mnt_idmap *idmap, struct inode *dir,
 		return error;
 
 	if ((S_ISCHR(mode) || S_ISBLK(mode)) && !is_whiteout &&
-	    !capable(CAP_MKNOD))
+	    !mknod_capable(dir, dentry, mode, dev))
 		return -EPERM;
 
 	if (!dir->i_op->mknod)
diff --git a/fs/super.c b/fs/super.c
index 2d762ce67f6e..bb01db6d9986 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -362,7 +362,11 @@  static struct super_block *alloc_super(struct file_system_type *type, int flags,
 	}
 	s->s_bdi = &noop_backing_dev_info;
 	s->s_flags = flags;
-	if (s->s_user_ns != &init_user_ns)
+	/*
+	 * We still have to think about this here. Several concerns exist
+	 * about the security model, especially about malicious fuse.
+	 */
+	if (s->s_user_ns != &init_user_ns && security_sb_alloc_userns(s))
 		s->s_iflags |= SB_I_NODEV;
 	INIT_HLIST_NODE(&s->s_instances);
 	INIT_HLIST_BL_HEAD(&s->s_roots);