From patchwork Wed Oct 18 10:50:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Michael_Wei=C3=9F?= X-Patchwork-Id: 154811 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp4701441vqb; Wed, 18 Oct 2023 03:52:25 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGGWgKF+h+tuOLLNUqSzCZpUfxmxS2dnlwqqA5mMmTIhFo9gag1FQHfRbYC6iecHoeWYcQq X-Received: by 2002:a05:6359:740b:b0:166:d97d:c5c3 with SMTP id va11-20020a056359740b00b00166d97dc5c3mr3954391rwb.1.1697626345214; Wed, 18 Oct 2023 03:52:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697626345; cv=none; d=google.com; s=arc-20160816; b=qSCPMeTrj1z+wuU1zXxgP4uXbDRc/BqHdXwopQox/OkPDkJqEi+ByLEOFikubbnQ8o OtaDxedqwmTLtWSWRgB6crucL391hFclzCdXSASoE+SabiuG/ZHnBvqJOd/moMleW5Pl Vf7zQPIsIS/EprI6hw7v7qrq5FyxHBZSvIOsM1n+siMkrGU5CepO50XAglcUyec7CCGp ROnJDACi9RC4YVdtMi5xefr5HoVGNwGt4y2OdfYMNR4yQPFlyPTfY9QPjGiu+zWMuAmO yrh6NX+zDIa803WhIrGffwAkjbO+nWbxIki1eRKZPgxcdjkUrcIe0LJWQpFTdko4+QQH Xlxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:ui-outboundreport:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from; bh=DoBalLWLPB6Ex/PCn/0LGWfZ/GMPhkIwzBLLpsE+xXI=; fh=c4ql7d5QsFreQmYQr7ycbAsMayBEmNUOv2nmR/PVp9M=; b=EuS5a5uPzqy0Ao7dgkGpBO1DQSvrg8stEjAS8Wo71YDMc7SToJfhMi6Ic6zl+Mo2gw cBje4AjX48i/udwgmvyzHfkkBji2l6ipkcW5cWUWe5riXv25k8UIyRKzxXsGU7N/PxOL wM7w3WKC+VYjL4coRkTPTdSn0PUcpYKlKa+V93Bwoel/zw5ADz+kHxequYos7n9LVHKh asqVVS6LKmS/pT1p3U5VDtfZ03R0tKTEsypVo14vVpiABCCNjcrH430KwOmogDs3UUL2 h4cIoxWrY4aWm+4ZCRSSPtilQhcEqTelRE2hMO8thoheWcZm6aoyEtUQ8rqrsbQnLrQ/ 3+3w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aisec.fraunhofer.de Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id a6-20020aa78e86000000b006bdfb718e16si3498193pfr.80.2023.10.18.03.52.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 03:52:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=aisec.fraunhofer.de Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id A883D8172963; Wed, 18 Oct 2023 03:52:21 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235180AbjJRKvq (ORCPT + 24 others); Wed, 18 Oct 2023 06:51:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48972 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229690AbjJRKvc (ORCPT ); Wed, 18 Oct 2023 06:51:32 -0400 Received: from mout.kundenserver.de (mout.kundenserver.de [212.227.126.133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2B995FE; Wed, 18 Oct 2023 03:51:28 -0700 (PDT) Received: from weisslap.aisec.fraunhofer.de ([91.67.186.133]) by mrelayeu.kundenserver.de (mreue012 [212.227.15.167]) with ESMTPSA (Nemesis) id 1M2Plu-1qpDK02lgs-003vUA; Wed, 18 Oct 2023 12:51:06 +0200 From: =?utf-8?q?Michael_Wei=C3=9F?= To: Alexander Mikhalitsyn , Christian Brauner , Alexei Starovoitov , Paul Moore Cc: Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Quentin Monnet , Alexander Viro , Miklos Szeredi , Amir Goldstein , "Serge E. Hallyn" , bpf@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, gyroidos@aisec.fraunhofer.de, =?utf-8?q?Mich?= =?utf-8?q?ael_Wei=C3=9F?= Subject: [RFC PATCH v2 14/14] device_cgroup: Allow mknod in non-initial userns if guarded Date: Wed, 18 Oct 2023 12:50:33 +0200 Message-Id: <20231018105033.13669-15-michael.weiss@aisec.fraunhofer.de> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231018105033.13669-1-michael.weiss@aisec.fraunhofer.de> References: <20231018105033.13669-1-michael.weiss@aisec.fraunhofer.de> MIME-Version: 1.0 X-Provags-ID: V03:K1:p+Jv7n/tSXj9VOdub/TyRhRZeiciOIbTx11fcWiGi9mKRboHY1V l3Y/axn61Mwfb1USZtNvgmraJ1CvN/YuyhUiM8MPPLLy3N9UcB3qIkXNVazSm0dD2d1J8N7 uPKVlmahAJpd7/TQyOQPb7QixZrgDlGBQnVqsB55vJGRa/srXyHE/FYrh62/Xb6SLAiNIAZ EMgJSuPBLUzKhU2OZ8/jg== UI-OutboundReport: notjunk:1;M01:P0:bWVLZ0m+uHw=;xZezacsXngGcvH0Bz/M7eP4ELMi 4AKs7tVSf04OhugwkXmK4kLGTy79PUmq3E2OhojNEXh7mA/P5Hpkd6Gu0iNxVex7N+b38m1mw VG5DJO61uPkIhhatZiKXD6Vx3zU+0tf3LRFT/ESZuNw5udUo0lkQnujM0x5hWzT0LyrE9+d/b bhjdwEyoyDQQGmcD0CppnWSXW89XpgCccRVLuKCdu4QHEusAaK4MVVDJTGfR01NXZmSjNWjOb ltu5zGJg5ezN8IwQ2hNHvQExD5LjaoTZ+7P4RqikOQc1i9kld9B3P23uCh1tIV8G4zdldqNeg lyBqE5//uO9VydAokKzr3xW9nbaXMMtXLRk7mGYWQdYcrk3HsUQ4TBjUzHNicacwq75pWhbxS 7lBcWLXhK67HHijf88W4b9QMXfQMRzh5X3DN2XItGKVjt7GbFuXNSf+XLYThWQSqztBliLqW5 i/4Hj6LkNwZ4Eud3ros5+f8VhB25d7CCBHPyd5iisNKQK6/KUAgJ5TRp9/ML5g+oWw39N7o+G IOhGCaGWeKEgKHLFxhSt4RPDyAd79tLzKBm/f1ayZmE3jLkXNlI3WWtuLVH4BHMeru1gfnRQL H53jjRvlxHM8oJnEkMFMSVOy1heQJoZdn4L6b/iLuwMS3bthXFxHdR5BMcTZ+aiyXck2IawtF Fv6ynUUeUInBWzpwk9fteyiPvcGHZ6d1lFvMvzlR0+CoeeOHLuHXDmbeoqptZAqR54RdpT0j1 yYmDGKaAf+ECfucMZWyjtBcHEnq5oeAh7poy5o9pnUOVCgwB6JexbEOSsbIPR4G5VqU9fUBGO l2558RO1KZDEekYpURHUWxC7tDkKBFud3PReNx1rcIcp33BT2sCWJ/hBXh//CIptigA+47bzj s8PUNBkPd2dr2ko+XFC1YNa8YlxpfKA4R0w0= X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Wed, 18 Oct 2023 03:52:21 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780090242507047754 X-GMAIL-MSGID: 1780090242507047754 If a container manager restricts its unprivileged (user namespaced) children by a device cgroup, it is not necessary to deny mknod() anymore. Thus, user space applications may map devices on different locations in the file system by using mknod() inside the container. A use case for this, we also use in GyroidOS, is to run virsh for VMs inside an unprivileged container. virsh creates device nodes, e.g., "/var/run/libvirt/qemu/11-fgfg.dev/null" which currently fails in a non-initial userns, even if a cgroup device white list with the corresponding major, minor of /dev/null exists. Thus, in this case the usual bind mounts or pre populated device nodes under /dev are not sufficient. To circumvent this limitation, allow mknod() by checking CAP_MKNOD in the userns by implementing the security_inode_mknod_nscap(). The hook implementation checks if the corresponding permission flag BPF_DEVCG_ACC_MKNOD_UNS is set for the device in the bpf program. To avoid to create unusable inodes in user space the hook also checks SB_I_NODEV on the corresponding super block. Further, the security_sb_alloc_userns() hook is implemented using cgroup_bpf_current_enabled() to allow usage of device nodes on super blocks mounted by a guarded task. Signed-off-by: Michael Weiß --- security/device_cgroup/lsm.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/security/device_cgroup/lsm.c b/security/device_cgroup/lsm.c index a963536d0a15..6bc984d9c9d1 100644 --- a/security/device_cgroup/lsm.c +++ b/security/device_cgroup/lsm.c @@ -66,10 +66,37 @@ static int devcg_inode_mknod(struct inode *dir, struct dentry *dentry, return __devcg_inode_mknod(mode, dev, DEVCG_ACC_MKNOD); } +#ifdef CONFIG_CGROUP_BPF +static int devcg_sb_alloc_userns(struct super_block *sb) +{ + if (cgroup_bpf_current_enabled(CGROUP_DEVICE)) + return 0; + + return -EPERM; +} + +static int devcg_inode_mknod_nscap(struct inode *dir, struct dentry *dentry, + umode_t mode, dev_t dev) +{ + if (!cgroup_bpf_current_enabled(CGROUP_DEVICE)) + return -EPERM; + + // avoid to create unusable inodes in user space + if (dentry->d_sb->s_iflags & SB_I_NODEV) + return -EPERM; + + return __devcg_inode_mknod(mode, dev, BPF_DEVCG_ACC_MKNOD_UNS); +} +#endif /* CONFIG_CGROUP_BPF */ + static struct security_hook_list devcg_hooks[] __ro_after_init = { LSM_HOOK_INIT(inode_permission, devcg_inode_permission), LSM_HOOK_INIT(inode_mknod, devcg_inode_mknod), LSM_HOOK_INIT(dev_permission, devcg_dev_permission), +#ifdef CONFIG_CGROUP_BPF + LSM_HOOK_INIT(sb_alloc_userns, devcg_sb_alloc_userns), + LSM_HOOK_INIT(inode_mknod_nscap, devcg_inode_mknod_nscap), +#endif }; static int __init devcgroup_init(void)