From patchwork Mon Jan 15 15:46:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexey Gladkov X-Patchwork-Id: 188231 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:693c:2614:b0:101:6a76:bbe3 with SMTP id mm20csp1784257dyc; Mon, 15 Jan 2024 07:49:48 -0800 (PST) X-Google-Smtp-Source: AGHT+IHWftgMujz+d9Zm58WKx9jfqavjA7pV5d+vdXXI6P+UVt0ieQmk+lUF/EwR4HOCjV85Wbc7 X-Received: by 2002:a50:ec87:0:b0:558:acc0:d5c1 with SMTP id e7-20020a50ec87000000b00558acc0d5c1mr1885336edr.75.1705333788664; Mon, 15 Jan 2024 07:49:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1705333788; cv=none; d=google.com; s=arc-20160816; b=d3neSoBJBXS1yxRQyN4OBnprVmapcriay/U8WSwWmgMZ2S+NXTR0Aj9JB7UqZIAcgY H+fh/KR1EO+8evII6yzH/gzRkqzSsuxHWMa7OM0cGgpomkC+aLVmhKdAGHuoSwk67JX2 +9gOOUvKO75m1YjqUgVbtyrYV++RzquoinJRlH/EbUSkmn/URbWxzXNQJmK77Mr9xnuO W2DoC5G/Z6FX60KORxD3QsurjwAa9C994Z8mXDBnvgAs5BBoEXI8RCDkW95XmI0vrhTW X119e0jPsgkiBrWvRaUf6jSAE3//KAyqeJ336S/V6ljU/0BxhWmdfq+jCMKrmpu5cTaN Bm5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=LcpwXRk4+D3mug8q4i3SJzlFDdHPvGnPVdq+aKCdOzc=; fh=zd5tx2mrKpeDSq8i29C8fPinxTee353gUXgQsj9s1wU=; b=uNCUP0cS/6UAV6GgmRrih5040r+NPGzYbOISPxXsY06JFN7IVTU/L/7AjWIYIvzzhy puoPMzFbqBxH3D1jN4rFW5pKR2Tr87UPcjFCcjt6nmxZtx3tm2cjfT21ysmeaDNHQ57w zCant2Dq87dNVoAIfJMMXjn8xfhGsk57PlO+OBZkk0heq+RN0I6zoVSduutM5WEicIC6 ZLJe1g2OkWOo6I8R8v1cYcGcermVex8VCNq867F+nx4ZXUxFmSPI2nxKhVXaE0THDnXX SUELsx4ls12qXK4BpvoCFAr1vlS4PzBKpKSQ1TMJFDuIN9oDdjw/o/fS8+x2d6TjN8lM leVg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-26195-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-26195-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id w28-20020a50d79c000000b00559737d8c34si220242edi.102.2024.01.15.07.49.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Jan 2024 07:49:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-26195-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-26195-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-26195-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 49C791F22742 for ; Mon, 15 Jan 2024 15:49:48 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C304717BAA; Mon, 15 Jan 2024 15:49:04 +0000 (UTC) Received: from us-smtp-delivery-44.mimecast.com (us-smtp-delivery-44.mimecast.com [207.211.30.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CDBB7179A5 for ; Mon, 15 Jan 2024 15:49:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=kernel.org Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-577-w6Of273vPOui2F9i1TlWMA-1; Mon, 15 Jan 2024 10:47:13 -0500 X-MC-Unique: w6Of273vPOui2F9i1TlWMA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B82B482DFE7; Mon, 15 Jan 2024 15:47:12 +0000 (UTC) Received: from localhost.redhat.com (unknown [10.45.226.182]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6C7723C25; Mon, 15 Jan 2024 15:47:11 +0000 (UTC) From: Alexey Gladkov To: LKML , Linux Containers Cc: Andrew Morton , Christian Brauner , "Eric W . Biederman" , Joel Granados , Kees Cook , Luis Chamberlain , Manfred Spraul Subject: [RESEND PATCH v3 1/3] sysctl: Allow change system v ipc sysctls inside ipc namespace Date: Mon, 15 Jan 2024 15:46:41 +0000 Message-ID: In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788172082605998195 X-GMAIL-MSGID: 1788172082605998195 Rootless containers are not allowed to modify kernel IPC parameters. All default limits are set to such high values that in fact there are no limits at all. All limits are not inherited and are initialized to default values when a new ipc_namespace is created. For new ipc_namespace: size_t ipc_ns.shm_ctlmax = SHMMAX; // (ULONG_MAX - (1UL << 24)) size_t ipc_ns.shm_ctlall = SHMALL; // (ULONG_MAX - (1UL << 24)) int ipc_ns.shm_ctlmni = IPCMNI; // (1 << 15) int ipc_ns.shm_rmid_forced = 0; unsigned int ipc_ns.msg_ctlmax = MSGMAX; // 8192 unsigned int ipc_ns.msg_ctlmni = MSGMNI; // 32000 unsigned int ipc_ns.msg_ctlmnb = MSGMNB; // 16384 The shm_tot (total amount of shared pages) has also ceased to be global, it is located in ipc_namespace and is not inherited from anywhere. In such conditions, it cannot be said that these limits limit anything. The real limiter for them is cgroups. If we allow rootless containers to change these parameters, then it can only be reduced. Signed-off-by: Alexey Gladkov Link: https://lkml.kernel.org/r/e2d84d3ec0172cfff759e6065da84ce0cc2736f8.1663756794.git.legion@kernel.org Signed-off-by: Eric W. Biederman --- ipc/ipc_sysctl.c | 37 +++++++++++++++++++++++++++++++++++-- 1 file changed, 35 insertions(+), 2 deletions(-) diff --git a/ipc/ipc_sysctl.c b/ipc/ipc_sysctl.c index 8c62e443f78b..01c4a50d22b2 100644 --- a/ipc/ipc_sysctl.c +++ b/ipc/ipc_sysctl.c @@ -14,6 +14,7 @@ #include #include #include +#include #include "util.h" static int proc_ipc_dointvec_minmax_orphans(struct ctl_table *table, int write, @@ -190,25 +191,57 @@ static int set_is_seen(struct ctl_table_set *set) return ¤t->nsproxy->ipc_ns->ipc_set == set; } +static void ipc_set_ownership(struct ctl_table_header *head, + struct ctl_table *table, + kuid_t *uid, kgid_t *gid) +{ + struct ipc_namespace *ns = + container_of(head->set, struct ipc_namespace, ipc_set); + + kuid_t ns_root_uid = make_kuid(ns->user_ns, 0); + kgid_t ns_root_gid = make_kgid(ns->user_ns, 0); + + *uid = uid_valid(ns_root_uid) ? ns_root_uid : GLOBAL_ROOT_UID; + *gid = gid_valid(ns_root_gid) ? ns_root_gid : GLOBAL_ROOT_GID; +} + static int ipc_permissions(struct ctl_table_header *head, struct ctl_table *table) { int mode = table->mode; #ifdef CONFIG_CHECKPOINT_RESTORE - struct ipc_namespace *ns = current->nsproxy->ipc_ns; + struct ipc_namespace *ns = + container_of(head->set, struct ipc_namespace, ipc_set); if (((table->data == &ns->ids[IPC_SEM_IDS].next_id) || (table->data == &ns->ids[IPC_MSG_IDS].next_id) || (table->data == &ns->ids[IPC_SHM_IDS].next_id)) && checkpoint_restore_ns_capable(ns->user_ns)) mode = 0666; + else #endif - return mode; + { + kuid_t ns_root_uid; + kgid_t ns_root_gid; + + ipc_set_ownership(head, table, &ns_root_uid, &ns_root_gid); + + if (uid_eq(current_euid(), ns_root_uid)) + mode >>= 6; + + else if (in_egroup_p(ns_root_gid)) + mode >>= 3; + } + + mode &= 7; + + return (mode << 6) | (mode << 3) | mode; } static struct ctl_table_root set_root = { .lookup = set_lookup, .permissions = ipc_permissions, + .set_ownership = ipc_set_ownership, }; bool setup_ipc_sysctls(struct ipc_namespace *ns) From patchwork Mon Jan 15 15:46:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexey Gladkov X-Patchwork-Id: 188230 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:693c:2614:b0:101:6a76:bbe3 with SMTP id mm20csp1783875dyc; Mon, 15 Jan 2024 07:49:07 -0800 (PST) X-Google-Smtp-Source: AGHT+IHEpC+C3sAB+b/Y1CSplIgm2gRvqzmqHBWoKLDpiEDM9Uh3CJj86hULj66G6iw5X1YlJmAG X-Received: by 2002:a25:ab0c:0:b0:da0:48df:cafa with SMTP id u12-20020a25ab0c000000b00da048dfcafamr2919175ybi.16.1705333747709; Mon, 15 Jan 2024 07:49:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1705333747; cv=none; d=google.com; s=arc-20160816; b=LrHH2fdlrcp+nQi+P74Zp+EmCrdoHoD5X81u1gmuCjaZkHGWRgmNbphTFgptmu6ZrE ibIj5A/9RifB71/aVjk49Bo5WK/67U+zsRiEBV6i+vQunc+K9BGXkBlwUDyAnn+tzzPw 2liw4xHh2Ofw2H2d5oH4aPKmRSC6/fg2P+XG6NIe1Df6oj4cKl2cRhwGwOUIgIlGDuvs HBc3SmLTsj3ibYp2kpQaWLoBCQflAjZgt1Fg10eJwxh+NNPCKfASHO/Bp50IPDkt0NTN ACC+/rmY4GEgX6yjxRtQ+1J8qDPHFRqr3cpPlm+Mzvjlz6iTGCP/iFTdl7zy4c+lHVNF u5pA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=oNfSu9ftEapK2cUTdE4dMrn6lXTJNargfLHCElh25SU=; fh=LCskshR3cGEvkgb32RNYPb2PLkRRuFTpb8LsbKQAEx0=; b=mw7SewtSS1KSeBjZEHwgANZ5hFPclBJG+hLZEF3kt63SganIwgpif3EVVlL9lQxS6h ro/u+9B/7Fn0c8dCQUKGySMH7ca2de6iglJftr46u0HOBIATo8E32wwWZPRlXIncziac ECw2YrjnobU9tqVUNKQCZEXNS4+hU3N2gM/uUs6M83LOalHo19hP9zPD+kixRFIlm3Fa J/EBDwQ+hDDtFDF/PtVJ+qkexVp+3gDxj0v5LhisaGye+iFg43jbOySkrEvGViuUkmMi feuz2gg2hVf8Rv+Kj4vd95btZ0GxLIcnyj4/G7zkzcWtDx3M8kSRn+oTpyXo+fcT1kGM ozrw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-26192-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-26192-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id k16-20020a05622a03d000b004299cafd4a3si8091194qtx.533.2024.01.15.07.49.07 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Jan 2024 07:49:07 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-26192-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-26192-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-26192-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 811301C21D82 for ; Mon, 15 Jan 2024 15:49:07 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2F3A117BB4; Mon, 15 Jan 2024 15:48:31 +0000 (UTC) Received: from us-smtp-delivery-44.mimecast.com (us-smtp-delivery-44.mimecast.com [205.139.111.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 537A91774C for ; Mon, 15 Jan 2024 15:48:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=kernel.org Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-459-QAkLZRVbNMynMU_EIsS2Bw-1; Mon, 15 Jan 2024 10:47:15 -0500 X-MC-Unique: QAkLZRVbNMynMU_EIsS2Bw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6458C3830084; Mon, 15 Jan 2024 15:47:14 +0000 (UTC) Received: from localhost.redhat.com (unknown [10.45.226.182]) by smtp.corp.redhat.com (Postfix) with ESMTP id EFF293C25; Mon, 15 Jan 2024 15:47:12 +0000 (UTC) From: Alexey Gladkov To: LKML , Linux Containers , linux-doc@vger.kernel.org Cc: Andrew Morton , Christian Brauner , "Eric W . Biederman" , Joel Granados , Kees Cook , Luis Chamberlain , Manfred Spraul Subject: [RESEND PATCH v3 2/3] docs: Add information about ipc sysctls limitations Date: Mon, 15 Jan 2024 15:46:42 +0000 Message-ID: <09e99911071766958af488beb4e8a728a4f12135.1705333426.git.legion@kernel.org> In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788172039684871910 X-GMAIL-MSGID: 1788172039684871910 After 25b21cb2f6d6 ("[PATCH] IPC namespace core") and 4e9823111bdc ("[PATCH] IPC namespace - shm") the shared memory page count stopped being global and started counting per ipc namespace. The documentation and shmget(2) still says that shmall is a global option. shmget(2): SHMALL System-wide limit on the total amount of shared memory, measured in units of the system page size. On Linux, this limit can be read and modified via /proc/sys/kernel/shmall. I think the changes made in 2006 should be documented. Signed-off-by: Alexey Gladkov Acked-by: "Eric W. Biederman" Link: https://lkml.kernel.org/r/ede20ddf7be48b93e8084c3be2e920841ee1a641.1663756794.git.legion@kernel.org Signed-off-by: Eric W. Biederman --- Documentation/admin-guide/sysctl/kernel.rst | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst index 6584a1f9bfe3..bc578663619d 100644 --- a/Documentation/admin-guide/sysctl/kernel.rst +++ b/Documentation/admin-guide/sysctl/kernel.rst @@ -594,6 +594,9 @@ default (``MSGMNB``). ``msgmni`` is the maximum number of IPC queues. 32000 by default (``MSGMNI``). +All of these parameters are set per ipc namespace. The maximum number of bytes +in POSIX message queues is limited by ``RLIMIT_MSGQUEUE``. This limit is +respected hierarchically in the each user namespace. msg_next_id, sem_next_id, and shm_next_id (System V IPC) ======================================================== @@ -1274,15 +1277,20 @@ are doing anyway :) shmall ====== -This parameter sets the total amount of shared memory pages that -can be used system wide. Hence, ``shmall`` should always be at least -``ceil(shmmax/PAGE_SIZE)``. +This parameter sets the total amount of shared memory pages that can be used +inside ipc namespace. The shared memory pages counting occurs for each ipc +namespace separately and is not inherited. Hence, ``shmall`` should always be at +least ``ceil(shmmax/PAGE_SIZE)``. If you are not sure what the default ``PAGE_SIZE`` is on your Linux system, you can run the following command:: # getconf PAGE_SIZE +To reduce or disable the ability to allocate shared memory, you must create a +new ipc namespace, set this parameter to the required value and prohibit the +creation of a new ipc namespace in the current user namespace or cgroups can +be used. shmmax ====== From patchwork Mon Jan 15 15:46:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexey Gladkov X-Patchwork-Id: 188232 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:693c:2614:b0:101:6a76:bbe3 with SMTP id mm20csp1787058dyc; Mon, 15 Jan 2024 07:55:47 -0800 (PST) X-Google-Smtp-Source: AGHT+IEpwRT+lGDzLvTaqYZY1pKC8QfkVkZCOd12ER102UG8+HoJKPk0o52Y6tRDDeSjWt3T7z4n X-Received: by 2002:a17:90a:2bc2:b0:28e:2444:597e with SMTP id n2-20020a17090a2bc200b0028e2444597emr1366074pje.37.1705334146847; Mon, 15 Jan 2024 07:55:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1705334146; cv=none; d=google.com; s=arc-20160816; b=r6BadOn+ThHAKRftaKcxZdNYQeVPQas9umytE0/ahA4JGNtcT+gppTruYGEF+cUC0z LVptz3/j3Z0oKgjpj9+zmRRlM3ynZuQ0Lo+mEiuq6DzwyAMdifbCHP6A95rvU6e+LiKl gSG88Agvli3jTZe2FF7MfFuFOxgYAlH6P6KcinmQ18+LjBVJwdMVOGbIYmxRVRkE4p+i A86RgSsExsq5Xo3XutIYalUi/0CvHUf4vpP7q0t4qSP7GQ3wrIGbpAGQIHuPaOJK5FjQ bABImNZ85jO62ekFnJNpat8RDk3nOBzyyfEz8POEXhjMZHgGOaErqek+FUZ96Wjhd9q+ 4GeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=Atg2TFpCiKAfpvoEitluBziGAvlHwW9+Bw/DYrVfHlw=; fh=zd5tx2mrKpeDSq8i29C8fPinxTee353gUXgQsj9s1wU=; b=bf7KsDzV1TvPzSI57xSCIkxlkq9k758FEVQr8u2mqgTuwDWJ6CCnVlp8kNbxx0OBOB hL6JS3GB6U5pN7DhG9+skMzgRo/NWcaoAZ0cIsZV4zXR6EL0/niPEv1d54Nxif4egfaN ijvtU5qcTxOBq7JJtwVaqTjNTmj8DlSCfZreVGVUn51YlaJVOaBTAFE9BkbjwFUNjCmV qUGXe/fZIkKD6TL327fSDcoqrvtemjy7UbLD6TS6LUL3GWbirN9qzJHH4y7Z5uzXpSq1 ps169fbn8JMnr5k5sy/24ZaoGEUbm7L8wdx5jCoDdPGPr1A6OQ83C/phRqIlgNPBlNhx /ApQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-26193-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-26193-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id p8-20020a17090ab90800b0028e69509986si457058pjr.68.2024.01.15.07.55.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Jan 2024 07:55:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-26193-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-26193-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-26193-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 54B3C283423 for ; Mon, 15 Jan 2024 15:49:13 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 701F917BC0; Mon, 15 Jan 2024 15:48:31 +0000 (UTC) Received: from us-smtp-delivery-44.mimecast.com (us-smtp-delivery-44.mimecast.com [207.211.30.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 95BA317757 for ; Mon, 15 Jan 2024 15:48:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=kernel.org Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-107-_RghfqN5O6OKasyAw_gn3g-1; Mon, 15 Jan 2024 10:47:16 -0500 X-MC-Unique: _RghfqN5O6OKasyAw_gn3g-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E73FE2825BBF; Mon, 15 Jan 2024 15:47:15 +0000 (UTC) Received: from localhost.redhat.com (unknown [10.45.226.182]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9B36F3C25; Mon, 15 Jan 2024 15:47:14 +0000 (UTC) From: Alexey Gladkov To: LKML , Linux Containers Cc: Andrew Morton , Christian Brauner , "Eric W . Biederman" , Joel Granados , Kees Cook , Luis Chamberlain , Manfred Spraul Subject: [RESEND PATCH v3 3/3] sysctl: Allow to change limits for posix messages queues Date: Mon, 15 Jan 2024 15:46:43 +0000 Message-ID: <6ad67f23d1459a4f4339f74aa73bac0ecf3995e1.1705333426.git.legion@kernel.org> In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788172458212863291 X-GMAIL-MSGID: 1788172458212863291 All parameters of posix messages queues (queues_max/msg_max/msgsize_max) end up being limited by RLIMIT_MSGQUEUE. The code in mqueue_get_inode is where that limiting happens. The RLIMIT_MSGQUEUE is bound to the user namespace and is counted hierarchically. We can allow root in the user namespace to modify the posix messages queues parameters. Signed-off-by: Alexey Gladkov Link: https://lkml.kernel.org/r/7eb21211c8622e91d226e63416b1b93c079f60ee.1663756794.git.legion@kernel.org Signed-off-by: Eric W. Biederman --- ipc/mq_sysctl.c | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/ipc/mq_sysctl.c b/ipc/mq_sysctl.c index ebb5ed81c151..21fba3a6edaf 100644 --- a/ipc/mq_sysctl.c +++ b/ipc/mq_sysctl.c @@ -12,6 +12,7 @@ #include #include #include +#include static int msg_max_limit_min = MIN_MSGMAX; static int msg_max_limit_max = HARD_MSGMAX; @@ -76,8 +77,43 @@ static int set_is_seen(struct ctl_table_set *set) return ¤t->nsproxy->ipc_ns->mq_set == set; } +static void mq_set_ownership(struct ctl_table_header *head, + struct ctl_table *table, + kuid_t *uid, kgid_t *gid) +{ + struct ipc_namespace *ns = + container_of(head->set, struct ipc_namespace, mq_set); + + kuid_t ns_root_uid = make_kuid(ns->user_ns, 0); + kgid_t ns_root_gid = make_kgid(ns->user_ns, 0); + + *uid = uid_valid(ns_root_uid) ? ns_root_uid : GLOBAL_ROOT_UID; + *gid = gid_valid(ns_root_gid) ? ns_root_gid : GLOBAL_ROOT_GID; +} + +static int mq_permissions(struct ctl_table_header *head, struct ctl_table *table) +{ + int mode = table->mode; + kuid_t ns_root_uid; + kgid_t ns_root_gid; + + mq_set_ownership(head, table, &ns_root_uid, &ns_root_gid); + + if (uid_eq(current_euid(), ns_root_uid)) + mode >>= 6; + + else if (in_egroup_p(ns_root_gid)) + mode >>= 3; + + mode &= 7; + + return (mode << 6) | (mode << 3) | mode; +} + static struct ctl_table_root set_root = { .lookup = set_lookup, + .permissions = mq_permissions, + .set_ownership = mq_set_ownership, }; bool setup_mq_sysctls(struct ipc_namespace *ns)