From patchwork Mon Jul 3 17:27:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Michal_Koutn=C3=BD?= X-Patchwork-Id: 115439 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp688254vqx; Mon, 3 Jul 2023 10:46:04 -0700 (PDT) X-Google-Smtp-Source: APBJJlHEdpUNq17FCTWCTOO6HHDTuJ5tBqx6Femmgbj87k5F5FK28YJyyQa0euGRgvE4uPHl3G6n X-Received: by 2002:a17:902:b20b:b0:1ae:4567:2737 with SMTP id t11-20020a170902b20b00b001ae45672737mr11402735plr.2.1688406364633; Mon, 03 Jul 2023 10:46:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688406364; cv=none; d=google.com; s=arc-20160816; b=wmL980beofxelGP9RFmFMM2qW2+TwD5Aq6WrWaCjCr7ms3fgM69ubQ7birCK23ps6x 8r1k8ug2MBFhW6qsvYN865xfmtKyWCAYeVCon6nWQAiUjIDDlSuOl5g08NB0N1OdQZmm UnKQNTu/ENCP9lrxaN1VKYTNI6Kix5f2Er4RFZqYLG7wCnL5Hs1G2osnZp6ZGNkWgbyA nmt1NaLoDq2Gs9kvkZHbrslClclHWPjSUNl7ca7VmguJrcWQBTNbWeIc2EplaB3kPi09 Draa5BpfeyNFswtCTCSRR2Mhn9M+SDsFqnzikJwHbqH9ilvj5gBV8nPkgHZO/IZBCJ53 HiXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=NRKA7HZbx/PNfhSZJXmKgXZ8xGGJ/O1hIuO7ezGpPQQ=; fh=W9YjB8LblKMI9BlzaYswLs++QP273lsG58/F27dOPSw=; b=XimIubN7k6CrPYjuzBsG1kbuQHjyi3AxIy3X/1f3Ts+xETKehizgRFj5E2i9F/3IeH 6hWmOEHtAZ+sMw1aPD7InXkiswUhHvtyy3DSzly1RIRM2j/TY3lbEgpF5SEAMv46ySPo HGeI2sSOsXrrlpZapLgYTsvQohtcl+W1xxlyy1/U8Vhx6cKNFil8DTW3VVbBqa3gWFdR kFaUQmBqvRQ7bJbKr4kWc1l2YHDU+HYG5xNlmGFrlW8FcWkEWvLK6aEJEvdtQekV8pyG pD+mOEzxaI2RJFcgDBgtchNIwQKbEIP3TwyPbXs8xUAmJAzhU1dEtmU4Qq1NEO9cb7a3 XNJA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=Hn9DTlo6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p4-20020a170902e74400b001b8970a2b15si2722572plf.86.2023.07.03.10.45.49; Mon, 03 Jul 2023 10:46:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=Hn9DTlo6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231281AbjGCR1v (ORCPT + 99 others); Mon, 3 Jul 2023 13:27:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47018 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229895AbjGCR1q (ORCPT ); Mon, 3 Jul 2023 13:27:46 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 169B0E5D; Mon, 3 Jul 2023 10:27:45 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id B9DB01FF9B; Mon, 3 Jul 2023 17:27:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1688405263; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NRKA7HZbx/PNfhSZJXmKgXZ8xGGJ/O1hIuO7ezGpPQQ=; b=Hn9DTlo6aF63q8ALuo80CjuvfTC0UlfB+iyNy+47ll7kQmJmt67mmHScMTlKKO+VWhhAEC pifkY+Bd1bdQ357+s6sp/Oxj5KUXjyCfyq4x160nlXLRR0vlOnsPM5bD52gAZX3yzbLKGX RRDass0gnpAoyS8bS1WjrmFKhTksYio= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 91F44138FC; Mon, 3 Jul 2023 17:27:43 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id OBPJIg8Fo2QqHgAAMHmgww (envelope-from ); Mon, 03 Jul 2023 17:27:43 +0000 From: =?utf-8?q?Michal_Koutn=C3=BD?= To: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-kselftest@vger.kernel.org Cc: Waiman Long , Zefan Li , Tejun Heo , Johannes Weiner , Shuah Khan Subject: [PATCH v3 1/3] cpuset: Allow setscheduler regardless of manipulated task Date: Mon, 3 Jul 2023 19:27:39 +0200 Message-ID: <20230703172741.25392-2-mkoutny@suse.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230703172741.25392-1-mkoutny@suse.com> References: <20230703172741.25392-1-mkoutny@suse.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770422391709534412?= X-GMAIL-MSGID: =?utf-8?q?1770422391709534412?= When we migrate a task between two cgroups, one of the checks is a verification whether we can modify task's scheduler settings (cap_task_setscheduler()). An implicit migration occurs also when enabling a controller on the unified hierarchy (think of parent to child migration). The aforementioned check may be problematic if the caller of the migration (enabling a controller) has no permissions over migrated tasks. For instance, a user's cgroup that ends up running a process of a different user. Although cgroup permissions are configured favorably, the enablement fails due to the foreign process [1]. Change the behavior by relaxing the permissions check on the unified hierarchy when no effective change would happen. This is in accordance with unified hierarchy attachment behavior when permissions of the source to target cgroups are decisive whereas the migrated task is opaque (as opposed to more restrictive check in __cgroup1_procs_write()). Notice that foreign task's affinity may still be modified if the user can modify destination cgroup's cpuset attributes (update_tasks_cpumask() does no permissions check). The permissions check could thus be skipped on v2 even when affinity changes. Stay conservative in this patch though. [1] https://github.com/systemd/systemd/issues/18293#issuecomment-831205649 Signed-off-by: Michal Koutný Reviewed-by: Waiman Long --- kernel/cgroup/cpuset.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 58e6f18f01c1..0a9b860844ca 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -2487,6 +2487,7 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) struct cgroup_subsys_state *css; struct cpuset *cs, *oldcs; struct task_struct *task; + bool cpus_updated, mems_updated; int ret; /* used later by cpuset_attach() */ @@ -2501,13 +2502,25 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) if (ret) goto out_unlock; + cpus_updated = !cpumask_equal(cs->effective_cpus, oldcs->effective_cpus); + mems_updated = !nodes_equal(cs->effective_mems, oldcs->effective_mems); + cgroup_taskset_for_each(task, css, tset) { ret = task_can_attach(task); if (ret) goto out_unlock; - ret = security_task_setscheduler(task); - if (ret) - goto out_unlock; + + /* + * Skip rights over task check in v2 when nothing changes, + * migration permission derives from hierarchy ownership in + * cgroup_procs_write_permission()). + */ + if (!cgroup_subsys_on_dfl(cpuset_cgrp_subsys) || + (cpus_updated || mems_updated)) { + ret = security_task_setscheduler(task); + if (ret) + goto out_unlock; + } if (dl_task(task)) { cs->nr_migrate_dl_tasks++; From patchwork Mon Jul 3 17:27:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Michal_Koutn=C3=BD?= X-Patchwork-Id: 115442 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp693504vqx; Mon, 3 Jul 2023 10:56:45 -0700 (PDT) X-Google-Smtp-Source: APBJJlE6FRooiHLMTjjUSUP+C8iKkKXRqWadNfJ9eZdiOC/hp2389ZU0OS8z7+iOIw7oSZ/St+fJ X-Received: by 2002:a17:902:e002:b0:1b4:ddef:841e with SMTP id o2-20020a170902e00200b001b4ddef841emr11643211plo.4.1688407004767; Mon, 03 Jul 2023 10:56:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688407004; cv=none; d=google.com; s=arc-20160816; b=fm7LtVxZScqllfVI4FL1k0jmqVqO3kMZgYrTkw8MKTOS+i+LHQ3y3pm57ZlAada6Pl kzzh/JpaOpuqOmK3HGSpVvOhboI+zl2rxxc4CdgW8su08hEmhE2677E0zjn43s35dYJU s5xum5fiuIz/dYhJ3GD1bMjetHo0NVMjB5l/Di+s6RC3vgWlBYf9XrEq7kcl4RrmM4Sk 1pmYrioKlBnGD4fcZh+G6o9SAUNw1ps/CLEDdgjrnaoghe/xYXPXcPUEEwttFV/D0eBK C695yEvdIJ+0KvJV7Ps1VLQMtxKqVH+jxUwTB9a+TiaHfjVVBKKLIShgYd42Ss3ZJ4+h xIWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=tdTsxfSnc5J48DYVbkQ319BZmwsMm6KxF6jWIimyuF0=; fh=W9YjB8LblKMI9BlzaYswLs++QP273lsG58/F27dOPSw=; b=PuyiSEwO1KTN62Nvv0/BiQZC/j//mITeke7IC4n7O/6Qft5Mx+gsnZ8jEEH2U+esFD 9mnRd10XITx6cfdE3nnApDP9P5KBe6ie/ZDRh2L6X8vqfveAJMKRQvN0xbriqIM8vrCt DQtYPbyUX7tRjWoY1RPWwwhm+9xyERhGcspawk/ahG+VRkFGEa2bozY/cihLfdf/7gtc 92QjoeL+OXs81eoHCsLHik31AqgwuJBX3QLXmiB1gBANI+m0w00VldPtsqxqNhzKVaxs pF20JPbtHph7ow4z2BMVSdN0GGM+wWQmvatKnsApGE8rdelqamjNpLwzp6enioewQpIF nUaA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=P7mpWSSd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c4-20020a170902d48400b001b895572179si3117556plg.184.2023.07.03.10.56.30; Mon, 03 Jul 2023 10:56:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=P7mpWSSd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231299AbjGCR1y (ORCPT + 99 others); Mon, 3 Jul 2023 13:27:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47020 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229958AbjGCR1r (ORCPT ); Mon, 3 Jul 2023 13:27:47 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58FCBE5E; Mon, 3 Jul 2023 10:27:45 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id E58A81FF9C; Mon, 3 Jul 2023 17:27:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1688405263; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tdTsxfSnc5J48DYVbkQ319BZmwsMm6KxF6jWIimyuF0=; b=P7mpWSSd6ugptxFXneoMnbkGa/9Ezend+8I1KTyUljga8e4aHu3drlKDPsmevTBxtFolWw hSuim1r1I+6KwQd9CNykqFQXkYtst0IqU+Hlw3VJOXcSayzmMIxPfpQ9/ThquUenGMMNnC +u5XVCNLhgaKsSntBIawBRTs+jTlGew= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id BBFD21358E; Mon, 3 Jul 2023 17:27:43 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id wJENLQ8Fo2QqHgAAMHmgww (envelope-from ); Mon, 03 Jul 2023 17:27:43 +0000 From: =?utf-8?q?Michal_Koutn=C3=BD?= To: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-kselftest@vger.kernel.org Cc: Waiman Long , Zefan Li , Tejun Heo , Johannes Weiner , Shuah Khan Subject: [PATCH v3 2/3] selftests: cgroup: Minor code reorganizations Date: Mon, 3 Jul 2023 19:27:40 +0200 Message-ID: <20230703172741.25392-3-mkoutny@suse.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230703172741.25392-1-mkoutny@suse.com> References: <20230703172741.25392-1-mkoutny@suse.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770423063708824613?= X-GMAIL-MSGID: =?utf-8?q?1770423063708824613?= No functional change intended, these small changes are merged into one commit and they serve as a preparation for an upcoming new testcase. Signed-off-by: Michal Koutný Reviewed-by: Waiman Long --- MAINTAINERS | 1 + tools/testing/selftests/cgroup/cgroup_util.c | 2 ++ tools/testing/selftests/cgroup/cgroup_util.h | 2 ++ tools/testing/selftests/cgroup/test_core.c | 2 +- tools/testing/selftests/cgroup/test_cpuset_prs.sh | 2 +- 5 files changed, 7 insertions(+), 2 deletions(-) diff --git a/MAINTAINERS b/MAINTAINERS index e0976ae2a523..03bec83944c4 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5260,6 +5260,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git F: Documentation/admin-guide/cgroup-v1/cpusets.rst F: include/linux/cpuset.h F: kernel/cgroup/cpuset.c +F: tools/testing/selftests/cgroup/test_cpuset_prs.sh CONTROL GROUP - MEMORY RESOURCE CONTROLLER (MEMCG) M: Johannes Weiner diff --git a/tools/testing/selftests/cgroup/cgroup_util.c b/tools/testing/selftests/cgroup/cgroup_util.c index e8bbbdb77e0d..0340d4ca8f51 100644 --- a/tools/testing/selftests/cgroup/cgroup_util.c +++ b/tools/testing/selftests/cgroup/cgroup_util.c @@ -286,6 +286,8 @@ int cg_destroy(const char *cgroup) { int ret; + if (!cgroup) + return 0; retry: ret = rmdir(cgroup); if (ret && errno == EBUSY) { diff --git a/tools/testing/selftests/cgroup/cgroup_util.h b/tools/testing/selftests/cgroup/cgroup_util.h index c92df4e5d395..1df7f202214a 100644 --- a/tools/testing/selftests/cgroup/cgroup_util.h +++ b/tools/testing/selftests/cgroup/cgroup_util.h @@ -11,6 +11,8 @@ #define USEC_PER_SEC 1000000L #define NSEC_PER_SEC 1000000000L +#define TEST_UID 65534 /* usually nobody, any !root is fine */ + /* * Checks if two given values differ by less than err% of their sum. */ diff --git a/tools/testing/selftests/cgroup/test_core.c b/tools/testing/selftests/cgroup/test_core.c index 600123503063..80aa6b2373b9 100644 --- a/tools/testing/selftests/cgroup/test_core.c +++ b/tools/testing/selftests/cgroup/test_core.c @@ -683,7 +683,7 @@ static int test_cgcore_thread_migration(const char *root) */ static int test_cgcore_lesser_euid_open(const char *root) { - const uid_t test_euid = 65534; /* usually nobody, any !root is fine */ + const uid_t test_euid = TEST_UID; int ret = KSFT_FAIL; char *cg_test_a = NULL, *cg_test_b = NULL; char *cg_test_a_procs = NULL, *cg_test_b_procs = NULL; diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/testing/selftests/cgroup/test_cpuset_prs.sh index 2b5215cc599f..4afb132e4e4f 100755 --- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh +++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh @@ -10,7 +10,7 @@ skip_test() { echo "$1" echo "Test SKIPPED" - exit 0 + exit 4 # ksft_skip } [[ $(id -u) -eq 0 ]] || skip_test "Test must be run as root!" From patchwork Mon Jul 3 17:27:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Michal_Koutn=C3=BD?= X-Patchwork-Id: 115440 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp689702vqx; Mon, 3 Jul 2023 10:48:50 -0700 (PDT) X-Google-Smtp-Source: APBJJlFJF0mR6bKjIqhmDFcg5kin7Y4z/aqRcyiEoYLEVDg/E9givJIYf4hP2VbaFbDem9KjqOsJ X-Received: by 2002:a05:6a00:3103:b0:67f:ff0a:1bbb with SMTP id bi3-20020a056a00310300b0067fff0a1bbbmr12302128pfb.1.1688406529997; Mon, 03 Jul 2023 10:48:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688406529; cv=none; d=google.com; s=arc-20160816; b=QVaORJf/RiJMwFEhj85xjWhKZZUFshwnJgPM3vOIXEIdSjxuH74cZB3jT1FWfz8jFS TI8jYjzhto0EP3Trc67r8L+2xHFnFXp26OqUWF6Cgp1qXhGD673Prqgw6LKIgHWIF/gP Svt0j98bSHdF/q2FILhx8mgcZOUSGg4BdUlAuZIA0gumSNfVaQrjVnFi3Pr0RPa/CH1U 7fG1DsIZKcG8McOTi0SheY/uge2MK1zdxSYxyshm4pdQ4MPMdR/dZEy16PQm67BPmJ0p NJPGMfnDLvmv1/jYHWpsQUn8ULbx6gB6XhX5OaEehbXEmzHk64Cd8yg+3eivxZfnD4uj V78w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=SW4z05NiH7oIsztapmK1VK8RVAsODZYR8xUUh1Doa24=; fh=W9YjB8LblKMI9BlzaYswLs++QP273lsG58/F27dOPSw=; b=DfrFIdrHNu5HtdbcDVi5s/O9Dqf7VEU1+VsU3pmw7Ki+mJD4w42J68qRQIK9OKhsTF WtQ7XURGSnMVPr+TWvWSAEGhgTdbhte7/50TbQLWJsUu6LdJ6yEPJLqzMiR+k0rQKSVF pu29zrg+2Jr6goNMum4UP0nRoK16TvNFM4KtJoTen8PO/JX2FteW98cId6FG2xRLTsLj x8+QZVmBJpCrSAjcVlxMQ4xdWDL5qAchGbVFfiEog6VuPjBgjLRKIILUGURQ6xURob42 /5d5sUnYhmrcrHeefd7gDyBnANDkL4kbf1/ztJvkiEOflOVyaBPwHV44tJwvZ1Quqwpj rWXw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=DuYB+rQw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x31-20020a056a00189f00b0066db06a5cf1si18908259pfh.43.2023.07.03.10.48.34; Mon, 03 Jul 2023 10:48:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=DuYB+rQw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231347AbjGCR14 (ORCPT + 99 others); Mon, 3 Jul 2023 13:27:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47026 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230349AbjGCR1s (ORCPT ); Mon, 3 Jul 2023 13:27:48 -0400 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66ABEE5F; Mon, 3 Jul 2023 10:27:45 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 1CBCA21B0A; Mon, 3 Jul 2023 17:27:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1688405264; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SW4z05NiH7oIsztapmK1VK8RVAsODZYR8xUUh1Doa24=; b=DuYB+rQwYiJNb/RuLJBH9gsEk/IA3Ejg6GCb5aP6BphXU4FhhQlo4Te1W7NBKJWc7sDlO5 CFj664FPOCh2zVzaYyKYtWQaOAXzr/cfdRkRDsmyaHz2ENRyqiP7r4XEs3UUW06i5cdxWi SBrGGZ0EmbxbtznXNI/EBm5B6e6NxAQ= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id E7F20138FC; Mon, 3 Jul 2023 17:27:43 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id WFXKNw8Fo2QqHgAAMHmgww (envelope-from ); Mon, 03 Jul 2023 17:27:43 +0000 From: =?utf-8?q?Michal_Koutn=C3=BD?= To: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-kselftest@vger.kernel.org Cc: Waiman Long , Zefan Li , Tejun Heo , Johannes Weiner , Shuah Khan Subject: [PATCH v3 3/3] selftests: cgroup: Add cpuset migrations testcase Date: Mon, 3 Jul 2023 19:27:41 +0200 Message-ID: <20230703172741.25392-4-mkoutny@suse.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230703172741.25392-1-mkoutny@suse.com> References: <20230703172741.25392-1-mkoutny@suse.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770422565353005330?= X-GMAIL-MSGID: =?utf-8?q?1770422565353005330?= Add a separate testfile to verify treating permissions when tasks are migrated on cgroup v2 hierarchy between cpuset cgroups. In accordance with v2 design, migration should be allowed based on delegation boundaries (i.e. cgroup.procs permissions) and does not depend on the migrated object (i.e. unprivileged process can migrate another process (even privileged) as long as it remains in the original dedicated scope). Signed-off-by: Michal Koutný --- MAINTAINERS | 1 + tools/testing/selftests/cgroup/.gitignore | 1 + tools/testing/selftests/cgroup/Makefile | 2 + tools/testing/selftests/cgroup/test_cpuset.c | 275 +++++++++++++++++++ 4 files changed, 279 insertions(+) create mode 100644 tools/testing/selftests/cgroup/test_cpuset.c diff --git a/MAINTAINERS b/MAINTAINERS index 03bec83944c4..5c55de000ee3 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5260,6 +5260,7 @@ T: git git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git F: Documentation/admin-guide/cgroup-v1/cpusets.rst F: include/linux/cpuset.h F: kernel/cgroup/cpuset.c +F: tools/testing/selftests/cgroup/test_cpuset.c F: tools/testing/selftests/cgroup/test_cpuset_prs.sh CONTROL GROUP - MEMORY RESOURCE CONTROLLER (MEMCG) diff --git a/tools/testing/selftests/cgroup/.gitignore b/tools/testing/selftests/cgroup/.gitignore index c4a57e69f749..8443a8d46a1c 100644 --- a/tools/testing/selftests/cgroup/.gitignore +++ b/tools/testing/selftests/cgroup/.gitignore @@ -5,4 +5,5 @@ test_freezer test_kmem test_kill test_cpu +test_cpuset wait_inotify diff --git a/tools/testing/selftests/cgroup/Makefile b/tools/testing/selftests/cgroup/Makefile index 3d263747d2ad..dee0f013c7f4 100644 --- a/tools/testing/selftests/cgroup/Makefile +++ b/tools/testing/selftests/cgroup/Makefile @@ -12,6 +12,7 @@ TEST_GEN_PROGS += test_core TEST_GEN_PROGS += test_freezer TEST_GEN_PROGS += test_kill TEST_GEN_PROGS += test_cpu +TEST_GEN_PROGS += test_cpuset LOCAL_HDRS += $(selfdir)/clone3/clone3_selftests.h $(selfdir)/pidfd/pidfd.h @@ -23,3 +24,4 @@ $(OUTPUT)/test_core: cgroup_util.c $(OUTPUT)/test_freezer: cgroup_util.c $(OUTPUT)/test_kill: cgroup_util.c $(OUTPUT)/test_cpu: cgroup_util.c +$(OUTPUT)/test_cpuset: cgroup_util.c diff --git a/tools/testing/selftests/cgroup/test_cpuset.c b/tools/testing/selftests/cgroup/test_cpuset.c new file mode 100644 index 000000000000..b061ed1e05b4 --- /dev/null +++ b/tools/testing/selftests/cgroup/test_cpuset.c @@ -0,0 +1,275 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include + +#include "../kselftest.h" +#include "cgroup_util.h" + +static int idle_process_fn(const char *cgroup, void *arg) +{ + (void)pause(); + return 0; +} + +static int do_migration_fn(const char *cgroup, void *arg) +{ + int object_pid = (int)(size_t)arg; + + if (setuid(TEST_UID)) + return EXIT_FAILURE; + + // XXX checking /proc/$pid/cgroup would be quicker than wait + if (cg_enter(cgroup, object_pid) || + cg_wait_for_proc_count(cgroup, 1)) + return EXIT_FAILURE; + + return EXIT_SUCCESS; +} + +static int do_controller_fn(const char *cgroup, void *arg) +{ + const char *child = cgroup; + const char *parent = arg; + + if (setuid(TEST_UID)) + return EXIT_FAILURE; + + if (!cg_read_strstr(child, "cgroup.controllers", "cpuset")) + return EXIT_FAILURE; + + if (cg_write(parent, "cgroup.subtree_control", "+cpuset")) + return EXIT_FAILURE; + + if (cg_read_strstr(child, "cgroup.controllers", "cpuset")) + return EXIT_FAILURE; + + if (cg_write(parent, "cgroup.subtree_control", "-cpuset")) + return EXIT_FAILURE; + + if (!cg_read_strstr(child, "cgroup.controllers", "cpuset")) + return EXIT_FAILURE; + + return EXIT_SUCCESS; +} + +/* + * Migrate a process between two sibling cgroups. + * The success should only depend on the parent cgroup permissions and not the + * migrated process itself (cpuset controller is in place because it uses + * security_task_setscheduler() in cgroup v1). + * + * Deliberately don't set cpuset.cpus in children to avoid definining migration + * permissions between two different cpusets. + */ +static int test_cpuset_perms_object(const char *root, bool allow) +{ + char *parent = NULL, *child_src = NULL, *child_dst = NULL; + char *parent_procs = NULL, *child_src_procs = NULL, *child_dst_procs = NULL; + const uid_t test_euid = TEST_UID; + int object_pid = 0; + int ret = KSFT_FAIL; + + parent = cg_name(root, "cpuset_test_0"); + if (!parent) + goto cleanup; + parent_procs = cg_name(parent, "cgroup.procs"); + if (!parent_procs) + goto cleanup; + if (cg_create(parent)) + goto cleanup; + + child_src = cg_name(parent, "cpuset_test_1"); + if (!child_src) + goto cleanup; + child_src_procs = cg_name(child_src, "cgroup.procs"); + if (!child_src_procs) + goto cleanup; + if (cg_create(child_src)) + goto cleanup; + + child_dst = cg_name(parent, "cpuset_test_2"); + if (!child_dst) + goto cleanup; + child_dst_procs = cg_name(child_dst, "cgroup.procs"); + if (!child_dst_procs) + goto cleanup; + if (cg_create(child_dst)) + goto cleanup; + + if (cg_write(parent, "cgroup.subtree_control", "+cpuset")) + goto cleanup; + + if (cg_read_strstr(child_src, "cgroup.controllers", "cpuset") || + cg_read_strstr(child_dst, "cgroup.controllers", "cpuset")) + goto cleanup; + + /* Enable permissions along src->dst tree path */ + if (chown(child_src_procs, test_euid, -1) || + chown(child_dst_procs, test_euid, -1)) + goto cleanup; + + if (allow && chown(parent_procs, test_euid, -1)) + goto cleanup; + + /* Fork a privileged child as a test object */ + object_pid = cg_run_nowait(child_src, idle_process_fn, NULL); + if (object_pid < 0) + goto cleanup; + + /* Carry out migration in a child process that can drop all privileges + * (including capabilities), the main process must remain privileged for + * cleanup. + * Child process's cgroup is irrelevant but we place it into child_dst + * as hacky way to pass information about migration target to the child. + */ + if (allow ^ (cg_run(child_dst, do_migration_fn, (void *)(size_t)object_pid) == EXIT_SUCCESS)) + goto cleanup; + + ret = KSFT_PASS; + +cleanup: + if (object_pid > 0) { + (void)kill(object_pid, SIGTERM); + (void)clone_reap(object_pid, WEXITED); + } + + cg_destroy(child_dst); + free(child_dst_procs); + free(child_dst); + + cg_destroy(child_src); + free(child_src_procs); + free(child_src); + + cg_destroy(parent); + free(parent_procs); + free(parent); + + return ret; +} + +static int test_cpuset_perms_object_allow(const char *root) +{ + return test_cpuset_perms_object(root, true); +} + +static int test_cpuset_perms_object_deny(const char *root) +{ + return test_cpuset_perms_object(root, false); +} + +/* + * Migrate a process between parent and child implicitely + * Implicit migration happens when a controller is enabled/disabled. + * + */ +static int test_cpuset_perms_subtree(const char *root) +{ + char *parent = NULL, *child = NULL; + char *parent_procs = NULL, *parent_subctl = NULL, *child_procs = NULL; + const uid_t test_euid = TEST_UID; + int object_pid = 0; + int ret = KSFT_FAIL; + + parent = cg_name(root, "cpuset_test_0"); + if (!parent) + goto cleanup; + parent_procs = cg_name(parent, "cgroup.procs"); + if (!parent_procs) + goto cleanup; + parent_subctl = cg_name(parent, "cgroup.subtree_control"); + if (!parent_subctl) + goto cleanup; + if (cg_create(parent)) + goto cleanup; + + child = cg_name(parent, "cpuset_test_1"); + if (!child) + goto cleanup; + child_procs = cg_name(child, "cgroup.procs"); + if (!child_procs) + goto cleanup; + if (cg_create(child)) + goto cleanup; + + /* Enable permissions as in a delegated subtree */ + if (chown(parent_procs, test_euid, -1) || + chown(parent_subctl, test_euid, -1) || + chown(child_procs, test_euid, -1)) + goto cleanup; + + /* Put a privileged child in the subtree and modify controller state + * from an unprivileged process, the main process remains privileged + * for cleanup. + * The unprivileged child runs in subtree too to avoid parent and + * internal-node constraing violation. + */ + object_pid = cg_run_nowait(child, idle_process_fn, NULL); + if (object_pid < 0) + goto cleanup; + + if (cg_run(child, do_controller_fn, parent) != EXIT_SUCCESS) + goto cleanup; + + ret = KSFT_PASS; + +cleanup: + if (object_pid > 0) { + (void)kill(object_pid, SIGTERM); + (void)clone_reap(object_pid, WEXITED); + } + + cg_destroy(child); + free(child_procs); + free(child); + + cg_destroy(parent); + free(parent_subctl); + free(parent_procs); + free(parent); + + return ret; +} + + +#define T(x) { x, #x } +struct cpuset_test { + int (*fn)(const char *root); + const char *name; +} tests[] = { + T(test_cpuset_perms_object_allow), + T(test_cpuset_perms_object_deny), + T(test_cpuset_perms_subtree), +}; +#undef T + +int main(int argc, char *argv[]) +{ + char root[PATH_MAX]; + int i, ret = EXIT_SUCCESS; + + if (cg_find_unified_root(root, sizeof(root))) + ksft_exit_skip("cgroup v2 isn't mounted\n"); + + if (cg_read_strstr(root, "cgroup.subtree_control", "cpuset")) + if (cg_write(root, "cgroup.subtree_control", "+cpuset")) + ksft_exit_skip("Failed to set cpuset controller\n"); + + for (i = 0; i < ARRAY_SIZE(tests); i++) { + switch (tests[i].fn(root)) { + case KSFT_PASS: + ksft_test_result_pass("%s\n", tests[i].name); + break; + case KSFT_SKIP: + ksft_test_result_skip("%s\n", tests[i].name); + break; + default: + ret = EXIT_FAILURE; + ksft_test_result_fail("%s\n", tests[i].name); + break; + } + } + + return ret; +}