From patchwork Wed Aug 9 19:34:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: tip-bot2 for Thomas Gleixner X-Patchwork-Id: 133512 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:9d8d:b0:139:fa0d:b2d with SMTP id d13csp143130rwo; Wed, 9 Aug 2023 14:04:36 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHlz4/kSPdxbsS5+0XYgRLxR2nHII3fIn68AUMtTE0vMpcMFa2oLbktf4TB5t135zMxofAK X-Received: by 2002:a17:903:1c1:b0:1b7:f64b:378a with SMTP id e1-20020a17090301c100b001b7f64b378amr326475plh.16.1691615075925; Wed, 09 Aug 2023 14:04:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691615075; cv=none; d=google.com; s=arc-20160816; b=q8kdOzANnDonSfhkQCCKTz3eeVCyoPXJ9UhQ7Qxz2O+toC1uwLqXI2FI8g7N+z4r4z CI37y+eIIIN+szu82lzudO5S4JK6n2QXORHOLZ2dpLdI/DbPKpxuqTd0+zHIz4GUglZ+ uBe6i/NZepoutMgeG9DvLF92rM64DcQ/8KTTI+IB1nE0nFbHxNU1cEE703EFETrhKaZc LCEevsPR7nCgknyo3rQQDyKjednyMSVcLhx7ocCONIIjSCx8O3IERW6UdRjyOZe3gLD8 Z+88dj2pmTGGLOFeqKRbDHAMl7vfMEiQiPC6XjpIEOwCnpF3n+baspn2PIst+iYjr0eL 9RTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:robot-unsubscribe :robot-id:message-id:mime-version:references:in-reply-to:cc:subject :to:reply-to:sender:from:dkim-signature:dkim-signature:date; bh=kHb1+I7dEUKHkoGnfCzy6ZAjqopVxANpj2PLezgS+ZE=; fh=lLAP8+3q2XCLv+coOFt8ONAu9SXxWzZLot88TGlc1n4=; b=SLebI58ROjUPJqRZe9OLU1pIiahLj6a441NHIPQMArYZqxyjCqpQCQANjT5jcSsowT n+gghmutACHbcgiGao2HNUw6LUM/4gkhCqRgyRer/h3uyrmTlLKpcmmj2gbaQwEowQy4 FNXoTbI7B5UH8cmPRLbwMVaZWSwRpuQKpHWsaeKSov2VTVxj50GByrdMzfvriR4pzkQd +bLpXH6CHi9lVTYpGfVMm9XjQgWmm7yTVzo/Ja/dlq87O8egkFRxcACt5lqsdc00Y3a9 ZJznlKdzzk8tD0m8vf1RSMhu1bZY12254A9RTFgazqk4ZG+Itx8Qu1EkC5BFRmpXpNaa gaHg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=cyRj2nlu; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id cm22-20020a17090afa1600b00256d7cc5b67si50534pjb.133.2023.08.09.14.04.08; Wed, 09 Aug 2023 14:04:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=cyRj2nlu; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232630AbjHITfB (ORCPT + 99 others); Wed, 9 Aug 2023 15:35:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49402 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232674AbjHITfA (ORCPT ); Wed, 9 Aug 2023 15:35:00 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F1B3E10DC; Wed, 9 Aug 2023 12:34:58 -0700 (PDT) Date: Wed, 09 Aug 2023 19:34:57 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1691609697; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kHb1+I7dEUKHkoGnfCzy6ZAjqopVxANpj2PLezgS+ZE=; b=cyRj2nlu7o6KSw45lUf4H8Xr9qj5FHhxk+Rj3clxFJnrq9pFO7rFmsMEZAAbw8ErO9ST48 uiA1qyhU1kvq4mI/Tl2USrEeb7vcu1zrVs1qv2j7T8bEsXMOXQ7PREZQRM95OkwpEKUT2J dd+CoL1yzUxEYwmTR55CRXnVurHg4+jDS1yPkkdX4uuCw+bBqOEwx43if7cX+50uH8HRio BqFFk1fjrMoXIjqnU9oPelJQGodEAPeMh9py/1hJ797VEepkmP8vxtB7j0nF2hCZIA09uo kYyk1JMiFvKher4HbQsiV13YDmWh4aaOpUHRvM0nQQhvBajKHf4XkC6CsUlM5w== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1691609697; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kHb1+I7dEUKHkoGnfCzy6ZAjqopVxANpj2PLezgS+ZE=; b=za0k7vLuKD3jSNcHSM4tRrah0q5QVmhtjIVGBe32YMPkW7V+IxmzdArpJ8pnZZ9V7HcFYu CjuDFqDviqPRZKAg== From: "tip-bot2 for Phil Auld" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: sched/core] sched, cgroup: Restore meaning to hierarchical_quota Cc: Phil Auld , "Peter Zijlstra (Intel)" , Ben Segall , Tejun Heo , x86@kernel.org, linux-kernel@vger.kernel.org In-Reply-To: <20230714125746.812891-1-pauld@redhat.com> References: <20230714125746.812891-1-pauld@redhat.com> MIME-Version: 1.0 Message-ID: <169160969702.27769.6254794761914899507.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771222431610260720 X-GMAIL-MSGID: 1773786970291358971 The following commit has been merged into the sched/core branch of tip: Commit-ID: c98c18270be115678f4295b10a5af5dcc9c4efa0 Gitweb: https://git.kernel.org/tip/c98c18270be115678f4295b10a5af5dcc9c4efa0 Author: Phil Auld AuthorDate: Fri, 14 Jul 2023 08:57:46 -04:00 Committer: Peter Zijlstra CommitterDate: Wed, 02 Aug 2023 16:19:26 +02:00 sched, cgroup: Restore meaning to hierarchical_quota In cgroupv2 cfs_b->hierarchical_quota is set to -1 for all task groups due to the previous fix simply taking the min. It should reflect a limit imposed at that level or by an ancestor. Even though cgroupv2 does not require child quota to be less than or equal to that of its ancestors the task group will still be constrained by such a quota so this should be shown here. Cgroupv1 continues to set this correctly. In both cases, add initialization when a new task group is created based on the current parent's value (or RUNTIME_INF in the case of root_task_group). Otherwise, the field is wrong until a quota is changed after creation and __cfs_schedulable() is called. Fixes: c53593e5cb69 ("sched, cgroup: Don't reject lower cpu.max on ancestors") Signed-off-by: Phil Auld Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Ben Segall Acked-by: Tejun Heo Link: https://lore.kernel.org/r/20230714125746.812891-1-pauld@redhat.com --- kernel/sched/core.c | 13 +++++++++---- kernel/sched/fair.c | 7 ++++--- kernel/sched/sched.h | 2 +- 3 files changed, 14 insertions(+), 8 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 83e3654..3af25ca 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -9953,7 +9953,7 @@ void __init sched_init(void) ptr += nr_cpu_ids * sizeof(void **); root_task_group.shares = ROOT_TASK_GROUP_LOAD; - init_cfs_bandwidth(&root_task_group.cfs_bandwidth); + init_cfs_bandwidth(&root_task_group.cfs_bandwidth, NULL); #endif /* CONFIG_FAIR_GROUP_SCHED */ #ifdef CONFIG_RT_GROUP_SCHED root_task_group.rt_se = (struct sched_rt_entity **)ptr; @@ -11087,11 +11087,16 @@ static int tg_cfs_schedulable_down(struct task_group *tg, void *data) /* * Ensure max(child_quota) <= parent_quota. On cgroup2, - * always take the min. On cgroup1, only inherit when no - * limit is set: + * always take the non-RUNTIME_INF min. On cgroup1, only + * inherit when no limit is set. In both cases this is used + * by the scheduler to determine if a given CFS task has a + * bandwidth constraint at some higher level. */ if (cgroup_subsys_on_dfl(cpu_cgrp_subsys)) { - quota = min(quota, parent_quota); + if (quota == RUNTIME_INF) + quota = parent_quota; + else if (parent_quota != RUNTIME_INF) + quota = min(quota, parent_quota); } else { if (quota == RUNTIME_INF) quota = parent_quota; diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index f55b0a7..26bfbb6 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6045,13 +6045,14 @@ static enum hrtimer_restart sched_cfs_period_timer(struct hrtimer *timer) return idle ? HRTIMER_NORESTART : HRTIMER_RESTART; } -void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b) +void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b, struct cfs_bandwidth *parent) { raw_spin_lock_init(&cfs_b->lock); cfs_b->runtime = 0; cfs_b->quota = RUNTIME_INF; cfs_b->period = ns_to_ktime(default_cfs_period()); cfs_b->burst = 0; + cfs_b->hierarchical_quota = parent ? parent->hierarchical_quota : RUNTIME_INF; INIT_LIST_HEAD(&cfs_b->throttled_cfs_rq); hrtimer_init(&cfs_b->period_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_PINNED); @@ -6217,7 +6218,7 @@ static inline int throttled_lb_pair(struct task_group *tg, return 0; } -void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b) {} +void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b, struct cfs_bandwidth *parent) {} #ifdef CONFIG_FAIR_GROUP_SCHED static void init_cfs_rq_runtime(struct cfs_rq *cfs_rq) {} @@ -12599,7 +12600,7 @@ int alloc_fair_sched_group(struct task_group *tg, struct task_group *parent) tg->shares = NICE_0_LOAD; - init_cfs_bandwidth(tg_cfs_bandwidth(tg)); + init_cfs_bandwidth(tg_cfs_bandwidth(tg), tg_cfs_bandwidth(parent)); for_each_possible_cpu(i) { cfs_rq = kzalloc_node(sizeof(struct cfs_rq), diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 9baeb1a..602de71 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -454,7 +454,7 @@ extern void unregister_fair_sched_group(struct task_group *tg); extern void init_tg_cfs_entry(struct task_group *tg, struct cfs_rq *cfs_rq, struct sched_entity *se, int cpu, struct sched_entity *parent); -extern void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b); +extern void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b, struct cfs_bandwidth *parent); extern void __refill_cfs_bandwidth_runtime(struct cfs_bandwidth *cfs_b); extern void start_cfs_bandwidth(struct cfs_bandwidth *cfs_b);