From patchwork Wed May 10 13:49:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: tip-bot2 for Thomas Gleixner X-Patchwork-Id: 92144 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp3642578vqo; Wed, 10 May 2023 06:57:05 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ47xyP/cJoADcdPm5ltByep5gCaQHutKRX2s/MeoQMkiuTTcwXzSjpA/9PBAEncWG0zqWV/ X-Received: by 2002:a05:6a00:1a0e:b0:642:fbed:2819 with SMTP id g14-20020a056a001a0e00b00642fbed2819mr22592872pfv.22.1683727024786; Wed, 10 May 2023 06:57:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683727024; cv=none; d=google.com; s=arc-20160816; b=bXLyTTdrHn3QIy176mY5sPvdpaPjmnpxUI5TPPT4kwDfPXiYROvOZDmNv17moQnGRQ 0VN6FJMNJ8FvZUzY6/2OU7is1l3bjeHjBMTcaVA8b+9vV9H6InytmrG8GfcRC/7oZxNN Yc+e7hbqsXg1Ozl+v81DwOMPcz2PzBYq7vyZ8r+U2KnA2viv93QAiZjUKJm/sOjX4E4d S/xtGIRmDKdKRjQqcFzUBWW3gzIBpC8NjFA7L50nnoiJS1vLA0Zg16twnmm9z2xcDnkZ 8fOJRlFXHLu/I6Nb5kmUu7jPo9NOKyl7B6owy1IgOTr7kK1fCW47X2w288l8C/fxQmKt TWUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:robot-unsubscribe :robot-id:message-id:mime-version:cc:subject:to:reply-to:sender:from :dkim-signature:dkim-signature:date; bh=46hdFH2kJFRmyUO0PSjD3oY8ttFy+HiOOIKvoMeo0qQ=; b=gc3CfbqtkOhz0R5pXWmls4yV0x8P6h4MZsGUwnrSD08sFMln/R1xU3auJ8299NGHZG EYsBxbvq49p7sJoXxfjrDjHD9HwUVFhbNaADU9keb1E0Ndl9ZNfbdov/iseIRcDISM5Y txKsdIhTB79rXdxtNH3l5EpWQSSJkIQO4siMbKDUkaQGMvCt1k5bqAqphlzqpfv6LXUR i43U59AEvIgUnjFbo9/AdwOSOMPu0ugHaCyzGZSCW/hTcB/d1HvKfIh3fDT88be57QIs 8FEtIpkBQli8dzUJPd1BEPLNyBiXa9pjROwGFD9KfTcCokw5J0eQtXJ1i5nWdW5qv+wY CIAw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=qs9W96am; dkim=neutral (no key) header.i=@linutronix.de header.b=dbvhjKMW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o26-20020a63921a000000b0050bfa82c245si4312011pgd.855.2023.05.10.06.56.51; Wed, 10 May 2023 06:57:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=qs9W96am; dkim=neutral (no key) header.i=@linutronix.de header.b=dbvhjKMW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237289AbjEJNta (ORCPT + 99 others); Wed, 10 May 2023 09:49:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237232AbjEJNtT (ORCPT ); Wed, 10 May 2023 09:49:19 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 727561BE3; Wed, 10 May 2023 06:49:11 -0700 (PDT) Date: Wed, 10 May 2023 13:49:09 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1683726550; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=46hdFH2kJFRmyUO0PSjD3oY8ttFy+HiOOIKvoMeo0qQ=; b=qs9W96am12/lcomWsGMR/LFhQdcq1JuYTuBHDiI63Q+rV2tns+C2NQ8lTP5qwb9cQ/k559 Fg0ab1Rh8YHZJNQXxEqsVlNONkf8MqA5ziYyliYC+NJoQxniXPoU1NxvSLxQprthb1PMwC BUua1mkzYeqFJPoncOyAvBGK4LA/FW7/CTTxNnVGL356vo2ND8zQmq9Hs+tgVJIzNsrvjD uEGMqETmWE2I+mtq5T+5vxJpokmzti+uY+CBpLgZGuQpMm1366O9lhggb/Bq4J1YJrBHUN c8ISWIThtkEVH+lxeYer6EA9o1EORr88j19yh2mu3dCU6dr64uWJn4Pf2p612g== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1683726550; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=46hdFH2kJFRmyUO0PSjD3oY8ttFy+HiOOIKvoMeo0qQ=; b=dbvhjKMWRTUMnTMrvhPi+OykNqN08XBvfll58Dj+MgcFhNU8rDWJHI8lkDMNxH99iTrKzY Ultb1PgCqL1kM/BQ== From: "tip-bot2 for Suren Baghdasaryan" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: sched/core] psi: remove 500ms min window size limitation for triggers Cc: Sudarshan Rajagopalan , Suren Baghdasaryan , "Peter Zijlstra (Intel)" , Michal Hocko , Johannes Weiner , x86@kernel.org, linux-kernel@vger.kernel.org MIME-Version: 1.0 Message-ID: <168372654968.404.2819009900502474821.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765515749060258574?= X-GMAIL-MSGID: =?utf-8?q?1765515749060258574?= The following commit has been merged into the sched/core branch of tip: Commit-ID: 519fabc7aaba3f0847cf37d5f9a5740c370eb777 Gitweb: https://git.kernel.org/tip/519fabc7aaba3f0847cf37d5f9a5740c370eb777 Author: Suren Baghdasaryan AuthorDate: Thu, 02 Mar 2023 17:13:46 -08:00 Committer: Peter Zijlstra CommitterDate: Mon, 08 May 2023 10:58:38 +02:00 psi: remove 500ms min window size limitation for triggers Current 500ms min window size for psi triggers limits polling interval to 50ms to prevent polling threads from using too much cpu bandwidth by polling too frequently. However the number of cgroups with triggers is unlimited, so this protection can be defeated by creating multiple cgroups with psi triggers (triggers in each cgroup are served by a single "psimon" kernel thread). Instead of limiting min polling period, which also limits the latency of psi events, it's better to limit psi trigger creation to authorized users only, like we do for system-wide psi triggers (/proc/pressure/* files can be written only by processes with CAP_SYS_RESOURCE capability). This also makes access rules for cgroup psi files consistent with system-wide ones. Add a CAP_SYS_RESOURCE capability check for cgroup psi file writers and remove the psi window min size limitation. Suggested-by: Sudarshan Rajagopalan Signed-off-by: Suren Baghdasaryan Signed-off-by: Peter Zijlstra (Intel) Acked-by: Michal Hocko Acked-by: Johannes Weiner Link: https://lore.kernel.org/all/cover.1676067791.git.quic_sudaraja@quicinc.com/ --- kernel/cgroup/cgroup.c | 12 ++++++++++++ kernel/sched/psi.c | 4 +--- 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 625d748..b26ae20 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -3877,6 +3877,14 @@ static __poll_t cgroup_pressure_poll(struct kernfs_open_file *of, return psi_trigger_poll(&ctx->psi.trigger, of->file, pt); } +static int cgroup_pressure_open(struct kernfs_open_file *of) +{ + if (of->file->f_mode & FMODE_WRITE && !capable(CAP_SYS_RESOURCE)) + return -EPERM; + + return 0; +} + static void cgroup_pressure_release(struct kernfs_open_file *of) { struct cgroup_file_ctx *ctx = of->priv; @@ -5276,6 +5284,7 @@ static struct cftype cgroup_psi_files[] = { { .name = "io.pressure", .file_offset = offsetof(struct cgroup, psi_files[PSI_IO]), + .open = cgroup_pressure_open, .seq_show = cgroup_io_pressure_show, .write = cgroup_io_pressure_write, .poll = cgroup_pressure_poll, @@ -5284,6 +5293,7 @@ static struct cftype cgroup_psi_files[] = { { .name = "memory.pressure", .file_offset = offsetof(struct cgroup, psi_files[PSI_MEM]), + .open = cgroup_pressure_open, .seq_show = cgroup_memory_pressure_show, .write = cgroup_memory_pressure_write, .poll = cgroup_pressure_poll, @@ -5292,6 +5302,7 @@ static struct cftype cgroup_psi_files[] = { { .name = "cpu.pressure", .file_offset = offsetof(struct cgroup, psi_files[PSI_CPU]), + .open = cgroup_pressure_open, .seq_show = cgroup_cpu_pressure_show, .write = cgroup_cpu_pressure_write, .poll = cgroup_pressure_poll, @@ -5301,6 +5312,7 @@ static struct cftype cgroup_psi_files[] = { { .name = "irq.pressure", .file_offset = offsetof(struct cgroup, psi_files[PSI_IRQ]), + .open = cgroup_pressure_open, .seq_show = cgroup_irq_pressure_show, .write = cgroup_irq_pressure_write, .poll = cgroup_pressure_poll, diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index e072f6b..b49af59 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -160,7 +160,6 @@ __setup("psi=", setup_psi); #define EXP_300s 2034 /* 1/exp(2s/300s) */ /* PSI trigger definitions */ -#define WINDOW_MIN_US 500000 /* Min window size is 500ms */ #define WINDOW_MAX_US 10000000 /* Max window size is 10s */ #define UPDATES_PER_WINDOW 10 /* 10 updates per window */ @@ -1305,8 +1304,7 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group, if (state >= PSI_NONIDLE) return ERR_PTR(-EINVAL); - if (window_us < WINDOW_MIN_US || - window_us > WINDOW_MAX_US) + if (window_us == 0 || window_us > WINDOW_MAX_US) return ERR_PTR(-EINVAL); /*