Message ID | 167778866506.1053859.2329229096484796501.stgit@bmoger-ubuntu |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp54999wrd; Thu, 2 Mar 2023 12:29:37 -0800 (PST) X-Google-Smtp-Source: AK7set88mQGkMZmrbN5OwdmcppRIn/5hodpsYfL5TNuS9SwADs/0IcWq+YkZGCNHCew3WLiAZ785 X-Received: by 2002:a17:906:bfc9:b0:88d:ba89:1837 with SMTP id us9-20020a170906bfc900b0088dba891837mr3322014ejb.8.1677788977412; Thu, 02 Mar 2023 12:29:37 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1677788977; cv=pass; d=google.com; s=arc-20160816; b=GQdwyu96FP1frWPbFtRWbfXPpUA0bdGDCCKmzN3d9e3x0UG7ce/nA3F4fZJUKY8Qdt cDzIx2wred5qCPVvQO58NIMqXfwtARNEX3cwBCR/Mvnh9WJvQWNzl3xLclPWrYPWX1eD loG6aCOex4O21slp1paTjbJ6HOXbgFhGgqzrJ5YpkTYDY9QR26umzb6+hNhYN6I7fYUR mPm59pzIGbTm83Q0//9MlDRIVmTL1PATc3yGkqDWJ8NI3z5LVzpQdYjcAlyeWPvZDo/1 FfttYmuYP2GjUeof0nBF/K2ysp47BIDhws5uGh2eP9aTP1ApqmADGBB/NEn3p04BcGU4 +MgQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:cc:to:from :subject:dkim-signature; bh=XyUD59BITbd8fWMDNt88HkpEY9h0zk2R4AB+rXjAUjE=; b=jRM9usAFhqpfNHzYrkc3a1lCEQPSYvp1qGhMMDr/11+Bh8mgLdcFzKMmY8C5fYQfw+ j5DxYl5Xwmfl6qFVB69Oh8wnIudCHUa/qZhSemf0JGoH8MSVU7yM+2UFEE28pWvkLE53 o2gLWABqQab2/q1bHkjt8OXeqiOGLm3z2zTihIiHIzBcTMHXiHzbXN48VtyqwRTZyuz4 +71anMSI+kAfy0to0vhgIUWpylZ49+kvaKdM37C+QahNlQ78GypNtldSynT5LU2IKYo1 VAwMECElmqi3hqIdCESb87YKlJ90jlWqV48LGyUky0NmOlKreOH9UwO6g+v8YdsdqC6M B30w== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@amd.com header.s=selector1 header.b=i2o51191; arc=pass (i=1 spf=pass spfdomain=amd.com dmarc=pass fromdomain=amd.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amd.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f11-20020a17090631cb00b008d7cf1f87fasi150831ejf.955.2023.03.02.12.29.15; Thu, 02 Mar 2023 12:29:37 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@amd.com header.s=selector1 header.b=i2o51191; arc=pass (i=1 spf=pass spfdomain=amd.com dmarc=pass fromdomain=amd.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amd.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229523AbjCBUYf (ORCPT <rfc822;davidbtadokoro@gmail.com> + 99 others); Thu, 2 Mar 2023 15:24:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51802 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229720AbjCBUYc (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 2 Mar 2023 15:24:32 -0500 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (mail-co1nam11on2063.outbound.protection.outlook.com [40.107.220.63]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 921AB57D02; Thu, 2 Mar 2023 12:24:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SAUI2dNXeTJHU0OjKJOLgrziHF5lXs66q1KrzKGkbokZmOWF07SJxdkW3+lXYToPqkS2N5RP+lSNwXsLYtZjPnM4+K6OpK77PwMr2HwCkI2TG8qV6ysftKOop+osTDKMwANwCz4UnFI+tSUMzBRgLKlWyRc43+SjdTJBApUp6zyMZZGWyU5uT7Vyw+HMunl/lj7rkda1nr7na+450YUZBnhd5aEuQRLkGrhCWxJgI79fnFdZth7GXs4w6MkHv3OvLD3CKnL/KegPEyQNGI7FIUCgfpC//TZks4LB2Oo4z9xFri7V3UoMtEgB8/xW9/sU2ODvb34TA/hS0n5V2s59/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=XyUD59BITbd8fWMDNt88HkpEY9h0zk2R4AB+rXjAUjE=; b=WuK3rxx6ceUQSOt9CuTXoOEOHhD4SunMVSCQEluq2CQCg032XwIItoReuZnlFMfAZO84Jk2mOhqcvNMZpY+nD0rIzxHZpHRcFMloCwfynIJ7rRxHWWWBbFXpmfMsWLxstdr+NbaMq7Uj6v3q4+AA2uVTOdUycf1Xanu3FLHWTCw/YWEdYC77jA4iRqq29y0ACxOqV/eXH3WIdalRjuijq7QPOztUuwL8OlOgSYyh9pNmseqi5HdmprYrKg57GR+49TVR1tlXtsOXH78V7MQK+7i+PPEtXsiVp4iq0nClvWsXK6F2wMTi37lMQ6AABcCzAHDtSNljCfHVXQHJIjNEPw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=quicinc.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=XyUD59BITbd8fWMDNt88HkpEY9h0zk2R4AB+rXjAUjE=; b=i2o51191ny+h21Jww/qy3l0R8I5Z39u45ekCf8+1+PNQmkmmCP49F8yBqsPdSqKCg3o2NjhK9UuS6KKHwPo16lo7TvdFJIdi/V4MgT1U5BqXd1pjxNUAXsgDPTiczHNYc0TIoE0o/MSbptL1XScv/3tXPbAhS5oE4uUm6cAFWQY= Received: from DM6PR13CA0011.namprd13.prod.outlook.com (2603:10b6:5:bc::24) by DM4PR12MB6111.namprd12.prod.outlook.com (2603:10b6:8:ac::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6134.30; Thu, 2 Mar 2023 20:24:29 +0000 Received: from DS1PEPF0000E630.namprd02.prod.outlook.com (2603:10b6:5:bc:cafe::ee) by DM6PR13CA0011.outlook.office365.com (2603:10b6:5:bc::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6178.6 via Frontend Transport; Thu, 2 Mar 2023 20:24:28 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DS1PEPF0000E630.mail.protection.outlook.com (10.167.17.134) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6156.12 via Frontend Transport; Thu, 2 Mar 2023 20:24:28 +0000 Received: from [127.0.1.1] (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.34; Thu, 2 Mar 2023 14:24:25 -0600 Subject: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group at once From: Babu Moger <babu.moger@amd.com> To: <corbet@lwn.net>, <reinette.chatre@intel.com>, <tglx@linutronix.de>, <mingo@redhat.com>, <bp@alien8.de> CC: <fenghua.yu@intel.com>, <dave.hansen@linux.intel.com>, <x86@kernel.org>, <hpa@zytor.com>, <paulmck@kernel.org>, <akpm@linux-foundation.org>, <quic_neeraju@quicinc.com>, <rdunlap@infradead.org>, <damien.lemoal@opensource.wdc.com>, <songmuchun@bytedance.com>, <peterz@infradead.org>, <jpoimboe@kernel.org>, <pbonzini@redhat.com>, <babu.moger@amd.com>, <chang.seok.bae@intel.com>, <pawan.kumar.gupta@linux.intel.com>, <jmattson@google.com>, <daniel.sneddon@linux.intel.com>, <sandipan.das@amd.com>, <tony.luck@intel.com>, <james.morse@arm.com>, <linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>, <bagasdotme@gmail.com>, <eranian@google.com>, <christophe.leroy@csgroup.eu>, <pawan.kumar.gupta@linux.intel.com>, <jarkko@kernel.org>, <adrian.hunter@intel.com>, <quic_jiles@quicinc.com>, <peternewman@google.com>, <babu.moger@amd.com> Date: Thu, 2 Mar 2023 14:24:25 -0600 Message-ID: <167778866506.1053859.2329229096484796501.stgit@bmoger-ubuntu> In-Reply-To: <167778850105.1053859.14596357862185564029.stgit@bmoger-ubuntu> References: <167778850105.1053859.14596357862185564029.stgit@bmoger-ubuntu> User-Agent: StGit/1.1.dev103+g5369f4c MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS1PEPF0000E630:EE_|DM4PR12MB6111:EE_ X-MS-Office365-Filtering-Correlation-Id: b499b001-eaf1-4c80-63c9-08db1b5c1fda X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Z2rqcRKCj2s1GXFex7gOBcHvkM+FlSgqQUO/rZOUTdWr5mKHslbqciOHWyoQnH8+KZKXGJPqUZHY0u/W1CSr0mS7Wx9drv24GXwAPhwk8ciGGshMiUutkpoPEHRMp7WZucq3qpNXdWCoASkF71ZwMtFw6aU1XKssw177Y4W6/0JWWKseeQKwI8SwP2lNjkxvZwX6lNlBL/3gYF2HNmt/i7uCow+9IQ2N8C9HGLv0fO9ycJTJZGjh5VSDaVWjx4DmZVfZlH+ROEJ3w0rKI1DJG8VnXBcYDJwxv1Stix50dTaG+J8FM0iTDua06sdsw4SmlIYqbCpLHxae5TITihXpqpPJ2FQNfN66orH4ButhJuczQgPzHrj8JL9n6U4uEhaes8Teqs6RU7Qx+S5hUvFZwTwSYUZd3u1Gw9EwoPjXeREbrgMfB5B2DAUsFzhCn1AtQTQu3mCMhjxIIFhH/D/NHLi1Yu4cNhMHODhzvNB2lILznVyJiwfW6VkDv4zq5+zuvYL5YY82voD7MLY0K9FMoIEOdvZ7ROuajli0qljjERMfcWfpxwwJUuprayqigawPNJwBmOZVdk5IlHKNXCxuWBsGTs0BNNS5DP5W7+j3TsgtvvTEeSjk6AmLUOtylB+f4r0turFOD+u4eVHiclaQJc4uguMxETUlBJ3Wodalw+A8b5RZMleTEUSmQkhXJ5bGcOAbFlO7zgxD87cOetfI6s62HXeJ34PYbFK3lcDOH2BDVWWKPE/o5+EH5/Q9Eyat X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230025)(7916004)(4636009)(396003)(376002)(39860400002)(346002)(136003)(451199018)(40470700004)(46966006)(36840700001)(70586007)(4326008)(40480700001)(70206006)(8676002)(2906002)(54906003)(16576012)(82310400005)(81166007)(110136005)(316002)(41300700001)(33716001)(36860700001)(44832011)(83380400001)(426003)(47076005)(336012)(82740400003)(26005)(478600001)(86362001)(103116003)(356005)(16526019)(9686003)(186003)(40460700003)(8936002)(7416002)(66899018)(7406005)(5660300002)(71626013)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Mar 2023 20:24:28.7466 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b499b001-eaf1-4c80-63c9-08db1b5c1fda X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DS1PEPF0000E630.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM4PR12MB6111 X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FORGED_SPF_HELO, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759289255022833600?= X-GMAIL-MSGID: =?utf-8?q?1759289255022833600?= |
Series |
x86/resctrl: Miscellaneous resctrl features
|
|
Commit Message
Moger, Babu
March 2, 2023, 8:24 p.m. UTC
The resctrl task assignment for MONITOR or CONTROL group needs to be
done one at a time. For example:
$mount -t resctrl resctrl /sys/fs/resctrl/
$mkdir /sys/fs/resctrl/clos1
$echo 123 > /sys/fs/resctrl/clos1/tasks
$echo 456 > /sys/fs/resctrl/clos1/tasks
$echo 789 > /sys/fs/resctrl/clos1/tasks
This is not user-friendly when dealing with hundreds of tasks. Also,
there is a syscall overhead for each command executed from user space.
It can be improved by supporting the multiple task assignment in one
command with the tasks separated by commas. For example:
$echo 123,456,789 > /sys/fs/resctrl/clos1/tasks
Signed-off-by: Babu Moger <babu.moger@amd.com>
---
Documentation/x86/resctrl.rst | 11 +++++++++--
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 24 +++++++++++++++++++++++-
2 files changed, 32 insertions(+), 3 deletions(-)
Comments
Hi Babu, On 3/2/2023 12:24 PM, Babu Moger wrote: > The resctrl task assignment for MONITOR or CONTROL group needs to be > done one at a time. For example: > > $mount -t resctrl resctrl /sys/fs/resctrl/ > $mkdir /sys/fs/resctrl/clos1 > $echo 123 > /sys/fs/resctrl/clos1/tasks > $echo 456 > /sys/fs/resctrl/clos1/tasks > $echo 789 > /sys/fs/resctrl/clos1/tasks > > This is not user-friendly when dealing with hundreds of tasks. Also, > there is a syscall overhead for each command executed from user space. To support this change it may also be helpful to add that moving tasks take the mutex so attempting to move tasks in parallel will not achieve a significant performance gain. > > It can be improved by supporting the multiple task assignment in one > command with the tasks separated by commas. For example: > > $echo 123,456,789 > /sys/fs/resctrl/clos1/tasks > > Signed-off-by: Babu Moger <babu.moger@amd.com> > --- > Documentation/x86/resctrl.rst | 11 +++++++++-- > arch/x86/kernel/cpu/resctrl/rdtgroup.c | 24 +++++++++++++++++++++++- > 2 files changed, 32 insertions(+), 3 deletions(-) > > diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst > index 058257dc56c8..25203f20002d 100644 > --- a/Documentation/x86/resctrl.rst > +++ b/Documentation/x86/resctrl.rst > @@ -292,13 +292,20 @@ All groups contain the following files: > "tasks": > Reading this file shows the list of all tasks that belong to > this group. Writing a task id to the file will add a task to the > - group. If the group is a CTRL_MON group the task is removed from > + group. Multiple tasks can be assigned together in one command by > + inputting the tasks separated by commas. Tasks will be assigned How about "tasks separated" -> "task ids separated" or "by inputting the tasks separated by commas" -> "by separating the task ids with commas" > + sequentially in the order it is provided. Failure while assigning > + the tasks will be aborted immediately. The tasks before the failure > + will be assigned and the tasks next in the sequence will not be > + assigned. Users may need to retry them again. The failure details > + will be logged in resctrl/info/last_cmd_status file. Please use full path as is done in rest of doc. > + > + If the group is a CTRL_MON group the task is removed from > whichever previous CTRL_MON group owned the task and also from > any MON group that owned the task. If the group is a MON group, > then the task must already belong to the CTRL_MON parent of this > group. The task is removed from any previous MON group. > > - Why is this line removal needed? > "cpus": > Reading this file shows a bitmask of the logical CPUs owned by > this group. Writing a mask to this file will add and remove > diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c > index e2c1599d1b37..15ea5b550fe9 100644 > --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c > +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c > @@ -683,16 +683,34 @@ static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of, > char *buf, size_t nbytes, loff_t off) > { > struct rdtgroup *rdtgrp; > + char *pid_str; > int ret = 0; > pid_t pid; > > - if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0) > + /* Valid input requires a trailing newline */ > + if (nbytes == 0 || buf[nbytes - 1] != '\n') > return -EINVAL; The resctrl files should be seen as user space API. With the above change you take an interface that did not require a newline and dictate that it should have a trailing newline. How convinced are you that this does not break any current user space scripts or applications? Why does this feature require a trailing newline? > + > + buf[nbytes - 1] = '\0'; > + > rdtgrp = rdtgroup_kn_lock_live(of->kn); > if (!rdtgrp) { > rdtgroup_kn_unlock(of->kn); > return -ENOENT; > } > + > +next: > + if (!buf || buf[0] == '\0') > + goto unlock; > + > + pid_str = strim(strsep(&buf, ",")); > + Could lib/cmdline.c:get_option() be useful? > + if (kstrtoint(pid_str, 0, &pid) || pid < 0) { > + rdt_last_cmd_printf("Invalid pid %d value\n", pid); This is risky. What will pid be if kstrtoint() failed? > + ret = -EINVAL; > + goto unlock; > + } > + > rdt_last_cmd_clear(); > > if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED || > @@ -703,6 +721,10 @@ static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of, > } > > ret = rdtgroup_move_task(pid, rdtgrp, of); > + if (ret) > + goto unlock; > + else > + goto next; > The documentation states "The failure details will be logged in resctrl/info/last_cmd_status file." but I do not see how this is happening here. From what I can tell this implementation does not do anything beyond what last_cmd_status already does so any special mention in the docs is not clear to me. The cover letter stated "Added pid in last_cmd_status when applicable." - it sounded as though last_cmd_status would contain the error with the pid that encountered the error but I do not see this happening here. > unlock: > rdtgroup_kn_unlock(of->kn); > > Reinette
[AMD Official Use Only - General] Hi Reinette, > -----Original Message----- > From: Reinette Chatre <reinette.chatre@intel.com> > Sent: Wednesday, March 15, 2023 1:33 PM > To: Moger, Babu <Babu.Moger@amd.com>; corbet@lwn.net; > tglx@linutronix.de; mingo@redhat.com; bp@alien8.de > Cc: fenghua.yu@intel.com; dave.hansen@linux.intel.com; x86@kernel.org; > hpa@zytor.com; paulmck@kernel.org; akpm@linux-foundation.org; > quic_neeraju@quicinc.com; rdunlap@infradead.org; > damien.lemoal@opensource.wdc.com; songmuchun@bytedance.com; > peterz@infradead.org; jpoimboe@kernel.org; pbonzini@redhat.com; > chang.seok.bae@intel.com; pawan.kumar.gupta@linux.intel.com; > jmattson@google.com; daniel.sneddon@linux.intel.com; Das1, Sandipan > <Sandipan.Das@amd.com>; tony.luck@intel.com; james.morse@arm.com; > linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org; > bagasdotme@gmail.com; eranian@google.com; christophe.leroy@csgroup.eu; > jarkko@kernel.org; adrian.hunter@intel.com; quic_jiles@quicinc.com; > peternewman@google.com > Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group > at once > > Hi Babu, > > On 3/2/2023 12:24 PM, Babu Moger wrote: > > The resctrl task assignment for MONITOR or CONTROL group needs to be > > done one at a time. For example: > > > > $mount -t resctrl resctrl /sys/fs/resctrl/ > > $mkdir /sys/fs/resctrl/clos1 > > $echo 123 > /sys/fs/resctrl/clos1/tasks > > $echo 456 > /sys/fs/resctrl/clos1/tasks > > $echo 789 > /sys/fs/resctrl/clos1/tasks > > > > This is not user-friendly when dealing with hundreds of tasks. Also, > > there is a syscall overhead for each command executed from user space. > > To support this change it may also be helpful to add that moving tasks take the > mutex so attempting to move tasks in parallel will not achieve a significant > performance gain. Agree. It may not be significant performance gain. Will remove this line. > > > > > It can be improved by supporting the multiple task assignment in one > > command with the tasks separated by commas. For example: > > > > $echo 123,456,789 > /sys/fs/resctrl/clos1/tasks > > > > Signed-off-by: Babu Moger <babu.moger@amd.com> > > --- > > Documentation/x86/resctrl.rst | 11 +++++++++-- > > arch/x86/kernel/cpu/resctrl/rdtgroup.c | 24 +++++++++++++++++++++++- > > 2 files changed, 32 insertions(+), 3 deletions(-) > > > > diff --git a/Documentation/x86/resctrl.rst > > b/Documentation/x86/resctrl.rst index 058257dc56c8..25203f20002d > > 100644 > > --- a/Documentation/x86/resctrl.rst > > +++ b/Documentation/x86/resctrl.rst > > @@ -292,13 +292,20 @@ All groups contain the following files: > > "tasks": > > Reading this file shows the list of all tasks that belong to > > this group. Writing a task id to the file will add a task to the > > - group. If the group is a CTRL_MON group the task is removed from > > + group. Multiple tasks can be assigned together in one command by > > + inputting the tasks separated by commas. Tasks will be assigned > > How about "tasks separated" -> "task ids separated" or "by inputting the tasks > separated by commas" -> "by separating the task ids with commas" Will change it to " Multiple tasks can be assigned together in one command by separating the task ids with commas." > > + sequentially in the order it is provided. Failure while assigning > > + the tasks will be aborted immediately. The tasks before the failure > > + will be assigned and the tasks next in the sequence will not be > > + assigned. Users may need to retry them again. The failure details > > + will be logged in resctrl/info/last_cmd_status file. > > Please use full path as is done in rest of doc. Ok. Sure > > > + > > + If the group is a CTRL_MON group the task is removed from > > whichever previous CTRL_MON group owned the task and also from > > any MON group that owned the task. If the group is a MON group, > > then the task must already belong to the CTRL_MON parent of this > > group. The task is removed from any previous MON group. > > > > - > > Why is this line removal needed? Not needed. > > > "cpus": > > Reading this file shows a bitmask of the logical CPUs owned by > > this group. Writing a mask to this file will add and remove diff > > --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c > > b/arch/x86/kernel/cpu/resctrl/rdtgroup.c > > index e2c1599d1b37..15ea5b550fe9 100644 > > --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c > > +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c > > @@ -683,16 +683,34 @@ static ssize_t rdtgroup_tasks_write(struct > kernfs_open_file *of, > > char *buf, size_t nbytes, loff_t off) { > > struct rdtgroup *rdtgrp; > > + char *pid_str; > > int ret = 0; > > pid_t pid; > > > > - if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0) > > + /* Valid input requires a trailing newline */ > > + if (nbytes == 0 || buf[nbytes - 1] != '\n') > > return -EINVAL; > > The resctrl files should be seen as user space API. With the above change you > take an interface that did not require a newline and dictate that it should have > a trailing newline. How convinced are you that this does not break any current > user space scripts or applications? Why does this feature require a trailing > newline? I have tested these changes with intel_cmt_cat tool. It didn’t have any problems. We are already doing newline check for few other inputs. > > > + > > + buf[nbytes - 1] = '\0'; > > + > > rdtgrp = rdtgroup_kn_lock_live(of->kn); > > if (!rdtgrp) { > > rdtgroup_kn_unlock(of->kn); > > return -ENOENT; > > } > > + > > +next: > > + if (!buf || buf[0] == '\0') > > + goto unlock; > > + > > + pid_str = strim(strsep(&buf, ",")); > > + > > Could lib/cmdline.c:get_option() be useful? Yes. We could that also. May not be required for the simple case like this. > > > + if (kstrtoint(pid_str, 0, &pid) || pid < 0) { > > + rdt_last_cmd_printf("Invalid pid %d value\n", pid); > > This is risky. What will pid be if kstrtoint() failed? Yea. I need to separate these failure cases. One is parsing error, and another is invalid pid. > > > + ret = -EINVAL; > > + goto unlock; > > + } > > + > > rdt_last_cmd_clear(); > > > > if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED || @@ -703,6 > +721,10 @@ > > static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of, > > } > > > > ret = rdtgroup_move_task(pid, rdtgrp, of); > > + if (ret) > > + goto unlock; > > + else > > + goto next; > > > > The documentation states "The failure details will be logged in > resctrl/info/last_cmd_status file." but I do not see how this is happening here. > From what I can tell this implementation does not do anything beyond what > last_cmd_status already does so any special mention in the docs is not clear to > me. The cover letter stated "Added pid in last_cmd_status when applicable." - it > sounded as though last_cmd_status would contain the error with the pid that > encountered the error but I do not see this happening here. You are right we are not doing anything special here. pid failures error was already there. I will have to change the text here. Thanks Babu > > > unlock: > > rdtgroup_kn_unlock(of->kn); > > > > > > Reinette
Hi Babu, On 3/16/2023 9:27 AM, Moger, Babu wrote: >> -----Original Message----- >> From: Reinette Chatre <reinette.chatre@intel.com> >> Sent: Wednesday, March 15, 2023 1:33 PM >> To: Moger, Babu <Babu.Moger@amd.com>; corbet@lwn.net; >> tglx@linutronix.de; mingo@redhat.com; bp@alien8.de >> Cc: fenghua.yu@intel.com; dave.hansen@linux.intel.com; x86@kernel.org; >> hpa@zytor.com; paulmck@kernel.org; akpm@linux-foundation.org; >> quic_neeraju@quicinc.com; rdunlap@infradead.org; >> damien.lemoal@opensource.wdc.com; songmuchun@bytedance.com; >> peterz@infradead.org; jpoimboe@kernel.org; pbonzini@redhat.com; >> chang.seok.bae@intel.com; pawan.kumar.gupta@linux.intel.com; >> jmattson@google.com; daniel.sneddon@linux.intel.com; Das1, Sandipan >> <Sandipan.Das@amd.com>; tony.luck@intel.com; james.morse@arm.com; >> linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org; >> bagasdotme@gmail.com; eranian@google.com; christophe.leroy@csgroup.eu; >> jarkko@kernel.org; adrian.hunter@intel.com; quic_jiles@quicinc.com; >> peternewman@google.com >> Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group >> at once >> >> Hi Babu, >> >> On 3/2/2023 12:24 PM, Babu Moger wrote: >>> The resctrl task assignment for MONITOR or CONTROL group needs to be >>> done one at a time. For example: >>> >>> $mount -t resctrl resctrl /sys/fs/resctrl/ >>> $mkdir /sys/fs/resctrl/clos1 >>> $echo 123 > /sys/fs/resctrl/clos1/tasks >>> $echo 456 > /sys/fs/resctrl/clos1/tasks >>> $echo 789 > /sys/fs/resctrl/clos1/tasks >>> >>> This is not user-friendly when dealing with hundreds of tasks. Also, >>> there is a syscall overhead for each command executed from user space. >> >> To support this change it may also be helpful to add that moving tasks take the >> mutex so attempting to move tasks in parallel will not achieve a significant >> performance gain. > > Agree. It may not be significant performance gain. Will remove this line. It does not sound as though you are actually responding to my comment. >>> --- a/Documentation/x86/resctrl.rst >>> +++ b/Documentation/x86/resctrl.rst >>> @@ -292,13 +292,20 @@ All groups contain the following files: >>> "tasks": >>> Reading this file shows the list of all tasks that belong to >>> this group. Writing a task id to the file will add a task to the >>> - group. If the group is a CTRL_MON group the task is removed from >>> + group. Multiple tasks can be assigned together in one command by >>> + inputting the tasks separated by commas. Tasks will be assigned >> >> How about "tasks separated" -> "task ids separated" or "by inputting the tasks >> separated by commas" -> "by separating the task ids with commas" > > > Will change it to " Multiple tasks can be assigned together in one command by separating the task ids with commas." I would drop the "together" since it implies that this is somehow atomic. It will also improve reading by using consistent terminology - note how the text switches from "add" to "assign". Something like "Multiple tasks can be added by separating the task ids with commas." I think using "command" is confusing since what is actually done is writing text to a file. ... >>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>> @@ -683,16 +683,34 @@ static ssize_t rdtgroup_tasks_write(struct >> kernfs_open_file *of, >>> char *buf, size_t nbytes, loff_t off) { >>> struct rdtgroup *rdtgrp; >>> + char *pid_str; >>> int ret = 0; >>> pid_t pid; >>> >>> - if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0) >>> + /* Valid input requires a trailing newline */ >>> + if (nbytes == 0 || buf[nbytes - 1] != '\n') >>> return -EINVAL; >> >> The resctrl files should be seen as user space API. With the above change you >> take an interface that did not require a newline and dictate that it should have >> a trailing newline. How convinced are you that this does not break any current >> user space scripts or applications? Why does this feature require a trailing >> newline? > > I have tested these changes with intel_cmt_cat tool. It didn’t have any problems. > We are already doing newline check for few other inputs. You tested this with the _one_ user space tool that you use. This is not sufficient to be convincing that this change has no impact. I do not believe that it is a valid argument that other inputs do a newline check. This input never required a newline check and it is not clear why this change now requires it. It seems that this is an unnecessary new requirement that runs the risk of breaking an existing application. I would like to ask again: How convinced are you that this does not break _any_ current user space scripts or applications? Why does this feature require a trailing newline? > >> >>> + >>> + buf[nbytes - 1] = '\0'; >>> + >>> rdtgrp = rdtgroup_kn_lock_live(of->kn); >>> if (!rdtgrp) { >>> rdtgroup_kn_unlock(of->kn); >>> return -ENOENT; >>> } >>> + >>> +next: >>> + if (!buf || buf[0] == '\0') >>> + goto unlock; >>> + >>> + pid_str = strim(strsep(&buf, ",")); >>> + >> >> Could lib/cmdline.c:get_option() be useful? > > Yes. We could that also. May not be required for the simple case like this. Please keep an eye out for how much of it you end up duplicating .... >>> + ret = -EINVAL; >>> + goto unlock; >>> + } >>> + >>> rdt_last_cmd_clear(); >>> >>> if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED || @@ -703,6 >> +721,10 @@ >>> static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of, >>> } >>> >>> ret = rdtgroup_move_task(pid, rdtgrp, of); >>> + if (ret) >>> + goto unlock; >>> + else >>> + goto next; >>> >> >> The documentation states "The failure details will be logged in >> resctrl/info/last_cmd_status file." but I do not see how this is happening here. >> From what I can tell this implementation does not do anything beyond what >> last_cmd_status already does so any special mention in the docs is not clear to >> me. The cover letter stated "Added pid in last_cmd_status when applicable." - it >> sounded as though last_cmd_status would contain the error with the pid that >> encountered the error but I do not see this happening here. > > You are right we are not doing anything special here. pid failures error was already there. > I will have to change the text here. What do you mean with "pid failures error was already there"? From what I understand your goal is to communicate to the user which pid encountered the error and I do not see that done. How will user know which pid encountered a failure? Reinette
Hi Reinette, On 3/16/23 12:12, Reinette Chatre wrote: > Hi Babu, > > On 3/16/2023 9:27 AM, Moger, Babu wrote: >>> -----Original Message----- >>> From: Reinette Chatre <reinette.chatre@intel.com> >>> Sent: Wednesday, March 15, 2023 1:33 PM >>> To: Moger, Babu <Babu.Moger@amd.com>; corbet@lwn.net; >>> tglx@linutronix.de; mingo@redhat.com; bp@alien8.de >>> Cc: fenghua.yu@intel.com; dave.hansen@linux.intel.com; x86@kernel.org; >>> hpa@zytor.com; paulmck@kernel.org; akpm@linux-foundation.org; >>> quic_neeraju@quicinc.com; rdunlap@infradead.org; >>> damien.lemoal@opensource.wdc.com; songmuchun@bytedance.com; >>> peterz@infradead.org; jpoimboe@kernel.org; pbonzini@redhat.com; >>> chang.seok.bae@intel.com; pawan.kumar.gupta@linux.intel.com; >>> jmattson@google.com; daniel.sneddon@linux.intel.com; Das1, Sandipan >>> <Sandipan.Das@amd.com>; tony.luck@intel.com; james.morse@arm.com; >>> linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org; >>> bagasdotme@gmail.com; eranian@google.com; christophe.leroy@csgroup.eu; >>> jarkko@kernel.org; adrian.hunter@intel.com; quic_jiles@quicinc.com; >>> peternewman@google.com >>> Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group >>> at once >>> >>> Hi Babu, >>> >>> On 3/2/2023 12:24 PM, Babu Moger wrote: >>>> The resctrl task assignment for MONITOR or CONTROL group needs to be >>>> done one at a time. For example: >>>> >>>> $mount -t resctrl resctrl /sys/fs/resctrl/ >>>> $mkdir /sys/fs/resctrl/clos1 >>>> $echo 123 > /sys/fs/resctrl/clos1/tasks >>>> $echo 456 > /sys/fs/resctrl/clos1/tasks >>>> $echo 789 > /sys/fs/resctrl/clos1/tasks >>>> >>>> This is not user-friendly when dealing with hundreds of tasks. Also, >>>> there is a syscall overhead for each command executed from user space. >>> >>> To support this change it may also be helpful to add that moving tasks take the >>> mutex so attempting to move tasks in parallel will not achieve a significant >>> performance gain. >> >> Agree. It may not be significant performance gain. Will remove this line. > > It does not sound as though you are actually responding to my comment. I am confused. I am already saying there is syscall overhead for each command if we move the tasks one by one. Now do you want me to add "moving tasks take the mutex so attempting to move tasks in parallel will not achieve a significant performance gain". It is contradictory, So, I wanted to remove the line about performance. Did I still miss something? > >>>> --- a/Documentation/x86/resctrl.rst >>>> +++ b/Documentation/x86/resctrl.rst >>>> @@ -292,13 +292,20 @@ All groups contain the following files: >>>> "tasks": >>>> Reading this file shows the list of all tasks that belong to >>>> this group. Writing a task id to the file will add a task to the >>>> - group. If the group is a CTRL_MON group the task is removed from >>>> + group. Multiple tasks can be assigned together in one command by >>>> + inputting the tasks separated by commas. Tasks will be assigned >>> >>> How about "tasks separated" -> "task ids separated" or "by inputting the tasks >>> separated by commas" -> "by separating the task ids with commas" >> >> >> Will change it to " Multiple tasks can be assigned together in one command by separating the task ids with commas." > > I would drop the "together" since it implies that this is > somehow atomic. It will also improve reading by using consistent terminology - > note how the text switches from "add" to "assign". Something like > "Multiple tasks can be added by separating the task ids with commas." I think > using "command" is confusing since what is actually done is writing > text to a file. Ok. Sure. Will change it. > > ... > >>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>> @@ -683,16 +683,34 @@ static ssize_t rdtgroup_tasks_write(struct >>> kernfs_open_file *of, >>>> char *buf, size_t nbytes, loff_t off) { >>>> struct rdtgroup *rdtgrp; >>>> + char *pid_str; >>>> int ret = 0; >>>> pid_t pid; >>>> >>>> - if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0) >>>> + /* Valid input requires a trailing newline */ >>>> + if (nbytes == 0 || buf[nbytes - 1] != '\n') >>>> return -EINVAL; >>> >>> The resctrl files should be seen as user space API. With the above change you >>> take an interface that did not require a newline and dictate that it should have >>> a trailing newline. How convinced are you that this does not break any current >>> user space scripts or applications? Why does this feature require a trailing >>> newline? >> >> I have tested these changes with intel_cmt_cat tool. It didn’t have any problems. >> We are already doing newline check for few other inputs. > > You tested this with the _one_ user space tool that you use. This is not sufficient > to be convincing that this change has no impact. I do not believe that it is a valid > argument that other inputs do a newline check. This input never required a newline > check and it is not clear why this change now requires it. It seems that this is an > unnecessary new requirement that runs the risk of breaking an existing application. > > I would like to ask again: How convinced are you that this does not break _any_ current > user space scripts or applications? Why does this feature require a trailing > newline? I do not know of any other tool using resctrl fs. So, you want me to drop the newline requirement for this. I can try that. Will let you know how it goes. > >> >>> >>>> + >>>> + buf[nbytes - 1] = '\0'; >>>> + >>>> rdtgrp = rdtgroup_kn_lock_live(of->kn); >>>> if (!rdtgrp) { >>>> rdtgroup_kn_unlock(of->kn); >>>> return -ENOENT; >>>> } >>>> + >>>> +next: >>>> + if (!buf || buf[0] == '\0') >>>> + goto unlock; >>>> + >>>> + pid_str = strim(strsep(&buf, ",")); >>>> + >>> >>> Could lib/cmdline.c:get_option() be useful? >> >> Yes. We could that also. May not be required for the simple case like this. > > Please keep an eye out for how much of it you end up duplicating .... Using the get_options will require at least two calls(one to get the length and then read the integers). Also need to allocate the integers array dynamically. That is lot code if we are going that route. > >>>> + ret = -EINVAL; >>>> + goto unlock; >>>> + } >>>> + >>>> rdt_last_cmd_clear(); >>>> >>>> if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED || @@ -703,6 >>> +721,10 @@ >>>> static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of, >>>> } >>>> >>>> ret = rdtgroup_move_task(pid, rdtgrp, of); >>>> + if (ret) >>>> + goto unlock; >>>> + else >>>> + goto next; >>>> >>> >>> The documentation states "The failure details will be logged in >>> resctrl/info/last_cmd_status file." but I do not see how this is happening here. >>> From what I can tell this implementation does not do anything beyond what >>> last_cmd_status already does so any special mention in the docs is not clear to >>> me. The cover letter stated "Added pid in last_cmd_status when applicable." - it >>> sounded as though last_cmd_status would contain the error with the pid that >>> encountered the error but I do not see this happening here. >> >> You are right we are not doing anything special here. pid failures error was already there. >> I will have to change the text here. > > What do you mean with "pid failures error was already there"? From what > I understand your goal is to communicate to the user which pid > encountered the error and I do not see that done. How will user know > which pid encountered a failure? We only have couple of failures to take here. Those failures are already handled by rdtgroup_move_task. It logs the pid for failure(using rdt_last_cmd_printf). I can say "The failure pid will be logged in /sys/fs/resctrl/info/last_cmd_status file."
Hi Babu, On 3/16/2023 12:51 PM, Moger, Babu wrote: > On 3/16/23 12:12, Reinette Chatre wrote: >> On 3/16/2023 9:27 AM, Moger, Babu wrote: >>>> -----Original Message----- >>>> From: Reinette Chatre <reinette.chatre@intel.com> >>>> Sent: Wednesday, March 15, 2023 1:33 PM >>>> To: Moger, Babu <Babu.Moger@amd.com>; corbet@lwn.net; >>>> tglx@linutronix.de; mingo@redhat.com; bp@alien8.de >>>> Cc: fenghua.yu@intel.com; dave.hansen@linux.intel.com; x86@kernel.org; >>>> hpa@zytor.com; paulmck@kernel.org; akpm@linux-foundation.org; >>>> quic_neeraju@quicinc.com; rdunlap@infradead.org; >>>> damien.lemoal@opensource.wdc.com; songmuchun@bytedance.com; >>>> peterz@infradead.org; jpoimboe@kernel.org; pbonzini@redhat.com; >>>> chang.seok.bae@intel.com; pawan.kumar.gupta@linux.intel.com; >>>> jmattson@google.com; daniel.sneddon@linux.intel.com; Das1, Sandipan >>>> <Sandipan.Das@amd.com>; tony.luck@intel.com; james.morse@arm.com; >>>> linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org; >>>> bagasdotme@gmail.com; eranian@google.com; christophe.leroy@csgroup.eu; >>>> jarkko@kernel.org; adrian.hunter@intel.com; quic_jiles@quicinc.com; >>>> peternewman@google.com >>>> Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group >>>> at once >>>> >>>> Hi Babu, >>>> >>>> On 3/2/2023 12:24 PM, Babu Moger wrote: >>>>> The resctrl task assignment for MONITOR or CONTROL group needs to be >>>>> done one at a time. For example: >>>>> >>>>> $mount -t resctrl resctrl /sys/fs/resctrl/ >>>>> $mkdir /sys/fs/resctrl/clos1 >>>>> $echo 123 > /sys/fs/resctrl/clos1/tasks >>>>> $echo 456 > /sys/fs/resctrl/clos1/tasks >>>>> $echo 789 > /sys/fs/resctrl/clos1/tasks >>>>> >>>>> This is not user-friendly when dealing with hundreds of tasks. Also, >>>>> there is a syscall overhead for each command executed from user space. >>>> >>>> To support this change it may also be helpful to add that moving tasks take the >>>> mutex so attempting to move tasks in parallel will not achieve a significant >>>> performance gain. >>> >>> Agree. It may not be significant performance gain. Will remove this line. >> >> It does not sound as though you are actually responding to my comment. > > I am confused. I am already saying there is syscall overhead for each > command if we move the tasks one by one. Now do you want me to add "moving > tasks take the mutex so attempting to move tasks in parallel will not > achieve a significant performance gain". > > It is contradictory, So, I wanted to remove the line about performance. > Did I still miss something? Where is the contradiction? Consider your example: $echo 123 > /sys/fs/resctrl/clos1/tasks $echo 456 > /sys/fs/resctrl/clos1/tasks $echo 789 > /sys/fs/resctrl/clos1/tasks Yes, there is syscall overhead for each of the above lines. My statement was in support of this work by stating that a user aiming to improve performance by attempting the above in parallel would not be able to see achieve significant performance gain since the calls would end up being serialized. You are providing two motivations (a) "user-friendly when dealing with hundreds of tasks", and (b) syscall overhead. Have you measured the improvement this solution provides? >>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>>> @@ -683,16 +683,34 @@ static ssize_t rdtgroup_tasks_write(struct >>>> kernfs_open_file *of, >>>>> char *buf, size_t nbytes, loff_t off) { >>>>> struct rdtgroup *rdtgrp; >>>>> + char *pid_str; >>>>> int ret = 0; >>>>> pid_t pid; >>>>> >>>>> - if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0) >>>>> + /* Valid input requires a trailing newline */ >>>>> + if (nbytes == 0 || buf[nbytes - 1] != '\n') >>>>> return -EINVAL; >>>> >>>> The resctrl files should be seen as user space API. With the above change you >>>> take an interface that did not require a newline and dictate that it should have >>>> a trailing newline. How convinced are you that this does not break any current >>>> user space scripts or applications? Why does this feature require a trailing >>>> newline? >>> >>> I have tested these changes with intel_cmt_cat tool. It didn’t have any problems. >>> We are already doing newline check for few other inputs. >> >> You tested this with the _one_ user space tool that you use. This is not sufficient >> to be convincing that this change has no impact. I do not believe that it is a valid >> argument that other inputs do a newline check. This input never required a newline >> check and it is not clear why this change now requires it. It seems that this is an >> unnecessary new requirement that runs the risk of breaking an existing application. >> >> I would like to ask again: How convinced are you that this does not break _any_ current >> user space scripts or applications? Why does this feature require a trailing >> newline? > > I do not know of any other tool using resctrl fs. > So, you want me to drop the newline requirement for this. I can try that. > Will let you know how it goes. You continue to avoid my question about why this requires a newline. Until I learn why this is required, yes, from what I can tell based on this patch this requirement can and should be dropped. >>>>> + >>>>> + buf[nbytes - 1] = '\0'; >>>>> + >>>>> rdtgrp = rdtgroup_kn_lock_live(of->kn); >>>>> if (!rdtgrp) { >>>>> rdtgroup_kn_unlock(of->kn); >>>>> return -ENOENT; >>>>> } >>>>> + >>>>> +next: >>>>> + if (!buf || buf[0] == '\0') >>>>> + goto unlock; >>>>> + >>>>> + pid_str = strim(strsep(&buf, ",")); >>>>> + >>>> >>>> Could lib/cmdline.c:get_option() be useful? >>> >>> Yes. We could that also. May not be required for the simple case like this. >> >> Please keep an eye out for how much of it you end up duplicating .... > > Using the get_options will require at least two calls(one to get the > length and then read the integers). Also need to allocate the integers > array dynamically. That is lot code if we are going that route. > I did not ask about get_options(), I asked about get_option(). >> >>>>> + ret = -EINVAL; >>>>> + goto unlock; >>>>> + } >>>>> + >>>>> rdt_last_cmd_clear(); >>>>> >>>>> if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED || @@ -703,6 >>>> +721,10 @@ >>>>> static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of, >>>>> } >>>>> >>>>> ret = rdtgroup_move_task(pid, rdtgrp, of); >>>>> + if (ret) >>>>> + goto unlock; >>>>> + else >>>>> + goto next; >>>>> >>>> >>>> The documentation states "The failure details will be logged in >>>> resctrl/info/last_cmd_status file." but I do not see how this is happening here. >>>> From what I can tell this implementation does not do anything beyond what >>>> last_cmd_status already does so any special mention in the docs is not clear to >>>> me. The cover letter stated "Added pid in last_cmd_status when applicable." - it >>>> sounded as though last_cmd_status would contain the error with the pid that >>>> encountered the error but I do not see this happening here. >>> >>> You are right we are not doing anything special here. pid failures error was already there. >>> I will have to change the text here. >> >> What do you mean with "pid failures error was already there"? From what >> I understand your goal is to communicate to the user which pid >> encountered the error and I do not see that done. How will user know >> which pid encountered a failure? > > We only have couple of failures to take here. Those failures are already > handled by rdtgroup_move_task. It logs the pid for failure(using > rdt_last_cmd_printf). The pid is only logged if there is no task with that pid. How about the error in __rdtgroup_move_task() - how will the user know which pid triggered that error? > > I can say "The failure pid will be logged in > /sys/fs/resctrl/info/last_cmd_status file." That will not be accurate. Not all errors include the pid. Reinette
Hi Reinette, On 3/16/23 15:33, Reinette Chatre wrote: > Hi Babu, > > On 3/16/2023 12:51 PM, Moger, Babu wrote: >> On 3/16/23 12:12, Reinette Chatre wrote: >>> On 3/16/2023 9:27 AM, Moger, Babu wrote: >>>>> -----Original Message----- >>>>> From: Reinette Chatre <reinette.chatre@intel.com> >>>>> Sent: Wednesday, March 15, 2023 1:33 PM >>>>> To: Moger, Babu <Babu.Moger@amd.com>; corbet@lwn.net; >>>>> tglx@linutronix.de; mingo@redhat.com; bp@alien8.de >>>>> Cc: fenghua.yu@intel.com; dave.hansen@linux.intel.com; x86@kernel.org; >>>>> hpa@zytor.com; paulmck@kernel.org; akpm@linux-foundation.org; >>>>> quic_neeraju@quicinc.com; rdunlap@infradead.org; >>>>> damien.lemoal@opensource.wdc.com; songmuchun@bytedance.com; >>>>> peterz@infradead.org; jpoimboe@kernel.org; pbonzini@redhat.com; >>>>> chang.seok.bae@intel.com; pawan.kumar.gupta@linux.intel.com; >>>>> jmattson@google.com; daniel.sneddon@linux.intel.com; Das1, Sandipan >>>>> <Sandipan.Das@amd.com>; tony.luck@intel.com; james.morse@arm.com; >>>>> linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org; >>>>> bagasdotme@gmail.com; eranian@google.com; christophe.leroy@csgroup.eu; >>>>> jarkko@kernel.org; adrian.hunter@intel.com; quic_jiles@quicinc.com; >>>>> peternewman@google.com >>>>> Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group >>>>> at once >>>>> >>>>> Hi Babu, >>>>> >>>>> On 3/2/2023 12:24 PM, Babu Moger wrote: >>>>>> The resctrl task assignment for MONITOR or CONTROL group needs to be >>>>>> done one at a time. For example: >>>>>> >>>>>> $mount -t resctrl resctrl /sys/fs/resctrl/ >>>>>> $mkdir /sys/fs/resctrl/clos1 >>>>>> $echo 123 > /sys/fs/resctrl/clos1/tasks >>>>>> $echo 456 > /sys/fs/resctrl/clos1/tasks >>>>>> $echo 789 > /sys/fs/resctrl/clos1/tasks >>>>>> >>>>>> This is not user-friendly when dealing with hundreds of tasks. Also, >>>>>> there is a syscall overhead for each command executed from user space. >>>>> >>>>> To support this change it may also be helpful to add that moving tasks take the >>>>> mutex so attempting to move tasks in parallel will not achieve a significant >>>>> performance gain. >>>> >>>> Agree. It may not be significant performance gain. Will remove this line. >>> >>> It does not sound as though you are actually responding to my comment. >> >> I am confused. I am already saying there is syscall overhead for each >> command if we move the tasks one by one. Now do you want me to add "moving >> tasks take the mutex so attempting to move tasks in parallel will not >> achieve a significant performance gain". >> >> It is contradictory, So, I wanted to remove the line about performance. >> Did I still miss something? > > Where is the contradiction? > > Consider your example: > $echo 123 > /sys/fs/resctrl/clos1/tasks > $echo 456 > /sys/fs/resctrl/clos1/tasks > $echo 789 > /sys/fs/resctrl/clos1/tasks > > Yes, there is syscall overhead for each of the above lines. My statement was in > support of this work by stating that a user aiming to improve performance by > attempting the above in parallel would not be able to see achieve significant > performance gain since the calls would end up being serialized. ok. Sure. Will add the text. I may modify little bit. > > You are providing two motivations (a) "user-friendly when dealing with > hundreds of tasks", and (b) syscall overhead. Have you measured the > improvement this solution provides? No. I have not measured the performance improvement. > >>>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>>>> @@ -683,16 +683,34 @@ static ssize_t rdtgroup_tasks_write(struct >>>>> kernfs_open_file *of, >>>>>> char *buf, size_t nbytes, loff_t off) { >>>>>> struct rdtgroup *rdtgrp; >>>>>> + char *pid_str; >>>>>> int ret = 0; >>>>>> pid_t pid; >>>>>> >>>>>> - if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0) >>>>>> + /* Valid input requires a trailing newline */ >>>>>> + if (nbytes == 0 || buf[nbytes - 1] != '\n') >>>>>> return -EINVAL; >>>>> >>>>> The resctrl files should be seen as user space API. With the above change you >>>>> take an interface that did not require a newline and dictate that it should have >>>>> a trailing newline. How convinced are you that this does not break any current >>>>> user space scripts or applications? Why does this feature require a trailing >>>>> newline? >>>> >>>> I have tested these changes with intel_cmt_cat tool. It didn’t have any problems. >>>> We are already doing newline check for few other inputs. >>> >>> You tested this with the _one_ user space tool that you use. This is not sufficient >>> to be convincing that this change has no impact. I do not believe that it is a valid >>> argument that other inputs do a newline check. This input never required a newline >>> check and it is not clear why this change now requires it. It seems that this is an >>> unnecessary new requirement that runs the risk of breaking an existing application. >>> >>> I would like to ask again: How convinced are you that this does not break _any_ current >>> user space scripts or applications? Why does this feature require a trailing >>> newline? I dont think this feature required trailing newline. I may have carried away from similar code in the area. I will remove that requirement. >> >> I do not know of any other tool using resctrl fs. >> So, you want me to drop the newline requirement for this. I can try that. >> Will let you know how it goes. > > You continue to avoid my question about why this requires a newline. Until > I learn why this is required, yes, from what I can tell based on this patch > this requirement can and should be dropped. > >>>>>> + >>>>>> + buf[nbytes - 1] = '\0'; >>>>>> + >>>>>> rdtgrp = rdtgroup_kn_lock_live(of->kn); >>>>>> if (!rdtgrp) { >>>>>> rdtgroup_kn_unlock(of->kn); >>>>>> return -ENOENT; >>>>>> } >>>>>> + >>>>>> +next: >>>>>> + if (!buf || buf[0] == '\0') >>>>>> + goto unlock; >>>>>> + >>>>>> + pid_str = strim(strsep(&buf, ",")); >>>>>> + >>>>> >>>>> Could lib/cmdline.c:get_option() be useful? >>>> >>>> Yes. We could that also. May not be required for the simple case like this. >>> >>> Please keep an eye out for how much of it you end up duplicating .... >> >> Using the get_options will require at least two calls(one to get the >> length and then read the integers). Also need to allocate the integers >> array dynamically. That is lot code if we are going that route. >> > > I did not ask about get_options(), I asked about get_option(). If you insist, will use get_option. But we still have to loop thru all the string till get_option returns 0. I can try that. > >>> >>>>>> + ret = -EINVAL; >>>>>> + goto unlock; >>>>>> + } >>>>>> + >>>>>> rdt_last_cmd_clear(); >>>>>> >>>>>> if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED || @@ -703,6 >>>>> +721,10 @@ >>>>>> static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of, >>>>>> } >>>>>> >>>>>> ret = rdtgroup_move_task(pid, rdtgrp, of); >>>>>> + if (ret) >>>>>> + goto unlock; >>>>>> + else >>>>>> + goto next; >>>>>> >>>>> >>>>> The documentation states "The failure details will be logged in >>>>> resctrl/info/last_cmd_status file." but I do not see how this is happening here. >>>>> From what I can tell this implementation does not do anything beyond what >>>>> last_cmd_status already does so any special mention in the docs is not clear to >>>>> me. The cover letter stated "Added pid in last_cmd_status when applicable." - it >>>>> sounded as though last_cmd_status would contain the error with the pid that >>>>> encountered the error but I do not see this happening here. >>>> >>>> You are right we are not doing anything special here. pid failures error was already there. >>>> I will have to change the text here. >>> >>> What do you mean with "pid failures error was already there"? From what >>> I understand your goal is to communicate to the user which pid >>> encountered the error and I do not see that done. How will user know >>> which pid encountered a failure? >> >> We only have couple of failures to take here. Those failures are already >> handled by rdtgroup_move_task. It logs the pid for failure(using >> rdt_last_cmd_printf). > > The pid is only logged if there is no task with that pid. How about the > error in __rdtgroup_move_task() - how will the user know which pid triggered > that error? Yes. These cases we may be able to report the pid. > >> >> I can say "The failure pid will be logged in >> /sys/fs/resctrl/info/last_cmd_status file." > > That will not be accurate. Not all errors include the pid. Can you please suggest? Thanks Babu Moger
On 3/20/23 10:07, Moger, Babu wrote: > Hi Reinette, > > On 3/16/23 15:33, Reinette Chatre wrote: >> Hi Babu, >> >> On 3/16/2023 12:51 PM, Moger, Babu wrote: >>> On 3/16/23 12:12, Reinette Chatre wrote: >>>> On 3/16/2023 9:27 AM, Moger, Babu wrote: >>>>>> -----Original Message----- >>>>>> From: Reinette Chatre <reinette.chatre@intel.com> >>>>>> Sent: Wednesday, March 15, 2023 1:33 PM >>>>>> To: Moger, Babu <Babu.Moger@amd.com>; corbet@lwn.net; >>>>>> tglx@linutronix.de; mingo@redhat.com; bp@alien8.de >>>>>> Cc: fenghua.yu@intel.com; dave.hansen@linux.intel.com; x86@kernel.org; >>>>>> hpa@zytor.com; paulmck@kernel.org; akpm@linux-foundation.org; >>>>>> quic_neeraju@quicinc.com; rdunlap@infradead.org; >>>>>> damien.lemoal@opensource.wdc.com; songmuchun@bytedance.com; >>>>>> peterz@infradead.org; jpoimboe@kernel.org; pbonzini@redhat.com; >>>>>> chang.seok.bae@intel.com; pawan.kumar.gupta@linux.intel.com; >>>>>> jmattson@google.com; daniel.sneddon@linux.intel.com; Das1, Sandipan >>>>>> <Sandipan.Das@amd.com>; tony.luck@intel.com; james.morse@arm.com; >>>>>> linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org; >>>>>> bagasdotme@gmail.com; eranian@google.com; christophe.leroy@csgroup.eu; >>>>>> jarkko@kernel.org; adrian.hunter@intel.com; quic_jiles@quicinc.com; >>>>>> peternewman@google.com >>>>>> Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group >>>>>> at once >>>>>> >>>>>> Hi Babu, >>>>>> >>>>>> On 3/2/2023 12:24 PM, Babu Moger wrote: >>>>>>> The resctrl task assignment for MONITOR or CONTROL group needs to be >>>>>>> done one at a time. For example: >>>>>>> >>>>>>> $mount -t resctrl resctrl /sys/fs/resctrl/ >>>>>>> $mkdir /sys/fs/resctrl/clos1 >>>>>>> $echo 123 > /sys/fs/resctrl/clos1/tasks >>>>>>> $echo 456 > /sys/fs/resctrl/clos1/tasks >>>>>>> $echo 789 > /sys/fs/resctrl/clos1/tasks >>>>>>> >>>>>>> This is not user-friendly when dealing with hundreds of tasks. Also, >>>>>>> there is a syscall overhead for each command executed from user space. >>>>>> >>>>>> To support this change it may also be helpful to add that moving tasks take the >>>>>> mutex so attempting to move tasks in parallel will not achieve a significant >>>>>> performance gain. >>>>> >>>>> Agree. It may not be significant performance gain. Will remove this line. >>>> >>>> It does not sound as though you are actually responding to my comment. >>> >>> I am confused. I am already saying there is syscall overhead for each >>> command if we move the tasks one by one. Now do you want me to add "moving >>> tasks take the mutex so attempting to move tasks in parallel will not >>> achieve a significant performance gain". >>> >>> It is contradictory, So, I wanted to remove the line about performance. >>> Did I still miss something? >> >> Where is the contradiction? >> >> Consider your example: >> $echo 123 > /sys/fs/resctrl/clos1/tasks >> $echo 456 > /sys/fs/resctrl/clos1/tasks >> $echo 789 > /sys/fs/resctrl/clos1/tasks >> >> Yes, there is syscall overhead for each of the above lines. My statement was in >> support of this work by stating that a user aiming to improve performance by >> attempting the above in parallel would not be able to see achieve significant >> performance gain since the calls would end up being serialized. > > ok. Sure. Will add the text. I may modify little bit. >> >> You are providing two motivations (a) "user-friendly when dealing with >> hundreds of tasks", and (b) syscall overhead. Have you measured the >> improvement this solution provides? > > No. I have not measured the performance improvement. > >> >>>>>>> --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>>>>> +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c >>>>>>> @@ -683,16 +683,34 @@ static ssize_t rdtgroup_tasks_write(struct >>>>>> kernfs_open_file *of, >>>>>>> char *buf, size_t nbytes, loff_t off) { >>>>>>> struct rdtgroup *rdtgrp; >>>>>>> + char *pid_str; >>>>>>> int ret = 0; >>>>>>> pid_t pid; >>>>>>> >>>>>>> - if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0) >>>>>>> + /* Valid input requires a trailing newline */ >>>>>>> + if (nbytes == 0 || buf[nbytes - 1] != '\n') >>>>>>> return -EINVAL; >>>>>> >>>>>> The resctrl files should be seen as user space API. With the above change you >>>>>> take an interface that did not require a newline and dictate that it should have >>>>>> a trailing newline. How convinced are you that this does not break any current >>>>>> user space scripts or applications? Why does this feature require a trailing >>>>>> newline? >>>>> >>>>> I have tested these changes with intel_cmt_cat tool. It didn’t have any problems. >>>>> We are already doing newline check for few other inputs. >>>> >>>> You tested this with the _one_ user space tool that you use. This is not sufficient >>>> to be convincing that this change has no impact. I do not believe that it is a valid >>>> argument that other inputs do a newline check. This input never required a newline >>>> check and it is not clear why this change now requires it. It seems that this is an >>>> unnecessary new requirement that runs the risk of breaking an existing application. >>>> >>>> I would like to ask again: How convinced are you that this does not break _any_ current >>>> user space scripts or applications? Why does this feature require a trailing >>>> newline? > > I dont think this feature required trailing newline. I may have carried > away from similar code in the area. I will remove that requirement. > >>> >>> I do not know of any other tool using resctrl fs. >>> So, you want me to drop the newline requirement for this. I can try that. >>> Will let you know how it goes. >> >> You continue to avoid my question about why this requires a newline. Until >> I learn why this is required, yes, from what I can tell based on this patch >> this requirement can and should be dropped. >> >>>>>>> + >>>>>>> + buf[nbytes - 1] = '\0'; >>>>>>> + >>>>>>> rdtgrp = rdtgroup_kn_lock_live(of->kn); >>>>>>> if (!rdtgrp) { >>>>>>> rdtgroup_kn_unlock(of->kn); >>>>>>> return -ENOENT; >>>>>>> } >>>>>>> + >>>>>>> +next: >>>>>>> + if (!buf || buf[0] == '\0') >>>>>>> + goto unlock; >>>>>>> + >>>>>>> + pid_str = strim(strsep(&buf, ",")); >>>>>>> + >>>>>> >>>>>> Could lib/cmdline.c:get_option() be useful? >>>>> >>>>> Yes. We could that also. May not be required for the simple case like this. >>>> >>>> Please keep an eye out for how much of it you end up duplicating .... >>> >>> Using the get_options will require at least two calls(one to get the >>> length and then read the integers). Also need to allocate the integers >>> array dynamically. That is lot code if we are going that route. >>> >> >> I did not ask about get_options(), I asked about get_option(). > > If you insist, will use get_option. But we still have to loop thru all the > string till get_option returns 0. I can try that. > >> >>>> >>>>>>> + ret = -EINVAL; >>>>>>> + goto unlock; >>>>>>> + } >>>>>>> + >>>>>>> rdt_last_cmd_clear(); >>>>>>> >>>>>>> if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED || @@ -703,6 >>>>>> +721,10 @@ >>>>>>> static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of, >>>>>>> } >>>>>>> >>>>>>> ret = rdtgroup_move_task(pid, rdtgrp, of); >>>>>>> + if (ret) >>>>>>> + goto unlock; >>>>>>> + else >>>>>>> + goto next; >>>>>>> >>>>>> >>>>>> The documentation states "The failure details will be logged in >>>>>> resctrl/info/last_cmd_status file." but I do not see how this is happening here. >>>>>> From what I can tell this implementation does not do anything beyond what >>>>>> last_cmd_status already does so any special mention in the docs is not clear to >>>>>> me. The cover letter stated "Added pid in last_cmd_status when applicable." - it >>>>>> sounded as though last_cmd_status would contain the error with the pid that >>>>>> encountered the error but I do not see this happening here. >>>>> >>>>> You are right we are not doing anything special here. pid failures error was already there. >>>>> I will have to change the text here. >>>> >>>> What do you mean with "pid failures error was already there"? From what >>>> I understand your goal is to communicate to the user which pid >>>> encountered the error and I do not see that done. How will user know >>>> which pid encountered a failure? >>> >>> We only have couple of failures to take here. Those failures are already >>> handled by rdtgroup_move_task. It logs the pid for failure(using >>> rdt_last_cmd_printf). >> >> The pid is only logged if there is no task with that pid. How about the >> error in __rdtgroup_move_task() - how will the user know which pid triggered >> that error? > > Yes. These cases we may be able to report the pid. I meant "we may not" > >> >>> >>> I can say "The failure pid will be logged in >>> /sys/fs/resctrl/info/last_cmd_status file." >> >> That will not be accurate. Not all errors include the pid. > > Can you please suggest? > Thanks > Babu Moger
Hi Babu, On 3/20/2023 8:07 AM, Moger, Babu wrote: > On 3/16/23 15:33, Reinette Chatre wrote: >> On 3/16/2023 12:51 PM, Moger, Babu wrote: >>> On 3/16/23 12:12, Reinette Chatre wrote: >>>> On 3/16/2023 9:27 AM, Moger, Babu wrote: >>>>>> -----Original Message----- >>>>>> From: Reinette Chatre <reinette.chatre@intel.com> >>>>>> Sent: Wednesday, March 15, 2023 1:33 PM >>>>>> To: Moger, Babu <Babu.Moger@amd.com>; corbet@lwn.net; >>>>>> tglx@linutronix.de; mingo@redhat.com; bp@alien8.de >>>>>> Cc: fenghua.yu@intel.com; dave.hansen@linux.intel.com; x86@kernel.org; >>>>>> hpa@zytor.com; paulmck@kernel.org; akpm@linux-foundation.org; >>>>>> quic_neeraju@quicinc.com; rdunlap@infradead.org; >>>>>> damien.lemoal@opensource.wdc.com; songmuchun@bytedance.com; >>>>>> peterz@infradead.org; jpoimboe@kernel.org; pbonzini@redhat.com; >>>>>> chang.seok.bae@intel.com; pawan.kumar.gupta@linux.intel.com; >>>>>> jmattson@google.com; daniel.sneddon@linux.intel.com; Das1, Sandipan >>>>>> <Sandipan.Das@amd.com>; tony.luck@intel.com; james.morse@arm.com; >>>>>> linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org; >>>>>> bagasdotme@gmail.com; eranian@google.com; christophe.leroy@csgroup.eu; >>>>>> jarkko@kernel.org; adrian.hunter@intel.com; quic_jiles@quicinc.com; >>>>>> peternewman@google.com >>>>>> Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group >>>>>> at once >>>>>> >>>>>> Hi Babu, >>>>>> >>>>>> On 3/2/2023 12:24 PM, Babu Moger wrote: >>>>>>> The resctrl task assignment for MONITOR or CONTROL group needs to be >>>>>>> done one at a time. For example: >>>>>>> >>>>>>> $mount -t resctrl resctrl /sys/fs/resctrl/ >>>>>>> $mkdir /sys/fs/resctrl/clos1 >>>>>>> $echo 123 > /sys/fs/resctrl/clos1/tasks >>>>>>> $echo 456 > /sys/fs/resctrl/clos1/tasks >>>>>>> $echo 789 > /sys/fs/resctrl/clos1/tasks >>>>>>> >>>>>>> This is not user-friendly when dealing with hundreds of tasks. Also, >>>>>>> there is a syscall overhead for each command executed from user space. >>>>>> >>>>>> To support this change it may also be helpful to add that moving tasks take the >>>>>> mutex so attempting to move tasks in parallel will not achieve a significant >>>>>> performance gain. >>>>> >>>>> Agree. It may not be significant performance gain. Will remove this line. >>>> >>>> It does not sound as though you are actually responding to my comment. >>> >>> I am confused. I am already saying there is syscall overhead for each >>> command if we move the tasks one by one. Now do you want me to add "moving >>> tasks take the mutex so attempting to move tasks in parallel will not >>> achieve a significant performance gain". >>> >>> It is contradictory, So, I wanted to remove the line about performance. >>> Did I still miss something? >> >> Where is the contradiction? >> >> Consider your example: >> $echo 123 > /sys/fs/resctrl/clos1/tasks >> $echo 456 > /sys/fs/resctrl/clos1/tasks >> $echo 789 > /sys/fs/resctrl/clos1/tasks >> >> Yes, there is syscall overhead for each of the above lines. My statement was in >> support of this work by stating that a user aiming to improve performance by >> attempting the above in parallel would not be able to see achieve significant >> performance gain since the calls would end up being serialized. > > ok. Sure. Will add the text. I may modify little bit. >> >> You are providing two motivations (a) "user-friendly when dealing with >> hundreds of tasks", and (b) syscall overhead. Have you measured the >> improvement this solution provides? > > No. I have not measured the performance improvement. The changelog makes a claim that the current implementation has overhead that is removed with this change. There is no data to support this claim. ... >>>>>>> + >>>>>>> + buf[nbytes - 1] = '\0'; >>>>>>> + >>>>>>> rdtgrp = rdtgroup_kn_lock_live(of->kn); >>>>>>> if (!rdtgrp) { >>>>>>> rdtgroup_kn_unlock(of->kn); >>>>>>> return -ENOENT; >>>>>>> } >>>>>>> + >>>>>>> +next: >>>>>>> + if (!buf || buf[0] == '\0') >>>>>>> + goto unlock; >>>>>>> + >>>>>>> + pid_str = strim(strsep(&buf, ",")); >>>>>>> + >>>>>> >>>>>> Could lib/cmdline.c:get_option() be useful? >>>>> >>>>> Yes. We could that also. May not be required for the simple case like this. >>>> >>>> Please keep an eye out for how much of it you end up duplicating .... >>> >>> Using the get_options will require at least two calls(one to get the >>> length and then read the integers). Also need to allocate the integers >>> array dynamically. That is lot code if we are going that route. >>> >> >> I did not ask about get_options(), I asked about get_option(). > > If you insist, will use get_option. But we still have to loop thru all the > string till get_option returns 0. I can try that. I just asked whether get_option() could be useful. Could you please point out what I said that made you think that I insist on this change being made? If it matches your usage, then know it is available, if it does not, then don't use it. ... >>> I can say "The failure pid will be logged in >>> /sys/fs/resctrl/info/last_cmd_status file." >> >> That will not be accurate. Not all errors include the pid. > > Can you please suggest? last_cmd_status provides a 512 char buffer to communicate details to the user. The buffer is cleared before the loop that moves all the tasks start. If an error is encountered, a detailed message is written to the buffer. One option may be to append a string to the buffer that includes the pid? Perhaps something like: rdt_last_cmd_printf("Error encountered while moving task %d\n", pid); Please feel free to improve. Reinette
Hi Reinette, On 3/20/23 11:52, Reinette Chatre wrote: > Hi Babu, > > On 3/20/2023 8:07 AM, Moger, Babu wrote: >> On 3/16/23 15:33, Reinette Chatre wrote: >>> On 3/16/2023 12:51 PM, Moger, Babu wrote: >>>> On 3/16/23 12:12, Reinette Chatre wrote: >>>>> On 3/16/2023 9:27 AM, Moger, Babu wrote: >>>>>>> -----Original Message----- >>>>>>> From: Reinette Chatre <reinette.chatre@intel.com> >>>>>>> Sent: Wednesday, March 15, 2023 1:33 PM >>>>>>> To: Moger, Babu <Babu.Moger@amd.com>; corbet@lwn.net; >>>>>>> tglx@linutronix.de; mingo@redhat.com; bp@alien8.de >>>>>>> Cc: fenghua.yu@intel.com; dave.hansen@linux.intel.com; x86@kernel.org; >>>>>>> hpa@zytor.com; paulmck@kernel.org; akpm@linux-foundation.org; >>>>>>> quic_neeraju@quicinc.com; rdunlap@infradead.org; >>>>>>> damien.lemoal@opensource.wdc.com; songmuchun@bytedance.com; >>>>>>> peterz@infradead.org; jpoimboe@kernel.org; pbonzini@redhat.com; >>>>>>> chang.seok.bae@intel.com; pawan.kumar.gupta@linux.intel.com; >>>>>>> jmattson@google.com; daniel.sneddon@linux.intel.com; Das1, Sandipan >>>>>>> <Sandipan.Das@amd.com>; tony.luck@intel.com; james.morse@arm.com; >>>>>>> linux-doc@vger.kernel.org; linux-kernel@vger.kernel.org; >>>>>>> bagasdotme@gmail.com; eranian@google.com; christophe.leroy@csgroup.eu; >>>>>>> jarkko@kernel.org; adrian.hunter@intel.com; quic_jiles@quicinc.com; >>>>>>> peternewman@google.com >>>>>>> Subject: Re: [PATCH v3 1/7] x86/resctrl: Add multiple tasks to the resctrl group >>>>>>> at once >>>>>>> >>>>>>> Hi Babu, >>>>>>> >>>>>>> On 3/2/2023 12:24 PM, Babu Moger wrote: >>>>>>>> The resctrl task assignment for MONITOR or CONTROL group needs to be >>>>>>>> done one at a time. For example: >>>>>>>> >>>>>>>> $mount -t resctrl resctrl /sys/fs/resctrl/ >>>>>>>> $mkdir /sys/fs/resctrl/clos1 >>>>>>>> $echo 123 > /sys/fs/resctrl/clos1/tasks >>>>>>>> $echo 456 > /sys/fs/resctrl/clos1/tasks >>>>>>>> $echo 789 > /sys/fs/resctrl/clos1/tasks >>>>>>>> >>>>>>>> This is not user-friendly when dealing with hundreds of tasks. Also, >>>>>>>> there is a syscall overhead for each command executed from user space. >>>>>>> >>>>>>> To support this change it may also be helpful to add that moving tasks take the >>>>>>> mutex so attempting to move tasks in parallel will not achieve a significant >>>>>>> performance gain. >>>>>> >>>>>> Agree. It may not be significant performance gain. Will remove this line. >>>>> >>>>> It does not sound as though you are actually responding to my comment. >>>> >>>> I am confused. I am already saying there is syscall overhead for each >>>> command if we move the tasks one by one. Now do you want me to add "moving >>>> tasks take the mutex so attempting to move tasks in parallel will not >>>> achieve a significant performance gain". >>>> >>>> It is contradictory, So, I wanted to remove the line about performance. >>>> Did I still miss something? >>> >>> Where is the contradiction? >>> >>> Consider your example: >>> $echo 123 > /sys/fs/resctrl/clos1/tasks >>> $echo 456 > /sys/fs/resctrl/clos1/tasks >>> $echo 789 > /sys/fs/resctrl/clos1/tasks >>> >>> Yes, there is syscall overhead for each of the above lines. My statement was in >>> support of this work by stating that a user aiming to improve performance by >>> attempting the above in parallel would not be able to see achieve significant >>> performance gain since the calls would end up being serialized. >> >> ok. Sure. Will add the text. I may modify little bit. >>> >>> You are providing two motivations (a) "user-friendly when dealing with >>> hundreds of tasks", and (b) syscall overhead. Have you measured the >>> improvement this solution provides? >> >> No. I have not measured the performance improvement. > > The changelog makes a claim that the current implementation has overhead > that is removed with this change. There is no data to support this claim. My main motivation for this change is to make it user-friendly. So that users can search the pid's and assign multiple tasks at a time. Originally I did not have the line for performance. Actually, I don't want to claim performance benefits. I will remove the performance claims. > > ... > >>>>>>>> + >>>>>>>> + buf[nbytes - 1] = '\0'; >>>>>>>> + >>>>>>>> rdtgrp = rdtgroup_kn_lock_live(of->kn); >>>>>>>> if (!rdtgrp) { >>>>>>>> rdtgroup_kn_unlock(of->kn); >>>>>>>> return -ENOENT; >>>>>>>> } >>>>>>>> + >>>>>>>> +next: >>>>>>>> + if (!buf || buf[0] == '\0') >>>>>>>> + goto unlock; >>>>>>>> + >>>>>>>> + pid_str = strim(strsep(&buf, ",")); >>>>>>>> + >>>>>>> >>>>>>> Could lib/cmdline.c:get_option() be useful? >>>>>> >>>>>> Yes. We could that also. May not be required for the simple case like this. >>>>> >>>>> Please keep an eye out for how much of it you end up duplicating .... >>>> >>>> Using the get_options will require at least two calls(one to get the >>>> length and then read the integers). Also need to allocate the integers >>>> array dynamically. That is lot code if we are going that route. >>>> >>> >>> I did not ask about get_options(), I asked about get_option(). >> >> If you insist, will use get_option. But we still have to loop thru all the >> string till get_option returns 0. I can try that. > > > I just asked whether get_option() could be useful. Could you please point out what > I said that made you think that I insist on this change being made? If it matches > your usage, then know it is available, if it does not, then don't use it. Ok. I dont see a major benefit using get_option here. So, not planning to to use it. > > ... > >>>> I can say "The failure pid will be logged in >>>> /sys/fs/resctrl/info/last_cmd_status file." >>> >>> That will not be accurate. Not all errors include the pid. >> >> Can you please suggest? > > last_cmd_status provides a 512 char buffer to communicate details > to the user. The buffer is cleared before the loop that moves all the > tasks start. If an error is encountered, a detailed message is written > to the buffer. One option may be to append a string to the buffer that > includes the pid? Perhaps something like: > rdt_last_cmd_printf("Error encountered while moving task %d\n", pid); ok. Will try to add and test it. > > Please feel free to improve. > > Reinette > >
diff --git a/Documentation/x86/resctrl.rst b/Documentation/x86/resctrl.rst index 058257dc56c8..25203f20002d 100644 --- a/Documentation/x86/resctrl.rst +++ b/Documentation/x86/resctrl.rst @@ -292,13 +292,20 @@ All groups contain the following files: "tasks": Reading this file shows the list of all tasks that belong to this group. Writing a task id to the file will add a task to the - group. If the group is a CTRL_MON group the task is removed from + group. Multiple tasks can be assigned together in one command by + inputting the tasks separated by commas. Tasks will be assigned + sequentially in the order it is provided. Failure while assigning + the tasks will be aborted immediately. The tasks before the failure + will be assigned and the tasks next in the sequence will not be + assigned. Users may need to retry them again. The failure details + will be logged in resctrl/info/last_cmd_status file. + + If the group is a CTRL_MON group the task is removed from whichever previous CTRL_MON group owned the task and also from any MON group that owned the task. If the group is a MON group, then the task must already belong to the CTRL_MON parent of this group. The task is removed from any previous MON group. - "cpus": Reading this file shows a bitmask of the logical CPUs owned by this group. Writing a mask to this file will add and remove diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index e2c1599d1b37..15ea5b550fe9 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -683,16 +683,34 @@ static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of, char *buf, size_t nbytes, loff_t off) { struct rdtgroup *rdtgrp; + char *pid_str; int ret = 0; pid_t pid; - if (kstrtoint(strstrip(buf), 0, &pid) || pid < 0) + /* Valid input requires a trailing newline */ + if (nbytes == 0 || buf[nbytes - 1] != '\n') return -EINVAL; + + buf[nbytes - 1] = '\0'; + rdtgrp = rdtgroup_kn_lock_live(of->kn); if (!rdtgrp) { rdtgroup_kn_unlock(of->kn); return -ENOENT; } + +next: + if (!buf || buf[0] == '\0') + goto unlock; + + pid_str = strim(strsep(&buf, ",")); + + if (kstrtoint(pid_str, 0, &pid) || pid < 0) { + rdt_last_cmd_printf("Invalid pid %d value\n", pid); + ret = -EINVAL; + goto unlock; + } + rdt_last_cmd_clear(); if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKED || @@ -703,6 +721,10 @@ static ssize_t rdtgroup_tasks_write(struct kernfs_open_file *of, } ret = rdtgroup_move_task(pid, rdtgrp, of); + if (ret) + goto unlock; + else + goto next; unlock: rdtgroup_kn_unlock(of->kn);