From patchwork Thu May 25 18:01:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99170 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp564930vqr; Thu, 25 May 2023 11:16:02 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7TXPbKPgt2yNFz+zhrzxsqKgYs81w9H4se9tYxbeKJou9rilTAlYgactvZbByE+NvXswUI X-Received: by 2002:a17:90a:8ce:b0:24b:7618:2d16 with SMTP id 14-20020a17090a08ce00b0024b76182d16mr2938195pjn.31.1685038562531; Thu, 25 May 2023 11:16:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685038562; cv=none; d=google.com; s=arc-20160816; b=mYNZDgfHU4xLoZZTMmA6VhpnuRZkHuhs6YEa3t4HHT08tc52G1Be40t14l+mgCMO+r QSbaJjTX/ZZqB3gLkuVJxrkClReRfVBQMqu0odHs+ovrU8xIDe4zyp4ypzyiVul6WcUV NdKlE1w5CPvcY1maqZ+qOSTEcXvIa+6D/ww2FCJvgrX4v+iLy7qUB2oc8lCY7d5zezG/ f8ZZoSQ21W7N8JEc33Y6855n8pL9SkjZvamocqhjWJhA9U94yDdqWrWn4L1PL29D2kfO TBpH1J622zkcdlk7CRvCGG0bGOPgz+AP1cH18x5zr2pLZ+VGgkE0XvjJR0cDp9CWHKje DPVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=pluukfyWzGci1Gt9L1iAso7dUBaNiU01UM4oOFKGbJc=; b=dos5DLwIcKsTtaP1dHbLAA24GjQ6Andvld0lCAAB/QS+5vOA5iQ0R3EzRDPGcksCZc MnS0/HCvCIo0P/5o+xw2rZ7ERs2TteRticC2AehOkuLYnEPVfda7jknSq+wiweuAv3Sp 9JAYiD0CJMdRKWXhgdnfSMONcOfSB8STgzt6TXWf9zqpesIKlagAL10HxqU4Vwefymp9 AoV1cLOP7YkXLmnUXY56msmR4BO6+wmlgZyx46k0CS5nfUJmNxZjsxxXq2Mkhm5YjtPG R+bkwyAko5dyQ49GUlfTMTwumnA8iHq1IaLpnP0vYasjXwRhHsX5kcWroxArdXqta2Tp uC8Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e10-20020a17090a9a8a00b002507107f730si1106380pjp.30.2023.05.25.11.15.48; Thu, 25 May 2023 11:16:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240131AbjEYSCq (ORCPT + 99 others); Thu, 25 May 2023 14:02:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60686 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240264AbjEYSCl (ORCPT ); Thu, 25 May 2023 14:02:41 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E3D4EE2 for ; Thu, 25 May 2023 11:02:31 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9FD0F1042; Thu, 25 May 2023 11:03:16 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B7EE03F6C4; Thu, 25 May 2023 11:02:28 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 01/24] x86/resctrl: Track the closid with the rmid Date: Thu, 25 May 2023 18:01:46 +0000 Message-Id: <20230525180209.19497-2-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766890995636699795?= X-GMAIL-MSGID: =?utf-8?q?1766890995636699795?= x86's RMID are independent of the CLOSID. An RMID can be allocated, used and freed without considering the CLOSID. MPAM's equivalent feature is PMG, which is not an independent number, it extends the CLOSID/PARTID space. For MPAM, only PMG-bits worth of 'RMID' can be allocated for a single CLOSID. i.e. if there is 1 bit of PMG space, then each CLOSID can have two monitor groups. To allow resctrl to disambiguate RMID values for different CLOSID, everything in resctrl that keeps an RMID value needs to know the CLOSID too. This will always be ignored on x86. Tested-by: Shaopeng Tan Reviewed-by: Xin Hao Signed-off-by: James Morse --- Is there a better term for 'the unique identifier for a monitor group'. Using RMID for that here may be confusing... Changes since v1: * Added comment in struct rmid_entry Changes since v2: * Moved X86_RESCTRL_BAD_CLOSID from a subsequent patch --- arch/x86/include/asm/resctrl.h | 7 +++ arch/x86/kernel/cpu/resctrl/internal.h | 2 +- arch/x86/kernel/cpu/resctrl/monitor.c | 65 ++++++++++++++--------- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 4 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 12 ++--- include/linux/resctrl.h | 11 +++- 6 files changed, 64 insertions(+), 37 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 255a78d9d906..e906070285fb 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -7,6 +7,13 @@ #include #include +/* + * This value can never be a valid CLOSID, and is used when mapping a + * (closid, rmid) pair to an index and back. On x86 only the RMID is + * needed. + */ +#define X86_RESCTRL_BAD_CLOSID ((u32)~0) + /** * struct resctrl_pqr_state - State cache for the PQR MSR * @cur_rmid: The cached Resource Monitoring ID diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 85ceaf9a31ac..f2da908bb079 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -535,7 +535,7 @@ struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r); int closids_supported(void); void closid_free(int closid); int alloc_rmid(void); -void free_rmid(u32 rmid); +void free_rmid(u32 closid, u32 rmid); int rdt_get_mon_l3_config(struct rdt_resource *r); bool __init rdt_cpu_has(int flag); void mon_event_count(void *info); diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index ded1fc7cb7cb..86574adedd64 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -25,6 +25,12 @@ #include "internal.h" struct rmid_entry { + /* + * Some architectures's resctrl_arch_rmid_read() needs the CLOSID value + * in order to access the correct monitor. This field provides the + * value to list walkers like __check_limbo(). On x86 this is ignored. + */ + u32 closid; u32 rmid; int busy; struct list_head list; @@ -136,7 +142,7 @@ static inline u64 get_corrected_mbm_count(u32 rmid, unsigned long val) return val; } -static inline struct rmid_entry *__rmid_entry(u32 rmid) +static inline struct rmid_entry *__rmid_entry(u32 closid, u32 rmid) { struct rmid_entry *entry; @@ -190,7 +196,8 @@ static struct arch_mbm_state *get_arch_mbm_state(struct rdt_hw_domain *hw_dom, } void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d, - u32 rmid, enum resctrl_event_id eventid) + u32 closid, u32 rmid, + enum resctrl_event_id eventid) { struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); struct arch_mbm_state *am; @@ -230,7 +237,8 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width) } int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, - u32 rmid, enum resctrl_event_id eventid, u64 *val) + u32 closid, u32 rmid, enum resctrl_event_id eventid, + u64 *val) { struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); @@ -285,9 +293,9 @@ void __check_limbo(struct rdt_domain *d, bool force_free) if (nrmid >= r->num_rmid) break; - entry = __rmid_entry(nrmid); + entry = __rmid_entry(X86_RESCTRL_BAD_CLOSID, nrmid);// temporary - if (resctrl_arch_rmid_read(r, d, entry->rmid, + if (resctrl_arch_rmid_read(r, d, entry->closid, entry->rmid, QOS_L3_OCCUP_EVENT_ID, &val)) { rmid_dirty = true; } else { @@ -342,7 +350,8 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) cpu = get_cpu(); list_for_each_entry(d, &r->domains, list) { if (cpumask_test_cpu(cpu, &d->cpu_mask)) { - err = resctrl_arch_rmid_read(r, d, entry->rmid, + err = resctrl_arch_rmid_read(r, d, entry->closid, + entry->rmid, QOS_L3_OCCUP_EVENT_ID, &val); if (err || val <= resctrl_rmid_realloc_threshold) @@ -366,7 +375,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) list_add_tail(&entry->list, &rmid_free_lru); } -void free_rmid(u32 rmid) +void free_rmid(u32 closid, u32 rmid) { struct rmid_entry *entry; @@ -375,7 +384,7 @@ void free_rmid(u32 rmid) lockdep_assert_held(&rdtgroup_mutex); - entry = __rmid_entry(rmid); + entry = __rmid_entry(closid, rmid); if (is_llc_occupancy_enabled()) add_rmid_to_limbo(entry); @@ -383,8 +392,8 @@ void free_rmid(u32 rmid) list_add_tail(&entry->list, &rmid_free_lru); } -static struct mbm_state *get_mbm_state(struct rdt_domain *d, u32 rmid, - enum resctrl_event_id evtid) +static struct mbm_state *get_mbm_state(struct rdt_domain *d, u32 closid, + u32 rmid, enum resctrl_event_id evtid) { switch (evtid) { case QOS_L3_MBM_TOTAL_EVENT_ID: @@ -396,20 +405,21 @@ static struct mbm_state *get_mbm_state(struct rdt_domain *d, u32 rmid, } } -static int __mon_event_count(u32 rmid, struct rmid_read *rr) +static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr) { struct mbm_state *m; u64 tval = 0; if (rr->first) { - resctrl_arch_reset_rmid(rr->r, rr->d, rmid, rr->evtid); - m = get_mbm_state(rr->d, rmid, rr->evtid); + resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid); + m = get_mbm_state(rr->d, closid, rmid, rr->evtid); if (m) memset(m, 0, sizeof(struct mbm_state)); return 0; } - rr->err = resctrl_arch_rmid_read(rr->r, rr->d, rmid, rr->evtid, &tval); + rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid, rr->evtid, + &tval); if (rr->err) return rr->err; @@ -429,7 +439,7 @@ static int __mon_event_count(u32 rmid, struct rmid_read *rr) * __mon_event_count() is compared with the chunks value from the previous * invocation. This must be called once per second to maintain values in MBps. */ -static void mbm_bw_count(u32 rmid, struct rmid_read *rr) +static void mbm_bw_count(u32 closid, u32 rmid, struct rmid_read *rr) { struct mbm_state *m = &rr->d->mbm_local[rmid]; u64 cur_bw, bytes, cur_bytes; @@ -459,7 +469,7 @@ void mon_event_count(void *info) rdtgrp = rr->rgrp; - ret = __mon_event_count(rdtgrp->mon.rmid, rr); + ret = __mon_event_count(rdtgrp->closid, rdtgrp->mon.rmid, rr); /* * For Ctrl groups read data from child monitor groups and @@ -470,7 +480,8 @@ void mon_event_count(void *info) if (rdtgrp->type == RDTCTRL_GROUP) { list_for_each_entry(entry, head, mon.crdtgrp_list) { - if (__mon_event_count(entry->mon.rmid, rr) == 0) + if (__mon_event_count(rdtgrp->closid, entry->mon.rmid, + rr) == 0) ret = 0; } } @@ -600,7 +611,8 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm) } } -static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, int rmid) +static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, + u32 closid, u32 rmid) { struct rmid_read rr; @@ -615,12 +627,12 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, int rmid) if (is_mbm_total_enabled()) { rr.evtid = QOS_L3_MBM_TOTAL_EVENT_ID; rr.val = 0; - __mon_event_count(rmid, &rr); + __mon_event_count(closid, rmid, &rr); } if (is_mbm_local_enabled()) { rr.evtid = QOS_L3_MBM_LOCAL_EVENT_ID; rr.val = 0; - __mon_event_count(rmid, &rr); + __mon_event_count(closid, rmid, &rr); /* * Call the MBA software controller only for the @@ -628,7 +640,7 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, int rmid) * the software controller explicitly. */ if (is_mba_sc(NULL)) - mbm_bw_count(rmid, &rr); + mbm_bw_count(closid, rmid, &rr); } } @@ -685,11 +697,11 @@ void mbm_handle_overflow(struct work_struct *work) d = container_of(work, struct rdt_domain, mbm_over.work); list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) { - mbm_update(r, d, prgrp->mon.rmid); + mbm_update(r, d, prgrp->closid, prgrp->mon.rmid); head = &prgrp->mon.crdtgrp_list; list_for_each_entry(crgrp, head, mon.crdtgrp_list) - mbm_update(r, d, crgrp->mon.rmid); + mbm_update(r, d, crgrp->closid, crgrp->mon.rmid); if (is_mba_sc(NULL)) update_mba_bw(prgrp, d); @@ -732,10 +744,11 @@ static int dom_data_init(struct rdt_resource *r) } /* - * RMID 0 is special and is always allocated. It's used for all - * tasks that are not monitored. + * RMID 0 is special and is always allocated. It's used for the + * default_rdtgroup control group, which will be setup later. See + * rdtgroup_setup_root(). */ - entry = __rmid_entry(0); + entry = __rmid_entry(0, 0); list_del(&entry->list); return 0; diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c index 458cb7419502..aeadaeb5df9a 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -738,7 +738,7 @@ int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp) * anymore when this group would be used for pseudo-locking. This * is safe to call on platforms not capable of monitoring. */ - free_rmid(rdtgrp->mon.rmid); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); ret = 0; goto out; @@ -773,7 +773,7 @@ int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp) ret = rdtgroup_locksetup_user_restore(rdtgrp); if (ret) { - free_rmid(rdtgrp->mon.rmid); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); return ret; } diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 6ad33f355861..ff9ccfcd18bd 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2701,7 +2701,7 @@ static void free_all_child_rdtgrp(struct rdtgroup *rdtgrp) head = &rdtgrp->mon.crdtgrp_list; list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) { - free_rmid(sentry->mon.rmid); + free_rmid(sentry->closid, sentry->mon.rmid); list_del(&sentry->mon.crdtgrp_list); if (atomic_read(&sentry->waitcount) != 0) @@ -2741,7 +2741,7 @@ static void rmdir_all_sub(void) cpumask_or(&rdtgroup_default.cpu_mask, &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask); - free_rmid(rdtgrp->mon.rmid); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); kernfs_remove(rdtgrp->kn); list_del(&rdtgrp->rdtgroup_list); @@ -3239,7 +3239,7 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, return 0; out_idfree: - free_rmid(rdtgrp->mon.rmid); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); out_destroy: kernfs_put(rdtgrp->kn); kernfs_remove(rdtgrp->kn); @@ -3253,7 +3253,7 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, static void mkdir_rdt_prepare_clean(struct rdtgroup *rgrp) { kernfs_remove(rgrp->kn); - free_rmid(rgrp->mon.rmid); + free_rmid(rgrp->closid, rgrp->mon.rmid); rdtgroup_remove(rgrp); } @@ -3402,7 +3402,7 @@ static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask) update_closid_rmid(tmpmask, NULL); rdtgrp->flags = RDT_DELETED; - free_rmid(rdtgrp->mon.rmid); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); /* * Remove the rdtgrp from the parent ctrl_mon group's list @@ -3448,8 +3448,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask) cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask); update_closid_rmid(tmpmask, NULL); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); closid_free(rdtgrp->closid); - free_rmid(rdtgrp->mon.rmid); rdtgroup_ctrl_remove(rdtgrp); diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 8334eeacfec5..7d80bae05f59 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -225,6 +225,8 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); * for this resource and domain. * @r: resource that the counter should be read from. * @d: domain that the counter should be read from. + * @closid: closid that matches the rmid. The counter may + * match traffic of both closid and rmid, or rmid only. * @rmid: rmid of the counter to read. * @eventid: eventid to read, e.g. L3 occupancy. * @val: result of the counter read in bytes. @@ -235,20 +237,25 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); * 0 on success, or -EIO, -EINVAL etc on error. */ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, - u32 rmid, enum resctrl_event_id eventid, u64 *val); + u32 closid, u32 rmid, enum resctrl_event_id eventid, + u64 *val); + /** * resctrl_arch_reset_rmid() - Reset any private state associated with rmid * and eventid. * @r: The domain's resource. * @d: The rmid's domain. + * @closid: The closid that matches the rmid. Counters may match both + * closid and rmid, or rmid only. * @rmid: The rmid whose counter values should be reset. * @eventid: The eventid whose counter values should be reset. * * This can be called from any CPU. */ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d, - u32 rmid, enum resctrl_event_id eventid); + u32 closid, u32 rmid, + enum resctrl_event_id eventid); /** * resctrl_arch_reset_rmid_all() - Reset all private state associated with From patchwork Thu May 25 18:01:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99174 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp566098vqr; Thu, 25 May 2023 11:17:45 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6KKKEVqO9hY/XI1wUB9RAHBjVczhH6e9hh2BWq8HU6JV/9+H5LUTYQrwNtxf0vAbbOQBF4 X-Received: by 2002:a17:90b:4b8e:b0:24d:d377:d1 with SMTP id lr14-20020a17090b4b8e00b0024dd37700d1mr3011539pjb.45.1685038665352; Thu, 25 May 2023 11:17:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685038665; cv=none; d=google.com; s=arc-20160816; b=Fx09BzwLfZnRiSOiUlnp63aRtCJiRGfYw+KAGpfYIylTJFU1Aso5kSAdWwBj8DZx41 n0UhFKKIu0jsMout+FoFe4lTxYpc7AN+xPRXxDz49bP5jsDrxL1m92eqIMW2gFIcy6xA MpuyFuzWeHcCq+0/HLBhYdfWik1opjpWyxsLgVLtD6qCazh2IAh6WxNnev/Abbs+dcpX D+9v25X5U4iE/hmJ4sOg+PoeOlkljEgUggKymkgubCVzltleWrNKsHq4C5FM1o8flA1O 4tOpxZ0b7qJ0FMVcOGgagwoP9miyWy6HM9bPfTrDaVoikRA2Fi84WCgoKpECM0hlOlw7 z5jQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=+Vcep9dY61fQtQba7Pso+4Kis2ZQsPjdoWNY7qdeFq8=; b=wUwt5XipOjTAg7pVipy7dmUm1c6+57jTiA8hVH2Wno+6W7bk/fFsbJin2J9+etKLfu o+hJRXVXQo2FojtDfWzm90yWpnEioVlIiXHVWzfPDk2X8GTzAqpOveYfWrwVFXyCvQ+J Tkrs0/UZROT/4Pm4EhLhMdtW8RUSHWhTHW5owr7TFm2XItvYEky2qZovKMW4dFFMmmCZ S8H8r43pxNhhxD1G6ofQJ/xbza5Ii04Cpy/djEGInvTwvQhRwg5SzHww0zIPMYrZArDU XxtchuOyBFd6kIAoyitKxPGAsZORSWR0dYBMFB8M0ZTyrlDhRysduJWcY6VhbQjVNBbU bfXQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h22-20020a17090ac39600b002479bbf3246si2140606pjt.124.2023.05.25.11.17.30; Thu, 25 May 2023 11:17:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240727AbjEYSCt (ORCPT + 99 others); Thu, 25 May 2023 14:02:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240536AbjEYSCn (ORCPT ); Thu, 25 May 2023 14:02:43 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 644971B3 for ; Thu, 25 May 2023 11:02:35 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 36F1415BF; Thu, 25 May 2023 11:03:20 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6AB793F6C4; Thu, 25 May 2023 11:02:32 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 02/24] x86/resctrl: Access per-rmid structures by index Date: Thu, 25 May 2023 18:01:47 +0000 Message-Id: <20230525180209.19497-3-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766891103435087346?= X-GMAIL-MSGID: =?utf-8?q?1766891103435087346?= Because of the differences between Intel RDT/AMD QoS and Arm's MPAM monitors, RMID values on arm64 are not unique unless the CLOSID is also included. Bitmaps like rmid_busy_llc need to be sized by the number of unique entries for this resource. Add helpers to encode/decode the CLOSID and RMID to an index. The domain's rmid_busy_llc and rmid_ptrs[] are then sized by index, as are the domain mbm_local and mbm_total arrays. On x86, the index is always just the RMID, so all these structures remain the same size. The index gives resctrl a unique value it can use to store monitor values, and allows MPAM to decode the CLOSID when reading the hardware counters. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v1: * Added X86_BAD_CLOSID macro to make it clear what this value means * Added second WARN_ON() for closid checking, and made both _ONCE() Changes since v2: * Added RESCTRL_RESERVED_CLOSID * Removed a newline * Repharsed some comments * Renamed a variable 'ignore'd * Moved X86_RESCTRL_BAD_CLOSID to a previous patch Changes since v3: * Changed a variable name * Fixed various typos --- arch/x86/include/asm/resctrl.h | 17 ++++++ arch/x86/kernel/cpu/resctrl/core.c | 2 +- arch/x86/kernel/cpu/resctrl/internal.h | 1 + arch/x86/kernel/cpu/resctrl/monitor.c | 84 +++++++++++++++++--------- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 7 ++- include/linux/resctrl.h | 3 + 6 files changed, 83 insertions(+), 31 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index e906070285fb..dd9b638d43c8 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -101,6 +101,23 @@ static inline void resctrl_sched_in(struct task_struct *tsk) __resctrl_sched_in(tsk); } +static inline u32 resctrl_arch_system_num_rmid_idx(void) +{ + /* RMID are independent numbers for x86. num_rmid_idx == num_rmid */ + return boot_cpu_data.x86_cache_max_rmid + 1; +} + +static inline void resctrl_arch_rmid_idx_decode(u32 idx, u32 *closid, u32 *rmid) +{ + *rmid = idx; + *closid = X86_RESCTRL_BAD_CLOSID; +} + +static inline u32 resctrl_arch_rmid_idx_encode(u32 ignored, u32 rmid) +{ + return rmid; +} + void resctrl_cpu_detect(struct cpuinfo_x86 *c); #else diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 030d3b409768..4bea032d072e 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -600,7 +600,7 @@ static void clear_closid_rmid(int cpu) state->default_rmid = 0; state->cur_closid = 0; state->cur_rmid = 0; - wrmsr(MSR_IA32_PQR_ASSOC, 0, 0); + wrmsr(MSR_IA32_PQR_ASSOC, 0, RESCTRL_RESERVED_CLOSID); } static int resctrl_online_cpu(unsigned int cpu) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index f2da908bb079..d571da4848a4 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -7,6 +7,7 @@ #include #include #include +#include #define L3_QOS_CDP_ENABLE 0x01ULL diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 86574adedd64..bcc25f5339c0 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -142,12 +142,29 @@ static inline u64 get_corrected_mbm_count(u32 rmid, unsigned long val) return val; } -static inline struct rmid_entry *__rmid_entry(u32 closid, u32 rmid) +/* + * x86 and arm64 differ in their handling of monitoring. + * x86's RMID are an independent number, there is only one source of traffic + * with an RMID value of '1'. + * arm64's PMG extend the PARTID/CLOSID space, there are multiple sources of + * traffic with a PMG value of '1', one for each CLOSID, meaning the RMID + * value is no longer unique. + * To account for this, resctrl uses an index. On x86 this is just the RMID, + * on arm64 it encodes the CLOSID and RMID. This gives a unique number. + * + * The domain's rmid_busy_llc and rmid_ptrs are sized by index. The arch code + * must accept an attempt to read every index. + */ +static inline struct rmid_entry *__rmid_entry(u32 idx) { struct rmid_entry *entry; + u32 closid, rmid; - entry = &rmid_ptrs[rmid]; - WARN_ON(entry->rmid != rmid); + entry = &rmid_ptrs[idx]; + resctrl_arch_rmid_idx_decode(idx, &closid, &rmid); + + WARN_ON_ONCE(entry->closid != closid); + WARN_ON_ONCE(entry->rmid != rmid); return entry; } @@ -277,8 +294,9 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, void __check_limbo(struct rdt_domain *d, bool force_free) { struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + u32 idx_limit = resctrl_arch_system_num_rmid_idx(); struct rmid_entry *entry; - u32 crmid = 1, nrmid; + u32 idx, cur_idx = 1; bool rmid_dirty; u64 val = 0; @@ -289,12 +307,11 @@ void __check_limbo(struct rdt_domain *d, bool force_free) * RMID and move it to the free list when the counter reaches 0. */ for (;;) { - nrmid = find_next_bit(d->rmid_busy_llc, r->num_rmid, crmid); - if (nrmid >= r->num_rmid) + idx = find_next_bit(d->rmid_busy_llc, idx_limit, cur_idx); + if (idx >= idx_limit) break; - entry = __rmid_entry(X86_RESCTRL_BAD_CLOSID, nrmid);// temporary - + entry = __rmid_entry(idx); if (resctrl_arch_rmid_read(r, d, entry->closid, entry->rmid, QOS_L3_OCCUP_EVENT_ID, &val)) { rmid_dirty = true; @@ -303,19 +320,21 @@ void __check_limbo(struct rdt_domain *d, bool force_free) } if (force_free || !rmid_dirty) { - clear_bit(entry->rmid, d->rmid_busy_llc); + clear_bit(idx, d->rmid_busy_llc); if (!--entry->busy) { rmid_limbo_count--; list_add_tail(&entry->list, &rmid_free_lru); } } - crmid = nrmid + 1; + cur_idx = idx + 1; } } bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d) { - return find_first_bit(d->rmid_busy_llc, r->num_rmid) != r->num_rmid; + u32 idx_limit = resctrl_arch_system_num_rmid_idx(); + + return find_first_bit(d->rmid_busy_llc, idx_limit) != idx_limit; } /* @@ -345,6 +364,9 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) struct rdt_domain *d; int cpu, err; u64 val = 0; + u32 idx; + + idx = resctrl_arch_rmid_idx_encode(entry->closid, entry->rmid); entry->busy = 0; cpu = get_cpu(); @@ -364,7 +386,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) */ if (!has_busy_rmid(r, d)) cqm_setup_limbo_handler(d, CQM_LIMBOCHECK_INTERVAL); - set_bit(entry->rmid, d->rmid_busy_llc); + set_bit(idx, d->rmid_busy_llc); entry->busy++; } put_cpu(); @@ -377,14 +399,16 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) void free_rmid(u32 closid, u32 rmid) { + u32 idx = resctrl_arch_rmid_idx_encode(closid, rmid); struct rmid_entry *entry; - if (!rmid) - return; - lockdep_assert_held(&rdtgroup_mutex); - entry = __rmid_entry(closid, rmid); + /* do not allow the default rmid to be free'd */ + if (!idx) + return; + + entry = __rmid_entry(idx); if (is_llc_occupancy_enabled()) add_rmid_to_limbo(entry); @@ -395,11 +419,13 @@ void free_rmid(u32 closid, u32 rmid) static struct mbm_state *get_mbm_state(struct rdt_domain *d, u32 closid, u32 rmid, enum resctrl_event_id evtid) { + u32 idx = resctrl_arch_rmid_idx_encode(closid, rmid); + switch (evtid) { case QOS_L3_MBM_TOTAL_EVENT_ID: - return &d->mbm_total[rmid]; + return &d->mbm_total[idx]; case QOS_L3_MBM_LOCAL_EVENT_ID: - return &d->mbm_local[rmid]; + return &d->mbm_local[idx]; default: return NULL; } @@ -441,7 +467,8 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr) */ static void mbm_bw_count(u32 closid, u32 rmid, struct rmid_read *rr) { - struct mbm_state *m = &rr->d->mbm_local[rmid]; + u32 idx = resctrl_arch_rmid_idx_encode(closid, rmid); + struct mbm_state *m = &rr->d->mbm_local[idx]; u64 cur_bw, bytes, cur_bytes; cur_bytes = rr->val; @@ -531,7 +558,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm) { u32 closid, rmid, cur_msr_val, new_msr_val; struct mbm_state *pmbm_data, *cmbm_data; - u32 cur_bw, delta_bw, user_bw; + u32 cur_bw, delta_bw, user_bw, idx; struct rdt_resource *r_mba; struct rdt_domain *dom_mba; struct list_head *head; @@ -544,7 +571,8 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm) closid = rgrp->closid; rmid = rgrp->mon.rmid; - pmbm_data = &dom_mbm->mbm_local[rmid]; + idx = resctrl_arch_rmid_idx_encode(closid, rmid); + pmbm_data = &dom_mbm->mbm_local[idx]; dom_mba = get_domain_from_cpu(smp_processor_id(), r_mba); if (!dom_mba) { @@ -727,19 +755,20 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) static int dom_data_init(struct rdt_resource *r) { + u32 idx_limit = resctrl_arch_system_num_rmid_idx(); struct rmid_entry *entry = NULL; - int i, nr_rmids; + u32 idx; + int i; - nr_rmids = r->num_rmid; - rmid_ptrs = kcalloc(nr_rmids, sizeof(struct rmid_entry), GFP_KERNEL); + rmid_ptrs = kcalloc(idx_limit, sizeof(struct rmid_entry), GFP_KERNEL); if (!rmid_ptrs) return -ENOMEM; - for (i = 0; i < nr_rmids; i++) { + for (i = 0; i < idx_limit; i++) { entry = &rmid_ptrs[i]; INIT_LIST_HEAD(&entry->list); - entry->rmid = i; + resctrl_arch_rmid_idx_decode(i, &entry->closid, &entry->rmid); list_add_tail(&entry->list, &rmid_free_lru); } @@ -748,7 +777,8 @@ static int dom_data_init(struct rdt_resource *r) * default_rdtgroup control group, which will be setup later. See * rdtgroup_setup_root(). */ - entry = __rmid_entry(0, 0); + idx = resctrl_arch_rmid_idx_encode(RESCTRL_RESERVED_CLOSID, 0); + entry = __rmid_entry(idx); list_del(&entry->list); return 0; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index ff9ccfcd18bd..023eae69f29e 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3604,16 +3604,17 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domain *d) { + u32 idx_limit = resctrl_arch_system_num_rmid_idx(); size_t tsize; if (is_llc_occupancy_enabled()) { - d->rmid_busy_llc = bitmap_zalloc(r->num_rmid, GFP_KERNEL); + d->rmid_busy_llc = bitmap_zalloc(idx_limit, GFP_KERNEL); if (!d->rmid_busy_llc) return -ENOMEM; } if (is_mbm_total_enabled()) { tsize = sizeof(*d->mbm_total); - d->mbm_total = kcalloc(r->num_rmid, tsize, GFP_KERNEL); + d->mbm_total = kcalloc(idx_limit, tsize, GFP_KERNEL); if (!d->mbm_total) { bitmap_free(d->rmid_busy_llc); return -ENOMEM; @@ -3621,7 +3622,7 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domain *d) } if (is_mbm_local_enabled()) { tsize = sizeof(*d->mbm_local); - d->mbm_local = kcalloc(r->num_rmid, tsize, GFP_KERNEL); + d->mbm_local = kcalloc(idx_limit, tsize, GFP_KERNEL); if (!d->mbm_local) { bitmap_free(d->rmid_busy_llc); kfree(d->mbm_total); diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 7d80bae05f59..ff7452f644e4 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -6,6 +6,9 @@ #include #include +/* CLOSID value used by the default control group */ +#define RESCTRL_RESERVED_CLOSID 0 + #ifdef CONFIG_PROC_CPU_RESCTRL int proc_resctrl_show(struct seq_file *m, From patchwork Thu May 25 18:01:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99153 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp557955vqr; Thu, 25 May 2023 11:05:02 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ74h1TAtayML1ysVq356kLZWJe6+Bpzrn5GPUSmq66b/Khq4VRY83UyWeBbgboZdeCd5xeS X-Received: by 2002:a17:903:283:b0:1b0:a3f:2713 with SMTP id j3-20020a170903028300b001b00a3f2713mr182136plr.36.1685037902110; Thu, 25 May 2023 11:05:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685037902; cv=none; d=google.com; s=arc-20160816; b=Ni1K6tXmCXL0CpyzH8CaVx9WP9qpej4cDdvydLlSOGlQLdKcP/72OPtfPFqmAivtG0 lrKImzJ2Jr7238aT4L7JyZnFH2L/G2l83FCOWptjTweDPPsSjaNEqV9VitnyIov48pzg aiqgA1VifU8KmFSQsM4vsWmsgZ2Up6ata9a0T9Dxuuh/8ODM0Tn+YQIU5ztTrPYBuCbF Ed6sqKeXqz+a1RYzlbzisLRegtAY76gDvCdbNBsd8UKSlgZILyi73HZTbXYBWTbCBWUT ugGpo2eVMnadHPCjwaro10d43b9//xHNnUtEKfgYGAMhIgo81gTsub7Cz9GhPtpoAQPN SbeA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=ribcpH5uFIRF41uDbRgATLPEhTdSrkNUNlpDUBqkH4U=; b=XsrHEJ9l2zy3XLqz+JU821hPxYL0MyZ7U2gld0X1wVCCZgnaYldYk/F8L3WvK9sJpi KQjhNkSHBHPo+097E/gqtybJnyXwOpohqkDiugH2pfCmbvm4YEnI+gJX7zfv4/Na23uE +4IiMb4KvTEeg0NblXBZY26N76cjvu84pfXuaGxzTXFCWEpiOuYVxMp02uovCY3GsZsh qZE7Qn9qtvjMGfX3AUHS3LTj2uke3yredwv0WYujrcrHL+TtxaHIHlSiU0lh4HQ23j9D jEqhjfBb0DSQLZpA3Lh7qk43Xo8oI/E8O0JFG8dj57zY9M1pq9jBXY/koH4TwCWVT5iP BTpQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 12-20020a170902c14c00b001a8173f468fsi1898626plj.314.2023.05.25.11.04.46; Thu, 25 May 2023 11:05:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240853AbjEYSCw (ORCPT + 99 others); Thu, 25 May 2023 14:02:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60652 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240581AbjEYSCo (ORCPT ); Thu, 25 May 2023 14:02:44 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id BDDA3E6 for ; Thu, 25 May 2023 11:02:38 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9325C1650; Thu, 25 May 2023 11:03:23 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C97F53F6C4; Thu, 25 May 2023 11:02:35 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 03/24] x86/resctrl: Create helper for RMID allocation and mondata dir creation Date: Thu, 25 May 2023 18:01:48 +0000 Message-Id: <20230525180209.19497-4-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766890303334388368?= X-GMAIL-MSGID: =?utf-8?q?1766890303334388368?= When monitorrring is support, each monitor and control group is allocated an RMID. For control groups, rdtgroup_mkdir_ctrl_mon() later goes on to allocate the CLOSID. MPAM's equivalent of RMID are not an independent number, so can't be allocated until the CLOSID is known. An RMID allocation for one CLOSID may fail, whereas another may succeed depending on how many monitor groups a control group has. The RMID allocation needs to move to be after the CLOSID has been allocated. To make a subsequent change that does this easier to read, move the RMID allocation and mondata dir creation to a helper. Tested-by: Shaopeng Tan Reviewed-by: Ilpo Järvinen Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 42 +++++++++++++++++--------- 1 file changed, 27 insertions(+), 15 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 023eae69f29e..05774b185eec 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3152,6 +3152,30 @@ static int rdtgroup_init_alloc(struct rdtgroup *rdtgrp) return ret; } +static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp) +{ + int ret; + + if (!rdt_mon_capable) + return 0; + + ret = alloc_rmid(); + if (ret < 0) { + rdt_last_cmd_puts("Out of RMIDs\n"); + return ret; + } + rdtgrp->mon.rmid = ret; + + ret = mkdir_mondata_all(rdtgrp->kn, rdtgrp, &rdtgrp->mon.mon_data_kn); + if (ret) { + rdt_last_cmd_puts("kernfs subdir error\n"); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); + return ret; + } + + return 0; +} + static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, const char *name, umode_t mode, enum rdt_group_type rtype, struct rdtgroup **r) @@ -3217,20 +3241,10 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, goto out_destroy; } - if (rdt_mon_capable) { - ret = alloc_rmid(); - if (ret < 0) { - rdt_last_cmd_puts("Out of RMIDs\n"); - goto out_destroy; - } - rdtgrp->mon.rmid = ret; + ret = mkdir_rdt_prepare_rmid_alloc(rdtgrp); + if (ret) + goto out_destroy; - ret = mkdir_mondata_all(kn, rdtgrp, &rdtgrp->mon.mon_data_kn); - if (ret) { - rdt_last_cmd_puts("kernfs subdir error\n"); - goto out_idfree; - } - } kernfs_activate(kn); /* @@ -3238,8 +3252,6 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, */ return 0; -out_idfree: - free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); out_destroy: kernfs_put(rdtgrp->kn); kernfs_remove(rdtgrp->kn); From patchwork Thu May 25 18:01:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99152 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp557947vqr; Thu, 25 May 2023 11:05:01 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4Pt/sQ1Gg3/E061lhq0riBqikWUvE0irmWlzf9Os6gN+MUsZD04wt+0yyrFju8drwmKN8D X-Received: by 2002:a17:90a:7a84:b0:252:ad83:5907 with SMTP id q4-20020a17090a7a8400b00252ad835907mr2679001pjf.16.1685037901327; Thu, 25 May 2023 11:05:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685037901; cv=none; d=google.com; s=arc-20160816; b=RltyRetET6N9lpWO0QhkP1N8Hw4/DbBwbB/er6GguMqyRb+VYKKn3Ps3QG9ihDu28b CA23N8o60liP/XzFmGuGl+eyHVgLwnFygglb/y2fSZjb6T82zkZ7TX4xv+vWjQRjmhgO BpIBVfoVbsiBa/I+aBZYLaz1cwrbeOL/OlGFqVRF4cATyP6Z8HRbNQOAkbMCHE4M3e/L EW1BMiymAM/K74WPGuQQcEUDppThPCJqtYUDp9YLCH39wAd58mkjoaa7XvuAtstR9DZs nf+r+F/2GBEs+TdotviFMxIfAZO0Nrzdf0/zGvHxJMAseBxAWhpoNGjifxCU83ag3Cwf SUVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=jt0TDnGf1sYj8EpI3y3z2vnHyf9TVn+5Wl58sKpRKgU=; b=XOBI/CgbZgwR2XKm+vBiIPE5VijO6nyKd0PG6BI0TDT0tdq3uK/+WQZwEiRsv+LEL9 rjUVen4voiBXeU6EoW71Z3VJaL1ja+cqxKWs9XosN/kAegKWEYc6bhRcthV5oD488dgm aQWoTC6w8MzJZe3rdf8KU/ai5feSGi5W563zJiHxUtgL7O9At247fz15ZwwyVqD/ZoAh YiJ9nFTY6hg6QDtsC2mqI/UG+E3Xsn3HHNZ43E8sQXLL9+F1z9EvT6KZmKSCxLBzgeV8 zhXeUuLgOGlmgRRRA+H/QmW4uR8zNCdJDodRDQzviAkW2mHWnXUd6dL1tdJ+EsMW82Rw 3ptg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r65-20020a632b44000000b00502f4fd0c16si1685628pgr.653.2023.05.25.11.04.46; Thu, 25 May 2023 11:05:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240906AbjEYSC5 (ORCPT + 99 others); Thu, 25 May 2023 14:02:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233869AbjEYSCq (ORCPT ); Thu, 25 May 2023 14:02:46 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 017F3E50 for ; Thu, 25 May 2023 11:02:43 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C235A1655; Thu, 25 May 2023 11:03:27 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 01C473F6C4; Thu, 25 May 2023 11:02:39 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 04/24] x86/resctrl: Move rmid allocation out of mkdir_rdt_prepare() Date: Thu, 25 May 2023 18:01:49 +0000 Message-Id: <20230525180209.19497-5-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766890302382691445?= X-GMAIL-MSGID: =?utf-8?q?1766890302382691445?= RMID are allocated for each monitor or control group directory, because each of these needs its own RMID. For control groups, rdtgroup_mkdir_ctrl_mon() later goes on to allocate the CLOSID. MPAM's equivalent of RMID is not an independent number, so can't be allocated until the CLOSID is known. An RMID allocation for one CLOSID may fail, whereas another may succeed depending on how many monitor groups a control group has. The RMID allocation needs to move to be after the CLOSID has been allocated. Move the RMID allocation out of mkdir_rdt_prepare() to occur in its caller, after the mkdir_rdt_prepare() call. This allows the RMID allocator to know the CLOSID. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v2: * Moved kernfs_activate() later to preserve atomicity of files being visible --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 35 +++++++++++++++++++------- 1 file changed, 26 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 05774b185eec..8346a8f2ff9f 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3176,6 +3176,12 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp) return 0; } +static void mkdir_rdt_prepare_rmid_free(struct rdtgroup *rgrp) +{ + if (rdt_mon_capable) + free_rmid(rgrp->closid, rgrp->mon.rmid); +} + static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, const char *name, umode_t mode, enum rdt_group_type rtype, struct rdtgroup **r) @@ -3241,12 +3247,6 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, goto out_destroy; } - ret = mkdir_rdt_prepare_rmid_alloc(rdtgrp); - if (ret) - goto out_destroy; - - kernfs_activate(kn); - /* * The caller unlocks the parent_kn upon success. */ @@ -3265,7 +3265,6 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, static void mkdir_rdt_prepare_clean(struct rdtgroup *rgrp) { kernfs_remove(rgrp->kn); - free_rmid(rgrp->closid, rgrp->mon.rmid); rdtgroup_remove(rgrp); } @@ -3287,12 +3286,21 @@ static int rdtgroup_mkdir_mon(struct kernfs_node *parent_kn, prgrp = rdtgrp->mon.parent; rdtgrp->closid = prgrp->closid; + ret = mkdir_rdt_prepare_rmid_alloc(rdtgrp); + if (ret) { + mkdir_rdt_prepare_clean(rdtgrp); + goto out_unlock; + } + + kernfs_activate(rdtgrp->kn); + /* * Add the rdtgrp to the list of rdtgrps the parent * ctrl_mon group has to track. */ list_add_tail(&rdtgrp->mon.crdtgrp_list, &prgrp->mon.crdtgrp_list); +out_unlock: rdtgroup_kn_unlock(parent_kn); return ret; } @@ -3323,10 +3331,17 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn, ret = 0; rdtgrp->closid = closid; - ret = rdtgroup_init_alloc(rdtgrp); - if (ret < 0) + + ret = mkdir_rdt_prepare_rmid_alloc(rdtgrp); + if (ret) goto out_id_free; + kernfs_activate(rdtgrp->kn); + + ret = rdtgroup_init_alloc(rdtgrp); + if (ret < 0) + goto out_rmid_free; + list_add(&rdtgrp->rdtgroup_list, &rdt_all_groups); if (rdt_mon_capable) { @@ -3345,6 +3360,8 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn, out_del_list: list_del(&rdtgrp->rdtgroup_list); +out_rmid_free: + mkdir_rdt_prepare_rmid_free(rdtgrp); out_id_free: closid_free(closid); out_common_fail: From patchwork Thu May 25 18:01:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99175 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp566403vqr; Thu, 25 May 2023 11:18:14 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ78XKJXTbENSh265zpeZmxqhmBinIUbGpsHv8VrvDlZ7QGdSQxGsVjiIf3HCFt5IVOhdGyU X-Received: by 2002:a05:6a00:218b:b0:64d:2487:5b20 with SMTP id h11-20020a056a00218b00b0064d24875b20mr10215217pfi.9.1685038694592; Thu, 25 May 2023 11:18:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685038694; cv=none; d=google.com; s=arc-20160816; b=f/sPkJg0WRoeZJQpNsJ8XoxQYJL7CqYsS+weffh2JMr8jVWaKfiDPXrxnSnnbK3jkh VbpstecDUwKho87916Qolq2NF8Ajg4f/5FgArSpwHK02cuh3DxCSE+LQHBWluUc9dGQX HIoWzgGZtO4qT0Dc680tv6MId+fcCAntOKfqyn2gLYAWQq2OuJonofwFhpwWyGEZm/n5 yTqbRbGfn2W4Wft93wBP7Wrmrt/A6WnYcfzMX4i5GkZOLfJrlOblixbgz3sy30DnoPF7 7pVIdmbRUpMqU7KjLa9FJpGYQrtjHCTf7NnIkVaq64QouTMf+QCjnc3yvdXl2x+Diffa bOIQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=fOd0S5EmawGc9fUHILNQVKlVEz1uRMZKfxAp9M6UAmg=; b=XKE0687oYzrGjtkS7UiKevksl+nGiEAJPSdORVUTK5lD7y7aWeKMZL2AAZbAgdttA8 cnMUJr5yCXsUvqcrV8DtfBXz+i+DS7YxBTONf39P1Un4IYrzmubVo03cgOFnO9DsArQN pc6XIy7F9lf7SV7jz4unoFRkZJe0OZ0rVLvq0YFg1YZfrObzsa9NKDYimjxfK/T3GqfX vGu6EUwo7ncRnRd+y/8UiAsdgBu8TsUP0C1W2mnqdJWyR2wi9s6nw6CP1tp1gmvrtRqr M0+3py2syHU2ACcfdMPwMfr7F9HpcfdUsnNw9dqLOl/oOJ1jUT3tBw9LPCMS9c1EO5wN XhJg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k130-20020a636f88000000b004faf33e2755si1643235pgc.349.2023.05.25.11.18.00; Thu, 25 May 2023 11:18:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240328AbjEYSDG (ORCPT + 99 others); Thu, 25 May 2023 14:03:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60838 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240810AbjEYSCt (ORCPT ); Thu, 25 May 2023 14:02:49 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id DCA231A6 for ; Thu, 25 May 2023 11:02:45 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AF73E1042; Thu, 25 May 2023 11:03:30 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id E540F3F6C4; Thu, 25 May 2023 11:02:42 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 05/24] x86/resctrl: Allow RMID allocation to be scoped by CLOSID Date: Thu, 25 May 2023 18:01:50 +0000 Message-Id: <20230525180209.19497-6-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766891134441919432?= X-GMAIL-MSGID: =?utf-8?q?1766891134441919432?= MPAMs RMID values are not unique unless the CLOSID is considered as well. alloc_rmid() expects the RMID to be an independent number. Pass the CLOSID in to alloc_rmid(). Use this to compare indexes when allocating. If the CLOSID is not relevant to the index, this ends up comparing the free RMID with itself, and the first free entry will be used. With MPAM the CLOSID is included in the index, so this becomes a walk of the free RMID entries, until one that matches the supplied CLOSID is found. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v2; * Rephrased comment in resctrl_find_free_rmid() to describe this in terms of list_entry_first() * Rephrased comment above alloc_rmid() Changes since v3: * Flipped conditions in alloc_rmid() --- arch/x86/kernel/cpu/resctrl/internal.h | 2 +- arch/x86/kernel/cpu/resctrl/monitor.c | 51 +++++++++++++++++------ arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 2 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +- 4 files changed, 41 insertions(+), 16 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index d571da4848a4..23e20f89d2b3 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -535,7 +535,7 @@ void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp); struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r); int closids_supported(void); void closid_free(int closid); -int alloc_rmid(void); +int alloc_rmid(u32 closid); void free_rmid(u32 closid, u32 rmid); int rdt_get_mon_l3_config(struct rdt_resource *r); bool __init rdt_cpu_has(int flag); diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index bcc25f5339c0..27e731c7de72 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -337,24 +337,49 @@ bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d) return find_first_bit(d->rmid_busy_llc, idx_limit) != idx_limit; } -/* - * As of now the RMIDs allocation is global. - * However we keep track of which packages the RMIDs - * are used to optimize the limbo list management. - */ -int alloc_rmid(void) +static struct rmid_entry *resctrl_find_free_rmid(u32 closid) { - struct rmid_entry *entry; - - lockdep_assert_held(&rdtgroup_mutex); + struct rmid_entry *itr; + u32 itr_idx, cmp_idx; if (list_empty(&rmid_free_lru)) - return rmid_limbo_count ? -EBUSY : -ENOSPC; + return rmid_limbo_count ? ERR_PTR(-EBUSY) : ERR_PTR(-ENOSPC); + + list_for_each_entry(itr, &rmid_free_lru, list) { + /* + * get the index of this free RMID, and the index it would need + * to be if it were used with this CLOSID. + * If the CLOSID is irrelevant on this architecture, these will + * always be the same meaning the compiler can reduce this loop + * to a single list_entry_first() call. + */ + itr_idx = resctrl_arch_rmid_idx_encode(itr->closid, itr->rmid); + cmp_idx = resctrl_arch_rmid_idx_encode(closid, itr->rmid); + + if (itr_idx == cmp_idx) + return itr; + } + + return ERR_PTR(-ENOSPC); +} + +/* + * For MPAM the RMID value is not unique, and has to be considered with + * the CLOSID. The (CLOSID, RMID) pair is allocated on all domains, which + * allows all domains to be managed by a single limbo list. + * Each domain also has a rmid_busy_llc to reduce the work of the limbo handler. + */ +int alloc_rmid(u32 closid) +{ + struct rmid_entry *entry; + + lockdep_assert_held(&rdtgroup_mutex); + + entry = resctrl_find_free_rmid(closid); + if (IS_ERR(entry)) + return PTR_ERR(entry); - entry = list_first_entry(&rmid_free_lru, - struct rmid_entry, list); list_del(&entry->list); - return entry->rmid; } diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c index aeadaeb5df9a..5ebd6e54c7f2 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -763,7 +763,7 @@ int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp) int ret; if (rdt_mon_capable) { - ret = alloc_rmid(); + ret = alloc_rmid(rdtgrp->closid); if (ret < 0) { rdt_last_cmd_puts("Out of RMIDs\n"); return ret; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 8346a8f2ff9f..ba0595508b2f 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3159,7 +3159,7 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp) if (!rdt_mon_capable) return 0; - ret = alloc_rmid(); + ret = alloc_rmid(rdtgrp->closid); if (ret < 0) { rdt_last_cmd_puts("Out of RMIDs\n"); return ret; From patchwork Thu May 25 18:01:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99151 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp557946vqr; Thu, 25 May 2023 11:05:01 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5BgVd3tbHiELYdDH27LQLt9481dm1fATBFzuHpVpaXAHeKOk1BBSrRc0hyaVWFl2T8iCjR X-Received: by 2002:a17:902:e74e:b0:1af:ccc9:ce4a with SMTP id p14-20020a170902e74e00b001afccc9ce4amr2898497plf.25.1685037901404; Thu, 25 May 2023 11:05:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685037901; cv=none; d=google.com; s=arc-20160816; b=T9HXG9vvRD5ftNucc2dTKSSg4YlCADAvZ44O+br5YBw4r/+4vvoIqjK/KK/3lZhqFo hQsdVBsxB5Y1dn8QJueW0/dMRxe1zxnRTZAE6kYkdnJ6rAM2L5BuQtCJ/JJNB+/ieP24 bVPmViWNcWJrKTZAL7qiC3Ps40OAhfSsmr9dDPzpeZx1nRcgJVffd//VaoU/+Owg18wl 7Ayc7urfW/LVMIKJ9wyoePY7t2XjmIehy562kebKP8qVA5vW3AxeqCw9FOyvxhW54RdR TGKnB3qc7HbvUAL/HkHLzChccdVV2zRpmh0bpg72qNqVCw9wuf2V5OAkvyREI5AH3Sqk cFkA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=cHeh4HUe2Yl4W4QMrfcLl1AlGRcr36xp/cAtx3/4t1E=; b=DCpjIMbMJzcRy3Rj2JpOPy7yUVj+0XmLQ506VsfY1ySKpu6j/5tJ8BLS/iRX/stzxC k4tfsGGPq28ycuGbe6noRDZH9TtOZrKPCQWjM4XV3T26TBA+m60yRm0eOBKgJl0jfiPn 3wuncFv0KSMN8c/Mj3o3IbL9PQkzfYqTldXqqNSwLZgtINkZKifcWkasF5gr++T1e62n SM6Tn4Kn5t6/AV/MLX0fjABCwT6/AEGtGWavrUq/M9CZNc2WNzJwiyLfSXiHEE7hdJrw gL8Rt/6GG1BxU8duCWZ+nTUqxDnAaxDDYVnvB4IWCRSU8ApBHhBrOKe2MjoYL16iyY0b z19g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i1-20020a17090332c100b001a64e7b702fsi2103219plr.447.2023.05.25.11.04.47; Thu, 25 May 2023 11:05:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240865AbjEYSDJ (ORCPT + 99 others); Thu, 25 May 2023 14:03:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60716 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240888AbjEYSCy (ORCPT ); Thu, 25 May 2023 14:02:54 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id DA2411BB for ; Thu, 25 May 2023 11:02:48 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A077315BF; Thu, 25 May 2023 11:03:33 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D77C73F6C4; Thu, 25 May 2023 11:02:45 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 06/24] x86/resctrl: Track the number of dirty RMID a CLOSID has Date: Thu, 25 May 2023 18:01:51 +0000 Message-Id: <20230525180209.19497-7-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766890302530769815?= X-GMAIL-MSGID: =?utf-8?q?1766890302530769815?= MPAM's PMG bits extend its PARTID space, meaning the same PMG value can be used for different control groups. This means once a CLOSID is allocated, all its monitoring ids may still be dirty, and held in limbo. Keep track of the number of RMID held in limbo each CLOSID has. This will allow a future helper to find the 'cleanest' CLOSID when allocating. The array is only needed when CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID is defined. This will never be the case on x86. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/monitor.c | 43 +++++++++++++++++++++++---- 1 file changed, 38 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 27e731c7de72..1e7fa40ee471 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -43,6 +43,13 @@ struct rmid_entry { */ static LIST_HEAD(rmid_free_lru); +/** + * @closid_num_dirty_rmid The number of dirty RMID each CLOSID has. + * Only allocated when CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID is defined. + * Indexed by CLOSID. Protected by rdtgroup_mutex. + */ +static int *closid_num_dirty_rmid; + /** * @rmid_limbo_count count of currently unused but (potentially) * dirty RMIDs. @@ -285,6 +292,17 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, return 0; } +static void limbo_release_entry(struct rmid_entry *entry) +{ + lockdep_assert_held(&rdtgroup_mutex); + + rmid_limbo_count--; + list_add_tail(&entry->list, &rmid_free_lru); + + if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) + closid_num_dirty_rmid[entry->closid]--; +} + /* * Check the RMIDs that are marked as busy for this domain. If the * reported LLC occupancy is below the threshold clear the busy bit and @@ -321,10 +339,8 @@ void __check_limbo(struct rdt_domain *d, bool force_free) if (force_free || !rmid_dirty) { clear_bit(idx, d->rmid_busy_llc); - if (!--entry->busy) { - rmid_limbo_count--; - list_add_tail(&entry->list, &rmid_free_lru); - } + if (!--entry->busy) + limbo_release_entry(entry); } cur_idx = idx + 1; } @@ -391,6 +407,8 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) u64 val = 0; u32 idx; + lockdep_assert_held(&rdtgroup_mutex); + idx = resctrl_arch_rmid_idx_encode(entry->closid, entry->rmid); entry->busy = 0; @@ -420,6 +438,9 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) rmid_limbo_count++; else list_add_tail(&entry->list, &rmid_free_lru); + + if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) + closid_num_dirty_rmid[entry->closid]++; } void free_rmid(u32 closid, u32 rmid) @@ -781,13 +802,25 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) static int dom_data_init(struct rdt_resource *r) { u32 idx_limit = resctrl_arch_system_num_rmid_idx(); + u32 num_closid = resctrl_arch_get_num_closid(r); struct rmid_entry *entry = NULL; u32 idx; int i; + lockdep_assert_held(&rdtgroup_mutex); + + if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) { + closid_num_dirty_rmid = kcalloc(num_closid, sizeof(int), + GFP_KERNEL); + if (!closid_num_dirty_rmid) + return -ENOMEM; + } + rmid_ptrs = kcalloc(idx_limit, sizeof(struct rmid_entry), GFP_KERNEL); - if (!rmid_ptrs) + if (!rmid_ptrs) { + kfree(closid_num_dirty_rmid); return -ENOMEM; + } for (i = 0; i < idx_limit; i++) { entry = &rmid_ptrs[i]; From patchwork Thu May 25 18:01:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99169 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp564858vqr; Thu, 25 May 2023 11:15:56 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ52+yOGX5/YPPY3tlHjadTJjNi9qUnNzX0jRBiSLAJoQkmdDBhdSQn+a+DpEEWYMD6Jwc/b X-Received: by 2002:a17:90a:bc97:b0:255:e301:7b01 with SMTP id x23-20020a17090abc9700b00255e3017b01mr2383530pjr.35.1685038555844; Thu, 25 May 2023 11:15:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685038555; cv=none; d=google.com; s=arc-20160816; b=k1YMb/3cNjU3QlaISTXLuMvgkhDz+RgvgiskgeMQlxu8HU+lMLPCypLTAhrVoorUuz sNCFBipamhvK1GFGS08YKsrHQhcAZ3VI5VfQGzAxrQTHHmQBA7FxMJKHPx+89KYIUgaR EewI6SN+dIb8wivhTQQcSVjbYh/SbrhcVY/LQ1LPAGA4cO6UqKib/z6sqHbDuyiN9heb WRtKQR9uSPu/65LEglJ4/qKqIvfsOdB+WA84TnuxYWZCLc/8VIqt/Ipj9KyDwxAYufhO h43J3PiPnwCeFtPsqExCka6Ayj3mihyPUbfRElGmRVTlKscPNWm2/Ob83FAJnpsgjfVY UP8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=pTMiA7j0TgA6agYnvRQUcfkB1DPcFRwdzP3QtopHzLg=; b=H7UPRIH2MtTTMtxFYDrAGmjI5/h8AvQsrmLpQTfcA83luuAWN/WXRTnJXoRKil/OAq vMync1INPy1p7MQ0E6gREeW7O8t24LGKuTWbOY95+ZDr17Jm/UUvHoISZXEAK/XAaxCp 5LX7GQn2xNfq7kU9nc2Z3lJ2sY7QFW3SLnHuLWkEKbbAUgaPtkEPisWDDgCujrJhVH1o k7uD7gz8W/DSs3Hwo4nqpgMvc8cYkvRIoBEgwdhDHN6hWJZWSch3G/2B5ce79dOv3Xmi 0xfxxHv6FswowLB1C+Ldt45L/05LqmEbtu902N7idhcS7vR/3LKeqZO6VjOyeKB1qYFY Yz2w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ds2-20020a17090b08c200b0024e500f3749si4201896pjb.68.2023.05.25.11.15.40; Thu, 25 May 2023 11:15:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241058AbjEYSDO (ORCPT + 99 others); Thu, 25 May 2023 14:03:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32812 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240624AbjEYSDA (ORCPT ); Thu, 25 May 2023 14:03:00 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0DC1FE57 for ; Thu, 25 May 2023 11:02:52 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8CE801650; Thu, 25 May 2023 11:03:36 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C2C043F6C4; Thu, 25 May 2023 11:02:48 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 07/24] x86/resctrl: Use set_bit()/clear_bit() instead of open coding Date: Thu, 25 May 2023 18:01:52 +0000 Message-Id: <20230525180209.19497-8-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766890988885635369?= X-GMAIL-MSGID: =?utf-8?q?1766890988885635369?= The resctrl CLOSID allocator uses a single 32bit word to track which CLOSID are free. The setting and clearing of bits is open coded. A subsequent patch adds resctrl_closid_is_free(), which adds more open coded bitmaps operations. These will eventually need changing to use the bitops helpers so that a CLOSID bitmap of the correct size can be allocated dynamically. Convert the existing open coded bit manipulations of closid_free_map to use set_bit() and friends. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index ba0595508b2f..6bf5623f82b4 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -106,7 +106,7 @@ void rdt_staged_configs_clear(void) * - Our choices on how to configure each resource become progressively more * limited as the number of resources grows. */ -static int closid_free_map; +static unsigned long closid_free_map; static int closid_free_map_len; int closids_supported(void) @@ -126,7 +126,7 @@ static void closid_init(void) closid_free_map = BIT_MASK(rdt_min_closid) - 1; /* CLOSID 0 is always reserved for the default group */ - closid_free_map &= ~1; + clear_bit(0, &closid_free_map); closid_free_map_len = rdt_min_closid; } @@ -137,14 +137,14 @@ static int closid_alloc(void) if (closid == 0) return -ENOSPC; closid--; - closid_free_map &= ~(1 << closid); + clear_bit(closid, &closid_free_map); return closid; } void closid_free(int closid) { - closid_free_map |= 1 << closid; + set_bit(closid, &closid_free_map); } /** @@ -156,7 +156,7 @@ void closid_free(int closid) */ static bool closid_allocated(unsigned int closid) { - return (closid_free_map & (1 << closid)) == 0; + return !test_bit(closid, &closid_free_map); } /** From patchwork Thu May 25 18:01:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99167 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp564626vqr; Thu, 25 May 2023 11:15:32 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4NRd5fR2Vt29/3ltDQk+av7dynkS99iDZoFeP+XOAFRZ8rrRDa7uVSICJu41aw9ROhRRqp X-Received: by 2002:a05:6a20:7346:b0:105:94e5:f5c5 with SMTP id v6-20020a056a20734600b0010594e5f5c5mr26975241pzc.56.1685038532317; Thu, 25 May 2023 11:15:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685038532; cv=none; d=google.com; s=arc-20160816; b=lJemCeGjN3BIY+kCygl3nJOQdJMRrRAOZqWLz1KdIi3Eu77tHUld7QjyfRkMX5dKK2 772faqq5/olezf3r9Kr16VTPPV26qT8CrtLdwEV8J6C11Rkpx414ZNKhL2umxjzUUDlV Y4JyENQCrMJFnL1yEW88tKSp6Z9wSxnELzBOj7es/IJOHC+FJOKCFDX3jKcG8GhsHJoe XF3qyjydjvTUdFvKQK9g9chMLibOwPfoY0Hj6V6m8QkcYFsgSfoTl2XBcu+YVQDObx6X HpUXCpuNOxP7RJbdyKDh6R/W8eLCiNGxSSOmQPmOcqti8p3jGwh9jMFHIN4y9GjxU84e /cXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=ytwQ4anz2/3RupZXalPBCsnR8FM+HSQsuaa/aaek1xg=; b=CbmSSkVJq3LKzYRYknU9uAffOOtehFZ1/fPmFtpltf+I96sDIzsZdReZc1olY9Oa3f sSuaq+1wmsxs23Og3Sonvw73kEKZDX0ma7kSNv6x1YWAKACPJp/piGy4/Jy1tawRfVA4 nm+X+2cVrcMUxRL1Gp4urhsAj0FmTotXZQez1DF9y4D9M80nl0Z5FK9pxi6kgRFb4a47 5S2iRYJvLNMO4ca20E16Lq5ZxJgC/xwfBBPjftHtA5KAUjoGFe4w84lWK53P+VyZLRwu irWEBMXzvBQvJvRcAY7HT1AYR4WNevyuE9IhtHNJvXn0/4lQje/wMqYDq7aaqUPgGBPd n3Mw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c66-20020a633545000000b0052c40645e07si1726133pga.57.2023.05.25.11.15.17; Thu, 25 May 2023 11:15:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240830AbjEYSDY (ORCPT + 99 others); Thu, 25 May 2023 14:03:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60866 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240778AbjEYSDF (ORCPT ); Thu, 25 May 2023 14:03:05 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 11B20E6B for ; Thu, 25 May 2023 11:02:54 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 83A2B1655; Thu, 25 May 2023 11:03:39 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B9D7E3F6C4; Thu, 25 May 2023 11:02:51 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 08/24] x86/resctrl: Allocate the cleanest CLOSID by searching closid_num_dirty_rmid Date: Thu, 25 May 2023 18:01:53 +0000 Message-Id: <20230525180209.19497-9-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766890964249634001?= X-GMAIL-MSGID: =?utf-8?q?1766890964249634001?= MPAM's PMG bits extend its PARTID space, meaning the same PMG value can be used for different control groups. This means once a CLOSID is allocated, all its monitoring ids may still be dirty, and held in limbo. Instead of allocating the first free CLOSID, on architectures where CONFIG_RESCTRL_RMID_DEPENDS_ON_COSID is enabled, search closid_num_dirty_rmid[] to find the cleanest CLOSID. The CLOSID found is returned to closid_alloc() for the free list to be updated. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/internal.h | 2 ++ arch/x86/kernel/cpu/resctrl/monitor.c | 45 ++++++++++++++++++++++++++ arch/x86/kernel/cpu/resctrl/rdtgroup.c | 19 ++++++++--- 3 files changed, 61 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 23e20f89d2b3..96fb7658ff74 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -557,5 +557,7 @@ void rdt_domain_reconfigure_cdp(struct rdt_resource *r); void __init thread_throttle_mode_init(void); void __init mbm_config_rftype_init(const char *config); void rdt_staged_configs_clear(void); +bool closid_allocated(unsigned int closid); +int resctrl_find_cleanest_closid(void); #endif /* _ASM_X86_RESCTRL_INTERNAL_H */ diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 1e7fa40ee471..128d4c7206e4 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -379,6 +379,51 @@ static struct rmid_entry *resctrl_find_free_rmid(u32 closid) return ERR_PTR(-ENOSPC); } +/** + * resctrl_find_cleanest_closid() - Find a CLOSID where all the associated + * RMID are clean, or the CLOSID that has + * the most clean RMID. + * + * MPAM's equivalent of RMID are per-CLOSID, meaning a freshly allocated CLOSID + * may not be able to allocate clean RMID. To avoid this the allocator will + * choose the CLOSID with the most clean RMID. + * + * When the CLOSID and RMID are independent numbers, the first free CLOSID will + * be returned. + * + * Call when CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID is defined. If not, call + * resctrl_closid_alloc_first_free() instead. + */ +int resctrl_find_cleanest_closid(void) +{ + u32 cleanest_closid = ~0, iter_num_dirty; + int i = 0; + + lockdep_assert_held(&rdtgroup_mutex); + + if (!IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) + return -EIO; + + for (i = 0; i < closids_supported(); i++) { + if (closid_allocated(i)) + continue; + + iter_num_dirty = closid_num_dirty_rmid[i]; + if (iter_num_dirty == 0) + return i; + + if (cleanest_closid == ~0) + cleanest_closid = i; + + if (iter_num_dirty < closid_num_dirty_rmid[cleanest_closid]) + cleanest_closid = i; + } + + if (cleanest_closid == ~0) + return -ENOSPC; + return cleanest_closid; +} + /* * For MPAM the RMID value is not unique, and has to be considered with * the CLOSID. The (CLOSID, RMID) pair is allocated on all domains, which diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 6bf5623f82b4..1120c7700126 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -132,11 +132,20 @@ static void closid_init(void) static int closid_alloc(void) { - u32 closid = ffs(closid_free_map); + u32 closid; + int err; - if (closid == 0) - return -ENOSPC; - closid--; + if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) { + err = resctrl_find_cleanest_closid(); + if (err < 0) + return err; + closid = err; + } else { + closid = ffs(closid_free_map); + if (closid == 0) + return -ENOSPC; + closid--; + } clear_bit(closid, &closid_free_map); return closid; @@ -154,7 +163,7 @@ void closid_free(int closid) * Return: true if @closid is currently associated with a resource group, * false if @closid is free */ -static bool closid_allocated(unsigned int closid) +bool closid_allocated(unsigned int closid) { return !test_bit(closid, &closid_free_map); } From patchwork Thu May 25 18:01:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99171 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp565082vqr; Thu, 25 May 2023 11:16:20 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6NOta3feirZ0Oj/ela2HCRsDI6Yz/DNETtZPxbeosqrOiIBmZ67dDGae0yKdZ1oSoPIGfU X-Received: by 2002:a17:903:1ca:b0:1ac:946e:468e with SMTP id e10-20020a17090301ca00b001ac946e468emr2316066plh.57.1685038580052; Thu, 25 May 2023 11:16:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685038580; cv=none; d=google.com; s=arc-20160816; b=EcGklwU3HgHLo8jt5PP4ZA6CGRBn4yGwlg/PGyejR+9dnJczbV9BJrWJXIJKRfwEkN coDWG2EFaqts16/HZP2sVQWOEz/agOcNEs2jzXn6/oYsMv2pm4yz4kcjCOrlnY+IJDOk JqICLnZQfp5p1kPWRt3VkqwCv+CfI8J1bXi5Thyq+T8ljDDaUiAyrgXzlVFORYferBtv wAV4veJwXLWc9HZZGpd8BQMH/yT2rXI9suBWzc2wLev0FcadPEIwdOeCTpVyWff5B4yP IfwHF2YawHIXQ5U3sCMpANl2RTOu7IYGHLUNTkbPgPcWjZoGM924cBZH7IljsPzZytl/ 9clA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=sqmyMLxZVCFw2azMTrfwVMin7Ll3y+TwLDUebd+P2n0=; b=CHQkU1PzbpMQhf9AdBbRixwbi83LmRNPGNWrlun1Vkw8PKbdTfcx9RzjrwwtsGI/nj Zx6SHdJiUONwfaFkYz8gM6f4z7LEz4kgYKI3EVzrNlBYgW7+Duk6Lz+SbIfSSszc744N +rGnLdpz19ePxJ9GMqpbFzYQhY6FrYE2SjzHymABifyKVUxXZR4Rq1ylg+UlLQSIhngm OZVUn+jaGtyN8/2MR7pQg/Y9bEzSsQQWTZj92CD87MQrWbNbJuay1xtowCN4W8tB/0jT Bx1RRw+bd5Xk/q4CgV0YZ1VbkiJhtBUMuhAh8GLEH3CNhaI9LpXEIi5VEW5ZjeziJl1n pBYA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q5-20020a170902bd8500b0019cd3a6016esi1923496pls.210.2023.05.25.11.16.04; Thu, 25 May 2023 11:16:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240888AbjEYSD1 (ORCPT + 99 others); Thu, 25 May 2023 14:03:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60686 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240908AbjEYSDK (ORCPT ); Thu, 25 May 2023 14:03:10 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id CA529E71 for ; Thu, 25 May 2023 11:02:57 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 6CE061042; Thu, 25 May 2023 11:03:42 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A2BFC3F6C4; Thu, 25 May 2023 11:02:54 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 09/24] x86/resctrl: Move CLOSID/RMID matching and setting to use helpers Date: Thu, 25 May 2023 18:01:54 +0000 Message-Id: <20230525180209.19497-10-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766891014124989595?= X-GMAIL-MSGID: =?utf-8?q?1766891014124989595?= When switching tasks, the CLOSID and RMID that the new task should use are stored in struct task_struct. For x86 the CLOSID known by resctrl, the value in task_struct, and the value written to the CPU register are all the same thing. MPAM's CPU interface has two different PARTID's one for data accesses the other for instruction fetch. Storing resctrl's CLOSID value in struct task_struct implies the arch code knows whether resctrl is using CDP. Move the matching and setting of the struct task_struct properties to use helpers. This allows arm64 to store the hardware format of the register, instead of having to convert it each time. __rdtgroup_move_task()s use of READ_ONCE()/WRITE_ONCE() ensures torn values aren't seen as another CPU may schedule the task being moved while the value is being changed. MPAM has an additional corner-case here as the PMG bits extend the PARTID space. If the scheduler sees a new-CLOSID but old-RMID, the task will dirty an RMID that the limbo code is not watching causing an inaccurate count. x86's RMID are independent values, so the limbo code will still be watching the old-RMID in this circumstance. To avoid this, arm64 needs both the CLOSID/RMID WRITE_ONCE()d together. Both values must be provided together. Because MPAM's RMID values are not unique, the CLOSID must be provided when matching the RMID. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v2: * __rdtgroup_move_task() changed to set CLOSID from different CLOSID place depending on group type --- arch/x86/include/asm/resctrl.h | 18 ++++++++ arch/x86/kernel/cpu/resctrl/rdtgroup.c | 62 ++++++++++++++++---------- 2 files changed, 56 insertions(+), 24 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index dd9b638d43c8..78376c19ee6f 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -95,6 +95,24 @@ static inline unsigned int resctrl_arch_round_mon_val(unsigned int val) return val * scale; } +static inline void resctrl_arch_set_closid_rmid(struct task_struct *tsk, + u32 closid, u32 rmid) +{ + WRITE_ONCE(tsk->closid, closid); + WRITE_ONCE(tsk->rmid, rmid); +} + +static inline bool resctrl_arch_match_closid(struct task_struct *tsk, u32 closid) +{ + return READ_ONCE(tsk->closid) == closid; +} + +static inline bool resctrl_arch_match_rmid(struct task_struct *tsk, u32 ignored, + u32 rmid) +{ + return READ_ONCE(tsk->rmid) == rmid; +} + static inline void resctrl_sched_in(struct task_struct *tsk) { if (static_branch_likely(&rdt_enable_key)) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 1120c7700126..f2bb3e09ed13 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -97,7 +97,7 @@ void rdt_staged_configs_clear(void) * * Using a global CLOSID across all resources has some advantages and * some drawbacks: - * + We can simply set "current->closid" to assign a task to a resource + * + We can simply set current's closid to assign a task to a resource * group. * + Context switch code can avoid extra memory references deciding which * CLOSID to load into the PQR_ASSOC MSR @@ -563,14 +563,26 @@ static void update_task_closid_rmid(struct task_struct *t) _update_task_closid_rmid(t); } +static bool task_in_rdtgroup(struct task_struct *tsk, struct rdtgroup *rdtgrp) +{ + u32 closid, rmid = rdtgrp->mon.rmid; + + if (rdtgrp->type == RDTCTRL_GROUP) + closid = rdtgrp->closid; + else if (rdtgrp->type == RDTMON_GROUP) + closid = rdtgrp->mon.parent->closid; + else + return false; + + return resctrl_arch_match_closid(tsk, closid) && + resctrl_arch_match_rmid(tsk, closid, rmid); +} + static int __rdtgroup_move_task(struct task_struct *tsk, struct rdtgroup *rdtgrp) { /* If the task is already in rdtgrp, no need to move the task. */ - if ((rdtgrp->type == RDTCTRL_GROUP && tsk->closid == rdtgrp->closid && - tsk->rmid == rdtgrp->mon.rmid) || - (rdtgrp->type == RDTMON_GROUP && tsk->rmid == rdtgrp->mon.rmid && - tsk->closid == rdtgrp->mon.parent->closid)) + if (task_in_rdtgroup(tsk, rdtgrp)) return 0; /* @@ -581,19 +593,19 @@ static int __rdtgroup_move_task(struct task_struct *tsk, * For monitor groups, can move the tasks only from * their parent CTRL group. */ - - if (rdtgrp->type == RDTCTRL_GROUP) { - WRITE_ONCE(tsk->closid, rdtgrp->closid); - WRITE_ONCE(tsk->rmid, rdtgrp->mon.rmid); - } else if (rdtgrp->type == RDTMON_GROUP) { - if (rdtgrp->mon.parent->closid == tsk->closid) { - WRITE_ONCE(tsk->rmid, rdtgrp->mon.rmid); - } else { - rdt_last_cmd_puts("Can't move task to different control group\n"); - return -EINVAL; - } + if (rdtgrp->type == RDTMON_GROUP && + !resctrl_arch_match_closid(tsk, rdtgrp->mon.parent->closid)) { + rdt_last_cmd_puts("Can't move task to different control group\n"); + return -EINVAL; } + if (rdtgrp->type == RDTMON_GROUP) + resctrl_arch_set_closid_rmid(tsk, rdtgrp->mon.parent->closid, + rdtgrp->mon.rmid); + else + resctrl_arch_set_closid_rmid(tsk, rdtgrp->closid, + rdtgrp->mon.rmid); + /* * Ensure the task's closid and rmid are written before determining if * the task is current that will decide if it will be interrupted. @@ -615,14 +627,15 @@ static int __rdtgroup_move_task(struct task_struct *tsk, static bool is_closid_match(struct task_struct *t, struct rdtgroup *r) { - return (rdt_alloc_capable && - (r->type == RDTCTRL_GROUP) && (t->closid == r->closid)); + return (rdt_alloc_capable && (r->type == RDTCTRL_GROUP) && + resctrl_arch_match_closid(t, r->closid)); } static bool is_rmid_match(struct task_struct *t, struct rdtgroup *r) { - return (rdt_mon_capable && - (r->type == RDTMON_GROUP) && (t->rmid == r->mon.rmid)); + return (rdt_mon_capable && (r->type == RDTMON_GROUP) && + resctrl_arch_match_rmid(t, r->mon.parent->closid, + r->mon.rmid)); } /** @@ -818,7 +831,7 @@ int proc_resctrl_show(struct seq_file *s, struct pid_namespace *ns, rdtg->mode != RDT_MODE_EXCLUSIVE) continue; - if (rdtg->closid != tsk->closid) + if (!resctrl_arch_match_closid(tsk, rdtg->closid)) continue; seq_printf(s, "res:%s%s\n", (rdtg == &rdtgroup_default) ? "/" : "", @@ -826,7 +839,8 @@ int proc_resctrl_show(struct seq_file *s, struct pid_namespace *ns, seq_puts(s, "mon:"); list_for_each_entry(crg, &rdtg->mon.crdtgrp_list, mon.crdtgrp_list) { - if (tsk->rmid != crg->mon.rmid) + if (!resctrl_arch_match_rmid(tsk, crg->mon.parent->closid, + crg->mon.rmid)) continue; seq_printf(s, "%s", crg->kn->name); break; @@ -2678,8 +2692,8 @@ static void rdt_move_group_tasks(struct rdtgroup *from, struct rdtgroup *to, for_each_process_thread(p, t) { if (!from || is_closid_match(t, from) || is_rmid_match(t, from)) { - WRITE_ONCE(t->closid, to->closid); - WRITE_ONCE(t->rmid, to->mon.rmid); + resctrl_arch_set_closid_rmid(t, to->closid, + to->mon.rmid); /* * Order the closid/rmid stores above before the loads From patchwork Thu May 25 18:01:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99172 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp565994vqr; Thu, 25 May 2023 11:17:37 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4zimv5CU0GSD7cxGld+aRm2/y/YpWxiDOZpPmK89Lefeowt3jLGntoyolZZ/k0wWbKOOYI X-Received: by 2002:a05:6a20:4321:b0:10c:4ff5:38b7 with SMTP id h33-20020a056a20432100b0010c4ff538b7mr12976880pzk.6.1685038657314; Thu, 25 May 2023 11:17:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685038657; cv=none; d=google.com; s=arc-20160816; b=xE+VVshwrb6TpIIU+4SFEVMFynL2ZPMfskFzR9g0up9btTw9Z7k1HRmr1fUanTsWfl L47dI1KRNDpnDtjreiQ6H0Zdh/kvFYrnp+DRTBqtPjtWO+iT3arRsgTtSZd3tdh7alJ6 pBiHNj1JhV7Cn20frS6VwdRCpc5Of4SnQt3Yx6SuEzw8cxo1fy7DD16fUQGm1KVUjPgN DJBTawmtdMGwnHaxyS3jyaEoVjXyJqXpfLulZ1tgLpHG5C/6c2UkrhGOXgcu3u5soTQl 5DwFhTrDYIBo12ToeyevgHGOAnLi/QX6yvkqx7/2ouJYgX9pW/v7rnRADRB3mCiwqQP9 PL7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=AnaEspv7yPHxWFYxao23k8O8tDzjRIv0xbGT3Sv6WLQ=; b=afy3gbYAbIWyOTe3AgaEof9oRxG3A5KrgX59Y4XnRXMpDgxN1ybV/cOiu/XmTzxgh1 qwmIZneoJICpNE1nQOF4YwT6ZWeadhS6wQhtxwEN1gXkUpOqP6jOYu2pssieDUuQtmjl 1xsJ5pYysuGYGM/5xxGjrwpZJM/k0BPSTt6aMAK2hneUjpL2j6E3XYl84J+SVMicZJwn xzJou96gdYM6T36j0wDdpzreDYLFTqdVn+PkWVlsOlqPY7MTk/wZnEfFIL2sLAnzb6Ut TLMdMVxPD3zRUF/IZ3yPd+SKMZ9nK6A8YrK0aWVWJutTIr1DHlCu4mpLp0RerxBvMyI8 LguA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w4-20020a656944000000b004fb8abdd188si1677062pgq.115.2023.05.25.11.17.22; Thu, 25 May 2023 11:17:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241134AbjEYSDi (ORCPT + 99 others); Thu, 25 May 2023 14:03:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33128 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229646AbjEYSDQ (ORCPT ); Thu, 25 May 2023 14:03:16 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 695A9D3 for ; Thu, 25 May 2023 11:03:00 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5730D15BF; Thu, 25 May 2023 11:03:45 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8E43A3F6C4; Thu, 25 May 2023 11:02:57 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 10/24] tick/nohz: Move tick_nohz_full_mask declaration outside the #ifdef Date: Thu, 25 May 2023 18:01:55 +0000 Message-Id: <20230525180209.19497-11-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766891095011815214?= X-GMAIL-MSGID: =?utf-8?q?1766891095011815214?= tick_nohz_full_mask lists the CPUs that are nohz_full. This is only needed when CONFIG_NO_HZ_FULL is defined. tick_nohz_full_cpu() allows a specific CPU to be tested against the mask, and evaluates to false when CONFIG_NO_HZ_FULL is not defined. The resctrl code needs to pick a CPU to run some work on, a new helper prefers housekeeping CPUs by examining the tick_nohz_full_mask. Hiding the declaration behind #ifdef CONFIG_NO_HZ_FULL forces all the users to be behind an ifdef too. Move the tick_nohz_full_mask declaration, this lets callers drop the ifdef, and guard access to tick_nohz_full_mask with IS_ENABLED() or something like tick_nohz_full_cpu(). The definition does not need to be moved as any callers should be removed at compile time unless CONFIG_NO_HZ_FULL is defined. Signed-off-by: James Morse --- include/linux/tick.h | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/include/linux/tick.h b/include/linux/tick.h index 9459fef5b857..65af90ca409a 100644 --- a/include/linux/tick.h +++ b/include/linux/tick.h @@ -174,9 +174,16 @@ static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; } static inline void tick_nohz_idle_stop_tick_protected(void) { } #endif /* !CONFIG_NO_HZ_COMMON */ +/* + * Mask of CPUs that are nohz_full. + * + * Users should be guarded by CONFIG_NO_HZ_FULL or a tick_nohz_full_cpu() + * check. + */ +extern cpumask_var_t tick_nohz_full_mask; + #ifdef CONFIG_NO_HZ_FULL extern bool tick_nohz_full_running; -extern cpumask_var_t tick_nohz_full_mask; static inline bool tick_nohz_full_enabled(void) { From patchwork Thu May 25 18:01:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99163 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp563935vqr; Thu, 25 May 2023 11:14:24 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ533sdlz1Sb0fySVGRH50OLUD+MtGVaCDa1euh3fzMGnnirCzIUE0WKPxnYeyAtkWPEep0I X-Received: by 2002:a05:6a20:7d8b:b0:101:166:863f with SMTP id v11-20020a056a207d8b00b001010166863fmr27769110pzj.23.1685038463905; Thu, 25 May 2023 11:14:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685038463; cv=none; d=google.com; s=arc-20160816; b=cqbZyGGTld+gSUiNkFU5MDpVBDCBscnfLbDaqKhBwSmOX3OxPdL7wmmvmwFx9C3OyL RNBvxVp/dEpGOvYUMLfhogERub7bjWEjmxEwGVwmpV93BIfl5n+a16GnsuTWKOuzuJ6y ER7wrlZwUQGW15o8HUXu/q3Sl72SplKJ4YFRfdp7JRbSRpiACztlZIJF8vlncd8DxgYt ghs6YDlBXIaZsmTAIpRifjTjoV0YaejOrVXFbh067mBTlQ6XB8AcCJzPVsqoAoN5DT/M qu10O/dulnHr+vaUguoEpTnflrk4nukQy707SYjhmH8vKAwzMef0/DxxgN4qyVxwYqY2 +3Mg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=HOoqHLLCrbxurCGJkWD9vXENRtGFqcHFPxQCoPrHwYY=; b=IkEpdJh4gqHJuf3v+PEtkI2XU7uU9atWQmwXcrp8nVRHq4/vYR9JBb+8+b47d4r6/L aiDqlz/i63pPgBLlUOXmtxXhz4AL81pmbKw0NYk6z8j5+B49sd/wElqWydj77Bn1f6Pt 3N6cI/ExgN3Gmt6u+9nWOiY1fifGo6bYeNbgE2/rL3wtrFAo8eH3C9ogHYIRckEEL9MQ pQXWETvi/GJzGgMdG2NlN/iVcgcMlSVbQmVvvTrZwvCIzItAkbKvKqJrccErBCabrKNN A2uLnRJEJTCKX5faHmgjp1XPHSMuvxrCZKMxPdl5Ad0vGHxPItdWtE50t7vs1s7T3Dmb SK8Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s187-20020a625ec4000000b005e1cabb612fsi2040869pfb.67.2023.05.25.11.14.09; Thu, 25 May 2023 11:14:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240695AbjEYSDk (ORCPT + 99 others); Thu, 25 May 2023 14:03:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32914 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241016AbjEYSDV (ORCPT ); Thu, 25 May 2023 14:03:21 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7FFED10D0 for ; Thu, 25 May 2023 11:03:03 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 432BE1650; Thu, 25 May 2023 11:03:48 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7813C3F6C4; Thu, 25 May 2023 11:03:00 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 11/24] x86/resctrl: Add cpumask_any_housekeeping() for limbo/overflow Date: Thu, 25 May 2023 18:01:56 +0000 Message-Id: <20230525180209.19497-12-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766890891936937000?= X-GMAIL-MSGID: =?utf-8?q?1766890891936937000?= The limbo and overflow code picks a CPU to use from the domain's list of online CPUs. Work is then scheduled on these CPUs to maintain the limbo list and any counters that may overflow. cpumask_any() may pick a CPU that is marked nohz_full, which will either penalise the work that CPU was dedicated to, or delay the processing of limbo list or counters that may overflow. Perhaps indefinitely. Delaying the overflow handling will skew the bandwidth values calculated by mba_sc, which expects to be called once a second. Add cpumask_any_housekeeping() as a replacement for cpumask_any() that prefers housekeeping CPUs. This helper will still return a nohz_full CPU if that is the only option. The CPU to use is re-evaluated each time the limbo/overflow work runs. This ensures the work will move off a nohz_full CPU once a housekeeping CPU is available. Signed-off-by: James Morse --- Changes since v3: * typos fixed --- arch/x86/kernel/cpu/resctrl/internal.h | 23 +++++++++++++++++++++++ arch/x86/kernel/cpu/resctrl/monitor.c | 17 ++++++++++++----- 2 files changed, 35 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 96fb7658ff74..6f18cf26988c 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -7,6 +7,7 @@ #include #include #include +#include #include #define L3_QOS_CDP_ENABLE 0x01ULL @@ -55,6 +56,28 @@ /* Max event bits supported */ #define MAX_EVT_CONFIG_BITS GENMASK(6, 0) +/** + * cpumask_any_housekeeping() - Choose any CPU in @mask, preferring those that + * aren't marked nohz_full + * @mask: The mask to pick a CPU from. + * + * Returns a CPU in @mask. If there are housekeeping CPUs that don't use + * nohz_full, these are preferred. + */ +static inline unsigned int cpumask_any_housekeeping(const struct cpumask *mask) +{ + int cpu, hk_cpu; + + cpu = cpumask_any(mask); + if (tick_nohz_full_cpu(cpu)) { + hk_cpu = cpumask_nth_andnot(0, mask, tick_nohz_full_mask); + if (hk_cpu < nr_cpu_ids) + cpu = hk_cpu; + } + + return cpu; +} + struct rdt_fs_context { struct kernfs_fs_context kfc; bool enable_cdpl2; diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 128d4c7206e4..e267869d60d5 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -770,9 +770,9 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, void cqm_handle_limbo(struct work_struct *work) { unsigned long delay = msecs_to_jiffies(CQM_LIMBOCHECK_INTERVAL); - int cpu = smp_processor_id(); struct rdt_resource *r; struct rdt_domain *d; + int cpu; mutex_lock(&rdtgroup_mutex); @@ -781,8 +781,10 @@ void cqm_handle_limbo(struct work_struct *work) __check_limbo(d, false); - if (has_busy_rmid(r, d)) + if (has_busy_rmid(r, d)) { + cpu = cpumask_any_housekeeping(&d->cpu_mask); schedule_delayed_work_on(cpu, &d->cqm_limbo, delay); + } mutex_unlock(&rdtgroup_mutex); } @@ -792,7 +794,7 @@ void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms) unsigned long delay = msecs_to_jiffies(delay_ms); int cpu; - cpu = cpumask_any(&dom->cpu_mask); + cpu = cpumask_any_housekeeping(&dom->cpu_mask); dom->cqm_work_cpu = cpu; schedule_delayed_work_on(cpu, &dom->cqm_limbo, delay); @@ -802,10 +804,10 @@ void mbm_handle_overflow(struct work_struct *work) { unsigned long delay = msecs_to_jiffies(MBM_OVERFLOW_INTERVAL); struct rdtgroup *prgrp, *crgrp; - int cpu = smp_processor_id(); struct list_head *head; struct rdt_resource *r; struct rdt_domain *d; + int cpu; mutex_lock(&rdtgroup_mutex); @@ -826,6 +828,11 @@ void mbm_handle_overflow(struct work_struct *work) update_mba_bw(prgrp, d); } + /* + * Re-check for housekeeping CPUs. This allows the overflow handler to + * move off a nohz_full CPU quickly. + */ + cpu = cpumask_any_housekeeping(&d->cpu_mask); schedule_delayed_work_on(cpu, &d->mbm_over, delay); out_unlock: @@ -839,7 +846,7 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) if (!static_branch_likely(&rdt_mon_enable_key)) return; - cpu = cpumask_any(&dom->cpu_mask); + cpu = cpumask_any_housekeeping(&dom->cpu_mask); dom->mbm_work_cpu = cpu; schedule_delayed_work_on(cpu, &dom->mbm_over, delay); } From patchwork Thu May 25 18:01:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99154 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp558364vqr; Thu, 25 May 2023 11:05:29 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4Tj7jyjtr9kbO1jTMwXHAaGI7QaQ/mMFkAFOS6+BSrwWIIBtv4XOIvURz4pu5uiK+z8nnc X-Received: by 2002:a17:90a:4208:b0:249:748b:a232 with SMTP id o8-20020a17090a420800b00249748ba232mr2831999pjg.25.1685037929282; Thu, 25 May 2023 11:05:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685037929; cv=none; d=google.com; s=arc-20160816; b=XHn3wcQXUoxA0wRN5e/NnlLPMoAmC1mHuBkk3BWrMAIUVf262vmH808bE/Ju/2kd+5 Cp06W6CRafBtoSPKiN01T7GjhKteEFz1wjDlFZgjh+G9m7qw4yqH/pnLv3wv8yq2FuUe 9IIfVboJ2hoa0UDblo4g84jX/Ipj9OA3TTvDJxfPtzwZqCZp0JYSY12F8v3xPpOF0OCN +BhOcgcu8B0Qkap0GVtg0IkP4zEI5MH83fT7yu+EHftRAHa1t4MkCAOWJMj5Zxct3NsS uGctbWda0Cmi/7iX1FqLInsfUHo3iKJHdYOr2XqxTjSHyaQJNR14WXUOhJYTGkt8kxe2 mrWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=fC2xvxCsR8Ry44g8N5I5+OUcBmaMRpPIKCKGrfFMoWU=; b=xVHPalQhmzHWTTwL08TEgicyDK1FUQvmrEBs6PcMTpr7DBxnTf5UJwZJl2jMulebkh /i9kHTkBF9tQG6il4DPVd7cbn+gRDGK21vmTzyCn8Ppi0JeFRSU+MYYaxDI2KZ0YL+Sl k3NIaDMC/0TtZyGAZqadeGeXzdTvJ6YrNEKvDny4h1uidKU+T+iJ76xxL/FUY0SUxJxB dzZ+FLXb8Pf6EOgVUA/9Gj5LEzfBtVMkQ9Gz3IjciLgLJRCaYL8TMIUIaktmBMKbaqDy R6bVsACSLdd30Tz/IfUSBYUCktXlAo1OlCXkOKartiThwcGwka46hwhqgPjSp3i2LcpX 9d4A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f12-20020a17090a654c00b00250ab2f31besi2050972pjs.71.2023.05.25.11.05.15; Thu, 25 May 2023 11:05:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241160AbjEYSDt (ORCPT + 99 others); Thu, 25 May 2023 14:03:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241046AbjEYSD0 (ORCPT ); Thu, 25 May 2023 14:03:26 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 6B2F619D for ; Thu, 25 May 2023 11:03:06 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 316781655; Thu, 25 May 2023 11:03:51 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 65FF83F6C4; Thu, 25 May 2023 11:03:03 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 12/24] x86/resctrl: Make resctrl_arch_rmid_read() retry when it is interrupted Date: Thu, 25 May 2023 18:01:57 +0000 Message-Id: <20230525180209.19497-13-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766890331613325230?= X-GMAIL-MSGID: =?utf-8?q?1766890331613325230?= resctrl_arch_rmid_read() could be called by resctrl in process context, and then called by the PMU driver from irq context on the same CPU. This could cause struct arch_mbm_state's prev_msr value to go backwards, leading to the chunks value being incremented multiple times. The struct arch_mbm_state holds both the previous msr value, and a count of the number of chunks. These two fields need to be updated atomically. Read the prev_msr before accessing the hardware, and cmpxchg() the value back. If the value has changed, the whole thing is re-attempted. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/internal.h | 5 +++-- arch/x86/kernel/cpu/resctrl/monitor.c | 28 +++++++++++++++++++------- 2 files changed, 24 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 6f18cf26988c..7960366b9434 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -2,6 +2,7 @@ #ifndef _ASM_X86_RESCTRL_INTERNAL_H #define _ASM_X86_RESCTRL_INTERNAL_H +#include #include #include #include @@ -338,8 +339,8 @@ struct mbm_state { * find this struct. */ struct arch_mbm_state { - u64 chunks; - u64 prev_msr; + atomic64_t chunks; + atomic64_t prev_msr; }; /** diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index e267869d60d5..1f470e55d555 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -225,13 +225,15 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d, { struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); struct arch_mbm_state *am; + u64 msr_val; am = get_arch_mbm_state(hw_dom, rmid, eventid); if (am) { memset(am, 0, sizeof(*am)); /* Record any initial, non-zero count value. */ - __rmid_read(rmid, eventid, &am->prev_msr); + __rmid_read(rmid, eventid, &msr_val); + atomic64_set(&am->prev_msr, msr_val); } } @@ -266,23 +268,35 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, { struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); + u64 start_msr_val, old_msr_val, msr_val, chunks; struct arch_mbm_state *am; - u64 msr_val, chunks; - int ret; + int ret = 0; if (!cpumask_test_cpu(smp_processor_id(), &d->cpu_mask)) return -EINVAL; +interrupted: + am = get_arch_mbm_state(hw_dom, rmid, eventid); + if (am) + start_msr_val = atomic64_read(&am->prev_msr); + ret = __rmid_read(rmid, eventid, &msr_val); if (ret) return ret; am = get_arch_mbm_state(hw_dom, rmid, eventid); if (am) { - am->chunks += mbm_overflow_count(am->prev_msr, msr_val, - hw_res->mbm_width); - chunks = get_corrected_mbm_count(rmid, am->chunks); - am->prev_msr = msr_val; + old_msr_val = atomic64_cmpxchg(&am->prev_msr, start_msr_val, + msr_val); + if (old_msr_val != start_msr_val) + goto interrupted; + + chunks = mbm_overflow_count(start_msr_val, msr_val, + hw_res->mbm_width); + atomic64_add(chunks, &am->chunks); + + chunks = get_corrected_mbm_count(rmid, + atomic64_read(&am->chunks)); } else { chunks = msr_val; } From patchwork Thu May 25 18:01:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99160 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp561901vqr; Thu, 25 May 2023 11:10:50 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4LnD6hoV1yshHgyGZcrxgqHqx/ZS6WUU+VSSWInmI+9IMMP1zTBUTJ7JV8GbgZ01X5EUoz X-Received: by 2002:a05:6a20:c181:b0:10a:ef63:4c33 with SMTP id bg1-20020a056a20c18100b0010aef634c33mr16555999pzb.47.1685038250239; Thu, 25 May 2023 11:10:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685038250; cv=none; d=google.com; s=arc-20160816; b=hLRDfXeXsmQ8AdISC4BelGD7ZzS1iEVHhev4EibfRAXQxvaf6LedPGbBlhhGDU37L0 g/Dzkg3hIZ08QxkDsC0vbxmAK1l6zLUw07yDPQDzWJVUPF/B1Fib5htVO9J36wkxwIqy gYXiGe3fw0CqKg9IOq/t/C/ObU7ckxmBFnKVfjiokYBuwolnJoXPCIbGkfBOKtwK+zeH Oqs5sc6kVx5Rn6ngRp7DGGNVjC3jm4+6IkEAWEXrUJGJ4kATOdTVmJGG39cBRswxRxKN JwhUaDZTWPP/lDQuvMK/n224z1mHqJg/xgrLG3V//FNn/Cfz3jzkjhq6ngHXdY+Y9xO+ /Jog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=DM2pz0E+HJNef5j/fGOpynhaTT09MX4DHzIZpEXfWis=; b=TqQK2aQHSPu8ePstuv5KpVI2S9SmJVn3XZqYJ6Uz62T2DfcvAXvqJwrz9b/EMMqJHg z+xVHI+DAdbgd2EnmLiwWJEQLXZp/KSmM5T8nbWIPe7AvNCzUQMh71WdG/zVzwg5GOZ/ wDiOsm4fz/6/zD8tHkM8xLGGn9mYbHwrKhbvDhbLDlegnKLdKeremZmMxT6QxksHXM7e KuHEMWasdAZ7fTElt0Kj+06RZlvafTZwdOVy4TWuMldsZ+1ucS7KAn7qAjV9QsmLt/ta s96B704dn3FkiTsRYvmca1Zw3GYi+sX8vL7mzC3mw1aAd1/kO7JeWQTD/D5rtjdrHJst pU0g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e10-20020a17090a9a8a00b002507107f730si1106380pjp.30.2023.05.25.11.10.34; Thu, 25 May 2023 11:10:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241328AbjEYSDx (ORCPT + 99 others); Thu, 25 May 2023 14:03:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33396 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241245AbjEYSDf (ORCPT ); Thu, 25 May 2023 14:03:35 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 5073CE51 for ; Thu, 25 May 2023 11:03:09 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 254BC165C; Thu, 25 May 2023 11:03:54 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 597993F6C4; Thu, 25 May 2023 11:03:06 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 13/24] x86/resctrl: Queue mon_event_read() instead of sending an IPI Date: Thu, 25 May 2023 18:01:58 +0000 Message-Id: <20230525180209.19497-14-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766890668133654815?= X-GMAIL-MSGID: =?utf-8?q?1766890668133654815?= Intel is blessed with an abundance of monitors, one per RMID, that can be read from any CPU in the domain. MPAMs monitors reside in the MMIO MSC, the number implemented is up to the manufacturer. This means when there are fewer monitors than needed, they need to be allocated and freed. MPAM's CSU monitors are used to back the 'llc_occupancy' monitor file. The CSU counter is allowed to return 'not ready' for a small number of micro-seconds after programming. To allow one CSU hardware monitor to be used for multiple control or monitor groups, the CPU accessing the monitor needs to be able to block when configuring and reading the counter. Worse, the domain may be broken up into slices, and the MMIO accesses for each slice may need performing from different CPUs. These two details mean MPAMs monitor code needs to be able to sleep, and IPI another CPU in the domain to read from a resource that has been sliced. mon_event_read() already invokes mon_event_count() via IPI, which means this isn't possible. On systems using nohz-full, some CPUs need to be interrupted to run kernel work as they otherwise stay in user-space running realtime workloads. Interrupting these CPUs should be avoided, and scheduling work on them may never complete. Change mon_event_read() to pick a housekeeping CPU, (one that is not using nohz_full) and schedule mon_event_count() and wait. If all the CPUs in a domain are using nohz-full, then an IPI is used as the fallback. This function is only used in response to a user-space filesystem request (not the timing sensitive overflow code). This allows MPAM to hide the slice behaviour from resctrl, and to keep the monitor-allocation in monitor.c. When the IPI fallback is used on machines where MPAM needs to make an access on multiple CPUs, the counter read will always fail. Tested-by: Shaopeng Tan Reviewed-by: Peter Newman Tested-by: Peter Newman Signed-off-by: James Morse --- Changes since v2: * Use cpumask_any_housekeeping() and fallback to an IPI if needed. Changes since v3: * Actually include the IPI fallback code. --- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 28 +++++++++++++++++++++-- arch/x86/kernel/cpu/resctrl/monitor.c | 2 +- 2 files changed, 27 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index b44c487727d4..6eeccad192ee 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -19,6 +19,7 @@ #include #include #include +#include #include "internal.h" /* @@ -520,12 +521,24 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of, return ret; } +static int smp_mon_event_count(void *arg) +{ + mon_event_count(arg); + + return 0; +} + void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, struct rdt_domain *d, struct rdtgroup *rdtgrp, int evtid, int first) { + int cpu; + + /* When picking a CPU from cpu_mask, ensure it can't race with cpuhp */ + lockdep_assert_held(&rdtgroup_mutex); + /* - * setup the parameters to send to the IPI to read the data. + * setup the parameters to pass to mon_event_count() to read the data. */ rr->rgrp = rdtgrp; rr->evtid = evtid; @@ -534,7 +547,18 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, rr->val = 0; rr->first = first; - smp_call_function_any(&d->cpu_mask, mon_event_count, rr, 1); + cpu = cpumask_any_housekeeping(&d->cpu_mask); + + /* + * cpumask_any_housekeeping() prefers housekeeping CPUs, but + * are all the CPUs nohz_full? If yes, pick a CPU to IPI. + * MPAM's resctrl_arch_rmid_read() is unable to read the + * counters on some platforms if its called in irq context. + */ + if (tick_nohz_full_cpu(cpu)) + smp_call_function_any(&d->cpu_mask, mon_event_count, rr, 1); + else + smp_call_on_cpu(cpu, smp_mon_event_count, rr, false); } int rdtgroup_mondata_show(struct seq_file *m, void *arg) diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 1f470e55d555..6ba40495589a 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -589,7 +589,7 @@ static void mbm_bw_count(u32 closid, u32 rmid, struct rmid_read *rr) } /* - * This is called via IPI to read the CQM/MBM counters + * This is scheduled by mon_event_read() to read the CQM/MBM counters * on a domain. */ void mon_event_count(void *info) From patchwork Thu May 25 18:01:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99162 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp563460vqr; Thu, 25 May 2023 11:13:27 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4ggekU++nN6OqMhEh7y8d2nGJrYQkAxkUEjDpYffji9IPWEUylpn9HbwX71Kma+0CX1KEv X-Received: by 2002:a05:6a20:430e:b0:10e:f1e3:8217 with SMTP id h14-20020a056a20430e00b0010ef1e38217mr4950102pzk.17.1685038407047; Thu, 25 May 2023 11:13:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685038407; cv=none; d=google.com; s=arc-20160816; b=ynWIiiw5a6MYbh9BSzNzQ1okAdc5Jf+oawMwYCeOwRPMWQbmnW8/wFzcywAZN0ckXi MKm3D1uJDXcSiqprLaWoKpLWX7Rt1nzNDh3m2Uh1pH7YAXaPh2ntGcooEBPx4/MmZR5N VjM+FovkR+OOP88R6Up9vgRR27DPqW2lHMcLL/1BF2SL0F1M+dua3W/Z6+3u1P6XgMD/ cE8khvfyimkg9O3gt6L9HzuhRdTVkDPZ65Y6BlxGhIUfm0NpC/dyW9J4n909eRDyDrYR PBCaXRLrQU1X1E8FLaWL3m9BkY+t2CzCQ+Awah7EfoVDWK+7QdhYOlagI0FsgWw3Ip6H i3yg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=bE1RrGJbjs5L6SpMr6Lk9hqblF/pkjBvwASZkD3GwMg=; b=ufJKkpLY093/6H/Mfzb3+HUBoIiT8AgAs7zcCvVZuDovSwM4iZD7qCijlcUUldlvKY 5cuIEOqeG8LK1Ylkpt5rW4eQGgAdHijhJ+axwUqRDgElxOFZdztqWd2g+37vp+XICkvx ntospJzL5ceuLpo1X9iqFpu9Erc+Z/1sp4G89gpTLB0stCv5kMS/TECbNZdwA3A0Hw+O guX9STE0xsfsCbp9a6BFxRHSFZH8fnSAVFkaVgtHlhxrJ9n9W6zglIWd7eTIUQXdhTqz 4MOvKhUbWNukXvkp2fJLjbZG0nMqKfP7jpvg8x8SBZBhvXNxGd/nkVW5jRUkevyLzgp7 2lVg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s187-20020a625ec4000000b005e1cabb612fsi2040869pfb.67.2023.05.25.11.13.12; Thu, 25 May 2023 11:13:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240909AbjEYSEG (ORCPT + 99 others); Thu, 25 May 2023 14:04:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33582 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241289AbjEYSDp (ORCPT ); Thu, 25 May 2023 14:03:45 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id F2C7610F7 for ; Thu, 25 May 2023 11:03:15 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 18A901042; Thu, 25 May 2023 11:03:57 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4A4563F6C4; Thu, 25 May 2023 11:03:09 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 14/24] x86/resctrl: Allow resctrl_arch_rmid_read() to sleep Date: Thu, 25 May 2023 18:01:59 +0000 Message-Id: <20230525180209.19497-15-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766890832800880224?= X-GMAIL-MSGID: =?utf-8?q?1766890832800880224?= MPAM's cache occupancy counters can take a little while to settle once the monitor has been configured. The maximum settling time is described to the driver via a firmware table. The value could be large enough that it makes sense to sleep. To avoid exposing this to resctrl, it should be hidden behind MPAM's resctrl_arch_rmid_read(). resctrl_arch_rmid_read() may be called via IPI meaning it is unable to sleep. In this case resctrl_arch_rmid_read() should return an error if it needs to sleep. This will only affect MPAM platforms where the cache occupancy counter isn't available immediately, nohz_full is in use, and there are there are no housekeeping CPUs in the necessary domain. There are three callers of resctrl_arch_rmid_read(): __mon_event_count() and __check_limbo() are both called from a non-migrateable context. mon_event_read() invokes __mon_event_count() using smp_call_on_cpu(), which adds work to the target CPUs workqueue. rdtgroup_mutex() is held, meaning this cannot race with the resctrl cpuhp callback. __check_limbo() is invoked via schedule_delayed_work_on() also adds work to a per-cpu workqueue. The remaining call is add_rmid_to_limbo() which is called in response to a user-space syscall that frees an rmid. This opportunistically reads the llc occupancy counter on the current domain to see if the RMID is over the dirty threshold. This has to disable preemption to avoid reading the wrong domain's value. Disabling pre-emption here prevents resctrl_arch_rmid_read() from sleeping. add_rmid_to_limbo() walks each domain, but only reads the counter on one domain. If the system has more than one domain, the RMID will always be added to the limbo list. If the RMIDs usage was not over the threshold, it will be removed from the list when __check_limbo() runs. Make this the default behaviour. Free RMIDs are always added to the limbo list for each domain. The user visible effect of this is that a clean RMID is not available for re-allocation immediately after 'rmdir()' completes, this behaviour was never portable as it never happened on a machine with multiple domains. Removing this path allows resctrl_arch_rmid_read() to sleep if its called with interrupts unmasked. Document this is the expected behaviour, and add a might_sleep() annotation to catch changes that won't work on arm64. Signed-off-by: James Morse --- The previous version allowed resctrl_arch_rmid_read() to be called on the wrong CPUs, but now that this needs to take nohz_full and housekeeping into account, its too complex. Changes since v3: * Removed error handling for smp_call_function_any(), this can't race with the cpuhp callbacks as both hold rdtgroup_mutex. * Switched to the alternative of removing the counter read, this simplifies things dramatically. --- arch/x86/kernel/cpu/resctrl/monitor.c | 15 ++------------- include/linux/resctrl.h | 18 +++++++++++++++++- 2 files changed, 19 insertions(+), 14 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 6ba40495589a..fb33100e172b 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -272,6 +272,8 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, struct arch_mbm_state *am; int ret = 0; + resctrl_arch_rmid_read_context_check(); + if (!cpumask_test_cpu(smp_processor_id(), &d->cpu_mask)) return -EINVAL; @@ -462,8 +464,6 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) { struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; struct rdt_domain *d; - int cpu, err; - u64 val = 0; u32 idx; lockdep_assert_held(&rdtgroup_mutex); @@ -471,17 +471,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) idx = resctrl_arch_rmid_idx_encode(entry->closid, entry->rmid); entry->busy = 0; - cpu = get_cpu(); list_for_each_entry(d, &r->domains, list) { - if (cpumask_test_cpu(cpu, &d->cpu_mask)) { - err = resctrl_arch_rmid_read(r, d, entry->closid, - entry->rmid, - QOS_L3_OCCUP_EVENT_ID, - &val); - if (err || val <= resctrl_rmid_realloc_threshold) - continue; - } - /* * For the first limbo RMID in the domain, * setup up the limbo worker. @@ -491,7 +481,6 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) set_bit(idx, d->rmid_busy_llc); entry->busy++; } - put_cpu(); if (entry->busy) rmid_limbo_count++; diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index ff7452f644e4..b961936decfa 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -234,7 +234,12 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); * @eventid: eventid to read, e.g. L3 occupancy. * @val: result of the counter read in bytes. * - * Call from process context on a CPU that belongs to domain @d. + * Some architectures need to sleep when first programming some of the counters. + * (specifically: arm64's MPAM cache occupancy counters can return 'not ready' + * for a short period of time). Call from a non-migrateable process context on + * a CPU that belongs to domain @d. e.g. use smp_call_on_cpu() or + * schedule_work_on(). This function can be called with interrupts masked, + * e.g. using smp_call_function_any(), but may concistently return an error. * * Return: * 0 on success, or -EIO, -EINVAL etc on error. @@ -243,6 +248,17 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, u32 closid, u32 rmid, enum resctrl_event_id eventid, u64 *val); +/** + * resctrl_arch_rmid_read_context_check() - warn about invalid contexts + * + * When built with CONFIG_DEBUG_ATOMIC_SLEEP, this function will generate a + * warning when resctrl_arch_rmid_read() is called from an invalid context. + */ +static inline void resctrl_arch_rmid_read_context_check(void) +{ + if (!irqs_disabled()) + might_sleep(); +} /** * resctrl_arch_reset_rmid() - Reset any private state associated with rmid From patchwork Thu May 25 18:02:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99161 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp562196vqr; Thu, 25 May 2023 11:11:16 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6uSicVY7TMD6cgBuu6BQabEP/4RX7GEzVqxRvGXPfvULn+7zcQIExWzWzDJDoUnSL6q0lM X-Received: by 2002:a05:6a20:4305:b0:10e:e497:bc5d with SMTP id h5-20020a056a20430500b0010ee497bc5dmr4859051pzk.5.1685038276434; Thu, 25 May 2023 11:11:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685038276; cv=none; d=google.com; s=arc-20160816; b=HgIgQY9ghx1vLSrb9ksN051EsmLZa8/FyZ8Jmqhf2lJrayxYPJLBeExGSRSrymHTT1 qJ9YxracJv1mQGiiR/PuNqLGZHZZbma6YqdFZxHsK31S8Fk0qOEfRQxpqUE+ZEzcOKQ7 ltCE0pyZAOq9uo0fRC6EdM7zPLSMxYUVf/P6OgslBPibW7L3HWSf82t7y5E9WTq4TfVX lRmN/MYznJI9ohoPZ/+plmCkh6RGVYZD0JSVmkDxtlESXgD7GPWoudvNT27pYxLuSWGn 2gyH3PubZcp79mRHHTtlFjSSafpLd/A0v11GtCu+KkInmBgOo62n16FzeDTDptuYeMz2 IWVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=CxVgwkXpQ9ki4sDtxwHO81cfdShLMZNkcMwc4zOWq8k=; b=li6PU4F2QsTOTU0hAXXkQxGpu+9eszJfzRjX6UOhBTqz9eCexBVqDwRfAgWPxC+azh WQJRbqv9y2ptOUmsgrolHWxii3atnsujqngbx04101SPMST5av5l9gIFKfSKhSvJD8OF hGqec5bvxfw0iYPanN1HFMpBqJR1D0bud2g/NFGKoc7ok+Cy0/143jrNoMKKdxx9nY2l zYEgOFbgoK36/HKHSRoMdnrl12hGpe9lEY63ivKEFI4yTc0dwpwPzB5kqDRB3/mlRRG3 PkSt2ST8Vf81MleX1Y4HiTn2Tf0k4ziQNv5WadqAkOEt/86PMAR4ctNH2KOJCg/2BS0F 7cwQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e10-20020a17090a9a8a00b002507107f730si1106380pjp.30.2023.05.25.11.11.02; Thu, 25 May 2023 11:11:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241382AbjEYSEL (ORCPT + 99 others); Thu, 25 May 2023 14:04:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33282 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241176AbjEYSDt (ORCPT ); Thu, 25 May 2023 14:03:49 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A93A81704 for ; Thu, 25 May 2023 11:03:20 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 0350C1684; Thu, 25 May 2023 11:04:00 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 39CF33F6C4; Thu, 25 May 2023 11:03:12 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 15/24] x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read() Date: Thu, 25 May 2023 18:02:00 +0000 Message-Id: <20230525180209.19497-16-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766890695661497154?= X-GMAIL-MSGID: =?utf-8?q?1766890695661497154?= Depending on the number of monitors available, Arm's MPAM may need to allocate a monitor prior to reading the counter value. Allocating a contended resource may involve sleeping. add_rmid_to_limbo() calls resctrl_arch_rmid_read() for multiple domains, the allocation should be valid for all domains. __check_limbo() and mon_event_count() each make multiple calls to resctrl_arch_rmid_read(), to avoid extra work on contended systems, the allocation should be valid for multiple invocations of resctrl_arch_rmid_read(). Add arch hooks for this allocation, which need calling before resctrl_arch_rmid_read(). The allocated monitor is passed to resctrl_arch_rmid_read(), then freed again afterwards. The helper can be called on any CPU, and can sleep. Tested-by: Shaopeng Tan Signed-off-by: James Morse ---- Changes since v3: * Expanded comment. * Removed stray header include. * Reworded commit message. * Made ctx a void * instead of an int. --- arch/x86/include/asm/resctrl.h | 11 ++++++++++ arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 5 +++++ arch/x86/kernel/cpu/resctrl/internal.h | 1 + arch/x86/kernel/cpu/resctrl/monitor.c | 26 +++++++++++++++++++---- include/linux/resctrl.h | 5 ++++- 5 files changed, 43 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 78376c19ee6f..20729364982b 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -136,6 +136,17 @@ static inline u32 resctrl_arch_rmid_idx_encode(u32 ignored, u32 rmid) return rmid; } +/* x86 can always read an rmid, nothing needs allocating */ +struct rdt_resource; +static inline void *resctrl_arch_mon_ctx_alloc(struct rdt_resource *r, int evtid) +{ + might_sleep(); + return NULL; +}; + +static inline void resctrl_arch_mon_ctx_free(struct rdt_resource *r, int evtid, + void *ctx) { }; + void resctrl_cpu_detect(struct cpuinfo_x86 *c); #else diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index 6eeccad192ee..280d66fae21c 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -546,6 +546,9 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, rr->d = d; rr->val = 0; rr->first = first; + rr->arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, evtid); + if (IS_ERR(rr->arch_mon_ctx)) + return; cpu = cpumask_any_housekeeping(&d->cpu_mask); @@ -559,6 +562,8 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, smp_call_function_any(&d->cpu_mask, mon_event_count, rr, 1); else smp_call_on_cpu(cpu, smp_mon_event_count, rr, false); + + resctrl_arch_mon_ctx_free(r, evtid, rr->arch_mon_ctx); } int rdtgroup_mondata_show(struct seq_file *m, void *arg) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 7960366b9434..a7e025cffdbc 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -136,6 +136,7 @@ struct rmid_read { bool first; int err; u64 val; + void *arch_mon_ctx; }; extern bool rdt_alloc_capable; diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index fb33100e172b..6d140018358a 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -264,7 +264,7 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width) int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, u32 closid, u32 rmid, enum resctrl_event_id eventid, - u64 *val) + u64 *val, void *ignored) { struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); @@ -331,9 +331,14 @@ void __check_limbo(struct rdt_domain *d, bool force_free) u32 idx_limit = resctrl_arch_system_num_rmid_idx(); struct rmid_entry *entry; u32 idx, cur_idx = 1; + void *arch_mon_ctx; bool rmid_dirty; u64 val = 0; + arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, QOS_L3_OCCUP_EVENT_ID); + if (arch_mon_ctx < 0) + return; + /* * Skip RMID 0 and start from RMID 1 and check all the RMIDs that * are marked as busy for occupancy < threshold. If the occupancy @@ -347,7 +352,8 @@ void __check_limbo(struct rdt_domain *d, bool force_free) entry = __rmid_entry(idx); if (resctrl_arch_rmid_read(r, d, entry->closid, entry->rmid, - QOS_L3_OCCUP_EVENT_ID, &val)) { + QOS_L3_OCCUP_EVENT_ID, &val, + arch_mon_ctx)) { rmid_dirty = true; } else { rmid_dirty = (val >= resctrl_rmid_realloc_threshold); @@ -360,6 +366,8 @@ void __check_limbo(struct rdt_domain *d, bool force_free) } cur_idx = idx + 1; } + + resctrl_arch_mon_ctx_free(r, QOS_L3_OCCUP_EVENT_ID, arch_mon_ctx); } bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d) @@ -539,7 +547,7 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr) } rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid, rr->evtid, - &tval); + &tval, rr->arch_mon_ctx); if (rr->err) return rr->err; @@ -589,7 +597,6 @@ void mon_event_count(void *info) int ret; rdtgrp = rr->rgrp; - ret = __mon_event_count(rdtgrp->closid, rdtgrp->mon.rmid, rr); /* @@ -749,11 +756,21 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, if (is_mbm_total_enabled()) { rr.evtid = QOS_L3_MBM_TOTAL_EVENT_ID; rr.val = 0; + rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid); + if (rr.arch_mon_ctx < 0) + return; + __mon_event_count(closid, rmid, &rr); + + resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx); } if (is_mbm_local_enabled()) { rr.evtid = QOS_L3_MBM_LOCAL_EVENT_ID; rr.val = 0; + rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid); + if (rr.arch_mon_ctx < 0) + return; + __mon_event_count(closid, rmid, &rr); /* @@ -763,6 +780,7 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, */ if (is_mba_sc(NULL)) mbm_bw_count(closid, rmid, &rr); + resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx); } } diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index b961936decfa..0dcb5cfde609 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -233,6 +233,9 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); * @rmid: rmid of the counter to read. * @eventid: eventid to read, e.g. L3 occupancy. * @val: result of the counter read in bytes. + * @arch_mon_ctx: An architecture specific value from + * resctrl_arch_mon_ctx_alloc(), for MPAM this identifies + * the hardware monitor allocated for this read request. * * Some architectures need to sleep when first programming some of the counters. * (specifically: arm64's MPAM cache occupancy counters can return 'not ready' @@ -246,7 +249,7 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); */ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, u32 closid, u32 rmid, enum resctrl_event_id eventid, - u64 *val); + u64 *val, void *arch_mon_ctx); /** * resctrl_arch_rmid_read_context_check() - warn about invalid contexts From patchwork Thu May 25 18:02:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99155 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp558884vqr; Thu, 25 May 2023 11:06:17 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7xiRZtXbdNP/BvTEJ276M4/9iPatBSEck+A1GDVQ0tzoowckPVlU6ZT7IVVPNyNUfYmmGw X-Received: by 2002:a05:6a20:4420:b0:10b:c54f:6d1b with SMTP id ce32-20020a056a20442000b0010bc54f6d1bmr18985150pzb.52.1685037976655; Thu, 25 May 2023 11:06:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685037976; cv=none; d=google.com; s=arc-20160816; b=FxLRllu9D6UJH300XkvLLlk1jk8Qb2YYWuraYOi2FXSw7JvKiMTK5H3QPsQu2qockr SqiFm/W1s8g0FPipF9WVDW8g3Wc7OTUaGU96wYc2X9aIf72JhemjiJSzQac/LHVm1ZkK 9VSAO03OIr6hiE82Ve6ntROOJzj7TqRVfc/fVasXqRWV+CmqloNMz9FApYdrI/7t8D5c 2dpjGbg0x46MDsMgkxM5fBRb7ocFeKial4ggrSUgEsf7NV2pGR3SBkFSDyMsurxmjRm2 vMxhq3garIzABwNwV/Rm0giU1tQ26HA+sS+lPbFy089jWS/DQTXV44SzE2In1d1+ct1u udqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=vDr2wJPggtbOcynaztoPCeAltml1kZF5G21jS7FGDsA=; b=nKlnUQThtG0nY8BSqv11T1Eb1mzgfvxGI0rjnzG8yapiWS5sdfcT9HvTXCcfffwrFF pkaE+UyN+9REZOlCujmyxbCx6SingiTMa363FxjR803CosCK4EYnWQovyNcmdSwRX8kC nH006LKH5WzRXYq6I2v5jcPdPUOPIIPhQPzjxV4AKVszKsgygQPyQW9etR5nDvd1tAKz h0yG1YraHeJI0l1t+e8ueDxYKzRwfRgKSpksJguggCZWxrSh2rQ2RUvLonL7qGjnylfk 93+gJ6R2i+2Fhaje5quLTdeNAuMeNPbFHip9W/vnsp58n6qpdNI83rGUgfS3ZusvpLCT wjkw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r65-20020a632b44000000b00502f4fd0c16si1685628pgr.653.2023.05.25.11.06.03; Thu, 25 May 2023 11:06:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240679AbjEYSEI (ORCPT + 99 others); Thu, 25 May 2023 14:04:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33620 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240637AbjEYSDr (ORCPT ); Thu, 25 May 2023 14:03:47 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id F166C10FF for ; Thu, 25 May 2023 11:03:17 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E086D15BF; Thu, 25 May 2023 11:04:02 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 20C093F6C4; Thu, 25 May 2023 11:03:15 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 16/24] x86/resctrl: Make resctrl_mounted checks explicit Date: Thu, 25 May 2023 18:02:01 +0000 Message-Id: <20230525180209.19497-17-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766890381643985116?= X-GMAIL-MSGID: =?utf-8?q?1766890381643985116?= The rdt_enable_key is switched when resctrl is mounted, and used to prevent a second mount of the filesystem. It also enables the architecture's context switch code. This requires another architecture to have the same set of static-keys, as resctrl depends on them too. The existing users of these static-keys are implicitly also checking if the filesystem is mounted. Make the resctrl_mounted checks explicit: resctrl can keep track of whether it has been mounted once. This doesn't need to be combined with whether the arch code is context switching the CLOSID. rdt_mon_enable_key is never used just to test that resctrl is mounted, but does also have this implication. Add a resctrl_mounted to all uses of rdt_mon_enable_key. This will allow rdt_mon_enable_key to be swapped with a helper in a subsequent patch. This will allow the static-key changing to be moved behind resctrl_arch_ calls. Tested-by: Shaopeng Tan Signed-off-by: James Morse ---- Changse since v3: * Removed a newline. * Rephrased commit message --- arch/x86/kernel/cpu/resctrl/internal.h | 1 + arch/x86/kernel/cpu/resctrl/monitor.c | 12 ++++++++++-- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 21 +++++++++++++++------ 3 files changed, 26 insertions(+), 8 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index a7e025cffdbc..67cabda6fd4d 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -143,6 +143,7 @@ extern bool rdt_alloc_capable; extern bool rdt_mon_capable; extern unsigned int rdt_mon_features; extern struct list_head resctrl_schema_all; +extern bool resctrl_mounted; enum rdt_group_type { RDTCTRL_GROUP = 0, diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 6d140018358a..da5a86c95142 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -832,7 +832,11 @@ void mbm_handle_overflow(struct work_struct *work) mutex_lock(&rdtgroup_mutex); - if (!static_branch_likely(&rdt_mon_enable_key)) + /* + * If the filesystem has been unmounted this work no longer needs to + * run. + */ + if (!resctrl_mounted || !static_branch_likely(&rdt_mon_enable_key)) goto out_unlock; r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; @@ -865,7 +869,11 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) unsigned long delay = msecs_to_jiffies(delay_ms); int cpu; - if (!static_branch_likely(&rdt_mon_enable_key)) + /* + * When a domain comes online there is no guarantee the filesystem is + * mounted. If not, there is no need to catch counter overflow. + */ + if (!resctrl_mounted || !static_branch_likely(&rdt_mon_enable_key)) return; cpu = cpumask_any_housekeeping(&dom->cpu_mask); dom->mbm_work_cpu = cpu; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index f2bb3e09ed13..47bb3ab775fc 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -42,6 +42,9 @@ LIST_HEAD(rdt_all_groups); /* list of entries for the schemata file */ LIST_HEAD(resctrl_schema_all); +/* The filesystem can only be mounted once. */ +bool resctrl_mounted; + /* Kernel fs node for "info" directory under root */ static struct kernfs_node *kn_info; @@ -815,7 +818,7 @@ int proc_resctrl_show(struct seq_file *s, struct pid_namespace *ns, mutex_lock(&rdtgroup_mutex); /* Return empty if resctrl has not been mounted. */ - if (!static_branch_unlikely(&rdt_enable_key)) { + if (!resctrl_mounted) { seq_puts(s, "res:\nmon:\n"); goto unlock; } @@ -2482,7 +2485,7 @@ static int rdt_get_tree(struct fs_context *fc) /* * resctrl file system can only be mounted once. */ - if (static_branch_unlikely(&rdt_enable_key)) { + if (resctrl_mounted) { ret = -EBUSY; goto out; } @@ -2530,8 +2533,10 @@ static int rdt_get_tree(struct fs_context *fc) if (rdt_mon_capable) static_branch_enable_cpuslocked(&rdt_mon_enable_key); - if (rdt_alloc_capable || rdt_mon_capable) + if (rdt_alloc_capable || rdt_mon_capable) { static_branch_enable_cpuslocked(&rdt_enable_key); + resctrl_mounted = true; + } if (is_mbm_enabled()) { r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; @@ -2802,6 +2807,7 @@ static void rdt_kill_sb(struct super_block *sb) static_branch_disable_cpuslocked(&rdt_alloc_enable_key); static_branch_disable_cpuslocked(&rdt_mon_enable_key); static_branch_disable_cpuslocked(&rdt_enable_key); + resctrl_mounted = false; kernfs_kill_sb(sb); mutex_unlock(&rdtgroup_mutex); cpus_read_unlock(); @@ -3633,7 +3639,7 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) * If resctrl is mounted, remove all the * per domain monitor data directories. */ - if (static_branch_unlikely(&rdt_mon_enable_key)) + if (resctrl_mounted && static_branch_unlikely(&rdt_mon_enable_key)) rmdir_mondata_subdir_allrdtgrp(r, d->id); if (is_mbm_enabled()) @@ -3710,8 +3716,11 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) if (is_llc_occupancy_enabled()) INIT_DELAYED_WORK(&d->cqm_limbo, cqm_handle_limbo); - /* If resctrl is mounted, add per domain monitor data directories. */ - if (static_branch_unlikely(&rdt_mon_enable_key)) + /* + * If the filesystem is not mounted, creating directories is deferred + * until mount time by rdt_get_tree() calling mkdir_mondata_all(). + */ + if (resctrl_mounted && static_branch_unlikely(&rdt_mon_enable_key)) mkdir_mondata_subdir_allrdtgrp(r, d); return 0; From patchwork Thu May 25 18:02:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99156 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp558997vqr; Thu, 25 May 2023 11:06:26 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6q11x1uK9PZQ/zQCJMA9f4KdKmNaHV9XFMpXzgW77B/RIRK7HXtpq6qdm+gsSxRc0qZnix X-Received: by 2002:a17:903:2445:b0:1af:d1a4:25d with SMTP id l5-20020a170903244500b001afd1a4025dmr3510588pls.42.1685037986596; Thu, 25 May 2023 11:06:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685037986; cv=none; d=google.com; s=arc-20160816; b=FvwzIykzPCMFh/dHEF3+qQ+aPRRChJg+wiChte7QmvdDeSMFmbTzMceLpP64LjiOPl tgvKXJO5yS38CF6/wUz0eWwKU2JBuNk4Frf1Atbcj8zKo7eVk5vbxfrXHELaOcPFXkn8 3D9HI5tNqBl4YpiI8eKPShnjkMnjGvf1P7rToYJQu7/CAuviRvXBMYXwkmXE7hxD11q+ K0T8T30OxOYvuRTx/eRMshAQOxr3dcoGXD62CwGyJL6NaKEdwrrzzE/bsVKhVi2zqQnW DJvjOtEK+AXXFBwHsGHhFOqdAi35VUu2vrhs1Gr0BYEAtyfV4wVF1HtT3gzE1I3Er/wV 5DqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=pfcNtE6PQZwpIFJppEx/xQurqO4aHVIin2LA9i0W4kE=; b=r2HZXiGaFyqiODEv/t8TuzmHLSurX5DNNmoGmoV79ht7MYAJqzO7O752LV0IJoHX9N SY53mWV3D43UIlOK1rna/cUG6TVwuqxfhiZD6rOqVCpYnTyScSW0SJuY8MvWa8uumZSf CJKeClSdCUEqYah1zAxHXMwBA93uDW6mYGvRW3r9CYB1DSPZUh2RaNGvBBlmrZTOZW/f VwXlnobWM5f5uB7y9R9Y/15lC1HtIq+yHA3YJI971A/Nn1i/e9sXwQkqA1ikzbygv1qT qztujAYX9uNSQRwlGULSVodbjc7EyvWSjs3eCKqvOXs3l5nvhUF+13v1dWN4jXgSEtWk ycPw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i1-20020a17090332c100b001a64e7b702fsi2103219plr.447.2023.05.25.11.06.12; Thu, 25 May 2023 11:06:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241356AbjEYSEY (ORCPT + 99 others); Thu, 25 May 2023 14:04:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33758 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241332AbjEYSDx (ORCPT ); Thu, 25 May 2023 14:03:53 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 30D1318D for ; Thu, 25 May 2023 11:03:29 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CDF491688; Thu, 25 May 2023 11:04:05 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0E3573F6C4; Thu, 25 May 2023 11:03:17 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 17/24] x86/resctrl: Move alloc/mon static keys into helpers Date: Thu, 25 May 2023 18:02:02 +0000 Message-Id: <20230525180209.19497-18-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766890391666146341?= X-GMAIL-MSGID: =?utf-8?q?1766890391666146341?= resctrl enables three static keys depending on the features it has enabled. Another architecture's context switch code may look different, any static keys that control it should be buried behind helpers. Move the alloc/mon logic into arch-specific helpers as a preparatory step for making the rdt_enable_key's status something the arch code decides. This means other architectures don't have to mirror the static keys. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- arch/x86/include/asm/resctrl.h | 20 ++++++++++++++++++++ arch/x86/kernel/cpu/resctrl/internal.h | 5 ----- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 8 ++++---- 3 files changed, 24 insertions(+), 9 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 20729364982b..83ec2b5791f0 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -42,6 +42,26 @@ DECLARE_STATIC_KEY_FALSE(rdt_enable_key); DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key); DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key); +static inline void resctrl_arch_enable_alloc(void) +{ + static_branch_enable_cpuslocked(&rdt_alloc_enable_key); +} + +static inline void resctrl_arch_disable_alloc(void) +{ + static_branch_disable_cpuslocked(&rdt_alloc_enable_key); +} + +static inline void resctrl_arch_enable_mon(void) +{ + static_branch_enable_cpuslocked(&rdt_mon_enable_key); +} + +static inline void resctrl_arch_disable_mon(void) +{ + static_branch_disable_cpuslocked(&rdt_mon_enable_key); +} + /* * __resctrl_sched_in() - Writes the task's CLOSid/RMID to IA32_PQR_MSR * diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 67cabda6fd4d..8660210ae958 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -93,9 +93,6 @@ static inline struct rdt_fs_context *rdt_fc2context(struct fs_context *fc) return container_of(kfc, struct rdt_fs_context, kfc); } -DECLARE_STATIC_KEY_FALSE(rdt_enable_key); -DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key); - /** * struct mon_evt - Entry in the event list of a resource * @evtid: event id @@ -453,8 +450,6 @@ extern struct mutex rdtgroup_mutex; extern struct rdt_hw_resource rdt_resources_all[]; extern struct rdtgroup rdtgroup_default; -DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key); - extern struct dentry *debugfs_resctrl; enum resctrl_res_level { diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 47bb3ab775fc..9e65248908d6 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2529,9 +2529,9 @@ static int rdt_get_tree(struct fs_context *fc) goto out_psl; if (rdt_alloc_capable) - static_branch_enable_cpuslocked(&rdt_alloc_enable_key); + resctrl_arch_enable_alloc(); if (rdt_mon_capable) - static_branch_enable_cpuslocked(&rdt_mon_enable_key); + resctrl_arch_enable_mon(); if (rdt_alloc_capable || rdt_mon_capable) { static_branch_enable_cpuslocked(&rdt_enable_key); @@ -2804,8 +2804,8 @@ static void rdt_kill_sb(struct super_block *sb) rdt_pseudo_lock_release(); rdtgroup_default.mode = RDT_MODE_SHAREABLE; schemata_list_destroy(); - static_branch_disable_cpuslocked(&rdt_alloc_enable_key); - static_branch_disable_cpuslocked(&rdt_mon_enable_key); + resctrl_arch_disable_alloc(); + resctrl_arch_disable_mon(); static_branch_disable_cpuslocked(&rdt_enable_key); resctrl_mounted = false; kernfs_kill_sb(sb); From patchwork Thu May 25 18:02:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99173 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp566082vqr; Thu, 25 May 2023 11:17:44 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5orkf9i8KJ920MjWtaNKlMXwAIspQZ7VRK7m8lhXy0Ts8dJzc3VzaD8eFLrFQxaK0OOJip X-Received: by 2002:a05:6a00:238c:b0:647:b6c9:179d with SMTP id f12-20020a056a00238c00b00647b6c9179dmr10885918pfc.21.1685038664139; Thu, 25 May 2023 11:17:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685038664; cv=none; d=google.com; s=arc-20160816; b=R2vyDmjVoAuPV1twSeEmF9bfu1xunX4cbRtENWf8ampLJpHLX3Q2wtF8kJcbnSWTPb hBTTEzaGS01lMXKGWIHbUgTZoRQIpxkZBvjtNhOEQ3avhBB1CBowXK8BV9mXmmd1ddDD mu8cx00obg120a2TPnmDZa4k+CjU1ADf5OEaaf2+Fxs1XJ0/nqlD/gZrLHqYN0BChQIc 52RM0IldkoT/Q0Eogq+u7RdcawW4iXiqQnaAm8Zoy/lQUaCg45iuNCQkoArR1iSnpYf3 /3DIVJ37X7qul9eWM9iT00gfHGUtOT4OyLEWpFQ93izpPA1eo7cakmUdYHY2QhIt5NMo HLhw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=+ia4G99i06QSYj7wXRMFn1FjelIv4JM4yeR1Z+8Xwqw=; b=tGirr1YXpO+ga8+Ty8VwYHRQPfNr+K1ZMS03osOQeKwDlgqiEiZAIgiqeDfD5G8x71 ZbCKaSEm4piyv1UrL6OeV+DNQQzggDxojaOL2MJfhGs3iwd5uP7cZwsXceBkF6TrPEk1 1JAI/+l5XjuQdMCMgLyZxAnUqiyb57EtNUZttykRGQXqNXfeMNEemtUEoVPvIZ5cxkiW gw18zPScvZIYUIRw3gztiV/K4fpE0vgCauh6PiRGgnYBJ/VAMwMKJVvXMY3dhGetPhUM dJ0+x3tL154sTUjpa07uo2LOhsYAPqPxGCHsel594GkTNpQJ86yI/MlqMQAePTWpKhz/ TTXw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 20-20020a631854000000b005347aba7376si594115pgy.297.2023.05.25.11.17.29; Thu, 25 May 2023 11:17:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241261AbjEYSE1 (ORCPT + 99 others); Thu, 25 May 2023 14:04:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238954AbjEYSD4 (ORCPT ); Thu, 25 May 2023 14:03:56 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 15E17E78 for ; Thu, 25 May 2023 11:03:33 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B48211650; Thu, 25 May 2023 11:04:08 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EAE043F6C4; Thu, 25 May 2023 11:03:20 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 18/24] x86/resctrl: Make rdt_enable_key the arch's decision to switch Date: Thu, 25 May 2023 18:02:03 +0000 Message-Id: <20230525180209.19497-19-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766891102297279997?= X-GMAIL-MSGID: =?utf-8?q?1766891102297279997?= rdt_enable_key is switched when resctrl is mounted. It was also previously used to prevent a second mount of the filesystem. Any other architecture that wants to support resctrl has to provide identical static keys. Now that there are helpers for enabling and disabling the alloc/mon keys, resctrl doesn't need to switch this extra key, it can be done by the arch code. Use the static-key increment and decrement helpers, and change resctrl to ensure the calls are balanced. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- arch/x86/include/asm/resctrl.h | 4 ++++ arch/x86/kernel/cpu/resctrl/rdtgroup.c | 11 +++++------ 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 83ec2b5791f0..839b701cf7bd 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -45,21 +45,25 @@ DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key); static inline void resctrl_arch_enable_alloc(void) { static_branch_enable_cpuslocked(&rdt_alloc_enable_key); + static_branch_inc_cpuslocked(&rdt_enable_key); } static inline void resctrl_arch_disable_alloc(void) { static_branch_disable_cpuslocked(&rdt_alloc_enable_key); + static_branch_dec_cpuslocked(&rdt_enable_key); } static inline void resctrl_arch_enable_mon(void) { static_branch_enable_cpuslocked(&rdt_mon_enable_key); + static_branch_inc_cpuslocked(&rdt_enable_key); } static inline void resctrl_arch_disable_mon(void) { static_branch_disable_cpuslocked(&rdt_mon_enable_key); + static_branch_dec_cpuslocked(&rdt_enable_key); } /* diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 9e65248908d6..501b68b95aef 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2533,10 +2533,8 @@ static int rdt_get_tree(struct fs_context *fc) if (rdt_mon_capable) resctrl_arch_enable_mon(); - if (rdt_alloc_capable || rdt_mon_capable) { - static_branch_enable_cpuslocked(&rdt_enable_key); + if (rdt_alloc_capable || rdt_mon_capable) resctrl_mounted = true; - } if (is_mbm_enabled()) { r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; @@ -2804,9 +2802,10 @@ static void rdt_kill_sb(struct super_block *sb) rdt_pseudo_lock_release(); rdtgroup_default.mode = RDT_MODE_SHAREABLE; schemata_list_destroy(); - resctrl_arch_disable_alloc(); - resctrl_arch_disable_mon(); - static_branch_disable_cpuslocked(&rdt_enable_key); + if (rdt_alloc_capable) + resctrl_arch_disable_alloc(); + if (rdt_mon_capable) + resctrl_arch_disable_mon(); resctrl_mounted = false; kernfs_kill_sb(sb); mutex_unlock(&rdtgroup_mutex); From patchwork Thu May 25 18:02:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99158 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp561102vqr; Thu, 25 May 2023 11:09:32 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6PlKPySfVLezBVqNCMEQAgs2R8QtdqrGeWFttyi486Am4Bv+/cr/cAxFQSqL3lm8Te3Wf6 X-Received: by 2002:a05:6a20:42a8:b0:10c:5324:db1 with SMTP id o40-20020a056a2042a800b0010c53240db1mr12247117pzj.24.1685038171639; Thu, 25 May 2023 11:09:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685038171; cv=none; d=google.com; s=arc-20160816; b=jX0m2T4oNp7WpJZcdi1Xk14iMIMMOUB6hh1b0BdeGlh7SM+IpJ0bR0jvPPW+3vDlSa 8dLtVZCYUlLW0guU8r+9EmHwcTbR/El9BwNoYF7X1INVywQpHs10J/Fujt4+7piO7vUL ylaQ89Fczx0bhYOHFXNyXtS6InqJwBSam7eunTF3fxvUCLKHpPWhJaIM01p0Y+bYY/vP 45V7FxGwUm9n5l/l3Lg8Sr3beE4cTIcB4zk4q9gYxi9Gwcx+swYnvcosut76FKVwLF8V rFXA2pvwC0iZUka++0UnQXmOVKFI1k2W96VC8QRkB78B7iOOI3IXGO6Ojd3pqMM1alyk Zq1g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=uUhNhgiYIts4wkVHZGkdvuz53z2S5YC8KsUTFASK664=; b=Bov1JZB6JOJFsko562sTILIwgg4JFe8mAhUedGlJVJJNM68TYgyo6xnTKhRB3M+9b9 U4E8H4Pl4mTfh/ipaxQwagEx0RHtC/jvU4FbxvAhHzxikxsRC0g4FldUt1PrsFpSaQTc 9TkiN8lL0AE+UjkoZYKeGDBU5FU06PfDSxd5Df3Q3cRomuCuwrO1xIwTLW3L9PNAOJzR rCO/sxfnnUjJVxTXlyjxGAkcQO0TciCCHerD25WN3cx4f9xQ0R0bl6W84l7I9r83Um6W /zBuuYFABysVt20B0/qh770udvAMUfMNEIlG8/y802Go/PafonqlNUkNP1PSYbLzqNXl CzTg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x9-20020aa79409000000b0063bcab61625si2116702pfo.178.2023.05.25.11.09.15; Thu, 25 May 2023 11:09:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241436AbjEYSEd (ORCPT + 99 others); Thu, 25 May 2023 14:04:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241362AbjEYSEC (ORCPT ); Thu, 25 May 2023 14:04:02 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id B52C2E2 for ; Thu, 25 May 2023 11:03:37 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9D054168F; Thu, 25 May 2023 11:04:11 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D3EB83F6C4; Thu, 25 May 2023 11:03:23 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 19/24] x86/resctrl: Add helpers for system wide mon/alloc capable Date: Thu, 25 May 2023 18:02:04 +0000 Message-Id: <20230525180209.19497-20-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766890585459957796?= X-GMAIL-MSGID: =?utf-8?q?1766890585459957796?= resctrl reads rdt_alloc_capable or rdt_mon_capable to determine whether any of the resources support the corresponding features. resctrl also uses the static-keys that affect the architecture's context-switch code to determine the same thing. This forces another architecture to have the same static-keys. As the static-key is enabled based on the capable flag, and none of the filesystem uses of these are in the scheduler path, move the capable flags behind helpers, and use these in the filesystem code instead of the static-key. After this change, only the architecture code manages and uses the static-keys to ensure __resctrl_sched_in() does not need runtime checks. This avoids multiple architectures having to define the same static-keys. Cases where the static-key implicitly tested if the resctrl filesystem was mounted all have an explicit check added by a previous patch. Tested-by: Shaopeng Tan Reviewed-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v1: * Added missing conversion in mkdir_rdt_prepare_rmid_free() Changes since v3: * Expanded the commit message. --- arch/x86/include/asm/resctrl.h | 13 +++++++++ arch/x86/kernel/cpu/resctrl/internal.h | 2 -- arch/x86/kernel/cpu/resctrl/monitor.c | 4 +-- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 6 ++-- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 34 +++++++++++------------ 5 files changed, 35 insertions(+), 24 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 839b701cf7bd..8acab1112255 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -38,10 +38,18 @@ struct resctrl_pqr_state { DECLARE_PER_CPU(struct resctrl_pqr_state, pqr_state); +extern bool rdt_alloc_capable; +extern bool rdt_mon_capable; + DECLARE_STATIC_KEY_FALSE(rdt_enable_key); DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key); DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key); +static inline bool resctrl_arch_alloc_capable(void) +{ + return rdt_alloc_capable; +} + static inline void resctrl_arch_enable_alloc(void) { static_branch_enable_cpuslocked(&rdt_alloc_enable_key); @@ -54,6 +62,11 @@ static inline void resctrl_arch_disable_alloc(void) static_branch_dec_cpuslocked(&rdt_enable_key); } +static inline bool resctrl_arch_mon_capable(void) +{ + return rdt_mon_capable; +} + static inline void resctrl_arch_enable_mon(void) { static_branch_enable_cpuslocked(&rdt_mon_enable_key); diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 8660210ae958..021a8956518c 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -136,8 +136,6 @@ struct rmid_read { void *arch_mon_ctx; }; -extern bool rdt_alloc_capable; -extern bool rdt_mon_capable; extern unsigned int rdt_mon_features; extern struct list_head resctrl_schema_all; extern bool resctrl_mounted; diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index da5a86c95142..ced933694f60 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -836,7 +836,7 @@ void mbm_handle_overflow(struct work_struct *work) * If the filesystem has been unmounted this work no longer needs to * run. */ - if (!resctrl_mounted || !static_branch_likely(&rdt_mon_enable_key)) + if (!resctrl_mounted || !resctrl_arch_mon_capable()) goto out_unlock; r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; @@ -873,7 +873,7 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) * When a domain comes online there is no guarantee the filesystem is * mounted. If not, there is no need to catch counter overflow. */ - if (!resctrl_mounted || !static_branch_likely(&rdt_mon_enable_key)) + if (!resctrl_mounted || !resctrl_arch_mon_capable()) return; cpu = cpumask_any_housekeeping(&dom->cpu_mask); dom->mbm_work_cpu = cpu; diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c index 5ebd6e54c7f2..460421051abf 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -567,7 +567,7 @@ static int rdtgroup_locksetup_user_restrict(struct rdtgroup *rdtgrp) if (ret) goto err_cpus; - if (rdt_mon_capable) { + if (resctrl_arch_mon_capable()) { ret = rdtgroup_kn_mode_restrict(rdtgrp, "mon_groups"); if (ret) goto err_cpus_list; @@ -614,7 +614,7 @@ static int rdtgroup_locksetup_user_restore(struct rdtgroup *rdtgrp) if (ret) goto err_cpus; - if (rdt_mon_capable) { + if (resctrl_arch_mon_capable()) { ret = rdtgroup_kn_mode_restore(rdtgrp, "mon_groups", 0777); if (ret) goto err_cpus_list; @@ -762,7 +762,7 @@ int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp) { int ret; - if (rdt_mon_capable) { + if (resctrl_arch_mon_capable()) { ret = alloc_rmid(rdtgrp->closid); if (ret < 0) { rdt_last_cmd_puts("Out of RMIDs\n"); diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 501b68b95aef..5330c0bdeffc 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -630,13 +630,13 @@ static int __rdtgroup_move_task(struct task_struct *tsk, static bool is_closid_match(struct task_struct *t, struct rdtgroup *r) { - return (rdt_alloc_capable && (r->type == RDTCTRL_GROUP) && + return (resctrl_arch_alloc_capable() && (r->type == RDTCTRL_GROUP) && resctrl_arch_match_closid(t, r->closid)); } static bool is_rmid_match(struct task_struct *t, struct rdtgroup *r) { - return (rdt_mon_capable && (r->type == RDTMON_GROUP) && + return (resctrl_arch_mon_capable() && (r->type == RDTMON_GROUP) && resctrl_arch_match_rmid(t, r->mon.parent->closid, r->mon.rmid)); } @@ -2506,7 +2506,7 @@ static int rdt_get_tree(struct fs_context *fc) if (ret < 0) goto out_schemata_free; - if (rdt_mon_capable) { + if (resctrl_arch_mon_capable()) { ret = mongroup_create_dir(rdtgroup_default.kn, &rdtgroup_default, "mon_groups", &kn_mongrp); @@ -2528,12 +2528,12 @@ static int rdt_get_tree(struct fs_context *fc) if (ret < 0) goto out_psl; - if (rdt_alloc_capable) + if (resctrl_arch_alloc_capable()) resctrl_arch_enable_alloc(); - if (rdt_mon_capable) + if (resctrl_arch_mon_capable()) resctrl_arch_enable_mon(); - if (rdt_alloc_capable || rdt_mon_capable) + if (resctrl_arch_alloc_capable() || resctrl_arch_mon_capable()) resctrl_mounted = true; if (is_mbm_enabled()) { @@ -2547,10 +2547,10 @@ static int rdt_get_tree(struct fs_context *fc) out_psl: rdt_pseudo_lock_release(); out_mondata: - if (rdt_mon_capable) + if (resctrl_arch_mon_capable()) kernfs_remove(kn_mondata); out_mongrp: - if (rdt_mon_capable) + if (resctrl_arch_mon_capable()) kernfs_remove(kn_mongrp); out_info: kernfs_remove(kn_info); @@ -2802,9 +2802,9 @@ static void rdt_kill_sb(struct super_block *sb) rdt_pseudo_lock_release(); rdtgroup_default.mode = RDT_MODE_SHAREABLE; schemata_list_destroy(); - if (rdt_alloc_capable) + if (resctrl_arch_alloc_capable()) resctrl_arch_disable_alloc(); - if (rdt_mon_capable) + if (resctrl_arch_mon_capable()) resctrl_arch_disable_mon(); resctrl_mounted = false; kernfs_kill_sb(sb); @@ -3184,7 +3184,7 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp) { int ret; - if (!rdt_mon_capable) + if (!resctrl_arch_mon_capable()) return 0; ret = alloc_rmid(rdtgrp->closid); @@ -3206,7 +3206,7 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp) static void mkdir_rdt_prepare_rmid_free(struct rdtgroup *rgrp) { - if (rdt_mon_capable) + if (resctrl_arch_mon_capable()) free_rmid(rgrp->closid, rgrp->mon.rmid); } @@ -3372,7 +3372,7 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn, list_add(&rdtgrp->rdtgroup_list, &rdt_all_groups); - if (rdt_mon_capable) { + if (resctrl_arch_mon_capable()) { /* * Create an empty mon_groups directory to hold the subset * of tasks and cpus to monitor. @@ -3427,14 +3427,14 @@ static int rdtgroup_mkdir(struct kernfs_node *parent_kn, const char *name, * allocation is supported, add a control and monitoring * subdirectory */ - if (rdt_alloc_capable && parent_kn == rdtgroup_default.kn) + if (resctrl_arch_alloc_capable() && parent_kn == rdtgroup_default.kn) return rdtgroup_mkdir_ctrl_mon(parent_kn, name, mode); /* * If RDT monitoring is supported and the parent directory is a valid * "mon_groups" directory, add a monitoring subdirectory. */ - if (rdt_mon_capable && is_mon_groups(parent_kn, name)) + if (resctrl_arch_mon_capable() && is_mon_groups(parent_kn, name)) return rdtgroup_mkdir_mon(parent_kn, name, mode); return -EPERM; @@ -3638,7 +3638,7 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) * If resctrl is mounted, remove all the * per domain monitor data directories. */ - if (resctrl_mounted && static_branch_unlikely(&rdt_mon_enable_key)) + if (resctrl_mounted && resctrl_arch_mon_capable()) rmdir_mondata_subdir_allrdtgrp(r, d->id); if (is_mbm_enabled()) @@ -3719,7 +3719,7 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) * If the filesystem is not mounted, creating directories is deferred * until mount time by rdt_get_tree() calling mkdir_mondata_all(). */ - if (resctrl_mounted && static_branch_unlikely(&rdt_mon_enable_key)) + if (resctrl_mounted && resctrl_arch_mon_capable()) mkdir_mondata_subdir_allrdtgrp(r, d); return 0; From patchwork Thu May 25 18:02:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99168 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp564715vqr; Thu, 25 May 2023 11:15:40 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5uStjhAWZbnNdNf/pJa49wXUq28/OUumjEFWL84s0YpNtDfUy19guSgiuQn/McXN1lHX/s X-Received: by 2002:a05:6a00:1988:b0:64d:1185:243c with SMTP id d8-20020a056a00198800b0064d1185243cmr9555064pfl.5.1685038540197; Thu, 25 May 2023 11:15:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685038540; cv=none; d=google.com; s=arc-20160816; b=Vvg9Vzc4kyeKhCWBNh5WXy+c/vYUth0OvJhiHla50bEDAZJ0ye8nDgPk7C5P6vc6dT /6lI9ev+ob4hem+v8q+sVzSynnSuntVEoKvGA3oaxRU9r7gk5Pw3WRJnueawIwVMGODd iZHRZlMC0U2V1XOl79jYKn3IKGuj9CEg4b4s6A0cedO9zueiyRu8VSdxQ2UW5of9syA/ WcJ0gG9EYHKOQHh9qXnKjeSnS0IxjDi2WiblZb8QOw6erMfX/IFlDu767r/jZo+IuyzH iZzZ0g8fr7gMLeh6+gMNKEFLm6toIChkXVhaqFWVvXPJGilvPNlm9EAn4dr/F1Dpi8sG /XTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=MCcJ6UTgKqTvtlYTQKzshDxnUUmw1sV/MGxcthoQXXU=; b=d95mfo1hvd8CDYbycNEgTekDNwzxFBHsW51+GjA+vupaBpUv7Pc9XV3hbn48VSnZrd lcUS6P4s3TE95LV8v3Dt0SYlAyxLl3s9wvzC3kQ7n9tj0iuQH6DvtGGs2Atsonu1wQ6G lRYsTt3rKE1CrHQNz7mM+3zYSLZnHCl8nnbOkIn7XYcMJcdw3ULurKvVD1Olu0w/ZgUQ BHtP/+VBIP8YNZamdZibZfSLuH9WQw1qx74AjeknGG/zIgeWc77N88HjuuaWZe35E/xg m5Ok2DbTmaWiX8sxzsqiIoeJr3aqoEvpHga2d7/i//aI5o/zttbEo+bBK6iksX4jgYBf /4Zg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 125-20020a620683000000b0064f78c32b85si1928460pfg.390.2023.05.25.11.15.25; Thu, 25 May 2023 11:15:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241362AbjEYSEh (ORCPT + 99 others); Thu, 25 May 2023 14:04:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60590 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231791AbjEYSEG (ORCPT ); Thu, 25 May 2023 14:04:06 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id C8F231707 for ; Thu, 25 May 2023 11:03:44 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8A6301655; Thu, 25 May 2023 11:04:14 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C0D7C3F6C4; Thu, 25 May 2023 11:03:26 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 20/24] x86/resctrl: Add cpu online callback for resctrl work Date: Thu, 25 May 2023 18:02:05 +0000 Message-Id: <20230525180209.19497-21-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766890972326423345?= X-GMAIL-MSGID: =?utf-8?q?1766890972326423345?= The resctrl architecture specific code may need to create a domain when a CPU comes online, it also needs to reset the CPUs PQR_ASSOC register. The resctrl filesystem code needs to update the rdtgroup_default cpu mask when cpus are brought online. Currently this is all done in one function, resctrl_online_cpu(). This will need to be split into architecture and filesystem parts before resctrl can be moved to /fs/. Pull the rdtgroup_default update work out as a filesystem specific cpu_online helper. resctrl_online_cpu() is the obvious name for this, which means the version in core.c needs renaming. resctrl_online_cpu() is called by the arch code once it has done the work to add the new cpu to any domains. In future patches, resctrl_online_cpu() will take the rdtgroup_mutex itself. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v3: * Renamed err to ret --- arch/x86/kernel/cpu/resctrl/core.c | 11 ++++++----- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 10 ++++++++++ include/linux/resctrl.h | 1 + 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 4bea032d072e..e00f3542e60e 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -603,19 +603,20 @@ static void clear_closid_rmid(int cpu) wrmsr(MSR_IA32_PQR_ASSOC, 0, RESCTRL_RESERVED_CLOSID); } -static int resctrl_online_cpu(unsigned int cpu) +static int resctrl_arch_online_cpu(unsigned int cpu) { struct rdt_resource *r; + int ret; mutex_lock(&rdtgroup_mutex); for_each_capable_rdt_resource(r) domain_add_cpu(cpu, r); - /* The cpu is set in default rdtgroup after online. */ - cpumask_set_cpu(cpu, &rdtgroup_default.cpu_mask); clear_closid_rmid(cpu); + + ret = resctrl_online_cpu(cpu); mutex_unlock(&rdtgroup_mutex); - return 0; + return ret; } static void clear_childcpus(struct rdtgroup *r, unsigned int cpu) @@ -965,7 +966,7 @@ static int __init resctrl_late_init(void) state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/resctrl/cat:online:", - resctrl_online_cpu, resctrl_offline_cpu); + resctrl_arch_online_cpu, resctrl_offline_cpu); if (state < 0) return state; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 5330c0bdeffc..7c3de5ea0482 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3725,6 +3725,16 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) return 0; } +int resctrl_online_cpu(unsigned int cpu) +{ + lockdep_assert_held(&rdtgroup_mutex); + + /* The cpu is set in default rdtgroup after online. */ + cpumask_set_cpu(cpu, &rdtgroup_default.cpu_mask); + + return 0; +} + /* * rdtgroup_init - rdtgroup initialization * diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 0dcb5cfde609..ecd41762d61a 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -222,6 +222,7 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d, u32 closid, enum resctrl_conf_type type); int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d); void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); +int resctrl_online_cpu(unsigned int cpu); /** * resctrl_arch_rmid_read() - Read the eventid counter corresponding to rmid From patchwork Thu May 25 18:02:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99157 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp559954vqr; Thu, 25 May 2023 11:07:49 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5WcvI9iXQl8OUJc4hcjuktBmt86F2LVmrUJDimqQMRQScyjFtdVuUoaMmYn6j7RJUxOY1O X-Received: by 2002:a17:90a:8597:b0:253:74f8:1e31 with SMTP id m23-20020a17090a859700b0025374f81e31mr2756750pjn.39.1685038068867; Thu, 25 May 2023 11:07:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685038068; cv=none; d=google.com; s=arc-20160816; b=zELkG4oQpAy9dBHUMh3sE47+1+zgI4lBYdbGwf9yVB1NHZTj3Z/Hri6G5ptmu5MvV/ fhe+2I368ibG4LdFEIDdnKcFCtHCSIZ7TfWzODK371YVPy5wuNOmwFhhZSo3+VbNzxk0 AFXrq5huln/aKbOFAoDFwuApau0i19x3jrbRbgE0ANZtvrwHFY0lTz9zGKT+LUF/37Iy LbkCznDh0UDfeQ7dlxJHVfthMNUvQnqGutVuhpFaUfaplwVFcAA3kqNcunKcnXTCr0jS YS6YOatqOShIHA710e4WmNdRuoS2nnTM1k3mfsOHlIdwWRmJkVYxZDB/Q25U7ONffJZB jscw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=+UaYQuQ7KO496xZVCzx0P7cRbFDcg+mV8UT/kYITLIg=; b=CxAMYUb/9Y5oWDH1SnmB96GFrnjrpX6To1q3559EAedo+0kibTJiUIyseEnHI1iTDK bSQxGiMij+ruwQwPi+dfAsZ/o4fyQ9xG3LfpD9a6wockIrfMfY6mMgt2PSjWRPb+FJXU FBUBouv3PNSCIYHLlAU+iJLcYlUZs0MMn5JTrWFnsclxbTenyqSm16IiE9wVglG4aW0j In5HBnsytlDcEHbX4J3V52f26RqdHvX8vBRRMBAnS+VeRSqY/Og8J6N+TarWXUIvgCGY Y1aWtF3l57FIjaxb9rV13XRn7Kvu/F5fYHDi0T9lZVpScUuEitaJN3mYVP0vSb/7Ci0R yUrg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r65-20020a632b44000000b00502f4fd0c16si1685628pgr.653.2023.05.25.11.07.36; Thu, 25 May 2023 11:07:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241380AbjEYSFV (ORCPT + 99 others); Thu, 25 May 2023 14:05:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33402 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241483AbjEYSEu (ORCPT ); Thu, 25 May 2023 14:04:50 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 62CECE62 for ; Thu, 25 May 2023 11:04:09 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 753DE1691; Thu, 25 May 2023 11:04:17 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id AB1483F6C4; Thu, 25 May 2023 11:03:29 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 21/24] x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but cpu Date: Thu, 25 May 2023 18:02:06 +0000 Message-Id: <20230525180209.19497-22-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766890478441000616?= X-GMAIL-MSGID: =?utf-8?q?1766890478441000616?= When a CPU is taken offline resctrl may need to move the overflow or limbo handlers to run on a different CPU. Once the offline callbacks have been split, cqm_setup_limbo_handler() will be called while the CPU that is going offline is still present in the cpu_mask. Pass the CPU to exclude to cqm_setup_limbo_handler() and mbm_setup_overflow_handler(). These functions can use a variant of cpumask_any_but() when selecting the CPU. -1 is used to indicate no CPUs need excluding. A subsequent patch moves these calls to be before CPUs have been removed, so this exclude_cpus behaviour is temporary. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v2: * Rephrased a comment to avoid a two letter bad-word. (we) * Avoid assigning mbm_work_cpu if the domain is going to be free()d * Added cpumask_any_housekeeping_but(), I dislike the name Changes since v3: * Marked an explanatory comment as temporary as the subsequent patch is no longer adjacent. --- arch/x86/kernel/cpu/resctrl/core.c | 8 +++-- arch/x86/kernel/cpu/resctrl/internal.h | 37 +++++++++++++++++++++-- arch/x86/kernel/cpu/resctrl/monitor.c | 42 +++++++++++++++++++++----- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 6 ++-- include/linux/resctrl.h | 3 ++ 5 files changed, 82 insertions(+), 14 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index e00f3542e60e..187ed127a446 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -582,12 +582,16 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) if (r == &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl) { if (is_mbm_enabled() && cpu == d->mbm_work_cpu) { cancel_delayed_work(&d->mbm_over); - mbm_setup_overflow_handler(d, 0); + /* + * temporary: exclude_cpu=-1 as this CPU has already + * been removed by cpumask_clear_cpu()d + */ + mbm_setup_overflow_handler(d, 0, RESCTRL_PICK_ANY_CPU); } if (is_llc_occupancy_enabled() && cpu == d->cqm_work_cpu && has_busy_rmid(r, d)) { cancel_delayed_work(&d->cqm_limbo); - cqm_setup_limbo_handler(d, 0); + cqm_setup_limbo_handler(d, 0, RESCTRL_PICK_ANY_CPU); } } } diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 021a8956518c..9cba8fc405b9 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -79,6 +79,37 @@ static inline unsigned int cpumask_any_housekeeping(const struct cpumask *mask) return cpu; } +/** + * cpumask_any_housekeeping_but() - Chose any cpu in @mask, preferring those + * that aren't marked nohz_full, excluding + * the provided CPU + * @mask: The mask to pick a CPU from. + * @exclude_cpu:The CPU to avoid picking. + * + * Returns a CPU from @mask, but not @but. If there are housekeeping CPUs that + * don't use nohz_full, these are preferred. + * Returns >= nr_cpu_ids if no CPUs are available. + */ +static inline unsigned int +cpumask_any_housekeeping_but(const struct cpumask *mask, int exclude_cpu) +{ + int cpu, hk_cpu; + + cpu = cpumask_any_but(mask, exclude_cpu); + if (tick_nohz_full_cpu(cpu)) { + hk_cpu = cpumask_nth_andnot(0, mask, tick_nohz_full_mask); + if (hk_cpu == exclude_cpu) { + hk_cpu = cpumask_nth_andnot(1, mask, + tick_nohz_full_mask); + } + + if (hk_cpu < nr_cpu_ids) + cpu = hk_cpu; + } + + return cpu; +} + struct rdt_fs_context { struct kernfs_fs_context kfc; bool enable_cdpl2; @@ -564,11 +595,13 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, struct rdt_domain *d, struct rdtgroup *rdtgrp, int evtid, int first); void mbm_setup_overflow_handler(struct rdt_domain *dom, - unsigned long delay_ms); + unsigned long delay_ms, + int exclude_cpu); void mbm_handle_overflow(struct work_struct *work); void __init intel_rdt_mbm_apply_quirk(void); bool is_mba_sc(struct rdt_resource *r); -void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms); +void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms, + int exclude_cpu); void cqm_handle_limbo(struct work_struct *work); bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d); void __check_limbo(struct rdt_domain *d, bool force_free); diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index ced933694f60..ae02185f3354 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -485,7 +485,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) * setup up the limbo worker. */ if (!has_busy_rmid(r, d)) - cqm_setup_limbo_handler(d, CQM_LIMBOCHECK_INTERVAL); + cqm_setup_limbo_handler(d, CQM_LIMBOCHECK_INTERVAL, -1); set_bit(idx, d->rmid_busy_llc); entry->busy++; } @@ -810,15 +810,28 @@ void cqm_handle_limbo(struct work_struct *work) mutex_unlock(&rdtgroup_mutex); } -void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms) +/** + * cqm_setup_limbo_handler() - Schedule the limbo handler to run for this + * domain. + * @delay_ms: How far in the future the handler should run. + * @exclude_cpu: Which CPU the handler should not run on, -1 to pick any CPU. + */ +void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms, + int exclude_cpu) { unsigned long delay = msecs_to_jiffies(delay_ms); int cpu; - cpu = cpumask_any_housekeeping(&dom->cpu_mask); - dom->cqm_work_cpu = cpu; + if (exclude_cpu == RESCTRL_PICK_ANY_CPU) + cpu = cpumask_any_housekeeping(&dom->cpu_mask); + else + cpu = cpumask_any_housekeeping_but(&dom->cpu_mask, + exclude_cpu); - schedule_delayed_work_on(cpu, &dom->cqm_limbo, delay); + if (cpu < nr_cpu_ids) { + dom->cqm_work_cpu = cpu; + schedule_delayed_work_on(cpu, &dom->cqm_limbo, delay); + } } void mbm_handle_overflow(struct work_struct *work) @@ -864,7 +877,14 @@ void mbm_handle_overflow(struct work_struct *work) mutex_unlock(&rdtgroup_mutex); } -void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) +/** + * mbm_setup_overflow_handler() - Schedule the overflow handler to run for this + * domain. + * @delay_ms: How far in the future the handler should run. + * @exclude_cpu: Which CPU the handler should not run on, -1 to pick any CPU. + */ +void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms, + int exclude_cpu) { unsigned long delay = msecs_to_jiffies(delay_ms); int cpu; @@ -875,9 +895,15 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) */ if (!resctrl_mounted || !resctrl_arch_mon_capable()) return; - cpu = cpumask_any_housekeeping(&dom->cpu_mask); + if (exclude_cpu == -1) + cpu = cpumask_any_housekeeping(&dom->cpu_mask); + else + cpu = cpumask_any_housekeeping_but(&dom->cpu_mask, + exclude_cpu); dom->mbm_work_cpu = cpu; - schedule_delayed_work_on(cpu, &dom->mbm_over, delay); + + if (cpu < nr_cpu_ids) + schedule_delayed_work_on(cpu, &dom->mbm_over, delay); } static int dom_data_init(struct rdt_resource *r) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 7c3de5ea0482..3373b11afe01 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2539,7 +2539,8 @@ static int rdt_get_tree(struct fs_context *fc) if (is_mbm_enabled()) { r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; list_for_each_entry(dom, &r->domains, list) - mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL); + mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL, + RESCTRL_PICK_ANY_CPU); } goto out; @@ -3709,7 +3710,8 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) if (is_mbm_enabled()) { INIT_DELAYED_WORK(&d->mbm_over, mbm_handle_overflow); - mbm_setup_overflow_handler(d, MBM_OVERFLOW_INTERVAL); + mbm_setup_overflow_handler(d, MBM_OVERFLOW_INTERVAL, + RESCTRL_PICK_ANY_CPU); } if (is_llc_occupancy_enabled()) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index ecd41762d61a..089b91133e5e 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -9,6 +9,9 @@ /* CLOSID value used by the default control group */ #define RESCTRL_RESERVED_CLOSID 0 +/* Indicates no CPU needs to be excluded */ +#define RESCTRL_PICK_ANY_CPU -1 + #ifdef CONFIG_PROC_CPU_RESCTRL int proc_resctrl_show(struct seq_file *m, From patchwork Thu May 25 18:02:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99166 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp564461vqr; Thu, 25 May 2023 11:15:16 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7miV9EVZRGjH4XdqT4oOgn5QqnZWjPJ3gpbdIdFAD53TNHXahEPnMsJKuA++FWy2lryUtC X-Received: by 2002:a05:6a00:180d:b0:63a:ea82:b7b7 with SMTP id y13-20020a056a00180d00b0063aea82b7b7mr9837502pfa.28.1685038516034; Thu, 25 May 2023 11:15:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685038516; cv=none; d=google.com; s=arc-20160816; b=TVCXO+2dJtXHTeMpwvP6nPpsuOyNCUno3lfzZ7iP8Cdus/SngK+GsinI0XfIZRS8WN DnQrwfViTc4V4vq+fFydxRJdDa6h6NsxgcmT8+H9/GMmGOTaqwsxdLV1+u8sYcODpCSL TghrJieuLoCERTbjuCnZMB2ZhIeU0akZMQExf2OVnjcD1Te0i/aBb6IKKXrDcfLgRizW c+PXZLXdXD176j3UIUqeVOG2c19Y6k9xj3AbsNF0bWsSVOaZYQlRpqiaErJJliF/YCLi oSQgi1Y9Kxw7xhbMli4FPWzFJ9IEvZ5T2qErHFfhF5V1I6hRyBxFH5uvlLUh9uu52rYj L9UA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=5CP/Dn1Jb6+RVgvhz+kDUdZe9H7PoINSjpnlnQI2jO0=; b=hYqBVFBmL4X/uHqs+MmAm9Ayp1BbJqO31geEZ4msVJc+qURupbkAOR688oN22Rl0GB zwRo0LwvNZ6XQr+Hywjr5U0EUhzGMHAhuSh/XuS2mbh8sQup1yXYfsflftKHiaj3urAe O4Cq3vnUPWVbLWKxQ0/5aWrU8dmacLWP1CVUG0C6QjL7rJ50Cf9qBhEuMxMyLJVaIXfM xm1XqTX3P6+yjwr1/ha7GmlGBoZPJJrEeBj+0qaXbKyymkOLtUCDKvZ9NJABhZ4LRGo4 mSBqdTLarmbLwQzWx05g5M+ad1nBFqWnyA3HNhX5/VRg8xn74ALUNr9sXIMqItBf+fTt Mt7g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s187-20020a625ec4000000b005e1cabb612fsi2040869pfb.67.2023.05.25.11.15.01; Thu, 25 May 2023 11:15:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240413AbjEYSEb (ORCPT + 99 others); Thu, 25 May 2023 14:04:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32896 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240719AbjEYSD4 (ORCPT ); Thu, 25 May 2023 14:03:56 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 70E04E7D for ; Thu, 25 May 2023 11:03:35 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 60935165C; Thu, 25 May 2023 11:04:20 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 972D23F6C4; Thu, 25 May 2023 11:03:32 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 22/24] x86/resctrl: Add cpu offline callback for resctrl work Date: Thu, 25 May 2023 18:02:07 +0000 Message-Id: <20230525180209.19497-23-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766890947177720515?= X-GMAIL-MSGID: =?utf-8?q?1766890947177720515?= The resctrl architecture specific code may need to free a domain when a CPU goes offline, it also needs to reset the CPUs PQR_ASSOC register. Amongst other things, the resctrl filesystem code needs to clear this CPU from the cpu_mask of any control and monitor groups. Currently this is all done in core.c and called from resctrl_offline_cpu(), making the split between architecture and filesystem code unclear. Move the filesystem work to remove the CPU from the control and monitor groups into a filesystem helper called resctrl_offline_cpu(), and rename the one in core.c resctrl_arch_offline_cpu(). The rdtgroup_mutex is unlocked and locked again in the call in preparation for changing the locking rules for the architecture code. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/core.c | 25 +++++-------------------- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 24 ++++++++++++++++++++++++ include/linux/resctrl.h | 1 + 3 files changed, 30 insertions(+), 20 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 187ed127a446..9128a9710537 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -623,31 +623,15 @@ static int resctrl_arch_online_cpu(unsigned int cpu) return ret; } -static void clear_childcpus(struct rdtgroup *r, unsigned int cpu) +static int resctrl_arch_offline_cpu(unsigned int cpu) { - struct rdtgroup *cr; - - list_for_each_entry(cr, &r->mon.crdtgrp_list, mon.crdtgrp_list) { - if (cpumask_test_and_clear_cpu(cpu, &cr->cpu_mask)) { - break; - } - } -} - -static int resctrl_offline_cpu(unsigned int cpu) -{ - struct rdtgroup *rdtgrp; struct rdt_resource *r; mutex_lock(&rdtgroup_mutex); + resctrl_offline_cpu(cpu); + for_each_capable_rdt_resource(r) domain_remove_cpu(cpu, r); - list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) { - if (cpumask_test_and_clear_cpu(cpu, &rdtgrp->cpu_mask)) { - clear_childcpus(rdtgrp, cpu); - break; - } - } clear_closid_rmid(cpu); mutex_unlock(&rdtgroup_mutex); @@ -970,7 +954,8 @@ static int __init resctrl_late_init(void) state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/resctrl/cat:online:", - resctrl_arch_online_cpu, resctrl_offline_cpu); + resctrl_arch_online_cpu, + resctrl_arch_offline_cpu); if (state < 0) return state; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 3373b11afe01..08b426f52f6d 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3737,6 +3737,30 @@ int resctrl_online_cpu(unsigned int cpu) return 0; } +static void clear_childcpus(struct rdtgroup *r, unsigned int cpu) +{ + struct rdtgroup *cr; + + list_for_each_entry(cr, &r->mon.crdtgrp_list, mon.crdtgrp_list) { + if (cpumask_test_and_clear_cpu(cpu, &cr->cpu_mask)) + break; + } +} + +void resctrl_offline_cpu(unsigned int cpu) +{ + struct rdtgroup *rdtgrp; + + lockdep_assert_held(&rdtgroup_mutex); + + list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) { + if (cpumask_test_and_clear_cpu(cpu, &rdtgrp->cpu_mask)) { + clear_childcpus(rdtgrp, cpu); + break; + } + } +} + /* * rdtgroup_init - rdtgroup initialization * diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 089b91133e5e..c4be3453b3ff 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -226,6 +226,7 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d, int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d); void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); int resctrl_online_cpu(unsigned int cpu); +void resctrl_offline_cpu(unsigned int cpu); /** * resctrl_arch_rmid_read() - Read the eventid counter corresponding to rmid From patchwork Thu May 25 18:02:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99164 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp564214vqr; Thu, 25 May 2023 11:14:53 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6wdt/ymeeEuq97+SmYHtSDIbnDEAzMNaZLTdFSWH4OdkjtbfdhS09vHq+6NBRjJn/FJV2U X-Received: by 2002:a17:90a:590d:b0:24d:df69:5c67 with SMTP id k13-20020a17090a590d00b0024ddf695c67mr3100237pji.12.1685038492699; Thu, 25 May 2023 11:14:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685038492; cv=none; d=google.com; s=arc-20160816; b=qtcsaMtZroCgHvi6IcVeGooiS3TkW7FAuy7JdnbF6mIUwyynRsnIFyVsX13jmoukch XvY4f3qdC1K2KT2gHui5y2xjN3oUinMcEfw67b04534SXHwRxnC/87AQykorbeHO5mGf 8uKiOZV1Qjm3P8bHNrHqCr5S0EGSa4gdgR8jUlNCo7nK0/xJO69jyAiECrcF6uUChdx4 J0xaXi/LeqOZ/jgE6ivzauqINnCnQguX1FLKwwUMrn85Mn/wEhzHSPTxDTPPZfW85ugx kaPlZiNxNsn6cwH3tQVfZL89cSBZ1PHNS4NBiiQ5Bswi+osow/ZlZbk4H537VrKsSPR8 aO2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=Viz1+o91JqD1SAiznVT/wUys9bxBnOXTip9k3WVYi74=; b=Vx5WBZjp8EdEDQhPguFfwncavWxo1EeKwH9nTfpXkCcqmhaMBOkfWOqIq5Gd1gKgkL w3ZKhHOT++td2z71lZa+A4yPmAehhslxrZzRJub+zFgtd2Zx5tVVY3Ebp3J+Bu9sQ2vG Jls/ov3VQ0IbCpUW+ECf9yED0EavI8V3JVf5NlB0rbW3AHNgkNeLXfHMZOTb0XpumK19 ialw/s1ebTD3CgKfnzi1b0Pnm5tLxjJ3H0CENHBqRYTLEcF28lGOgyMERCePj9UJQ8qm YCY+aNtYgYLEiyC8Jd9oFMvsTz/Roqp695fjIyRRQe9PiOCJfo8ds54vizYWJ6mDByuR 7scg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e10-20020a17090a9a8a00b002507107f730si1106380pjp.30.2023.05.25.11.14.38; Thu, 25 May 2023 11:14:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241478AbjEYSFk (ORCPT + 99 others); Thu, 25 May 2023 14:05:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34028 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241309AbjEYSFT (ORCPT ); Thu, 25 May 2023 14:05:19 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id F32451731 for ; Thu, 25 May 2023 11:04:29 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 529A51692; Thu, 25 May 2023 11:04:23 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8A1813F6C4; Thu, 25 May 2023 11:03:35 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 23/24] x86/resctrl: Move domain helper migration into resctrl_offline_cpu() Date: Thu, 25 May 2023 18:02:08 +0000 Message-Id: <20230525180209.19497-24-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766890922834695498?= X-GMAIL-MSGID: =?utf-8?q?1766890922834695498?= When a CPU is taken offline the resctrl filesystem code needs to check if it was the CPU nominated to perform the periodic overflow and limbo work. If so, another CPU needs to be chosen to do this work. This is currently done in core.c, mixed in with the code that removes the CPU from the domain's mask, and potentially free()s the domain. Move the migration of the overflow and limbo helpers into the filesystem code, into resctrl_offline_cpu(). As resctrl_offline_cpu() runs before the architecture code has removed the CPU from the domain mask, the callers need to be told which CPU is being removed, to avoid picking it as the new CPU. This uses the exclude_cpu feature previously added. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/core.c | 16 ---------------- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 15 +++++++++++++++ 2 files changed, 15 insertions(+), 16 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 9128a9710537..edc0dd123317 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -578,22 +578,6 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) return; } - - if (r == &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl) { - if (is_mbm_enabled() && cpu == d->mbm_work_cpu) { - cancel_delayed_work(&d->mbm_over); - /* - * temporary: exclude_cpu=-1 as this CPU has already - * been removed by cpumask_clear_cpu()d - */ - mbm_setup_overflow_handler(d, 0, RESCTRL_PICK_ANY_CPU); - } - if (is_llc_occupancy_enabled() && cpu == d->cqm_work_cpu && - has_busy_rmid(r, d)) { - cancel_delayed_work(&d->cqm_limbo); - cqm_setup_limbo_handler(d, 0, RESCTRL_PICK_ANY_CPU); - } - } } static void clear_closid_rmid(int cpu) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 08b426f52f6d..3a8e2c98b611 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3749,7 +3749,9 @@ static void clear_childcpus(struct rdtgroup *r, unsigned int cpu) void resctrl_offline_cpu(unsigned int cpu) { + struct rdt_domain *d; struct rdtgroup *rdtgrp; + struct rdt_resource *l3 = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; lockdep_assert_held(&rdtgroup_mutex); @@ -3759,6 +3761,19 @@ void resctrl_offline_cpu(unsigned int cpu) break; } } + + d = get_domain_from_cpu(cpu, l3); + if (d) { + if (is_mbm_enabled() && cpu == d->mbm_work_cpu) { + cancel_delayed_work(&d->mbm_over); + mbm_setup_overflow_handler(d, 0, cpu); + } + if (is_llc_occupancy_enabled() && cpu == d->cqm_work_cpu && + has_busy_rmid(l3, d)) { + cancel_delayed_work(&d->cqm_limbo); + cqm_setup_limbo_handler(d, 0, cpu); + } + } } /* From patchwork Thu May 25 18:02:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 99177 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp570998vqr; Thu, 25 May 2023 11:26:41 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6brYwsFjsW5CGX2W9+4AJuDZrCbIzOuaTDQYzSs2RfRozzjq4XGWSQ+Us8+cKhl1K3PrEp X-Received: by 2002:a17:903:228a:b0:1ad:b5f4:dfd5 with SMTP id b10-20020a170903228a00b001adb5f4dfd5mr3153274plh.32.1685039201512; Thu, 25 May 2023 11:26:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685039201; cv=none; d=google.com; s=arc-20160816; b=TeXNSpL/Nbcm+1GUjsOWuGXgZnXgggunYy+kl1eqKmTRxTBnYBfwvofLgblBU08GzX E7S86pHQb+wnGxqX3pIFnPX9RacvWHNRI91taRqZYQ6KlbVkelYdXURfgrYVxHEZGpzq +pMRXHwwauhFtez/ylNFEF8l5G97tQPJg1ILYaI6sW9PBEwJmKJFKZOdAfsaDUVxG5aB wOKCdL9XgYQGlYCkSJJkTCCgl5lYW4l4ejwxYaUq4qJ3NPGnVYCt22blnTi/4B1kw1dG zsl0wLk/k2r7TramGV8si3ztyE6+0aSdDTECfinAnO+B4qn51FDPIZBVAtfz91XHgC/z 5Qqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=ZaOns33ickBdiB3fLTtMBYVqVrEy2PUuGwfdsOG+bFU=; b=uUD+heMjQep4KtXTn+mtTYEa08u/NGyL/yS1ScnfOtWp3TAahgxaa2MWuaH+wy7y4p fbsfEMmqzSk7521eFD7m2uUgZnWfOVqHysHbX3NaFqhKaIZJG/BZx6zISiQ8KnKysw7Q cnXkkbm4QsJAhTdlskV9Yka/HLhaFDIN0zEr6roAI5riYoKbEJZJCz2ZJcleG/m0TxwR u4qgBGUH1fjhGk4HB5jiAjDJq8RSw17LBHHPwZfXXGtiyozgp8koH1wt6D4eM+rzbkZz xkYnHtUSl94xAM8knHW3FCoDSuywgK4RCE1EpveKoSrcCDPDJJsE8ByBwUO8Fpbcoul8 x1Wg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u18-20020a170903125200b001ac495a2f96si1745550plh.559.2023.05.25.11.26.26; Thu, 25 May 2023 11:26:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229672AbjEYSFu (ORCPT + 99 others); Thu, 25 May 2023 14:05:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33842 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241448AbjEYSFb (ORCPT ); Thu, 25 May 2023 14:05:31 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 167AB1985 for ; Thu, 25 May 2023 11:04:37 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 55F05169C; Thu, 25 May 2023 11:04:26 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 71CED3F6C4; Thu, 25 May 2023 11:03:38 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v4 24/24] x86/resctrl: Separate arch and fs resctrl locks Date: Thu, 25 May 2023 18:02:09 +0000 Message-Id: <20230525180209.19497-25-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230525180209.19497-1-james.morse@arm.com> References: <20230525180209.19497-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766891665890297970?= X-GMAIL-MSGID: =?utf-8?q?1766891665890297970?= resctrl has one mutex that is taken by the architecture specific code, and the filesystem parts. The two interact via cpuhp, where the architecture code updates the domain list. Filesystem handlers that walk the domains list should not run concurrently with the cpuhp callback modifying the list. Exposing a lock from the filesystem code means the interface is not cleanly defined, and creates the possibility of cross-architecture lock ordering headaches. The interaction only exists so that certain filesystem paths are serialised against cpu hotplug. The cpu hotplug code already has a mechanism to do this using cpus_read_lock(). MPAM's monitors have an overflow interrupt, so it needs to be possible to walk the domains list in irq context. RCU is ideal for this, but some paths need to be able to sleep to allocate memory. Because resctrl_{on,off}line_cpu() take the rdtgroup_mutex as part of a cpuhp callback, cpus_read_lock() must always be taken first. rdtgroup_schemata_write() already does this. Most of the filesystem code's domain list walkers are currently protected by the rdtgroup_mutex taken in rdtgroup_kn_lock_live(). The exceptions are rdt_bit_usage_show() and the mon_config helpers which take the lock directly. Make the domain list protected by RCU. An architecture-specific lock prevents concurrent writers. rdt_bit_usage_show() can walk the domain list under rcu_read_lock(). The mon_config helpers send multiple IPIs, take the cpus_read_lock() in these cases. The other filesystem list walkers need to be able to sleep. Add cpus_read_lock() to rdtgroup_kn_lock_live() so that the cpuhp callbacks can't be invoked when file system operations are occurring. Add lockdep_assert_cpus_held() in the cases where the rdtgroup_kn_lock_live() call isn't obvious. Resctrl's domain online/offline calls now need to take the rdtgroup_mutex themselves. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v2: * Reworded a comment, * Added a lockdep assertion * Moved clear_closid_rmid() outside the locked region of cpu online/offline Changes since v3: * Added a header include --- arch/x86/kernel/cpu/resctrl/core.c | 38 +++++++++----- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 16 ++++-- arch/x86/kernel/cpu/resctrl/monitor.c | 4 ++ arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 3 ++ arch/x86/kernel/cpu/resctrl/rdtgroup.c | 63 ++++++++++++++++++++--- include/linux/resctrl.h | 2 +- 6 files changed, 100 insertions(+), 26 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index edc0dd123317..f106c68a9be8 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -25,8 +25,15 @@ #include #include "internal.h" -/* Mutex to protect rdtgroup access. */ -DEFINE_MUTEX(rdtgroup_mutex); +/* + * rdt_domain structures are kfree()d when their last CPU goes offline, + * and allocated when the first CPU in a new domain comes online. + * The rdt_resource's domain list is updated when this happens. Readers of + * the domain list must either take cpus_read_lock(), or rely on an RCU + * read-side critical section, to avoid observing concurrent modification. + * All writers take this mutex: + */ +static DEFINE_MUTEX(domain_list_lock); /* * The cached resctrl_pqr_state is strictly per CPU and can never be @@ -508,6 +515,8 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r) struct rdt_domain *d; int err; + lockdep_assert_held(&domain_list_lock); + d = rdt_find_domain(r, id, &add_pos); if (IS_ERR(d)) { pr_warn("Couldn't find cache id for CPU %d\n", cpu); @@ -541,11 +550,12 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r) return; } - list_add_tail(&d->list, add_pos); + list_add_tail_rcu(&d->list, add_pos); err = resctrl_online_domain(r, d); if (err) { - list_del(&d->list); + list_del_rcu(&d->list); + synchronize_rcu(); domain_free(hw_dom); } } @@ -556,6 +566,8 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) struct rdt_hw_domain *hw_dom; struct rdt_domain *d; + lockdep_assert_held(&domain_list_lock); + d = rdt_find_domain(r, id, NULL); if (IS_ERR_OR_NULL(d)) { pr_warn("Couldn't find cache id for CPU %d\n", cpu); @@ -566,7 +578,8 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) cpumask_clear_cpu(cpu, &d->cpu_mask); if (cpumask_empty(&d->cpu_mask)) { resctrl_offline_domain(r, d); - list_del(&d->list); + list_del_rcu(&d->list); + synchronize_rcu(); /* * rdt_domain "d" is going to be freed below, so clear @@ -594,30 +607,29 @@ static void clear_closid_rmid(int cpu) static int resctrl_arch_online_cpu(unsigned int cpu) { struct rdt_resource *r; - int ret; - mutex_lock(&rdtgroup_mutex); + mutex_lock(&domain_list_lock); for_each_capable_rdt_resource(r) domain_add_cpu(cpu, r); + mutex_unlock(&domain_list_lock); + clear_closid_rmid(cpu); - ret = resctrl_online_cpu(cpu); - mutex_unlock(&rdtgroup_mutex); - - return ret; + return resctrl_online_cpu(cpu); } static int resctrl_arch_offline_cpu(unsigned int cpu) { struct rdt_resource *r; - mutex_lock(&rdtgroup_mutex); resctrl_offline_cpu(cpu); + mutex_lock(&domain_list_lock); for_each_capable_rdt_resource(r) domain_remove_cpu(cpu, r); + mutex_unlock(&domain_list_lock); + clear_closid_rmid(cpu); - mutex_unlock(&rdtgroup_mutex); return 0; } diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index 280d66fae21c..d8d7c127403b 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -209,6 +209,9 @@ static int parse_line(char *line, struct resctrl_schema *s, struct rdt_domain *d; unsigned long dom_id; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP && (r->rid == RDT_RESOURCE_MBA || r->rid == RDT_RESOURCE_SMBA)) { rdt_last_cmd_puts("Cannot pseudo-lock MBA resource\n"); @@ -313,6 +316,9 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid) struct rdt_domain *d; u32 idx; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL)) return -ENOMEM; @@ -378,11 +384,9 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of, return -EINVAL; buf[nbytes - 1] = '\0'; - cpus_read_lock(); rdtgrp = rdtgroup_kn_lock_live(of->kn); if (!rdtgrp) { rdtgroup_kn_unlock(of->kn); - cpus_read_unlock(); return -ENOENT; } rdt_last_cmd_clear(); @@ -444,7 +448,6 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of, out: rdt_staged_configs_clear(); rdtgroup_kn_unlock(of->kn); - cpus_read_unlock(); return ret ?: nbytes; } @@ -464,6 +467,9 @@ static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int clo bool sep = false; u32 ctrl_val; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + seq_printf(s, "%*s:", max_name_width, schema->name); list_for_each_entry(dom, &r->domains, list) { if (sep) @@ -534,8 +540,8 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, { int cpu; - /* When picking a CPU from cpu_mask, ensure it can't race with cpuhp */ - lockdep_assert_held(&rdtgroup_mutex); + /* When picking a cpu from cpu_mask, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); /* * setup the parameters to pass to mon_event_count() to read the data. diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index ae02185f3354..41b4cd2c7d64 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -15,6 +15,7 @@ * Software Developer Manual June 2016, volume 3, section 17.17. */ +#include #include #include #include @@ -476,6 +477,9 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) lockdep_assert_held(&rdtgroup_mutex); + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + idx = resctrl_arch_rmid_idx_encode(entry->closid, entry->rmid); entry->busy = 0; diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c index 460421051abf..fc3ed917d173 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -830,6 +830,9 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d) struct rdt_domain *d_i; bool ret = false; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + if (!zalloc_cpumask_var(&cpu_with_psl, GFP_KERNEL)) return true; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 3a8e2c98b611..9002ac728001 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -35,6 +35,10 @@ DEFINE_STATIC_KEY_FALSE(rdt_enable_key); DEFINE_STATIC_KEY_FALSE(rdt_mon_enable_key); DEFINE_STATIC_KEY_FALSE(rdt_alloc_enable_key); + +/* Mutex to protect rdtgroup access. */ +DEFINE_MUTEX(rdtgroup_mutex); + static struct kernfs_root *rdt_root; struct rdtgroup rdtgroup_default; LIST_HEAD(rdt_all_groups); @@ -950,7 +954,8 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of, mutex_lock(&rdtgroup_mutex); hw_shareable = r->cache.shareable_bits; - list_for_each_entry(dom, &r->domains, list) { + rcu_read_lock(); + list_for_each_entry_rcu(dom, &r->domains, list) { if (sep) seq_putc(seq, ';'); sw_shareable = 0; @@ -1006,8 +1011,10 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of, } sep = true; } + rcu_read_unlock(); seq_putc(seq, '\n'); mutex_unlock(&rdtgroup_mutex); + return 0; } @@ -1250,6 +1257,9 @@ static bool rdtgroup_mode_test_exclusive(struct rdtgroup *rdtgrp) struct rdt_domain *d; u32 ctrl; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + list_for_each_entry(s, &resctrl_schema_all, list) { r = s->res; if (r->rid == RDT_RESOURCE_MBA || r->rid == RDT_RESOURCE_SMBA) @@ -1516,6 +1526,7 @@ static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid struct rdt_domain *dom; bool sep = false; + cpus_read_lock(); mutex_lock(&rdtgroup_mutex); list_for_each_entry(dom, &r->domains, list) { @@ -1532,6 +1543,7 @@ static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid seq_puts(s, "\n"); mutex_unlock(&rdtgroup_mutex); + cpus_read_unlock(); return 0; } @@ -1623,6 +1635,9 @@ static int mon_config_write(struct rdt_resource *r, char *tok, u32 evtid) struct rdt_domain *d; int ret = 0; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + next: if (!tok || tok[0] == '\0') return 0; @@ -1664,6 +1679,7 @@ static ssize_t mbm_total_bytes_config_write(struct kernfs_open_file *of, if (nbytes == 0 || buf[nbytes - 1] != '\n') return -EINVAL; + cpus_read_lock(); mutex_lock(&rdtgroup_mutex); rdt_last_cmd_clear(); @@ -1673,6 +1689,7 @@ static ssize_t mbm_total_bytes_config_write(struct kernfs_open_file *of, ret = mon_config_write(r, buf, QOS_L3_MBM_TOTAL_EVENT_ID); mutex_unlock(&rdtgroup_mutex); + cpus_read_unlock(); return ret ?: nbytes; } @@ -1688,6 +1705,7 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of, if (nbytes == 0 || buf[nbytes - 1] != '\n') return -EINVAL; + cpus_read_lock(); mutex_lock(&rdtgroup_mutex); rdt_last_cmd_clear(); @@ -1697,6 +1715,7 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of, ret = mon_config_write(r, buf, QOS_L3_MBM_LOCAL_EVENT_ID); mutex_unlock(&rdtgroup_mutex); + cpus_read_unlock(); return ret ?: nbytes; } @@ -2149,6 +2168,9 @@ static int set_cache_qos_cfg(int level, bool enable) struct rdt_domain *d; int cpu; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + if (level == RDT_RESOURCE_L3) update = l3_qos_cfg_update; else if (level == RDT_RESOURCE_L2) @@ -2337,6 +2359,7 @@ struct rdtgroup *rdtgroup_kn_lock_live(struct kernfs_node *kn) atomic_inc(&rdtgrp->waitcount); kernfs_break_active_protection(kn); + cpus_read_lock(); mutex_lock(&rdtgroup_mutex); /* Was this group deleted while we waited? */ @@ -2354,6 +2377,7 @@ void rdtgroup_kn_unlock(struct kernfs_node *kn) return; mutex_unlock(&rdtgroup_mutex); + cpus_read_unlock(); if (atomic_dec_and_test(&rdtgrp->waitcount) && (rdtgrp->flags & RDT_DELETED)) { @@ -2651,6 +2675,9 @@ static int reset_all_ctrls(struct rdt_resource *r) struct rdt_domain *d; int i; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL)) return -ENOMEM; @@ -2935,6 +2962,9 @@ static int mkdir_mondata_subdir_alldom(struct kernfs_node *parent_kn, struct rdt_domain *dom; int ret; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + list_for_each_entry(dom, &r->domains, list) { ret = mkdir_mondata_subdir(parent_kn, dom, r, prgrp); if (ret) @@ -3625,7 +3655,8 @@ static void domain_destroy_mon_state(struct rdt_domain *d) kfree(d->mbm_local); } -void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) +static void _resctrl_offline_domain(struct rdt_resource *r, + struct rdt_domain *d) { lockdep_assert_held(&rdtgroup_mutex); @@ -3660,6 +3691,13 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) domain_destroy_mon_state(d); } +void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) +{ + mutex_lock(&rdtgroup_mutex); + _resctrl_offline_domain(r, d); + mutex_unlock(&rdtgroup_mutex); +} + static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domain *d) { u32 idx_limit = resctrl_arch_system_num_rmid_idx(); @@ -3691,7 +3729,7 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domain *d) return 0; } -int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) +static int _resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) { int err; @@ -3727,12 +3765,23 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) return 0; } +int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) +{ + int err; + + mutex_lock(&rdtgroup_mutex); + err = _resctrl_online_domain(r, d); + mutex_unlock(&rdtgroup_mutex); + + return err; +} + int resctrl_online_cpu(unsigned int cpu) { - lockdep_assert_held(&rdtgroup_mutex); - + mutex_lock(&rdtgroup_mutex); /* The cpu is set in default rdtgroup after online. */ cpumask_set_cpu(cpu, &rdtgroup_default.cpu_mask); + mutex_unlock(&rdtgroup_mutex); return 0; } @@ -3753,8 +3802,7 @@ void resctrl_offline_cpu(unsigned int cpu) struct rdtgroup *rdtgrp; struct rdt_resource *l3 = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; - lockdep_assert_held(&rdtgroup_mutex); - + mutex_lock(&rdtgroup_mutex); list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) { if (cpumask_test_and_clear_cpu(cpu, &rdtgrp->cpu_mask)) { clear_childcpus(rdtgrp, cpu); @@ -3774,6 +3822,7 @@ void resctrl_offline_cpu(unsigned int cpu) cqm_setup_limbo_handler(d, 0, cpu); } } + mutex_unlock(&rdtgroup_mutex); } /* diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index c4be3453b3ff..fe94ef3369fa 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -159,7 +159,7 @@ struct resctrl_schema; * @cache_level: Which cache level defines scope of this resource * @cache: Cache allocation related data * @membw: If the component has bandwidth controls, their properties. - * @domains: All domains for this resource + * @domains: RCU list of all domains for this resource * @name: Name to use in "schemata" file. * @data_width: Character width of data when displaying * @default_ctrl: Specifies default cache cbm or memory B/W percent.