From patchwork Fri Jul 28 16:42:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127765 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp578359vqg; Fri, 28 Jul 2023 10:13:08 -0700 (PDT) X-Google-Smtp-Source: APBJJlGu+uaaG3/gJ+hK6YrZal3giEtvdaiK5q35zWqLCXcIPVjCyppbVKGn4zy9IcUh6njWsl+B X-Received: by 2002:a17:903:244f:b0:1b8:8ff5:2cee with SMTP id l15-20020a170903244f00b001b88ff52ceemr2122360pls.64.1690564388012; Fri, 28 Jul 2023 10:13:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690564387; cv=none; d=google.com; s=arc-20160816; b=LRwzh9F545UlO+a9P4Oy7lM6qEmp35GBiCgSAA2rE31Ek4whYumPVx7FTraTY+SREp jbS/9u1lATllpB2/EpUBaYgX8ZvbZUWFWBUEUQiGc46TPmFu5havGDuIqE102YAB3KpL StUTKM+7ofRGNim40R4AYFU8g+GvvoJgXDZhUaCbhJFvPDYtv2wil5KV3HLIwo3Au4RF jAwfxl2IbjpsqwqK8ejRTrwqmMKqJhqEVVDjBMEtJqm9mbmmfUEancNgeYI6ZA+cLPTl 4q5azL8/YnFQLyBFDrwe+WYqLeWA3PVp5cKoiHpc8IoOE2poAyvjkZfNBNeiKP9Fu1yP TVOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=FBsCtJ/Rl9t2ojwIy+36U0EXcoJs+Ae1Jen85z//yqY=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=seorsPWPZNO1yS61P9kVDpyUy0FJVqLLrG1SMJWNMv7XJfpCFZNok+dOOMgNud0un+ ImzmkjbMeYHRO2GUwsA2025tVqpbODsdcambwSmKx3zYlrAA0lnib4B+bfeJFENZbn8Q ZoUqmFBqqXFf08AZKKKjJu/GL5nAtwI92Z9NwFJaiKPu1cJXeybcanXGmd1oZ6Zdv+I8 rSIXkDNgxjvBquHb+Kbf752tI1u/g8sXtL2F4YQRhWj12ImoS1RTzcZNALDyyj67V5hJ wqhRxdGiZJrb5K+ooxXmiOZGAREu6PmGPlSWKo8aRPadkdCbc96i1rEkXkZo/iUeHFoz jh3A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i3-20020a170902c94300b001b8b4371af8si3337476pla.648.2023.07.28.10.12.54; Fri, 28 Jul 2023 10:13:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235394AbjG1QoZ (ORCPT + 99 others); Fri, 28 Jul 2023 12:44:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46600 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235906AbjG1Qnc (ORCPT ); Fri, 28 Jul 2023 12:43:32 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8208C1984 for ; Fri, 28 Jul 2023 09:43:21 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5AC8CD75; Fri, 28 Jul 2023 09:44:04 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6B5183F67D; Fri, 28 Jul 2023 09:43:18 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 01/24] x86/resctrl: Track the closid with the rmid Date: Fri, 28 Jul 2023 16:42:31 +0000 Message-Id: <20230728164254.27562-2-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772685244079026026 X-GMAIL-MSGID: 1772685244079026026 x86's RMID are independent of the CLOSID. An RMID can be allocated, used and freed without considering the CLOSID. MPAM's equivalent feature is PMG, which is not an independent number, it extends the CLOSID/PARTID space. For MPAM, only PMG-bits worth of 'RMID' can be allocated for a single CLOSID. i.e. if there is 1 bit of PMG space, then each CLOSID can have two monitor groups. To allow resctrl to disambiguate RMID values for different CLOSID, everything in resctrl that keeps an RMID value needs to know the CLOSID too. This will always be ignored on x86. Tested-by: Shaopeng Tan Reviewed-by: Xin Hao Signed-off-by: James Morse --- Is there a better term for 'the unique identifier for a monitor group'. Using RMID for that here may be confusing... Changes since v1: * Added comment in struct rmid_entry Changes since v2: * Moved X86_RESCTRL_BAD_CLOSID from a subsequent patch Chances since v3: * Renamed X86_RESCTRL_BAD_CLOSID to EMPTY * Clarified a few comments and kernel-doc --- arch/x86/include/asm/resctrl.h | 7 +++ arch/x86/kernel/cpu/resctrl/internal.h | 2 +- arch/x86/kernel/cpu/resctrl/monitor.c | 65 ++++++++++++++--------- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 4 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 12 ++--- include/linux/resctrl.h | 12 ++++- 6 files changed, 65 insertions(+), 37 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 255a78d9d906..29999f52b461 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -7,6 +7,13 @@ #include #include +/* + * This value can never be a valid CLOSID, and is used when mapping a + * (closid, rmid) pair to an index and back. On x86 only the RMID is + * needed. + */ +#define X86_RESCTRL_EMPTY_CLOSID ((u32)~0) + /** * struct resctrl_pqr_state - State cache for the PQR MSR * @cur_rmid: The cached Resource Monitoring ID diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 85ceaf9a31ac..f2da908bb079 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -535,7 +535,7 @@ struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r); int closids_supported(void); void closid_free(int closid); int alloc_rmid(void); -void free_rmid(u32 rmid); +void free_rmid(u32 closid, u32 rmid); int rdt_get_mon_l3_config(struct rdt_resource *r); bool __init rdt_cpu_has(int flag); void mon_event_count(void *info); diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index ded1fc7cb7cb..fa66029de41c 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -25,6 +25,12 @@ #include "internal.h" struct rmid_entry { + /* + * Some architectures's resctrl_arch_rmid_read() needs the CLOSID value + * in order to access the correct monitor. This field provides the + * value to list walkers like __check_limbo(). On x86 this is ignored. + */ + u32 closid; u32 rmid; int busy; struct list_head list; @@ -136,7 +142,7 @@ static inline u64 get_corrected_mbm_count(u32 rmid, unsigned long val) return val; } -static inline struct rmid_entry *__rmid_entry(u32 rmid) +static inline struct rmid_entry *__rmid_entry(u32 closid, u32 rmid) { struct rmid_entry *entry; @@ -190,7 +196,8 @@ static struct arch_mbm_state *get_arch_mbm_state(struct rdt_hw_domain *hw_dom, } void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d, - u32 rmid, enum resctrl_event_id eventid) + u32 closid, u32 rmid, + enum resctrl_event_id eventid) { struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); struct arch_mbm_state *am; @@ -230,7 +237,8 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width) } int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, - u32 rmid, enum resctrl_event_id eventid, u64 *val) + u32 closid, u32 rmid, enum resctrl_event_id eventid, + u64 *val) { struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); @@ -285,9 +293,9 @@ void __check_limbo(struct rdt_domain *d, bool force_free) if (nrmid >= r->num_rmid) break; - entry = __rmid_entry(nrmid); + entry = __rmid_entry(X86_RESCTRL_EMPTY_CLOSID, nrmid);// temporary - if (resctrl_arch_rmid_read(r, d, entry->rmid, + if (resctrl_arch_rmid_read(r, d, entry->closid, entry->rmid, QOS_L3_OCCUP_EVENT_ID, &val)) { rmid_dirty = true; } else { @@ -342,7 +350,8 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) cpu = get_cpu(); list_for_each_entry(d, &r->domains, list) { if (cpumask_test_cpu(cpu, &d->cpu_mask)) { - err = resctrl_arch_rmid_read(r, d, entry->rmid, + err = resctrl_arch_rmid_read(r, d, entry->closid, + entry->rmid, QOS_L3_OCCUP_EVENT_ID, &val); if (err || val <= resctrl_rmid_realloc_threshold) @@ -366,7 +375,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) list_add_tail(&entry->list, &rmid_free_lru); } -void free_rmid(u32 rmid) +void free_rmid(u32 closid, u32 rmid) { struct rmid_entry *entry; @@ -375,7 +384,7 @@ void free_rmid(u32 rmid) lockdep_assert_held(&rdtgroup_mutex); - entry = __rmid_entry(rmid); + entry = __rmid_entry(closid, rmid); if (is_llc_occupancy_enabled()) add_rmid_to_limbo(entry); @@ -383,8 +392,8 @@ void free_rmid(u32 rmid) list_add_tail(&entry->list, &rmid_free_lru); } -static struct mbm_state *get_mbm_state(struct rdt_domain *d, u32 rmid, - enum resctrl_event_id evtid) +static struct mbm_state *get_mbm_state(struct rdt_domain *d, u32 closid, + u32 rmid, enum resctrl_event_id evtid) { switch (evtid) { case QOS_L3_MBM_TOTAL_EVENT_ID: @@ -396,20 +405,21 @@ static struct mbm_state *get_mbm_state(struct rdt_domain *d, u32 rmid, } } -static int __mon_event_count(u32 rmid, struct rmid_read *rr) +static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr) { struct mbm_state *m; u64 tval = 0; if (rr->first) { - resctrl_arch_reset_rmid(rr->r, rr->d, rmid, rr->evtid); - m = get_mbm_state(rr->d, rmid, rr->evtid); + resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid); + m = get_mbm_state(rr->d, closid, rmid, rr->evtid); if (m) memset(m, 0, sizeof(struct mbm_state)); return 0; } - rr->err = resctrl_arch_rmid_read(rr->r, rr->d, rmid, rr->evtid, &tval); + rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid, rr->evtid, + &tval); if (rr->err) return rr->err; @@ -429,7 +439,7 @@ static int __mon_event_count(u32 rmid, struct rmid_read *rr) * __mon_event_count() is compared with the chunks value from the previous * invocation. This must be called once per second to maintain values in MBps. */ -static void mbm_bw_count(u32 rmid, struct rmid_read *rr) +static void mbm_bw_count(u32 closid, u32 rmid, struct rmid_read *rr) { struct mbm_state *m = &rr->d->mbm_local[rmid]; u64 cur_bw, bytes, cur_bytes; @@ -459,7 +469,7 @@ void mon_event_count(void *info) rdtgrp = rr->rgrp; - ret = __mon_event_count(rdtgrp->mon.rmid, rr); + ret = __mon_event_count(rdtgrp->closid, rdtgrp->mon.rmid, rr); /* * For Ctrl groups read data from child monitor groups and @@ -470,7 +480,8 @@ void mon_event_count(void *info) if (rdtgrp->type == RDTCTRL_GROUP) { list_for_each_entry(entry, head, mon.crdtgrp_list) { - if (__mon_event_count(entry->mon.rmid, rr) == 0) + if (__mon_event_count(rdtgrp->closid, entry->mon.rmid, + rr) == 0) ret = 0; } } @@ -600,7 +611,8 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm) } } -static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, int rmid) +static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, + u32 closid, u32 rmid) { struct rmid_read rr; @@ -615,12 +627,12 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, int rmid) if (is_mbm_total_enabled()) { rr.evtid = QOS_L3_MBM_TOTAL_EVENT_ID; rr.val = 0; - __mon_event_count(rmid, &rr); + __mon_event_count(closid, rmid, &rr); } if (is_mbm_local_enabled()) { rr.evtid = QOS_L3_MBM_LOCAL_EVENT_ID; rr.val = 0; - __mon_event_count(rmid, &rr); + __mon_event_count(closid, rmid, &rr); /* * Call the MBA software controller only for the @@ -628,7 +640,7 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, int rmid) * the software controller explicitly. */ if (is_mba_sc(NULL)) - mbm_bw_count(rmid, &rr); + mbm_bw_count(closid, rmid, &rr); } } @@ -685,11 +697,11 @@ void mbm_handle_overflow(struct work_struct *work) d = container_of(work, struct rdt_domain, mbm_over.work); list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) { - mbm_update(r, d, prgrp->mon.rmid); + mbm_update(r, d, prgrp->closid, prgrp->mon.rmid); head = &prgrp->mon.crdtgrp_list; list_for_each_entry(crgrp, head, mon.crdtgrp_list) - mbm_update(r, d, crgrp->mon.rmid); + mbm_update(r, d, crgrp->closid, crgrp->mon.rmid); if (is_mba_sc(NULL)) update_mba_bw(prgrp, d); @@ -732,10 +744,11 @@ static int dom_data_init(struct rdt_resource *r) } /* - * RMID 0 is special and is always allocated. It's used for all - * tasks that are not monitored. + * CLOSID 0 and RMID 0 are special and are always allocated. These are + * used for rdtgroup_default control group, which will be setup later. + * See rdtgroup_setup_root(). */ - entry = __rmid_entry(0); + entry = __rmid_entry(0, 0); list_del(&entry->list); return 0; diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c index 458cb7419502..aeadaeb5df9a 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -738,7 +738,7 @@ int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp) * anymore when this group would be used for pseudo-locking. This * is safe to call on platforms not capable of monitoring. */ - free_rmid(rdtgrp->mon.rmid); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); ret = 0; goto out; @@ -773,7 +773,7 @@ int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp) ret = rdtgroup_locksetup_user_restore(rdtgrp); if (ret) { - free_rmid(rdtgrp->mon.rmid); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); return ret; } diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 725344048f85..f7fda4fc2c9e 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2714,7 +2714,7 @@ static void free_all_child_rdtgrp(struct rdtgroup *rdtgrp) head = &rdtgrp->mon.crdtgrp_list; list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) { - free_rmid(sentry->mon.rmid); + free_rmid(sentry->closid, sentry->mon.rmid); list_del(&sentry->mon.crdtgrp_list); if (atomic_read(&sentry->waitcount) != 0) @@ -2754,7 +2754,7 @@ static void rmdir_all_sub(void) cpumask_or(&rdtgroup_default.cpu_mask, &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask); - free_rmid(rdtgrp->mon.rmid); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); kernfs_remove(rdtgrp->kn); list_del(&rdtgrp->rdtgroup_list); @@ -3252,7 +3252,7 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, return 0; out_idfree: - free_rmid(rdtgrp->mon.rmid); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); out_destroy: kernfs_put(rdtgrp->kn); kernfs_remove(rdtgrp->kn); @@ -3266,7 +3266,7 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, static void mkdir_rdt_prepare_clean(struct rdtgroup *rgrp) { kernfs_remove(rgrp->kn); - free_rmid(rgrp->mon.rmid); + free_rmid(rgrp->closid, rgrp->mon.rmid); rdtgroup_remove(rgrp); } @@ -3415,7 +3415,7 @@ static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask) update_closid_rmid(tmpmask, NULL); rdtgrp->flags = RDT_DELETED; - free_rmid(rdtgrp->mon.rmid); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); /* * Remove the rdtgrp from the parent ctrl_mon group's list @@ -3461,8 +3461,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask) cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask); update_closid_rmid(tmpmask, NULL); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); closid_free(rdtgrp->closid); - free_rmid(rdtgrp->mon.rmid); rdtgroup_ctrl_remove(rdtgrp); diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 8334eeacfec5..c413bb11d336 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -225,6 +225,9 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); * for this resource and domain. * @r: resource that the counter should be read from. * @d: domain that the counter should be read from. + * @closid: closid that matches the rmid. Depending on the architecture, the + * counter may match traffic of both @closid and @rmid, or @rmid + * only. * @rmid: rmid of the counter to read. * @eventid: eventid to read, e.g. L3 occupancy. * @val: result of the counter read in bytes. @@ -235,20 +238,25 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); * 0 on success, or -EIO, -EINVAL etc on error. */ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, - u32 rmid, enum resctrl_event_id eventid, u64 *val); + u32 closid, u32 rmid, enum resctrl_event_id eventid, + u64 *val); + /** * resctrl_arch_reset_rmid() - Reset any private state associated with rmid * and eventid. * @r: The domain's resource. * @d: The rmid's domain. + * @closid: closid that matches the rmid. Depending on the architecture, the + * counter may match traffic of both @closid and @rmid, or @rmid only. * @rmid: The rmid whose counter values should be reset. * @eventid: The eventid whose counter values should be reset. * * This can be called from any CPU. */ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d, - u32 rmid, enum resctrl_event_id eventid); + u32 closid, u32 rmid, + enum resctrl_event_id eventid); /** * resctrl_arch_reset_rmid_all() - Reset all private state associated with From patchwork Fri Jul 28 16:42:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127751 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp571760vqg; Fri, 28 Jul 2023 10:03:43 -0700 (PDT) X-Google-Smtp-Source: APBJJlFnqzoBmEeB3P1BGyq9EZOOm92BKxlOvJK1Nc93+L+GIZCtLyTSOnrApiGZBddEvtqh8y9N X-Received: by 2002:a17:90a:7d0c:b0:261:1141:b716 with SMTP id g12-20020a17090a7d0c00b002611141b716mr2292988pjl.33.1690563823491; Fri, 28 Jul 2023 10:03:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690563823; cv=none; d=google.com; s=arc-20160816; b=r6qJ9Oj/rhhRLpq6d9WFjjjQkird/z8XKxAxQsh7kF7mSH3eV90YRZMckpgDW+gwsu qC0xGOlkTJs80IF4bJkjyD3jYP7pC0iT3EsS+A1/YfNr7j51Hzy1HUu5WQwEjrhq3maw eZNM9mna+ImxLj6wSYjjEjx8usVtwCn3AAhRMUu27/5RNO99TOT4we9OUv1SuA3vKcUt gS9foWSUyYLw8jYpqPlr216RQ4gMJqhqPpqPbCuQJqRS/pc/sQaIk6c10h91PZzZ4sUo 8f8ln94CqgPpszfVKRBxKsY7ejwwQot347NX3AbQMv06jmRXhy/N+7cU5g4JotuX+hZG 5AHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=eFLTxzUTG06G4FoAymOnKSL3a126DKqPfsuxRUPXnK8=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=zcHePNCIlNWf1XnMbHQhQgLR0X4MYm7i4Pk83BpbucvZo8RexZiZ63bD6RUi3S5OmL T85wRbhx8C+9IKt461X3LV3umghozzqIRu9B/oBeG70WCuSgVESqMMWJd1foDgoj1dn6 dC57LPGP3KIx6/QSkH79e4xz/JnYYSWI+0zKZBoGkoSiLwi4hk6E2dC6qR5pL6u8/903 TM79P1jku0A80A8vXJPbLGFMEPf6ZSsvcQzZzvTGU06+efsrMc3ndh2iUhEzS23nxyz0 50k8KxsvPOk3AbPDV70x+bKQhthOO5r3+knq8GNqeK5IV7oR8ausTl+TG7JhvRXkElGz 1YVA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p13-20020a17090a2d8d00b00263c7cadb73si3301992pjd.152.2023.07.28.10.03.24; Fri, 28 Jul 2023 10:03:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234453AbjG1Qod (ORCPT + 99 others); Fri, 28 Jul 2023 12:44:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46862 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236028AbjG1Qne (ORCPT ); Fri, 28 Jul 2023 12:43:34 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 961AF4217 for ; Fri, 28 Jul 2023 09:43:24 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 630F21570; Fri, 28 Jul 2023 09:44:07 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 7E63A3F67D; Fri, 28 Jul 2023 09:43:21 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 02/24] x86/resctrl: Access per-rmid structures by index Date: Fri, 28 Jul 2023 16:42:32 +0000 Message-Id: <20230728164254.27562-3-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772684651401362742 X-GMAIL-MSGID: 1772684651401362742 x86 systems identify traffic using the CLOSID and RMID. The CLOSID is used to lookup the control policy, the RMID is used for monitoring. For x86 these are independent numbers. Arm's MPAM has equivalent features PARTID and PMG, where the PARTID is used to lookup the control policy. The PMG in contrast is a small number of bits that are used to subdivide PARTID when monitoring. The cache-occupancy monitors require the PARTID to be specified when monitoring. This means MPAM's PMG field is not unique. There are multiple PMG-0, one per allocated CLOSID/PARTID. If PMG is treated as equivalent to RMID, it cannot be allocated as an independent number. Bitmaps like rmid_busy_llc need to be sized by the number of unique entries for this resource. Treat the combined CLOSID and RMID as an index, and provide architecture helpers to pack and unpack an index. This makes the MPAM values unique. The domain's rmid_busy_llc and rmid_ptrs[] are then sized by index, as are domain mbm_local[] and mbm_total[]. x86 can ignore the CLOSID field when packing and unpacking an index, and report as many indexes as RMID. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v1: * Added X86_BAD_CLOSID macro to make it clear what this value means * Added second WARN_ON() for closid checking, and made both _ONCE() Changes since v2: * Added RESCTRL_RESERVED_CLOSID * Removed a newline * Repharsed some comments * Renamed a variable 'ignore'd * Moved X86_RESCTRL_BAD_CLOSID to a previous patch Changes since v3: * Changed a variable name * Fixed various typos Changes since v4: * Removed resource paramter from has_busy_rmid() * Rewrote commit message --- arch/x86/include/asm/resctrl.h | 17 +++++ arch/x86/kernel/cpu/resctrl/core.c | 4 +- arch/x86/kernel/cpu/resctrl/internal.h | 3 +- arch/x86/kernel/cpu/resctrl/monitor.c | 92 +++++++++++++++++--------- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 9 +-- include/linux/resctrl.h | 4 ++ 6 files changed, 92 insertions(+), 37 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 29999f52b461..9510c23db62d 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -101,6 +101,23 @@ static inline void resctrl_sched_in(struct task_struct *tsk) __resctrl_sched_in(tsk); } +static inline u32 resctrl_arch_system_num_rmid_idx(void) +{ + /* RMID are independent numbers for x86. num_rmid_idx == num_rmid */ + return boot_cpu_data.x86_cache_max_rmid + 1; +} + +static inline void resctrl_arch_rmid_idx_decode(u32 idx, u32 *closid, u32 *rmid) +{ + *rmid = idx; + *closid = X86_RESCTRL_EMPTY_CLOSID; +} + +static inline u32 resctrl_arch_rmid_idx_encode(u32 ignored, u32 rmid) +{ + return rmid; +} + void resctrl_cpu_detect(struct cpuinfo_x86 *c); #else diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 030d3b409768..8dfede01b0c9 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -585,7 +585,7 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) mbm_setup_overflow_handler(d, 0); } if (is_llc_occupancy_enabled() && cpu == d->cqm_work_cpu && - has_busy_rmid(r, d)) { + has_busy_rmid(d)) { cancel_delayed_work(&d->cqm_limbo); cqm_setup_limbo_handler(d, 0); } @@ -600,7 +600,7 @@ static void clear_closid_rmid(int cpu) state->default_rmid = 0; state->cur_closid = 0; state->cur_rmid = 0; - wrmsr(MSR_IA32_PQR_ASSOC, 0, 0); + wrmsr(MSR_IA32_PQR_ASSOC, 0, RESCTRL_RESERVED_CLOSID); } static int resctrl_online_cpu(unsigned int cpu) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index f2da908bb079..b48715bb8762 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -7,6 +7,7 @@ #include #include #include +#include #define L3_QOS_CDP_ENABLE 0x01ULL @@ -550,7 +551,7 @@ void __init intel_rdt_mbm_apply_quirk(void); bool is_mba_sc(struct rdt_resource *r); void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms); void cqm_handle_limbo(struct work_struct *work); -bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d); +bool has_busy_rmid(struct rdt_domain *d); void __check_limbo(struct rdt_domain *d, bool force_free); void rdt_domain_reconfigure_cdp(struct rdt_resource *r); void __init thread_throttle_mode_init(void); diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index fa66029de41c..bd234b66dddf 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -142,12 +142,29 @@ static inline u64 get_corrected_mbm_count(u32 rmid, unsigned long val) return val; } -static inline struct rmid_entry *__rmid_entry(u32 closid, u32 rmid) +/* + * x86 and arm64 differ in their handling of monitoring. + * x86's RMID are an independent number, there is only one source of traffic + * with an RMID value of '1'. + * arm64's PMG extend the PARTID/CLOSID space, there are multiple sources of + * traffic with a PMG value of '1', one for each CLOSID, meaning the RMID + * value is no longer unique. + * To account for this, resctrl uses an index. On x86 this is just the RMID, + * on arm64 it encodes the CLOSID and RMID. This gives a unique number. + * + * The domain's rmid_busy_llc and rmid_ptrs[] are sized by index. The arch code + * must accept an attempt to read every index. + */ +static inline struct rmid_entry *__rmid_entry(u32 idx) { struct rmid_entry *entry; + u32 closid, rmid; - entry = &rmid_ptrs[rmid]; - WARN_ON(entry->rmid != rmid); + entry = &rmid_ptrs[idx]; + resctrl_arch_rmid_idx_decode(idx, &closid, &rmid); + + WARN_ON_ONCE(entry->closid != closid); + WARN_ON_ONCE(entry->rmid != rmid); return entry; } @@ -277,8 +294,9 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, void __check_limbo(struct rdt_domain *d, bool force_free) { struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + u32 idx_limit = resctrl_arch_system_num_rmid_idx(); struct rmid_entry *entry; - u32 crmid = 1, nrmid; + u32 idx, cur_idx = 1; bool rmid_dirty; u64 val = 0; @@ -289,12 +307,11 @@ void __check_limbo(struct rdt_domain *d, bool force_free) * RMID and move it to the free list when the counter reaches 0. */ for (;;) { - nrmid = find_next_bit(d->rmid_busy_llc, r->num_rmid, crmid); - if (nrmid >= r->num_rmid) + idx = find_next_bit(d->rmid_busy_llc, idx_limit, cur_idx); + if (idx >= idx_limit) break; - entry = __rmid_entry(X86_RESCTRL_EMPTY_CLOSID, nrmid);// temporary - + entry = __rmid_entry(idx); if (resctrl_arch_rmid_read(r, d, entry->closid, entry->rmid, QOS_L3_OCCUP_EVENT_ID, &val)) { rmid_dirty = true; @@ -303,19 +320,21 @@ void __check_limbo(struct rdt_domain *d, bool force_free) } if (force_free || !rmid_dirty) { - clear_bit(entry->rmid, d->rmid_busy_llc); + clear_bit(idx, d->rmid_busy_llc); if (!--entry->busy) { rmid_limbo_count--; list_add_tail(&entry->list, &rmid_free_lru); } } - crmid = nrmid + 1; + cur_idx = idx + 1; } } -bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d) +bool has_busy_rmid(struct rdt_domain *d) { - return find_first_bit(d->rmid_busy_llc, r->num_rmid) != r->num_rmid; + u32 idx_limit = resctrl_arch_system_num_rmid_idx(); + + return find_first_bit(d->rmid_busy_llc, idx_limit) != idx_limit; } /* @@ -345,6 +364,9 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) struct rdt_domain *d; int cpu, err; u64 val = 0; + u32 idx; + + idx = resctrl_arch_rmid_idx_encode(entry->closid, entry->rmid); entry->busy = 0; cpu = get_cpu(); @@ -362,9 +384,9 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) * For the first limbo RMID in the domain, * setup up the limbo worker. */ - if (!has_busy_rmid(r, d)) + if (!has_busy_rmid(d)) cqm_setup_limbo_handler(d, CQM_LIMBOCHECK_INTERVAL); - set_bit(entry->rmid, d->rmid_busy_llc); + set_bit(idx, d->rmid_busy_llc); entry->busy++; } put_cpu(); @@ -377,14 +399,17 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) void free_rmid(u32 closid, u32 rmid) { + u32 idx = resctrl_arch_rmid_idx_encode(closid, rmid); struct rmid_entry *entry; - if (!rmid) - return; - lockdep_assert_held(&rdtgroup_mutex); - entry = __rmid_entry(closid, rmid); + /* do not allow the default rmid to be free'd */ + if (idx == resctrl_arch_rmid_idx_encode(RESCTRL_RESERVED_CLOSID, + RESCTRL_RESERVED_RMID)) + return; + + entry = __rmid_entry(idx); if (is_llc_occupancy_enabled()) add_rmid_to_limbo(entry); @@ -395,11 +420,13 @@ void free_rmid(u32 closid, u32 rmid) static struct mbm_state *get_mbm_state(struct rdt_domain *d, u32 closid, u32 rmid, enum resctrl_event_id evtid) { + u32 idx = resctrl_arch_rmid_idx_encode(closid, rmid); + switch (evtid) { case QOS_L3_MBM_TOTAL_EVENT_ID: - return &d->mbm_total[rmid]; + return &d->mbm_total[idx]; case QOS_L3_MBM_LOCAL_EVENT_ID: - return &d->mbm_local[rmid]; + return &d->mbm_local[idx]; default: return NULL; } @@ -441,7 +468,8 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr) */ static void mbm_bw_count(u32 closid, u32 rmid, struct rmid_read *rr) { - struct mbm_state *m = &rr->d->mbm_local[rmid]; + u32 idx = resctrl_arch_rmid_idx_encode(closid, rmid); + struct mbm_state *m = &rr->d->mbm_local[idx]; u64 cur_bw, bytes, cur_bytes; cur_bytes = rr->val; @@ -531,7 +559,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm) { u32 closid, rmid, cur_msr_val, new_msr_val; struct mbm_state *pmbm_data, *cmbm_data; - u32 cur_bw, delta_bw, user_bw; + u32 cur_bw, delta_bw, user_bw, idx; struct rdt_resource *r_mba; struct rdt_domain *dom_mba; struct list_head *head; @@ -544,7 +572,8 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm) closid = rgrp->closid; rmid = rgrp->mon.rmid; - pmbm_data = &dom_mbm->mbm_local[rmid]; + idx = resctrl_arch_rmid_idx_encode(closid, rmid); + pmbm_data = &dom_mbm->mbm_local[idx]; dom_mba = get_domain_from_cpu(smp_processor_id(), r_mba); if (!dom_mba) { @@ -662,7 +691,7 @@ void cqm_handle_limbo(struct work_struct *work) __check_limbo(d, false); - if (has_busy_rmid(r, d)) + if (has_busy_rmid(d)) schedule_delayed_work_on(cpu, &d->cqm_limbo, delay); mutex_unlock(&rdtgroup_mutex); @@ -727,19 +756,20 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) static int dom_data_init(struct rdt_resource *r) { + u32 idx_limit = resctrl_arch_system_num_rmid_idx(); struct rmid_entry *entry = NULL; - int i, nr_rmids; + u32 idx; + int i; - nr_rmids = r->num_rmid; - rmid_ptrs = kcalloc(nr_rmids, sizeof(struct rmid_entry), GFP_KERNEL); + rmid_ptrs = kcalloc(idx_limit, sizeof(struct rmid_entry), GFP_KERNEL); if (!rmid_ptrs) return -ENOMEM; - for (i = 0; i < nr_rmids; i++) { + for (i = 0; i < idx_limit; i++) { entry = &rmid_ptrs[i]; INIT_LIST_HEAD(&entry->list); - entry->rmid = i; + resctrl_arch_rmid_idx_decode(i, &entry->closid, &entry->rmid); list_add_tail(&entry->list, &rmid_free_lru); } @@ -748,7 +778,9 @@ static int dom_data_init(struct rdt_resource *r) * used for rdtgroup_default control group, which will be setup later. * See rdtgroup_setup_root(). */ - entry = __rmid_entry(0, 0); + idx = resctrl_arch_rmid_idx_encode(RESCTRL_RESERVED_CLOSID, + RESCTRL_RESERVED_RMID); + entry = __rmid_entry(idx); list_del(&entry->list); return 0; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index f7fda4fc2c9e..6b7190f9cff6 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3727,7 +3727,7 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) if (is_mbm_enabled()) cancel_delayed_work(&d->mbm_over); - if (is_llc_occupancy_enabled() && has_busy_rmid(r, d)) { + if (is_llc_occupancy_enabled() && has_busy_rmid(d)) { /* * When a package is going down, forcefully * decrement rmid->ebusy. There is no way to know @@ -3745,16 +3745,17 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domain *d) { + u32 idx_limit = resctrl_arch_system_num_rmid_idx(); size_t tsize; if (is_llc_occupancy_enabled()) { - d->rmid_busy_llc = bitmap_zalloc(r->num_rmid, GFP_KERNEL); + d->rmid_busy_llc = bitmap_zalloc(idx_limit, GFP_KERNEL); if (!d->rmid_busy_llc) return -ENOMEM; } if (is_mbm_total_enabled()) { tsize = sizeof(*d->mbm_total); - d->mbm_total = kcalloc(r->num_rmid, tsize, GFP_KERNEL); + d->mbm_total = kcalloc(idx_limit, tsize, GFP_KERNEL); if (!d->mbm_total) { bitmap_free(d->rmid_busy_llc); return -ENOMEM; @@ -3762,7 +3763,7 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domain *d) } if (is_mbm_local_enabled()) { tsize = sizeof(*d->mbm_local); - d->mbm_local = kcalloc(r->num_rmid, tsize, GFP_KERNEL); + d->mbm_local = kcalloc(idx_limit, tsize, GFP_KERNEL); if (!d->mbm_local) { bitmap_free(d->rmid_busy_llc); kfree(d->mbm_total); diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index c413bb11d336..660752406174 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -6,6 +6,10 @@ #include #include +/* CLOSID, RMID value used by the default control group */ +#define RESCTRL_RESERVED_CLOSID 0 +#define RESCTRL_RESERVED_RMID 0 + #ifdef CONFIG_PROC_CPU_RESCTRL int proc_resctrl_show(struct seq_file *m, From patchwork Fri Jul 28 16:42:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127753 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp572082vqg; Fri, 28 Jul 2023 10:04:09 -0700 (PDT) X-Google-Smtp-Source: APBJJlF8MRIIPA6sOzOgGz7FeFdw/j/brYvQ4pTtIPWx/YpQZvBtsiuFa7w8xpVkQmW1tYpEHp0w X-Received: by 2002:a05:6a20:6a23:b0:134:135c:5a23 with SMTP id p35-20020a056a206a2300b00134135c5a23mr2942310pzk.18.1690563848982; Fri, 28 Jul 2023 10:04:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690563848; cv=none; d=google.com; s=arc-20160816; b=ScF7BPv6zGd1sUkMDnzhY7W2vc+nxY9/PAy9gG+HOD0V93mwqpjBgRLQzlFto00Q0i Qy7EdH7VQmp1/QKrS/XGae51aQl12DjvYAz/K0W4B7QBX50tLsQDOi/DDWlUpdi5YmYL 1Vl7dtn7h/BERQyxz9rqeXffjz5RON09IeY+JGv6eeXKuwIdV0HJe29oxUdqwqYnBVS/ WSYK0oj4CihaPllA8yc5wGqJryYhg54xxMOIQundMrzesfjqyBVqRcQX6IPZYBhMBkZZ UuMSHJzRKSp+MGEEbJOVPN9BvnI7+Oxmrzbn3JDZmHxX37yzb3fTSaMOHzmFuNSLOybE fdHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=F0PoYF4UCQsXzNgArbLB5t/lFRRi5X1s1x6V0VVusPE=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=oYTMsDRX9rCneJwsET2z4HXCQXXE5e7i/idf9mki0HNOTQcg1zQGsVMAIwYrPaIkbG t2qVubMGFEGzbPNObFA6UokIARHCl7s2mV896TkEID+YeqpVMiRvHa5zYIaJGCdVa4Vp ho78bqFfvLCKFONenLv1KN1WN68Gkc57DKEGvIauqz9hj22Jk/gdHGDiYSwZzbYivyhP 5jstJLBRpmoB1e507i1AZQcUQRChX2ZWVIahpTuaDct+lo0dR4s9SAOAKmgYdXuGKj+p vU7l8W0l6nzV+LTwZJso9Cj9MZ6FIfafnG2fuKv9wMZ/DdndKkKkXm9Qfshq1IV52m2Y FFnw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e14-20020a63d94e000000b00563fc720694si3227204pgj.119.2023.07.28.10.03.55; Fri, 28 Jul 2023 10:04:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234290AbjG1Qo3 (ORCPT + 99 others); Fri, 28 Jul 2023 12:44:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46642 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235995AbjG1Qne (ORCPT ); Fri, 28 Jul 2023 12:43:34 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7E7C93C22 for ; Fri, 28 Jul 2023 09:43:27 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5075E2F4; Fri, 28 Jul 2023 09:44:10 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 882673F67D; Fri, 28 Jul 2023 09:43:24 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 03/24] x86/resctrl: Create helper for RMID allocation and mondata dir creation Date: Fri, 28 Jul 2023 16:42:33 +0000 Message-Id: <20230728164254.27562-4-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772684678539112265 X-GMAIL-MSGID: 1772684678539112265 When monitoring is support, each monitor and control group is allocated an RMID. For control groups, rdtgroup_mkdir_ctrl_mon() later goes on to allocate the CLOSID. MPAM's equivalent of RMID are not an independent number, so can't be allocated until the CLOSID is known. An RMID allocation for one CLOSID may fail, whereas another may succeed depending on how many monitor groups a control group has. The RMID allocation needs to move to be after the CLOSID has been allocated. Move the RMID allocation and mondata dir creation to a helper, this makes a subsequent change easier to read. Tested-by: Shaopeng Tan Reviewed-by: Ilpo Järvinen Signed-off-by: James Morse --- Changes since v4: * Fixed typo in commit message, moved some words around. --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 42 +++++++++++++++++--------- 1 file changed, 27 insertions(+), 15 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 6b7190f9cff6..e7178bbbd30f 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3165,6 +3165,30 @@ static int rdtgroup_init_alloc(struct rdtgroup *rdtgrp) return ret; } +static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp) +{ + int ret; + + if (!rdt_mon_capable) + return 0; + + ret = alloc_rmid(); + if (ret < 0) { + rdt_last_cmd_puts("Out of RMIDs\n"); + return ret; + } + rdtgrp->mon.rmid = ret; + + ret = mkdir_mondata_all(rdtgrp->kn, rdtgrp, &rdtgrp->mon.mon_data_kn); + if (ret) { + rdt_last_cmd_puts("kernfs subdir error\n"); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); + return ret; + } + + return 0; +} + static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, const char *name, umode_t mode, enum rdt_group_type rtype, struct rdtgroup **r) @@ -3230,20 +3254,10 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, goto out_destroy; } - if (rdt_mon_capable) { - ret = alloc_rmid(); - if (ret < 0) { - rdt_last_cmd_puts("Out of RMIDs\n"); - goto out_destroy; - } - rdtgrp->mon.rmid = ret; + ret = mkdir_rdt_prepare_rmid_alloc(rdtgrp); + if (ret) + goto out_destroy; - ret = mkdir_mondata_all(kn, rdtgrp, &rdtgrp->mon.mon_data_kn); - if (ret) { - rdt_last_cmd_puts("kernfs subdir error\n"); - goto out_idfree; - } - } kernfs_activate(kn); /* @@ -3251,8 +3265,6 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, */ return 0; -out_idfree: - free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); out_destroy: kernfs_put(rdtgrp->kn); kernfs_remove(rdtgrp->kn); From patchwork Fri Jul 28 16:42:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127790 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp591775vqg; Fri, 28 Jul 2023 10:36:58 -0700 (PDT) X-Google-Smtp-Source: APBJJlEPxj5DRzdgET/w1/EAe8xpz7q5f1TAldVKGDuC4QAG+85QplowFx0rEwzGSJQhLl6Uu6oR X-Received: by 2002:a17:906:76ce:b0:99b:d594:8f8a with SMTP id q14-20020a17090676ce00b0099bd5948f8amr90545ejn.0.1690565817899; Fri, 28 Jul 2023 10:36:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690565817; cv=none; d=google.com; s=arc-20160816; b=hYoGOV2Gk+f58V83/Nvx44GiKZ85uLZlaL6nKPWODLW9h9P2m0w3deoqYVNYICRcUK 15EzFbPt/ApRiprSmoqj/e9/LBVAeAPPAHm+By2U3hAEoRW8L8rQpblO9K8Wd0qTtw25 DqayTfkDCj9+ieiGVzzRW1cv3RjV299GIvhyv1C1JN2pDVRoabhehasX4ute15XMALxc CrsswZN4HwNZNQzAZ3fUOMg9bGnFUq6zDnRf4APdCcrWOBbjrfCAwRVmP0brc4D+gnMb mKCKIQh1HOof55rspluS2psipOZP5Nw80aioz2cZo5ehTr/cAxVO8BjDxmzpSMbYxj+a r8pg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=Mv0I8NSzqrEhRonAGJuLauaxW3lmETNgCfUk922khVk=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=y31KD90YoYXRorqDClj3Fl6mudiVAnf4on+ANr71HK4GCEVZLbomuTiwbuPD5bw2Hb 9Lqe0sNkPNZjQ51hNNBZpx+WoEIrvxQN+4gTi6g0Ob5bgrmOLEpnUwCS4FCzleI1AcJT rvNPmdMu9PM8W+WybvX4MvC2IxxCxrodL9Ia8traSYLhK4U0cLSrqMQISTAmfLH9w3aG xsLgJu+AfNVUWNtFKc/UjgrUui+NGni8ghSUmsOdgi8KhyShkC6//8n1n+z5ZAYPb7X4 t3ctAnTI5LiUKQ40bwqeMoiWyQ2jEIRntOkedciLZTj/O7ELHFF0yix/wk90FT/cWf+2 aOJA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s11-20020a170906a18b00b0098dfdc3f2d9si2895375ejy.342.2023.07.28.10.36.33; Fri, 28 Jul 2023 10:36:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235625AbjG1Qoh (ORCPT + 99 others); Fri, 28 Jul 2023 12:44:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46260 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233482AbjG1Qnf (ORCPT ); Fri, 28 Jul 2023 12:43:35 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 79B324223 for ; Fri, 28 Jul 2023 09:43:30 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3CD2C15DB; Fri, 28 Jul 2023 09:44:13 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 764733F67D; Fri, 28 Jul 2023 09:43:27 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 04/24] x86/resctrl: Move rmid allocation out of mkdir_rdt_prepare() Date: Fri, 28 Jul 2023 16:42:34 +0000 Message-Id: <20230728164254.27562-5-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772686743254746829 X-GMAIL-MSGID: 1772686743254746829 RMID are allocated for each monitor or control group directory, because each of these needs its own RMID. For control groups, rdtgroup_mkdir_ctrl_mon() later goes on to allocate the CLOSID. MPAM's equivalent of RMID is not an independent number, so can't be allocated until the CLOSID is known. An RMID allocation for one CLOSID may fail, whereas another may succeed depending on how many monitor groups a control group has. The RMID allocation needs to move to be after the CLOSID has been allocated. Move the RMID allocation out of mkdir_rdt_prepare() to occur in its caller, after the mkdir_rdt_prepare() call. This allows the RMID allocator to know the CLOSID. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v2: * Moved kernfs_activate() later to preserve atomicity of files being visible --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 35 +++++++++++++++++++------- 1 file changed, 26 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index e7178bbbd30f..7c5cfb373d03 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3189,6 +3189,12 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp) return 0; } +static void mkdir_rdt_prepare_rmid_free(struct rdtgroup *rgrp) +{ + if (rdt_mon_capable) + free_rmid(rgrp->closid, rgrp->mon.rmid); +} + static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, const char *name, umode_t mode, enum rdt_group_type rtype, struct rdtgroup **r) @@ -3254,12 +3260,6 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, goto out_destroy; } - ret = mkdir_rdt_prepare_rmid_alloc(rdtgrp); - if (ret) - goto out_destroy; - - kernfs_activate(kn); - /* * The caller unlocks the parent_kn upon success. */ @@ -3278,7 +3278,6 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, static void mkdir_rdt_prepare_clean(struct rdtgroup *rgrp) { kernfs_remove(rgrp->kn); - free_rmid(rgrp->closid, rgrp->mon.rmid); rdtgroup_remove(rgrp); } @@ -3300,12 +3299,21 @@ static int rdtgroup_mkdir_mon(struct kernfs_node *parent_kn, prgrp = rdtgrp->mon.parent; rdtgrp->closid = prgrp->closid; + ret = mkdir_rdt_prepare_rmid_alloc(rdtgrp); + if (ret) { + mkdir_rdt_prepare_clean(rdtgrp); + goto out_unlock; + } + + kernfs_activate(rdtgrp->kn); + /* * Add the rdtgrp to the list of rdtgrps the parent * ctrl_mon group has to track. */ list_add_tail(&rdtgrp->mon.crdtgrp_list, &prgrp->mon.crdtgrp_list); +out_unlock: rdtgroup_kn_unlock(parent_kn); return ret; } @@ -3336,10 +3344,17 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn, ret = 0; rdtgrp->closid = closid; - ret = rdtgroup_init_alloc(rdtgrp); - if (ret < 0) + + ret = mkdir_rdt_prepare_rmid_alloc(rdtgrp); + if (ret) goto out_id_free; + kernfs_activate(rdtgrp->kn); + + ret = rdtgroup_init_alloc(rdtgrp); + if (ret < 0) + goto out_rmid_free; + list_add(&rdtgrp->rdtgroup_list, &rdt_all_groups); if (rdt_mon_capable) { @@ -3358,6 +3373,8 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn, out_del_list: list_del(&rdtgrp->rdtgroup_list); +out_rmid_free: + mkdir_rdt_prepare_rmid_free(rdtgrp); out_id_free: closid_free(closid); out_common_fail: From patchwork Fri Jul 28 16:42:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127768 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp579368vqg; Fri, 28 Jul 2023 10:14:51 -0700 (PDT) X-Google-Smtp-Source: APBJJlG068+l4imiSX7DTMfoIkARpBsAIZRZyRhYH0XGkmFMCidiAgUBgVV9AB4gEjbc5Lh5Fsph X-Received: by 2002:a17:90b:4b84:b0:263:829:2de with SMTP id lr4-20020a17090b4b8400b00263082902demr1919517pjb.2.1690564490804; Fri, 28 Jul 2023 10:14:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690564490; cv=none; d=google.com; s=arc-20160816; b=vvWthl+KAKpyDHBCX71vk4UdH67SSo6/mmy0KF7j9zSSFgSbVU85yKeSn6/sKiwS4o +OTUcKUAfc+9fI+Ho4FeoM65ZleuRqHoViAOFFofcVK0uvgqcCkAIx15z0/L2Pmx8vDS zWEDEyyguqOIxdaJad4Yikxsxz13alQZIEbSjMeJ4K2LHWOpq/JlDHEjAwpDqimPb1gd 9rfbGUf9m4sbhtmDJ+bH80k09UWzYaEheoNR6ECyH+xAkvTZuKNWTvEUICrQxs1+FgO4 QcWGkp8p9RIqstIDAEm7Omz1q6MrsNWq1D6TERVTLDZZkmRpDSth2fgYn4/0dw0Gzbgo lDCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=40VxHJoOhnDzMUCcxjRUolqjKomZL9e1Dqf2+icU3FA=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=aisuGOMZpu0eWFnaWVKSTOs6T5jhJYGHGOWVqQy2RYyPYLSuyTEDAgfQN2SKOrsqCY auQZPvg1nNUifbRqiEtwallo7iI/wikXgfl7aoWAnjszNgO5kwFTe/Ebn/yVSotxQCLX IXMaOjCJmJS9EJhc7Zm1/+ZB/Fl9IpP4mjL1xd31HmhejrCaqWcNT0tuSQ0zcnDq4DsY kaqTDVizaeKuF7S8tSk5nyGY328IWUjmNOTG23ThtFI9yWpWpJr/VUUm5q6vUWwsI6P7 /wFsPMVOIHfnJmroEybN7shdqymw75Y7UXqTrZMnsb606+k+iI3h2rdLdDIkVUqN+cTl WL1Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q202-20020a632ad3000000b0055785a37147si24395pgq.590.2023.07.28.10.14.37; Fri, 28 Jul 2023 10:14:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234532AbjG1Qon (ORCPT + 99 others); Fri, 28 Jul 2023 12:44:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46942 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234095AbjG1Qnh (ORCPT ); Fri, 28 Jul 2023 12:43:37 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 547401739 for ; Fri, 28 Jul 2023 09:43:33 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 24E791655; Fri, 28 Jul 2023 09:44:16 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5DA8D3F67D; Fri, 28 Jul 2023 09:43:30 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 05/24] x86/resctrl: Allow RMID allocation to be scoped by CLOSID Date: Fri, 28 Jul 2023 16:42:35 +0000 Message-Id: <20230728164254.27562-6-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772685351949653669 X-GMAIL-MSGID: 1772685351949653669 MPAMs RMID values are not unique unless the CLOSID is considered as well. alloc_rmid() expects the RMID to be an independent number. Pass the CLOSID in to alloc_rmid(). Use this to compare indexes when allocating. If the CLOSID is not relevant to the index, this ends up comparing the free RMID with itself, and the first free entry will be used. With MPAM the CLOSID is included in the index, so this becomes a walk of the free RMID entries, until one that matches the supplied CLOSID is found. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v2; * Rephrased comment in resctrl_find_free_rmid() to describe this in terms of list_entry_first() * Rephrased comment above alloc_rmid() Changes since v3: * Flipped conditions in alloc_rmid() Changes since v4: * Typo in comment --- arch/x86/kernel/cpu/resctrl/internal.h | 2 +- arch/x86/kernel/cpu/resctrl/monitor.c | 51 +++++++++++++++++------ arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 2 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +- 4 files changed, 41 insertions(+), 16 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index b48715bb8762..94749ee950dd 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -535,7 +535,7 @@ void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp); struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r); int closids_supported(void); void closid_free(int closid); -int alloc_rmid(void); +int alloc_rmid(u32 closid); void free_rmid(u32 closid, u32 rmid); int rdt_get_mon_l3_config(struct rdt_resource *r); bool __init rdt_cpu_has(int flag); diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index bd234b66dddf..de91ca781d9f 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -337,24 +337,49 @@ bool has_busy_rmid(struct rdt_domain *d) return find_first_bit(d->rmid_busy_llc, idx_limit) != idx_limit; } -/* - * As of now the RMIDs allocation is global. - * However we keep track of which packages the RMIDs - * are used to optimize the limbo list management. - */ -int alloc_rmid(void) +static struct rmid_entry *resctrl_find_free_rmid(u32 closid) { - struct rmid_entry *entry; - - lockdep_assert_held(&rdtgroup_mutex); + struct rmid_entry *itr; + u32 itr_idx, cmp_idx; if (list_empty(&rmid_free_lru)) - return rmid_limbo_count ? -EBUSY : -ENOSPC; + return rmid_limbo_count ? ERR_PTR(-EBUSY) : ERR_PTR(-ENOSPC); + + list_for_each_entry(itr, &rmid_free_lru, list) { + /* + * Get the index of this free RMID, and the index it would need + * to be if it were used with this CLOSID. + * If the CLOSID is irrelevant on this architecture, these will + * always be the same meaning the compiler can reduce this loop + * to a single list_entry_first() call. + */ + itr_idx = resctrl_arch_rmid_idx_encode(itr->closid, itr->rmid); + cmp_idx = resctrl_arch_rmid_idx_encode(closid, itr->rmid); + + if (itr_idx == cmp_idx) + return itr; + } + + return ERR_PTR(-ENOSPC); +} + +/* + * For MPAM the RMID value is not unique, and has to be considered with + * the CLOSID. The (CLOSID, RMID) pair is allocated on all domains, which + * allows all domains to be managed by a single limbo list. + * Each domain also has a rmid_busy_llc to reduce the work of the limbo handler. + */ +int alloc_rmid(u32 closid) +{ + struct rmid_entry *entry; + + lockdep_assert_held(&rdtgroup_mutex); + + entry = resctrl_find_free_rmid(closid); + if (IS_ERR(entry)) + return PTR_ERR(entry); - entry = list_first_entry(&rmid_free_lru, - struct rmid_entry, list); list_del(&entry->list); - return entry->rmid; } diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c index aeadaeb5df9a..5ebd6e54c7f2 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -763,7 +763,7 @@ int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp) int ret; if (rdt_mon_capable) { - ret = alloc_rmid(); + ret = alloc_rmid(rdtgrp->closid); if (ret < 0) { rdt_last_cmd_puts("Out of RMIDs\n"); return ret; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 7c5cfb373d03..b97e119dbe46 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3172,7 +3172,7 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp) if (!rdt_mon_capable) return 0; - ret = alloc_rmid(); + ret = alloc_rmid(rdtgrp->closid); if (ret < 0) { rdt_last_cmd_puts("Out of RMIDs\n"); return ret; From patchwork Fri Jul 28 16:42:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127769 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp579612vqg; Fri, 28 Jul 2023 10:15:12 -0700 (PDT) X-Google-Smtp-Source: APBJJlGGnq2aIGJvV7a87J1w9wmAJ+c1GCgFsxN73voguTvdqdbOQ4ZE46SdvWjLGwMjZvSwIa7R X-Received: by 2002:a05:6a20:2c98:b0:13b:a2c9:922e with SMTP id g24-20020a056a202c9800b0013ba2c9922emr1580216pzj.27.1690564512171; Fri, 28 Jul 2023 10:15:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690564512; cv=none; d=google.com; s=arc-20160816; b=tLuosdIh50mNWoaZIXkq+k4ShrPDQ3rPU7cUOlN385D0yQKwGLh1kOJH86QcF8Epj+ MqUVS/mQLsVc0ryjlTBs9XtwfetGWIF6eKeCcs6Xw9irGW24OTMte8tYq2R3peSXyIQb V6Qn3Uw/3rwNEqgJy03n1bZ5r1TxfMxJibNxTEuYG4Z8zf5tvF726GCgflYC9DUgLrpq qikez78C2jYygze9P6JvYolrWoI5DO9H2f9UXxzePOdcX1LBm98G83yhBEGE+FxJS5R6 y/ZQys+cmq+R4xNcndoUsXsF54hII738niqj6L8ygZeS2yasAw7YJqiOCoXMRZeQvpIP rg/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=vgJWjfsj4iSbRl+JL1kwq0L+XTc/Da8yZdgkJG8ZFGw=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=hZNkIfnN+VXe2ryqdf+9ahxn31YpWV4NYUM5Tp7ts8v154jp7Z1vqn6yfm5kyQ9q5w QJGyiFALWUL+Evs1lFZunn8fqcvLIzHDgeIiclWm7sQf9uJzwuyvJULVsB5ajwFMlxQa Pz3U6SXN8oZ1UTVQPJa4XxSXhkq7azCeuq20+FHpHYukP91yC0jW7l0VTpiFjwaF+rJH dPboqIZuX/r2zrMTguNgPkAwmhsvNHru9bR/FzZKqdbl3t/J5/0d8LTLMYxLH9g0yvZ0 h9at/d+UZGQaZfmlGuBO1ewYtx9Q/s0UyYRgoV9w1teUtov7rRHb8ulVzvnfSQ39HK58 EfwA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id cp24-20020a056a00349800b0068255c2b8a8si3369975pfb.151.2023.07.28.10.14.55; Fri, 28 Jul 2023 10:15:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235146AbjG1Qoq (ORCPT + 99 others); Fri, 28 Jul 2023 12:44:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46948 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234089AbjG1Qnh (ORCPT ); Fri, 28 Jul 2023 12:43:37 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 37FB64208 for ; Fri, 28 Jul 2023 09:43:36 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 12542D75; Fri, 28 Jul 2023 09:44:19 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4624C3F67D; Fri, 28 Jul 2023 09:43:33 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 06/24] x86/resctrl: Track the number of dirty RMID a CLOSID has Date: Fri, 28 Jul 2023 16:42:36 +0000 Message-Id: <20230728164254.27562-7-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772685373812618943 X-GMAIL-MSGID: 1772685373812618943 MPAM's PMG bits extend its PARTID space, meaning the same PMG value can be used for different control groups. This means once a CLOSID is allocated, all its monitoring ids may still be dirty, and held in limbo. Keep track of the number of RMID held in limbo each CLOSID has. This will allow a future helper to find the 'cleanest' CLOSID when allocating. The array is only needed when CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID is defined. This will never be the case on x86. Signed-off-by: James Morse --- Changes since v4: * Moved closid_num_dirty_rmid[] update under entry->busy check * Take the mutex in dom_data_init() as the caller doesn't. --- arch/x86/kernel/cpu/resctrl/monitor.c | 49 +++++++++++++++++++++++---- 1 file changed, 42 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index de91ca781d9f..44addc0126fc 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -43,6 +43,13 @@ struct rmid_entry { */ static LIST_HEAD(rmid_free_lru); +/** + * @closid_num_dirty_rmid The number of dirty RMID each CLOSID has. + * Only allocated when CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID is defined. + * Indexed by CLOSID. Protected by rdtgroup_mutex. + */ +static int *closid_num_dirty_rmid; + /** * @rmid_limbo_count count of currently unused but (potentially) * dirty RMIDs. @@ -285,6 +292,17 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, return 0; } +static void limbo_release_entry(struct rmid_entry *entry) +{ + lockdep_assert_held(&rdtgroup_mutex); + + rmid_limbo_count--; + list_add_tail(&entry->list, &rmid_free_lru); + + if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) + closid_num_dirty_rmid[entry->closid]--; +} + /* * Check the RMIDs that are marked as busy for this domain. If the * reported LLC occupancy is below the threshold clear the busy bit and @@ -321,10 +339,8 @@ void __check_limbo(struct rdt_domain *d, bool force_free) if (force_free || !rmid_dirty) { clear_bit(idx, d->rmid_busy_llc); - if (!--entry->busy) { - rmid_limbo_count--; - list_add_tail(&entry->list, &rmid_free_lru); - } + if (!--entry->busy) + limbo_release_entry(entry); } cur_idx = idx + 1; } @@ -391,6 +407,8 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) u64 val = 0; u32 idx; + lockdep_assert_held(&rdtgroup_mutex); + idx = resctrl_arch_rmid_idx_encode(entry->closid, entry->rmid); entry->busy = 0; @@ -416,9 +434,11 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) } put_cpu(); - if (entry->busy) + if (entry->busy) { rmid_limbo_count++; - else + if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) + closid_num_dirty_rmid[entry->closid]++; + } else list_add_tail(&entry->list, &rmid_free_lru); } @@ -782,13 +802,28 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) static int dom_data_init(struct rdt_resource *r) { u32 idx_limit = resctrl_arch_system_num_rmid_idx(); + u32 num_closid = resctrl_arch_get_num_closid(r); struct rmid_entry *entry = NULL; u32 idx; int i; + if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) { + int *tmp; + + tmp = kcalloc(num_closid, sizeof(int), GFP_KERNEL); + if (!tmp) + return -ENOMEM; + + mutex_lock(&rdtgroup_mutex); + closid_num_dirty_rmid = tmp; + mutex_unlock(&rdtgroup_mutex); + } + rmid_ptrs = kcalloc(idx_limit, sizeof(struct rmid_entry), GFP_KERNEL); - if (!rmid_ptrs) + if (!rmid_ptrs) { + kfree(closid_num_dirty_rmid); return -ENOMEM; + } for (i = 0; i < idx_limit; i++) { entry = &rmid_ptrs[i]; From patchwork Fri Jul 28 16:42:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127786 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp591089vqg; Fri, 28 Jul 2023 10:35:37 -0700 (PDT) X-Google-Smtp-Source: APBJJlGOw8joXJDHoKcuoJfq3YrI6aRJusOAlxlIWwVDzcQ2H3zPHUSxOJk6akUtX13vS3nqBt8g X-Received: by 2002:a05:6808:1cd:b0:399:169:75d4 with SMTP id x13-20020a05680801cd00b00399016975d4mr3168036oic.36.1690565736742; Fri, 28 Jul 2023 10:35:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690565736; cv=none; d=google.com; s=arc-20160816; b=KGev7eWZwY9rsTEt+km6AXVTINMVBFYmEj6XzdK6KJxdz5yd3C1/pI10WAl0ETXES9 +MyqpKx76z1MgZgXO0daQPcZGx9rCtJSL6pNapv5gQRqin7/MGhISvDv+IswJwAvJk/Y 4/+adR1Xo5QqxjR+PAUs3tHScOtxYP/mZWU7WphEVWGRkjm72B4Ysauz0Ta7G/qzj8ps J0zheRzKMA1KPvbkgEr13cXTExNseN7NkU+kNL9M7ccfexjqcgM8ibz7uq6cZ/0mCV2Z brb3ZWFuuwBbfZ3h3hbQjoJOuvlB2d9kuN+qiwKMDpUqCmR4hlixxBV8odEygPKgtCwQ Q0Rw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=E4dL7xqpWX4Qe2R+ccPICLR2aZMLBjlNDb57DyAjaJc=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=nFj6EikadMfPNBJqw0ZD0bFUSvb73PckKCnxZ2Mz7sUvOuUkaKJXBAKVcUr2uABGoQ Rr72T6f6D1Fh5ye0wJycOVUrEcXfb+ubug0VbqJWEp3LHjAeW/Gkk8Rlb1AvpfzF58St 4G9sDpzqvkmfjQyIZ2dq8AejDxpps0z7o9c0BjZyoIN5zthiZW8qLhMkRQJUJ3QP1ZJ5 04NT4toBpaNh844ixXVsX5Y+y3Wd1MzudBdABIIeoVqkfvZaESVg+6CrRttQYUjPMWfD Huvf5rrXCkikNE0LZDuDDiSHDVr+K+cHYoLCDue8MBfvlRxHRZLZaxMu1naXkoDZhrba UVMA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id mi18-20020a17090b4b5200b0026838c1d91csi5077420pjb.38.2023.07.28.10.35.23; Fri, 28 Jul 2023 10:35:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233771AbjG1Qot (ORCPT + 99 others); Fri, 28 Jul 2023 12:44:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234695AbjG1Qnu (ORCPT ); Fri, 28 Jul 2023 12:43:50 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 88423448E for ; Fri, 28 Jul 2023 09:43:39 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 076B81570; Fri, 28 Jul 2023 09:44:22 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3E80E3F67D; Fri, 28 Jul 2023 09:43:36 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 07/24] x86/resctrl: Use set_bit()/clear_bit() instead of open coding Date: Fri, 28 Jul 2023 16:42:37 +0000 Message-Id: <20230728164254.27562-8-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772686657801796020 X-GMAIL-MSGID: 1772686657801796020 The resctrl CLOSID allocator uses a single 32bit word to track which CLOSID are free. The setting and clearing of bits is open coded. A subsequent patch adds resctrl_closid_is_free(), which adds more open coded bitmaps operations. These will eventually need changing to use the bitops helpers so that a CLOSID bitmap of the correct size can be allocated dynamically. Convert the existing open coded bit manipulations of closid_free_map to use set_bit() and friends. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index b97e119dbe46..4ab9bb018c17 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -106,7 +106,7 @@ void rdt_staged_configs_clear(void) * - Our choices on how to configure each resource become progressively more * limited as the number of resources grows. */ -static int closid_free_map; +static unsigned long closid_free_map; static int closid_free_map_len; int closids_supported(void) @@ -126,7 +126,7 @@ static void closid_init(void) closid_free_map = BIT_MASK(rdt_min_closid) - 1; /* CLOSID 0 is always reserved for the default group */ - closid_free_map &= ~1; + clear_bit(0, &closid_free_map); closid_free_map_len = rdt_min_closid; } @@ -137,14 +137,14 @@ static int closid_alloc(void) if (closid == 0) return -ENOSPC; closid--; - closid_free_map &= ~(1 << closid); + clear_bit(closid, &closid_free_map); return closid; } void closid_free(int closid) { - closid_free_map |= 1 << closid; + set_bit(closid, &closid_free_map); } /** @@ -156,7 +156,7 @@ void closid_free(int closid) */ static bool closid_allocated(unsigned int closid) { - return (closid_free_map & (1 << closid)) == 0; + return !test_bit(closid, &closid_free_map); } /** From patchwork Fri Jul 28 16:42:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127758 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp574385vqg; Fri, 28 Jul 2023 10:07:04 -0700 (PDT) X-Google-Smtp-Source: APBJJlEcUv42/Fk/XXoZKWzeZW+sC7T5RDM3lfUqqs5KrXmuVAFaAZvTMccghYlx+MuMQBOMGX3B X-Received: by 2002:a05:6a20:dda6:b0:10f:52e2:49ec with SMTP id kw38-20020a056a20dda600b0010f52e249ecmr1727536pzb.53.1690564024090; Fri, 28 Jul 2023 10:07:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690564024; cv=none; d=google.com; s=arc-20160816; b=Hk1fVQxxcyHgjJPX420NWZQf9uusB8uzlUexoSJPDLHV/SOSvgwyT+1T5rciiOztN9 d1VJAsbDlLWPgGhLv1+DaPjfTFndX+hlyWYV6UNWDdhfNys7xghLu+K1X/T9QbK3EmAq IjeafJdUuCsEVMv4QTwXcKdHgV0SYSMPLcyKj9qktSXwQFzn1putD20CuNuF+x5JSo6S uB4vEO98U4uiXdHqigTEQkw9owEtvkaEVVTg+FBF+DoCqtFX7yNxeyTzBjs3BvfGWLKy j6zhEpIAWr+U4Algcshncz/X4RmuURO/lOjSdCo7HzwCrOtNHGIIuISwh/j5W5KN7895 LdAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=KpLWkrdzN2uSQ1FvligBS2EFgAlHEY3gQF6w1HnoyEo=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=Vx5qh9g6N/gB266Zgzw8xkxHuzDM1eNB2iGNbggv4Wdhib1RnJSeA764sDWdB8kupN fB25eTgagl7Xbrplx7VwGa0GJYjDtC2jxvntiGuBgZv79INcp8BAHPtMumBouxjbzvF8 2RO259cwB8KQ/aGu9Hz7EHllvgol0+4B+gFpJYGMKcegMuuDPQBAzi5eDSGMSWYOmTJi 4th5MQkSL2dZAtX8gT8abKiXtIlxaicenD+d7VRdG3eqd0kBpmMUHTwnfjv/T2/YI4RZ l0IlvLlAQu/rG8XoMQo6fK/rC7kT3zeZV09nHg1Hc+7RJhMhBy6exeD8c6nPJdIlgJN3 qptQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id be3-20020a656e43000000b0055c8b7f3523si3337156pgb.54.2023.07.28.10.06.50; Fri, 28 Jul 2023 10:07:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234924AbjG1QpM (ORCPT + 99 others); Fri, 28 Jul 2023 12:45:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46922 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233839AbjG1QoI (ORCPT ); Fri, 28 Jul 2023 12:44:08 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id EA52E468B for ; Fri, 28 Jul 2023 09:43:42 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 05E1D15DB; Fri, 28 Jul 2023 09:44:25 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3BF943F67D; Fri, 28 Jul 2023 09:43:39 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 08/24] x86/resctrl: Allocate the cleanest CLOSID by searching closid_num_dirty_rmid Date: Fri, 28 Jul 2023 16:42:38 +0000 Message-Id: <20230728164254.27562-9-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772684861927260541 X-GMAIL-MSGID: 1772684861927260541 MPAM's PMG bits extend its PARTID space, meaning the same PMG value can be used for different control groups. This means once a CLOSID is allocated, all its monitoring ids may still be dirty, and held in limbo. Instead of allocating the first free CLOSID, on architectures where CONFIG_RESCTRL_RMID_DEPENDS_ON_COSID is enabled, search closid_num_dirty_rmid[] to find the cleanest CLOSID. The CLOSID found is returned to closid_alloc() for the free list to be updated. Signed-off-by: James Morse --- Changes since v4: * Dropped stale section from comment --- arch/x86/kernel/cpu/resctrl/internal.h | 2 ++ arch/x86/kernel/cpu/resctrl/monitor.c | 42 ++++++++++++++++++++++++++ arch/x86/kernel/cpu/resctrl/rdtgroup.c | 19 +++++++++--- 3 files changed, 58 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 94749ee950dd..7c2a1c235480 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -557,5 +557,7 @@ void rdt_domain_reconfigure_cdp(struct rdt_resource *r); void __init thread_throttle_mode_init(void); void __init mbm_config_rftype_init(const char *config); void rdt_staged_configs_clear(void); +bool closid_allocated(unsigned int closid); +int resctrl_find_cleanest_closid(void); #endif /* _ASM_X86_RESCTRL_INTERNAL_H */ diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 44addc0126fc..c268aa5925c7 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -379,6 +379,48 @@ static struct rmid_entry *resctrl_find_free_rmid(u32 closid) return ERR_PTR(-ENOSPC); } +/** + * resctrl_find_cleanest_closid() - Find a CLOSID where all the associated + * RMID are clean, or the CLOSID that has + * the most clean RMID. + * + * MPAM's equivalent of RMID are per-CLOSID, meaning a freshly allocated CLOSID + * may not be able to allocate clean RMID. To avoid this the allocator will + * choose the CLOSID with the most clean RMID. + * + * When the CLOSID and RMID are independent numbers, the first free CLOSID will + * be returned. + */ +int resctrl_find_cleanest_closid(void) +{ + u32 cleanest_closid = ~0, iter_num_dirty; + int i = 0; + + lockdep_assert_held(&rdtgroup_mutex); + + if (!IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) + return -EIO; + + for (i = 0; i < closids_supported(); i++) { + if (closid_allocated(i)) + continue; + + iter_num_dirty = closid_num_dirty_rmid[i]; + if (iter_num_dirty == 0) + return i; + + if (cleanest_closid == ~0) + cleanest_closid = i; + + if (iter_num_dirty < closid_num_dirty_rmid[cleanest_closid]) + cleanest_closid = i; + } + + if (cleanest_closid == ~0) + return -ENOSPC; + return cleanest_closid; +} + /* * For MPAM the RMID value is not unique, and has to be considered with * the CLOSID. The (CLOSID, RMID) pair is allocated on all domains, which diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 4ab9bb018c17..df28b81d2c9c 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -132,11 +132,20 @@ static void closid_init(void) static int closid_alloc(void) { - u32 closid = ffs(closid_free_map); + u32 closid; + int err; - if (closid == 0) - return -ENOSPC; - closid--; + if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) { + err = resctrl_find_cleanest_closid(); + if (err < 0) + return err; + closid = err; + } else { + closid = ffs(closid_free_map); + if (closid == 0) + return -ENOSPC; + closid--; + } clear_bit(closid, &closid_free_map); return closid; @@ -154,7 +163,7 @@ void closid_free(int closid) * Return: true if @closid is currently associated with a resource group, * false if @closid is free */ -static bool closid_allocated(unsigned int closid) +bool closid_allocated(unsigned int closid) { return !test_bit(closid, &closid_free_map); } From patchwork Fri Jul 28 16:42:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127777 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp588165vqg; Fri, 28 Jul 2023 10:30:26 -0700 (PDT) X-Google-Smtp-Source: APBJJlHxc9dZDQqR24gtZ3xhweW4qLoPQBxw8YeCTC+wH1HAxqDtkQOL5nYcsJ73greDFbB+VzLF X-Received: by 2002:a9d:7dd5:0:b0:6b8:83ca:560a with SMTP id k21-20020a9d7dd5000000b006b883ca560amr3733209otn.18.1690565426289; Fri, 28 Jul 2023 10:30:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690565426; cv=none; d=google.com; s=arc-20160816; b=vzNtbhTlzGf5Vp2bwSUknviViZsF8Dnj9MdVtmnXQfi1suTAn8LzBTCSEk8Cptp8QL coxJYIzt3BHA8vpNktcRVYOFcGyVf3p9pUxPLWlzbtUk3iqYDI93pIKe1MqXxGPTK2LP 5qsVfmGrUv2lzVjgQY4yo7+7tkbAN6ppzUEqAnA0pJuTJcbVX1OsN8G/uceWt+n+9XbR ZDJmwZWI9iYWLVJaIBzaBmkkiE8U2FTWp9Ao6Ww+XxN5VGK60w4pTYs+In1O+LNOos79 wrh8l0b8wsdACBLGTuze1uM4gqS+Z0J+5HHQWOjZh0t22oqaajaqT0EtKNd/P44NGahv cnHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=C5kEye5287OKMV1n0lybKVf0oCHVLqLrl31R3sB5RUE=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=oqzAVSV1GegvrbOYreLF8Rq/8DVPbheXMHjThXIsXEvAYsVRnaoK6qvR0wVQqATyAL YgQZEtdxi0eg2e3ZROnqo6AHghOY4zEpCahUTkv1AVaQxKwTNueKAP0bSP+76HOzrgAF bsAB+tnUsTEGfC0/tVLBPGPn4YdOvUCoDeM012S+5Ho6BYONpWH+6iPfO0v5RdKTumZg /DwmesrGJPocuvGuYJ16Vz5whN/hpbS+QPnb+xvgWQjJc06+AZ2UgbJUMKlxVpXZoAiL Zj4gAxzdIzCT+FFe/57GU12+zWSduA6CDCVLqcvYYxVuzyTk2corlQPcC8rKREelEenX HHzw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s16-20020a63af50000000b00563f317e54fsi3252081pgo.284.2023.07.28.10.30.13; Fri, 28 Jul 2023 10:30:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235388AbjG1Qpi (ORCPT + 99 others); Fri, 28 Jul 2023 12:45:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47748 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234811AbjG1QpK (ORCPT ); Fri, 28 Jul 2023 12:45:10 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8626349E0 for ; Fri, 28 Jul 2023 09:43:45 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E59832F4; Fri, 28 Jul 2023 09:44:27 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 27C333F67D; Fri, 28 Jul 2023 09:43:42 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 09/24] x86/resctrl: Move CLOSID/RMID matching and setting to use helpers Date: Fri, 28 Jul 2023 16:42:39 +0000 Message-Id: <20230728164254.27562-10-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772686332704912896 X-GMAIL-MSGID: 1772686332704912896 When switching tasks, the CLOSID and RMID that the new task should use are stored in struct task_struct. For x86 the CLOSID known by resctrl, the value in task_struct, and the value written to the CPU register are all the same thing. MPAM's CPU interface has two different PARTID's one for data accesses the other for instruction fetch. Storing resctrl's CLOSID value in struct task_struct implies the arch code knows whether resctrl is using CDP. Move the matching and setting of the struct task_struct properties to use helpers. This allows arm64 to store the hardware format of the register, instead of having to convert it each time. __rdtgroup_move_task()s use of READ_ONCE()/WRITE_ONCE() ensures torn values aren't seen as another CPU may schedule the task being moved while the value is being changed. MPAM has an additional corner-case here as the PMG bits extend the PARTID space. If the scheduler sees a new-CLOSID but old-RMID, the task will dirty an RMID that the limbo code is not watching causing an inaccurate count. x86's RMID are independent values, so the limbo code will still be watching the old-RMID in this circumstance. To avoid this, arm64 needs both the CLOSID/RMID WRITE_ONCE()d together. Both values must be provided together. Because MPAM's RMID values are not unique, the CLOSID must be provided when matching the RMID. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v2: * __rdtgroup_move_task() changed to set CLOSID from different CLOSID place depending on group type --- arch/x86/include/asm/resctrl.h | 18 ++++++++ arch/x86/kernel/cpu/resctrl/rdtgroup.c | 62 ++++++++++++++++---------- 2 files changed, 56 insertions(+), 24 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 9510c23db62d..66d9e18cdc61 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -95,6 +95,24 @@ static inline unsigned int resctrl_arch_round_mon_val(unsigned int val) return val * scale; } +static inline void resctrl_arch_set_closid_rmid(struct task_struct *tsk, + u32 closid, u32 rmid) +{ + WRITE_ONCE(tsk->closid, closid); + WRITE_ONCE(tsk->rmid, rmid); +} + +static inline bool resctrl_arch_match_closid(struct task_struct *tsk, u32 closid) +{ + return READ_ONCE(tsk->closid) == closid; +} + +static inline bool resctrl_arch_match_rmid(struct task_struct *tsk, u32 ignored, + u32 rmid) +{ + return READ_ONCE(tsk->rmid) == rmid; +} + static inline void resctrl_sched_in(struct task_struct *tsk) { if (static_branch_likely(&rdt_enable_key)) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index df28b81d2c9c..775f6bede6f8 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -97,7 +97,7 @@ void rdt_staged_configs_clear(void) * * Using a global CLOSID across all resources has some advantages and * some drawbacks: - * + We can simply set "current->closid" to assign a task to a resource + * + We can simply set current's closid to assign a task to a resource * group. * + Context switch code can avoid extra memory references deciding which * CLOSID to load into the PQR_ASSOC MSR @@ -563,14 +563,26 @@ static void update_task_closid_rmid(struct task_struct *t) _update_task_closid_rmid(t); } +static bool task_in_rdtgroup(struct task_struct *tsk, struct rdtgroup *rdtgrp) +{ + u32 closid, rmid = rdtgrp->mon.rmid; + + if (rdtgrp->type == RDTCTRL_GROUP) + closid = rdtgrp->closid; + else if (rdtgrp->type == RDTMON_GROUP) + closid = rdtgrp->mon.parent->closid; + else + return false; + + return resctrl_arch_match_closid(tsk, closid) && + resctrl_arch_match_rmid(tsk, closid, rmid); +} + static int __rdtgroup_move_task(struct task_struct *tsk, struct rdtgroup *rdtgrp) { /* If the task is already in rdtgrp, no need to move the task. */ - if ((rdtgrp->type == RDTCTRL_GROUP && tsk->closid == rdtgrp->closid && - tsk->rmid == rdtgrp->mon.rmid) || - (rdtgrp->type == RDTMON_GROUP && tsk->rmid == rdtgrp->mon.rmid && - tsk->closid == rdtgrp->mon.parent->closid)) + if (task_in_rdtgroup(tsk, rdtgrp)) return 0; /* @@ -581,19 +593,19 @@ static int __rdtgroup_move_task(struct task_struct *tsk, * For monitor groups, can move the tasks only from * their parent CTRL group. */ - - if (rdtgrp->type == RDTCTRL_GROUP) { - WRITE_ONCE(tsk->closid, rdtgrp->closid); - WRITE_ONCE(tsk->rmid, rdtgrp->mon.rmid); - } else if (rdtgrp->type == RDTMON_GROUP) { - if (rdtgrp->mon.parent->closid == tsk->closid) { - WRITE_ONCE(tsk->rmid, rdtgrp->mon.rmid); - } else { - rdt_last_cmd_puts("Can't move task to different control group\n"); - return -EINVAL; - } + if (rdtgrp->type == RDTMON_GROUP && + !resctrl_arch_match_closid(tsk, rdtgrp->mon.parent->closid)) { + rdt_last_cmd_puts("Can't move task to different control group\n"); + return -EINVAL; } + if (rdtgrp->type == RDTMON_GROUP) + resctrl_arch_set_closid_rmid(tsk, rdtgrp->mon.parent->closid, + rdtgrp->mon.rmid); + else + resctrl_arch_set_closid_rmid(tsk, rdtgrp->closid, + rdtgrp->mon.rmid); + /* * Ensure the task's closid and rmid are written before determining if * the task is current that will decide if it will be interrupted. @@ -615,14 +627,15 @@ static int __rdtgroup_move_task(struct task_struct *tsk, static bool is_closid_match(struct task_struct *t, struct rdtgroup *r) { - return (rdt_alloc_capable && - (r->type == RDTCTRL_GROUP) && (t->closid == r->closid)); + return (rdt_alloc_capable && (r->type == RDTCTRL_GROUP) && + resctrl_arch_match_closid(t, r->closid)); } static bool is_rmid_match(struct task_struct *t, struct rdtgroup *r) { - return (rdt_mon_capable && - (r->type == RDTMON_GROUP) && (t->rmid == r->mon.rmid)); + return (rdt_mon_capable && (r->type == RDTMON_GROUP) && + resctrl_arch_match_rmid(t, r->mon.parent->closid, + r->mon.rmid)); } /** @@ -822,7 +835,7 @@ int proc_resctrl_show(struct seq_file *s, struct pid_namespace *ns, rdtg->mode != RDT_MODE_EXCLUSIVE) continue; - if (rdtg->closid != tsk->closid) + if (!resctrl_arch_match_closid(tsk, rdtg->closid)) continue; seq_printf(s, "res:%s%s\n", (rdtg == &rdtgroup_default) ? "/" : "", @@ -830,7 +843,8 @@ int proc_resctrl_show(struct seq_file *s, struct pid_namespace *ns, seq_puts(s, "mon:"); list_for_each_entry(crg, &rdtg->mon.crdtgrp_list, mon.crdtgrp_list) { - if (tsk->rmid != crg->mon.rmid) + if (!resctrl_arch_match_rmid(tsk, crg->mon.parent->closid, + crg->mon.rmid)) continue; seq_printf(s, "%s", crg->kn->name); break; @@ -2691,8 +2705,8 @@ static void rdt_move_group_tasks(struct rdtgroup *from, struct rdtgroup *to, for_each_process_thread(p, t) { if (!from || is_closid_match(t, from) || is_rmid_match(t, from)) { - WRITE_ONCE(t->closid, to->closid); - WRITE_ONCE(t->rmid, to->mon.rmid); + resctrl_arch_set_closid_rmid(t, to->closid, + to->mon.rmid); /* * Order the closid/rmid stores above before the loads From patchwork Fri Jul 28 16:42:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127749 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp567319vqg; Fri, 28 Jul 2023 09:57:44 -0700 (PDT) X-Google-Smtp-Source: APBJJlE8xWqjKiPbPMunZZYueeWC9epkoRu+FVA/7UqeQbK0NWD29s6Au2McGp+DcM54oFB3rDv+ X-Received: by 2002:a05:6a21:7897:b0:13c:dee4:ceaa with SMTP id bf23-20020a056a21789700b0013cdee4ceaamr1629643pzc.16.1690563464545; Fri, 28 Jul 2023 09:57:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690563464; cv=none; d=google.com; s=arc-20160816; b=nZVZX7jMYo3CSYomkNH9LmE2eIYKRsPZtN5hDaxcvEAiNNRBHzq19KD2RFTz4XRTyz AL6y2/VPcZIaN4h/U8Ob1/lfHWlFePbvO295aBKig2Nx3di0BNrcIFtJ/YSQCtwCicKz E2iQJUTvk12PO0AyunuqfJ3TPwgscnWm+7NZakxw9GdeU7KMS95qh+Cz4ibFpnvl7fnF vH3pyYi0fJMeETkH5rxAQQd8aHtcQJnFlLMv07xtZgaEt9SVDVXWi9s2+H1blpa2k2QL oKh5ASb/7JzIOD9WW61lFqym8MG2/bTcaZABS1R3pgVAAklKY4WgeQSZB59J1XFnzktc BC/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=AnaEspv7yPHxWFYxao23k8O8tDzjRIv0xbGT3Sv6WLQ=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=Tpr2rOjYLmy0BsU7jImzdGYGtBocFpM0e3fHctGbjCu9kQwaSvzQG9eke8qMdTZk1j bfboW7JmN5DGsWzhPWSsE8B0mzSfxef2poIPngSnX+08WWHAiCx1lh6rLi2kZmiIr9wd SDwF7SfKqOHHON18zW+5MaQFaYF/IEVp78ulhXzxQQMbK0kqsSgCOwcr7fp471hVSj3D Mmg5ce1l9ne3kWdkvJ0xfedf+IlnBSNn3aVJgIuLXLL5TonXIYU0xx1AdDTDJWbuVnUq avXAkcyQjQdBuZGq4eANico42s75XrL4+mFLok2XszSteGobg3548J5LekgChnzczhR0 M8og== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s14-20020a63e80e000000b0055b12581c75si3241112pgh.675.2023.07.28.09.57.31; Fri, 28 Jul 2023 09:57:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235318AbjG1QpW (ORCPT + 99 others); Fri, 28 Jul 2023 12:45:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46224 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235566AbjG1Qoc (ORCPT ); Fri, 28 Jul 2023 12:44:32 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id BF69F4C13 for ; Fri, 28 Jul 2023 09:43:48 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CD43CD75; Fri, 28 Jul 2023 09:44:30 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1286D3F844; Fri, 28 Jul 2023 09:43:44 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 10/24] tick/nohz: Move tick_nohz_full_mask declaration outside the #ifdef Date: Fri, 28 Jul 2023 16:42:40 +0000 Message-Id: <20230728164254.27562-11-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772684275394911988 X-GMAIL-MSGID: 1772684275394911988 tick_nohz_full_mask lists the CPUs that are nohz_full. This is only needed when CONFIG_NO_HZ_FULL is defined. tick_nohz_full_cpu() allows a specific CPU to be tested against the mask, and evaluates to false when CONFIG_NO_HZ_FULL is not defined. The resctrl code needs to pick a CPU to run some work on, a new helper prefers housekeeping CPUs by examining the tick_nohz_full_mask. Hiding the declaration behind #ifdef CONFIG_NO_HZ_FULL forces all the users to be behind an ifdef too. Move the tick_nohz_full_mask declaration, this lets callers drop the ifdef, and guard access to tick_nohz_full_mask with IS_ENABLED() or something like tick_nohz_full_cpu(). The definition does not need to be moved as any callers should be removed at compile time unless CONFIG_NO_HZ_FULL is defined. Signed-off-by: James Morse --- include/linux/tick.h | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/include/linux/tick.h b/include/linux/tick.h index 9459fef5b857..65af90ca409a 100644 --- a/include/linux/tick.h +++ b/include/linux/tick.h @@ -174,9 +174,16 @@ static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; } static inline void tick_nohz_idle_stop_tick_protected(void) { } #endif /* !CONFIG_NO_HZ_COMMON */ +/* + * Mask of CPUs that are nohz_full. + * + * Users should be guarded by CONFIG_NO_HZ_FULL or a tick_nohz_full_cpu() + * check. + */ +extern cpumask_var_t tick_nohz_full_mask; + #ifdef CONFIG_NO_HZ_FULL extern bool tick_nohz_full_running; -extern cpumask_var_t tick_nohz_full_mask; static inline bool tick_nohz_full_enabled(void) { From patchwork Fri Jul 28 16:42:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127748 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp566945vqg; Fri, 28 Jul 2023 09:57:07 -0700 (PDT) X-Google-Smtp-Source: APBJJlEVMcnLFSM3wonVEVbB2UKcgz+0B3Hx0PWTBgZy+AA8RrLC/hI0r1cQwexza4OixqH6ilU+ X-Received: by 2002:a05:6a20:1d8:b0:137:2b6f:4307 with SMTP id 24-20020a056a2001d800b001372b6f4307mr2223309pzz.27.1690563426793; Fri, 28 Jul 2023 09:57:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690563426; cv=none; d=google.com; s=arc-20160816; b=XEraUsWfeHXcSa6PrTIOZVyiMcwI46LIK3WNX87VhnXXI/GA3GRti1cPzZCPYLQkf/ uwX0CqakMlG6cj8SlCwGgiW/0P5PHmtpogaj0zu0olH58v9CjcxiHF2a3D7oO72MQpVO s8KrzhsacXVZrxsH7/Pr7PHevV2d43AerQWXiZF3ZpIlF6vszhWO0FUsgEl50GQDo8KX 5ZY3j7JCskVuwI2xeiyfPhucfm9AvCvadOnvg5/2TrZPnlrTUewFaa45+l23KB6hGdIn RhaAJXk1DC8ad8FdXl+W7qvjKDOYcofK70ZRNNf++TVqgfT2Yti8rM6Lg/YVjs3d1u8r JYnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=43rkbMhgLvupagQCzkv3GWl6QA04hEVMU/oDPnoF7bw=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=GUIQwdeEsqKtIUCYhUfx26oVFTFyFO9jZdONwkoUHuPFdem8xkNOOVxanlfDaldoi+ Vo6xQZkq+srKxfo0aDqMWrurpmUgUwnGFvqJFHmg9/dmctlhRYpRv9EfhKb1FZ6hpKCT CgkpDiE16YarsI08Ang6eqQj8E00rXDoV5ncxXDZZQB+A24mapA97riPUwCJp3HtsI+/ d6N9F+fzH1Z3tuyK104k9hwS9Pt2YeFiAlgkkHGd3nhnGFr6zvqWWzDfPWwDublalEON /TSz6k4mKiNylrAwtE6skG0bmmW2Jll6V05a18y3/WkXVPl2LHyfkzyNm+9DG2FM1j46 b+1Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m65-20020a633f44000000b0056401aef836si2817528pga.822.2023.07.28.09.56.53; Fri, 28 Jul 2023 09:57:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235258AbjG1Qpw (ORCPT + 99 others); Fri, 28 Jul 2023 12:45:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46224 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235176AbjG1QpQ (ORCPT ); Fri, 28 Jul 2023 12:45:16 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 9108A4EC3 for ; Fri, 28 Jul 2023 09:43:54 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C4FD21570; Fri, 28 Jul 2023 09:44:33 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0A62D3F67D; Fri, 28 Jul 2023 09:43:47 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 11/24] x86/resctrl: Add cpumask_any_housekeeping() for limbo/overflow Date: Fri, 28 Jul 2023 16:42:41 +0000 Message-Id: <20230728164254.27562-12-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772684235512096831 X-GMAIL-MSGID: 1772684235512096831 The limbo and overflow code picks a CPU to use from the domain's list of online CPUs. Work is then scheduled on these CPUs to maintain the limbo list and any counters that may overflow. cpumask_any() may pick a CPU that is marked nohz_full, which will either penalise the work that CPU was dedicated to, or delay the processing of limbo list or counters that may overflow. Perhaps indefinitely. Delaying the overflow handling will skew the bandwidth values calculated by mba_sc, which expects to be called once a second. Add cpumask_any_housekeeping() as a replacement for cpumask_any() that prefers housekeeping CPUs. This helper will still return a nohz_full CPU if that is the only option. The CPU to use is re-evaluated each time the limbo/overflow work runs. This ensures the work will move off a nohz_full CPU once a housekeeping CPU is available. Signed-off-by: James Morse --- Changes since v3: * typos fixed Changes since v4: * Made temporary variables unsigned --- arch/x86/kernel/cpu/resctrl/internal.h | 23 +++++++++++++++++++++++ arch/x86/kernel/cpu/resctrl/monitor.c | 17 ++++++++++++----- 2 files changed, 35 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 7c2a1c235480..a32d307292a1 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -7,6 +7,7 @@ #include #include #include +#include #include #define L3_QOS_CDP_ENABLE 0x01ULL @@ -55,6 +56,28 @@ /* Max event bits supported */ #define MAX_EVT_CONFIG_BITS GENMASK(6, 0) +/** + * cpumask_any_housekeeping() - Choose any CPU in @mask, preferring those that + * aren't marked nohz_full + * @mask: The mask to pick a CPU from. + * + * Returns a CPU in @mask. If there are housekeeping CPUs that don't use + * nohz_full, these are preferred. + */ +static inline unsigned int cpumask_any_housekeeping(const struct cpumask *mask) +{ + unsigned int cpu, hk_cpu; + + cpu = cpumask_any(mask); + if (tick_nohz_full_cpu(cpu)) { + hk_cpu = cpumask_nth_andnot(0, mask, tick_nohz_full_mask); + if (hk_cpu < nr_cpu_ids) + cpu = hk_cpu; + } + + return cpu; +} + struct rdt_fs_context { struct kernfs_fs_context kfc; bool enable_cdpl2; diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index c268aa5925c7..f0670795b446 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -767,9 +767,9 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, void cqm_handle_limbo(struct work_struct *work) { unsigned long delay = msecs_to_jiffies(CQM_LIMBOCHECK_INTERVAL); - int cpu = smp_processor_id(); struct rdt_resource *r; struct rdt_domain *d; + int cpu; mutex_lock(&rdtgroup_mutex); @@ -778,8 +778,10 @@ void cqm_handle_limbo(struct work_struct *work) __check_limbo(d, false); - if (has_busy_rmid(d)) + if (has_busy_rmid(d)) { + cpu = cpumask_any_housekeeping(&d->cpu_mask); schedule_delayed_work_on(cpu, &d->cqm_limbo, delay); + } mutex_unlock(&rdtgroup_mutex); } @@ -789,7 +791,7 @@ void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms) unsigned long delay = msecs_to_jiffies(delay_ms); int cpu; - cpu = cpumask_any(&dom->cpu_mask); + cpu = cpumask_any_housekeeping(&dom->cpu_mask); dom->cqm_work_cpu = cpu; schedule_delayed_work_on(cpu, &dom->cqm_limbo, delay); @@ -799,10 +801,10 @@ void mbm_handle_overflow(struct work_struct *work) { unsigned long delay = msecs_to_jiffies(MBM_OVERFLOW_INTERVAL); struct rdtgroup *prgrp, *crgrp; - int cpu = smp_processor_id(); struct list_head *head; struct rdt_resource *r; struct rdt_domain *d; + int cpu; mutex_lock(&rdtgroup_mutex); @@ -823,6 +825,11 @@ void mbm_handle_overflow(struct work_struct *work) update_mba_bw(prgrp, d); } + /* + * Re-check for housekeeping CPUs. This allows the overflow handler to + * move off a nohz_full CPU quickly. + */ + cpu = cpumask_any_housekeeping(&d->cpu_mask); schedule_delayed_work_on(cpu, &d->mbm_over, delay); out_unlock: @@ -836,7 +843,7 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) if (!static_branch_likely(&rdt_mon_enable_key)) return; - cpu = cpumask_any(&dom->cpu_mask); + cpu = cpumask_any_housekeeping(&dom->cpu_mask); dom->mbm_work_cpu = cpu; schedule_delayed_work_on(cpu, &dom->mbm_over, delay); } From patchwork Fri Jul 28 16:42:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127756 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp573921vqg; Fri, 28 Jul 2023 10:06:28 -0700 (PDT) X-Google-Smtp-Source: APBJJlFET3Oyd15Kl3DsicNXbBexU9QMGhzlMclOu2OfMv6ii5Yb2WXNiAHZ6VZfnmqt0+1qGZqh X-Received: by 2002:a05:6a21:339f:b0:12e:4d86:c017 with SMTP id yy31-20020a056a21339f00b0012e4d86c017mr3103899pzb.10.1690563988193; Fri, 28 Jul 2023 10:06:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690563988; cv=none; d=google.com; s=arc-20160816; b=Y409QMP6SNzLNtSc0S3XZZqztljfn5QbXeZ7N/TD7CbzfsvSOfqErRHLAvpmTIhnqW 1/CiJD1ibTREWI/SGoGOpt2Pv//CL/TSlXxGadaMvMiALoyg19S+gIrYTz2T6kiuEQw/ M2Z+L4ehApPrkDHbrImJy0FKbACkipLmPJrYo09q1wkcm2mQz0IV6UPYBwKMpVw9kEV7 PowAJZiX0ujQik5x5R3CMWQPEi3MesOXbKTTzt/veMvaGdR3Utr8Kqgqh5/ENkvaqctF q9kJY4Uh25tshwW9x1RRSCE3h+KpoxuSwoCX2XFAi+/igpebskFWcyJAnXrMdukszWEd OPgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=7BipQIcNI3ZNiKIMCFjWtaqeuURR2S/T1fUs0/46oxg=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=Ja3VJmm1pUiz+hWdN4JukWYLytFyc/1HMYvGzsLj//qvYbsO4hnMRlhdB69hzYngeM sTD2XV4LMDcl0b9JQIcrNWEPJruH/tqDgyZaRUOswBwHMjUjDJf4FnH9A1UGeZgggtck lVkbgxDKCN4DZh6GwdyrckCj9/HXSi3QExjtCOsr9scQHixXZreNFXAAR1BUyDREEoZk SPwtHZZjECgfG25axwFHlb7vck/KmOMaXwA0vPND7Y+g6Nzo9UhX6r/ElJsmsLIqGVk/ yB/x9TtaalIIKusrJ+Ch7ZHbE3cUz5IWVMqIWeDYyAwVgYJK8ZH+Ild7JWKNANuqRudr kC6A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id be3-20020a656e43000000b0055c8b7f3523si3337156pgb.54.2023.07.28.10.06.11; Fri, 28 Jul 2023 10:06:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234908AbjG1QqJ (ORCPT + 99 others); Fri, 28 Jul 2023 12:46:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46948 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234738AbjG1Qpg (ORCPT ); Fri, 28 Jul 2023 12:45:36 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 5524B4EEC for ; Fri, 28 Jul 2023 09:44:03 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B3F2015DB; Fri, 28 Jul 2023 09:44:36 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id ED9DD3F67D; Fri, 28 Jul 2023 09:43:50 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 12/24] x86/resctrl: Make resctrl_arch_rmid_read() retry when it is interrupted Date: Fri, 28 Jul 2023 16:42:42 +0000 Message-Id: <20230728164254.27562-13-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772684824509686430 X-GMAIL-MSGID: 1772684824509686430 resctrl_arch_rmid_read() could be called by resctrl in process context, and then called by the PMU driver from irq context on the same CPU. This could cause struct arch_mbm_state's prev_msr value to go backwards, leading to the chunks value being incremented multiple times. The struct arch_mbm_state holds both the previous msr value, and a count of the number of chunks. These two fields need to be updated atomically. Similarly __rmid_read() must write to one MSR and read from another, this must be proteted from re-entrance. Read the prev_msr before accessing the hardware, and cmpxchg() the value back. If the value has changed, the whole thing is re-attempted. To protect the MSR, __rmid_read() will retry reads for QM_CTR if QM_EVTSEL has changed from the selected value. Signed-off-by: James Morse --- Changes since v4: * Added retry loop in __rmid_read() to protect the CPU MSRs. --- arch/x86/kernel/cpu/resctrl/internal.h | 5 +-- arch/x86/kernel/cpu/resctrl/monitor.c | 45 ++++++++++++++++++++------ 2 files changed, 38 insertions(+), 12 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index a32d307292a1..7012f42a82ee 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -2,6 +2,7 @@ #ifndef _ASM_X86_RESCTRL_INTERNAL_H #define _ASM_X86_RESCTRL_INTERNAL_H +#include #include #include #include @@ -338,8 +339,8 @@ struct mbm_state { * find this struct. */ struct arch_mbm_state { - u64 chunks; - u64 prev_msr; + atomic64_t chunks; + atomic64_t prev_msr; }; /** diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index f0670795b446..62350bbd23e0 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -16,6 +16,7 @@ */ #include +#include #include #include @@ -24,6 +25,9 @@ #include "internal.h" +/* Sequence number for writes to IA32 QM_EVTSEL */ +static DEFINE_PER_CPU(u64, qm_evtsel_seq); + struct rmid_entry { /* * Some architectures's resctrl_arch_rmid_read() needs the CLOSID value @@ -178,7 +182,7 @@ static inline struct rmid_entry *__rmid_entry(u32 idx) static int __rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val) { - u64 msr_val; + u64 msr_val, seq; /* * As per the SDM, when IA32_QM_EVTSEL.EvtID (bits 7:0) is configured @@ -187,9 +191,16 @@ static int __rmid_read(u32 rmid, enum resctrl_event_id eventid, u64 *val) * IA32_QM_CTR.data (bits 61:0) reports the monitored data. * IA32_QM_CTR.Error (bit 63) and IA32_QM_CTR.Unavailable (bit 62) * are error bits. + * A per-cpu sequence counter is incremented each time QM_EVTSEL is + * written. This is used to detect if this function was interrupted by + * another call without re-reading the MSRs. Retry the MSR read when + * this happens as the QM_CTR value may belong to a different event. */ - wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid); - rdmsrl(MSR_IA32_QM_CTR, msr_val); + do { + seq = this_cpu_inc_return(qm_evtsel_seq); + wrmsr(MSR_IA32_QM_EVTSEL, eventid, rmid); + rdmsrl(MSR_IA32_QM_CTR, msr_val); + } while (seq != this_cpu_read(qm_evtsel_seq)); if (msr_val & RMID_VAL_ERROR) return -EIO; @@ -225,13 +236,15 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d, { struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); struct arch_mbm_state *am; + u64 msr_val; am = get_arch_mbm_state(hw_dom, rmid, eventid); if (am) { memset(am, 0, sizeof(*am)); /* Record any initial, non-zero count value. */ - __rmid_read(rmid, eventid, &am->prev_msr); + __rmid_read(rmid, eventid, &msr_val); + atomic64_set(&am->prev_msr, msr_val); } } @@ -266,23 +279,35 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, { struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); + u64 start_msr_val, old_msr_val, msr_val, chunks; struct arch_mbm_state *am; - u64 msr_val, chunks; - int ret; + int ret = 0; if (!cpumask_test_cpu(smp_processor_id(), &d->cpu_mask)) return -EINVAL; +interrupted: + am = get_arch_mbm_state(hw_dom, rmid, eventid); + if (am) + start_msr_val = atomic64_read(&am->prev_msr); + ret = __rmid_read(rmid, eventid, &msr_val); if (ret) return ret; am = get_arch_mbm_state(hw_dom, rmid, eventid); if (am) { - am->chunks += mbm_overflow_count(am->prev_msr, msr_val, - hw_res->mbm_width); - chunks = get_corrected_mbm_count(rmid, am->chunks); - am->prev_msr = msr_val; + old_msr_val = atomic64_cmpxchg(&am->prev_msr, start_msr_val, + msr_val); + if (old_msr_val != start_msr_val) + goto interrupted; + + chunks = mbm_overflow_count(start_msr_val, msr_val, + hw_res->mbm_width); + atomic64_add(chunks, &am->chunks); + + chunks = get_corrected_mbm_count(rmid, + atomic64_read(&am->chunks)); } else { chunks = msr_val; } From patchwork Fri Jul 28 16:42:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127767 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp579034vqg; Fri, 28 Jul 2023 10:14:18 -0700 (PDT) X-Google-Smtp-Source: APBJJlFQAQhKl3ndBsyNd3h1DyGYlanhvCyC6i2TB1IeOkC60GeIwGT8x0RTittaV2PeubNU30m/ X-Received: by 2002:a05:6a20:430b:b0:13a:52ce:13cc with SMTP id h11-20020a056a20430b00b0013a52ce13ccmr2532416pzk.51.1690564458148; Fri, 28 Jul 2023 10:14:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690564458; cv=none; d=google.com; s=arc-20160816; b=BlWTCcyY3M/lN9Vc6/WVZNy0ZatwdVtZ4G1BxAwMSjxc7oAj2mxSb8nys9YTtrvOTr qatipHoaIsh325FxuP5LLeP8vATx+HUVywBS66bCVyHoPLPJLCxeOw0daLzPCK06AHOu +Kau8BcwlotfPSdZLG4Q1y5c8BSSVMVGtAhjr7JY6ItHUVKv+wwaogqHpRBS1ExwemjP bl2bA/E0zJX6O1aKui+WwEFrdbaCvGirIrYMJkgobdlMIbzq719g2/EczyW/6Pgz/RWD jmYGixvT0gUX3oiSo6E++xnAsf/DsDdOtSk2z0FKtQAXH3yoPwr/F67GAem/QBpUpc4v nmBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=pqwzG9EHEMQEBSaIklNJDqRTcsL9ENqnBXqFMdidqTU=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=FXtgRjNkxR//vXYAy0q8TVUiVo7/UTZTRqcNsKgsJYnbl3TZ2LFXkKmJLQMe7POaoq zow/rjU/i6H1sQXc7mrvBxzzDEWvID83GiCLf6J5sb6txEzcWsxGKi/0qP9tMuS3cSNE mpPZ1JTzP++UZo6eyNCpvEJDfQGSTKNZ9XmbpCwLcJhVEhaE3tTPeAH2saQDgqSMROZ3 4pX/0Qj/UkqDVC4GwIvD5J4KUpumwbBSppPxk5bs4e+FtPkhVyfMHpzr91JJgyl4r59I s2iN15Izgsg4Ob2mGPce83qWUO5fXNYReEK+z0DxC/S80EGsIuVu5ZJhxZcTK35uKTrX 124Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q202-20020a632ad3000000b0055785a37147si24395pgq.590.2023.07.28.10.14.05; Fri, 28 Jul 2023 10:14:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233779AbjG1Qp5 (ORCPT + 99 others); Fri, 28 Jul 2023 12:45:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233332AbjG1Qpa (ORCPT ); Fri, 28 Jul 2023 12:45:30 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id D9DE74ED0 for ; Fri, 28 Jul 2023 09:43:56 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9DEDDD75; Fri, 28 Jul 2023 09:44:39 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D79493F67D; Fri, 28 Jul 2023 09:43:53 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 13/24] x86/resctrl: Queue mon_event_read() instead of sending an IPI Date: Fri, 28 Jul 2023 16:42:43 +0000 Message-Id: <20230728164254.27562-14-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772685317501151202 X-GMAIL-MSGID: 1772685317501151202 Intel is blessed with an abundance of monitors, one per RMID, that can be read from any CPU in the domain. MPAMs monitors reside in the MMIO MSC, the number implemented is up to the manufacturer. This means when there are fewer monitors than needed, they need to be allocated and freed. MPAM's CSU monitors are used to back the 'llc_occupancy' monitor file. The CSU counter is allowed to return 'not ready' for a small number of micro-seconds after programming. To allow one CSU hardware monitor to be used for multiple control or monitor groups, the CPU accessing the monitor needs to be able to block when configuring and reading the counter. Worse, the domain may be broken up into slices, and the MMIO accesses for each slice may need performing from different CPUs. These two details mean MPAMs monitor code needs to be able to sleep, and IPI another CPU in the domain to read from a resource that has been sliced. mon_event_read() already invokes mon_event_count() via IPI, which means this isn't possible. On systems using nohz-full, some CPUs need to be interrupted to run kernel work as they otherwise stay in user-space running realtime workloads. Interrupting these CPUs should be avoided, and scheduling work on them may never complete. Change mon_event_read() to pick a housekeeping CPU, (one that is not using nohz_full) and schedule mon_event_count() and wait. If all the CPUs in a domain are using nohz-full, then an IPI is used as the fallback. This function is only used in response to a user-space filesystem request (not the timing sensitive overflow code). This allows MPAM to hide the slice behaviour from resctrl, and to keep the monitor-allocation in monitor.c. When the IPI fallback is used on machines where MPAM needs to make an access on multiple CPUs, the counter read will always fail. Tested-by: Shaopeng Tan Reviewed-by: Peter Newman Tested-by: Peter Newman Signed-off-by: James Morse --- Changes since v2: * Use cpumask_any_housekeeping() and fallback to an IPI if needed. Changes since v3: * Actually include the IPI fallback code. Changes since v4: * Tinkered with existing capitalisation. --- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 28 +++++++++++++++++++++-- arch/x86/kernel/cpu/resctrl/monitor.c | 2 +- 2 files changed, 27 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index b44c487727d4..bd263b9a0abd 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -19,6 +19,7 @@ #include #include #include +#include #include "internal.h" /* @@ -520,12 +521,24 @@ int rdtgroup_schemata_show(struct kernfs_open_file *of, return ret; } +static int smp_mon_event_count(void *arg) +{ + mon_event_count(arg); + + return 0; +} + void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, struct rdt_domain *d, struct rdtgroup *rdtgrp, int evtid, int first) { + int cpu; + + /* When picking a CPU from cpu_mask, ensure it can't race with cpuhp */ + lockdep_assert_held(&rdtgroup_mutex); + /* - * setup the parameters to send to the IPI to read the data. + * Setup the parameters to pass to mon_event_count() to read the data. */ rr->rgrp = rdtgrp; rr->evtid = evtid; @@ -534,7 +547,18 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, rr->val = 0; rr->first = first; - smp_call_function_any(&d->cpu_mask, mon_event_count, rr, 1); + cpu = cpumask_any_housekeeping(&d->cpu_mask); + + /* + * cpumask_any_housekeeping() prefers housekeeping CPUs, but + * are all the CPUs nohz_full? If yes, pick a CPU to IPI. + * MPAM's resctrl_arch_rmid_read() is unable to read the + * counters on some platforms if its called in irq context. + */ + if (tick_nohz_full_cpu(cpu)) + smp_call_function_any(&d->cpu_mask, mon_event_count, rr, 1); + else + smp_call_on_cpu(cpu, smp_mon_event_count, rr, false); } int rdtgroup_mondata_show(struct seq_file *m, void *arg) diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 62350bbd23e0..32569354c4f1 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -597,7 +597,7 @@ static void mbm_bw_count(u32 closid, u32 rmid, struct rmid_read *rr) } /* - * This is called via IPI to read the CQM/MBM counters + * This is scheduled by mon_event_read() to read the CQM/MBM counters * on a domain. */ void mon_event_count(void *info) From patchwork Fri Jul 28 16:42:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127764 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp576180vqg; Fri, 28 Jul 2023 10:09:39 -0700 (PDT) X-Google-Smtp-Source: APBJJlHCjrkOJJ5vFlGctaj0hQZJjG+XLaLsj3t+5/yzWxziQYcPXIjDXgurdd+Xg8Z7uSbOysqt X-Received: by 2002:a05:6a20:2593:b0:135:62b8:2a35 with SMTP id k19-20020a056a20259300b0013562b82a35mr2020138pzd.32.1690564179150; Fri, 28 Jul 2023 10:09:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690564179; cv=none; d=google.com; s=arc-20160816; b=kyu1JaKOvhaeiZyZ14oSc4xelQqlowNlel/OLSSLejW8MivlLrSPPjgasLjWCxaAFx PouQTU2wkGbBLrCKG3kXJlQXZ8enVip7j40uzR+jR6r0bzPG7lb6+VezGB7HUaSmvaiU p30ZPFTyAxsJvOZMb4C8kdBA0KBiG9KI6JDdQZXTvw4AI4pOAa/nDO5b5ph40maRmP4T iQooj4mNmDfOMrdFMtB52Amw9XS5o2XNxVRRttYtesEybJwzrpqfS+FtJxGFF65xnr2J IIkWDb9KyJzEZUNR7bb+pXgS7Zk4wQ1mlZudI6OfBh3SngDdokkY3oWYmW2n0OiILFd8 DapQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=kBCO1Dx5gyh2LmF/RcEIxCx0cI/rvTXYhjvox4gwJ10=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=wBs6G90UFOLJ6EVNYhh9eHhUtej+Yy2b7mgCSOhD8zkKw+ViXYx5ac5Bhdo6NfMY2B u/G4f2zij8ebmyOcorqyhhj6+SDXzwGCRoXocAXbv+HWyj1H+PFLTEaebtUpaYlXuocv LHO17/b+wFwSVQYZz+1nH3x8QdyJOo2FX/7kPBEiHHYFsn1oBFlkd+zniT73bIa4aMSC cSEYOyKDHToBgnvbGuEry1uL4GfjAY5/unv5oh7ej/Aj2zb8Y8B2aMrk0sHlRgPhTaEp iAOERPwbyxbQFaxp9F9Kp8Nln/ruH3v5isCn59C+5URWHAKw0MXkrz2rXuD/+ptcGWe/ 0LgQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v191-20020a6389c8000000b00534784002afsi3201381pgd.807.2023.07.28.10.09.25; Fri, 28 Jul 2023 10:09:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234930AbjG1Qqe (ORCPT + 99 others); Fri, 28 Jul 2023 12:46:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234828AbjG1QqJ (ORCPT ); Fri, 28 Jul 2023 12:46:09 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 60920421E for ; Fri, 28 Jul 2023 09:44:26 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 955781655; Fri, 28 Jul 2023 09:44:42 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C87F73F67D; Fri, 28 Jul 2023 09:43:56 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 14/24] x86/resctrl: Allow resctrl_arch_rmid_read() to sleep Date: Fri, 28 Jul 2023 16:42:44 +0000 Message-Id: <20230728164254.27562-15-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772685024851887824 X-GMAIL-MSGID: 1772685024851887824 MPAM's cache occupancy counters can take a little while to settle once the monitor has been configured. The maximum settling time is described to the driver via a firmware table. The value could be large enough that it makes sense to sleep. To avoid exposing this to resctrl, it should be hidden behind MPAM's resctrl_arch_rmid_read(). resctrl_arch_rmid_read() may be called via IPI meaning it is unable to sleep. In this case resctrl_arch_rmid_read() should return an error if it needs to sleep. This will only affect MPAM platforms where the cache occupancy counter isn't available immediately, nohz_full is in use, and there are there are no housekeeping CPUs in the necessary domain. There are three callers of resctrl_arch_rmid_read(): __mon_event_count() and __check_limbo() are both called from a non-migrateable context. mon_event_read() invokes __mon_event_count() using smp_call_on_cpu(), which adds work to the target CPUs workqueue. rdtgroup_mutex() is held, meaning this cannot race with the resctrl cpuhp callback. __check_limbo() is invoked via schedule_delayed_work_on() also adds work to a per-cpu workqueue. The remaining call is add_rmid_to_limbo() which is called in response to a user-space syscall that frees an RMID. This opportunistically reads the LLC occupancy counter on the current domain to see if the RMID is over the dirty threshold. This has to disable preemption to avoid reading the wrong domain's value. Disabling pre-emption here prevents resctrl_arch_rmid_read() from sleeping. add_rmid_to_limbo() walks each domain, but only reads the counter on one domain. If the system has more than one domain, the RMID will always be added to the limbo list. If the RMIDs usage was not over the threshold, it will be removed from the list when __check_limbo() runs. Make this the default behaviour. Free RMIDs are always added to the limbo list for each domain. The user visible effect of this is that a clean RMID is not available for re-allocation immediately after 'rmdir()' completes, this behaviour was never portable as it never happened on a machine with multiple domains. Removing this path allows resctrl_arch_rmid_read() to sleep if its called with interrupts unmasked. Document this is the expected behaviour, and add a might_sleep() annotation to catch changes that won't work on arm64. Signed-off-by: James Morse --- The previous version allowed resctrl_arch_rmid_read() to be called on the wrong CPUs, but now that this needs to take nohz_full and housekeeping into account, its too complex. Changes since v3: * Removed error handling for smp_call_function_any(), this can't race with the cpuhp callbacks as both hold rdtgroup_mutex. * Switched to the alternative of removing the counter read, this simplifies things dramatically. Changes since v4: * Messed with capitalisation. * Removed some dead code now that entry->busy will never be zero in add_rmid_to_limbo(). * Rephrased the comment above resctrl_arch_rmid_read_context_check(). --- arch/x86/kernel/cpu/resctrl/monitor.c | 24 +++++------------------- include/linux/resctrl.h | 18 +++++++++++++++++- 2 files changed, 22 insertions(+), 20 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 32569354c4f1..08e3307863c3 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -283,6 +283,8 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, struct arch_mbm_state *am; int ret = 0; + resctrl_arch_rmid_read_context_check(); + if (!cpumask_test_cpu(smp_processor_id(), &d->cpu_mask)) return -EINVAL; @@ -470,8 +472,6 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) { struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; struct rdt_domain *d; - int cpu, err; - u64 val = 0; u32 idx; lockdep_assert_held(&rdtgroup_mutex); @@ -479,17 +479,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) idx = resctrl_arch_rmid_idx_encode(entry->closid, entry->rmid); entry->busy = 0; - cpu = get_cpu(); list_for_each_entry(d, &r->domains, list) { - if (cpumask_test_cpu(cpu, &d->cpu_mask)) { - err = resctrl_arch_rmid_read(r, d, entry->closid, - entry->rmid, - QOS_L3_OCCUP_EVENT_ID, - &val); - if (err || val <= resctrl_rmid_realloc_threshold) - continue; - } - /* * For the first limbo RMID in the domain, * setup up the limbo worker. @@ -499,14 +489,10 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) set_bit(idx, d->rmid_busy_llc); entry->busy++; } - put_cpu(); - if (entry->busy) { - rmid_limbo_count++; - if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) - closid_num_dirty_rmid[entry->closid]++; - } else - list_add_tail(&entry->list, &rmid_free_lru); + rmid_limbo_count++; + if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) + closid_num_dirty_rmid[entry->closid]++; } void free_rmid(u32 closid, u32 rmid) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 660752406174..f7311102e94c 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -236,7 +236,12 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); * @eventid: eventid to read, e.g. L3 occupancy. * @val: result of the counter read in bytes. * - * Call from process context on a CPU that belongs to domain @d. + * Some architectures need to sleep when first programming some of the counters. + * (specifically: arm64's MPAM cache occupancy counters can return 'not ready' + * for a short period of time). Call from a non-migrateable process context on + * a CPU that belongs to domain @d. e.g. use smp_call_on_cpu() or + * schedule_work_on(). This function can be called with interrupts masked, + * e.g. using smp_call_function_any(), but may consistently return an error. * * Return: * 0 on success, or -EIO, -EINVAL etc on error. @@ -245,6 +250,17 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, u32 closid, u32 rmid, enum resctrl_event_id eventid, u64 *val); +/** + * resctrl_arch_rmid_read_context_check() - warn about invalid contexts + * + * When built with CONFIG_DEBUG_ATOMIC_SLEEP generate a warning when + * resctrl_arch_rmid_read() is called with preemption disabled. + */ +static inline void resctrl_arch_rmid_read_context_check(void) +{ + if (!irqs_disabled()) + might_sleep(); +} /** * resctrl_arch_reset_rmid() - Reset any private state associated with rmid From patchwork Fri Jul 28 16:42:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127754 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp572268vqg; Fri, 28 Jul 2023 10:04:23 -0700 (PDT) X-Google-Smtp-Source: APBJJlHU5IUj4VDOjADCK/ERpYkDQ6wau3Ke30AKOq+rsMHJKGNu9U42iKby4o/3zj44SA+ci0K+ X-Received: by 2002:a05:6a21:339f:b0:12e:4d86:c017 with SMTP id yy31-20020a056a21339f00b0012e4d86c017mr3095924pzb.10.1690563862873; Fri, 28 Jul 2023 10:04:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690563862; cv=none; d=google.com; s=arc-20160816; b=OK2rfPPScmf4rwVInMJ7G8YIaqis1vlfTKGTFhzH4kGPDCiGa1Hrk1SxqadMk0LSoJ qUkZdt8z3EF8nIUZwlQMHk7Zp1b9C0/gz8Re77yffv8vjiOjw8iUfEua6Zbs+v5cScvB xZV25co/3ElUbe6L+7pr10nd0W0gDoLiGOS63bfIREikyiSfol6W5BkzTBJPCNNSOq3W oZFwqtBCfCgkI48+DgEoTRTEDuXpPNta+ktvHaqwF0L/gbcoHWMEDhuPZvsCvjEXbXWu mLsR52ZLVeXX7YKTG74jgN+6wN0v6vKyvODrGz94PZz71ZIwhvDrm3hqtqS8fcVxYqMq RX5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=Gi4RHG41u2kP9lk+eEjIIMKjLZ6Rd7+pOlmZXXRyex8=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=vXMNaUlsd+fGoH9FjWEkIqQWPw47Uvn1NQVMGqUIL8MrNmOasasqkJCAr1dNTSjOZd Y48sXvTnVS4FUTHtbkGCYtrvm3bSQm8FmRPXD57tDZ8LAH5LXfyGFOmEiGxtQWQTut8/ uSe1T381cLYC0xQPQ+9bAsjc7wzmDPhQQHgE3reNEBp5xh+BmSX1dXfeitpnu2AOnPxa cvL+WhK9lYLWbOKznDa/AVffg3Jd+Yip/ixoXntw4LequYiThL0AkKJi5c4eCfy5uZP/ 4qcBY84NwH3hDJG+QHub7NfIxn2ziSY43VvicSUVOSagDCjQVzOyVs9MWfn64r4NR2hh Br5g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id be3-20020a656e43000000b0055c8b7f3523si3337156pgb.54.2023.07.28.10.04.05; Fri, 28 Jul 2023 10:04:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234603AbjG1Qqx (ORCPT + 99 others); Fri, 28 Jul 2023 12:46:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48020 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233704AbjG1Qq2 (ORCPT ); Fri, 28 Jul 2023 12:46:28 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id C5EB259D3 for ; Fri, 28 Jul 2023 09:44:42 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 80CD5165C; Fri, 28 Jul 2023 09:44:45 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B7B933F67D; Fri, 28 Jul 2023 09:43:59 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 15/24] x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read() Date: Fri, 28 Jul 2023 16:42:45 +0000 Message-Id: <20230728164254.27562-16-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772684693080902026 X-GMAIL-MSGID: 1772684693080902026 Depending on the number of monitors available, Arm's MPAM may need to allocate a monitor prior to reading the counter value. Allocating a contended resource may involve sleeping. add_rmid_to_limbo() calls resctrl_arch_rmid_read() for multiple domains, the allocation should be valid for all domains. __check_limbo() and mon_event_count() each make multiple calls to resctrl_arch_rmid_read(), to avoid extra work on contended systems, the allocation should be valid for multiple invocations of resctrl_arch_rmid_read(). Add arch hooks for this allocation, which need calling before resctrl_arch_rmid_read(). The allocated monitor is passed to resctrl_arch_rmid_read(), then freed again afterwards. The helper can be called on any CPU, and can sleep. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v3: * Expanded comment. * Removed stray header include. * Reworded commit message. * Made ctx a void * instead of an int. Changes since v4: * Used IS_ERR() in more places. --- arch/x86/include/asm/resctrl.h | 11 ++++++++++ arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 5 +++++ arch/x86/kernel/cpu/resctrl/internal.h | 1 + arch/x86/kernel/cpu/resctrl/monitor.c | 25 ++++++++++++++++++++--- include/linux/resctrl.h | 5 ++++- 5 files changed, 43 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 66d9e18cdc61..0986b5208d76 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -136,6 +136,17 @@ static inline u32 resctrl_arch_rmid_idx_encode(u32 ignored, u32 rmid) return rmid; } +/* x86 can always read an rmid, nothing needs allocating */ +struct rdt_resource; +static inline void *resctrl_arch_mon_ctx_alloc(struct rdt_resource *r, int evtid) +{ + might_sleep(); + return NULL; +}; + +static inline void resctrl_arch_mon_ctx_free(struct rdt_resource *r, int evtid, + void *ctx) { }; + void resctrl_cpu_detect(struct cpuinfo_x86 *c); #else diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index bd263b9a0abd..55bad57a7bd5 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -546,6 +546,9 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, rr->d = d; rr->val = 0; rr->first = first; + rr->arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, evtid); + if (IS_ERR(rr->arch_mon_ctx)) + return; cpu = cpumask_any_housekeeping(&d->cpu_mask); @@ -559,6 +562,8 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, smp_call_function_any(&d->cpu_mask, mon_event_count, rr, 1); else smp_call_on_cpu(cpu, smp_mon_event_count, rr, false); + + resctrl_arch_mon_ctx_free(r, evtid, rr->arch_mon_ctx); } int rdtgroup_mondata_show(struct seq_file *m, void *arg) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 7012f42a82ee..45db51280ff4 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -136,6 +136,7 @@ struct rmid_read { bool first; int err; u64 val; + void *arch_mon_ctx; }; extern bool rdt_alloc_capable; diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 08e3307863c3..5eed8d0cbf36 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -275,7 +275,7 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width) int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, u32 closid, u32 rmid, enum resctrl_event_id eventid, - u64 *val) + u64 *val, void *ignored) { struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); @@ -342,9 +342,14 @@ void __check_limbo(struct rdt_domain *d, bool force_free) u32 idx_limit = resctrl_arch_system_num_rmid_idx(); struct rmid_entry *entry; u32 idx, cur_idx = 1; + void *arch_mon_ctx; bool rmid_dirty; u64 val = 0; + arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, QOS_L3_OCCUP_EVENT_ID); + if (IS_ERR(arch_mon_ctx)) + return; + /* * Skip RMID 0 and start from RMID 1 and check all the RMIDs that * are marked as busy for occupancy < threshold. If the occupancy @@ -358,7 +363,8 @@ void __check_limbo(struct rdt_domain *d, bool force_free) entry = __rmid_entry(idx); if (resctrl_arch_rmid_read(r, d, entry->closid, entry->rmid, - QOS_L3_OCCUP_EVENT_ID, &val)) { + QOS_L3_OCCUP_EVENT_ID, &val, + arch_mon_ctx)) { rmid_dirty = true; } else { rmid_dirty = (val >= resctrl_rmid_realloc_threshold); @@ -371,6 +377,8 @@ void __check_limbo(struct rdt_domain *d, bool force_free) } cur_idx = idx + 1; } + + resctrl_arch_mon_ctx_free(r, QOS_L3_OCCUP_EVENT_ID, arch_mon_ctx); } bool has_busy_rmid(struct rdt_domain *d) @@ -544,7 +552,7 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr) } rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid, rr->evtid, - &tval); + &tval, rr->arch_mon_ctx); if (rr->err) return rr->err; @@ -754,11 +762,21 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, if (is_mbm_total_enabled()) { rr.evtid = QOS_L3_MBM_TOTAL_EVENT_ID; rr.val = 0; + rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid); + if (IS_ERR(rr.arch_mon_ctx)) + return; + __mon_event_count(closid, rmid, &rr); + + resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx); } if (is_mbm_local_enabled()) { rr.evtid = QOS_L3_MBM_LOCAL_EVENT_ID; rr.val = 0; + rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid); + if (IS_ERR(rr.arch_mon_ctx)) + return; + __mon_event_count(closid, rmid, &rr); /* @@ -768,6 +786,7 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, */ if (is_mba_sc(NULL)) mbm_bw_count(closid, rmid, &rr); + resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx); } } diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index f7311102e94c..5e4b4df9610b 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -235,6 +235,9 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); * @rmid: rmid of the counter to read. * @eventid: eventid to read, e.g. L3 occupancy. * @val: result of the counter read in bytes. + * @arch_mon_ctx: An architecture specific value from + * resctrl_arch_mon_ctx_alloc(), for MPAM this identifies + * the hardware monitor allocated for this read request. * * Some architectures need to sleep when first programming some of the counters. * (specifically: arm64's MPAM cache occupancy counters can return 'not ready' @@ -248,7 +251,7 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); */ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, u32 closid, u32 rmid, enum resctrl_event_id eventid, - u64 *val); + u64 *val, void *arch_mon_ctx); /** * resctrl_arch_rmid_read_context_check() - warn about invalid contexts From patchwork Fri Jul 28 16:42:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127747 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp565897vqg; Fri, 28 Jul 2023 09:54:57 -0700 (PDT) X-Google-Smtp-Source: APBJJlGsu2iXAL4P+gQ/LIu1WwDeMwoFOsimvZrPJnXiUWbJ0THcCXQ+dVjzKVH+TOu83TdF4C68 X-Received: by 2002:a17:903:495:b0:1b7:ffb9:ea85 with SMTP id jj21-20020a170903049500b001b7ffb9ea85mr2227422plb.29.1690563297513; Fri, 28 Jul 2023 09:54:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690563297; cv=none; d=google.com; s=arc-20160816; b=cW48d7PuvHIfNB5eQ4i1W9KE6b1+wntQcKc9L4aONA81hmtHAnI3FPEqEQQkXRURce Ad68if2/MkzK+lBY5oHpHjUz74uKLJq1J2ZxDrHxlyiouoM5Pwk0cuXj5KrK83mdil65 k3O1Qtb1wsXhM7FUSY4lW6J0o8SUCHbgcDP8KOq6RQNv90UY6OH3Yp6UHOlcvR8Wgjiz RTUmuLASU73+Z3D1rCmQZ246KxmAS4GuaiJhaRXpewKwniKHvxKrDIvXVBOqWvts0JA3 l7FTuKIZ8N/mk4AOYMJ6WTE9OPwGycyoCFQPtJ7lU2lspm8V+nYz216JAZzQHTKtLsMC uTyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=PDxogs7pmPtlOz54fakFgDO/nr+SaDqoeRwhiFlyGvY=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=jR41LIkiAD2pv1hq3vGPTx5o3g24mmMvY7yRZbGuzbgInmGCnD5X3qxnpY0+CYxtMc vjz9MFX7nw04r+WmBOB58NAyMYNIZPWiJoJjQFfzCVf9kTjvJ/dRO7TJTWgCZxO6grQv v67TEP5dDICURMdIm0UH9WYPdlpzQROJ/2PAB0bDq2xBwGAy5jz+srztPLGBX7/+8rtF 4jpQ/TsABmm7QaqamNTmc1hj6fUSud75SYx1AD9x9XFLIns4/pIVcJ/eMduLvFiMkRqQ Um6rfYUzC4zKWQpOHOYCpFqpeX9Ciouq1BSTT/nIaSmOP97YA/rHfywD93m3xmkbC8I4 xQ2w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h5-20020a170902f54500b001bb20380bf9si3337826plf.545.2023.07.28.09.54.43; Fri, 28 Jul 2023 09:54:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235197AbjG1QrO (ORCPT + 99 others); Fri, 28 Jul 2023 12:47:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46904 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233729AbjG1Qqp (ORCPT ); Fri, 28 Jul 2023 12:46:45 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 348F95B95 for ; Fri, 28 Jul 2023 09:44:54 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 69F3E1682; Fri, 28 Jul 2023 09:44:48 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A1ED43F67D; Fri, 28 Jul 2023 09:44:02 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 16/24] x86/resctrl: Make resctrl_mounted checks explicit Date: Fri, 28 Jul 2023 16:42:46 +0000 Message-Id: <20230728164254.27562-17-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772684100455748079 X-GMAIL-MSGID: 1772684100455748079 The rdt_enable_key is switched when resctrl is mounted, and used to prevent a second mount of the filesystem. It also enables the architecture's context switch code. This requires another architecture to have the same set of static-keys, as resctrl depends on them too. The existing users of these static-keys are implicitly also checking if the filesystem is mounted. Make the resctrl_mounted checks explicit: resctrl can keep track of whether it has been mounted once. This doesn't need to be combined with whether the arch code is context switching the CLOSID. rdt_mon_enable_key is never used just to test that resctrl is mounted, but does also have this implication. Add a resctrl_mounted to all uses of rdt_mon_enable_key. This will allow rdt_mon_enable_key to be swapped with a helper in a subsequent patch. This will allow the static-key changing to be moved behind resctrl_arch_ calls. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v3: * Removed a newline. * Rephrased commit message Changes since v4: * Rephrased comment. --- arch/x86/kernel/cpu/resctrl/internal.h | 1 + arch/x86/kernel/cpu/resctrl/monitor.c | 12 ++++++++++-- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 23 +++++++++++++++++------ 3 files changed, 28 insertions(+), 8 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 45db51280ff4..28751579abe6 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -143,6 +143,7 @@ extern bool rdt_alloc_capable; extern bool rdt_mon_capable; extern unsigned int rdt_mon_features; extern struct list_head resctrl_schema_all; +extern bool resctrl_mounted; enum rdt_group_type { RDTCTRL_GROUP = 0, diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 5eed8d0cbf36..5350d44b16b6 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -838,7 +838,11 @@ void mbm_handle_overflow(struct work_struct *work) mutex_lock(&rdtgroup_mutex); - if (!static_branch_likely(&rdt_mon_enable_key)) + /* + * If the filesystem has been unmounted this work no longer needs to + * run. + */ + if (!resctrl_mounted || !static_branch_likely(&rdt_mon_enable_key)) goto out_unlock; r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; @@ -871,7 +875,11 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) unsigned long delay = msecs_to_jiffies(delay_ms); int cpu; - if (!static_branch_likely(&rdt_mon_enable_key)) + /* + * When a domain comes online there is no guarantee the filesystem is + * mounted. If not, there is no need to catch counter overflow. + */ + if (!resctrl_mounted || !static_branch_likely(&rdt_mon_enable_key)) return; cpu = cpumask_any_housekeeping(&dom->cpu_mask); dom->mbm_work_cpu = cpu; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 775f6bede6f8..68fe2dde8887 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -42,6 +42,9 @@ LIST_HEAD(rdt_all_groups); /* list of entries for the schemata file */ LIST_HEAD(resctrl_schema_all); +/* The filesystem can only be mounted once. */ +bool resctrl_mounted; + /* Kernel fs node for "info" directory under root */ static struct kernfs_node *kn_info; @@ -819,7 +822,7 @@ int proc_resctrl_show(struct seq_file *s, struct pid_namespace *ns, mutex_lock(&rdtgroup_mutex); /* Return empty if resctrl has not been mounted. */ - if (!static_branch_unlikely(&rdt_enable_key)) { + if (!resctrl_mounted) { seq_puts(s, "res:\nmon:\n"); goto unlock; } @@ -2495,7 +2498,7 @@ static int rdt_get_tree(struct fs_context *fc) /* * resctrl file system can only be mounted once. */ - if (static_branch_unlikely(&rdt_enable_key)) { + if (resctrl_mounted) { ret = -EBUSY; goto out; } @@ -2543,8 +2546,10 @@ static int rdt_get_tree(struct fs_context *fc) if (rdt_mon_capable) static_branch_enable_cpuslocked(&rdt_mon_enable_key); - if (rdt_alloc_capable || rdt_mon_capable) + if (rdt_alloc_capable || rdt_mon_capable) { static_branch_enable_cpuslocked(&rdt_enable_key); + resctrl_mounted = true; + } if (is_mbm_enabled()) { r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; @@ -2815,6 +2820,7 @@ static void rdt_kill_sb(struct super_block *sb) static_branch_disable_cpuslocked(&rdt_alloc_enable_key); static_branch_disable_cpuslocked(&rdt_mon_enable_key); static_branch_disable_cpuslocked(&rdt_enable_key); + resctrl_mounted = false; kernfs_kill_sb(sb); mutex_unlock(&rdtgroup_mutex); cpus_read_unlock(); @@ -3774,7 +3780,7 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) * If resctrl is mounted, remove all the * per domain monitor data directories. */ - if (static_branch_unlikely(&rdt_mon_enable_key)) + if (resctrl_mounted && static_branch_unlikely(&rdt_mon_enable_key)) rmdir_mondata_subdir_allrdtgrp(r, d->id); if (is_mbm_enabled()) @@ -3851,8 +3857,13 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) if (is_llc_occupancy_enabled()) INIT_DELAYED_WORK(&d->cqm_limbo, cqm_handle_limbo); - /* If resctrl is mounted, add per domain monitor data directories. */ - if (static_branch_unlikely(&rdt_mon_enable_key)) + /* + * If the filesystem is not mounted then only the default resource group + * exists. Creation of its directories is deferred until mount time + * by rdt_get_tree() calling mkdir_mondata_all(). + * If resctrl is mounted, add per domain monitor data directories. + */ + if (resctrl_mounted && static_branch_unlikely(&rdt_mon_enable_key)) mkdir_mondata_subdir_allrdtgrp(r, d); return 0; From patchwork Fri Jul 28 16:42:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127770 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp579606vqg; Fri, 28 Jul 2023 10:15:12 -0700 (PDT) X-Google-Smtp-Source: APBJJlGim00Rrn81u09zmDnCuYrKyV/zYBKhb68Gv0N4Qz7CdWwYdr5ZZAd1mIMwsA8Fsn/a0UOj X-Received: by 2002:a05:6a00:218d:b0:66c:9faa:bb12 with SMTP id h13-20020a056a00218d00b0066c9faabb12mr3128354pfi.9.1690564511677; Fri, 28 Jul 2023 10:15:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690564511; cv=none; d=google.com; s=arc-20160816; b=WmDLAioRLR9DNwJauRfFFTUo4+BsuxIjngFfgX/kDucxw8BwW3hebJqiZeRzSAuqM/ /BxSx1oEvYRxy3Y5d0+ZCQ8Y0UAvl9GV85ftD+F8XJ55/Yoxy9k0fyEkzlDzsO9ZjF54 bFljrm+DTwVIO197OW6oAjaC+egdoraaQuqtHxWnDXbemA3kinfJI+IEoslWwSKKWlmf 1P6pyv7MROpxHFueFd9n4UToSUQ9kI7lWFM8MGX/moMudt2ZnIdLmF1r4RINNJ9WQ7lo 1cLTbCUcTWjAyl8bb12gNRfOxeZLXCVNiB5n6GfIpQQ4jdg0JRsBAX2Qrs+kAeMFx3UM SJdw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=w3ZBILNlQGDwYwr6gKGH4TLXLVW050HlNdRz8cLkBCY=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=PP3qAmN8tKLC7URVGncKzkzPczygGZGSaciZarCZxKbnrvHmnncJ4RnLM26n/geCmN oeHXZYhhQffzsRd3ij1buja7emt/9n/d5Bl5gUSbpndbBqyMydUqd29mjrp1LbDa8vi5 POD5v7z73XYWzH46sGC+P+iKeC1UWW9HHnQMuzlVx6rIvx924YnebrNDFxGnTdSixdbJ jTA8soOl0HA0iRhDvRLGud1nr9aZZlNW7aEw2pS+YHA80Wq0h+ttpd5FRPI7huqgiOy4 wwNcev4E35cFki8CLJFO4Xz0tXIdb1jd4gIBCQipcpKpSgDg7c5u6xgclqqh7OI0He4h R6BA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a11-20020a056a000c8b00b0067a4d4dfde4si3545303pfv.104.2023.07.28.10.14.54; Fri, 28 Jul 2023 10:15:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233907AbjG1QqM (ORCPT + 99 others); Fri, 28 Jul 2023 12:46:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229870AbjG1Qpm (ORCPT ); Fri, 28 Jul 2023 12:45:42 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7A50D173F for ; Fri, 28 Jul 2023 09:44:08 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5CC052F4; Fri, 28 Jul 2023 09:44:51 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 920AC3F67D; Fri, 28 Jul 2023 09:44:05 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 17/24] x86/resctrl: Move alloc/mon static keys into helpers Date: Fri, 28 Jul 2023 16:42:47 +0000 Message-Id: <20230728164254.27562-18-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772685373106670101 X-GMAIL-MSGID: 1772685373106670101 resctrl enables three static keys depending on the features it has enabled. Another architecture's context switch code may look different, any static keys that control it should be buried behind helpers. Move the alloc/mon logic into arch-specific helpers as a preparatory step for making the rdt_enable_key's status something the arch code decides. This means other architectures don't have to mirror the static keys. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- arch/x86/include/asm/resctrl.h | 20 ++++++++++++++++++++ arch/x86/kernel/cpu/resctrl/internal.h | 5 ----- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 8 ++++---- 3 files changed, 24 insertions(+), 9 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 0986b5208d76..23010fce5f8f 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -42,6 +42,26 @@ DECLARE_STATIC_KEY_FALSE(rdt_enable_key); DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key); DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key); +static inline void resctrl_arch_enable_alloc(void) +{ + static_branch_enable_cpuslocked(&rdt_alloc_enable_key); +} + +static inline void resctrl_arch_disable_alloc(void) +{ + static_branch_disable_cpuslocked(&rdt_alloc_enable_key); +} + +static inline void resctrl_arch_enable_mon(void) +{ + static_branch_enable_cpuslocked(&rdt_mon_enable_key); +} + +static inline void resctrl_arch_disable_mon(void) +{ + static_branch_disable_cpuslocked(&rdt_mon_enable_key); +} + /* * __resctrl_sched_in() - Writes the task's CLOSid/RMID to IA32_PQR_MSR * diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 28751579abe6..ac39fecba4ca 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -93,9 +93,6 @@ static inline struct rdt_fs_context *rdt_fc2context(struct fs_context *fc) return container_of(kfc, struct rdt_fs_context, kfc); } -DECLARE_STATIC_KEY_FALSE(rdt_enable_key); -DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key); - /** * struct mon_evt - Entry in the event list of a resource * @evtid: event id @@ -453,8 +450,6 @@ extern struct mutex rdtgroup_mutex; extern struct rdt_hw_resource rdt_resources_all[]; extern struct rdtgroup rdtgroup_default; -DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key); - extern struct dentry *debugfs_resctrl; enum resctrl_res_level { diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 68fe2dde8887..91a740e10865 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2542,9 +2542,9 @@ static int rdt_get_tree(struct fs_context *fc) goto out_psl; if (rdt_alloc_capable) - static_branch_enable_cpuslocked(&rdt_alloc_enable_key); + resctrl_arch_enable_alloc(); if (rdt_mon_capable) - static_branch_enable_cpuslocked(&rdt_mon_enable_key); + resctrl_arch_enable_mon(); if (rdt_alloc_capable || rdt_mon_capable) { static_branch_enable_cpuslocked(&rdt_enable_key); @@ -2817,8 +2817,8 @@ static void rdt_kill_sb(struct super_block *sb) rdt_pseudo_lock_release(); rdtgroup_default.mode = RDT_MODE_SHAREABLE; schemata_list_destroy(); - static_branch_disable_cpuslocked(&rdt_alloc_enable_key); - static_branch_disable_cpuslocked(&rdt_mon_enable_key); + resctrl_arch_disable_alloc(); + resctrl_arch_disable_mon(); static_branch_disable_cpuslocked(&rdt_enable_key); resctrl_mounted = false; kernfs_kill_sb(sb); From patchwork Fri Jul 28 16:42:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127791 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp591822vqg; Fri, 28 Jul 2023 10:37:02 -0700 (PDT) X-Google-Smtp-Source: APBJJlE2aGa+zBx5MClkc7voNFfT+7yE8XjPvr1BRtsX1ac9gncCs8oS+DgODjg+GTDZZfpe6e+5 X-Received: by 2002:a05:6512:ac2:b0:4fb:74da:989a with SMTP id n2-20020a0565120ac200b004fb74da989amr2558218lfu.3.1690565822646; Fri, 28 Jul 2023 10:37:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690565822; cv=none; d=google.com; s=arc-20160816; b=PFu5/K+zuNyZ4KNF5k704D5isjdWuVk7OzE3bDJc5kAeV8zI1YIfhd20dAv20Zk0LH sfMIb7PqDkQ1CIgQ3aQxAfzME2Na/O4f8yhpd39eT5IxLC2F8Kb0dIs3ZcHYLBQreNCs tP/g9VlaGojFr26gJbILs3EK6KLdE2rthpxFXzeXXxF3yYqwTIjq8A5cPvWBG3Udy3iM 7lREcnK7VJpRSo/yazrAUVuCZA8L5lKaMRG/341PdmK53dmDenmDEpjVcVN0Kgwqw7gL c0rcm3+QDgT942ShODr9RR+fSmsTggE2VSoqpTKRrP6Lbd78FPGvXWQrVuX+2Xuf43Up sghQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=yfjHi81RltcrpQriY41CLAJjgcnKoLIQqxoFqt/CVaU=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=OygdFaYyVucIk/OwphI2WJjQGRPikdhDTOFi7A4e8GNITOwIzdtPew+4ATBwSmwIFZ fhBhYTTg/E4RYKI5U7p4OonKaWrxm+AuHs5pm03Ssywow6JqpcX7Fb+dQ5KLBoqRHxV/ 2z6x59Ys4f9CQQ7i4OEp8bzVo0lHr9rI4Zr8oGOFNUGQGmEVUUrpFkhw3S7KxebGp+aK PvfXSYQmvgbD//DcPuAV0hfVdf/oOd9L5h6rWN8hATE1waZ8QQsyyOdkvQwBnUFLqH6e InxjVm/vpdkP1itdQlpd/cpO3fh9Xa7RwCr6u96NEUuFnJRII10qkC4wIYHAFsOguXP3 NMbQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sb9-20020a170906edc900b009937af8ac4asi2956810ejb.241.2023.07.28.10.36.37; Fri, 28 Jul 2023 10:37:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233925AbjG1QwV (ORCPT + 99 others); Fri, 28 Jul 2023 12:52:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52606 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232099AbjG1QwI (ORCPT ); Fri, 28 Jul 2023 12:52:08 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 2E3035278 for ; Fri, 28 Jul 2023 09:50:33 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4D29E1684; Fri, 28 Jul 2023 09:44:54 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 82D6F3F67D; Fri, 28 Jul 2023 09:44:08 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 18/24] x86/resctrl: Make rdt_enable_key the arch's decision to switch Date: Fri, 28 Jul 2023 16:42:48 +0000 Message-Id: <20230728164254.27562-19-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772686748375339583 X-GMAIL-MSGID: 1772686748375339583 rdt_enable_key is switched when resctrl is mounted. It was also previously used to prevent a second mount of the filesystem. Any other architecture that wants to support resctrl has to provide identical static keys. Now that there are helpers for enabling and disabling the alloc/mon keys, resctrl doesn't need to switch this extra key, it can be done by the arch code. Use the static-key increment and decrement helpers, and change resctrl to ensure the calls are balanced. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- arch/x86/include/asm/resctrl.h | 4 ++++ arch/x86/kernel/cpu/resctrl/rdtgroup.c | 11 +++++------ 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 23010fce5f8f..3876d4bb4bed 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -45,21 +45,25 @@ DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key); static inline void resctrl_arch_enable_alloc(void) { static_branch_enable_cpuslocked(&rdt_alloc_enable_key); + static_branch_inc_cpuslocked(&rdt_enable_key); } static inline void resctrl_arch_disable_alloc(void) { static_branch_disable_cpuslocked(&rdt_alloc_enable_key); + static_branch_dec_cpuslocked(&rdt_enable_key); } static inline void resctrl_arch_enable_mon(void) { static_branch_enable_cpuslocked(&rdt_mon_enable_key); + static_branch_inc_cpuslocked(&rdt_enable_key); } static inline void resctrl_arch_disable_mon(void) { static_branch_disable_cpuslocked(&rdt_mon_enable_key); + static_branch_dec_cpuslocked(&rdt_enable_key); } /* diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 91a740e10865..ce1ed485e4f7 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2546,10 +2546,8 @@ static int rdt_get_tree(struct fs_context *fc) if (rdt_mon_capable) resctrl_arch_enable_mon(); - if (rdt_alloc_capable || rdt_mon_capable) { - static_branch_enable_cpuslocked(&rdt_enable_key); + if (rdt_alloc_capable || rdt_mon_capable) resctrl_mounted = true; - } if (is_mbm_enabled()) { r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; @@ -2817,9 +2815,10 @@ static void rdt_kill_sb(struct super_block *sb) rdt_pseudo_lock_release(); rdtgroup_default.mode = RDT_MODE_SHAREABLE; schemata_list_destroy(); - resctrl_arch_disable_alloc(); - resctrl_arch_disable_mon(); - static_branch_disable_cpuslocked(&rdt_enable_key); + if (rdt_alloc_capable) + resctrl_arch_disable_alloc(); + if (rdt_mon_capable) + resctrl_arch_disable_mon(); resctrl_mounted = false; kernfs_kill_sb(sb); mutex_unlock(&rdtgroup_mutex); From patchwork Fri Jul 28 16:42:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127778 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp588293vqg; Fri, 28 Jul 2023 10:30:38 -0700 (PDT) X-Google-Smtp-Source: APBJJlEcIF0Mh9aHk3I6wD5mYt99w/a6XSQWkX2e3gGkZDrkRFK9PgUjsyszFWfLZjoubepGEN7k X-Received: by 2002:a05:6300:8003:b0:135:4858:683 with SMTP id an3-20020a056300800300b0013548580683mr2657899pzc.48.1690565438543; Fri, 28 Jul 2023 10:30:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690565438; cv=none; d=google.com; s=arc-20160816; b=PLJOuO2KL+ZGkDcSOsAggkJzmxBoBJySc9e/BWTIEnTn8Z6KoCKym+kFCiBYXSXu+p 1I+OZ2dbWdSU9k2Kyz1IvTi3HxAfOmSD7lnh9wlj8pRlokAIHpTKB6Q2v1nbFEGkNjDJ RAQ4Ihbe+IVQt3t7rUxbQdz3MeSQLOxgxk0GvvumpKiCUDYruz4YvhUBX7TUpVUnBgtO 90da0Zial9hNJ1gchl3Rk/VWXtq6HLeLa4STI070dRjExUlGiFEHCYFwSO3Ww0c+gDbp IZyF5h9ZDG8lmSCtPG6rIDFI7rhszf0DwAo67QEqS7nnXuma9Ea7O1s9B57iV/rjJX63 6Whw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=6CDeroJYXnHOxZP2Kb4o0o7QTN54PsoMF9W4+8r1o6Y=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=PVyONyPhOKMakJbq1Eewnv2fUK/0o3HIlyffRKCP50mmw/qhQwG/BzJNGWORZi2xsf J4fUNOZxzBjWE/sNCFM5adS1J/sDnu5OOe+aegaXK5lpyDxVmxJhxmrITV1tkOwX/nGP iXe8+PrRwNKVZuzzbfzbbTUfayUqtE0UdT9wDrHV4Gf7YJfmsdT5e1NX2gaeB8he13mN daoTuUZXo4ilEubCWbiZdL+yqz8eHfdRFhGrK3oyQnqKFa/B1CINpEq+MXPDV/IyclAv t6wcBwH8B1v82APfBW2mpN2R3v/tO1/xrIvVgyRtcT9BmZJgmeKF3oI/6y03HEAUMQI3 eOew== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 82-20020a630155000000b0056387b40590si3374222pgb.798.2023.07.28.10.30.24; Fri, 28 Jul 2023 10:30:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230360AbjG1Qwa (ORCPT + 99 others); Fri, 28 Jul 2023 12:52:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53614 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233254AbjG1QwJ (ORCPT ); Fri, 28 Jul 2023 12:52:09 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id D7C5F5589 for ; Fri, 28 Jul 2023 09:50:35 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3C4291688; Fri, 28 Jul 2023 09:44:57 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6F3FC3F67D; Fri, 28 Jul 2023 09:44:11 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 19/24] x86/resctrl: Add helpers for system wide mon/alloc capable Date: Fri, 28 Jul 2023 16:42:49 +0000 Message-Id: <20230728164254.27562-20-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772686345738653395 X-GMAIL-MSGID: 1772686345738653395 resctrl reads rdt_alloc_capable or rdt_mon_capable to determine whether any of the resources support the corresponding features. resctrl also uses the static-keys that affect the architecture's context-switch code to determine the same thing. This forces another architecture to have the same static-keys. As the static-key is enabled based on the capable flag, and none of the filesystem uses of these are in the scheduler path, move the capable flags behind helpers, and use these in the filesystem code instead of the static-key. After this change, only the architecture code manages and uses the static-keys to ensure __resctrl_sched_in() does not need runtime checks. This avoids multiple architectures having to define the same static-keys. Cases where the static-key implicitly tested if the resctrl filesystem was mounted all have an explicit check added by a previous patch. Tested-by: Shaopeng Tan Reviewed-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v1: * Added missing conversion in mkdir_rdt_prepare_rmid_free() Changes since v3: * Expanded the commit message. --- arch/x86/include/asm/resctrl.h | 13 +++++++++ arch/x86/kernel/cpu/resctrl/internal.h | 2 -- arch/x86/kernel/cpu/resctrl/monitor.c | 4 +-- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 6 ++-- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 34 +++++++++++------------ 5 files changed, 35 insertions(+), 24 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 3876d4bb4bed..63a4a2332d61 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -38,10 +38,18 @@ struct resctrl_pqr_state { DECLARE_PER_CPU(struct resctrl_pqr_state, pqr_state); +extern bool rdt_alloc_capable; +extern bool rdt_mon_capable; + DECLARE_STATIC_KEY_FALSE(rdt_enable_key); DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key); DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key); +static inline bool resctrl_arch_alloc_capable(void) +{ + return rdt_alloc_capable; +} + static inline void resctrl_arch_enable_alloc(void) { static_branch_enable_cpuslocked(&rdt_alloc_enable_key); @@ -54,6 +62,11 @@ static inline void resctrl_arch_disable_alloc(void) static_branch_dec_cpuslocked(&rdt_enable_key); } +static inline bool resctrl_arch_mon_capable(void) +{ + return rdt_mon_capable; +} + static inline void resctrl_arch_enable_mon(void) { static_branch_enable_cpuslocked(&rdt_mon_enable_key); diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index ac39fecba4ca..f99e0a1f39c8 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -136,8 +136,6 @@ struct rmid_read { void *arch_mon_ctx; }; -extern bool rdt_alloc_capable; -extern bool rdt_mon_capable; extern unsigned int rdt_mon_features; extern struct list_head resctrl_schema_all; extern bool resctrl_mounted; diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 5350d44b16b6..c0b1ad8d8f6d 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -842,7 +842,7 @@ void mbm_handle_overflow(struct work_struct *work) * If the filesystem has been unmounted this work no longer needs to * run. */ - if (!resctrl_mounted || !static_branch_likely(&rdt_mon_enable_key)) + if (!resctrl_mounted || !resctrl_arch_mon_capable()) goto out_unlock; r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; @@ -879,7 +879,7 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) * When a domain comes online there is no guarantee the filesystem is * mounted. If not, there is no need to catch counter overflow. */ - if (!resctrl_mounted || !static_branch_likely(&rdt_mon_enable_key)) + if (!resctrl_mounted || !resctrl_arch_mon_capable()) return; cpu = cpumask_any_housekeeping(&dom->cpu_mask); dom->mbm_work_cpu = cpu; diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c index 5ebd6e54c7f2..460421051abf 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -567,7 +567,7 @@ static int rdtgroup_locksetup_user_restrict(struct rdtgroup *rdtgrp) if (ret) goto err_cpus; - if (rdt_mon_capable) { + if (resctrl_arch_mon_capable()) { ret = rdtgroup_kn_mode_restrict(rdtgrp, "mon_groups"); if (ret) goto err_cpus_list; @@ -614,7 +614,7 @@ static int rdtgroup_locksetup_user_restore(struct rdtgroup *rdtgrp) if (ret) goto err_cpus; - if (rdt_mon_capable) { + if (resctrl_arch_mon_capable()) { ret = rdtgroup_kn_mode_restore(rdtgrp, "mon_groups", 0777); if (ret) goto err_cpus_list; @@ -762,7 +762,7 @@ int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp) { int ret; - if (rdt_mon_capable) { + if (resctrl_arch_mon_capable()) { ret = alloc_rmid(rdtgrp->closid); if (ret < 0) { rdt_last_cmd_puts("Out of RMIDs\n"); diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index ce1ed485e4f7..fef78a3dc632 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -630,13 +630,13 @@ static int __rdtgroup_move_task(struct task_struct *tsk, static bool is_closid_match(struct task_struct *t, struct rdtgroup *r) { - return (rdt_alloc_capable && (r->type == RDTCTRL_GROUP) && + return (resctrl_arch_alloc_capable() && (r->type == RDTCTRL_GROUP) && resctrl_arch_match_closid(t, r->closid)); } static bool is_rmid_match(struct task_struct *t, struct rdtgroup *r) { - return (rdt_mon_capable && (r->type == RDTMON_GROUP) && + return (resctrl_arch_mon_capable() && (r->type == RDTMON_GROUP) && resctrl_arch_match_rmid(t, r->mon.parent->closid, r->mon.rmid)); } @@ -2519,7 +2519,7 @@ static int rdt_get_tree(struct fs_context *fc) if (ret < 0) goto out_schemata_free; - if (rdt_mon_capable) { + if (resctrl_arch_mon_capable()) { ret = mongroup_create_dir(rdtgroup_default.kn, &rdtgroup_default, "mon_groups", &kn_mongrp); @@ -2541,12 +2541,12 @@ static int rdt_get_tree(struct fs_context *fc) if (ret < 0) goto out_psl; - if (rdt_alloc_capable) + if (resctrl_arch_alloc_capable()) resctrl_arch_enable_alloc(); - if (rdt_mon_capable) + if (resctrl_arch_mon_capable()) resctrl_arch_enable_mon(); - if (rdt_alloc_capable || rdt_mon_capable) + if (resctrl_arch_alloc_capable() || resctrl_arch_mon_capable()) resctrl_mounted = true; if (is_mbm_enabled()) { @@ -2560,10 +2560,10 @@ static int rdt_get_tree(struct fs_context *fc) out_psl: rdt_pseudo_lock_release(); out_mondata: - if (rdt_mon_capable) + if (resctrl_arch_mon_capable()) kernfs_remove(kn_mondata); out_mongrp: - if (rdt_mon_capable) + if (resctrl_arch_mon_capable()) kernfs_remove(kn_mongrp); out_info: kernfs_remove(kn_info); @@ -2815,9 +2815,9 @@ static void rdt_kill_sb(struct super_block *sb) rdt_pseudo_lock_release(); rdtgroup_default.mode = RDT_MODE_SHAREABLE; schemata_list_destroy(); - if (rdt_alloc_capable) + if (resctrl_arch_alloc_capable()) resctrl_arch_disable_alloc(); - if (rdt_mon_capable) + if (resctrl_arch_mon_capable()) resctrl_arch_disable_mon(); resctrl_mounted = false; kernfs_kill_sb(sb); @@ -3197,7 +3197,7 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp) { int ret; - if (!rdt_mon_capable) + if (!resctrl_arch_mon_capable()) return 0; ret = alloc_rmid(rdtgrp->closid); @@ -3219,7 +3219,7 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp) static void mkdir_rdt_prepare_rmid_free(struct rdtgroup *rgrp) { - if (rdt_mon_capable) + if (resctrl_arch_mon_capable()) free_rmid(rgrp->closid, rgrp->mon.rmid); } @@ -3385,7 +3385,7 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn, list_add(&rdtgrp->rdtgroup_list, &rdt_all_groups); - if (rdt_mon_capable) { + if (resctrl_arch_mon_capable()) { /* * Create an empty mon_groups directory to hold the subset * of tasks and cpus to monitor. @@ -3440,14 +3440,14 @@ static int rdtgroup_mkdir(struct kernfs_node *parent_kn, const char *name, * allocation is supported, add a control and monitoring * subdirectory */ - if (rdt_alloc_capable && parent_kn == rdtgroup_default.kn) + if (resctrl_arch_alloc_capable() && parent_kn == rdtgroup_default.kn) return rdtgroup_mkdir_ctrl_mon(parent_kn, name, mode); /* * If RDT monitoring is supported and the parent directory is a valid * "mon_groups" directory, add a monitoring subdirectory. */ - if (rdt_mon_capable && is_mon_groups(parent_kn, name)) + if (resctrl_arch_mon_capable() && is_mon_groups(parent_kn, name)) return rdtgroup_mkdir_mon(parent_kn, name, mode); return -EPERM; @@ -3779,7 +3779,7 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) * If resctrl is mounted, remove all the * per domain monitor data directories. */ - if (resctrl_mounted && static_branch_unlikely(&rdt_mon_enable_key)) + if (resctrl_mounted && resctrl_arch_mon_capable()) rmdir_mondata_subdir_allrdtgrp(r, d->id); if (is_mbm_enabled()) @@ -3862,7 +3862,7 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) * by rdt_get_tree() calling mkdir_mondata_all(). * If resctrl is mounted, add per domain monitor data directories. */ - if (resctrl_mounted && static_branch_unlikely(&rdt_mon_enable_key)) + if (resctrl_mounted && resctrl_arch_mon_capable()) mkdir_mondata_subdir_allrdtgrp(r, d); return 0; From patchwork Fri Jul 28 16:42:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127794 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp592010vqg; Fri, 28 Jul 2023 10:37:21 -0700 (PDT) X-Google-Smtp-Source: APBJJlEWBZZSX0jbSIrElZWx394/xyEvf2zhjCiQYWapyr6MpABDEBJm00uXV67GOk61p7kwYxmL X-Received: by 2002:aa7:cb03:0:b0:522:201f:dee8 with SMTP id s3-20020aa7cb03000000b00522201fdee8mr2157580edt.0.1690565840868; Fri, 28 Jul 2023 10:37:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690565840; cv=none; d=google.com; s=arc-20160816; b=0nG0qnSTXglUwE4w6ow4qxykcIUIpIdi8fa+2RZQSVTNHE2Z50ky0AwIUSEL+MpZP5 BNKhSZq3ZKwAGkG4gf1980Sywp4d7epIzT+2q16CeDF8rx0VANB1hIyoFGomIekENaTM lCLus6kJmUogBQIctJ0/opuqNxrELmKBI0+zzCow1r7oQoVs6CnJCUcQLO/2NDl8W0JY AFq7zmeACdlPoEcgkDoB5jDHJw1f7A5/2RdMHX+mHSRDTbVCBqJ+lDbT3eyfiux4Nn3H LxRNzwUDN+mSp/doruBqd40ZU2II/Fmo/w7y+ZVj3qvVcBz03at8HOMQSwhkumoZuIRO D13Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=5ISppSxnGkTZPZsEWDcD3+2YaTEB3VV5EMOZ9uNcCN4=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=e9XhG8NZMaWG0KEixx/NMlv5mGxripH5c4lvoFdYo9JVY/aDDpj0vjx2VsVkLtMDUh dhYgnZHXyFwibEDLOdIh2d91I018PRkGQSXuIt3zerP+lzHdORUxNgc4mQIwEwCV1GfT BbO6jogRuef8prmk4vDiG0ipWtUhyQGHE434ezqAC1T+61/mQ2at1MqJ3+wxJq6XIcvm mPmsZoU2RkhgLZ5alT3dXgUzpmP/2YG56XMkb7LZaeRoBVlLIILmDRJezVpJ9L43x7SO IM4LEJSZ6gemjqpig93IY8Y1LR0rzIfelfYejdAiGilzfhn/OH8nl0T7tXm0RY0py4WJ 5+oQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q18-20020a056402033200b005223fc691e5si2251787edw.348.2023.07.28.10.36.55; Fri, 28 Jul 2023 10:37:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233675AbjG1QqY (ORCPT + 99 others); Fri, 28 Jul 2023 12:46:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229446AbjG1Qpy (ORCPT ); Fri, 28 Jul 2023 12:45:54 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 4066F525D for ; Fri, 28 Jul 2023 09:44:17 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2CA0E168F; Fri, 28 Jul 2023 09:45:00 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 647B53F67D; Fri, 28 Jul 2023 09:44:14 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 20/24] x86/resctrl: Add cpu online callback for resctrl work Date: Fri, 28 Jul 2023 16:42:50 +0000 Message-Id: <20230728164254.27562-21-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772686766680233652 X-GMAIL-MSGID: 1772686766680233652 The resctrl architecture specific code may need to create a domain when a CPU comes online, it also needs to reset the CPUs PQR_ASSOC register. The resctrl filesystem code needs to update the rdtgroup_default CPU mask when CPUs are brought online. Currently this is all done in one function, resctrl_online_cpu(). This will need to be split into architecture and filesystem parts before resctrl can be moved to /fs/. Pull the rdtgroup_default update work out as a filesystem specific cpu_online helper. resctrl_online_cpu() is the obvious name for this, which means the version in core.c needs renaming. resctrl_online_cpu() is called by the arch code once it has done the work to add the new CPU to any domains. In future patches, resctrl_online_cpu() will take the rdtgroup_mutex itself. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v3: * Renamed err to ret Changes since v4: * Changes in capitalisation. --- arch/x86/kernel/cpu/resctrl/core.c | 11 ++++++----- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 10 ++++++++++ include/linux/resctrl.h | 1 + 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 8dfede01b0c9..a694563d3929 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -603,19 +603,20 @@ static void clear_closid_rmid(int cpu) wrmsr(MSR_IA32_PQR_ASSOC, 0, RESCTRL_RESERVED_CLOSID); } -static int resctrl_online_cpu(unsigned int cpu) +static int resctrl_arch_online_cpu(unsigned int cpu) { struct rdt_resource *r; + int ret; mutex_lock(&rdtgroup_mutex); for_each_capable_rdt_resource(r) domain_add_cpu(cpu, r); - /* The cpu is set in default rdtgroup after online. */ - cpumask_set_cpu(cpu, &rdtgroup_default.cpu_mask); clear_closid_rmid(cpu); + + ret = resctrl_online_cpu(cpu); mutex_unlock(&rdtgroup_mutex); - return 0; + return ret; } static void clear_childcpus(struct rdtgroup *r, unsigned int cpu) @@ -965,7 +966,7 @@ static int __init resctrl_late_init(void) state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/resctrl/cat:online:", - resctrl_online_cpu, resctrl_offline_cpu); + resctrl_arch_online_cpu, resctrl_offline_cpu); if (state < 0) return state; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index fef78a3dc632..7bd3a3dc0f44 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3868,6 +3868,16 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) return 0; } +int resctrl_online_cpu(unsigned int cpu) +{ + lockdep_assert_held(&rdtgroup_mutex); + + /* The CPU is set in default rdtgroup after online. */ + cpumask_set_cpu(cpu, &rdtgroup_default.cpu_mask); + + return 0; +} + /* * rdtgroup_init - rdtgroup initialization * diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 5e4b4df9610b..35d3c97df212 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -223,6 +223,7 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d, u32 closid, enum resctrl_conf_type type); int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d); void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); +int resctrl_online_cpu(unsigned int cpu); /** * resctrl_arch_rmid_read() - Read the eventid counter corresponding to rmid From patchwork Fri Jul 28 16:42:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127771 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp581119vqg; Fri, 28 Jul 2023 10:17:26 -0700 (PDT) X-Google-Smtp-Source: APBJJlHoDZGS+1d4YK4SyxUwA1McvGWhlrIqf0VkJmI2ZNsTavs3QYqqu180Y2kkUlUlkCcJJOPs X-Received: by 2002:a17:902:e551:b0:1bb:a125:f831 with SMTP id n17-20020a170902e55100b001bba125f831mr2454158plf.58.1690564645516; Fri, 28 Jul 2023 10:17:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690564645; cv=none; d=google.com; s=arc-20160816; b=e19Yk7NuHdySjurdRCJ3zaY5vLIMlxZPqDrKIS3+ic1d0ntNIdix/ep6N4AUejXW8s x9lpuQlj93CNh0ypiBHEvEsdb4gMc+hNcMba0D/mTeB4xsWnB1FspCfDwp2cAHUTeNpP DvSUtEDElF92Ek7imOSSD7t0RQxsS8+nMCVK6WMdRGt2l4KLz3sITlWAsDHmrETyTzTe g3NEAkw0fswnTY+CHBQWqTWqZ+rwgOrEER8uupoanx7o0ZxOOepmm8ppghx0q2OqURZq CoIhNF51JTRaMaswC+Ras5dBfk5/vhX1hOnx0lf8pX+aOBHh8QJYGv0ujpaYyfyvDDh5 8QHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=XLT7EHAKNT+sQwVBFRIIRkpdL79ixEYqgghizZU2R7U=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=1IIJHjyD/testF1OSERL7hNNuuShtpOgL61hRhf9tCbN+nmgh/a9SAOdIe+sGE+0D3 Ny5BySkDT+7f+xPe1PprFk8G9Ea2sm14BfdHtiRrIoLQJ8jKxYUy01zVW1iyNTxp+9MZ QSDebs4mF90ICH8hX+ufSxDz4h1U1bQYUqu1Vx3gYPM/Uotg1hbvP3DxS5dAhJT2uIYd idsyEyk+tnUbvF/I5/IYOpp0IvY5hwwWpkRjTTyix/9UeWD3tJVcMx2IG4Tvb3Skd9t7 wW8RCSqLg3zhGSuHXtF5jFntfGnHc1I78jKZQjt5rzWQP3JU7xjqdI5kORql78m1spqD AHNg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p18-20020a170902e75200b001b8ae69289dsi3428375plf.539.2023.07.28.10.17.12; Fri, 28 Jul 2023 10:17:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233172AbjG1Qwd (ORCPT + 99 others); Fri, 28 Jul 2023 12:52:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52642 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232397AbjG1QwJ (ORCPT ); Fri, 28 Jul 2023 12:52:09 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A32B25586 for ; Fri, 28 Jul 2023 09:50:34 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 171471570; Fri, 28 Jul 2023 09:45:03 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4EF703F67D; Fri, 28 Jul 2023 09:44:17 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 21/24] x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but cpu Date: Fri, 28 Jul 2023 16:42:51 +0000 Message-Id: <20230728164254.27562-22-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772685513907718581 X-GMAIL-MSGID: 1772685513907718581 When a CPU is taken offline resctrl may need to move the overflow or limbo handlers to run on a different CPU. Once the offline callbacks have been split, cqm_setup_limbo_handler() will be called while the CPU that is going offline is still present in the cpu_mask. Pass the CPU to exclude to cqm_setup_limbo_handler() and mbm_setup_overflow_handler(). These functions can use a variant of cpumask_any_but() when selecting the CPU. -1 is used to indicate no CPUs need excluding. A subsequent patch moves these calls to be before CPUs have been removed, so this exclude_cpus behaviour is temporary. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v2: * Rephrased a comment to avoid a two letter bad-word. (we) * Avoid assigning mbm_work_cpu if the domain is going to be free()d * Added cpumask_any_housekeeping_but(), I dislike the name Changes since v3: * Marked an explanatory comment as temporary as the subsequent patch is no longer adjacent. Changes since v4: * Check against RESCTRL_PICK_ANY_CPU instead of -1. * Leave cqm_work_cpu as nr_cpu_ids when no CPU is available. * Made cpumask_any_housekeeping_but() more readable. --- arch/x86/kernel/cpu/resctrl/core.c | 8 +++-- arch/x86/kernel/cpu/resctrl/internal.h | 36 ++++++++++++++++++++-- arch/x86/kernel/cpu/resctrl/monitor.c | 42 +++++++++++++++++++++----- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 6 ++-- include/linux/resctrl.h | 2 ++ 5 files changed, 81 insertions(+), 13 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index a694563d3929..d39572a0a3cd 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -582,12 +582,16 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) if (r == &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl) { if (is_mbm_enabled() && cpu == d->mbm_work_cpu) { cancel_delayed_work(&d->mbm_over); - mbm_setup_overflow_handler(d, 0); + /* + * temporary: exclude_cpu=-1 as this CPU has already + * been removed by cpumask_clear_cpu()d + */ + mbm_setup_overflow_handler(d, 0, RESCTRL_PICK_ANY_CPU); } if (is_llc_occupancy_enabled() && cpu == d->cqm_work_cpu && has_busy_rmid(d)) { cancel_delayed_work(&d->cqm_limbo); - cqm_setup_limbo_handler(d, 0); + cqm_setup_limbo_handler(d, 0, RESCTRL_PICK_ANY_CPU); } } } diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index f99e0a1f39c8..655418c23c0e 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -79,6 +79,36 @@ static inline unsigned int cpumask_any_housekeeping(const struct cpumask *mask) return cpu; } +/** + * cpumask_any_housekeeping_but() - Chose any cpu in @mask, preferring those + * that aren't marked nohz_full, excluding + * the provided CPU + * @mask: The mask to pick a CPU from. + * @exclude_cpu:The CPU to avoid picking. + * + * Returns a CPU from @mask, but not @exclude_cpus. If there are housekeeping + * CPUs that don't use nohz_full, these are preferred. + * Returns >= nr_cpu_ids if no CPUs are available. + */ +static inline unsigned int +cpumask_any_housekeeping_but(const struct cpumask *mask, int exclude_cpu) +{ + unsigned int cpu, hk_cpu; + + cpu = cpumask_any_but(mask, exclude_cpu); + if (!tick_nohz_full_cpu(cpu)) + return cpu; + + hk_cpu = cpumask_nth_andnot(0, mask, tick_nohz_full_mask); + if (hk_cpu == exclude_cpu) + hk_cpu = cpumask_nth_andnot(1, mask, tick_nohz_full_mask); + + if (hk_cpu < nr_cpu_ids) + cpu = hk_cpu; + + return cpu; +} + struct rdt_fs_context { struct kernfs_fs_context kfc; bool enable_cdpl2; @@ -564,11 +594,13 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, struct rdt_domain *d, struct rdtgroup *rdtgrp, int evtid, int first); void mbm_setup_overflow_handler(struct rdt_domain *dom, - unsigned long delay_ms); + unsigned long delay_ms, + int exclude_cpu); void mbm_handle_overflow(struct work_struct *work); void __init intel_rdt_mbm_apply_quirk(void); bool is_mba_sc(struct rdt_resource *r); -void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms); +void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms, + int exclude_cpu); void cqm_handle_limbo(struct work_struct *work); bool has_busy_rmid(struct rdt_domain *d); void __check_limbo(struct rdt_domain *d, bool force_free); diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index c0b1ad8d8f6d..471cdc4e4eae 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -493,7 +493,8 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) * setup up the limbo worker. */ if (!has_busy_rmid(d)) - cqm_setup_limbo_handler(d, CQM_LIMBOCHECK_INTERVAL); + cqm_setup_limbo_handler(d, CQM_LIMBOCHECK_INTERVAL, + RESCTRL_PICK_ANY_CPU); set_bit(idx, d->rmid_busy_llc); entry->busy++; } @@ -816,15 +817,28 @@ void cqm_handle_limbo(struct work_struct *work) mutex_unlock(&rdtgroup_mutex); } -void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms) +/** + * cqm_setup_limbo_handler() - Schedule the limbo handler to run for this + * domain. + * @delay_ms: How far in the future the handler should run. + * @exclude_cpu: Which CPU the handler should not run on, + * RESCTRL_PICK_ANY_CPU to pick any CPU. + */ +void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms, + int exclude_cpu) { unsigned long delay = msecs_to_jiffies(delay_ms); int cpu; - cpu = cpumask_any_housekeeping(&dom->cpu_mask); + if (exclude_cpu == RESCTRL_PICK_ANY_CPU) + cpu = cpumask_any_housekeeping(&dom->cpu_mask); + else + cpu = cpumask_any_housekeeping_but(&dom->cpu_mask, + exclude_cpu); dom->cqm_work_cpu = cpu; - schedule_delayed_work_on(cpu, &dom->cqm_limbo, delay); + if (cpu < nr_cpu_ids) + schedule_delayed_work_on(cpu, &dom->cqm_limbo, delay); } void mbm_handle_overflow(struct work_struct *work) @@ -870,7 +884,15 @@ void mbm_handle_overflow(struct work_struct *work) mutex_unlock(&rdtgroup_mutex); } -void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) +/** + * mbm_setup_overflow_handler() - Schedule the overflow handler to run for this + * domain. + * @delay_ms: How far in the future the handler should run. + * @exclude_cpu: Which CPU the handler should not run on, + * RESCTRL_PICK_ANY_CPU to pick any CPU. + */ +void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms, + int exclude_cpu) { unsigned long delay = msecs_to_jiffies(delay_ms); int cpu; @@ -881,9 +903,15 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) */ if (!resctrl_mounted || !resctrl_arch_mon_capable()) return; - cpu = cpumask_any_housekeeping(&dom->cpu_mask); + if (exclude_cpu == RESCTRL_PICK_ANY_CPU) + cpu = cpumask_any_housekeeping(&dom->cpu_mask); + else + cpu = cpumask_any_housekeeping_but(&dom->cpu_mask, + exclude_cpu); dom->mbm_work_cpu = cpu; - schedule_delayed_work_on(cpu, &dom->mbm_over, delay); + + if (cpu < nr_cpu_ids) + schedule_delayed_work_on(cpu, &dom->mbm_over, delay); } static int dom_data_init(struct rdt_resource *r) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 7bd3a3dc0f44..dac7ed7ac71a 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2552,7 +2552,8 @@ static int rdt_get_tree(struct fs_context *fc) if (is_mbm_enabled()) { r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; list_for_each_entry(dom, &r->domains, list) - mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL); + mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL, + RESCTRL_PICK_ANY_CPU); } goto out; @@ -3850,7 +3851,8 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) if (is_mbm_enabled()) { INIT_DELAYED_WORK(&d->mbm_over, mbm_handle_overflow); - mbm_setup_overflow_handler(d, MBM_OVERFLOW_INTERVAL); + mbm_setup_overflow_handler(d, MBM_OVERFLOW_INTERVAL, + RESCTRL_PICK_ANY_CPU); } if (is_llc_occupancy_enabled()) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 35d3c97df212..56b4645940a7 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -10,6 +10,8 @@ #define RESCTRL_RESERVED_CLOSID 0 #define RESCTRL_RESERVED_RMID 0 +#define RESCTRL_PICK_ANY_CPU -1 + #ifdef CONFIG_PROC_CPU_RESCTRL int proc_resctrl_show(struct seq_file *m, From patchwork Fri Jul 28 16:42:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127772 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp583588vqg; Fri, 28 Jul 2023 10:21:40 -0700 (PDT) X-Google-Smtp-Source: APBJJlFVSwHL2lb6W8+9Gxnm3ieY2lmVU5P3/lJJm2DlL1k6PwqBrXqGJaT9IGdOUW/TvTYqYI19 X-Received: by 2002:a17:902:ec82:b0:1bb:9c45:130f with SMTP id x2-20020a170902ec8200b001bb9c45130fmr2725033plg.69.1690564900043; Fri, 28 Jul 2023 10:21:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690564900; cv=none; d=google.com; s=arc-20160816; b=bKYdib5sBFZv5xabQAtQMmYTo1lV4Adht6u6lIpCTHE1IuTOeSseB4drPzia6EYN15 2rhG829qLciVbmCiq5+y03loGGqR8z5Jomxkc1IgWgNV/pFOZ5toLScjGrRBI9sK4pnF i9IzFK+9HijmG+Cj5SHNJpyD20m7vmITXjo3GH/9kgSSBRWdDB5nD+UiQsa8BsR55nUQ u6u/gOoUntfLrnEVTDtisBXJmjtMQhLL767Yha5IaPGATU3WpFlDsuQCYJfxsHKy6gU7 gMqefdIKQWFA99E8GMsp6f2IqK/yIsDa5alp7YuMcyKltQBtVslLfQi7z96J2HJTElXV gd8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=ZVgnLEEPZj4ooWURUXc1DZ+T6ilXCo1TP7J02CrTCyU=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=NTApGhumSTHlKd8bpOc/hLZterfXBynHOYSAp+bK7MO3mPRt5HjPALhfRzZpJ8ir/D zRPkzwfNbuX8p3mT1tra1o0FOG9iwpGCKfHmKhiugKh4GN7U9dzrNsPAIKdi9PhwB3qF 7ygVWth6PnuCW6R2YAPNOu2mD6EnnOVkHiedFrHW5hfjnyIy6kGpquSuSnjOHUZTezUy 8YZEMBYwo0sdm4HdQUm52c/nGK7Z0idKyx0upLKQl0nU6ayKu/6i6DUwZapffxznKKF7 ZcRZAJkif1RArD4gU4fErH76TkRdYwOfQfdwMyQGdGSbVAEVrfOjJNvgzPC3JgJPRn8l qOTA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ks14-20020a170903084e00b001bb9f190bafsi3204414plb.526.2023.07.28.10.21.25; Fri, 28 Jul 2023 10:21:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235573AbjG1Q4T (ORCPT + 99 others); Fri, 28 Jul 2023 12:56:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57742 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235540AbjG1Q4D (ORCPT ); Fri, 28 Jul 2023 12:56:03 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 2361F46B9 for ; Fri, 28 Jul 2023 09:55:21 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 03D03D75; Fri, 28 Jul 2023 09:45:06 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3D5F83F67D; Fri, 28 Jul 2023 09:44:20 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 22/24] x86/resctrl: Add cpu offline callback for resctrl work Date: Fri, 28 Jul 2023 16:42:52 +0000 Message-Id: <20230728164254.27562-23-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772685780817916621 X-GMAIL-MSGID: 1772685780817916621 The resctrl architecture specific code may need to free a domain when a CPU goes offline, it also needs to reset the CPUs PQR_ASSOC register. Amongst other things, the resctrl filesystem code needs to clear this CPU from the cpu_mask of any control and monitor groups. Currently this is all done in core.c and called from resctrl_offline_cpu(), making the split between architecture and filesystem code unclear. Move the filesystem work to remove the CPU from the control and monitor groups into a filesystem helper called resctrl_offline_cpu(), and rename the one in core.c resctrl_arch_offline_cpu(). The rdtgroup_mutex is unlocked and locked again in the call in preparation for changing the locking rules for the architecture code. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/core.c | 25 +++++-------------------- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 24 ++++++++++++++++++++++++ include/linux/resctrl.h | 1 + 3 files changed, 30 insertions(+), 20 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index d39572a0a3cd..6eb9408a942a 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -623,31 +623,15 @@ static int resctrl_arch_online_cpu(unsigned int cpu) return ret; } -static void clear_childcpus(struct rdtgroup *r, unsigned int cpu) +static int resctrl_arch_offline_cpu(unsigned int cpu) { - struct rdtgroup *cr; - - list_for_each_entry(cr, &r->mon.crdtgrp_list, mon.crdtgrp_list) { - if (cpumask_test_and_clear_cpu(cpu, &cr->cpu_mask)) { - break; - } - } -} - -static int resctrl_offline_cpu(unsigned int cpu) -{ - struct rdtgroup *rdtgrp; struct rdt_resource *r; mutex_lock(&rdtgroup_mutex); + resctrl_offline_cpu(cpu); + for_each_capable_rdt_resource(r) domain_remove_cpu(cpu, r); - list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) { - if (cpumask_test_and_clear_cpu(cpu, &rdtgrp->cpu_mask)) { - clear_childcpus(rdtgrp, cpu); - break; - } - } clear_closid_rmid(cpu); mutex_unlock(&rdtgroup_mutex); @@ -970,7 +954,8 @@ static int __init resctrl_late_init(void) state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/resctrl/cat:online:", - resctrl_arch_online_cpu, resctrl_offline_cpu); + resctrl_arch_online_cpu, + resctrl_arch_offline_cpu); if (state < 0) return state; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index dac7ed7ac71a..12a628b5d476 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3880,6 +3880,30 @@ int resctrl_online_cpu(unsigned int cpu) return 0; } +static void clear_childcpus(struct rdtgroup *r, unsigned int cpu) +{ + struct rdtgroup *cr; + + list_for_each_entry(cr, &r->mon.crdtgrp_list, mon.crdtgrp_list) { + if (cpumask_test_and_clear_cpu(cpu, &cr->cpu_mask)) + break; + } +} + +void resctrl_offline_cpu(unsigned int cpu) +{ + struct rdtgroup *rdtgrp; + + lockdep_assert_held(&rdtgroup_mutex); + + list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) { + if (cpumask_test_and_clear_cpu(cpu, &rdtgrp->cpu_mask)) { + clear_childcpus(rdtgrp, cpu); + break; + } + } +} + /* * rdtgroup_init - rdtgroup initialization * diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 56b4645940a7..f3ef3ceb9c5e 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -226,6 +226,7 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d, int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d); void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); int resctrl_online_cpu(unsigned int cpu); +void resctrl_offline_cpu(unsigned int cpu); /** * resctrl_arch_rmid_read() - Read the eventid counter corresponding to rmid From patchwork Fri Jul 28 16:42:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127797 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp593627vqg; Fri, 28 Jul 2023 10:40:31 -0700 (PDT) X-Google-Smtp-Source: APBJJlGel5CBzNJOhjFi1z2D5k7Y+X0/l+9XGnsg731XyllNito8BXjt0DTk71GVhD+Lfc9pctLJ X-Received: by 2002:a17:902:d511:b0:1b8:b3f9:58e5 with SMTP id b17-20020a170902d51100b001b8b3f958e5mr2373744plg.38.1690566030968; Fri, 28 Jul 2023 10:40:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690566030; cv=none; d=google.com; s=arc-20160816; b=Y02ZVFkADKqg50nFVIsLTAJ9+jjSZB4VQyn/XA+r05viwkPBXSYgouoqeBOsDjEWmb deCWMabHgfEzIBlwLKUW0XWgFq0GLQrmagB75HUNnmoO4N+FxvOeNv/tqXT5wqHbCAgg F36pmnV5gC/77tmgtb8jdwdRugz1CA662Gd9JZ+g8Lu6WLLbVe+sg7MDjTB+bpbc4Eye v430RKxwaAqTUWX0JGxESjdIpZHCwqJFAGjsFchbZZdiaoSe8MsfBvje3te3vzyGpf2x EINySsVPEcDZg7O+SxP6ODGNyAktuh2MupWAxlupK55CmhhdaIvX/1rCvxVnP0YTCh1+ 1V9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=8pxIUY7N6xaP0tRZAKzIdgdHi0dETXWui0c30F2JjvQ=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=eARnZerAHQ6PuV3gZsntG1Gu4+qSjvAOyv9EyKcX60UxeT4HYM8tN5/UZBpF4gWyan 710Xolhl0bKtOOXRo2ejiAwZE+0faL8WOONC4H+JL3+U8UH+jCL5z24k0pbZFk2swBYL d+w9C00ycfSa23N0Gv01T0z/QUHvFruk7bUqaJI7+JU2bB9J18dxxXAGCd5DCLZvllLZ S+BwRf7YUKP3/KMBD1RPYEwVFZsxm2YhyVwMm0iahRkLt+sNUK1vL3BPSAawVPQoMbI1 HaTCcs1PyTCTK6xdBUYLxXNGqZHkruKlkxLFBZwGuEYfDCtbf0qrTa5p+HRAFBA0Qqh9 IXsQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id kb13-20020a170903338d00b001bb3bcd05bbsi607159plb.471.2023.07.28.10.40.17; Fri, 28 Jul 2023 10:40:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233329AbjG1Qw0 (ORCPT + 99 others); Fri, 28 Jul 2023 12:52:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53294 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234045AbjG1QwI (ORCPT ); Fri, 28 Jul 2023 12:52:08 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 3A297527A for ; Fri, 28 Jul 2023 09:50:33 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EBBFD1691; Fri, 28 Jul 2023 09:45:08 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 30E153F67D; Fri, 28 Jul 2023 09:44:23 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 23/24] x86/resctrl: Move domain helper migration into resctrl_offline_cpu() Date: Fri, 28 Jul 2023 16:42:53 +0000 Message-Id: <20230728164254.27562-24-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772686966729968803 X-GMAIL-MSGID: 1772686966729968803 When a CPU is taken offline the resctrl filesystem code needs to check if it was the CPU nominated to perform the periodic overflow and limbo work. If so, another CPU needs to be chosen to do this work. This is currently done in core.c, mixed in with the code that removes the CPU from the domain's mask, and potentially free()s the domain. Move the migration of the overflow and limbo helpers into the filesystem code, into resctrl_offline_cpu(). As resctrl_offline_cpu() runs before the architecture code has removed the CPU from the domain mask, the callers need to be told which CPU is being removed, to avoid picking it as the new CPU. This uses the exclude_cpu feature previously added. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/core.c | 16 ---------------- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 15 +++++++++++++++ 2 files changed, 15 insertions(+), 16 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 6eb9408a942a..edc0dd123317 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -578,22 +578,6 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) return; } - - if (r == &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl) { - if (is_mbm_enabled() && cpu == d->mbm_work_cpu) { - cancel_delayed_work(&d->mbm_over); - /* - * temporary: exclude_cpu=-1 as this CPU has already - * been removed by cpumask_clear_cpu()d - */ - mbm_setup_overflow_handler(d, 0, RESCTRL_PICK_ANY_CPU); - } - if (is_llc_occupancy_enabled() && cpu == d->cqm_work_cpu && - has_busy_rmid(d)) { - cancel_delayed_work(&d->cqm_limbo); - cqm_setup_limbo_handler(d, 0, RESCTRL_PICK_ANY_CPU); - } - } } static void clear_closid_rmid(int cpu) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 12a628b5d476..a256a96df487 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3892,7 +3892,9 @@ static void clear_childcpus(struct rdtgroup *r, unsigned int cpu) void resctrl_offline_cpu(unsigned int cpu) { + struct rdt_domain *d; struct rdtgroup *rdtgrp; + struct rdt_resource *l3 = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; lockdep_assert_held(&rdtgroup_mutex); @@ -3902,6 +3904,19 @@ void resctrl_offline_cpu(unsigned int cpu) break; } } + + d = get_domain_from_cpu(cpu, l3); + if (d) { + if (is_mbm_enabled() && cpu == d->mbm_work_cpu) { + cancel_delayed_work(&d->mbm_over); + mbm_setup_overflow_handler(d, 0, cpu); + } + if (is_llc_occupancy_enabled() && cpu == d->cqm_work_cpu && + has_busy_rmid(d)) { + cancel_delayed_work(&d->cqm_limbo); + cqm_setup_limbo_handler(d, 0, cpu); + } + } } /* From patchwork Fri Jul 28 16:42:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 127745 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp563969vqg; Fri, 28 Jul 2023 09:51:05 -0700 (PDT) X-Google-Smtp-Source: APBJJlFVVCawCc6yFE8ttW2Ku0SmKLrdqwUmqMM1s4CIpAfGoLWZGxhBqfgqCQ7V7nJKTjec5iBP X-Received: by 2002:a05:6a00:986:b0:666:b22d:c6e0 with SMTP id u6-20020a056a00098600b00666b22dc6e0mr3447482pfg.11.1690563065382; Fri, 28 Jul 2023 09:51:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690563065; cv=none; d=google.com; s=arc-20160816; b=UPHHstYKDC+sgscsKfFLaW0R0ZAkiY+1t83vB0P1C7RtjrTLtHfINr+zydSgTtCwn3 DF7nm3j8cN7Q9KZSE4h6N+Dj0p4lSoJdEHIYmkyRXqFyPBs7EJB1RBa73CsYx9nippKr Hgt2kikMQ4KaCnDKg19oFn9EqDHp3GArrkW56tj4V5p264L8Vv/Pw3dDc2Qs0OUNaJbW xEh/5f1xJGq7dlXLvCUaLwnHhxI8IpbCq6NGXzPvo5J5+YS7fkZiySGRzr89hf9ZLwub JyjmTkRqoeLF3rUTAyUSI7JwJZPhaGF3/y90X/qB2yukz70pIubAP7szzEPTqwuK/NZE 5cRA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=DNbXr7TVFoB5es+PSazNfDutiTV40w6WVDm7b9FN2Js=; fh=n9L+e/Mo/wOalMjVbltm4zKnQoPh0Qbxoe+x3eznw8A=; b=IgMjJxgwYLEbKJOpAhKuGNaojs5a0JkOSdslwvg3BEcb70DkwmDfElBK9H4QAzXBBh 3kXSgNL/P3G1VDRpb11jagXG7yWTVtLpa1DhqeOdx9V818X/ChArCQKc9pU682AdOfB0 Ezyf51ktTfVgzBvgrm6CAVdBWN0ya5XqR6PY7M32lFfGsgHtqfqcrS0EqtVaA2lxF5Kn zQwusmGEzo6GlSON7SykvAOoT/FMg5PiYVfTqcJ966fFyCZK30INZ0cfJebsSjc6fyFn L3CfBs/S1DLweiuuJ5/76auajzy8JcR94Zyzjh8nOQ+0+38KL59k7MAolfv93mqOdgNE revg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j25-20020a634a59000000b0055c79555b90si3229651pgl.333.2023.07.28.09.50.49; Fri, 28 Jul 2023 09:51:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229808AbjG1Qrw (ORCPT + 99 others); Fri, 28 Jul 2023 12:47:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46974 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231368AbjG1Qrc (ORCPT ); Fri, 28 Jul 2023 12:47:32 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id F3B2C4C1F for ; Fri, 28 Jul 2023 09:45:40 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D8E2E15DB; Fri, 28 Jul 2023 09:45:11 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1C2D33F67D; Fri, 28 Jul 2023 09:44:26 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com Subject: [PATCH v5 24/24] x86/resctrl: Separate arch and fs resctrl locks Date: Fri, 28 Jul 2023 16:42:54 +0000 Message-Id: <20230728164254.27562-25-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230728164254.27562-1-james.morse@arm.com> References: <20230728164254.27562-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772683856840951788 X-GMAIL-MSGID: 1772683856840951788 resctrl has one mutex that is taken by the architecture specific code, and the filesystem parts. The two interact via cpuhp, where the architecture code updates the domain list. Filesystem handlers that walk the domains list should not run concurrently with the cpuhp callback modifying the list. Exposing a lock from the filesystem code means the interface is not cleanly defined, and creates the possibility of cross-architecture lock ordering headaches. The interaction only exists so that certain filesystem paths are serialised against cpu hotplug. The cpu hotplug code already has a mechanism to do this using cpus_read_lock(). MPAM's monitors have an overflow interrupt, so it needs to be possible to walk the domains list in irq context. RCU is ideal for this, but some paths need to be able to sleep to allocate memory. Because resctrl_{on,off}line_cpu() take the rdtgroup_mutex as part of a cpuhp callback, cpus_read_lock() must always be taken first. rdtgroup_schemata_write() already does this. Most of the filesystem code's domain list walkers are currently protected by the rdtgroup_mutex taken in rdtgroup_kn_lock_live(). The exceptions are rdt_bit_usage_show() and the mon_config helpers which take the lock directly. Make the domain list protected by RCU. An architecture-specific lock prevents concurrent writers. rdt_bit_usage_show() can walk the domain list under rcu_read_lock(). The mon_config helpers send multiple IPIs, take the cpus_read_lock() in these cases. The other filesystem list walkers need to be able to sleep. Add cpus_read_lock() to rdtgroup_kn_lock_live() so that the cpuhp callbacks can't be invoked when file system operations are occurring. Add lockdep_assert_cpus_held() in the cases where the rdtgroup_kn_lock_live() call isn't obvious. Resctrl's domain online/offline calls now need to take the rdtgroup_mutex themselves. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v2: * Reworded a comment, * Added a lockdep assertion * Moved clear_closid_rmid() outside the locked region of cpu online/offline Changes since v3: * Added a header include --- arch/x86/kernel/cpu/resctrl/core.c | 38 +++++++++----- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 16 ++++-- arch/x86/kernel/cpu/resctrl/monitor.c | 4 ++ arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 3 ++ arch/x86/kernel/cpu/resctrl/rdtgroup.c | 64 ++++++++++++++++++++--- include/linux/resctrl.h | 2 +- 6 files changed, 101 insertions(+), 26 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index edc0dd123317..f106c68a9be8 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -25,8 +25,15 @@ #include #include "internal.h" -/* Mutex to protect rdtgroup access. */ -DEFINE_MUTEX(rdtgroup_mutex); +/* + * rdt_domain structures are kfree()d when their last CPU goes offline, + * and allocated when the first CPU in a new domain comes online. + * The rdt_resource's domain list is updated when this happens. Readers of + * the domain list must either take cpus_read_lock(), or rely on an RCU + * read-side critical section, to avoid observing concurrent modification. + * All writers take this mutex: + */ +static DEFINE_MUTEX(domain_list_lock); /* * The cached resctrl_pqr_state is strictly per CPU and can never be @@ -508,6 +515,8 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r) struct rdt_domain *d; int err; + lockdep_assert_held(&domain_list_lock); + d = rdt_find_domain(r, id, &add_pos); if (IS_ERR(d)) { pr_warn("Couldn't find cache id for CPU %d\n", cpu); @@ -541,11 +550,12 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r) return; } - list_add_tail(&d->list, add_pos); + list_add_tail_rcu(&d->list, add_pos); err = resctrl_online_domain(r, d); if (err) { - list_del(&d->list); + list_del_rcu(&d->list); + synchronize_rcu(); domain_free(hw_dom); } } @@ -556,6 +566,8 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) struct rdt_hw_domain *hw_dom; struct rdt_domain *d; + lockdep_assert_held(&domain_list_lock); + d = rdt_find_domain(r, id, NULL); if (IS_ERR_OR_NULL(d)) { pr_warn("Couldn't find cache id for CPU %d\n", cpu); @@ -566,7 +578,8 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) cpumask_clear_cpu(cpu, &d->cpu_mask); if (cpumask_empty(&d->cpu_mask)) { resctrl_offline_domain(r, d); - list_del(&d->list); + list_del_rcu(&d->list); + synchronize_rcu(); /* * rdt_domain "d" is going to be freed below, so clear @@ -594,30 +607,29 @@ static void clear_closid_rmid(int cpu) static int resctrl_arch_online_cpu(unsigned int cpu) { struct rdt_resource *r; - int ret; - mutex_lock(&rdtgroup_mutex); + mutex_lock(&domain_list_lock); for_each_capable_rdt_resource(r) domain_add_cpu(cpu, r); + mutex_unlock(&domain_list_lock); + clear_closid_rmid(cpu); - ret = resctrl_online_cpu(cpu); - mutex_unlock(&rdtgroup_mutex); - - return ret; + return resctrl_online_cpu(cpu); } static int resctrl_arch_offline_cpu(unsigned int cpu) { struct rdt_resource *r; - mutex_lock(&rdtgroup_mutex); resctrl_offline_cpu(cpu); + mutex_lock(&domain_list_lock); for_each_capable_rdt_resource(r) domain_remove_cpu(cpu, r); + mutex_unlock(&domain_list_lock); + clear_closid_rmid(cpu); - mutex_unlock(&rdtgroup_mutex); return 0; } diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index 55bad57a7bd5..b4f611359d1e 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -209,6 +209,9 @@ static int parse_line(char *line, struct resctrl_schema *s, struct rdt_domain *d; unsigned long dom_id; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP && (r->rid == RDT_RESOURCE_MBA || r->rid == RDT_RESOURCE_SMBA)) { rdt_last_cmd_puts("Cannot pseudo-lock MBA resource\n"); @@ -313,6 +316,9 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid) struct rdt_domain *d; u32 idx; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL)) return -ENOMEM; @@ -378,11 +384,9 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of, return -EINVAL; buf[nbytes - 1] = '\0'; - cpus_read_lock(); rdtgrp = rdtgroup_kn_lock_live(of->kn); if (!rdtgrp) { rdtgroup_kn_unlock(of->kn); - cpus_read_unlock(); return -ENOENT; } rdt_last_cmd_clear(); @@ -444,7 +448,6 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of, out: rdt_staged_configs_clear(); rdtgroup_kn_unlock(of->kn); - cpus_read_unlock(); return ret ?: nbytes; } @@ -464,6 +467,9 @@ static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int clo bool sep = false; u32 ctrl_val; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + seq_printf(s, "%*s:", max_name_width, schema->name); list_for_each_entry(dom, &r->domains, list) { if (sep) @@ -534,8 +540,8 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, { int cpu; - /* When picking a CPU from cpu_mask, ensure it can't race with cpuhp */ - lockdep_assert_held(&rdtgroup_mutex); + /* When picking a cpu from cpu_mask, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); /* * Setup the parameters to pass to mon_event_count() to read the data. diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 471cdc4e4eae..11b5da93044d 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -15,6 +15,7 @@ * Software Developer Manual June 2016, volume 3, section 17.17. */ +#include #include #include #include @@ -484,6 +485,9 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) lockdep_assert_held(&rdtgroup_mutex); + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + idx = resctrl_arch_rmid_idx_encode(entry->closid, entry->rmid); entry->busy = 0; diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c index 460421051abf..fc3ed917d173 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -830,6 +830,9 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d) struct rdt_domain *d_i; bool ret = false; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + if (!zalloc_cpumask_var(&cpu_with_psl, GFP_KERNEL)) return true; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index a256a96df487..47dcf2cb76ca 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -35,6 +35,10 @@ DEFINE_STATIC_KEY_FALSE(rdt_enable_key); DEFINE_STATIC_KEY_FALSE(rdt_mon_enable_key); DEFINE_STATIC_KEY_FALSE(rdt_alloc_enable_key); + +/* Mutex to protect rdtgroup access. */ +DEFINE_MUTEX(rdtgroup_mutex); + static struct kernfs_root *rdt_root; struct rdtgroup rdtgroup_default; LIST_HEAD(rdt_all_groups); @@ -954,7 +958,8 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of, mutex_lock(&rdtgroup_mutex); hw_shareable = r->cache.shareable_bits; - list_for_each_entry(dom, &r->domains, list) { + rcu_read_lock(); + list_for_each_entry_rcu(dom, &r->domains, list) { if (sep) seq_putc(seq, ';'); sw_shareable = 0; @@ -1010,8 +1015,10 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of, } sep = true; } + rcu_read_unlock(); seq_putc(seq, '\n'); mutex_unlock(&rdtgroup_mutex); + return 0; } @@ -1254,6 +1261,9 @@ static bool rdtgroup_mode_test_exclusive(struct rdtgroup *rdtgrp) struct rdt_domain *d; u32 ctrl; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + list_for_each_entry(s, &resctrl_schema_all, list) { r = s->res; if (r->rid == RDT_RESOURCE_MBA || r->rid == RDT_RESOURCE_SMBA) @@ -1520,6 +1530,7 @@ static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid struct rdt_domain *dom; bool sep = false; + cpus_read_lock(); mutex_lock(&rdtgroup_mutex); list_for_each_entry(dom, &r->domains, list) { @@ -1536,6 +1547,7 @@ static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid seq_puts(s, "\n"); mutex_unlock(&rdtgroup_mutex); + cpus_read_unlock(); return 0; } @@ -1627,6 +1639,9 @@ static int mon_config_write(struct rdt_resource *r, char *tok, u32 evtid) struct rdt_domain *d; int ret = 0; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + next: if (!tok || tok[0] == '\0') return 0; @@ -1668,6 +1683,7 @@ static ssize_t mbm_total_bytes_config_write(struct kernfs_open_file *of, if (nbytes == 0 || buf[nbytes - 1] != '\n') return -EINVAL; + cpus_read_lock(); mutex_lock(&rdtgroup_mutex); rdt_last_cmd_clear(); @@ -1677,6 +1693,7 @@ static ssize_t mbm_total_bytes_config_write(struct kernfs_open_file *of, ret = mon_config_write(r, buf, QOS_L3_MBM_TOTAL_EVENT_ID); mutex_unlock(&rdtgroup_mutex); + cpus_read_unlock(); return ret ?: nbytes; } @@ -1692,6 +1709,7 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of, if (nbytes == 0 || buf[nbytes - 1] != '\n') return -EINVAL; + cpus_read_lock(); mutex_lock(&rdtgroup_mutex); rdt_last_cmd_clear(); @@ -1701,6 +1719,7 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of, ret = mon_config_write(r, buf, QOS_L3_MBM_LOCAL_EVENT_ID); mutex_unlock(&rdtgroup_mutex); + cpus_read_unlock(); return ret ?: nbytes; } @@ -2153,6 +2172,9 @@ static int set_cache_qos_cfg(int level, bool enable) struct rdt_domain *d; int cpu; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + if (level == RDT_RESOURCE_L3) update = l3_qos_cfg_update; else if (level == RDT_RESOURCE_L2) @@ -2360,6 +2382,7 @@ struct rdtgroup *rdtgroup_kn_lock_live(struct kernfs_node *kn) rdtgroup_kn_get(rdtgrp, kn); + cpus_read_lock(); mutex_lock(&rdtgroup_mutex); /* Was this group deleted while we waited? */ @@ -2377,6 +2400,8 @@ void rdtgroup_kn_unlock(struct kernfs_node *kn) return; mutex_unlock(&rdtgroup_mutex); + cpus_read_unlock(); + rdtgroup_kn_put(rdtgrp, kn); } @@ -2664,6 +2689,9 @@ static int reset_all_ctrls(struct rdt_resource *r) struct rdt_domain *d; int i; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL)) return -ENOMEM; @@ -2948,6 +2976,9 @@ static int mkdir_mondata_subdir_alldom(struct kernfs_node *parent_kn, struct rdt_domain *dom; int ret; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + list_for_each_entry(dom, &r->domains, list) { ret = mkdir_mondata_subdir(parent_kn, dom, r, prgrp); if (ret) @@ -3766,7 +3797,8 @@ static void domain_destroy_mon_state(struct rdt_domain *d) kfree(d->mbm_local); } -void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) +static void _resctrl_offline_domain(struct rdt_resource *r, + struct rdt_domain *d) { lockdep_assert_held(&rdtgroup_mutex); @@ -3801,6 +3833,13 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) domain_destroy_mon_state(d); } +void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) +{ + mutex_lock(&rdtgroup_mutex); + _resctrl_offline_domain(r, d); + mutex_unlock(&rdtgroup_mutex); +} + static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domain *d) { u32 idx_limit = resctrl_arch_system_num_rmid_idx(); @@ -3832,7 +3871,7 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domain *d) return 0; } -int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) +static int _resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) { int err; @@ -3870,12 +3909,23 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) return 0; } +int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) +{ + int err; + + mutex_lock(&rdtgroup_mutex); + err = _resctrl_online_domain(r, d); + mutex_unlock(&rdtgroup_mutex); + + return err; +} + int resctrl_online_cpu(unsigned int cpu) { - lockdep_assert_held(&rdtgroup_mutex); - + mutex_lock(&rdtgroup_mutex); /* The CPU is set in default rdtgroup after online. */ cpumask_set_cpu(cpu, &rdtgroup_default.cpu_mask); + mutex_unlock(&rdtgroup_mutex); return 0; } @@ -3896,8 +3946,7 @@ void resctrl_offline_cpu(unsigned int cpu) struct rdtgroup *rdtgrp; struct rdt_resource *l3 = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; - lockdep_assert_held(&rdtgroup_mutex); - + mutex_lock(&rdtgroup_mutex); list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) { if (cpumask_test_and_clear_cpu(cpu, &rdtgrp->cpu_mask)) { clear_childcpus(rdtgrp, cpu); @@ -3917,6 +3966,7 @@ void resctrl_offline_cpu(unsigned int cpu) cqm_setup_limbo_handler(d, 0, cpu); } } + mutex_unlock(&rdtgroup_mutex); } /* diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index f3ef3ceb9c5e..2bbdd3d591ea 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -159,7 +159,7 @@ struct resctrl_schema; * @cache_level: Which cache level defines scope of this resource * @cache: Cache allocation related data * @membw: If the component has bandwidth controls, their properties. - * @domains: All domains for this resource + * @domains: RCU list of all domains for this resource * @name: Name to use in "schemata" file. * @data_width: Character width of data when displaying * @default_ctrl: Specifies default cache cbm or memory B/W percent.