From patchwork Mon Mar 20 17:26:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72342 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1359651wrt; Mon, 20 Mar 2023 11:10:48 -0700 (PDT) X-Google-Smtp-Source: AK7set9dXr+34KyNjcgQfcyT/+Ig3ppicB7XsJpPMBhEstBHLk6uIA+Y88F5c8IE8ff/hZ07tmZE X-Received: by 2002:a62:1b95:0:b0:627:f9ac:8a33 with SMTP id b143-20020a621b95000000b00627f9ac8a33mr3571560pfb.13.1679335848248; Mon, 20 Mar 2023 11:10:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679335848; cv=none; d=google.com; s=arc-20160816; b=OScu3VugmWO3wD7XFsstwN6+KPOwhNDg/pqrLcw/ZpBkGWNiopxbofa0PvaD2z/Y4g +WB9mX/jnF6scoxPofK7IQIvyR9fPlia/0jk1lyo5a9HjjdN7ciiII/DV6KEt/dxuYBh VOzMmPJ2TP4g+DStQXujAjEJIAOZw4Q4jGymdHZqJObu21kgB+WyRJL22i6Pxpqk+HY3 3ESm0AROCnhYYVXdgUnClQ/Ib9VL0LCkEtFTIvYN2L/SRqv8rJPAhpXFZVa+g722w6Wj ngFTHljNkDlThow6l6erdaHzTUsOl3ifaBRy6RdSJOMB7EdwLN8BV43IPg9GsGvRU3Zq EPiw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=by1DISPY5t3jDXVF69kCdBkHrPeDvRXYGgSaRUStj4Q=; b=xiHx1l9aZ82obgzdHuk4iFH0778rWUHxx8IWHEV83X6B8wTKBzj6Id+FAJa8mkWDyA mV/p/3F8wovnHWbPoiF8A5voReLANPtxNJ00CMRwRBOu12GZBO5FJURxsBQENGXaWvsG noT1NLq+tAqEi7faaSvi+9diy8oKl+/miR2H8iGeaXg0MPjsHY2zWrcQZm624t9UDO/k BpNFwHuLKrJcAZqckYvq8nCQwRS6iZbNz4Q28z778dL8lporjStdp60uCDpZh0YSJnuW 1lmOYvCnXkZdAR7ktj4gaSWAMa2nmaoIuQwlhF/wS4sYtbUTA8F3xuIRyZtEtgYVlroW IyTw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t18-20020a62d152000000b00625b9e9308esi11137183pfl.299.2023.03.20.11.10.31; Mon, 20 Mar 2023 11:10:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233484AbjCTRc7 (ORCPT + 99 others); Mon, 20 Mar 2023 13:32:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58834 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233400AbjCTRcc (ORCPT ); Mon, 20 Mar 2023 13:32:32 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id DEA1623DBC for ; Mon, 20 Mar 2023 10:28:05 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A8622FEC; Mon, 20 Mar 2023 10:27:57 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0C6C23F67D; Mon, 20 Mar 2023 10:27:10 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 01/19] x86/resctrl: Track the closid with the rmid Date: Mon, 20 Mar 2023 17:26:02 +0000 Message-Id: <20230320172620.18254-2-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760911266126161685?= X-GMAIL-MSGID: =?utf-8?q?1760911266126161685?= x86's RMID are independent of the CLOSID. An RMID can be allocated, used and freed without considering the CLOSID. MPAM's equivalent feature is PMG, which is not an independent number, it extends the CLOSID/PARTID space. For MPAM, only PMG-bits worth of 'RMID' can be allocated for a single CLOSID. i.e. if there is 1 bit of PMG space, then each CLOSID can have two monitor groups. To allow resctrl to disambiguate RMID values for different CLOSID, everything in resctrl that keeps an RMID value needs to know the CLOSID too. This will always be ignored on x86. Tested-by: Shaopeng Tan Reviewed-by: Xin Hao Signed-off-by: James Morse --- Is there a better term for 'the unique identifier for a monitor group'. Using RMID for that here may be confusing... Changes since v1: * Added comment in struct rmid_entry Changes since v2: * Moved X86_RESCTRL_BAD_CLOSID from a subsequent patch --- arch/x86/include/asm/resctrl.h | 7 +++ arch/x86/kernel/cpu/resctrl/internal.h | 2 +- arch/x86/kernel/cpu/resctrl/monitor.c | 59 ++++++++++++++--------- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 4 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 12 ++--- include/linux/resctrl.h | 11 ++++- 6 files changed, 61 insertions(+), 34 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 52788f79786f..cbe986d23df6 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -7,6 +7,13 @@ #include #include +/* + * This value can never be a valid CLOSID, and is used when mapping a + * (closid, rmid) pair to an index and back. On x86 only the RMID is + * needed. + */ +#define X86_RESCTRL_BAD_CLOSID ((u32)~0) + /** * struct resctrl_pqr_state - State cache for the PQR MSR * @cur_rmid: The cached Resource Monitoring ID diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 8edecc5763d8..c64097947994 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -535,7 +535,7 @@ struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r); int closids_supported(void); void closid_free(int closid); int alloc_rmid(void); -void free_rmid(u32 rmid); +void free_rmid(u32 closid, u32 rmid); int rdt_get_mon_l3_config(struct rdt_resource *r); bool __init rdt_cpu_has(int flag); void mon_event_count(void *info); diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 7fe51488e136..18c37d364030 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -25,6 +25,12 @@ #include "internal.h" struct rmid_entry { + /* + * Some architectures's resctrl_arch_rmid_read() needs the CLOSID value + * in order to access the correct monitor. This field provides the + * value to list walkers like __check_limbo(). On x86 this is ignored. + */ + u32 closid; u32 rmid; int busy; struct list_head list; @@ -136,7 +142,7 @@ static inline u64 get_corrected_mbm_count(u32 rmid, unsigned long val) return val; } -static inline struct rmid_entry *__rmid_entry(u32 rmid) +static inline struct rmid_entry *__rmid_entry(u32 closid, u32 rmid) { struct rmid_entry *entry; @@ -190,7 +196,8 @@ static struct arch_mbm_state *get_arch_mbm_state(struct rdt_hw_domain *hw_dom, } void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d, - u32 rmid, enum resctrl_event_id eventid) + u32 closid, u32 rmid, + enum resctrl_event_id eventid) { struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); struct arch_mbm_state *am; @@ -230,7 +237,8 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width) } int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, - u32 rmid, enum resctrl_event_id eventid, u64 *val) + u32 closid, u32 rmid, enum resctrl_event_id eventid, + u64 *val) { struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); @@ -285,9 +293,9 @@ void __check_limbo(struct rdt_domain *d, bool force_free) if (nrmid >= r->num_rmid) break; - entry = __rmid_entry(nrmid); + entry = __rmid_entry(X86_RESCTRL_BAD_CLOSID, nrmid);// temporary - if (resctrl_arch_rmid_read(r, d, entry->rmid, + if (resctrl_arch_rmid_read(r, d, entry->closid, entry->rmid, QOS_L3_OCCUP_EVENT_ID, &val)) { rmid_dirty = true; } else { @@ -342,7 +350,8 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) cpu = get_cpu(); list_for_each_entry(d, &r->domains, list) { if (cpumask_test_cpu(cpu, &d->cpu_mask)) { - err = resctrl_arch_rmid_read(r, d, entry->rmid, + err = resctrl_arch_rmid_read(r, d, entry->closid, + entry->rmid, QOS_L3_OCCUP_EVENT_ID, &val); if (err || val <= resctrl_rmid_realloc_threshold) @@ -366,7 +375,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) list_add_tail(&entry->list, &rmid_free_lru); } -void free_rmid(u32 rmid) +void free_rmid(u32 closid, u32 rmid) { struct rmid_entry *entry; @@ -375,7 +384,7 @@ void free_rmid(u32 rmid) lockdep_assert_held(&rdtgroup_mutex); - entry = __rmid_entry(rmid); + entry = __rmid_entry(closid, rmid); if (is_llc_occupancy_enabled()) add_rmid_to_limbo(entry); @@ -383,15 +392,16 @@ void free_rmid(u32 rmid) list_add_tail(&entry->list, &rmid_free_lru); } -static int __mon_event_count(u32 rmid, struct rmid_read *rr) +static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr) { struct mbm_state *m; u64 tval = 0; if (rr->first) - resctrl_arch_reset_rmid(rr->r, rr->d, rmid, rr->evtid); + resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid); - rr->err = resctrl_arch_rmid_read(rr->r, rr->d, rmid, rr->evtid, &tval); + rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid, rr->evtid, + &tval); if (rr->err) return rr->err; @@ -434,7 +444,7 @@ static int __mon_event_count(u32 rmid, struct rmid_read *rr) * __mon_event_count() is compared with the chunks value from the previous * invocation. This must be called once per second to maintain values in MBps. */ -static void mbm_bw_count(u32 rmid, struct rmid_read *rr) +static void mbm_bw_count(u32 closid, u32 rmid, struct rmid_read *rr) { struct mbm_state *m = &rr->d->mbm_local[rmid]; u64 cur_bw, bytes, cur_bytes; @@ -464,7 +474,7 @@ void mon_event_count(void *info) rdtgrp = rr->rgrp; - ret = __mon_event_count(rdtgrp->mon.rmid, rr); + ret = __mon_event_count(rdtgrp->closid, rdtgrp->mon.rmid, rr); /* * For Ctrl groups read data from child monitor groups and @@ -475,7 +485,8 @@ void mon_event_count(void *info) if (rdtgrp->type == RDTCTRL_GROUP) { list_for_each_entry(entry, head, mon.crdtgrp_list) { - if (__mon_event_count(entry->mon.rmid, rr) == 0) + if (__mon_event_count(rdtgrp->closid, entry->mon.rmid, + rr) == 0) ret = 0; } } @@ -605,7 +616,8 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm) } } -static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, int rmid) +static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, + u32 closid, u32 rmid) { struct rmid_read rr; @@ -620,12 +632,12 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, int rmid) if (is_mbm_total_enabled()) { rr.evtid = QOS_L3_MBM_TOTAL_EVENT_ID; rr.val = 0; - __mon_event_count(rmid, &rr); + __mon_event_count(closid, rmid, &rr); } if (is_mbm_local_enabled()) { rr.evtid = QOS_L3_MBM_LOCAL_EVENT_ID; rr.val = 0; - __mon_event_count(rmid, &rr); + __mon_event_count(closid, rmid, &rr); /* * Call the MBA software controller only for the @@ -633,7 +645,7 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, int rmid) * the software controller explicitly. */ if (is_mba_sc(NULL)) - mbm_bw_count(rmid, &rr); + mbm_bw_count(closid, rmid, &rr); } } @@ -690,11 +702,11 @@ void mbm_handle_overflow(struct work_struct *work) d = container_of(work, struct rdt_domain, mbm_over.work); list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) { - mbm_update(r, d, prgrp->mon.rmid); + mbm_update(r, d, prgrp->closid, prgrp->mon.rmid); head = &prgrp->mon.crdtgrp_list; list_for_each_entry(crgrp, head, mon.crdtgrp_list) - mbm_update(r, d, crgrp->mon.rmid); + mbm_update(r, d, crgrp->closid, crgrp->mon.rmid); if (is_mba_sc(NULL)) update_mba_bw(prgrp, d); @@ -737,10 +749,11 @@ static int dom_data_init(struct rdt_resource *r) } /* - * RMID 0 is special and is always allocated. It's used for all - * tasks that are not monitored. + * RMID 0 is special and is always allocated. It's used for the + * default_rdtgroup control group, which will be setup later. See + * rdtgroup_setup_root(). */ - entry = __rmid_entry(0); + entry = __rmid_entry(0, 0); list_del(&entry->list); return 0; diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c index 524f8ff3e69c..c51932516965 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -738,7 +738,7 @@ int rdtgroup_locksetup_enter(struct rdtgroup *rdtgrp) * anymore when this group would be used for pseudo-locking. This * is safe to call on platforms not capable of monitoring. */ - free_rmid(rdtgrp->mon.rmid); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); ret = 0; goto out; @@ -773,7 +773,7 @@ int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp) ret = rdtgroup_locksetup_user_restore(rdtgrp); if (ret) { - free_rmid(rdtgrp->mon.rmid); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); return ret; } diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index e2c1599d1b37..23e6b3a373b0 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2688,7 +2688,7 @@ static void free_all_child_rdtgrp(struct rdtgroup *rdtgrp) head = &rdtgrp->mon.crdtgrp_list; list_for_each_entry_safe(sentry, stmp, head, mon.crdtgrp_list) { - free_rmid(sentry->mon.rmid); + free_rmid(sentry->closid, sentry->mon.rmid); list_del(&sentry->mon.crdtgrp_list); if (atomic_read(&sentry->waitcount) != 0) @@ -2728,7 +2728,7 @@ static void rmdir_all_sub(void) cpumask_or(&rdtgroup_default.cpu_mask, &rdtgroup_default.cpu_mask, &rdtgrp->cpu_mask); - free_rmid(rdtgrp->mon.rmid); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); kernfs_remove(rdtgrp->kn); list_del(&rdtgrp->rdtgroup_list); @@ -3222,7 +3222,7 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, return 0; out_idfree: - free_rmid(rdtgrp->mon.rmid); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); out_destroy: kernfs_put(rdtgrp->kn); kernfs_remove(rdtgrp->kn); @@ -3236,7 +3236,7 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, static void mkdir_rdt_prepare_clean(struct rdtgroup *rgrp) { kernfs_remove(rgrp->kn); - free_rmid(rgrp->mon.rmid); + free_rmid(rgrp->closid, rgrp->mon.rmid); rdtgroup_remove(rgrp); } @@ -3385,7 +3385,7 @@ static int rdtgroup_rmdir_mon(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask) update_closid_rmid(tmpmask, NULL); rdtgrp->flags = RDT_DELETED; - free_rmid(rdtgrp->mon.rmid); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); /* * Remove the rdtgrp from the parent ctrl_mon group's list @@ -3431,8 +3431,8 @@ static int rdtgroup_rmdir_ctrl(struct rdtgroup *rdtgrp, cpumask_var_t tmpmask) cpumask_or(tmpmask, tmpmask, &rdtgrp->cpu_mask); update_closid_rmid(tmpmask, NULL); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); closid_free(rdtgrp->closid); - free_rmid(rdtgrp->mon.rmid); rdtgroup_ctrl_remove(rdtgrp); diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 8334eeacfec5..7d80bae05f59 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -225,6 +225,8 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); * for this resource and domain. * @r: resource that the counter should be read from. * @d: domain that the counter should be read from. + * @closid: closid that matches the rmid. The counter may + * match traffic of both closid and rmid, or rmid only. * @rmid: rmid of the counter to read. * @eventid: eventid to read, e.g. L3 occupancy. * @val: result of the counter read in bytes. @@ -235,20 +237,25 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); * 0 on success, or -EIO, -EINVAL etc on error. */ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, - u32 rmid, enum resctrl_event_id eventid, u64 *val); + u32 closid, u32 rmid, enum resctrl_event_id eventid, + u64 *val); + /** * resctrl_arch_reset_rmid() - Reset any private state associated with rmid * and eventid. * @r: The domain's resource. * @d: The rmid's domain. + * @closid: The closid that matches the rmid. Counters may match both + * closid and rmid, or rmid only. * @rmid: The rmid whose counter values should be reset. * @eventid: The eventid whose counter values should be reset. * * This can be called from any CPU. */ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_domain *d, - u32 rmid, enum resctrl_event_id eventid); + u32 closid, u32 rmid, + enum resctrl_event_id eventid); /** * resctrl_arch_reset_rmid_all() - Reset all private state associated with From patchwork Mon Mar 20 17:26:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72340 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1359197wrt; Mon, 20 Mar 2023 11:09:55 -0700 (PDT) X-Google-Smtp-Source: AK7set93m1DuzkI72AL9qQSGIRSSAFBpyRz+doh50bvDLiLaRmaA3m+0EsCERBu9hr92hydeGd6s X-Received: by 2002:a05:6a20:b326:b0:d9:c00a:82bc with SMTP id ef38-20020a056a20b32600b000d9c00a82bcmr3229645pzb.52.1679335794695; Mon, 20 Mar 2023 11:09:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679335794; cv=none; d=google.com; s=arc-20160816; b=N4vQJzA/9c8ZxEXe5+L6aLH0Xf14Pmw2MnmvvoK9O3KqRYUGl4q7VsmjRozmRrHVmo Ir9KhWe+LQEeGLslffO9T2kNrhjStkR+9v9xRuuPbLST1pHRvGK39nkTvGkbnE0gB7BX 37Sdx9a73DSnvxnHhHpLbS6MpYLb8oNfskZc4G3kdHr6nMgk0U9x35TiWlfSiIPEc+On 6ZpoWvhHJnpSYlqXJZZSNlVj7nyoHwvaugyRVgK6lBXvXwA2ObZPfc4ysUqe9bUiqbmn yDA5pWVFu8l+/WmkoLn3M37P7pU2ZnCxZcmt8dMqlBR9A0cgpbrJ2MIZV35rs6fNAgxK mZrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=j3SB0ORLVgebb1ExVs+rGJAs55xJWiUOdUIKsCqqiYc=; b=igB/HKjVE99NpLptR+TGSsM8rNNpHQljG7jTPdFKXdXGPQ5+0uIrUOoIM0Jp9enw4X YNt8Ri23Y9ssAhF9vbGUuIttUEYd2rzOgAAHOLPhqFbH7H9jS6Xp0rgJoOXcxD8Wnr1G bMDFFXu/aUmCix44Pipmd0peG81y0yCUsRsprHRskXGdnQR10Z/xIhmMIxF9iR9A4rr6 uKLoO/ITYJcpTuhNf7mxE8oj2beXCtdgyQoR7RS+mnfwfSfPs/nZTjOocIu29kMKUpM7 HYyrnkXt3uGmuk8qqWcZG/AQKiB3qn3cLuvkIddG66zrlfZ/sFCuel+pFG1DSQMwrS7B RJdA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 17-20020a630211000000b0050bef513932si11387254pgc.816.2023.03.20.11.09.06; Mon, 20 Mar 2023 11:09:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230218AbjCTRy0 (ORCPT + 99 others); Mon, 20 Mar 2023 13:54:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230385AbjCTRyG (ORCPT ); Mon, 20 Mar 2023 13:54:06 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 06A483CE26 for ; Mon, 20 Mar 2023 10:47:50 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AB087106F; Mon, 20 Mar 2023 10:28:00 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1BA673F67D; Mon, 20 Mar 2023 10:27:14 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 02/19] x86/resctrl: Access per-rmid structures by index Date: Mon, 20 Mar 2023 17:26:03 +0000 Message-Id: <20230320172620.18254-3-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760911210229120229?= X-GMAIL-MSGID: =?utf-8?q?1760911210229120229?= Because of the differences between Intel RDT/AMD QoS and Arm's MPAM monitors, RMID values on arm64 are not unique unless the CLOSID is also included. Bitmaps like rmid_busy_llc need to be sized by the number of unique entries for this resource. Add helpers to encode/decode the CLOSID and RMID to an index. The domain's rmid_busy__llc and the rmid_ptrs[] array are then sized by index, as are the domain mbm_local and mbm_total arrays. On x86, the index is always just the RMID, so all these structures remain the same size. The index gives resctrl a unique value it can use to store monitor values, and allows MPAM to decode the closid when reading the hardware counters. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v1: * Added X86_BAD_CLOSID macro to make it clear what this value means * Added second WARN_ON() for closid checking, and made both _ONCE() Changes since v2: * Added RESCTRL_RESERVED_CLOSID * Removed a newline * Repharsed some comments * Renamed a variable 'ignore'd * Moved X86_RESCTRL_BAD_CLOSID to a previous patch --- arch/x86/include/asm/resctrl.h | 17 ++++++ arch/x86/kernel/cpu/resctrl/core.c | 2 +- arch/x86/kernel/cpu/resctrl/internal.h | 1 + arch/x86/kernel/cpu/resctrl/monitor.c | 83 +++++++++++++++++--------- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 7 ++- include/linux/resctrl.h | 3 + 6 files changed, 82 insertions(+), 31 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index cbe986d23df6..3ca40be41a0a 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -101,6 +101,23 @@ static inline void resctrl_sched_in(void) __resctrl_sched_in(); } +static inline u32 resctrl_arch_system_num_rmid_idx(void) +{ + /* RMID are independent numbers for x86. num_rmid_idx==num_rmid */ + return boot_cpu_data.x86_cache_max_rmid + 1; +} + +static inline void resctrl_arch_rmid_idx_decode(u32 idx, u32 *closid, u32 *rmid) +{ + *rmid = idx; + *closid = X86_RESCTRL_BAD_CLOSID; +} + +static inline u32 resctrl_arch_rmid_idx_encode(u32 ignored, u32 rmid) +{ + return rmid; +} + void resctrl_cpu_detect(struct cpuinfo_x86 *c); #else diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 030d3b409768..351319403f84 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -600,7 +600,7 @@ static void clear_closid_rmid(int cpu) state->default_rmid = 0; state->cur_closid = 0; state->cur_rmid = 0; - wrmsr(MSR_IA32_PQR_ASSOC, 0, 0); + wrmsr(MSR_IA32_PQR_ASSOC, RESCTRL_RESERVED_CLOSID, 0); } static int resctrl_online_cpu(unsigned int cpu) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index c64097947994..47506e2afd59 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -7,6 +7,7 @@ #include #include #include +#include #define L3_QOS_CDP_ENABLE 0x01ULL diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 18c37d364030..03a7d13dd653 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -142,12 +142,29 @@ static inline u64 get_corrected_mbm_count(u32 rmid, unsigned long val) return val; } -static inline struct rmid_entry *__rmid_entry(u32 closid, u32 rmid) +/* + * x86 and arm64 differ in their handling of monitoring. + * x86's RMID are an independent number, there is only one source of traffic + * an RMID value of '1'. + * arm64's PMG extend the PARTID/CLOSID space, there are multiple sources of + * traffic with a PMG value of '1', one for each CLOSID, meaining the RMID + * value is no longer unique. + * To account for this, resctrl uses an index. On x86 this is just the RMID, + * on arm64 it encodes the CLOSID and RMID. This gives a unique number. + * + * The domain's rmid_busy_llc and rmid_ptrs are sized by index. The arch code + * must accept an attempt to read every index. + */ +static inline struct rmid_entry *__rmid_entry(u32 idx) { struct rmid_entry *entry; + u32 closid, rmid; - entry = &rmid_ptrs[rmid]; - WARN_ON(entry->rmid != rmid); + entry = &rmid_ptrs[idx]; + resctrl_arch_rmid_idx_decode(idx, &closid, &rmid); + + WARN_ON_ONCE(entry->closid != closid); + WARN_ON_ONCE(entry->rmid != rmid); return entry; } @@ -277,8 +294,9 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, void __check_limbo(struct rdt_domain *d, bool force_free) { struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + u32 idx_limit = resctrl_arch_system_num_rmid_idx(); struct rmid_entry *entry; - u32 crmid = 1, nrmid; + u32 idx, cur_idx = 1; bool rmid_dirty; u64 val = 0; @@ -289,12 +307,11 @@ void __check_limbo(struct rdt_domain *d, bool force_free) * RMID and move it to the free list when the counter reaches 0. */ for (;;) { - nrmid = find_next_bit(d->rmid_busy_llc, r->num_rmid, crmid); - if (nrmid >= r->num_rmid) + idx = find_next_bit(d->rmid_busy_llc, idx_limit, cur_idx); + if (idx >= idx_limit) break; - entry = __rmid_entry(X86_RESCTRL_BAD_CLOSID, nrmid);// temporary - + entry = __rmid_entry(idx); if (resctrl_arch_rmid_read(r, d, entry->closid, entry->rmid, QOS_L3_OCCUP_EVENT_ID, &val)) { rmid_dirty = true; @@ -303,19 +320,21 @@ void __check_limbo(struct rdt_domain *d, bool force_free) } if (force_free || !rmid_dirty) { - clear_bit(entry->rmid, d->rmid_busy_llc); + clear_bit(idx, d->rmid_busy_llc); if (!--entry->busy) { rmid_limbo_count--; list_add_tail(&entry->list, &rmid_free_lru); } } - crmid = nrmid + 1; + cur_idx = idx + 1; } } bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d) { - return find_first_bit(d->rmid_busy_llc, r->num_rmid) != r->num_rmid; + u32 idx_limit = resctrl_arch_system_num_rmid_idx(); + + return find_first_bit(d->rmid_busy_llc, idx_limit) != idx_limit; } /* @@ -345,6 +364,9 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) struct rdt_domain *d; int cpu, err; u64 val = 0; + u32 idx; + + idx = resctrl_arch_rmid_idx_encode(entry->closid, entry->rmid); entry->busy = 0; cpu = get_cpu(); @@ -364,7 +386,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) */ if (!has_busy_rmid(r, d)) cqm_setup_limbo_handler(d, CQM_LIMBOCHECK_INTERVAL); - set_bit(entry->rmid, d->rmid_busy_llc); + set_bit(idx, d->rmid_busy_llc); entry->busy++; } put_cpu(); @@ -377,14 +399,16 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) void free_rmid(u32 closid, u32 rmid) { + u32 idx = resctrl_arch_rmid_idx_encode(closid, rmid); struct rmid_entry *entry; - if (!rmid) - return; - lockdep_assert_held(&rdtgroup_mutex); - entry = __rmid_entry(closid, rmid); + /* do not allow the default rmid to be free'd */ + if (!idx) + return; + + entry = __rmid_entry(idx); if (is_llc_occupancy_enabled()) add_rmid_to_limbo(entry); @@ -394,6 +418,7 @@ void free_rmid(u32 closid, u32 rmid) static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr) { + u32 idx = resctrl_arch_rmid_idx_encode(closid, rmid); struct mbm_state *m; u64 tval = 0; @@ -410,10 +435,10 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr) rr->val += tval; return 0; case QOS_L3_MBM_TOTAL_EVENT_ID: - m = &rr->d->mbm_total[rmid]; + m = &rr->d->mbm_total[idx]; break; case QOS_L3_MBM_LOCAL_EVENT_ID: - m = &rr->d->mbm_local[rmid]; + m = &rr->d->mbm_local[idx]; break; default: /* @@ -446,7 +471,8 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr) */ static void mbm_bw_count(u32 closid, u32 rmid, struct rmid_read *rr) { - struct mbm_state *m = &rr->d->mbm_local[rmid]; + u32 idx = resctrl_arch_rmid_idx_encode(closid, rmid); + struct mbm_state *m = &rr->d->mbm_local[idx]; u64 cur_bw, bytes, cur_bytes; cur_bytes = rr->val; @@ -536,7 +562,7 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm) { u32 closid, rmid, cur_msr_val, new_msr_val; struct mbm_state *pmbm_data, *cmbm_data; - u32 cur_bw, delta_bw, user_bw; + u32 cur_bw, delta_bw, user_bw, idx; struct rdt_resource *r_mba; struct rdt_domain *dom_mba; struct list_head *head; @@ -549,7 +575,8 @@ static void update_mba_bw(struct rdtgroup *rgrp, struct rdt_domain *dom_mbm) closid = rgrp->closid; rmid = rgrp->mon.rmid; - pmbm_data = &dom_mbm->mbm_local[rmid]; + idx = resctrl_arch_rmid_idx_encode(closid, rmid); + pmbm_data = &dom_mbm->mbm_local[idx]; dom_mba = get_domain_from_cpu(smp_processor_id(), r_mba); if (!dom_mba) { @@ -732,19 +759,20 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) static int dom_data_init(struct rdt_resource *r) { + u32 nr_idx = resctrl_arch_system_num_rmid_idx(); struct rmid_entry *entry = NULL; - int i, nr_rmids; + u32 idx; + int i; - nr_rmids = r->num_rmid; - rmid_ptrs = kcalloc(nr_rmids, sizeof(struct rmid_entry), GFP_KERNEL); + rmid_ptrs = kcalloc(nr_idx, sizeof(struct rmid_entry), GFP_KERNEL); if (!rmid_ptrs) return -ENOMEM; - for (i = 0; i < nr_rmids; i++) { + for (i = 0; i < nr_idx; i++) { entry = &rmid_ptrs[i]; INIT_LIST_HEAD(&entry->list); - entry->rmid = i; + resctrl_arch_rmid_idx_decode(i, &entry->closid, &entry->rmid); list_add_tail(&entry->list, &rmid_free_lru); } @@ -753,7 +781,8 @@ static int dom_data_init(struct rdt_resource *r) * default_rdtgroup control group, which will be setup later. See * rdtgroup_setup_root(). */ - entry = __rmid_entry(0, 0); + idx = resctrl_arch_rmid_idx_encode(RESCTRL_RESERVED_CLOSID, 0); + entry = __rmid_entry(idx); list_del(&entry->list); return 0; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 23e6b3a373b0..6ecaf34a4e32 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3587,16 +3587,17 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domain *d) { + u32 idx_limit = resctrl_arch_system_num_rmid_idx(); size_t tsize; if (is_llc_occupancy_enabled()) { - d->rmid_busy_llc = bitmap_zalloc(r->num_rmid, GFP_KERNEL); + d->rmid_busy_llc = bitmap_zalloc(idx_limit, GFP_KERNEL); if (!d->rmid_busy_llc) return -ENOMEM; } if (is_mbm_total_enabled()) { tsize = sizeof(*d->mbm_total); - d->mbm_total = kcalloc(r->num_rmid, tsize, GFP_KERNEL); + d->mbm_total = kcalloc(idx_limit, tsize, GFP_KERNEL); if (!d->mbm_total) { bitmap_free(d->rmid_busy_llc); return -ENOMEM; @@ -3604,7 +3605,7 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domain *d) } if (is_mbm_local_enabled()) { tsize = sizeof(*d->mbm_local); - d->mbm_local = kcalloc(r->num_rmid, tsize, GFP_KERNEL); + d->mbm_local = kcalloc(idx_limit, tsize, GFP_KERNEL); if (!d->mbm_local) { bitmap_free(d->rmid_busy_llc); kfree(d->mbm_total); diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 7d80bae05f59..ff7452f644e4 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -6,6 +6,9 @@ #include #include +/* CLOSID value used by the default control group */ +#define RESCTRL_RESERVED_CLOSID 0 + #ifdef CONFIG_PROC_CPU_RESCTRL int proc_resctrl_show(struct seq_file *m, From patchwork Mon Mar 20 17:26:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72334 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1356867wrt; Mon, 20 Mar 2023 11:05:42 -0700 (PDT) X-Google-Smtp-Source: AK7set8nQLRj+eGW+fUphuL3yMkVdlMM78yvrZBO/F8Yb0109JnpNaWQ4TVK8+YWlQXe7Z4KK4t8 X-Received: by 2002:a05:6a20:4905:b0:d5:1863:fe5f with SMTP id ft5-20020a056a20490500b000d51863fe5fmr423244pzb.2.1679335542120; Mon, 20 Mar 2023 11:05:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679335542; cv=none; d=google.com; s=arc-20160816; b=Dn1tNKk8vWsY7SppdOznHAVdNVennhV/VIdyeoPpsZvyh3DAGHXs1plSM0fN5xJ2j2 GGm7JVIyG2/xUlHg0Rbop8VYNMB6jnRAzxGOTyKvHSuxNpQ0SedLqyBVP5YGFw2NcXfi BMtofXRaEunzTo3se0EG55t9a8mgU5ChhXsTXUlEi9QqsKUDdhyY5wAyrERAfd0buq1b 6GIqEEEiijirwQBHBJICWCDRal8UAo3Lvh7m7dFJfhxm8p3Eyp6qgYCmMEoOpqmZsyP6 O35QoKBU/l/nD9KabdCR9g+69idOY+vU+D6wLtpJ2c/XPPBTwTcFXoDNXHeZ3NRRlE2i b1bQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=XoMWIOKSf2MxNqn1rkFz6ptDL/e/gZE+RsjG9TSg1SI=; b=Hzr7L3Oe0/IWHIPSOo06e3Avwv+Q0wiR3pLeRW8A82yR/xBeiKX83k8L7AtkwHYBnI RAgWDFUU3djHik3xeBygg5pvGd1AgEk4XCELT6sBg99lWFpPIXnXf9RLHVTVKIUQTnR/ +Rn90ulJ/p+LiR2AD0+6MrI83yQJQf+K7Ne4U80CjZzLh+e57BENHGCkyJLk/Om2drQq yGCLoSHbdqDeV+ebn1Fz2DQkyXaTf3y7WYXg5lsmf+kjdi6u5YUcdBtHAROZILnwWO5/ CwGNe7yxZixYPDkH10vRo5IjS6/20WUwHtIryKex4NIvmm/7uPCj6zi7T87wJ/b2Wd6Z N/JA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id fd11-20020a056a002e8b00b005e7dc5d1b1bsi12269973pfb.82.2023.03.20.11.05.29; Mon, 20 Mar 2023 11:05:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229716AbjCTRua (ORCPT + 99 others); Mon, 20 Mar 2023 13:50:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36354 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229593AbjCTRtQ (ORCPT ); Mon, 20 Mar 2023 13:49:16 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7B6A7BB8A for ; Mon, 20 Mar 2023 10:43:56 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 794FC150C; Mon, 20 Mar 2023 10:28:03 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 03A873F67D; Mon, 20 Mar 2023 10:27:16 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 03/19] x86/resctrl: Create helper for RMID allocation and mondata dir creation Date: Mon, 20 Mar 2023 17:26:04 +0000 Message-Id: <20230320172620.18254-4-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760910945356640259?= X-GMAIL-MSGID: =?utf-8?q?1760910945356640259?= RMID are allocated for each monitor or control group directory, because each of these needs its own RMID. For control groups, rdtgroup_mkdir_ctrl_mon() later goes on to allocate the CLOSID. MPAM's equivalent of RMID are not an independent number, so can't be allocated until the CLOSID is known. An RMID allocation for one CLOSID may fail, whereas another may succeed depending on how many monitor groups a control group has. The RMID allocation needs to move to be after the CLOSID has been allocated. To make a subsequent change that does this easier to read, move the RMID allocation and mondata dir creation to a helper. Tested-by: Shaopeng Tan Signed-off-by: James Morse Reviewed-by: Ilpo Järvinen --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 42 +++++++++++++++++--------- 1 file changed, 27 insertions(+), 15 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 6ecaf34a4e32..b785beb0db26 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3135,6 +3135,30 @@ static int rdtgroup_init_alloc(struct rdtgroup *rdtgrp) return 0; } +static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp) +{ + int ret; + + if (!rdt_mon_capable) + return 0; + + ret = alloc_rmid(); + if (ret < 0) { + rdt_last_cmd_puts("Out of RMIDs\n"); + return ret; + } + rdtgrp->mon.rmid = ret; + + ret = mkdir_mondata_all(rdtgrp->kn, rdtgrp, &rdtgrp->mon.mon_data_kn); + if (ret) { + rdt_last_cmd_puts("kernfs subdir error\n"); + free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); + return ret; + } + + return 0; +} + static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, const char *name, umode_t mode, enum rdt_group_type rtype, struct rdtgroup **r) @@ -3200,20 +3224,10 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, goto out_destroy; } - if (rdt_mon_capable) { - ret = alloc_rmid(); - if (ret < 0) { - rdt_last_cmd_puts("Out of RMIDs\n"); - goto out_destroy; - } - rdtgrp->mon.rmid = ret; + ret = mkdir_rdt_prepare_rmid_alloc(rdtgrp); + if (ret) + goto out_destroy; - ret = mkdir_mondata_all(kn, rdtgrp, &rdtgrp->mon.mon_data_kn); - if (ret) { - rdt_last_cmd_puts("kernfs subdir error\n"); - goto out_idfree; - } - } kernfs_activate(kn); /* @@ -3221,8 +3235,6 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, */ return 0; -out_idfree: - free_rmid(rdtgrp->closid, rdtgrp->mon.rmid); out_destroy: kernfs_put(rdtgrp->kn); kernfs_remove(rdtgrp->kn); From patchwork Mon Mar 20 17:26:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72337 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1358390wrt; Mon, 20 Mar 2023 11:08:29 -0700 (PDT) X-Google-Smtp-Source: AK7set+eAeYtY0hIiB6jauX/KafTrI89QKumTCAhZEpuMoS8GmbOq/ILpPw1aIiAtzK0wnSK52R0 X-Received: by 2002:a05:6a20:a025:b0:cc:fa4b:3a6a with SMTP id p37-20020a056a20a02500b000ccfa4b3a6amr24039983pzj.58.1679335708772; Mon, 20 Mar 2023 11:08:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679335708; cv=none; d=google.com; s=arc-20160816; b=eU27vFANs6TQP0jxnPzoSmn6wL5SFLgbf+apSF4Flm6SSZ6u9RVUPOKkT8pvKcvqIv XCNj0i85Zvw7HHQhVwPIkEcswZFUX7EsDhNg8gbMt4nWBpyc9OM/+07iv2KHEJ3kOUdH XQLKhu+mSnV+M9qUsfGYNtZdLC0V0xVqN1yR8MoMJgTzwI7b3dmG/Oow2mEvSGsB2b3U j2lBzkzJy8TuWtBVazohMMQfCnOt+vVxL3YPtKTYgTD6/k359QiCNWPlltrT3hS7AUJJ kjwfOpgDBtOQKycZBsvVlMrauTZrh+T5kprBchAzKddBffIGd7lsGe79SKXcZodhUXxL HsfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=n8q37qc5EzjcNGaFGTTE5SaG7c2iXrMEI2LO82mZU7Q=; b=CEKfAzC3hqEL+h70UAPxFzK1Szf2v9YkzqDcPuyBld41lcjiKEuk93C7pJiiOyAekH SIWYpOS0ND4rZpv45Z75P1BwGUMnQ6dQzQMXiZbbTqnqgWOsxhO6/+UdJCMgFSz/2Aqi 7agKhUULbecVdionHkyiLowAJzUOq/lgpJOnqK/LJlDFcfDL/C5rBEl+3UZ8PeBTbp5B cOwwxmcYogWDtqiXUHyW+K4vmxL3fJXDzut9Tb0H2P35ltM7O68dUD4C+vjgd7tzZeek 8kAyFucRYqhGGbHhHqlvHP1Ab6oToxr4FhIA9IWba8Kq1+JQS7uYc/asgaQmuDX9fmNQ +wjA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p11-20020a63fe0b000000b004a68aefb7b0si11146888pgh.173.2023.03.20.11.08.12; Mon, 20 Mar 2023 11:08:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230136AbjCTRre (ORCPT + 99 others); Mon, 20 Mar 2023 13:47:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229668AbjCTRrG (ORCPT ); Mon, 20 Mar 2023 13:47:06 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0CF8239B88 for ; Mon, 20 Mar 2023 10:41:57 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4922E1516; Mon, 20 Mar 2023 10:28:06 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C8C3E3F67D; Mon, 20 Mar 2023 10:27:19 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 04/19] x86/resctrl: Move rmid allocation out of mkdir_rdt_prepare() Date: Mon, 20 Mar 2023 17:26:05 +0000 Message-Id: <20230320172620.18254-5-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760911120207044095?= X-GMAIL-MSGID: =?utf-8?q?1760911120207044095?= RMID are allocated for each monitor or control group directory, because each of these needs its own RMID. For control groups, rdtgroup_mkdir_ctrl_mon() later goes on to allocate the CLOSID. MPAM's equivalent of RMID is not an independent number, so can't be allocated until the CLOSID is known. An RMID allocation for one CLOSID may fail, whereas another may succeed depending on how many monitor groups a control group has. The RMID allocation needs to move to be after the CLOSID has been allocated. Move the RMID allocation out of mkdir_rdt_prepare() to occur in its caller, after the mkdir_rdt_prepare() call. This allows the RMID allocator to know the CLOSID. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v2: * Moved kernfs_activate() later to preserve atomicity of files being visible --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 35 +++++++++++++++++++------- 1 file changed, 26 insertions(+), 9 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index b785beb0db26..16c8ca135b37 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3159,6 +3159,12 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp) return 0; } +static void mkdir_rdt_prepare_rmid_free(struct rdtgroup *rgrp) +{ + if (rdt_mon_capable) + free_rmid(rgrp->closid, rgrp->mon.rmid); +} + static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, const char *name, umode_t mode, enum rdt_group_type rtype, struct rdtgroup **r) @@ -3224,12 +3230,6 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, goto out_destroy; } - ret = mkdir_rdt_prepare_rmid_alloc(rdtgrp); - if (ret) - goto out_destroy; - - kernfs_activate(kn); - /* * The caller unlocks the parent_kn upon success. */ @@ -3248,7 +3248,6 @@ static int mkdir_rdt_prepare(struct kernfs_node *parent_kn, static void mkdir_rdt_prepare_clean(struct rdtgroup *rgrp) { kernfs_remove(rgrp->kn); - free_rmid(rgrp->closid, rgrp->mon.rmid); rdtgroup_remove(rgrp); } @@ -3270,12 +3269,21 @@ static int rdtgroup_mkdir_mon(struct kernfs_node *parent_kn, prgrp = rdtgrp->mon.parent; rdtgrp->closid = prgrp->closid; + ret = mkdir_rdt_prepare_rmid_alloc(rdtgrp); + if (ret) { + mkdir_rdt_prepare_clean(rdtgrp); + goto out_unlock; + } + + kernfs_activate(rdtgrp->kn); + /* * Add the rdtgrp to the list of rdtgrps the parent * ctrl_mon group has to track. */ list_add_tail(&rdtgrp->mon.crdtgrp_list, &prgrp->mon.crdtgrp_list); +out_unlock: rdtgroup_kn_unlock(parent_kn); return ret; } @@ -3306,10 +3314,17 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn, ret = 0; rdtgrp->closid = closid; - ret = rdtgroup_init_alloc(rdtgrp); - if (ret < 0) + + ret = mkdir_rdt_prepare_rmid_alloc(rdtgrp); + if (ret) goto out_id_free; + kernfs_activate(rdtgrp->kn); + + ret = rdtgroup_init_alloc(rdtgrp); + if (ret < 0) + goto out_rmid_free; + list_add(&rdtgrp->rdtgroup_list, &rdt_all_groups); if (rdt_mon_capable) { @@ -3328,6 +3343,8 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn, out_del_list: list_del(&rdtgrp->rdtgroup_list); +out_rmid_free: + mkdir_rdt_prepare_rmid_free(rdtgrp); out_id_free: closid_free(closid); out_common_fail: From patchwork Mon Mar 20 17:26:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72350 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1361399wrt; Mon, 20 Mar 2023 11:14:31 -0700 (PDT) X-Google-Smtp-Source: AK7set/7aiEf7myoaiWGKoZZZDJOlezUKjgktPC/3LoUlJ0dTPAOxhuzLhmIRqPIp8M8J/QSEtSE X-Received: by 2002:a17:90a:e383:b0:23f:c096:7129 with SMTP id b3-20020a17090ae38300b0023fc0967129mr50522pjz.26.1679336071435; Mon, 20 Mar 2023 11:14:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679336071; cv=none; d=google.com; s=arc-20160816; b=UgKsOpYvszr2lpktOxuRpS4Z3JKAHZulDG7AaiwksQuSButUI6B90+QrLaeonM0JwP GXA/OAppmXwnkOKb51at++EYXsEeZ/xI3ia/xdWyEWhqT07HLu5ufsV5Xx3F3kcelwba 3XHFAUk8b/f8h+PXT0AFl6Q7ky+oqh4xSV7wifeOUdx0iyDXmt8MjpyfPINLaDMboRMM JpXsjYTZTxAt8Dxy+oEzjTYF58+lxL+3qHu0qpb/x5YEpVeWMAYMBIcbtVET/ttB2Ha1 SZ7jK+FJrDNIOdkPyOccgGhn6LUjS4xiekw8HHnFRJC5NxLQtM900p0Ic3G1JJ74q5Za iJ2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=6prV157bGh60v5t2nHBrbmnSY8LUYb5uxOzRsDx0xuA=; b=BGK7AU9YARSkPUbF+mst5k9CoJ6O/TYqgAWNUM1oaieQW/RTfcNWimdNhUoe4Nk3Nd e484FebWjUqQSdU2nSmKqHNbN3lVd7i31jWyMavgPuiU3s9erFFQc/7gSW9NYx0zzFgb KmOMk4eR1PPtByHN/mApcSa3S6+IPOccwYHSXWL1+vSfFG214uYClczzvQmBpmc2IV8g uFtFpKTKmVBo3jJbqa3yy42yXOtTvgLVAJ6HeYuQwyA5Y9v0NBxIfKaBRt5s8Sg0XUhe TnMfLZHE1tZbqfZRKLCZ48MBB0ZhDJok1MCHhU0qGpjTRu3EqNqYJdln0+e2TyczpD4s rqCg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b7-20020a17090a488700b0023d008fe931si15976662pjh.150.2023.03.20.11.14.16; Mon, 20 Mar 2023 11:14:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230250AbjCTRsR (ORCPT + 99 others); Mon, 20 Mar 2023 13:48:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33716 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229911AbjCTRrT (ORCPT ); Mon, 20 Mar 2023 13:47:19 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id ECDCB39BB6 for ; Mon, 20 Mar 2023 10:43:01 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 19F6515A1; Mon, 20 Mar 2023 10:28:09 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9445B3F67D; Mon, 20 Mar 2023 10:27:22 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 05/19] x86/resctrl: Allow RMID allocation to be scoped by CLOSID Date: Mon, 20 Mar 2023 17:26:06 +0000 Message-Id: <20230320172620.18254-6-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760911500225607757?= X-GMAIL-MSGID: =?utf-8?q?1760911500225607757?= MPAMs RMID values are not unique unless the CLOSID is considered as well. alloc_rmid() expects the RMID to be an independent number. Pass the CLOSID in to alloc_rmid(). Use this to compare indexes when allocating. If the CLOSID is not relevant to the index, this ends up comparing the free RMID with itself, and the first free entry will be used. With MPAM the CLOSID is included in the index, so this becomes a walk of the free RMID entries, until one that matches the supplied CLOSID is found. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v2; * Rephrased comment in resctrl_find_free_rmid() to describe this in terms of list_entry_first() * Rephrased comment above alloc_rmid() --- arch/x86/kernel/cpu/resctrl/internal.h | 2 +- arch/x86/kernel/cpu/resctrl/monitor.c | 54 +++++++++++++++++------ arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 2 +- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 2 +- 4 files changed, 43 insertions(+), 17 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 47506e2afd59..e11d9ce943d3 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -535,7 +535,7 @@ void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp); struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r); int closids_supported(void); void closid_free(int closid); -int alloc_rmid(void); +int alloc_rmid(u32 closid); void free_rmid(u32 closid, u32 rmid); int rdt_get_mon_l3_config(struct rdt_resource *r); bool __init rdt_cpu_has(int flag); diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 03a7d13dd653..ca58a433c668 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -337,25 +337,51 @@ bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d) return find_first_bit(d->rmid_busy_llc, idx_limit) != idx_limit; } -/* - * As of now the RMIDs allocation is global. - * However we keep track of which packages the RMIDs - * are used to optimize the limbo list management. - */ -int alloc_rmid(void) +static struct rmid_entry *resctrl_find_free_rmid(u32 closid) { - struct rmid_entry *entry; - - lockdep_assert_held(&rdtgroup_mutex); + struct rmid_entry *itr; + u32 itr_idx, cmp_idx; if (list_empty(&rmid_free_lru)) - return rmid_limbo_count ? -EBUSY : -ENOSPC; + return rmid_limbo_count ? ERR_PTR(-EBUSY) : ERR_PTR(-ENOSPC); - entry = list_first_entry(&rmid_free_lru, - struct rmid_entry, list); - list_del(&entry->list); + list_for_each_entry(itr, &rmid_free_lru, list) { + /* + * get the index of this free RMID, and the index it would need + * to be if it were used with this CLOSID. + * If the CLOSID is irrelevant on this architecture, these will + * always be the same meaning the compiler can reduce this loop + * to a single list_entry_first() call. + */ + itr_idx = resctrl_arch_rmid_idx_encode(itr->closid, itr->rmid); + cmp_idx = resctrl_arch_rmid_idx_encode(closid, itr->rmid); - return entry->rmid; + if (itr_idx == cmp_idx) + return itr; + } + + return ERR_PTR(-ENOSPC); +} + +/* + * For MPAM the RMID value is not unique, and has to be considered with + * the CLOSID. The (CLOSID, RMID) pair is allocated on all domains, which + * allows all domains to be managed by a single limbo list. + * Each domain also has a rmid_busy_llc to reduce the work of the limbo handler. + */ +int alloc_rmid(u32 closid) +{ + struct rmid_entry *entry; + + lockdep_assert_held(&rdtgroup_mutex); + + entry = resctrl_find_free_rmid(closid); + if (!IS_ERR(entry)) { + list_del(&entry->list); + return entry->rmid; + } + + return PTR_ERR(entry); } static void add_rmid_to_limbo(struct rmid_entry *entry) diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c index c51932516965..3b724a40d3a2 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -763,7 +763,7 @@ int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp) int ret; if (rdt_mon_capable) { - ret = alloc_rmid(); + ret = alloc_rmid(rdtgrp->closid); if (ret < 0) { rdt_last_cmd_puts("Out of RMIDs\n"); return ret; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 16c8ca135b37..bcd27610bb77 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3142,7 +3142,7 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp) if (!rdt_mon_capable) return 0; - ret = alloc_rmid(); + ret = alloc_rmid(rdtgrp->closid); if (ret < 0) { rdt_last_cmd_puts("Out of RMIDs\n"); return ret; From patchwork Mon Mar 20 17:26:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72345 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1360643wrt; Mon, 20 Mar 2023 11:12:50 -0700 (PDT) X-Google-Smtp-Source: AK7set8VZcm+a6/H1xOm0019rYe4W9qOnFxEO3dAWaOoTwSdsksd+EniZVJUgUF4aTZGg3E1iflW X-Received: by 2002:a05:6a20:779e:b0:d9:7749:63b6 with SMTP id c30-20020a056a20779e00b000d9774963b6mr4315381pzg.34.1679335970230; Mon, 20 Mar 2023 11:12:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679335970; cv=none; d=google.com; s=arc-20160816; b=CN83dxHGZSoLR/X0cTqkqTqJOC6aWPEiOKc54TCvGfN8hoPMeR4sKrANoyE+GiPGNN QRW96pojgtVxxyex+hj+3yt3uADcpM8rxXYAYgUr22s9feV/YlJI+T9op2KGH7LWmF6F 5cjQI5dnNb9K+h9Nayz49jRw/s7rtWbKd7pUFBrLivHD3jo9Kd5OeHnD/9Yv1XnHJQoM ZntXmSgXdrkVqw1U7NrJQEP9pfW++BvR1vKqsLIRqGoxyXs919Mk0MavG2JJrr8RlbCt PjBq6Bnx+SNBvhz93kwBUgtw2oFYmaN4iP09rFFD8qmZkr/GvwL/ZXWMvxo4I19jI6MW hXBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=MUTskXgyXDfMYre+A0K186o79vn5avGQyG1AdxhYK7I=; b=vb1E6W6Ye6NgsmdytCYbtOBoZUROjjXXEG9EJEa8eKTRKBveZpw4/jngylX5z/LQWc BxbY6RqK7dUt4GjILcULWzQ89wcqz8E1WYM8k7jh+ieTpRmiKT0+4PFSIxb3yiSHozFH IQwrd+upJFmJlvyW7asA7zg19BIzhRMDyFLytJBekRl/iHZzL6QD8XRLQcokGxncg4o2 GL4pHATWm0N1i8KdOwDIlXAgPEoUF0idjK9OcXG5GQNcN40QdQOmuE8q0+4SswykJiQc DU5is6o73SdGUVLweH1FN3eP6x8oKCANWPMEN6GQlYYyDWohWLqFepIYmkuQ1LW+pP2o 4vNA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 3-20020a631143000000b004fd72ef0180si10861673pgr.99.2023.03.20.11.12.35; Mon, 20 Mar 2023 11:12:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230160AbjCTRrq (ORCPT + 99 others); Mon, 20 Mar 2023 13:47:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34306 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229735AbjCTRrK (ORCPT ); Mon, 20 Mar 2023 13:47:10 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0D8013B22A for ; Mon, 20 Mar 2023 10:42:48 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DF5C015DB; Mon, 20 Mar 2023 10:28:11 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6AF793F67D; Mon, 20 Mar 2023 10:27:25 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 06/19] x86/resctrl: Allow the allocator to check if a CLOSID can allocate clean RMID Date: Mon, 20 Mar 2023 17:26:07 +0000 Message-Id: <20230320172620.18254-7-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760911394056997258?= X-GMAIL-MSGID: =?utf-8?q?1760911394056997258?= MPAM's PMG bits extend its PARTID space, meaning the same PMG value can be used for different control groups. This means once a CLOSID is allocated, all its monitoring ids may still be dirty, and held in limbo. Add a helper to allow the CLOSID allocator to check if a CLOSID has dirty RMID values. This behaviour is enabled by a kconfig option selected by the architecture, which avoids a pointless search for x86. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v1: * Removed superflous IS_ENABLED(). Changes since v2: * Reworded comment over resctrl_closid_is_dirty() to reflect this is all RMID. --- arch/x86/kernel/cpu/resctrl/internal.h | 1 + arch/x86/kernel/cpu/resctrl/monitor.c | 36 ++++++++++++++++++++++++++ arch/x86/kernel/cpu/resctrl/rdtgroup.c | 17 +++++++----- 3 files changed, 47 insertions(+), 7 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index e11d9ce943d3..87545e4beb70 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -534,6 +534,7 @@ int rdtgroup_pseudo_lock_create(struct rdtgroup *rdtgrp); void rdtgroup_pseudo_lock_remove(struct rdtgroup *rdtgrp); struct rdt_domain *get_domain_from_cpu(int cpu, struct rdt_resource *r); int closids_supported(void); +bool resctrl_closid_is_dirty(u32 closid); void closid_free(int closid); int alloc_rmid(u32 closid); void free_rmid(u32 closid, u32 rmid); diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index ca58a433c668..a2ae4be4b2ba 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -363,6 +363,42 @@ static struct rmid_entry *resctrl_find_free_rmid(u32 closid) return ERR_PTR(-ENOSPC); } +/** + * resctrl_closid_is_dirty - Determine if all RMID associated with this CLOSID + * are available. + * @closid: The CLOSID that is being queried. + * + * MPAM's equivalent of RMID are per-CLOSID, meaning a freshly allocated CLOSID + * may not be able to allocate clean RMID. To avoid this the allocator will + * only return clean CLOSID. This is enough for now as it allows MPAM systems + * to use resctrl. This suffers from the problem that there may be no CLOSID + * where all the RMID are clean, causing the CLOSID allocation to fail. + * This can be improved (once MPAM support is upstream) to return the cleanest + * CLOSID where PMG=0 is clean. This would allow the CLOSID allocation to + * succeed, but subsequent monitor-group allocations may fail. + */ +bool resctrl_closid_is_dirty(u32 closid) +{ + struct rmid_entry *entry; + int i; + + lockdep_assert_held(&rdtgroup_mutex); + + if (!IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID)) + return false; + + for (i = 0; i < resctrl_arch_system_num_rmid_idx(); i++) { + entry = &rmid_ptrs[i]; + if (entry->closid != closid) + continue; + + if (entry->busy) + return true; + } + + return false; +} + /* * For MPAM the RMID value is not unique, and has to be considered with * the CLOSID. The (CLOSID, RMID) pair is allocated on all domains, which diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index bcd27610bb77..e741bc47bae9 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -93,7 +93,7 @@ void rdt_last_cmd_printf(const char *fmt, ...) * - Our choices on how to configure each resource become progressively more * limited as the number of resources grows. */ -static int closid_free_map; +static unsigned long closid_free_map; static int closid_free_map_len; int closids_supported(void) @@ -119,14 +119,17 @@ static void closid_init(void) static int closid_alloc(void) { - u32 closid = ffs(closid_free_map); + u32 closid; - if (closid == 0) - return -ENOSPC; - closid--; - closid_free_map &= ~(1 << closid); + for_each_set_bit(closid, &closid_free_map, closid_free_map_len) { + if (resctrl_closid_is_dirty(closid)) + continue; - return closid; + clear_bit(closid, &closid_free_map); + return closid; + } + + return -ENOSPC; } void closid_free(int closid) From patchwork Mon Mar 20 17:26:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72344 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1360200wrt; Mon, 20 Mar 2023 11:11:49 -0700 (PDT) X-Google-Smtp-Source: AK7set8BG6PxLxCllud7+WDzz7Opfwit/PUioLOawFYxsi6IJ1xCe7z+y8FMxdrfVhMNs7h0A4M6 X-Received: by 2002:a17:902:f687:b0:1a1:c551:bd00 with SMTP id l7-20020a170902f68700b001a1c551bd00mr7392906plg.35.1679335908791; Mon, 20 Mar 2023 11:11:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679335908; cv=none; d=google.com; s=arc-20160816; b=pKZ0TsASpYJ7EEWq1C2RqH6lRrPTNy+9XWiXARG3FMI5xudDxBBKC0yD2cxIdDwuC6 7krrodRtublFZ3LvjC9jlxrH4p7b3/RIdJ1P89tp2nnaw9B2k6EsSH00qk8HHRFTUetk LIBBI8mPN5Y2nmjGy6nPSLj8zzmzj0GP4CVLz7t04TMlbfBb0vyTy3KCDlLybDKAN1NS eW3nrrycFyK+2wK0mFZiyRrJy7n06Q6ppdA6lSaQlobXERHLvMqAra1V//yoEvqqD1JX +JDCFIBnNEMR1MxxVi0AfSbf/Ewgka3kySy0DR/+AonrtlaOxZgDQVdJc3305xkbi8tt ABVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=4G9cZ5pKDoUHi7VAJ7maNqrEyJyXjvfRQqff90VjQCo=; b=w/7D2T/CsE4YEeno2YKs5xwBZviotaDaEuJkC50y8UjijWaIocIhinduWFxL9MeMwn gkV/qg5QGO7bA3myS4lTlv/kh5Ujam/eqRv5PMZ9NnTt8bb+yvUw4dO/boN6sUKeh3JL kzt9kUwF28i+hnsrJnBf2EgDv9uFB1HtB1br7XKBsdHwpxJX3xmx3sT+I6r7DZXXULa0 i24efwocGgS5J6FTSUj6bTaog1niCwf1CeK9AkuhlRlOWq5OvKqflEQnkC2lnEEWvYay GFZiGH5vS+KjFl+Xluc6xajTR6eVf9vEIx8muMIBjVI8tF0pojFLfMZsvwKuz5CUx1ZW Qh5Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s13-20020a170902ea0d00b001a1cbc1a94csi4526346plg.554.2023.03.20.11.11.31; Mon, 20 Mar 2023 11:11:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229896AbjCTRqe (ORCPT + 99 others); Mon, 20 Mar 2023 13:46:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56632 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229931AbjCTRqB (ORCPT ); Mon, 20 Mar 2023 13:46:01 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E5ADE3A850 for ; Mon, 20 Mar 2023 10:41:58 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AD0F71650; Mon, 20 Mar 2023 10:28:14 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 371883F67D; Mon, 20 Mar 2023 10:27:28 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 07/19] x86/resctrl: Move CLOSID/RMID matching and setting to use helpers Date: Mon, 20 Mar 2023 17:26:08 +0000 Message-Id: <20230320172620.18254-8-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760911329793429108?= X-GMAIL-MSGID: =?utf-8?q?1760911329793429108?= When switching tasks, the CLOSID and RMID that the new task should use are stored in struct task_struct. For x86 the CLOSID known by resctrl, the value in task_struct, and the value written to the CPU register are all the same thing. MPAM's CPU interface has two different PARTID's one for data accesses the other for instruction fetch. Storing resctrl's CLOSID value in struct task_struct implies the arch code knows whether resctrl is using CDP. Move the matching and setting of the struct task_struct properties to use helpers. This allows arm64 to store the hardware format of the register, instead of having to convert it each time. __rdtgroup_move_task()s use of READ_ONCE()/WRITE_ONCE() ensures torn values aren't seen as another CPU may schedule the task being moved while the value is being changed. MPAM has an additional corner-case here as the PMG bits extend the PARTID space. If the scheduler sees a new-CLOSID but old-RMID, the task will dirty an RMID that the limbo code is not watching causing an inaccurate count. x86's RMID are independent values, so the limbo code will still be watching the old-RMID in this circumstance. To avoid this, arm64 needs both the CLOSID/RMID WRITE_ONCE()d together. Both values must be provided together. Because MPAM's RMID values are not unique, the CLOSID must be provided when matching the RMID. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v2: * __rdtgroup_move_task() changed to set CLOSID from different CLOSID place depending on group type --- arch/x86/include/asm/resctrl.h | 18 ++++++++ arch/x86/kernel/cpu/resctrl/rdtgroup.c | 62 ++++++++++++++++---------- 2 files changed, 56 insertions(+), 24 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 3ca40be41a0a..752123b0ce40 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -95,6 +95,24 @@ static inline unsigned int resctrl_arch_round_mon_val(unsigned int val) return val * scale; } +static inline void resctrl_arch_set_closid_rmid(struct task_struct *tsk, + u32 closid, u32 rmid) +{ + WRITE_ONCE(tsk->closid, closid); + WRITE_ONCE(tsk->rmid, rmid); +} + +static inline bool resctrl_arch_match_closid(struct task_struct *tsk, u32 closid) +{ + return READ_ONCE(tsk->closid) == closid; +} + +static inline bool resctrl_arch_match_rmid(struct task_struct *tsk, u32 ignored, + u32 rmid) +{ + return READ_ONCE(tsk->rmid) == rmid; +} + static inline void resctrl_sched_in(void) { if (static_branch_likely(&rdt_enable_key)) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index e741bc47bae9..2306fbc9a9bb 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -84,7 +84,7 @@ void rdt_last_cmd_printf(const char *fmt, ...) * * Using a global CLOSID across all resources has some advantages and * some drawbacks: - * + We can simply set "current->closid" to assign a task to a resource + * + We can simply set current's closid to assign a task to a resource * group. * + Context switch code can avoid extra memory references deciding which * CLOSID to load into the PQR_ASSOC MSR @@ -544,14 +544,26 @@ static void update_task_closid_rmid(struct task_struct *t) _update_task_closid_rmid(t); } +static bool task_in_rdtgroup(struct task_struct *tsk, struct rdtgroup *rdtgrp) +{ + u32 closid, rmid = rdtgrp->mon.rmid; + + if (rdtgrp->type == RDTCTRL_GROUP) + closid = rdtgrp->closid; + else if (rdtgrp->type == RDTMON_GROUP) + closid = rdtgrp->mon.parent->closid; + else + return false; + + return resctrl_arch_match_closid(tsk, closid) && + resctrl_arch_match_rmid(tsk, closid, rmid); +} + static int __rdtgroup_move_task(struct task_struct *tsk, struct rdtgroup *rdtgrp) { /* If the task is already in rdtgrp, no need to move the task. */ - if ((rdtgrp->type == RDTCTRL_GROUP && tsk->closid == rdtgrp->closid && - tsk->rmid == rdtgrp->mon.rmid) || - (rdtgrp->type == RDTMON_GROUP && tsk->rmid == rdtgrp->mon.rmid && - tsk->closid == rdtgrp->mon.parent->closid)) + if (task_in_rdtgroup(tsk, rdtgrp)) return 0; /* @@ -562,19 +574,19 @@ static int __rdtgroup_move_task(struct task_struct *tsk, * For monitor groups, can move the tasks only from * their parent CTRL group. */ - - if (rdtgrp->type == RDTCTRL_GROUP) { - WRITE_ONCE(tsk->closid, rdtgrp->closid); - WRITE_ONCE(tsk->rmid, rdtgrp->mon.rmid); - } else if (rdtgrp->type == RDTMON_GROUP) { - if (rdtgrp->mon.parent->closid == tsk->closid) { - WRITE_ONCE(tsk->rmid, rdtgrp->mon.rmid); - } else { - rdt_last_cmd_puts("Can't move task to different control group\n"); - return -EINVAL; - } + if (rdtgrp->type == RDTMON_GROUP && + !resctrl_arch_match_closid(tsk, rdtgrp->mon.parent->closid)) { + rdt_last_cmd_puts("Can't move task to different control group\n"); + return -EINVAL; } + if (rdtgrp->type == RDTMON_GROUP) + resctrl_arch_set_closid_rmid(tsk, rdtgrp->mon.parent->closid, + rdtgrp->mon.rmid); + else + resctrl_arch_set_closid_rmid(tsk, rdtgrp->closid, + rdtgrp->mon.rmid); + /* * Ensure the task's closid and rmid are written before determining if * the task is current that will decide if it will be interrupted. @@ -596,14 +608,15 @@ static int __rdtgroup_move_task(struct task_struct *tsk, static bool is_closid_match(struct task_struct *t, struct rdtgroup *r) { - return (rdt_alloc_capable && - (r->type == RDTCTRL_GROUP) && (t->closid == r->closid)); + return (rdt_alloc_capable && (r->type == RDTCTRL_GROUP) && + resctrl_arch_match_closid(t, r->closid)); } static bool is_rmid_match(struct task_struct *t, struct rdtgroup *r) { - return (rdt_mon_capable && - (r->type == RDTMON_GROUP) && (t->rmid == r->mon.rmid)); + return (rdt_mon_capable && (r->type == RDTMON_GROUP) && + resctrl_arch_match_rmid(t, r->mon.parent->closid, + r->mon.rmid)); } /** @@ -799,7 +812,7 @@ int proc_resctrl_show(struct seq_file *s, struct pid_namespace *ns, rdtg->mode != RDT_MODE_EXCLUSIVE) continue; - if (rdtg->closid != tsk->closid) + if (!resctrl_arch_match_closid(tsk, rdtg->closid)) continue; seq_printf(s, "res:%s%s\n", (rdtg == &rdtgroup_default) ? "/" : "", @@ -807,7 +820,8 @@ int proc_resctrl_show(struct seq_file *s, struct pid_namespace *ns, seq_puts(s, "mon:"); list_for_each_entry(crg, &rdtg->mon.crdtgrp_list, mon.crdtgrp_list) { - if (tsk->rmid != crg->mon.rmid) + if (!resctrl_arch_match_rmid(tsk, crg->mon.parent->closid, + crg->mon.rmid)) continue; seq_printf(s, "%s", crg->kn->name); break; @@ -2659,8 +2673,8 @@ static void rdt_move_group_tasks(struct rdtgroup *from, struct rdtgroup *to, for_each_process_thread(p, t) { if (!from || is_closid_match(t, from) || is_rmid_match(t, from)) { - WRITE_ONCE(t->closid, to->closid); - WRITE_ONCE(t->rmid, to->mon.rmid); + resctrl_arch_set_closid_rmid(t, to->closid, + to->mon.rmid); /* * Order the closid/rmid stores above before the loads From patchwork Mon Mar 20 17:26:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72343 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1359901wrt; Mon, 20 Mar 2023 11:11:15 -0700 (PDT) X-Google-Smtp-Source: AK7set/zCs8/U7pUCdZek9VauLyUU0x5FPD5Kw6nl13PFx4XRCnQL2dGZKD+hlpGgGKhuKM3LMKC X-Received: by 2002:a17:90b:1e4a:b0:23a:340d:fa49 with SMTP id pi10-20020a17090b1e4a00b0023a340dfa49mr22003pjb.32.1679335874784; Mon, 20 Mar 2023 11:11:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679335874; cv=none; d=google.com; s=arc-20160816; b=fFJ05oBczDdEH280gBqZyjJZwyD8fX5N5AZe2mTZLEyJvDCUUV16zm2jckIqqNg2wt ZJnKpQfFyv0bJX1lkSjffK6n6a15QI/wk2N7A/S2tKaBMSV///pkBAGC/1xtxyORTcYJ uHjdkpvo4P1nDMVl0OGJlSQfW4GV3o8jdAM5Hy2ucJ3SvUQR9spt411QD1k5/bmmSQ53 zKRHXZXVIgZiccfb3dEwEFDPV/1fMBd9Q/V2bHVqGLXJr1VDBWN6NgO2evnlcUzWCiNg L6sM3+gIxWqt1vquBAl0rBH1HXnWOogc90ZUPZFNKIe+sm5hr4eGA0iPCp9xzdtlN81k K9ZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=841BaiXo8m199LFtGaPTiiCTlvcUh7hzuC9d3GvgaTI=; b=cxQ9zGW1qsilEc588VAZG4TVQkV+iA93DkMqS4k+TzxxV+k9bPW12U+C8RkxKR0fqX 8sB0ohRPIzvRH6ZIt5fl5Lon+y0XPm0bjzYJd65MQqh+HoyF4o9wRVk+x4+JzDlo4NsJ LgRxLubFwcTTthcujaKCE9eF0UtfjBjYicYxXrs3S30F8K8esd4ftJ8hmg84oiz7z8j5 EhmmoehQvbcBoQLaqoMAeJqwc97gghQX4QHw9ug8WmaxtSV4fj7WgX+GvYicIHGz3QFn 8flC+ZrxvLGpyzuXIQUIL8JWOkCDZs8vo8ZW6PRWRBaM80RDSg6lsWzF7vEOg40my4rI VZQA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b7-20020a17090a488700b0023d008fe931si15976662pjh.150.2023.03.20.11.10.55; Mon, 20 Mar 2023 11:11:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229685AbjCTRu0 (ORCPT + 99 others); Mon, 20 Mar 2023 13:50:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34660 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230323AbjCTRtO (ORCPT ); Mon, 20 Mar 2023 13:49:14 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 474093AA9 for ; Mon, 20 Mar 2023 10:43:56 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 805591655; Mon, 20 Mar 2023 10:28:17 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0AEE33F67D; Mon, 20 Mar 2023 10:27:30 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 08/19] x86/resctrl: Add cpumask_any_housekeeping() for limbo/overflow Date: Mon, 20 Mar 2023 17:26:09 +0000 Message-Id: <20230320172620.18254-9-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760911294339715590?= X-GMAIL-MSGID: =?utf-8?q?1760911294339715590?= The limbo and overflow code picks a CPU to use from the domain's list of online CPUs. Work is then scheduled on these CPUs to maintain the limbo list and any counters that may overflow. cpumask_any() may pick a CPU that is marked nohz_full, which will either penalise the work that CPU was dedicated to, or delay the processing of limbo list or counters that may overflow. Perhaps indefinitely. Delaying the overflow handling will skew the bandwidth values calculated by mba_sc, which expects to be called once a second. Add cpumask_any_housekeeping() as a replacement for cpumask_any() that prefers housekeeping CPUs. This helper will still return a nohz_full CPU if that is the only option. The CPU to use is re-evaluated each time the limbo/overflow work runs. This ensures the work will move off a nohz_full CPU once a houskeeping CPU is available. Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/internal.h | 23 +++++++++++++++++++++++ arch/x86/kernel/cpu/resctrl/monitor.c | 17 ++++++++++++----- include/linux/tick.h | 3 ++- 3 files changed, 37 insertions(+), 6 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 87545e4beb70..0b5fd5a0cda2 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -7,6 +7,7 @@ #include #include #include +#include #include #define L3_QOS_CDP_ENABLE 0x01ULL @@ -55,6 +56,28 @@ /* Max event bits supported */ #define MAX_EVT_CONFIG_BITS GENMASK(6, 0) +/** + * cpumask_any_housekeeping() - Chose any cpu in @mask, preferring those that + * aren't marked nohz_full + * @mask: The mask to pick a CPU from. + * + * Returns a CPU in @mask. If there are houskeeping CPUs that don't use + * nohz_full, these are preferred. + */ +static inline unsigned int cpumask_any_housekeeping(const struct cpumask *mask) +{ + int cpu, hk_cpu; + + cpu = cpumask_any(mask); + if (tick_nohz_full_cpu(cpu)) { + hk_cpu = cpumask_nth_andnot(0, mask, tick_nohz_full_mask); + if (hk_cpu < nr_cpu_ids) + cpu = hk_cpu; + } + + return cpu; +} + struct rdt_fs_context { struct kernfs_fs_context kfc; bool enable_cdpl2; diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index a2ae4be4b2ba..3bec5c59ca0e 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -745,9 +745,9 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, void cqm_handle_limbo(struct work_struct *work) { unsigned long delay = msecs_to_jiffies(CQM_LIMBOCHECK_INTERVAL); - int cpu = smp_processor_id(); struct rdt_resource *r; struct rdt_domain *d; + int cpu; mutex_lock(&rdtgroup_mutex); @@ -756,8 +756,10 @@ void cqm_handle_limbo(struct work_struct *work) __check_limbo(d, false); - if (has_busy_rmid(r, d)) + if (has_busy_rmid(r, d)) { + cpu = cpumask_any_housekeeping(&d->cpu_mask); schedule_delayed_work_on(cpu, &d->cqm_limbo, delay); + } mutex_unlock(&rdtgroup_mutex); } @@ -767,7 +769,7 @@ void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms) unsigned long delay = msecs_to_jiffies(delay_ms); int cpu; - cpu = cpumask_any(&dom->cpu_mask); + cpu = cpumask_any_housekeeping(&dom->cpu_mask); dom->cqm_work_cpu = cpu; schedule_delayed_work_on(cpu, &dom->cqm_limbo, delay); @@ -777,10 +779,10 @@ void mbm_handle_overflow(struct work_struct *work) { unsigned long delay = msecs_to_jiffies(MBM_OVERFLOW_INTERVAL); struct rdtgroup *prgrp, *crgrp; - int cpu = smp_processor_id(); struct list_head *head; struct rdt_resource *r; struct rdt_domain *d; + int cpu; mutex_lock(&rdtgroup_mutex); @@ -801,6 +803,11 @@ void mbm_handle_overflow(struct work_struct *work) update_mba_bw(prgrp, d); } + /* + * Re-check for housekeeping CPUs. This allows the overflow handler to + * move off a nohz_full CPU quickly. + */ + cpu = cpumask_any_housekeeping(&d->cpu_mask); schedule_delayed_work_on(cpu, &d->mbm_over, delay); out_unlock: @@ -814,7 +821,7 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) if (!static_branch_likely(&rdt_mon_enable_key)) return; - cpu = cpumask_any(&dom->cpu_mask); + cpu = cpumask_any_housekeeping(&dom->cpu_mask); dom->mbm_work_cpu = cpu; schedule_delayed_work_on(cpu, &dom->mbm_over, delay); } diff --git a/include/linux/tick.h b/include/linux/tick.h index bfd571f18cfd..ae2e9019fc18 100644 --- a/include/linux/tick.h +++ b/include/linux/tick.h @@ -174,9 +174,10 @@ static inline u64 get_cpu_iowait_time_us(int cpu, u64 *unused) { return -1; } static inline void tick_nohz_idle_stop_tick_protected(void) { } #endif /* !CONFIG_NO_HZ_COMMON */ +extern cpumask_var_t tick_nohz_full_mask; + #ifdef CONFIG_NO_HZ_FULL extern bool tick_nohz_full_running; -extern cpumask_var_t tick_nohz_full_mask; static inline bool tick_nohz_full_enabled(void) { From patchwork Mon Mar 20 17:26:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72353 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1362209wrt; Mon, 20 Mar 2023 11:16:09 -0700 (PDT) X-Google-Smtp-Source: AK7set8+JJsIu1D8FlgynUv40fL6xQbbfnba5/xl/oWcaIBqKwF0VDvhrda4XwJ5TrGAjNPw6S+9 X-Received: by 2002:a05:6a20:a8a5:b0:d9:3937:43a7 with SMTP id ca37-20020a056a20a8a500b000d9393743a7mr6301917pzb.55.1679336168966; Mon, 20 Mar 2023 11:16:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679336168; cv=none; d=google.com; s=arc-20160816; b=tqnZQoYDBOhAvfLVIzLLu9Ru8dTEN5yFBo6DqNUbPLaXDfKcA6SmnlRxo/YpRHcYVh s+AcaccQNiQI7n3xT4jOPWBiUAKoVvYa4oG57qpkHdheVy3IIL0Nbr5g2/spHKhA96C8 m2+o5XEXxkHpQMT7EHcq0syDrGDesLJswPLKOACWQFdnNfJR0deauTrB9NT7i+VMQMwB OYVfbzPqw5cTtN270R92s2ktno9nFwOKKGdgRr2F1Rk9pwZ1EjBmlIbFtWri7pfiLMrU IujBfegmsIICO6IOxS2c25v6IIbIIq5CsWbiu5Atdswq7RQms0bIIJNNRhy9jbp+U2hS 5QXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=J0oWX2oJ0lCWBH/JrsoUEX1fN+42U2i6DMT9wL+N/Xg=; b=EhfXHxo53ot+36Mmw1wJzR7hQAtwBdTUsKwbUTIgJw3m41931jTnzJnNfdMJr3qxc7 hAx8EOG+LJegfDYPB0YfWGHuq3t/X8X9mTwJ7RuRpPbBwyG+ZwbsYLT5nCq8S6+VlhE7 2jaJ77d/IMea435dCbVQRdDs9qhSPMRCyiBeqHSgr7qjvKjkSPwj2wfeXH5IS0NO/g8D elxT+y8Jw80wU2BehR/xpQ+/pnpwWJ1x+dDnBQnrv06ZiOYEio7j4bTuEmU8ZCrRoh52 5rm1VVf3UJVM3Lq20YuYpAZOzSG0ryNjlKI517ZiuTI+5JZYndeiTU7DmbNXtWrUslnC dGhg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s23-20020a632c17000000b004f143cb44a2si10786898pgs.625.2023.03.20.11.15.39; Mon, 20 Mar 2023 11:16:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230210AbjCTRsC (ORCPT + 99 others); Mon, 20 Mar 2023 13:48:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36096 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229914AbjCTRrO (ORCPT ); Mon, 20 Mar 2023 13:47:14 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id AEAD6C163 for ; Mon, 20 Mar 2023 10:42:56 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 53CD4165C; Mon, 20 Mar 2023 10:28:20 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CCF183F67D; Mon, 20 Mar 2023 10:27:33 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 09/19] x86/resctrl: Queue mon_event_read() instead of sending an IPI Date: Mon, 20 Mar 2023 17:26:10 +0000 Message-Id: <20230320172620.18254-10-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760911602444410586?= X-GMAIL-MSGID: =?utf-8?q?1760911602444410586?= x86 is blessed with an abundance of monitors, one per RMID, that can be read from any CPU in the domain. MPAMs monitors reside in the MMIO MSC, the number implemented is up to the manufacturer. This means when there are fewer monitors than needed, they need to be allocated and freed. Worse, the domain may be broken up into slices, and the MMIO accesses for each slice may need performing from different CPUs. These two details mean MPAMs monitor code needs to be able to sleep, and IPI another CPU in the domain to read from a resource that has been sliced. mon_event_read() already invokes mon_event_count() via IPI, which means this isn't possible. On systems using nohz-full, some CPUs need to be interrupted to run kernel work as they otherwise stay in user-space running realtime workloads. Interrupting these CPUs should be avoided, and scheduling work on them may never complete. Change mon_event_read() to pick a housekeeping CPU, (one that is not using nohz_full) and schedule mon_event_count() and wait. If all the CPUs in a domain are using nohz-full, then an IPI is used as the fallback. This function is only used in response to a user-space filesystem request (not the timing sensitive overflow code). This allows MPAM to hide the slice behaviour from resctrl, and to keep the monitor-allocation in monitor.c. When the IPI fallback is used on machines where MPAM needs to make an access on multiple CPUs, the counter read will always fail. Tested-by: Shaopeng Tan Signed-off-by: James Morse Reviewed-By: Peter Newman Tested-By: Peter Newman --- Changes since v2: * Use cpumask_any_housekeeping() and fallback to an IPI if needed --- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 19 +++++++++++++++++-- arch/x86/kernel/cpu/resctrl/internal.h | 2 +- arch/x86/kernel/cpu/resctrl/monitor.c | 6 ++++-- 3 files changed, 22 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index eb07d4435391..b06e86839d00 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -19,6 +19,7 @@ #include #include #include +#include #include "internal.h" /* @@ -527,8 +528,13 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, struct rdt_domain *d, struct rdtgroup *rdtgrp, int evtid, int first) { + int cpu; + + /* When picking a CPU from cpu_mask, ensure it can't race with cpuhp */ + lockdep_assert_held(&rdtgroup_mutex); + /* - * setup the parameters to send to the IPI to read the data. + * setup the parameters to pass to mon_event_count() to read the data. */ rr->rgrp = rdtgrp; rr->evtid = evtid; @@ -537,7 +543,16 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, rr->val = 0; rr->first = first; - smp_call_function_any(&d->cpu_mask, mon_event_count, rr, 1); + cpu = get_cpu(); + if (cpumask_test_cpu(cpu, &d->cpu_mask)) { + mon_event_count(rr); + put_cpu(); + } else { + put_cpu(); + + cpu = cpumask_any_housekeeping(&d->cpu_mask); + smp_call_on_cpu(cpu, mon_event_count, rr, false); + } } int rdtgroup_mondata_show(struct seq_file *m, void *arg) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 0b5fd5a0cda2..a07557390895 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -563,7 +563,7 @@ int alloc_rmid(u32 closid); void free_rmid(u32 closid, u32 rmid); int rdt_get_mon_l3_config(struct rdt_resource *r); bool __init rdt_cpu_has(int flag); -void mon_event_count(void *info); +int mon_event_count(void *info); int rdtgroup_mondata_show(struct seq_file *m, void *arg); void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, struct rdt_domain *d, struct rdtgroup *rdtgrp, diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 3bec5c59ca0e..5e9e876c3409 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -550,10 +550,10 @@ static void mbm_bw_count(u32 closid, u32 rmid, struct rmid_read *rr) } /* - * This is called via IPI to read the CQM/MBM counters + * This is scheduled by mon_event_read() to read the CQM/MBM counters * on a domain. */ -void mon_event_count(void *info) +int mon_event_count(void *info) { struct rdtgroup *rdtgrp, *entry; struct rmid_read *rr = info; @@ -586,6 +586,8 @@ void mon_event_count(void *info) */ if (ret == 0) rr->err = 0; + + return 0; } /* From patchwork Mon Mar 20 17:26:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72333 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1355672wrt; Mon, 20 Mar 2023 11:03:40 -0700 (PDT) X-Google-Smtp-Source: AK7set/NBRUjqW4Y8nB/3IcAadUmsIIPv3cuQEN7dY2J9KbdAQKq/CMwi9EQZAAyd2Ymrg7wRQsd X-Received: by 2002:a05:6a20:1b30:b0:d3:a13a:4c35 with SMTP id ch48-20020a056a201b3000b000d3a13a4c35mr348554pzb.6.1679335419825; Mon, 20 Mar 2023 11:03:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679335419; cv=none; d=google.com; s=arc-20160816; b=dloLXKizPDXuzrqbhL9W30BUS4AsmkihFqcbG/iDHRyrfY61aXuncKQtpJBtUu8xde pjEWYvRYfQVdEVlt95wRu7T9Q43uNGN9emqRU/cfwpqOGQSpZ6iyeFgyA0ZclQ5kyZAm nvn/72h4VSdOcqloSrU5BnpB+Vl3VbgaXqRwOHxCu7lK6c0wM/SksHqJtlDtODHnzd/K AH5ahiOjuhkUTbVQyjXfLv+HF4/+tKbSBiYougN9rk1RqtarAmPqk2bmVkq5Q57rQ/yP K8x9gaw7UAyP+noNBvEFzgqxVEWoAUPTxNYzZh+VSBOzK5TzoCojvUMudXfsEI07VQIT utGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=a6lmvxTaH/CgYqU6cQ9uWiCyW+tuo0ab4N1J0wE19Jc=; b=egVk5ZSxkULxdEo5AUrPer3BxYnDRI6t7ytdFB3/m4ypYepILQJRc+rQc07fbteDdP SBhZnSHkykP7hGqE4X1SHBBhvbjjXL+bJKZt1GZOvVxdrAV80LON9+oCSG0AV8uiH11L w6Oy2OZD/zsp1JgjJrSdJEIBcewe1gV9s0JJIJ6NIA8wJ2ARjczJ/NdThb51OY731/7F it7tWJoELXEqV3f/kQ8mBJ25uM9bP09G7Q6lB6cAnr9zORnzmGJBsduju6XMYEkLCkLP FPDs7MVSPym++eQ0sZUe/UN390w3vxefysFs4WCzZJXrak5Down4A3gW3IFdCitKLQqE KPLA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h11-20020a65480b000000b00502d6b2edd4si10392243pgs.804.2023.03.20.11.03.22; Mon, 20 Mar 2023 11:03:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229993AbjCTRrQ (ORCPT + 99 others); Mon, 20 Mar 2023 13:47:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33590 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229872AbjCTRqb (ORCPT ); Mon, 20 Mar 2023 13:46:31 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 76C1A3B220 for ; Mon, 20 Mar 2023 10:42:48 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 220151682; Mon, 20 Mar 2023 10:28:23 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A0DC53F67D; Mon, 20 Mar 2023 10:27:36 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 10/19] x86/resctrl: Allow resctrl_arch_rmid_read() to sleep Date: Mon, 20 Mar 2023 17:26:11 +0000 Message-Id: <20230320172620.18254-11-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760910817116735123?= X-GMAIL-MSGID: =?utf-8?q?1760910817116735123?= MPAM's cache occupancy counters can take a little while to settle once the monitor has been configured. The maximum settling time is described to the driver via a firmware table. The value could be large enough that it makes sense to sleep. To avoid exposing this to resctrl, it should be hidden behind MPAM's resctrl_arch_rmid_read(). But add_rmid_to_limbo() calls resctrl_arch_rmid_read() from a non-preemptible context. add_rmid_to_limbo() is opportunistically reading the L3 occupancy counter on this domain to avoid adding the RMID to limbo if this domain's value has drifted below resctrl_rmid_realloc_threshold since the limbo handler last ran. Determining 'this domain' involves disabling preeption to prevent the thread being migrated to CPUs in a different domain between the check and resctrl_arch_rmid_read() call. The check is skipped for all remote domains. Instead, call resctrl_arch_rmid_read() for each domain, and get it to read the arch specific counter via IPI if its called on a CPU outside the target domain. By covering remote domains, this change stops the limbo handler from being started unnecessarily if a remote domain is below the threshold. This also allows resctrl_arch_rmid_read() to sleep. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- The alternative is to remove the counter read from this path altogether, and assume user-space would never try to re-allocate the last RMID before the limbo handler runs next. --- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 12 +----- arch/x86/kernel/cpu/resctrl/monitor.c | 48 +++++++++++++++-------- 2 files changed, 33 insertions(+), 27 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index b06e86839d00..9161bc95eea7 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -543,16 +543,8 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, rr->val = 0; rr->first = first; - cpu = get_cpu(); - if (cpumask_test_cpu(cpu, &d->cpu_mask)) { - mon_event_count(rr); - put_cpu(); - } else { - put_cpu(); - - cpu = cpumask_any_housekeeping(&d->cpu_mask); - smp_call_on_cpu(cpu, mon_event_count, rr, false); - } + cpu = cpumask_any_housekeeping(&d->cpu_mask); + smp_call_on_cpu(cpu, mon_event_count, rr, false); } int rdtgroup_mondata_show(struct seq_file *m, void *arg) diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 5e9e876c3409..de72df06b37b 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -253,22 +253,42 @@ static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width) return chunks >> shift; } +struct __rmid_read_arg +{ + u32 rmid; + enum resctrl_event_id eventid; + + u64 msr_val; + int err; +}; + +static void smp_call_rmid_read(void *_arg) +{ + struct __rmid_read_arg *arg = _arg; + + arg->err = __rmid_read(arg->rmid, arg->eventid, &arg->msr_val); +} + int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, u32 closid, u32 rmid, enum resctrl_event_id eventid, u64 *val) { struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); + struct __rmid_read_arg arg; struct arch_mbm_state *am; u64 msr_val, chunks; - int ret; + int err; - if (!cpumask_test_cpu(smp_processor_id(), &d->cpu_mask)) - return -EINVAL; + arg.rmid = rmid; + arg.eventid = eventid; - ret = __rmid_read(rmid, eventid, &msr_val); - if (ret) - return ret; + err = smp_call_function_any(&d->cpu_mask, smp_call_rmid_read, &arg, true); + if (err) + return err; + if (arg.err) + return arg.err; + msr_val = arg.msr_val; am = get_arch_mbm_state(hw_dom, rmid, eventid); if (am) { @@ -424,23 +444,18 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) { struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; struct rdt_domain *d; - int cpu, err; u64 val = 0; u32 idx; + int err; idx = resctrl_arch_rmid_idx_encode(entry->closid, entry->rmid); entry->busy = 0; - cpu = get_cpu(); list_for_each_entry(d, &r->domains, list) { - if (cpumask_test_cpu(cpu, &d->cpu_mask)) { - err = resctrl_arch_rmid_read(r, d, entry->closid, - entry->rmid, - QOS_L3_OCCUP_EVENT_ID, - &val); - if (err || val <= resctrl_rmid_realloc_threshold) - continue; - } + err = resctrl_arch_rmid_read(r, d, entry->closid, entry->rmid, + QOS_L3_OCCUP_EVENT_ID, &val); + if (err || val <= resctrl_rmid_realloc_threshold) + continue; /* * For the first limbo RMID in the domain, @@ -451,7 +466,6 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) set_bit(idx, d->rmid_busy_llc); entry->busy++; } - put_cpu(); if (entry->busy) rmid_limbo_count++; From patchwork Mon Mar 20 17:26:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72335 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1357850wrt; Mon, 20 Mar 2023 11:07:28 -0700 (PDT) X-Google-Smtp-Source: AK7set+Pge0DrS1+MjCaZY9LX2eDWtWFpXKwsiYvObp94m0pD1IC1SF6wOREniZ+n31Pt66m/3WU X-Received: by 2002:a17:90b:1d08:b0:23f:7ff6:eb8 with SMTP id on8-20020a17090b1d0800b0023f7ff60eb8mr7797pjb.31.1679335648034; Mon, 20 Mar 2023 11:07:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679335648; cv=none; d=google.com; s=arc-20160816; b=u44MjynVkkPFVoTsbJOtCBMP0YM2028k7evGyhggf0oTwHlcdds7dlD4hd9UL/T2ec lOJ1oH3Cmh5hWo4yiuYrtSxsfy/rD/rgu4SY6/uleSnq507W22xDPkqZs4DnX3/SuviF LhUf/ffeObJc9ipZJQLAZlWUhEnvXqbRgoyqXtUCnVQNumMltX2WA5Cd5eFa0vKoste1 QF0NBZBzAzwtq4ba23YF04bDi25wUjKLMJqBT9ACBW63cxj1J/XgDyyU/2qZ+fYHGze0 hg6smWLIVbLRFWn/R35exz61XsH0sKzOFXA5zYTT558LkNGnfgFc6WY4s7/586MxY71F 38aw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=bJItYQHAP+7aKuy5H7MThtPKHiarZSddVhWkv5t0F3I=; b=CqIkPLRBuGvX7hkWWBIXY24hgGcJZKrPeVvYo0g6NQDd+J0IUWTAm9yuyE5MI8FY8Q 3NBndKkJSUaFlTaHgYxOgkRAPcAAwnLn5HPWKI51neR7QDQHu+Ls8Y760y0LGicplHMY RIVExjkMkJETpBZBJm6iZtyZJMAS3TNBZhSy9r4dVEQ4c7g9TGbjy04XvEquDXX77mR6 skXTyVM+Hcql7xHEdMRy5N9DX4AvBBwjhU91C+AMv9X8GPERCIcFavUgPP3LeDaPDNgO r6mdI4MwpdlDdMk/fQhMRBj0hwhVEg8A9/DOaTPFDA4D+RHXMmdYIZLx//MMTn8rN1cE sOYA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c13-20020a17090ad90d00b0023b30b28b89si10369405pjv.56.2023.03.20.11.07.14; Mon, 20 Mar 2023 11:07:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229617AbjCTRsV (ORCPT + 99 others); Mon, 20 Mar 2023 13:48:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230135AbjCTRrU (ORCPT ); Mon, 20 Mar 2023 13:47:20 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id ECB8712850 for ; Mon, 20 Mar 2023 10:43:01 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E22DC168F; Mon, 20 Mar 2023 10:28:25 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6CE773F67D; Mon, 20 Mar 2023 10:27:39 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 11/19] x86/resctrl: Allow arch to allocate memory needed in resctrl_arch_rmid_read() Date: Mon, 20 Mar 2023 17:26:12 +0000 Message-Id: <20230320172620.18254-12-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760911056515193091?= X-GMAIL-MSGID: =?utf-8?q?1760911056515193091?= Depending on the number of monitors available, Arm's MPAM may need to allocate a monitor prior to reading the counter value. Allocating a contended resource may involve sleeping. All callers of resctrl_arch_rmid_read() read the counter on more than one domain. If the monitor is allocated globally, there is no need to allocate and free it for each call to resctrl_arch_rmid_read(). Add arch hooks for this allocation, which need calling before resctrl_arch_rmid_read(). The allocated monitor is passed to resctrl_arch_rmid_read(), then freed again afterwards. The helper can be called on any CPU, and can sleep. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- arch/x86/include/asm/resctrl.h | 11 +++++++ arch/x86/kernel/cpu/resctrl/internal.h | 1 + arch/x86/kernel/cpu/resctrl/monitor.c | 40 +++++++++++++++++++++++--- include/linux/resctrl.h | 4 +-- 4 files changed, 50 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 752123b0ce40..1c87f1626456 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -136,6 +136,17 @@ static inline u32 resctrl_arch_rmid_idx_encode(u32 ignored, u32 rmid) return rmid; } +/* x86 can always read an rmid, nothing needs allocating */ +struct rdt_resource; +static inline int resctrl_arch_mon_ctx_alloc(struct rdt_resource *r, int evtid) +{ + might_sleep(); + return 0; +}; + +static inline void resctrl_arch_mon_ctx_free(struct rdt_resource *r, int evtid, + int ctx) { }; + void resctrl_cpu_detect(struct cpuinfo_x86 *c); #else diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index a07557390895..7262b355e128 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -135,6 +135,7 @@ struct rmid_read { bool first; int err; u64 val; + int arch_mon_ctx; }; extern bool rdt_alloc_capable; diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index de72df06b37b..f38cd2f12285 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -15,6 +15,7 @@ * Software Developer Manual June 2016, volume 3, section 17.17. */ +#include #include #include #include @@ -271,7 +272,7 @@ static void smp_call_rmid_read(void *_arg) int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, u32 closid, u32 rmid, enum resctrl_event_id eventid, - u64 *val) + u64 *val, int ignored) { struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r); struct rdt_hw_domain *hw_dom = resctrl_to_arch_dom(d); @@ -317,9 +318,14 @@ void __check_limbo(struct rdt_domain *d, bool force_free) u32 idx_limit = resctrl_arch_system_num_rmid_idx(); struct rmid_entry *entry; u32 idx, cur_idx = 1; + int arch_mon_ctx; bool rmid_dirty; u64 val = 0; + arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, QOS_L3_OCCUP_EVENT_ID); + if (arch_mon_ctx < 0) + return; + /* * Skip RMID 0 and start from RMID 1 and check all the RMIDs that * are marked as busy for occupancy < threshold. If the occupancy @@ -333,7 +339,8 @@ void __check_limbo(struct rdt_domain *d, bool force_free) entry = __rmid_entry(idx); if (resctrl_arch_rmid_read(r, d, entry->closid, entry->rmid, - QOS_L3_OCCUP_EVENT_ID, &val)) { + QOS_L3_OCCUP_EVENT_ID, &val, + arch_mon_ctx)) { rmid_dirty = true; } else { rmid_dirty = (val >= resctrl_rmid_realloc_threshold); @@ -348,6 +355,8 @@ void __check_limbo(struct rdt_domain *d, bool force_free) } cur_idx = idx + 1; } + + resctrl_arch_mon_ctx_free(r, QOS_L3_OCCUP_EVENT_ID, arch_mon_ctx); } bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d) @@ -444,16 +453,22 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) { struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; struct rdt_domain *d; + int arch_mon_ctx; u64 val = 0; u32 idx; int err; idx = resctrl_arch_rmid_idx_encode(entry->closid, entry->rmid); + arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, QOS_L3_OCCUP_EVENT_ID); + if (arch_mon_ctx < 0) + return; + entry->busy = 0; list_for_each_entry(d, &r->domains, list) { err = resctrl_arch_rmid_read(r, d, entry->closid, entry->rmid, - QOS_L3_OCCUP_EVENT_ID, &val); + QOS_L3_OCCUP_EVENT_ID, &val, + arch_mon_ctx); if (err || val <= resctrl_rmid_realloc_threshold) continue; @@ -466,6 +481,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) set_bit(idx, d->rmid_busy_llc); entry->busy++; } + resctrl_arch_mon_ctx_free(r, QOS_L3_OCCUP_EVENT_ID, arch_mon_ctx); if (entry->busy) rmid_limbo_count++; @@ -502,7 +518,7 @@ static int __mon_event_count(u32 closid, u32 rmid, struct rmid_read *rr) resctrl_arch_reset_rmid(rr->r, rr->d, closid, rmid, rr->evtid); rr->err = resctrl_arch_rmid_read(rr->r, rr->d, closid, rmid, rr->evtid, - &tval); + &tval, rr->arch_mon_ctx); if (rr->err) return rr->err; @@ -575,6 +591,9 @@ int mon_event_count(void *info) int ret; rdtgrp = rr->rgrp; + rr->arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr->r, rr->evtid); + if (rr->arch_mon_ctx < 0) + return rr->arch_mon_ctx; ret = __mon_event_count(rdtgrp->closid, rdtgrp->mon.rmid, rr); @@ -601,6 +620,8 @@ int mon_event_count(void *info) if (ret == 0) rr->err = 0; + resctrl_arch_mon_ctx_free(rr->r, rr->evtid, rr->arch_mon_ctx); + return 0; } @@ -737,11 +758,21 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, if (is_mbm_total_enabled()) { rr.evtid = QOS_L3_MBM_TOTAL_EVENT_ID; rr.val = 0; + rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid); + if (rr.arch_mon_ctx < 0) + return; + __mon_event_count(closid, rmid, &rr); + + resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx); } if (is_mbm_local_enabled()) { rr.evtid = QOS_L3_MBM_LOCAL_EVENT_ID; rr.val = 0; + rr.arch_mon_ctx = resctrl_arch_mon_ctx_alloc(rr.r, rr.evtid); + if (rr.arch_mon_ctx < 0) + return; + __mon_event_count(closid, rmid, &rr); /* @@ -751,6 +782,7 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, */ if (is_mba_sc(NULL)) mbm_bw_count(closid, rmid, &rr); + resctrl_arch_mon_ctx_free(rr.r, rr.evtid, rr.arch_mon_ctx); } } diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index ff7452f644e4..03e4f41cd336 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -233,6 +233,7 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); * @rmid: rmid of the counter to read. * @eventid: eventid to read, e.g. L3 occupancy. * @val: result of the counter read in bytes. + * @arch_mon_ctx: An allocated context from resctrl_arch_mon_ctx_alloc(). * * Call from process context on a CPU that belongs to domain @d. * @@ -241,8 +242,7 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); */ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain *d, u32 closid, u32 rmid, enum resctrl_event_id eventid, - u64 *val); - + u64 *val, int arch_mon_ctx); /** * resctrl_arch_reset_rmid() - Reset any private state associated with rmid From patchwork Mon Mar 20 17:26:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72339 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1359071wrt; Mon, 20 Mar 2023 11:09:39 -0700 (PDT) X-Google-Smtp-Source: AK7set9HbXNerKJRgPT2uaAHU+LlvknRddZkkc7ilhBZWKbMW5EoXmqsV/X0jKAarqR/aDl0hHdj X-Received: by 2002:a17:902:e384:b0:1a1:b5ce:5d03 with SMTP id g4-20020a170902e38400b001a1b5ce5d03mr6771955ple.10.1679335779508; Mon, 20 Mar 2023 11:09:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679335779; cv=none; d=google.com; s=arc-20160816; b=pIGBxN6tAQLuKWWgbQdTrEtMcIwn6Cq4TQSDW+2dVAXO49YonesF2vjzb/b2gGj78z lZ2y5ePWTai8JB2tzXFsUhmY36bGpEjdXpvomEiuvK+oimKgSl4cJlWVjvjybuR+vlDr q1b6811auqMlO0Eu886kSS2nGLhUSdLut2r1lRtFcWnxRxT0Or4JZU26OEORfITBo3Zg c0OVujekMPx/hg0gkMFM4WaWuFDlYyYtCjASqiSYqPB6ovTsLWUF3C4sU6RKSuaMT6BT mQ/GfA5gYI0lsseoYHRjNBLbivv2e2GiiETU7l5wYUvF4ltO8/+j3CS8oKaICVlN3SX6 4jJg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=jiIZRI7qi1Z9AZ1k8M2x/+6ZShLIrwADqJNTxp7tFfk=; b=XSnrROm+CTL09VUOcaJVxXMeybABVOGM47B+hoYvsAEOcO3N9vuQbeBJw2HhlzojdG OGdyUjnQcq259OeeQg7A3ZSmjE0cm1yVrP/3RXM3d+oQ5W3HHl4XBW36XUITNYoTH6a6 JdvZjg4Wu2H/nuAZ8j+5P8Poyios72siB7ajh3riA3/LGldnTHV2AmBPXdFQuyjNg2gu 5Lt9n4OUJrYjCMu4fqBRAgsa4P5x6RQkM2h6I4u+Zlt6f88o8+XEQ587ZnVeiW31FQpx TN53dLg/LfDPj53PDKvc0PlpkTVO+7cVFbapC9GdJGSTU5sk5yPWUujDw/+jR/xoxUbV 95cA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j12-20020a170903024c00b00189efa12957si11844882plh.126.2023.03.20.11.09.23; Mon, 20 Mar 2023 11:09:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230195AbjCTRrx (ORCPT + 99 others); Mon, 20 Mar 2023 13:47:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229490AbjCTRrL (ORCPT ); Mon, 20 Mar 2023 13:47:11 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id AC3B53B229 for ; Mon, 20 Mar 2023 10:42:52 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B9ECA169C; Mon, 20 Mar 2023 10:28:28 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 437713F67D; Mon, 20 Mar 2023 10:27:42 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 12/19] x86/resctrl: Make resctrl_mounted checks explicit Date: Mon, 20 Mar 2023 17:26:13 +0000 Message-Id: <20230320172620.18254-13-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760911194546703497?= X-GMAIL-MSGID: =?utf-8?q?1760911194546703497?= The rdt_enable_key is switched when resctrl is mounted, and used to prevent a second mount of the filesystem. It also enables the architecture's context switch code. This requires another architecture to have the same set of static-keys, as resctrl depends on them too. Make the resctrl_mounted checks explicit: resctrl can keep track of whether it has been mounted once. This doesn't need to be combined with whether the arch code is context switching the CLOSID. Tests against the rdt_mon_enable_key become a test that resctrl is mounted and that monitoring is enabled. This will allow the static-key changing to be moved behind resctrl_arch_ calls. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/internal.h | 1 + arch/x86/kernel/cpu/resctrl/monitor.c | 5 +++-- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 17 +++++++++++------ 3 files changed, 15 insertions(+), 8 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 7262b355e128..7d5188e8bec3 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -142,6 +142,7 @@ extern bool rdt_alloc_capable; extern bool rdt_mon_capable; extern unsigned int rdt_mon_features; extern struct list_head resctrl_schema_all; +extern bool resctrl_mounted; enum rdt_group_type { RDTCTRL_GROUP = 0, diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index f38cd2f12285..6279f5c98b39 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -834,7 +834,7 @@ void mbm_handle_overflow(struct work_struct *work) mutex_lock(&rdtgroup_mutex); - if (!static_branch_likely(&rdt_mon_enable_key)) + if (!resctrl_mounted || !static_branch_likely(&rdt_mon_enable_key)) goto out_unlock; r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; @@ -867,8 +867,9 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) unsigned long delay = msecs_to_jiffies(delay_ms); int cpu; - if (!static_branch_likely(&rdt_mon_enable_key)) + if (!resctrl_mounted || !static_branch_likely(&rdt_mon_enable_key)) return; + cpu = cpumask_any_housekeeping(&dom->cpu_mask); dom->mbm_work_cpu = cpu; schedule_delayed_work_on(cpu, &dom->mbm_over, delay); diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 2306fbc9a9bb..5176a85f281c 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -42,6 +42,9 @@ LIST_HEAD(rdt_all_groups); /* list of entries for the schemata file */ LIST_HEAD(resctrl_schema_all); +/* the filesystem can only be mounted once */ +bool resctrl_mounted; + /* Kernel fs node for "info" directory under root */ static struct kernfs_node *kn_info; @@ -796,7 +799,7 @@ int proc_resctrl_show(struct seq_file *s, struct pid_namespace *ns, mutex_lock(&rdtgroup_mutex); /* Return empty if resctrl has not been mounted. */ - if (!static_branch_unlikely(&rdt_enable_key)) { + if (!resctrl_mounted) { seq_puts(s, "res:\nmon:\n"); goto unlock; } @@ -2463,7 +2466,7 @@ static int rdt_get_tree(struct fs_context *fc) /* * resctrl file system can only be mounted once. */ - if (static_branch_unlikely(&rdt_enable_key)) { + if (resctrl_mounted) { ret = -EBUSY; goto out; } @@ -2511,8 +2514,10 @@ static int rdt_get_tree(struct fs_context *fc) if (rdt_mon_capable) static_branch_enable_cpuslocked(&rdt_mon_enable_key); - if (rdt_alloc_capable || rdt_mon_capable) + if (rdt_alloc_capable || rdt_mon_capable) { static_branch_enable_cpuslocked(&rdt_enable_key); + resctrl_mounted = true; + } if (is_mbm_enabled()) { r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; @@ -2783,6 +2788,7 @@ static void rdt_kill_sb(struct super_block *sb) static_branch_disable_cpuslocked(&rdt_alloc_enable_key); static_branch_disable_cpuslocked(&rdt_mon_enable_key); static_branch_disable_cpuslocked(&rdt_enable_key); + resctrl_mounted = false; kernfs_kill_sb(sb); mutex_unlock(&rdtgroup_mutex); cpus_read_unlock(); @@ -3610,7 +3616,7 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) * If resctrl is mounted, remove all the * per domain monitor data directories. */ - if (static_branch_unlikely(&rdt_mon_enable_key)) + if (resctrl_mounted && static_branch_unlikely(&rdt_mon_enable_key)) rmdir_mondata_subdir_allrdtgrp(r, d->id); if (is_mbm_enabled()) @@ -3687,8 +3693,7 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) if (is_llc_occupancy_enabled()) INIT_DELAYED_WORK(&d->cqm_limbo, cqm_handle_limbo); - /* If resctrl is mounted, add per domain monitor data directories. */ - if (static_branch_unlikely(&rdt_mon_enable_key)) + if (resctrl_mounted && static_branch_unlikely(&rdt_mon_enable_key)) mkdir_mondata_subdir_allrdtgrp(r, d); return 0; From patchwork Mon Mar 20 17:26:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72331 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1349843wrt; Mon, 20 Mar 2023 10:51:46 -0700 (PDT) X-Google-Smtp-Source: AK7set9gyf+SH4EhI3GU7Y3RgY8lKyWAzvkDMa2qRtDBOzCdWi58FS0TR6He5FG3a1a8yx4SXT5l X-Received: by 2002:a17:902:dac6:b0:1a1:cef2:acd1 with SMTP id q6-20020a170902dac600b001a1cef2acd1mr5630783plx.17.1679334706405; Mon, 20 Mar 2023 10:51:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679334706; cv=none; d=google.com; s=arc-20160816; b=n399pBB3F2T5jkzWsUmEhwENIOI3g4iDWXrBk9PX2KinA/LL90+C9L5m61sZCFtAeS bx4TbcPqYPBD9wtQ4KbGe3fiTzhw74RPocKMGCQtLZwD0mWe3xdm1IQ62/o8vdKp751y JnL6gLgeGw8jBnjS0VrELmlqAbUFSBREA+xZ57rvPl12LG64eZKJwNZK3wSLYP4g56GW qbzoxQHZmsYKf97VMEvBNQd4tsK81m64VQXM4XxdNeETW7eiPUAL8JWgw7o8CtPuRsq1 rFEo21GPvrYo0XTiulf7jS3aT1RlA/tK5ejUxY6qmZDCOCrLZtviPJKZkClKg7f3t044 p9Xw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=0rMGyX78pJJBAB33bwAtkKrkgUOLP6iOkubSCqegpk8=; b=q03ZA4BbF+fGIMe4iSttBnhwK3YlSIKqg6CvqqvjPcuM2Z1ljp5RTW0vkR54crFiBL G4/EROH86lkyjiIqAnmzdXaLn9Cv/hWsVm3YIe+Bj9y+yFsK/MnK0zI9/yLifrQgexZw V0NLKXZHRKBlfswdDiidYEMTbW90MSZVjmOV8SE4HkQQIoq4QlDvkMc30EH5SWw2Elx4 ihCienmTBGlHwaPmkLTqljn46u8mMi/qrcwaBwOtYZ0Qw8tSnAZBqVvziyDyYwtbadt1 aGVgvoitgsVkxn0PWf45Ou4+kwRFP2V/O2RjTg0UgllOURNRBRNgDMR81yklNs41kiYi aQzA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d17-20020a170903231100b001a1d76e7214si2134899plh.111.2023.03.20.10.51.31; Mon, 20 Mar 2023 10:51:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230028AbjCTRqk (ORCPT + 99 others); Mon, 20 Mar 2023 13:46:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33566 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229940AbjCTRqB (ORCPT ); Mon, 20 Mar 2023 13:46:01 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 24EE63A4DC for ; Mon, 20 Mar 2023 10:41:58 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 91656169E; Mon, 20 Mar 2023 10:28:31 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 18FBB3F67D; Mon, 20 Mar 2023 10:27:44 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 13/19] x86/resctrl: Move alloc/mon static keys into helpers Date: Mon, 20 Mar 2023 17:26:14 +0000 Message-Id: <20230320172620.18254-14-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760910069205371729?= X-GMAIL-MSGID: =?utf-8?q?1760910069205371729?= resctrl enables three static keys depending on the features it has enabled. Another architecture's context switch code may look different, any static keys that control it should be buried behind helpers. Move the alloc/mon logic into arch-specific helpers as a preparatory step for making the rdt_enable_key's status something the arch code decides. This means other architectures don't have to mirror the static keys. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- arch/x86/include/asm/resctrl.h | 20 ++++++++++++++++++++ arch/x86/kernel/cpu/resctrl/internal.h | 5 ----- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 8 ++++---- 3 files changed, 24 insertions(+), 9 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 1c87f1626456..5fdfcd5f943e 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -42,6 +42,26 @@ DECLARE_STATIC_KEY_FALSE(rdt_enable_key); DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key); DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key); +static inline void resctrl_arch_enable_alloc(void) +{ + static_branch_enable_cpuslocked(&rdt_alloc_enable_key); +} + +static inline void resctrl_arch_disable_alloc(void) +{ + static_branch_disable_cpuslocked(&rdt_alloc_enable_key); +} + +static inline void resctrl_arch_enable_mon(void) +{ + static_branch_enable_cpuslocked(&rdt_mon_enable_key); +} + +static inline void resctrl_arch_disable_mon(void) +{ + static_branch_disable_cpuslocked(&rdt_mon_enable_key); +} + /* * __resctrl_sched_in() - Writes the task's CLOSid/RMID to IA32_PQR_MSR * diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 7d5188e8bec3..c83bd581c1d5 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -92,9 +92,6 @@ static inline struct rdt_fs_context *rdt_fc2context(struct fs_context *fc) return container_of(kfc, struct rdt_fs_context, kfc); } -DECLARE_STATIC_KEY_FALSE(rdt_enable_key); -DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key); - /** * struct mon_evt - Entry in the event list of a resource * @evtid: event id @@ -452,8 +449,6 @@ extern struct mutex rdtgroup_mutex; extern struct rdt_hw_resource rdt_resources_all[]; extern struct rdtgroup rdtgroup_default; -DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key); - extern struct dentry *debugfs_resctrl; enum resctrl_res_level { diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 5176a85f281c..c6c31efb85ac 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2510,9 +2510,9 @@ static int rdt_get_tree(struct fs_context *fc) goto out_psl; if (rdt_alloc_capable) - static_branch_enable_cpuslocked(&rdt_alloc_enable_key); + resctrl_arch_enable_alloc(); if (rdt_mon_capable) - static_branch_enable_cpuslocked(&rdt_mon_enable_key); + resctrl_arch_enable_mon(); if (rdt_alloc_capable || rdt_mon_capable) { static_branch_enable_cpuslocked(&rdt_enable_key); @@ -2785,8 +2785,8 @@ static void rdt_kill_sb(struct super_block *sb) rdt_pseudo_lock_release(); rdtgroup_default.mode = RDT_MODE_SHAREABLE; schemata_list_destroy(); - static_branch_disable_cpuslocked(&rdt_alloc_enable_key); - static_branch_disable_cpuslocked(&rdt_mon_enable_key); + resctrl_arch_disable_alloc(); + resctrl_arch_disable_mon(); static_branch_disable_cpuslocked(&rdt_enable_key); resctrl_mounted = false; kernfs_kill_sb(sb); From patchwork Mon Mar 20 17:26:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72348 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1361183wrt; Mon, 20 Mar 2023 11:13:58 -0700 (PDT) X-Google-Smtp-Source: AK7set/HJaSJ9vlnX4HFA8ZAW9qY0y/g/VhFIhvAq4KUxlWHOcnU5gvxWVNgR1itJREMdN0k/fDq X-Received: by 2002:a05:6a20:4fa3:b0:da:2fdf:385e with SMTP id gh35-20020a056a204fa300b000da2fdf385emr1615464pzb.49.1679336037864; Mon, 20 Mar 2023 11:13:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679336037; cv=none; d=google.com; s=arc-20160816; b=aDf1q3FXx/ZmHgmUGReEYH43WFb3cYnXLCd3/HRwVtlNccEvnT/2J7Wr1Dz99qKJZu ZRtovV3huDThRht7sqS6HW+4Tsw7vu+zL/T9K+jRk9y2HBrEdeQAwWUn0pUolNwu+fnX Yu8gekwXwxy/KA1zV79O1kz21GeoE6SfDJWRTRofGJsVfYyq2d1kd5atpgCXZHl6yrUQ C5e1JNCPeOXXqneuJAYoeuG99fdLHpIbqB8gL8plx/nrQw/QhWJ6pVdaYBg0y2zj3hwu sgfvjgLVl7mlZFQjzkO1ib+N/mhL1Qxjhuuj09FHFwS6rgK3+9JsMai6RfEB1SH2DTts NoHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=WPJz3dmnQWgZ9e3nKqn4+PmnB0T9NQF1YtwvWcOwApQ=; b=OUmhqTgBf9ERLKrI1FVAPsp5gc8X52yfX6fWxWHqjihGwE9yhwdob57VRQxpU183Oh Ft+2PjRxtrdwoOqCps4cHBmGhM4wJr+kc4dUITQglpuGyt2J9Qo56rwx1ih2yoC56VMP K5abRutN7ebt07Am8o4cvtzphY6UhzQukUqF0z8pcUy50CaBg0/CqCAFxZZa0bJcbmqN Xueiy3/7ztW8Iiv+veKioMKRrPjN1DxWX/Z17px/9LnJuAfeqHxvFecq248d1HpWrBiP msOLHUffbxrjUDzOunq5LoAIG4Q7nWFzX1tFSr3Bd6Bwzqb1jkSvOiCGHKlpQmu01jiX 5hvA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f5-20020a056a0022c500b005a8d684c05dsi12053891pfj.271.2023.03.20.11.13.44; Mon, 20 Mar 2023 11:13:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230134AbjCTRsM (ORCPT + 99 others); Mon, 20 Mar 2023 13:48:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33630 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230115AbjCTRrT (ORCPT ); Mon, 20 Mar 2023 13:47:19 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 622983B21D for ; Mon, 20 Mar 2023 10:42:48 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 612D51BB0; Mon, 20 Mar 2023 10:28:34 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DD8FF3F67D; Mon, 20 Mar 2023 10:27:47 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 14/19] x86/resctrl: Make rdt_enable_key the arch's decision to switch Date: Mon, 20 Mar 2023 17:26:15 +0000 Message-Id: <20230320172620.18254-15-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760911465010858297?= X-GMAIL-MSGID: =?utf-8?q?1760911465010858297?= rdt_enable_key is switched when resctrl is mounted. It was also previously used to prevent a second mount of the filesystem. Any other architecture that wants to support resctrl has to provide identical static keys. Now that there are helpers for enabling and disabling the alloc/mon keys, resctrl doesn't need to switch this extra key, it can be done by the arch code. Use the static-key increment and decrement helpers, and change resctrl to ensure the calls are balanced. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- arch/x86/include/asm/resctrl.h | 4 ++++ arch/x86/kernel/cpu/resctrl/rdtgroup.c | 11 +++++------ 2 files changed, 9 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 5fdfcd5f943e..147af2b43385 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -45,21 +45,25 @@ DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key); static inline void resctrl_arch_enable_alloc(void) { static_branch_enable_cpuslocked(&rdt_alloc_enable_key); + static_branch_inc_cpuslocked(&rdt_enable_key); } static inline void resctrl_arch_disable_alloc(void) { static_branch_disable_cpuslocked(&rdt_alloc_enable_key); + static_branch_dec_cpuslocked(&rdt_enable_key); } static inline void resctrl_arch_enable_mon(void) { static_branch_enable_cpuslocked(&rdt_mon_enable_key); + static_branch_inc_cpuslocked(&rdt_enable_key); } static inline void resctrl_arch_disable_mon(void) { static_branch_disable_cpuslocked(&rdt_mon_enable_key); + static_branch_dec_cpuslocked(&rdt_enable_key); } /* diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index c6c31efb85ac..2ca8981c7d0d 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2514,10 +2514,8 @@ static int rdt_get_tree(struct fs_context *fc) if (rdt_mon_capable) resctrl_arch_enable_mon(); - if (rdt_alloc_capable || rdt_mon_capable) { - static_branch_enable_cpuslocked(&rdt_enable_key); + if (rdt_alloc_capable || rdt_mon_capable) resctrl_mounted = true; - } if (is_mbm_enabled()) { r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; @@ -2785,9 +2783,10 @@ static void rdt_kill_sb(struct super_block *sb) rdt_pseudo_lock_release(); rdtgroup_default.mode = RDT_MODE_SHAREABLE; schemata_list_destroy(); - resctrl_arch_disable_alloc(); - resctrl_arch_disable_mon(); - static_branch_disable_cpuslocked(&rdt_enable_key); + if (rdt_alloc_capable) + resctrl_arch_disable_alloc(); + if (rdt_mon_capable) + resctrl_arch_disable_mon(); resctrl_mounted = false; kernfs_kill_sb(sb); mutex_unlock(&rdtgroup_mutex); From patchwork Mon Mar 20 17:26:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72332 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1351132wrt; Mon, 20 Mar 2023 10:54:35 -0700 (PDT) X-Google-Smtp-Source: AK7set8RobclaIQlv9w7j5YRrnHcFV2KgF7opPIfALCbjw27vWGuAEIW7g6IwHPAjVa3r/CepBYh X-Received: by 2002:a17:90b:4d84:b0:23f:5ab0:68a2 with SMTP id oj4-20020a17090b4d8400b0023f5ab068a2mr13502521pjb.40.1679334875048; Mon, 20 Mar 2023 10:54:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679334875; cv=none; d=google.com; s=arc-20160816; b=dnmP+itYt0LXxlfJXXwkxHQ6k75g0evUYjJFazSIpBaT4Cfk573rKK5GaVoyoSLic6 rlM2YAL7OGlDI5NJKF49BumjYo3Alhp2V08I+xvDdTzmG3BhrpXXGH3Rfb3RbNwNiT+2 5gv7SCA3z33IIODppthMEdni1bPGrfXFyrk6WXESlPvNPS1YcE9XWZY5yCNuu8uw9C6w 6ZDH0O5nPeTizsbZNsX1sNvyRf+fWYhr5gK+G8Sz+1H4Z59wl9B7gXA+qDPGYttncYyu ylM5EZ0SCgAQE10B3rRV6B77nxX7tTAC7R7WCXLA2FSlapUQ4GQp88BpvHfbt++z7N4x +gpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=q0lBFZnCNsPzEAboDLZmAS/WWCYr8N7QdEtoGce9SwE=; b=vtPKX6C9eyDjhZXeWnNJjqBEXqvUgt+Y3evVaEt9Om6hZf83uvK1kBdXaV/2bUOFzc YCn9ONfo5G82KPwMnuM1eSQlWntzoQLItf2sEri6b9dzd4STAYIidSlh0zgm713e2ZNm fyivlpWfdm+teC0dlxHjjbC7Z3Xiyt30/Osy0uzPoqKH57flBjp0HFr3GlSTLExrQYQE N3fod/7ScMeElG86WghIum8Sb7nW9PijJqf2F5FS+N8WOlOqT9Y6A4ahKLctAb94PoCZ w8VOZUBjAj/L9TpxfxDkXFQtAdiF1qUpUEQQdmP8nFS1zwI3AIYsyBIeAtkNOHekNoiI Ovrw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id go16-20020a17090b03d000b00230b5621adasi15388631pjb.45.2023.03.20.10.54.19; Mon, 20 Mar 2023 10:54:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230220AbjCTRsH (ORCPT + 99 others); Mon, 20 Mar 2023 13:48:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36090 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229939AbjCTRrO (ORCPT ); Mon, 20 Mar 2023 13:47:14 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 5123839292 for ; Mon, 20 Mar 2023 10:42:53 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2F37F1BB2; Mon, 20 Mar 2023 10:28:37 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id ADE773F67D; Mon, 20 Mar 2023 10:27:50 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 15/19] x86/resctrl: Add helpers for system wide mon/alloc capable Date: Mon, 20 Mar 2023 17:26:16 +0000 Message-Id: <20230320172620.18254-16-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760910245705569634?= X-GMAIL-MSGID: =?utf-8?q?1760910245705569634?= resctrl reads rdt_alloc_capable or rdt_mon_capable to determine whether any of the resources support the corresponding features. resctrl also uses the static-keys that affect the architecture's context-switch code to determine the same thing. This forces another architecture to have the same static-keys. As the static-key is enabled based on the capable flag, and none of the filesystem uses of these are in the scheduler path, move the capable flags behind helpers, and use these in the filesystem code instead of the static-key. After this change, only the architecture code manages and uses the static-keys to ensure __resctrl_sched_in() does not need runtime checks. This avoids multiple architectures having to define the same static-keys. Tested-by: Shaopeng Tan Reviewed-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v1: * Added missing conversion in mkdir_rdt_prepare_rmid_free() --- arch/x86/include/asm/resctrl.h | 13 +++++++++ arch/x86/kernel/cpu/resctrl/internal.h | 2 -- arch/x86/kernel/cpu/resctrl/monitor.c | 4 +-- arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 6 ++-- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 34 +++++++++++------------ 5 files changed, 35 insertions(+), 24 deletions(-) diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h index 147af2b43385..4355245652c9 100644 --- a/arch/x86/include/asm/resctrl.h +++ b/arch/x86/include/asm/resctrl.h @@ -38,10 +38,18 @@ struct resctrl_pqr_state { DECLARE_PER_CPU(struct resctrl_pqr_state, pqr_state); +extern bool rdt_alloc_capable; +extern bool rdt_mon_capable; + DECLARE_STATIC_KEY_FALSE(rdt_enable_key); DECLARE_STATIC_KEY_FALSE(rdt_alloc_enable_key); DECLARE_STATIC_KEY_FALSE(rdt_mon_enable_key); +static inline bool resctrl_arch_alloc_capable(void) +{ + return rdt_alloc_capable; +} + static inline void resctrl_arch_enable_alloc(void) { static_branch_enable_cpuslocked(&rdt_alloc_enable_key); @@ -54,6 +62,11 @@ static inline void resctrl_arch_disable_alloc(void) static_branch_dec_cpuslocked(&rdt_enable_key); } +static inline bool resctrl_arch_mon_capable(void) +{ + return rdt_mon_capable; +} + static inline void resctrl_arch_enable_mon(void) { static_branch_enable_cpuslocked(&rdt_mon_enable_key); diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index c83bd581c1d5..3eb5b307b809 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -135,8 +135,6 @@ struct rmid_read { int arch_mon_ctx; }; -extern bool rdt_alloc_capable; -extern bool rdt_mon_capable; extern unsigned int rdt_mon_features; extern struct list_head resctrl_schema_all; extern bool resctrl_mounted; diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 6279f5c98b39..f0f2e61b15d5 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -834,7 +834,7 @@ void mbm_handle_overflow(struct work_struct *work) mutex_lock(&rdtgroup_mutex); - if (!resctrl_mounted || !static_branch_likely(&rdt_mon_enable_key)) + if (!resctrl_mounted || !resctrl_arch_mon_capable()) goto out_unlock; r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; @@ -867,7 +867,7 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) unsigned long delay = msecs_to_jiffies(delay_ms); int cpu; - if (!resctrl_mounted || !static_branch_likely(&rdt_mon_enable_key)) + if (!resctrl_mounted || !resctrl_arch_mon_capable()) return; cpu = cpumask_any_housekeeping(&dom->cpu_mask); diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c index 3b724a40d3a2..0b4fdb118643 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -567,7 +567,7 @@ static int rdtgroup_locksetup_user_restrict(struct rdtgroup *rdtgrp) if (ret) goto err_cpus; - if (rdt_mon_capable) { + if (resctrl_arch_mon_capable()) { ret = rdtgroup_kn_mode_restrict(rdtgrp, "mon_groups"); if (ret) goto err_cpus_list; @@ -614,7 +614,7 @@ static int rdtgroup_locksetup_user_restore(struct rdtgroup *rdtgrp) if (ret) goto err_cpus; - if (rdt_mon_capable) { + if (resctrl_arch_mon_capable()) { ret = rdtgroup_kn_mode_restore(rdtgrp, "mon_groups", 0777); if (ret) goto err_cpus_list; @@ -762,7 +762,7 @@ int rdtgroup_locksetup_exit(struct rdtgroup *rdtgrp) { int ret; - if (rdt_mon_capable) { + if (resctrl_arch_mon_capable()) { ret = alloc_rmid(rdtgrp->closid); if (ret < 0) { rdt_last_cmd_puts("Out of RMIDs\n"); diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 2ca8981c7d0d..8f319e03b449 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -611,13 +611,13 @@ static int __rdtgroup_move_task(struct task_struct *tsk, static bool is_closid_match(struct task_struct *t, struct rdtgroup *r) { - return (rdt_alloc_capable && (r->type == RDTCTRL_GROUP) && + return (resctrl_arch_alloc_capable() && (r->type == RDTCTRL_GROUP) && resctrl_arch_match_closid(t, r->closid)); } static bool is_rmid_match(struct task_struct *t, struct rdtgroup *r) { - return (rdt_mon_capable && (r->type == RDTMON_GROUP) && + return (resctrl_arch_mon_capable() && (r->type == RDTMON_GROUP) && resctrl_arch_match_rmid(t, r->mon.parent->closid, r->mon.rmid)); } @@ -2487,7 +2487,7 @@ static int rdt_get_tree(struct fs_context *fc) if (ret < 0) goto out_schemata_free; - if (rdt_mon_capable) { + if (resctrl_arch_mon_capable()) { ret = mongroup_create_dir(rdtgroup_default.kn, &rdtgroup_default, "mon_groups", &kn_mongrp); @@ -2509,12 +2509,12 @@ static int rdt_get_tree(struct fs_context *fc) if (ret < 0) goto out_psl; - if (rdt_alloc_capable) + if (resctrl_arch_alloc_capable()) resctrl_arch_enable_alloc(); - if (rdt_mon_capable) + if (resctrl_arch_mon_capable()) resctrl_arch_enable_mon(); - if (rdt_alloc_capable || rdt_mon_capable) + if (resctrl_arch_alloc_capable() || resctrl_arch_mon_capable()) resctrl_mounted = true; if (is_mbm_enabled()) { @@ -2528,10 +2528,10 @@ static int rdt_get_tree(struct fs_context *fc) out_psl: rdt_pseudo_lock_release(); out_mondata: - if (rdt_mon_capable) + if (resctrl_arch_mon_capable()) kernfs_remove(kn_mondata); out_mongrp: - if (rdt_mon_capable) + if (resctrl_arch_mon_capable()) kernfs_remove(kn_mongrp); out_info: kernfs_remove(kn_info); @@ -2783,9 +2783,9 @@ static void rdt_kill_sb(struct super_block *sb) rdt_pseudo_lock_release(); rdtgroup_default.mode = RDT_MODE_SHAREABLE; schemata_list_destroy(); - if (rdt_alloc_capable) + if (resctrl_arch_alloc_capable()) resctrl_arch_disable_alloc(); - if (rdt_mon_capable) + if (resctrl_arch_mon_capable()) resctrl_arch_disable_mon(); resctrl_mounted = false; kernfs_kill_sb(sb); @@ -3161,7 +3161,7 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp) { int ret; - if (!rdt_mon_capable) + if (!resctrl_arch_mon_capable()) return 0; ret = alloc_rmid(rdtgrp->closid); @@ -3183,7 +3183,7 @@ static int mkdir_rdt_prepare_rmid_alloc(struct rdtgroup *rdtgrp) static void mkdir_rdt_prepare_rmid_free(struct rdtgroup *rgrp) { - if (rdt_mon_capable) + if (resctrl_arch_mon_capable()) free_rmid(rgrp->closid, rgrp->mon.rmid); } @@ -3349,7 +3349,7 @@ static int rdtgroup_mkdir_ctrl_mon(struct kernfs_node *parent_kn, list_add(&rdtgrp->rdtgroup_list, &rdt_all_groups); - if (rdt_mon_capable) { + if (resctrl_arch_mon_capable()) { /* * Create an empty mon_groups directory to hold the subset * of tasks and cpus to monitor. @@ -3404,14 +3404,14 @@ static int rdtgroup_mkdir(struct kernfs_node *parent_kn, const char *name, * allocation is supported, add a control and monitoring * subdirectory */ - if (rdt_alloc_capable && parent_kn == rdtgroup_default.kn) + if (resctrl_arch_alloc_capable() && parent_kn == rdtgroup_default.kn) return rdtgroup_mkdir_ctrl_mon(parent_kn, name, mode); /* * If RDT monitoring is supported and the parent directory is a valid * "mon_groups" directory, add a monitoring subdirectory. */ - if (rdt_mon_capable && is_mon_groups(parent_kn, name)) + if (resctrl_arch_mon_capable() && is_mon_groups(parent_kn, name)) return rdtgroup_mkdir_mon(parent_kn, name, mode); return -EPERM; @@ -3615,7 +3615,7 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) * If resctrl is mounted, remove all the * per domain monitor data directories. */ - if (resctrl_mounted && static_branch_unlikely(&rdt_mon_enable_key)) + if (resctrl_mounted && resctrl_arch_mon_capable()) rmdir_mondata_subdir_allrdtgrp(r, d->id); if (is_mbm_enabled()) @@ -3692,7 +3692,7 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) if (is_llc_occupancy_enabled()) INIT_DELAYED_WORK(&d->cqm_limbo, cqm_handle_limbo); - if (resctrl_mounted && static_branch_unlikely(&rdt_mon_enable_key)) + if (resctrl_mounted && resctrl_arch_mon_capable()) mkdir_mondata_subdir_allrdtgrp(r, d); return 0; From patchwork Mon Mar 20 17:26:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72330 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1349672wrt; Mon, 20 Mar 2023 10:51:26 -0700 (PDT) X-Google-Smtp-Source: AK7set+WtcRXRoY1hEGMAhKwEamJhtlMcGqetjruj3eK4CCzSqXrd2k7DkjY073P6323csMImwrx X-Received: by 2002:a05:6a20:be04:b0:da:4e05:768 with SMTP id ge4-20020a056a20be0400b000da4e050768mr898146pzb.38.1679334686118; Mon, 20 Mar 2023 10:51:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679334686; cv=none; d=google.com; s=arc-20160816; b=DohdJ7oZunIF44YALVySyH5DemBv5PolzPEPBLE53K1Ql24YHkUoSegTf3dSEB9Ijl PysREPbgOA34qJcACJbuXQYfSasd8oP5FmzT1QSgqQmMcVjhivaKkGyo6wIlmxWmSCpm KsfWveGMaIIaK10mPP3xJFKvRxOfo7gV3NvKFE3KiX+y4dVyyno+p7uzHJHf4GLIXiPK f0BnZepQGQhKe348Th97BPPeHLkgEhbUNLLVstlWKo3/cEeKFOja7iVGyYdSBb9369hh gFeTN6aIXJ1ArBZj29ZcU5Twj41ZGJmBLB4ljN/qV+sDJcUuCMeuUyE8+WMZvtppHZ1U wr5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=ma4uYYCOowIicwUrsnuZ1uhKdExQW+zW8u8t+9tWLZg=; b=BsEjLjd5SwDTd7YzJQzOSFPAwjNnlzVoJSahj0w+ED/yXfRa8AAM61vLjrN5J/Xpt+ VRz5Y8acu9obKQ/wBQM+uq/0yuRk4JfcV6hh3r0ovJg5yvE5Nb4deGoy99w48oGwHfgb b+8BwjxOqHwGhc9V/PQx2fjBgmFgxd+8+gJK+0Z9wIHjjTVz/4kbgBdZAVENgzjOqqkj 0Sz6AnDxzdV9LZYXYPxGz3zxYm10Oj1jWRP4vO+qKmxJEtLPHvhlqK45r6ucK0/bfQ8q UTIPkKxJXV4PTnkWcdKavwuhz4wg2cQCgkpfeyBhv6nqG1wgGt2nL+J6tNw6O6e0G59v IaWQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h9-20020a631209000000b00502e4278d61si10938759pgl.648.2023.03.20.10.51.09; Mon, 20 Mar 2023 10:51:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229842AbjCTRqb (ORCPT + 99 others); Mon, 20 Mar 2023 13:46:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57624 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229799AbjCTRqA (ORCPT ); Mon, 20 Mar 2023 13:46:00 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 785263431A for ; Mon, 20 Mar 2023 10:41:58 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 08F5A1BC0; Mon, 20 Mar 2023 10:28:40 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 810733F67D; Mon, 20 Mar 2023 10:27:53 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 16/19] x86/resctrl: Add cpu online callback for resctrl work Date: Mon, 20 Mar 2023 17:26:17 +0000 Message-Id: <20230320172620.18254-17-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760910047643348816?= X-GMAIL-MSGID: =?utf-8?q?1760910047643348816?= The resctrl architecture specific code may need to create a domain when a CPU comes online, it also needs to reset the CPUs PQR_ASSOC register. The resctrl filesystem code needs to update the rdtgroup_default cpu mask when cpus are brought online. Currently this is all done in one function, resctrl_online_cpu(). This will need to be split into architecture and filesystem parts before resctrl can be moved to /fs/. Pull the rdtgroup_default update work out as a filesystem specific cpu_online helper. resctrl_online_cpu() is the obvious name for this, which means the version in core.c needs renaming. resctrl_online_cpu() is called by the arch code once it has done the work to add the new cpu to any domains. In future patches, resctrl_online_cpu() will take the rdtgroup_mutex itself. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/core.c | 11 ++++++----- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 10 ++++++++++ include/linux/resctrl.h | 1 + 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 351319403f84..8e25ea49372e 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -603,19 +603,20 @@ static void clear_closid_rmid(int cpu) wrmsr(MSR_IA32_PQR_ASSOC, RESCTRL_RESERVED_CLOSID, 0); } -static int resctrl_online_cpu(unsigned int cpu) +static int resctrl_arch_online_cpu(unsigned int cpu) { struct rdt_resource *r; + int err; mutex_lock(&rdtgroup_mutex); for_each_capable_rdt_resource(r) domain_add_cpu(cpu, r); - /* The cpu is set in default rdtgroup after online. */ - cpumask_set_cpu(cpu, &rdtgroup_default.cpu_mask); clear_closid_rmid(cpu); + + err = resctrl_online_cpu(cpu); mutex_unlock(&rdtgroup_mutex); - return 0; + return err; } static void clear_childcpus(struct rdtgroup *r, unsigned int cpu) @@ -965,7 +966,7 @@ static int __init resctrl_late_init(void) state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/resctrl/cat:online:", - resctrl_online_cpu, resctrl_offline_cpu); + resctrl_arch_online_cpu, resctrl_offline_cpu); if (state < 0) return state; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 8f319e03b449..410b2b451c30 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3698,6 +3698,16 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) return 0; } +int resctrl_online_cpu(unsigned int cpu) +{ + lockdep_assert_held(&rdtgroup_mutex); + + /* The cpu is set in default rdtgroup after online. */ + cpumask_set_cpu(cpu, &rdtgroup_default.cpu_mask); + + return 0; +} + /* * rdtgroup_init - rdtgroup initialization * diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 03e4f41cd336..5a66d034aa61 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -222,6 +222,7 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d, u32 closid, enum resctrl_conf_type type); int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d); void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); +int resctrl_online_cpu(unsigned int cpu); /** * resctrl_arch_rmid_read() - Read the eventid counter corresponding to rmid From patchwork Mon Mar 20 17:26:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72347 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1361013wrt; Mon, 20 Mar 2023 11:13:38 -0700 (PDT) X-Google-Smtp-Source: AK7set97b44m8fqo8LIEwkMtdxHm4BaBpuFGuGxAiNHQ5LPDCDzZ6TVpP9FaDpzEQ2BSG26tSt22 X-Received: by 2002:aa7:9428:0:b0:625:4b46:e019 with SMTP id y8-20020aa79428000000b006254b46e019mr352792pfo.9.1679336017726; Mon, 20 Mar 2023 11:13:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679336017; cv=none; d=google.com; s=arc-20160816; b=t0GiGZ3ryzsgS7z6kBWrjjw+lzSKKnw+5imrXBE3OaUMRsL0cDpYbMdOZVUbOzKiwV JantavSVVoUfX4ISEqd+ksuZHhxZOqssHiJ8fqLhgMvWAEoOisnoM1qgmcyOl6+nUBAW /EkB0/bH6psK8UOxHn3LMhJUYAegQ+dmAo9vvQA3al85BrY07JzQ/Nslc1ld5ORpdxgh 5lpj1cIEI4kOnQFo8/QrqOo5xY5u5yRgH1KChxpRkC7m7WZLOBLFFSmH5HvSDoMD4BUz wkIPWnhJhlOcnRadLU6eQ974u08ray0x8GOm9CC1p3Mkcp3ZFX9D3852RFl5kvdpKcne w9Gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=I5Vnhn4CL4Jv0gnEs4J5ewaBzQLj1OeCDL05eu/LuRU=; b=ocrGymWF/dvKvAs0xbmPifukxLWi+hhktnLU/tfTAAGHFpS6bjS+0vBAd/yH17VSJo p6H7gztOEml32lfoKKKHIyXNlk/GVDXqk01gj/4bmeiZmEDthYZneNRc56UpYXAXwPVb TRor/GPfcFftEIu4OxPxRI4SNwxDiIZOl/tagq2kfJpQ7Kf9bjR2Qy+W/8AfDP0KqKyZ VM6emCHUhn1DFJpbqCU+JcneL8Y72izTL3Q4hSp+asqQ0EdSXVU4VAuyPS0rJFvb5rNo VlajlxpH5SZTkzOvuYjD8s5cdp5Zyj6Dcb+LA7rdpwCpQeSofTYSvPWsKR3w51b2idQP 0hPg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k49-20020a056a000d3100b005a8be9b1b56si2303763pfv.217.2023.03.20.11.13.24; Mon, 20 Mar 2023 11:13:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229575AbjCTRrT (ORCPT + 99 others); Mon, 20 Mar 2023 13:47:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60904 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229882AbjCTRqb (ORCPT ); Mon, 20 Mar 2023 13:46:31 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 450283B22D for ; Mon, 20 Mar 2023 10:42:49 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C9AD91BCA; Mon, 20 Mar 2023 10:28:42 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 554AA3F67D; Mon, 20 Mar 2023 10:27:56 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 17/19] x86/resctrl: Allow overflow/limbo handlers to be scheduled on any-but cpu Date: Mon, 20 Mar 2023 17:26:18 +0000 Message-Id: <20230320172620.18254-18-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760911443818231686?= X-GMAIL-MSGID: =?utf-8?q?1760911443818231686?= When a CPU is taken offline resctrl may need to move the overflow or limbo handlers to run on a different CPU. Once the offline callbacks have been split, cqm_setup_limbo_handler() will be called while the CPU that is going offline is still present in the cpu_mask. Pass the CPU to exclude to cqm_setup_limbo_handler() and mbm_setup_overflow_handler(). These functions can use a variant of cpumask_any_but() when selecting the CPU. -1 is used to indicate no CPUs need excluding. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v2: * Rephrased a comment to avoid a two letter bad-word. (we) * Avoid assigning mbm_work_cpu if the domain is going to be free()d * Added cpumask_any_housekeeping_but(), I dislike the name --- arch/x86/kernel/cpu/resctrl/core.c | 8 +++-- arch/x86/kernel/cpu/resctrl/internal.h | 37 ++++++++++++++++++++-- arch/x86/kernel/cpu/resctrl/monitor.c | 43 +++++++++++++++++++++----- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 6 ++-- include/linux/resctrl.h | 3 ++ 5 files changed, 83 insertions(+), 14 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 8e25ea49372e..aafe4b74587c 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -582,12 +582,16 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) if (r == &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl) { if (is_mbm_enabled() && cpu == d->mbm_work_cpu) { cancel_delayed_work(&d->mbm_over); - mbm_setup_overflow_handler(d, 0); + /* + * exclude_cpu=-1 as this CPU has already been removed + * by cpumask_clear_cpu()d + */ + mbm_setup_overflow_handler(d, 0, RESCTRL_PICK_ANY_CPU); } if (is_llc_occupancy_enabled() && cpu == d->cqm_work_cpu && has_busy_rmid(r, d)) { cancel_delayed_work(&d->cqm_limbo); - cqm_setup_limbo_handler(d, 0); + cqm_setup_limbo_handler(d, 0, RESCTRL_PICK_ANY_CPU); } } } diff --git a/arch/x86/kernel/cpu/resctrl/internal.h b/arch/x86/kernel/cpu/resctrl/internal.h index 3eb5b307b809..47838ba6876e 100644 --- a/arch/x86/kernel/cpu/resctrl/internal.h +++ b/arch/x86/kernel/cpu/resctrl/internal.h @@ -78,6 +78,37 @@ static inline unsigned int cpumask_any_housekeeping(const struct cpumask *mask) return cpu; } +/** + * cpumask_any_housekeeping_but() - Chose any cpu in @mask, preferring those + * that aren't marked nohz_full, excluding + * the provided CPU + * @mask: The mask to pick a CPU from. + * @exclude_cpu:The CPU to avoid picking. + * + * Returns a CPU from @mask, but not @but. If there are houskeeping CPUs that + * don't use nohz_full, these are preferred. + * Returns >= nr_cpu_ids if no CPUs are available. + */ +static inline unsigned int +cpumask_any_housekeeping_but(const struct cpumask *mask, int exclude_cpu) +{ + int cpu, hk_cpu; + + cpu = cpumask_any_but(mask, exclude_cpu); + if (tick_nohz_full_cpu(cpu)) { + hk_cpu = cpumask_nth_andnot(0, mask, tick_nohz_full_mask); + if (hk_cpu == exclude_cpu) { + hk_cpu = cpumask_nth_andnot(1, mask, + tick_nohz_full_mask); + } + + if (hk_cpu < nr_cpu_ids) + cpu = hk_cpu; + } + + return cpu; +} + struct rdt_fs_context { struct kernfs_fs_context kfc; bool enable_cdpl2; @@ -564,11 +595,13 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, struct rdt_domain *d, struct rdtgroup *rdtgrp, int evtid, int first); void mbm_setup_overflow_handler(struct rdt_domain *dom, - unsigned long delay_ms); + unsigned long delay_ms, + int exclude_cpu); void mbm_handle_overflow(struct work_struct *work); void __init intel_rdt_mbm_apply_quirk(void); bool is_mba_sc(struct rdt_resource *r); -void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms); +void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms, + int exclude_cpu); void cqm_handle_limbo(struct work_struct *work); bool has_busy_rmid(struct rdt_resource *r, struct rdt_domain *d); void __check_limbo(struct rdt_domain *d, bool force_free); diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index f0f2e61b15d5..11fa5d79c81d 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -477,7 +477,7 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) * setup up the limbo worker. */ if (!has_busy_rmid(r, d)) - cqm_setup_limbo_handler(d, CQM_LIMBOCHECK_INTERVAL); + cqm_setup_limbo_handler(d, CQM_LIMBOCHECK_INTERVAL, -1); set_bit(idx, d->rmid_busy_llc); entry->busy++; } @@ -812,15 +812,28 @@ void cqm_handle_limbo(struct work_struct *work) mutex_unlock(&rdtgroup_mutex); } -void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms) +/** + * cqm_setup_limbo_handler() - Schedule the limbo handler to run for this + * domain. + * @delay_ms: How far in the future the handler should run. + * @exclude_cpu: Which CPU the handler should not run on, -1 to pick any CPU. + */ +void cqm_setup_limbo_handler(struct rdt_domain *dom, unsigned long delay_ms, + int exclude_cpu) { unsigned long delay = msecs_to_jiffies(delay_ms); int cpu; - cpu = cpumask_any_housekeeping(&dom->cpu_mask); - dom->cqm_work_cpu = cpu; + if (exclude_cpu == RESCTRL_PICK_ANY_CPU) + cpu = cpumask_any_housekeeping(&dom->cpu_mask); + else + cpu = cpumask_any_housekeeping_but(&dom->cpu_mask, + exclude_cpu); - schedule_delayed_work_on(cpu, &dom->cqm_limbo, delay); + if (cpu < nr_cpu_ids) { + dom->cqm_work_cpu = cpu; + schedule_delayed_work_on(cpu, &dom->cqm_limbo, delay); + } } void mbm_handle_overflow(struct work_struct *work) @@ -862,7 +875,14 @@ void mbm_handle_overflow(struct work_struct *work) mutex_unlock(&rdtgroup_mutex); } -void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) +/** + * mbm_setup_overflow_handler() - Schedule the overflow handler to run for this + * domain. + * @delay_ms: How far in the future the handler should run. + * @exclude_cpu: Which CPU the handler should not run on, -1 to pick any CPU. + */ +void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms, + int exclude_cpu) { unsigned long delay = msecs_to_jiffies(delay_ms); int cpu; @@ -870,9 +890,16 @@ void mbm_setup_overflow_handler(struct rdt_domain *dom, unsigned long delay_ms) if (!resctrl_mounted || !resctrl_arch_mon_capable()) return; - cpu = cpumask_any_housekeeping(&dom->cpu_mask); + if (exclude_cpu == -1) + cpu = cpumask_any_housekeeping(&dom->cpu_mask); + else + cpu = cpumask_any_housekeeping_but(&dom->cpu_mask, + exclude_cpu); + dom->mbm_work_cpu = cpu; - schedule_delayed_work_on(cpu, &dom->mbm_over, delay); + + if (cpu < nr_cpu_ids) + schedule_delayed_work_on(cpu, &dom->mbm_over, delay); } static int dom_data_init(struct rdt_resource *r) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 410b2b451c30..bf206bdb21ee 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -2520,7 +2520,8 @@ static int rdt_get_tree(struct fs_context *fc) if (is_mbm_enabled()) { r = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; list_for_each_entry(dom, &r->domains, list) - mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL); + mbm_setup_overflow_handler(dom, MBM_OVERFLOW_INTERVAL, + RESCTRL_PICK_ANY_CPU); } goto out; @@ -3686,7 +3687,8 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) if (is_mbm_enabled()) { INIT_DELAYED_WORK(&d->mbm_over, mbm_handle_overflow); - mbm_setup_overflow_handler(d, MBM_OVERFLOW_INTERVAL); + mbm_setup_overflow_handler(d, MBM_OVERFLOW_INTERVAL, + RESCTRL_PICK_ANY_CPU); } if (is_llc_occupancy_enabled()) diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 5a66d034aa61..3ea7d618f33f 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -9,6 +9,9 @@ /* CLOSID value used by the default control group */ #define RESCTRL_RESERVED_CLOSID 0 +/* Indicates no CPU needs to be excluded */ +#define RESCTRL_PICK_ANY_CPU -1 + #ifdef CONFIG_PROC_CPU_RESCTRL int proc_resctrl_show(struct seq_file *m, From patchwork Mon Mar 20 17:26:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72338 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1358509wrt; Mon, 20 Mar 2023 11:08:39 -0700 (PDT) X-Google-Smtp-Source: AK7set+PKx+V9r16jXaCx4X4ZrsWL7xFksHeJqqr3HOvxqUJMFDLzohD/XV78vAA1qTG/i3w8ICG X-Received: by 2002:a17:90b:3a81:b0:234:13a3:6e67 with SMTP id om1-20020a17090b3a8100b0023413a36e67mr99305pjb.12.1679335719554; Mon, 20 Mar 2023 11:08:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679335719; cv=none; d=google.com; s=arc-20160816; b=L6xQwCHTtVycd2ZvwZazWkCyKZbp6zGJlV3c+cnRCiyRRezKEEwaRW/wOL11fQGOaj GZCgAgE6MPAdgElRLj7ra9HaToMcLjYi506G7HbE6uUo0n8gWeMSlu+K9/nR6Q79NjdP R+ABwpwrZuKT6akNgtEgeXZI1LLd5C8IlrCjlzqSAQlBpFu/fn2PAWkFAbm5vD3V4ZOh 2ciRcg9rTKX66mv8mV/aVhUFpycPqCAUI/qwdXvzWAbPwbBrnNSOhlx0HC/nHWAaH8Ae 93fLT6ENUYItxwNEvUKEYce3nzQqRAzbnC/mr0SikqspZV5p71KIbqSscDrU5UgY/gyF Xzng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=GEdIbvWYdMFCtL5fl4Kye2cd5/9IxCp4lNnpvZE5GLg=; b=OTwBCIVpOMuSuZeJeyocaPKgooKxj/DmCufgoOrDkbzzOEnKDZyu0I43gtP4qyxaPA OnAoTY64XAmbfMZ1eXlZDkyFEaIa5CGrkwmzfzk4QHkDr2Iiu+D6UUsqBhKu1cCBoBjM sM8hXYplCNGvhuiXmv+S3p/kGGDEkWQKP2zN72/9ZCnFEfQT+2Alh6ccsAh2lLtE8Cdt sMDsCiMjV4y7YXs4x/HYskY9A7+SXuleEBIGmZmH1XNOQRtWqckeARQu7gJ/odr49yLA FcnbEXMb+TvsoUlSJGJgrZbz8oJB/pKHSdCdDKGjw7R4EQRbvJiKZyCPYfNalMADB5v9 l36g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v6-20020a17090abb8600b00233b583bf5fsi16475911pjr.74.2023.03.20.11.08.26; Mon, 20 Mar 2023 11:08:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229825AbjCTRrt (ORCPT + 99 others); Mon, 20 Mar 2023 13:47:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36006 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229906AbjCTRrK (ORCPT ); Mon, 20 Mar 2023 13:47:10 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 3DA9B3B238 for ; Mon, 20 Mar 2023 10:42:49 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B502D1BCB; Mon, 20 Mar 2023 10:28:45 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 30DE83F67D; Mon, 20 Mar 2023 10:27:59 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 18/19] x86/resctrl: Add cpu offline callback for resctrl work Date: Mon, 20 Mar 2023 17:26:19 +0000 Message-Id: <20230320172620.18254-19-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760911131504030426?= X-GMAIL-MSGID: =?utf-8?q?1760911131504030426?= The resctrl architecture specific code may need to free a domain when a CPU goes offline, it also needs to reset the CPUs PQR_ASSOC register. The resctrl filesystem code needs to move the overflow and limbo work to run on a different CPU, and clear this CPU from the cpu_mask of control and monitor groups. Currently this is all done in core.c and called from resctrl_offline_cpu(), making the split between architecture and filesystem code unclear. Move the filesystem work into a filesystem helper called resctrl_offline_cpu(), and rename the one in core.c resctrl_arch_offline_cpu(). The rdtgroup_mutex is unlocked and locked again in the call in preparation for changing the locking rules for the architecture code. resctrl_offline_cpu() is called before any of the resource/domains are updated, and makes use of the exclude_cpu feature that was previously added. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- arch/x86/kernel/cpu/resctrl/core.c | 41 ++++---------------------- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 39 ++++++++++++++++++++++++ include/linux/resctrl.h | 1 + 3 files changed, 45 insertions(+), 36 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index aafe4b74587c..4e5fc89dab6d 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -578,22 +578,6 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) return; } - - if (r == &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl) { - if (is_mbm_enabled() && cpu == d->mbm_work_cpu) { - cancel_delayed_work(&d->mbm_over); - /* - * exclude_cpu=-1 as this CPU has already been removed - * by cpumask_clear_cpu()d - */ - mbm_setup_overflow_handler(d, 0, RESCTRL_PICK_ANY_CPU); - } - if (is_llc_occupancy_enabled() && cpu == d->cqm_work_cpu && - has_busy_rmid(r, d)) { - cancel_delayed_work(&d->cqm_limbo); - cqm_setup_limbo_handler(d, 0, RESCTRL_PICK_ANY_CPU); - } - } } static void clear_closid_rmid(int cpu) @@ -623,31 +607,15 @@ static int resctrl_arch_online_cpu(unsigned int cpu) return err; } -static void clear_childcpus(struct rdtgroup *r, unsigned int cpu) +static int resctrl_arch_offline_cpu(unsigned int cpu) { - struct rdtgroup *cr; - - list_for_each_entry(cr, &r->mon.crdtgrp_list, mon.crdtgrp_list) { - if (cpumask_test_and_clear_cpu(cpu, &cr->cpu_mask)) { - break; - } - } -} - -static int resctrl_offline_cpu(unsigned int cpu) -{ - struct rdtgroup *rdtgrp; struct rdt_resource *r; mutex_lock(&rdtgroup_mutex); + resctrl_offline_cpu(cpu); + for_each_capable_rdt_resource(r) domain_remove_cpu(cpu, r); - list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) { - if (cpumask_test_and_clear_cpu(cpu, &rdtgrp->cpu_mask)) { - clear_childcpus(rdtgrp, cpu); - break; - } - } clear_closid_rmid(cpu); mutex_unlock(&rdtgroup_mutex); @@ -970,7 +938,8 @@ static int __init resctrl_late_init(void) state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/resctrl/cat:online:", - resctrl_arch_online_cpu, resctrl_offline_cpu); + resctrl_arch_online_cpu, + resctrl_arch_offline_cpu); if (state < 0) return state; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index bf206bdb21ee..c27ec56c6c60 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -3710,6 +3710,45 @@ int resctrl_online_cpu(unsigned int cpu) return 0; } +static void clear_childcpus(struct rdtgroup *r, unsigned int cpu) +{ + struct rdtgroup *cr; + + list_for_each_entry(cr, &r->mon.crdtgrp_list, mon.crdtgrp_list) { + if (cpumask_test_and_clear_cpu(cpu, &cr->cpu_mask)) + break; + } +} + +void resctrl_offline_cpu(unsigned int cpu) +{ + struct rdt_domain *d; + struct rdtgroup *rdtgrp; + struct rdt_resource *l3 = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; + + lockdep_assert_held(&rdtgroup_mutex); + + list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) { + if (cpumask_test_and_clear_cpu(cpu, &rdtgrp->cpu_mask)) { + clear_childcpus(rdtgrp, cpu); + break; + } + } + + d = get_domain_from_cpu(cpu, l3); + if (d) { + if (is_mbm_enabled() && cpu == d->mbm_work_cpu) { + cancel_delayed_work(&d->mbm_over); + mbm_setup_overflow_handler(d, 0, cpu); + } + if (is_llc_occupancy_enabled() && cpu == d->cqm_work_cpu && + has_busy_rmid(l3, d)) { + cancel_delayed_work(&d->cqm_limbo); + cqm_setup_limbo_handler(d, 0, cpu); + } + } +} + /* * rdtgroup_init - rdtgroup initialization * diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index 3ea7d618f33f..f053527aaa5b 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -226,6 +226,7 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_domain *d, int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d); void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d); int resctrl_online_cpu(unsigned int cpu); +void resctrl_offline_cpu(unsigned int cpu); /** * resctrl_arch_rmid_read() - Read the eventid counter corresponding to rmid From patchwork Mon Mar 20 17:26:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 72349 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1361373wrt; Mon, 20 Mar 2023 11:14:28 -0700 (PDT) X-Google-Smtp-Source: AK7set/1msaZnyQ6Xb777f1W5O35bs4dkH0tE7wMy30N8KZKeyS98SrcFdYbiWu/FgYnvCPVAnbW X-Received: by 2002:a17:90b:1a8a:b0:23f:dd27:169e with SMTP id ng10-20020a17090b1a8a00b0023fdd27169emr81620pjb.17.1679336068600; Mon, 20 Mar 2023 11:14:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679336068; cv=none; d=google.com; s=arc-20160816; b=qM+dQwRznlqhEfiLlFNdl1rxDYDAMQIARA09l535H0gJKz/UmEcxDKo5ceGpxfuSAL 0DtVyH0Ux5oTrMr/nLBSWgVn0cl52nVdThnuIoJQVI4zJiLU5D9yQmXJKg74IMly2oQB IFyGjYIkndKovSDKt4M5WqG30R27FpoU2471Rijg5PGKsnCiaBVCpxUTAJE1osGG+k+3 FZ1655B4r2iHXurKlADwQwZFSYPeO6109CS6ap8qAFHG9VjoPqAj2NXzsYR53CAXm5hA QXNzmxp1VvmuANC/0l9rNzPCpLRa2HPO8yswqioisovqUprd8VzKcbPvg3SOepZpjUTz Kn+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=KZFdr62dwS55nVIUNChF1O+4vY7IYtV5EfTyH07xbwM=; b=DBRMvlxpV49wajltqKmCk5Nonsy3Zf+yLXCiEnHOwhQPGa9VVySf062TizOEzrmQto 2PSGBLZWjpxJ4TmZf/nQanaOkYkUQbPWSX4bCN3r2G2nQcciVUJs6jqxKcpZdVtbPXBz cL5T9OFDyLKVjX0k6SX+vlU1FCand7sTYimNjhrveWDeUh2HsPY6GMq19MEAyVpN1SLe 8emvClRgKor/4RY0/8dhucDJgYMClFq9wYqLirYXrVht6e7GhEO2GpjhsPHl0Dkh4BTO Mnf9m8uzCn9wtX5b23Jd3YcuoKKu0LCTGieO0jPfNwkaCOdFiFmIUtKzndTyabqmzPYD UIeA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c13-20020a17090ad90d00b0023b30b28b89si10369405pjv.56.2023.03.20.11.14.15; Mon, 20 Mar 2023 11:14:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230041AbjCTRqh (ORCPT + 99 others); Mon, 20 Mar 2023 13:46:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56628 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229794AbjCTRqA (ORCPT ); Mon, 20 Mar 2023 13:46:00 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 96DA43928F for ; Mon, 20 Mar 2023 10:41:57 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9F77B1BD0; Mon, 20 Mar 2023 10:28:48 -0700 (PDT) Received: from merodach.members.linode.com (unknown [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 23D523F67D; Mon, 20 Mar 2023 10:28:02 -0700 (PDT) From: James Morse To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , James Morse , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com Subject: [PATCH v3 19/19] x86/resctrl: Separate arch and fs resctrl locks Date: Mon, 20 Mar 2023 17:26:20 +0000 Message-Id: <20230320172620.18254-20-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20230320172620.18254-1-james.morse@arm.com> References: <20230320172620.18254-1-james.morse@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760911497078895558?= X-GMAIL-MSGID: =?utf-8?q?1760911497078895558?= resctrl has one mutex that is taken by the architecture specific code, and the filesystem parts. The two interact via cpuhp, where the architecture code updates the domain list. Filesystem handlers that walk the domains list should not run concurrently with the cpuhp callback modifying the list. Exposing a lock from the filesystem code means the interface is not cleanly defined, and creates the possibility of cross-architecture lock ordering headaches. The interaction only exists so that certain filesystem paths are serialised against cpu hotplug. The cpu hotplug code already has a mechanism to do this using cpus_read_lock(). MPAM's monitors have an overflow interrupt, so it needs to be possible to walk the domains list in irq context. RCU is ideal for this, but some paths need to be able to sleep to allocate memory. Because resctrl_{on,off}line_cpu() take the rdtgroup_mutex as part of a cpuhp callback, cpus_read_lock() must always be taken first. rdtgroup_schemata_write() already does this. Most of the filesystem code's domain list walkers are currently protected by the rdtgroup_mutex taken in rdtgroup_kn_lock_live(). The exceptions are rdt_bit_usage_show() and the mon_config helpers which take the lock directly. Make the domain list protected by RCU. An architecture-specific lock prevents concurrent writers. rdt_bit_usage_show() can walk the domain list under rcu_read_lock(). The mon_config helpers send multiple IPIs, take the cpus_read_lock() in these cases. The other filesystem list walkers need to be able to sleep. Add cpus_read_lock() to rdtgroup_kn_lock_live() so that the cpuhp callbacks can't be invoked when file system operations are occurring. Add lockdep_assert_cpus_held() in the cases where the rdtgroup_kn_lock_live() call isn't obvious. Resctrl's domain online/offline calls now need to take the rdtgroup_mutex themselves. Tested-by: Shaopeng Tan Signed-off-by: James Morse --- Changes since v2: * Reworded a comment, * Added a lockdep assertion * Moved clear_closid_rmid() outside the locked region of cpu online/offline --- arch/x86/kernel/cpu/resctrl/core.c | 38 +++++++++----- arch/x86/kernel/cpu/resctrl/ctrlmondata.c | 16 ++++-- arch/x86/kernel/cpu/resctrl/monitor.c | 3 ++ arch/x86/kernel/cpu/resctrl/pseudo_lock.c | 3 ++ arch/x86/kernel/cpu/resctrl/rdtgroup.c | 63 ++++++++++++++++++++--- include/linux/resctrl.h | 2 +- 6 files changed, 99 insertions(+), 26 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/core.c b/arch/x86/kernel/cpu/resctrl/core.c index 4e5fc89dab6d..85216091228a 100644 --- a/arch/x86/kernel/cpu/resctrl/core.c +++ b/arch/x86/kernel/cpu/resctrl/core.c @@ -25,8 +25,15 @@ #include #include "internal.h" -/* Mutex to protect rdtgroup access. */ -DEFINE_MUTEX(rdtgroup_mutex); +/* + * rdt_domain structures are kfree()d when their last CPU goes offline, + * and allocated when the first CPU in a new domain comes online. + * The rdt_resource's domain list is updated when this happens. Readers of + * the domain list must either take cpus_read_lock(), or rely on an RCU + * read-side critical section, to avoid observing concurrent modification. + * All writers take this mutex: + */ +static DEFINE_MUTEX(domain_list_lock); /* * The cached resctrl_pqr_state is strictly per CPU and can never be @@ -508,6 +515,8 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r) struct rdt_domain *d; int err; + lockdep_assert_held(&domain_list_lock); + d = rdt_find_domain(r, id, &add_pos); if (IS_ERR(d)) { pr_warn("Couldn't find cache id for CPU %d\n", cpu); @@ -541,11 +550,12 @@ static void domain_add_cpu(int cpu, struct rdt_resource *r) return; } - list_add_tail(&d->list, add_pos); + list_add_tail_rcu(&d->list, add_pos); err = resctrl_online_domain(r, d); if (err) { - list_del(&d->list); + list_del_rcu(&d->list); + synchronize_rcu(); domain_free(hw_dom); } } @@ -556,6 +566,8 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) struct rdt_hw_domain *hw_dom; struct rdt_domain *d; + lockdep_assert_held(&domain_list_lock); + d = rdt_find_domain(r, id, NULL); if (IS_ERR_OR_NULL(d)) { pr_warn("Couldn't find cache id for CPU %d\n", cpu); @@ -566,7 +578,8 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r) cpumask_clear_cpu(cpu, &d->cpu_mask); if (cpumask_empty(&d->cpu_mask)) { resctrl_offline_domain(r, d); - list_del(&d->list); + list_del_rcu(&d->list); + synchronize_rcu(); /* * rdt_domain "d" is going to be freed below, so clear @@ -594,30 +607,29 @@ static void clear_closid_rmid(int cpu) static int resctrl_arch_online_cpu(unsigned int cpu) { struct rdt_resource *r; - int err; - mutex_lock(&rdtgroup_mutex); + mutex_lock(&domain_list_lock); for_each_capable_rdt_resource(r) domain_add_cpu(cpu, r); + mutex_unlock(&domain_list_lock); + clear_closid_rmid(cpu); - err = resctrl_online_cpu(cpu); - mutex_unlock(&rdtgroup_mutex); - - return err; + return resctrl_online_cpu(cpu); } static int resctrl_arch_offline_cpu(unsigned int cpu) { struct rdt_resource *r; - mutex_lock(&rdtgroup_mutex); resctrl_offline_cpu(cpu); + mutex_lock(&domain_list_lock); for_each_capable_rdt_resource(r) domain_remove_cpu(cpu, r); + mutex_unlock(&domain_list_lock); + clear_closid_rmid(cpu); - mutex_unlock(&rdtgroup_mutex); return 0; } diff --git a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c index 9161bc95eea7..7c582fafa526 100644 --- a/arch/x86/kernel/cpu/resctrl/ctrlmondata.c +++ b/arch/x86/kernel/cpu/resctrl/ctrlmondata.c @@ -209,6 +209,9 @@ static int parse_line(char *line, struct resctrl_schema *s, struct rdt_domain *d; unsigned long dom_id; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + if (rdtgrp->mode == RDT_MODE_PSEUDO_LOCKSETUP && (r->rid == RDT_RESOURCE_MBA || r->rid == RDT_RESOURCE_SMBA)) { rdt_last_cmd_puts("Cannot pseudo-lock MBA resource\n"); @@ -313,6 +316,9 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid) struct rdt_domain *d; u32 idx; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL)) return -ENOMEM; @@ -379,11 +385,9 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of, return -EINVAL; buf[nbytes - 1] = '\0'; - cpus_read_lock(); rdtgrp = rdtgroup_kn_lock_live(of->kn); if (!rdtgrp) { rdtgroup_kn_unlock(of->kn); - cpus_read_unlock(); return -ENOENT; } rdt_last_cmd_clear(); @@ -447,7 +451,6 @@ ssize_t rdtgroup_schemata_write(struct kernfs_open_file *of, out: rdtgroup_kn_unlock(of->kn); - cpus_read_unlock(); return ret ?: nbytes; } @@ -467,6 +470,9 @@ static void show_doms(struct seq_file *s, struct resctrl_schema *schema, int clo bool sep = false; u32 ctrl_val; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + seq_printf(s, "%*s:", max_name_width, schema->name); list_for_each_entry(dom, &r->domains, list) { if (sep) @@ -530,8 +536,8 @@ void mon_event_read(struct rmid_read *rr, struct rdt_resource *r, { int cpu; - /* When picking a CPU from cpu_mask, ensure it can't race with cpuhp */ - lockdep_assert_held(&rdtgroup_mutex); + /* When picking a cpu from cpu_mask, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); /* * setup the parameters to pass to mon_event_count() to read the data. diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c index 11fa5d79c81d..58f665ce7a0a 100644 --- a/arch/x86/kernel/cpu/resctrl/monitor.c +++ b/arch/x86/kernel/cpu/resctrl/monitor.c @@ -458,6 +458,9 @@ static void add_rmid_to_limbo(struct rmid_entry *entry) u32 idx; int err; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + idx = resctrl_arch_rmid_idx_encode(entry->closid, entry->rmid); arch_mon_ctx = resctrl_arch_mon_ctx_alloc(r, QOS_L3_OCCUP_EVENT_ID); diff --git a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c index 0b4fdb118643..f8864626d593 100644 --- a/arch/x86/kernel/cpu/resctrl/pseudo_lock.c +++ b/arch/x86/kernel/cpu/resctrl/pseudo_lock.c @@ -830,6 +830,9 @@ bool rdtgroup_pseudo_locked_in_hierarchy(struct rdt_domain *d) struct rdt_domain *d_i; bool ret = false; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + if (!zalloc_cpumask_var(&cpu_with_psl, GFP_KERNEL)) return true; diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index c27ec56c6c60..52c610426181 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -35,6 +35,10 @@ DEFINE_STATIC_KEY_FALSE(rdt_enable_key); DEFINE_STATIC_KEY_FALSE(rdt_mon_enable_key); DEFINE_STATIC_KEY_FALSE(rdt_alloc_enable_key); + +/* Mutex to protect rdtgroup access. */ +DEFINE_MUTEX(rdtgroup_mutex); + static struct kernfs_root *rdt_root; struct rdtgroup rdtgroup_default; LIST_HEAD(rdt_all_groups); @@ -931,7 +935,8 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of, mutex_lock(&rdtgroup_mutex); hw_shareable = r->cache.shareable_bits; - list_for_each_entry(dom, &r->domains, list) { + rcu_read_lock(); + list_for_each_entry_rcu(dom, &r->domains, list) { if (sep) seq_putc(seq, ';'); sw_shareable = 0; @@ -987,8 +992,10 @@ static int rdt_bit_usage_show(struct kernfs_open_file *of, } sep = true; } + rcu_read_unlock(); seq_putc(seq, '\n'); mutex_unlock(&rdtgroup_mutex); + return 0; } @@ -1231,6 +1238,9 @@ static bool rdtgroup_mode_test_exclusive(struct rdtgroup *rdtgrp) struct rdt_domain *d; u32 ctrl; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + list_for_each_entry(s, &resctrl_schema_all, list) { r = s->res; if (r->rid == RDT_RESOURCE_MBA || r->rid == RDT_RESOURCE_SMBA) @@ -1497,6 +1507,7 @@ static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid struct rdt_domain *dom; bool sep = false; + cpus_read_lock(); mutex_lock(&rdtgroup_mutex); list_for_each_entry(dom, &r->domains, list) { @@ -1513,6 +1524,7 @@ static int mbm_config_show(struct seq_file *s, struct rdt_resource *r, u32 evtid seq_puts(s, "\n"); mutex_unlock(&rdtgroup_mutex); + cpus_read_unlock(); return 0; } @@ -1604,6 +1616,9 @@ static int mon_config_write(struct rdt_resource *r, char *tok, u32 evtid) struct rdt_domain *d; int ret = 0; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + next: if (!tok || tok[0] == '\0') return 0; @@ -1645,6 +1660,7 @@ static ssize_t mbm_total_bytes_config_write(struct kernfs_open_file *of, if (nbytes == 0 || buf[nbytes - 1] != '\n') return -EINVAL; + cpus_read_lock(); mutex_lock(&rdtgroup_mutex); rdt_last_cmd_clear(); @@ -1654,6 +1670,7 @@ static ssize_t mbm_total_bytes_config_write(struct kernfs_open_file *of, ret = mon_config_write(r, buf, QOS_L3_MBM_TOTAL_EVENT_ID); mutex_unlock(&rdtgroup_mutex); + cpus_read_unlock(); return ret ?: nbytes; } @@ -1669,6 +1686,7 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of, if (nbytes == 0 || buf[nbytes - 1] != '\n') return -EINVAL; + cpus_read_lock(); mutex_lock(&rdtgroup_mutex); rdt_last_cmd_clear(); @@ -1678,6 +1696,7 @@ static ssize_t mbm_local_bytes_config_write(struct kernfs_open_file *of, ret = mon_config_write(r, buf, QOS_L3_MBM_LOCAL_EVENT_ID); mutex_unlock(&rdtgroup_mutex); + cpus_read_unlock(); return ret ?: nbytes; } @@ -2130,6 +2149,9 @@ static int set_cache_qos_cfg(int level, bool enable) struct rdt_domain *d; int cpu; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + if (level == RDT_RESOURCE_L3) update = l3_qos_cfg_update; else if (level == RDT_RESOURCE_L2) @@ -2318,6 +2340,7 @@ struct rdtgroup *rdtgroup_kn_lock_live(struct kernfs_node *kn) atomic_inc(&rdtgrp->waitcount); kernfs_break_active_protection(kn); + cpus_read_lock(); mutex_lock(&rdtgroup_mutex); /* Was this group deleted while we waited? */ @@ -2335,6 +2358,7 @@ void rdtgroup_kn_unlock(struct kernfs_node *kn) return; mutex_unlock(&rdtgroup_mutex); + cpus_read_unlock(); if (atomic_dec_and_test(&rdtgrp->waitcount) && (rdtgrp->flags & RDT_DELETED)) { @@ -2632,6 +2656,9 @@ static int reset_all_ctrls(struct rdt_resource *r) struct rdt_domain *d; int i; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL)) return -ENOMEM; @@ -2916,6 +2943,9 @@ static int mkdir_mondata_subdir_alldom(struct kernfs_node *parent_kn, struct rdt_domain *dom; int ret; + /* Walking r->domains, ensure it can't race with cpuhp */ + lockdep_assert_cpus_held(); + list_for_each_entry(dom, &r->domains, list) { ret = mkdir_mondata_subdir(parent_kn, dom, r, prgrp); if (ret) @@ -3602,7 +3632,8 @@ static void domain_destroy_mon_state(struct rdt_domain *d) kfree(d->mbm_local); } -void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) +static void _resctrl_offline_domain(struct rdt_resource *r, + struct rdt_domain *d) { lockdep_assert_held(&rdtgroup_mutex); @@ -3637,6 +3668,13 @@ void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) domain_destroy_mon_state(d); } +void resctrl_offline_domain(struct rdt_resource *r, struct rdt_domain *d) +{ + mutex_lock(&rdtgroup_mutex); + _resctrl_offline_domain(r, d); + mutex_unlock(&rdtgroup_mutex); +} + static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domain *d) { u32 idx_limit = resctrl_arch_system_num_rmid_idx(); @@ -3668,7 +3706,7 @@ static int domain_setup_mon_state(struct rdt_resource *r, struct rdt_domain *d) return 0; } -int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) +static int _resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) { int err; @@ -3700,12 +3738,23 @@ int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) return 0; } +int resctrl_online_domain(struct rdt_resource *r, struct rdt_domain *d) +{ + int err; + + mutex_lock(&rdtgroup_mutex); + err = _resctrl_online_domain(r, d); + mutex_unlock(&rdtgroup_mutex); + + return err; +} + int resctrl_online_cpu(unsigned int cpu) { - lockdep_assert_held(&rdtgroup_mutex); - + mutex_lock(&rdtgroup_mutex); /* The cpu is set in default rdtgroup after online. */ cpumask_set_cpu(cpu, &rdtgroup_default.cpu_mask); + mutex_unlock(&rdtgroup_mutex); return 0; } @@ -3726,8 +3775,7 @@ void resctrl_offline_cpu(unsigned int cpu) struct rdtgroup *rdtgrp; struct rdt_resource *l3 = &rdt_resources_all[RDT_RESOURCE_L3].r_resctrl; - lockdep_assert_held(&rdtgroup_mutex); - + mutex_lock(&rdtgroup_mutex); list_for_each_entry(rdtgrp, &rdt_all_groups, rdtgroup_list) { if (cpumask_test_and_clear_cpu(cpu, &rdtgrp->cpu_mask)) { clear_childcpus(rdtgrp, cpu); @@ -3747,6 +3795,7 @@ void resctrl_offline_cpu(unsigned int cpu) cqm_setup_limbo_handler(d, 0, cpu); } } + mutex_unlock(&rdtgroup_mutex); } /* diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h index f053527aaa5b..4d35798effef 100644 --- a/include/linux/resctrl.h +++ b/include/linux/resctrl.h @@ -159,7 +159,7 @@ struct resctrl_schema; * @cache_level: Which cache level defines scope of this resource * @cache: Cache allocation related data * @membw: If the component has bandwidth controls, their properties. - * @domains: All domains for this resource + * @domains: RCU list of all domains for this resource * @name: Name to use in "schemata" file. * @data_width: Character width of data when displaying * @default_ctrl: Specifies default cache cbm or memory B/W percent.