Message ID | 20230913040635.28815-2-haitao.huang@linux.intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9ecd:0:b0:3f2:4152:657d with SMTP id t13csp904526vqx; Wed, 13 Sep 2023 00:15:07 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHT9A95+qmfjgPIIvw/wqC+cJrvZp+avkye7QVCA3t9G0MFbeeUHoYVk9aZ6ThImVfdpBXr X-Received: by 2002:a17:902:d48f:b0:1c3:81b7:2385 with SMTP id c15-20020a170902d48f00b001c381b72385mr2885105plg.11.1694589306676; Wed, 13 Sep 2023 00:15:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694589306; cv=none; d=google.com; s=arc-20160816; b=q2qFyA8eS30HpQ8GlqcfRMULJ3FhMPf18S7Nq/rfqHASPjzFqL12dH0LSzVRFm251N fG490R6JZKaNrFr7eh8qBO8mWoEfB1JpVgXng+OFEOdM3eVwKatGjdx+IrqAkgB/V5Io TlUbITKIHOohHgPs2zCgHGdS7lbM8lNeRFjCkgSoGiCy9eK1VhZFe0aiTaUhCw4mQf7P yU0IuPgfrirWQ9tRWrfvTDnvknYqfqef7KxkAgCcLLCD040ARtr20Q0uyz0vu7mi8Xgw pV2wvav5aN9UHWS4CB6lsGGS5i0B+18CCUV7alsgQy1bWC03WYPaJWjSah3j9qkTpAYU j2nw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=MsrCcmbDo2i/GirMQP9weL4R1aCOM/s+nj61M34VFpI=; fh=j8PE345l5Ydlo3KwK7JeWnjqRgjiq4AteUoOZeOwa0I=; b=kNhP3WJ4BCna1I4sQwK5SPgi1VnXXXhF8pp13dh1F+QNZACwoiruo6vEAysxqa+gcT FiJYLKWeQM+nE2Zm30ZGJGHS4KDnWSPg09p0Nlr6yJw8k9v5Mb8/2DQda48fkjbC1MsC yDLvs4irm37cKXtrbsTm2+Z6FLEn3o6SbktV2obMax1EkuRus651yGJ1IiDyj+DTaJx2 DZRddGsyLX491Zx6EoLqreUpnEiyN5boYWwaP5I0/yCXpITBSRBobSOBL0sgOckQtBfw gqjGLhZSvFpeyOpo2FdwrfGTVWl+pVouejsq4nwrt7gKXxeyRualOh0RjqpzojEc3GHr WxFg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="DZvS/fv1"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id g8-20020a1709029f8800b001b8805f98e9si9388974plq.452.2023.09.13.00.15.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 00:15:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="DZvS/fv1"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id A5B21827A0B9; Tue, 12 Sep 2023 21:08:58 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238253AbjIMEIy (ORCPT <rfc822;pwkd43@gmail.com> + 36 others); Wed, 13 Sep 2023 00:08:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59152 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233943AbjIMEIv (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 13 Sep 2023 00:08:51 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 56712E4B; Tue, 12 Sep 2023 21:08:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694578127; x=1726114127; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=IQkt55v8KXd+nCRIxFPx1Ph1uWRU7k/GkDwDqYyAYB4=; b=DZvS/fv1RbnlWVs3YNpVyhVwRWI+FLWYdxqHOjoQiu1tTcEHHbvymWp3 qm1gxigr1TulveryIee6EkuHGhHgz8EPM9J198iuDzeEf2wH8VvMnzMJ9 Pk58EoQC/7YjozaKwj6fbgqcoxzRwZe9LvAEDb6AEqhOq88Xj9PVfSIxT c7yC4qXXlTsblF9CEwvRkzIe8k/qFtG3n3qR5BbfGLwjfi1A0HI7iNpeQ 9jBxiDo8ICjgQH7cONCSOPMTnxorsIN+dlucCiNQj4e2ftNO6rEIZJUlW suHwxUq8JUb1/b7gNga6a1fX4yyjKsOS5Q3UcbgnRYromM77H8OmG4/n4 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10831"; a="357990296" X-IronPort-AV: E=Sophos;i="6.02,142,1688454000"; d="scan'208";a="357990296" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Sep 2023 21:06:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10831"; a="747155862" X-IronPort-AV: E=Sophos;i="6.02,142,1688454000"; d="scan'208";a="747155862" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by fmsmga007.fm.intel.com with ESMTP; 12 Sep 2023 21:06:37 -0700 From: Haitao Huang <haitao.huang@linux.intel.com> To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, x86@kernel.org, cgroups@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, sohil.mehta@intel.com Cc: zhiquan1.li@intel.com, kristen@linux.intel.com, seanjc@google.com, zhanb@microsoft.com, anakrish@microsoft.com, mikko.ylinen@linux.intel.com, yangjie@microsoft.com Subject: [PATCH v4 01/18] cgroup/misc: Add per resource callbacks for CSS events Date: Tue, 12 Sep 2023 21:06:18 -0700 Message-Id: <20230913040635.28815-2-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230913040635.28815-1-haitao.huang@linux.intel.com> References: <20230913040635.28815-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Tue, 12 Sep 2023 21:08:58 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776905676704024301 X-GMAIL-MSGID: 1776905676704024301 |
Series |
Add Cgroup support for SGX EPC memory
|
|
Commit Message
Haitao Huang
Sept. 13, 2023, 4:06 a.m. UTC
From: Kristen Carlson Accardi <kristen@linux.intel.com> Consumers of the misc cgroup controller might need to perform separate actions for Cgroups Subsystem State(CSS) events: cgroup alloc and free. In addition, writes to the max value may also need separate action. Add the ability to allow downstream users to setup callbacks for these operations, and call the corresponding per-resource-type callback when appropriate. This code will be utilized by the SGX driver in a future patch. Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> --- V4: - Moved this to the front of the series. - Applies on cgroup/for-6.6 with the overflow fix for misc. V3: - Removed the released() callback --- include/linux/misc_cgroup.h | 5 +++++ kernel/cgroup/misc.c | 32 +++++++++++++++++++++++++++++--- 2 files changed, 34 insertions(+), 3 deletions(-)
Comments
On Wed Sep 13, 2023 at 7:06 AM EEST, Haitao Huang wrote: > From: Kristen Carlson Accardi <kristen@linux.intel.com> > > Consumers of the misc cgroup controller might need to perform separate > actions for Cgroups Subsystem State(CSS) events: cgroup alloc and free. nit: s/State(CSS)/State (CSS)/ "cgroup alloc" and "cgroup free" mean absolutely nothing. > In addition, writes to the max value may also need separate action. Add What "the max value"? > the ability to allow downstream users to setup callbacks for these > operations, and call the corresponding per-resource-type callback when > appropriate. Who are "the downstream users" and what sort of callbacks they setup? > > This code will be utilized by the SGX driver in a future patch. > > Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com> > Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> > --- > V4: > - Moved this to the front of the series. > - Applies on cgroup/for-6.6 with the overflow fix for misc. > > V3: > - Removed the released() callback > --- > include/linux/misc_cgroup.h | 5 +++++ > kernel/cgroup/misc.c | 32 +++++++++++++++++++++++++++++--- > 2 files changed, 34 insertions(+), 3 deletions(-) > > diff --git a/include/linux/misc_cgroup.h b/include/linux/misc_cgroup.h > index e799b1f8d05b..e1bcd176c2de 100644 > --- a/include/linux/misc_cgroup.h > +++ b/include/linux/misc_cgroup.h > @@ -37,6 +37,11 @@ struct misc_res { > u64 max; > atomic64_t usage; > atomic64_t events; > + > + /* per resource callback ops */ > + int (*misc_cg_alloc)(struct misc_cg *cg); > + void (*misc_cg_free)(struct misc_cg *cg); > + void (*misc_cg_max_write)(struct misc_cg *cg); > }; > > /** > diff --git a/kernel/cgroup/misc.c b/kernel/cgroup/misc.c > index 79a3717a5803..e0092170d0dd 100644 > --- a/kernel/cgroup/misc.c > +++ b/kernel/cgroup/misc.c > @@ -276,10 +276,13 @@ static ssize_t misc_cg_max_write(struct kernfs_open_file *of, char *buf, > > cg = css_misc(of_css(of)); > > - if (READ_ONCE(misc_res_capacity[type])) > + if (READ_ONCE(misc_res_capacity[type])) { > WRITE_ONCE(cg->res[type].max, max); > - else > + if (cg->res[type].misc_cg_max_write) > + cg->res[type].misc_cg_max_write(cg); > + } else { > ret = -EINVAL; > + } > > return ret ? ret : nbytes; > } > @@ -383,23 +386,39 @@ static struct cftype misc_cg_files[] = { > static struct cgroup_subsys_state * > misc_cg_alloc(struct cgroup_subsys_state *parent_css) > { > + struct misc_cg *parent_cg; > enum misc_res_type i; > struct misc_cg *cg; > + int ret; > > if (!parent_css) { > cg = &root_cg; > + parent_cg = &root_cg; > } else { > cg = kzalloc(sizeof(*cg), GFP_KERNEL); > if (!cg) > return ERR_PTR(-ENOMEM); > + parent_cg = css_misc(parent_css); > } > > for (i = 0; i < MISC_CG_RES_TYPES; i++) { > WRITE_ONCE(cg->res[i].max, MAX_NUM); > atomic64_set(&cg->res[i].usage, 0); > + if (parent_cg->res[i].misc_cg_alloc) { > + ret = parent_cg->res[i].misc_cg_alloc(cg); > + if (ret) > + goto alloc_err; > + } > } > > return &cg->css; > + > +alloc_err: > + for (i = 0; i < MISC_CG_RES_TYPES; i++) > + if (parent_cg->res[i].misc_cg_free) > + cg->res[i].misc_cg_free(cg); > + kfree(cg); > + return ERR_PTR(ret); > } > > /** > @@ -410,7 +429,14 @@ misc_cg_alloc(struct cgroup_subsys_state *parent_css) > */ > static void misc_cg_free(struct cgroup_subsys_state *css) > { > - kfree(css_misc(css)); > + struct misc_cg *cg = css_misc(css); > + enum misc_res_type i; > + > + for (i = 0; i < MISC_CG_RES_TYPES; i++) > + if (cg->res[i].misc_cg_free) > + cg->res[i].misc_cg_free(cg); > + > + kfree(cg); > } > > /* Cgroup controller callbacks */ > -- > 2.25.1 BR, Jarkko
On Tue, Sep 12, 2023 at 09:06:18PM -0700, Haitao Huang wrote: > @@ -37,6 +37,11 @@ struct misc_res { > u64 max; > atomic64_t usage; > atomic64_t events; > + > + /* per resource callback ops */ > + int (*misc_cg_alloc)(struct misc_cg *cg); > + void (*misc_cg_free)(struct misc_cg *cg); > + void (*misc_cg_max_write)(struct misc_cg *cg); A nit about naming. These are already in misc_res and cgroup_ and cgrp_ prefixes are a lot more common. So, maybe go for sth like cgrp_alloc? Thanks.
On Fri, Sep 15, 2023 at 07:55:45AM -1000, Tejun Heo wrote: > On Tue, Sep 12, 2023 at 09:06:18PM -0700, Haitao Huang wrote: > > @@ -37,6 +37,11 @@ struct misc_res { > > u64 max; > > atomic64_t usage; > > atomic64_t events; > > + > > + /* per resource callback ops */ > > + int (*misc_cg_alloc)(struct misc_cg *cg); > > + void (*misc_cg_free)(struct misc_cg *cg); > > + void (*misc_cg_max_write)(struct misc_cg *cg); > > A nit about naming. These are already in misc_res and cgroup_ and cgrp_ > prefixes are a lot more common. So, maybe go for sth like cgrp_alloc? Ah, never mind about the prefix part. misc is using cg_ prefix widely already. Thanks.
On Fri, 15 Sep 2023 12:58:11 -0500, Tejun Heo <tj@kernel.org> wrote: > On Fri, Sep 15, 2023 at 07:55:45AM -1000, Tejun Heo wrote: >> On Tue, Sep 12, 2023 at 09:06:18PM -0700, Haitao Huang wrote: >> > @@ -37,6 +37,11 @@ struct misc_res { >> > u64 max; >> > atomic64_t usage; >> > atomic64_t events; >> > + >> > + /* per resource callback ops */ >> > + int (*misc_cg_alloc)(struct misc_cg *cg); >> > + void (*misc_cg_free)(struct misc_cg *cg); >> > + void (*misc_cg_max_write)(struct misc_cg *cg); >> >> A nit about naming. These are already in misc_res and cgroup_ and cgrp_ >> prefixes are a lot more common. So, maybe go for sth like cgrp_alloc? > > Ah, never mind about the prefix part. misc is using cg_ prefix widely > already. > Change them to plain alloc, free, max_write? As they are per resource type, not per cgroup. Also following no-prefix naming scheme like "open" for fops, vma_ops, etc. Thanks for your review. Haitao
Hi Jarkko On Wed, 13 Sep 2023 04:39:06 -0500, Jarkko Sakkinen <jarkko@kernel.org> wrote: > On Wed Sep 13, 2023 at 7:06 AM EEST, Haitao Huang wrote: >> From: Kristen Carlson Accardi <kristen@linux.intel.com> >> >> Consumers of the misc cgroup controller might need to perform separate >> actions for Cgroups Subsystem State(CSS) events: cgroup alloc and free. > > nit: s/State(CSS)/State (CSS)/ > > "cgroup alloc" and "cgroup free" mean absolutely nothing. > > >> In addition, writes to the max value may also need separate action. Add > > What "the max value"? > >> the ability to allow downstream users to setup callbacks for these >> operations, and call the corresponding per-resource-type callback when >> appropriate. > > Who are "the downstream users" and what sort of callbacks they setup? How about this? The misc cgroup controller (subsystem) currently does not perform resource type specific action for Cgroups Subsystem State (CSS) events: the 'css_alloc' event when a cgroup is created and the 'css_free' event when a cgroup is destroyed, or in event of user writing the max value to the misc.max file to set the consumption limit of a specific resource [admin-guide/cgroup-v2.rst, 5-9. Misc]. Define callbacks for those events and allow resource providers to register the callbacks per resource type as needed. This will be utilized later by the EPC misc cgroup support implemented in the SGX driver: - On cgroup alloc, allocate and initialize necessary structures for EPC reclaiming, e.g., LRU list, work queue, etc. - On cgroup free, cleanup and free those structures created in alloc. - On max write, trigger EPC reclaiming if the new limit is at or below current consumption. Thanks Haitao
On Sat Sep 16, 2023 at 7:11 AM EEST, Haitao Huang wrote: > Hi Jarkko > > On Wed, 13 Sep 2023 04:39:06 -0500, Jarkko Sakkinen <jarkko@kernel.org> > wrote: > > > On Wed Sep 13, 2023 at 7:06 AM EEST, Haitao Huang wrote: > >> From: Kristen Carlson Accardi <kristen@linux.intel.com> > >> > >> Consumers of the misc cgroup controller might need to perform separate > >> actions for Cgroups Subsystem State(CSS) events: cgroup alloc and free. > > > > nit: s/State(CSS)/State (CSS)/ > > > > "cgroup alloc" and "cgroup free" mean absolutely nothing. > > > > > >> In addition, writes to the max value may also need separate action. Add > > > > What "the max value"? > > > >> the ability to allow downstream users to setup callbacks for these > >> operations, and call the corresponding per-resource-type callback when > >> appropriate. > > > > Who are "the downstream users" and what sort of callbacks they setup? > > How about this? > > The misc cgroup controller (subsystem) currently does not perform resource > type specific action for Cgroups Subsystem State (CSS) events: the > 'css_alloc' event when a cgroup is created and the 'css_free' event when a > cgroup is destroyed, or in event of user writing the max value to the > misc.max file to set the consumption limit of a specific resource > [admin-guide/cgroup-v2.rst, 5-9. Misc]. > > Define callbacks for those events and allow resource providers to register > the callbacks per resource type as needed. This will be utilized later by > the EPC misc cgroup support implemented in the SGX driver: > - On cgroup alloc, allocate and initialize necessary structures for EPC > reclaiming, e.g., LRU list, work queue, etc. > - On cgroup free, cleanup and free those structures created in alloc. > - On max write, trigger EPC reclaiming if the new limit is at or below > current consumption. Yeah, this is much better (I was on holiday, thus the delay on response). > Thanks > Haitao BR, Jarkko
diff --git a/include/linux/misc_cgroup.h b/include/linux/misc_cgroup.h index e799b1f8d05b..e1bcd176c2de 100644 --- a/include/linux/misc_cgroup.h +++ b/include/linux/misc_cgroup.h @@ -37,6 +37,11 @@ struct misc_res { u64 max; atomic64_t usage; atomic64_t events; + + /* per resource callback ops */ + int (*misc_cg_alloc)(struct misc_cg *cg); + void (*misc_cg_free)(struct misc_cg *cg); + void (*misc_cg_max_write)(struct misc_cg *cg); }; /** diff --git a/kernel/cgroup/misc.c b/kernel/cgroup/misc.c index 79a3717a5803..e0092170d0dd 100644 --- a/kernel/cgroup/misc.c +++ b/kernel/cgroup/misc.c @@ -276,10 +276,13 @@ static ssize_t misc_cg_max_write(struct kernfs_open_file *of, char *buf, cg = css_misc(of_css(of)); - if (READ_ONCE(misc_res_capacity[type])) + if (READ_ONCE(misc_res_capacity[type])) { WRITE_ONCE(cg->res[type].max, max); - else + if (cg->res[type].misc_cg_max_write) + cg->res[type].misc_cg_max_write(cg); + } else { ret = -EINVAL; + } return ret ? ret : nbytes; } @@ -383,23 +386,39 @@ static struct cftype misc_cg_files[] = { static struct cgroup_subsys_state * misc_cg_alloc(struct cgroup_subsys_state *parent_css) { + struct misc_cg *parent_cg; enum misc_res_type i; struct misc_cg *cg; + int ret; if (!parent_css) { cg = &root_cg; + parent_cg = &root_cg; } else { cg = kzalloc(sizeof(*cg), GFP_KERNEL); if (!cg) return ERR_PTR(-ENOMEM); + parent_cg = css_misc(parent_css); } for (i = 0; i < MISC_CG_RES_TYPES; i++) { WRITE_ONCE(cg->res[i].max, MAX_NUM); atomic64_set(&cg->res[i].usage, 0); + if (parent_cg->res[i].misc_cg_alloc) { + ret = parent_cg->res[i].misc_cg_alloc(cg); + if (ret) + goto alloc_err; + } } return &cg->css; + +alloc_err: + for (i = 0; i < MISC_CG_RES_TYPES; i++) + if (parent_cg->res[i].misc_cg_free) + cg->res[i].misc_cg_free(cg); + kfree(cg); + return ERR_PTR(ret); } /** @@ -410,7 +429,14 @@ misc_cg_alloc(struct cgroup_subsys_state *parent_css) */ static void misc_cg_free(struct cgroup_subsys_state *css) { - kfree(css_misc(css)); + struct misc_cg *cg = css_misc(css); + enum misc_res_type i; + + for (i = 0; i < MISC_CG_RES_TYPES; i++) + if (cg->res[i].misc_cg_free) + cg->res[i].misc_cg_free(cg); + + kfree(cg); } /* Cgroup controller callbacks */