Message ID | 20231030182013.40086-2-haitao.huang@linux.intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:d641:0:b0:403:3b70:6f57 with SMTP id cy1csp2415297vqb; Mon, 30 Oct 2023 11:21:21 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEkXfddfsBAWbQ/wmAKCU4jqpjoVGSiRP30hxa51QCIXyzpWkirtq00DRtLojhBPC2SrhS+ X-Received: by 2002:a17:90b:38ca:b0:280:1de0:8a86 with SMTP id nn10-20020a17090b38ca00b002801de08a86mr5197458pjb.20.1698690080853; Mon, 30 Oct 2023 11:21:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698690080; cv=none; d=google.com; s=arc-20160816; b=sDEwge16mm3SYBTbioJ8fmJ+37TX6WwYedfyuvoQ6trfr2VnGCIA3SdQiv/O1aqSds om5sJ6SCrAip37WIOoXEAGIeJUEnGPGs3h0kKYV+yMjNtwvHfGny4va8JQ4RI7nmUPpI a4WwvYGKDzCiBEFAfO/eh868pPhsmAYPEyjpQhX7rNZx25P74SIWsv+1JSGLwaIj/jDw y/dz5w2dP9X/P8D4HR0jaJ8L5//fZIYzHlqmw+QHlzdMITCw/fd0W2dxfavhnURPLuD2 TNW+7Pn+LQ6QiHZF4tbrtVIAf23SCYpeKr+jtXeK9TNX1t4edBj4Hvy89nB4IlMKsTS8 dyMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Pe+sBBU2XxaVJRSSBOPIoXqkDo1ImRgF5DvLiYV96kU=; fh=sNqM0rzpS3sahstSbhU1PMo5LJZgUrUMK8BDlDCJMTE=; b=Jf/w4xBqgSNFkCYax77gXDVSjh9U76+Wv2tVZcGtJq7HmSMNsIYTwaeHCbV0OvoaMf RwZEJ8NdLy/j15QvsrIJInK7fj5ewfutgS4XJ7nlGDo2ljcfND+odS5xfi6BiQJtdhjt j7K36+htolpsRPBJRsiweB9zbTSSrr1AICk2vdxHVPtfp4vMoxO+rFwOoAuUuQWTgO+F FLJr0iCwhq22iWqNOeJVJ7q+sulL7TiAysUDCFGYddLlc6Ti7p+BYZPd5MrkVLjaeA46 C9l1LeUFrk5aEt8i19VeQNEFgqGl9y56rrheIBYHiIDxAgD/t4Vdm98HjYblQTvQb/bq n1cw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=m1sarAKo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id h18-20020a17090acf1200b0028031758019si3312757pju.32.2023.10.30.11.21.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Oct 2023 11:21:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=m1sarAKo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id B2EAA8058A3B; Mon, 30 Oct 2023 11:21:07 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232013AbjJ3SUn (ORCPT <rfc822;chrisjones.unixmen@gmail.com> + 32 others); Mon, 30 Oct 2023 14:20:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33594 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231827AbjJ3SUe (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 30 Oct 2023 14:20:34 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 32848C2; Mon, 30 Oct 2023 11:20:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698690030; x=1730226030; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=530RSKYG0q88gdwnkVm1khu5QkPGWZ8B6AWmOKTXhCE=; b=m1sarAKoduGVRWMmATKMuXdQxiZ3mLGqae29t6wqjr2Ha/sKk8rJ+JeQ KuQKmBJjJqNDzstteoQVeBpYlvuDP/e4pW9VfqSnUrCg1Pk3yYuTC05C8 f2pSHYntvfgrHTXSPv+U03cCqBqPaPh5qfKRMWpETHYju6aw6mTq0GFOs Keyi7qmB72Mdvz9Jvn/IkzHhXdQWuikbE31V+NZCdKIzZaQCqIPW6Q4I9 1UikWyzUaft832Sa6oE+nEiwS3Zrc+lcCmfd8n7ia/lB6x61wLho/aROn T+grI1hBVIw7go/R3n1oQkAx35mya4fSVZo9tHn38JMWxI96QI/MtFxU5 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10879"; a="367479523" X-IronPort-AV: E=Sophos;i="6.03,263,1694761200"; d="scan'208";a="367479523" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Oct 2023 11:20:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10879"; a="789529497" X-IronPort-AV: E=Sophos;i="6.03,263,1694761200"; d="scan'208";a="789529497" Received: from b4969161e530.jf.intel.com ([10.165.56.46]) by orsmga008.jf.intel.com with ESMTP; 30 Oct 2023 11:20:28 -0700 From: Haitao Huang <haitao.huang@linux.intel.com> To: jarkko@kernel.org, dave.hansen@linux.intel.com, tj@kernel.org, mkoutny@suse.com, linux-kernel@vger.kernel.org, linux-sgx@vger.kernel.org, x86@kernel.org, cgroups@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, sohil.mehta@intel.com Cc: zhiquan1.li@intel.com, kristen@linux.intel.com, seanjc@google.com, zhanb@microsoft.com, anakrish@microsoft.com, mikko.ylinen@linux.intel.com, yangjie@microsoft.com, Haitao Huang <haitao.huang@linux.intel.com> Subject: [PATCH v6 01/12] cgroup/misc: Add per resource callbacks for CSS events Date: Mon, 30 Oct 2023 11:20:02 -0700 Message-Id: <20231030182013.40086-2-haitao.huang@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231030182013.40086-1-haitao.huang@linux.intel.com> References: <20231030182013.40086-1-haitao.huang@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Mon, 30 Oct 2023 11:21:07 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781205650617352005 X-GMAIL-MSGID: 1781205650617352005 |
Series |
Add Cgroup support for SGX EPC memory
|
|
Commit Message
Haitao Huang
Oct. 30, 2023, 6:20 p.m. UTC
From: Kristen Carlson Accardi <kristen@linux.intel.com> The misc cgroup controller (subsystem) currently does not perform resource type specific action for Cgroups Subsystem State (CSS) events: the 'css_alloc' event when a cgroup is created and the 'css_free' event when a cgroup is destroyed. Define callbacks for those events and allow resource providers to register the callbacks per resource type as needed. This will be utilized later by the EPC misc cgroup support implemented in the SGX driver. Also add per resource type private data for those callbacks to store and access resource specific data. Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com> Co-developed-by: Haitao Huang <haitao.huang@linux.intel.com> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> --- V6: - Create ops struct for per resource callbacks (Jarkko) - Drop max_write callback (Dave, Michal) - Style fixes (Kai) --- include/linux/misc_cgroup.h | 14 ++++++++++++++ kernel/cgroup/misc.c | 27 ++++++++++++++++++++++++--- 2 files changed, 38 insertions(+), 3 deletions(-)
Comments
On Mon Oct 30, 2023 at 8:20 PM EET, Haitao Huang wrote: > From: Kristen Carlson Accardi <kristen@linux.intel.com> > > The misc cgroup controller (subsystem) currently does not perform > resource type specific action for Cgroups Subsystem State (CSS) events: > the 'css_alloc' event when a cgroup is created and the 'css_free' event > when a cgroup is destroyed. > > Define callbacks for those events and allow resource providers to > register the callbacks per resource type as needed. This will be > utilized later by the EPC misc cgroup support implemented in the SGX > driver. > > Also add per resource type private data for those callbacks to store and > access resource specific data. > > Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com> > Co-developed-by: Haitao Huang <haitao.huang@linux.intel.com> > Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> > --- > V6: > - Create ops struct for per resource callbacks (Jarkko) > - Drop max_write callback (Dave, Michal) > - Style fixes (Kai) > --- > include/linux/misc_cgroup.h | 14 ++++++++++++++ > kernel/cgroup/misc.c | 27 ++++++++++++++++++++++++--- > 2 files changed, 38 insertions(+), 3 deletions(-) > > diff --git a/include/linux/misc_cgroup.h b/include/linux/misc_cgroup.h > index e799b1f8d05b..5dc509c27c3d 100644 > --- a/include/linux/misc_cgroup.h > +++ b/include/linux/misc_cgroup.h > @@ -27,16 +27,30 @@ struct misc_cg; > > #include <linux/cgroup.h> > > +/** > + * struct misc_operations_struct: per resource callback ops. > + * @alloc: invoked for resource specific initialization when cgroup is allocated. > + * @free: invoked for resource specific cleanup when cgroup is deallocated. > + */ > +struct misc_operations_struct { > + int (*alloc)(struct misc_cg *cg); > + void (*free)(struct misc_cg *cg); > +}; Maybe just misc_operations, or even misc_ops? > + > /** > * struct misc_res: Per cgroup per misc type resource > * @max: Maximum limit on the resource. > * @usage: Current usage of the resource. > * @events: Number of times, the resource limit exceeded. > + * @priv: resource specific data. > + * @misc_ops: resource specific operations. > */ > struct misc_res { > u64 max; > atomic64_t usage; > atomic64_t events; > + void *priv; priv is the wrong patch, it just confuses the overall picture heere. please move it to 04/12. Let's deal with the callbacks here. > + const struct misc_operations_struct *misc_ops; > }; misc_ops would be at least consistent with this, as misc_res also has an acronym. > > /** > diff --git a/kernel/cgroup/misc.c b/kernel/cgroup/misc.c > index 79a3717a5803..d971ede44ebf 100644 > --- a/kernel/cgroup/misc.c > +++ b/kernel/cgroup/misc.c > @@ -383,23 +383,37 @@ static struct cftype misc_cg_files[] = { > static struct cgroup_subsys_state * > misc_cg_alloc(struct cgroup_subsys_state *parent_css) > { > + struct misc_cg *parent_cg, *cg; > enum misc_res_type i; > - struct misc_cg *cg; > + int ret; > > if (!parent_css) { > - cg = &root_cg; > + parent_cg = cg = &root_cg; > } else { > cg = kzalloc(sizeof(*cg), GFP_KERNEL); > if (!cg) > return ERR_PTR(-ENOMEM); > + parent_cg = css_misc(parent_css); > } > > for (i = 0; i < MISC_CG_RES_TYPES; i++) { > WRITE_ONCE(cg->res[i].max, MAX_NUM); > atomic64_set(&cg->res[i].usage, 0); > + if (parent_cg->res[i].misc_ops && parent_cg->res[i].misc_ops->alloc) { > + ret = parent_cg->res[i].misc_ops->alloc(cg); > + if (ret) > + goto alloc_err; The patch set only has a use case for both operations defined - any partial combinations should never be allowed. To enforce this invariant you could create a set of operations (written out of top of my head): static int misc_res_init(struct misc_res *res, struct misc_ops *ops) { if (!misc_ops->alloc) { pr_err("%s: alloc missing\n", __func__); return -EINVAL; } if (!misc_ops->free) { pr_err("%s: free missing\n", __func__); return -EINVAL; } res->misc_ops = misc_ops; return 0; } static inline int misc_res_alloc(struct misc_cg *cg, struct misc_res *res) { int ret; if (!res->misc_ops) return 0; return res->misc_ops->alloc(cg); } static inline void misc_res_free(struct misc_cg *cg, struct misc_res *res) { int ret; if (!res->misc_ops) return 0; return res->misc_ops->alloc(cg); } Now if anything has misc_ops, it will also always have *both* callback, and nothing half-baked gets in. The above loops would be then: for (i = 0; i < MISC_CG_RES_TYPES; i++) { WRITE_ONCE(cg->res[i].max, MAX_NUM); atomic64_set(&cg->res[i].usage, 0); ret = misc_res_alloc(&parent_cg->res[i]); if (ret) goto alloc_err; Cleaner and better guards for state consistency. In 04/12 you need to then call misc_res_init() instead of direct assignment. BR, Jarkko
On Mon, Oct 30, 2023 at 11:20:02AM -0700, Haitao Huang <haitao.huang@linux.intel.com> wrote: > From: Kristen Carlson Accardi <kristen@linux.intel.com> > > The misc cgroup controller (subsystem) currently does not perform > resource type specific action for Cgroups Subsystem State (CSS) events: > the 'css_alloc' event when a cgroup is created and the 'css_free' event > when a cgroup is destroyed. > > Define callbacks for those events and allow resource providers to > register the callbacks per resource type as needed. This will be > utilized later by the EPC misc cgroup support implemented in the SGX > driver. I notice now that the callbacks are per resource and per cgroup. I think that is an overkill that complicates misc_cg allocation code with parent's callbacks while freeing is done by own callbacks. It's possibly too much liberal to keep objects' lifecycle understandable in the long term too. For some uniformity, I'd suggest struct blkcg_policy array in block/blk-cgroup.c as the precedent. (Perhaps indexed with static enum misc_res_type instead of dynamic index assignment as blkcg_policy_registeer() does.) I hope you don't really need per-cgroup callbacks :-) Michal
On Fri, 05 Jan 2024 03:45:02 -0600, Michal Koutný <mkoutny@suse.com> wrote: > On Mon, Oct 30, 2023 at 11:20:02AM -0700, Haitao Huang > <haitao.huang@linux.intel.com> wrote: >> From: Kristen Carlson Accardi <kristen@linux.intel.com> >> >> The misc cgroup controller (subsystem) currently does not perform >> resource type specific action for Cgroups Subsystem State (CSS) events: >> the 'css_alloc' event when a cgroup is created and the 'css_free' event >> when a cgroup is destroyed. >> >> Define callbacks for those events and allow resource providers to >> register the callbacks per resource type as needed. This will be >> utilized later by the EPC misc cgroup support implemented in the SGX >> driver. > > I notice now that the callbacks are per resource and per cgroup. > I think that is an overkill that complicates misc_cg allocation code > with parent's callbacks while freeing is done by own callbacks. > It's possibly too much liberal to keep objects' lifecycle understandable > in the long term too. > > For some uniformity, I'd suggest struct blkcg_policy array in > block/blk-cgroup.c as the precedent. (Perhaps indexed with static enum > misc_res_type instead of dynamic index assignment as > blkcg_policy_registeer() does.) > > I hope you don't really need per-cgroup callbacks :-) > > Michal Sure I can do that. IIUC, you are suggesting something similar how capacity is set via misc_cg_set_capacity for each resource type. Here we make a misc_cg_set_ops() which stores the ops pointers in a static array, then use them indexed with the resource type. Thanks Haitao
On Wed, 15 Nov 2023 14:25:59 -0600, Jarkko Sakkinen <jarkko@kernel.org> wrote: > On Mon Oct 30, 2023 at 8:20 PM EET, Haitao Huang wrote: >> From: Kristen Carlson Accardi <kristen@linux.intel.com> >> >> The misc cgroup controller (subsystem) currently does not perform >> resource type specific action for Cgroups Subsystem State (CSS) events: >> the 'css_alloc' event when a cgroup is created and the 'css_free' event >> when a cgroup is destroyed. >> >> Define callbacks for those events and allow resource providers to >> register the callbacks per resource type as needed. This will be >> utilized later by the EPC misc cgroup support implemented in the SGX >> driver. >> >> Also add per resource type private data for those callbacks to store and >> access resource specific data. >> >> Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com> >> Co-developed-by: Haitao Huang <haitao.huang@linux.intel.com> >> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> >> --- >> V6: >> - Create ops struct for per resource callbacks (Jarkko) >> - Drop max_write callback (Dave, Michal) >> - Style fixes (Kai) >> --- >> include/linux/misc_cgroup.h | 14 ++++++++++++++ >> kernel/cgroup/misc.c | 27 ++++++++++++++++++++++++--- >> 2 files changed, 38 insertions(+), 3 deletions(-) >> >> diff --git a/include/linux/misc_cgroup.h b/include/linux/misc_cgroup.h >> index e799b1f8d05b..5dc509c27c3d 100644 >> --- a/include/linux/misc_cgroup.h >> +++ b/include/linux/misc_cgroup.h >> @@ -27,16 +27,30 @@ struct misc_cg; >> >> #include <linux/cgroup.h> >> >> +/** >> + * struct misc_operations_struct: per resource callback ops. >> + * @alloc: invoked for resource specific initialization when cgroup is >> allocated. >> + * @free: invoked for resource specific cleanup when cgroup is >> deallocated. >> + */ >> +struct misc_operations_struct { >> + int (*alloc)(struct misc_cg *cg); >> + void (*free)(struct misc_cg *cg); >> +}; > > Maybe just misc_operations, or even misc_ops? > With Michal's suggestion to make ops per-resource-type, I'll rename this misc_res_ops (I was following vm_operations_struct as example) >> + >> /** >> * struct misc_res: Per cgroup per misc type resource >> * @max: Maximum limit on the resource. >> * @usage: Current usage of the resource. >> * @events: Number of times, the resource limit exceeded. >> + * @priv: resource specific data. >> + * @misc_ops: resource specific operations. >> */ >> struct misc_res { >> u64 max; >> atomic64_t usage; >> atomic64_t events; >> + void *priv; > > priv is the wrong patch, it just confuses the overall picture heere. > please move it to 04/12. Let's deal with the callbacks here. > Ok >> + const struct misc_operations_struct *misc_ops; >> }; > > misc_ops would be at least consistent with this, as misc_res also has an > acronym. > >> >> /** >> diff --git a/kernel/cgroup/misc.c b/kernel/cgroup/misc.c >> index 79a3717a5803..d971ede44ebf 100644 >> --- a/kernel/cgroup/misc.c >> +++ b/kernel/cgroup/misc.c >> @@ -383,23 +383,37 @@ static struct cftype misc_cg_files[] = { >> static struct cgroup_subsys_state * >> misc_cg_alloc(struct cgroup_subsys_state *parent_css) >> { >> + struct misc_cg *parent_cg, *cg; >> enum misc_res_type i; >> - struct misc_cg *cg; >> + int ret; >> >> if (!parent_css) { >> - cg = &root_cg; >> + parent_cg = cg = &root_cg; >> } else { >> cg = kzalloc(sizeof(*cg), GFP_KERNEL); >> if (!cg) >> return ERR_PTR(-ENOMEM); >> + parent_cg = css_misc(parent_css); >> } >> >> for (i = 0; i < MISC_CG_RES_TYPES; i++) { >> WRITE_ONCE(cg->res[i].max, MAX_NUM); >> atomic64_set(&cg->res[i].usage, 0); >> + if (parent_cg->res[i].misc_ops && parent_cg->res[i].misc_ops->alloc) >> { >> + ret = parent_cg->res[i].misc_ops->alloc(cg); >> + if (ret) >> + goto alloc_err; > > The patch set only has a use case for both operations defined - any > partial combinations should never be allowed. > > To enforce this invariant you could create a set of operations (written > out of top of my head): > > static int misc_res_init(struct misc_res *res, struct misc_ops *ops) > { > if (!misc_ops->alloc) { > pr_err("%s: alloc missing\n", __func__); > return -EINVAL; > } > > if (!misc_ops->free) { > pr_err("%s: free missing\n", __func__); > return -EINVAL; > } > > res->misc_ops = misc_ops; > return 0; > } > > static inline int misc_res_alloc(struct misc_cg *cg, struct misc_res > *res) > { > int ret; > > if (!res->misc_ops) > return 0; > > return res->misc_ops->alloc(cg); > } > > static inline void misc_res_free(struct misc_cg *cg, struct misc_res > *res) > { > int ret; > > if (!res->misc_ops) > return 0; > > return res->misc_ops->alloc(cg); > } > > Now if anything has misc_ops, it will also always have *both* callback, > and nothing half-baked gets in. > > The above loops would be then: > > for (i = 0; i < MISC_CG_RES_TYPES; i++) { > WRITE_ONCE(cg->res[i].max, MAX_NUM); > atomic64_set(&cg->res[i].usage, 0); > ret = misc_res_alloc(&parent_cg->res[i]); > if (ret) > goto alloc_err; > > Cleaner and better guards for state consistency. In 04/12 you need to > then call misc_res_init() instead of direct assignment. > > BR, Jarkko Will combine these with the use of a static operations array suggested by Michal. Thanks Haitao
On Tue Jan 9, 2024 at 5:37 AM EET, Haitao Huang wrote: > On Wed, 15 Nov 2023 14:25:59 -0600, Jarkko Sakkinen <jarkko@kernel.org> > wrote: > > > On Mon Oct 30, 2023 at 8:20 PM EET, Haitao Huang wrote: > >> From: Kristen Carlson Accardi <kristen@linux.intel.com> > >> > >> The misc cgroup controller (subsystem) currently does not perform > >> resource type specific action for Cgroups Subsystem State (CSS) events: > >> the 'css_alloc' event when a cgroup is created and the 'css_free' event > >> when a cgroup is destroyed. > >> > >> Define callbacks for those events and allow resource providers to > >> register the callbacks per resource type as needed. This will be > >> utilized later by the EPC misc cgroup support implemented in the SGX > >> driver. > >> > >> Also add per resource type private data for those callbacks to store and > >> access resource specific data. > >> > >> Signed-off-by: Kristen Carlson Accardi <kristen@linux.intel.com> > >> Co-developed-by: Haitao Huang <haitao.huang@linux.intel.com> > >> Signed-off-by: Haitao Huang <haitao.huang@linux.intel.com> > >> --- > >> V6: > >> - Create ops struct for per resource callbacks (Jarkko) > >> - Drop max_write callback (Dave, Michal) > >> - Style fixes (Kai) > >> --- > >> include/linux/misc_cgroup.h | 14 ++++++++++++++ > >> kernel/cgroup/misc.c | 27 ++++++++++++++++++++++++--- > >> 2 files changed, 38 insertions(+), 3 deletions(-) > >> > >> diff --git a/include/linux/misc_cgroup.h b/include/linux/misc_cgroup.h > >> index e799b1f8d05b..5dc509c27c3d 100644 > >> --- a/include/linux/misc_cgroup.h > >> +++ b/include/linux/misc_cgroup.h > >> @@ -27,16 +27,30 @@ struct misc_cg; > >> > >> #include <linux/cgroup.h> > >> > >> +/** > >> + * struct misc_operations_struct: per resource callback ops. > >> + * @alloc: invoked for resource specific initialization when cgroup is > >> allocated. > >> + * @free: invoked for resource specific cleanup when cgroup is > >> deallocated. > >> + */ > >> +struct misc_operations_struct { > >> + int (*alloc)(struct misc_cg *cg); > >> + void (*free)(struct misc_cg *cg); > >> +}; > > > > Maybe just misc_operations, or even misc_ops? > > > > With Michal's suggestion to make ops per-resource-type, I'll rename this > misc_res_ops (I was following vm_operations_struct as example) > > >> + > >> /** > >> * struct misc_res: Per cgroup per misc type resource > >> * @max: Maximum limit on the resource. > >> * @usage: Current usage of the resource. > >> * @events: Number of times, the resource limit exceeded. > >> + * @priv: resource specific data. > >> + * @misc_ops: resource specific operations. > >> */ > >> struct misc_res { > >> u64 max; > >> atomic64_t usage; > >> atomic64_t events; > >> + void *priv; > > > > priv is the wrong patch, it just confuses the overall picture heere. > > please move it to 04/12. Let's deal with the callbacks here. > > > > Ok > > >> + const struct misc_operations_struct *misc_ops; > >> }; > > > > misc_ops would be at least consistent with this, as misc_res also has an > > acronym. > > > >> > >> /** > >> diff --git a/kernel/cgroup/misc.c b/kernel/cgroup/misc.c > >> index 79a3717a5803..d971ede44ebf 100644 > >> --- a/kernel/cgroup/misc.c > >> +++ b/kernel/cgroup/misc.c > >> @@ -383,23 +383,37 @@ static struct cftype misc_cg_files[] = { > >> static struct cgroup_subsys_state * > >> misc_cg_alloc(struct cgroup_subsys_state *parent_css) > >> { > >> + struct misc_cg *parent_cg, *cg; > >> enum misc_res_type i; > >> - struct misc_cg *cg; > >> + int ret; > >> > >> if (!parent_css) { > >> - cg = &root_cg; > >> + parent_cg = cg = &root_cg; > >> } else { > >> cg = kzalloc(sizeof(*cg), GFP_KERNEL); > >> if (!cg) > >> return ERR_PTR(-ENOMEM); > >> + parent_cg = css_misc(parent_css); > >> } > >> > >> for (i = 0; i < MISC_CG_RES_TYPES; i++) { > >> WRITE_ONCE(cg->res[i].max, MAX_NUM); > >> atomic64_set(&cg->res[i].usage, 0); > >> + if (parent_cg->res[i].misc_ops && parent_cg->res[i].misc_ops->alloc) > >> { > >> + ret = parent_cg->res[i].misc_ops->alloc(cg); > >> + if (ret) > >> + goto alloc_err; > > > > The patch set only has a use case for both operations defined - any > > partial combinations should never be allowed. > > > > To enforce this invariant you could create a set of operations (written > > out of top of my head): > > > > static int misc_res_init(struct misc_res *res, struct misc_ops *ops) > > { > > if (!misc_ops->alloc) { > > pr_err("%s: alloc missing\n", __func__); > > return -EINVAL; > > } > > > > if (!misc_ops->free) { > > pr_err("%s: free missing\n", __func__); > > return -EINVAL; > > } > > > > res->misc_ops = misc_ops; > > return 0; > > } > > > > static inline int misc_res_alloc(struct misc_cg *cg, struct misc_res > > *res) > > { > > int ret; > > > > if (!res->misc_ops) > > return 0; > > > > return res->misc_ops->alloc(cg); > > } > > > > static inline void misc_res_free(struct misc_cg *cg, struct misc_res > > *res) > > { > > int ret; > > > > if (!res->misc_ops) > > return 0; > > > > return res->misc_ops->alloc(cg); > > } > > > > Now if anything has misc_ops, it will also always have *both* callback, > > and nothing half-baked gets in. > > > > The above loops would be then: > > > > for (i = 0; i < MISC_CG_RES_TYPES; i++) { > > WRITE_ONCE(cg->res[i].max, MAX_NUM); > > atomic64_set(&cg->res[i].usage, 0); > > ret = misc_res_alloc(&parent_cg->res[i]); > > if (ret) > > goto alloc_err; > > > > Cleaner and better guards for state consistency. In 04/12 you need to > > then call misc_res_init() instead of direct assignment. > > > > BR, Jarkko > > Will combine these with the use of a static operations array suggested by > Michal. OK, great, thanks! BR, Jarkko
diff --git a/include/linux/misc_cgroup.h b/include/linux/misc_cgroup.h index e799b1f8d05b..5dc509c27c3d 100644 --- a/include/linux/misc_cgroup.h +++ b/include/linux/misc_cgroup.h @@ -27,16 +27,30 @@ struct misc_cg; #include <linux/cgroup.h> +/** + * struct misc_operations_struct: per resource callback ops. + * @alloc: invoked for resource specific initialization when cgroup is allocated. + * @free: invoked for resource specific cleanup when cgroup is deallocated. + */ +struct misc_operations_struct { + int (*alloc)(struct misc_cg *cg); + void (*free)(struct misc_cg *cg); +}; + /** * struct misc_res: Per cgroup per misc type resource * @max: Maximum limit on the resource. * @usage: Current usage of the resource. * @events: Number of times, the resource limit exceeded. + * @priv: resource specific data. + * @misc_ops: resource specific operations. */ struct misc_res { u64 max; atomic64_t usage; atomic64_t events; + void *priv; + const struct misc_operations_struct *misc_ops; }; /** diff --git a/kernel/cgroup/misc.c b/kernel/cgroup/misc.c index 79a3717a5803..d971ede44ebf 100644 --- a/kernel/cgroup/misc.c +++ b/kernel/cgroup/misc.c @@ -383,23 +383,37 @@ static struct cftype misc_cg_files[] = { static struct cgroup_subsys_state * misc_cg_alloc(struct cgroup_subsys_state *parent_css) { + struct misc_cg *parent_cg, *cg; enum misc_res_type i; - struct misc_cg *cg; + int ret; if (!parent_css) { - cg = &root_cg; + parent_cg = cg = &root_cg; } else { cg = kzalloc(sizeof(*cg), GFP_KERNEL); if (!cg) return ERR_PTR(-ENOMEM); + parent_cg = css_misc(parent_css); } for (i = 0; i < MISC_CG_RES_TYPES; i++) { WRITE_ONCE(cg->res[i].max, MAX_NUM); atomic64_set(&cg->res[i].usage, 0); + if (parent_cg->res[i].misc_ops && parent_cg->res[i].misc_ops->alloc) { + ret = parent_cg->res[i].misc_ops->alloc(cg); + if (ret) + goto alloc_err; + } } return &cg->css; + +alloc_err: + for (i = 0; i < MISC_CG_RES_TYPES; i++) + if (parent_cg->res[i].misc_ops && parent_cg->res[i].misc_ops->free) + cg->res[i].misc_ops->free(cg); + kfree(cg); + return ERR_PTR(ret); } /** @@ -410,7 +424,14 @@ misc_cg_alloc(struct cgroup_subsys_state *parent_css) */ static void misc_cg_free(struct cgroup_subsys_state *css) { - kfree(css_misc(css)); + struct misc_cg *cg = css_misc(css); + enum misc_res_type i; + + for (i = 0; i < MISC_CG_RES_TYPES; i++) + if (cg->res[i].misc_ops && cg->res[i].misc_ops->free) + cg->res[i].misc_ops->free(cg); + + kfree(cg); } /* Cgroup controller callbacks */