From patchwork Mon Oct 16 22:18:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 153785 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp3755968vqb; Mon, 16 Oct 2023 15:19:41 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHrENeR06C96aSX3+J2tTNo3NXoPEFUMhJ59VJ8iPZ7MT/TIGuDS4TAg2nkM1Gq25IQ2cEw X-Received: by 2002:a05:6a21:8cc5:b0:15c:b7ba:e9ba with SMTP id ta5-20020a056a218cc500b0015cb7bae9bamr361584pzb.0.1697494781425; Mon, 16 Oct 2023 15:19:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697494781; cv=none; d=google.com; s=arc-20160816; b=p5z8BPjNuXTMCXMJPsqml0RoxGmAqipXE5XEMhGIqBgzxerH02SuXEkyuxTJEhNTbr FbCNS2fEV0TlL294I/XgLCinHhDfdEkvic9SM+KR1aTn/7HXcV0+JgGjyAfdxKVMWr8/ IzbGNWe/peQkJqjsQUg1khf1ztB1kx3N29TbEQ4Gj+XssyPtITEFApmC3q5p0MUqK0/q o/Wtnywar2JAt+aeqbcpgZpII26ovT+QD5sE2sjVuBxdffEUyhhHJWEG3ka/GWuZRb/R v18z2nuvmqmm9F401Hq8DKW7RJEdLgtaxKI58+Qwk/2B7Hmx4zJwkkY7bmgPJUw2rPkd FGhw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Qo1YPZiBE3t+4Vaa4b0Fou7dzWJJ5TkLVY3p5Qh+0JM=; fh=TyPg14X/GXePfX2HFCpC8BoYIRT/2UqYFN7bT1+n50k=; b=As6F29ivggxjosQ8cPcDtpVd1JfEdSTlHD1EfpxvgQz7Ht0r5ZVIMi2kDWyEPRMbo7 cCL1K0uvcBzs1C+KSBG0QNOXLNWQsjDF8M2uIMFe9e9XzFkDMfmlDDqy0KRAI6dIV5Up gGJVViXL9VNjoFMpe59f4mkEUt6B9eUKxAOlXQvBgQST8rRN0woMtJic14EKA/is75vQ YDwfWDbyzvTIAabw/yB8mHrApL/fwWaWbDd9liFD2I8iSRf17ZXWQdAw+An9OFtvEksR nrqDcQ86QEShtUnqtnanX7p2wMDzpsrFUKG11+I23C2BtqSN9RJKd7xDkjkP3i/uj8Lp tdGA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linux.dev header.s=key1 header.b=t4+D5b2o; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id g9-20020a170902c38900b001c5e0672f53si239246plg.466.2023.10.16.15.19.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Oct 2023 15:19:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=fail header.i=@linux.dev header.s=key1 header.b=t4+D5b2o; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 5D4988038932; Mon, 16 Oct 2023 15:19:39 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234169AbjJPWT3 (ORCPT + 18 others); Mon, 16 Oct 2023 18:19:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57174 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234164AbjJPWT1 (ORCPT ); Mon, 16 Oct 2023 18:19:27 -0400 Received: from out-197.mta1.migadu.com (out-197.mta1.migadu.com [95.215.58.197]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C21A9B for ; Mon, 16 Oct 2023 15:19:25 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1697494764; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Qo1YPZiBE3t+4Vaa4b0Fou7dzWJJ5TkLVY3p5Qh+0JM=; b=t4+D5b2ohims4xmfGrzn/7ohW+Sf1HKAp2WgnGX/vHds2piEpV/vNNlAymLlRqdz76x+AN 1KSuuR9IKnmKBFdjygx31p4IdK1IlYSUsRUMldFjNZPT3F6UztYULUNijYqDhqKJ/VEtqy AgCi61ESxsfBhffhea6oWKV75TAEaCk= From: Roman Gushchin To: Andrew Morton Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Johannes Weiner , Michal Hocko , Shakeel@vger.kernel.org, Butt@vger.kernel.org, shakeelb@google.com, Muchun Song , Dennis Zhou , David Rientjes , Vlastimil Babka , Naresh Kamboju , Roman Gushchin Subject: [PATCH v3 1/5] mm: kmem: optimize get_obj_cgroup_from_current() Date: Mon, 16 Oct 2023 15:18:56 -0700 Message-ID: <20231016221900.4031141-2-roman.gushchin@linux.dev> In-Reply-To: <20231016221900.4031141-1-roman.gushchin@linux.dev> References: <20231016221900.4031141-1-roman.gushchin@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Mon, 16 Oct 2023 15:19:39 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779952287487100766 X-GMAIL-MSGID: 1779952287487100766 Manually inline memcg_kmem_bypass() and active_memcg() to speed up get_obj_cgroup_from_current() by avoiding duplicate in_task() checks and active_memcg() readings. Also add a likely() macro to __get_obj_cgroup_from_memcg(): obj_cgroup_tryget() should succeed at almost all times except a very unlikely race with the memcg deletion path. Signed-off-by: Roman Gushchin (Cruise) Tested-by: Naresh Kamboju Acked-by: Shakeel Butt Acked-by: Johannes Weiner Reviewed-by: Vlastimil Babka --- mm/memcontrol.c | 34 ++++++++++++++-------------------- 1 file changed, 14 insertions(+), 20 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 9741d62d0424..16ac2a5838fb 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1068,19 +1068,6 @@ struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm) } EXPORT_SYMBOL(get_mem_cgroup_from_mm); -static __always_inline bool memcg_kmem_bypass(void) -{ - /* Allow remote memcg charging from any context. */ - if (unlikely(active_memcg())) - return false; - - /* Memcg to charge can't be determined. */ - if (!in_task() || !current->mm || (current->flags & PF_KTHREAD)) - return true; - - return false; -} - /** * mem_cgroup_iter - iterate over memory cgroup hierarchy * @root: hierarchy root @@ -3007,7 +2994,7 @@ static struct obj_cgroup *__get_obj_cgroup_from_memcg(struct mem_cgroup *memcg) for (; !mem_cgroup_is_root(memcg); memcg = parent_mem_cgroup(memcg)) { objcg = rcu_dereference(memcg->objcg); - if (objcg && obj_cgroup_tryget(objcg)) + if (likely(objcg && obj_cgroup_tryget(objcg))) break; objcg = NULL; } @@ -3016,16 +3003,23 @@ static struct obj_cgroup *__get_obj_cgroup_from_memcg(struct mem_cgroup *memcg) __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void) { - struct obj_cgroup *objcg = NULL; struct mem_cgroup *memcg; + struct obj_cgroup *objcg; - if (memcg_kmem_bypass()) - return NULL; + if (in_task()) { + memcg = current->active_memcg; + + /* Memcg to charge can't be determined. */ + if (likely(!memcg) && (!current->mm || (current->flags & PF_KTHREAD))) + return NULL; + } else { + memcg = this_cpu_read(int_active_memcg); + if (likely(!memcg)) + return NULL; + } rcu_read_lock(); - if (unlikely(active_memcg())) - memcg = active_memcg(); - else + if (!memcg) memcg = mem_cgroup_from_task(current); objcg = __get_obj_cgroup_from_memcg(memcg); rcu_read_unlock(); From patchwork Mon Oct 16 22:18:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 153789 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp3756279vqb; Mon, 16 Oct 2023 15:20:21 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF/LkN+V03lxCc6tEU3gLlg7pnsd9RLw8Q4JRVd9o1s2XoVWiwS6HZy4SwWUP9/EH02Ltnb X-Received: by 2002:a05:6a00:1d9b:b0:692:ad93:e852 with SMTP id z27-20020a056a001d9b00b00692ad93e852mr477244pfw.2.1697494821175; Mon, 16 Oct 2023 15:20:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697494821; cv=none; d=google.com; s=arc-20160816; b=O14eqPOHTm+QfWfKmFdWWAaGVF4E4xaAR+IfT5MPIKYvIB5OYNfcQ376T29ky/J2ej niqqdpsyaGt/YgLi2+SMPIk29KDEyVcc70NkN+jfjfMHV3JdDi75dFhXrcPfzGIJpij1 7J4xfRVzfeCl/4z/NTfAY6R1DeH+QIEfFZy8zQNwijY3f18R8XuUSkZbLm6UAATjJ2Bo V8IcaW9FOf3vJZafQKaS35YsohUi1yFBFFhh6bxQDBPfTfHP+Pg8hmjO7o7zShY324GO q58OtuJqBLilcFt72uwEGT35UbzVE8Q+0A6YEcnfYmAe3hpsrqbH1UHvq+5eSo25gPXO 08vA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=QetWSvG/dIqABNSrWq/tQjmr1cKjhHLV7pUaP7w0PwY=; fh=TyPg14X/GXePfX2HFCpC8BoYIRT/2UqYFN7bT1+n50k=; b=yNwr84VSbJoaqdUUvp9bLqoHUPYfD/o2430p5slNe5wpeXrv/RTFyMzG+mZcbNscZI PRhEyNHLtRd+fY72PcXOHh0ZRwX3s29SwDl0YU9ByGYS+WqKODZr6kBdsJte4al0X7XU DV5DmIc3JjeM4Oca6k0jkdV3LT9+9YM1yH3U3HV8RHoTFeoLdKMUQWSiAh8c4z8MUbSS 7n98OgbCbCjgT/YytQSVp3mSQhEtsGrC10CIxasgByTrxq39+yVIKvUv5R1gjpwHdDJr PQhbF3KduVDIkpssMH/4rWYKWC2+zQ0Op7JBsnycp7HLo+jm4MVPKOnE6K3fF5+8gu3F LQjQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linux.dev header.s=key1 header.b=kPRJDwqv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id by34-20020a056a0205a200b00563d9ff5158si338513pgb.350.2023.10.16.15.20.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Oct 2023 15:20:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=fail header.i=@linux.dev header.s=key1 header.b=kPRJDwqv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 4BB6E8052BE1; Mon, 16 Oct 2023 15:20:18 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234237AbjJPWTc (ORCPT + 18 others); Mon, 16 Oct 2023 18:19:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57182 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234164AbjJPWTa (ORCPT ); Mon, 16 Oct 2023 18:19:30 -0400 Received: from out-191.mta1.migadu.com (out-191.mta1.migadu.com [95.215.58.191]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 621BB9B for ; Mon, 16 Oct 2023 15:19:28 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1697494766; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QetWSvG/dIqABNSrWq/tQjmr1cKjhHLV7pUaP7w0PwY=; b=kPRJDwqva4l8EYb2Hzq5VdQI/RNReszMXwcGpJ8deMSYH7LoBSfVwoYo2K0DTFZ/Q4hH3M HcBOdaZ/YlHNQMW5x0quhFK/kunjweHG8dVTueD64X/gdup6K9UHID5r+mHs70hSBj6x4O 1O5Ww/n78phlTo68IA1+2XLHnQwiqUI= From: Roman Gushchin To: Andrew Morton Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Johannes Weiner , Michal Hocko , Shakeel@vger.kernel.org, Butt@vger.kernel.org, shakeelb@google.com, Muchun Song , Dennis Zhou , David Rientjes , Vlastimil Babka , Naresh Kamboju , Roman Gushchin Subject: [PATCH v3 2/5] mm: kmem: add direct objcg pointer to task_struct Date: Mon, 16 Oct 2023 15:18:57 -0700 Message-ID: <20231016221900.4031141-3-roman.gushchin@linux.dev> In-Reply-To: <20231016221900.4031141-1-roman.gushchin@linux.dev> References: <20231016221900.4031141-1-roman.gushchin@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Mon, 16 Oct 2023 15:20:18 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779952329650740716 X-GMAIL-MSGID: 1779952329650740716 To charge a freshly allocated kernel object to a memory cgroup, the kernel needs to obtain an objcg pointer. Currently it does it indirectly by obtaining the memcg pointer first and then calling to __get_obj_cgroup_from_memcg(). Usually tasks spend their entire life belonging to the same object cgroup. So it makes sense to save the objcg pointer on task_struct directly, so it can be obtained faster. It requires some work on fork, exit and cgroup migrate paths, but these paths are way colder. To avoid any costly synchronization the following rules are applied: 1) A task sets it's objcg pointer itself. 2) If a task is being migrated to another cgroup, the least significant bit of the objcg pointer is set atomically. 3) On the allocation path the objcg pointer is obtained locklessly using the READ_ONCE() macro and the least significant bit is checked. If it's set, the following procedure is used to update it locklessly: - task->objcg is zeroed using cmpxcg - new objcg pointer is obtained - task->objcg is updated using try_cmpxchg - operation is repeated if try_cmpxcg fails It guarantees that no updates will be lost if task migration is racing against objcg pointer update. It also allows to keep both read and write paths fully lockless. Because the task is keeping a reference to the objcg, it can't go away while the task is alive. This commit doesn't change the way the remote memcg charging works. Signed-off-by: Roman Gushchin (Cruise) Tested-by: Naresh Kamboju Acked-by: Johannes Weiner Acked-by: Shakeel Butt --- include/linux/sched.h | 4 ++ mm/memcontrol.c | 130 +++++++++++++++++++++++++++++++++++++++--- 2 files changed, 125 insertions(+), 9 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 77f01ac385f7..60de42715b56 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1443,6 +1443,10 @@ struct task_struct { struct mem_cgroup *active_memcg; #endif +#ifdef CONFIG_MEMCG_KMEM + struct obj_cgroup *objcg; +#endif + #ifdef CONFIG_BLK_CGROUP struct gendisk *throttle_disk; #endif diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 16ac2a5838fb..0605e45bd4a2 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -249,6 +249,8 @@ struct mem_cgroup *vmpressure_to_memcg(struct vmpressure *vmpr) return container_of(vmpr, struct mem_cgroup, vmpressure); } +#define CURRENT_OBJCG_UPDATE_FLAG 0x1UL + #ifdef CONFIG_MEMCG_KMEM static DEFINE_SPINLOCK(objcg_lock); @@ -3001,6 +3003,50 @@ static struct obj_cgroup *__get_obj_cgroup_from_memcg(struct mem_cgroup *memcg) return objcg; } +static struct obj_cgroup *current_objcg_update(void) +{ + struct mem_cgroup *memcg; + struct obj_cgroup *old, *objcg = NULL; + + do { + /* Atomically drop the update bit. */ + old = xchg(¤t->objcg, NULL); + if (old) { + old = (struct obj_cgroup *) + ((unsigned long)old & ~CURRENT_OBJCG_UPDATE_FLAG); + if (old) + obj_cgroup_put(old); + + old = NULL; + } + + /* Obtain the new objcg pointer. */ + rcu_read_lock(); + memcg = mem_cgroup_from_task(current); + /* + * The current task can be asynchronously moved to another + * memcg and the previous memcg can be offlined. So let's + * get the memcg pointer and try get a reference to objcg + * under a rcu read lock. + */ + for (; memcg != root_mem_cgroup; memcg = parent_mem_cgroup(memcg)) { + objcg = rcu_dereference(memcg->objcg); + if (likely(objcg && obj_cgroup_tryget(objcg))) + break; + objcg = NULL; + } + rcu_read_unlock(); + + /* + * Try set up a new objcg pointer atomically. If it + * fails, it means the update flag was set concurrently, so + * the whole procedure should be repeated. + */ + } while (!try_cmpxchg(¤t->objcg, &old, objcg)); + + return objcg; +} + __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void) { struct mem_cgroup *memcg; @@ -3008,19 +3054,26 @@ __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void) if (in_task()) { memcg = current->active_memcg; + if (unlikely(memcg)) + goto from_memcg; - /* Memcg to charge can't be determined. */ - if (likely(!memcg) && (!current->mm || (current->flags & PF_KTHREAD))) - return NULL; + objcg = READ_ONCE(current->objcg); + if (unlikely((unsigned long)objcg & CURRENT_OBJCG_UPDATE_FLAG)) + objcg = current_objcg_update(); + + if (objcg) { + obj_cgroup_get(objcg); + return objcg; + } } else { memcg = this_cpu_read(int_active_memcg); - if (likely(!memcg)) - return NULL; + if (unlikely(memcg)) + goto from_memcg; } + return NULL; +from_memcg: rcu_read_lock(); - if (!memcg) - memcg = mem_cgroup_from_task(current); objcg = __get_obj_cgroup_from_memcg(memcg); rcu_read_unlock(); return objcg; @@ -6345,6 +6398,7 @@ static void mem_cgroup_move_task(void) mem_cgroup_clear_mc(); } } + #else /* !CONFIG_MMU */ static int mem_cgroup_can_attach(struct cgroup_taskset *tset) { @@ -6358,8 +6412,39 @@ static void mem_cgroup_move_task(void) } #endif +#ifdef CONFIG_MEMCG_KMEM +static void mem_cgroup_fork(struct task_struct *task) +{ + /* + * Set the update flag to cause task->objcg to be initialized lazily + * on the first allocation. It can be done without any synchronization + * because it's always performed on the current task, so does + * current_objcg_update(). + */ + task->objcg = (struct obj_cgroup *)CURRENT_OBJCG_UPDATE_FLAG; +} + +static void mem_cgroup_exit(struct task_struct *task) +{ + struct obj_cgroup *objcg = task->objcg; + + objcg = (struct obj_cgroup *) + ((unsigned long)objcg & ~CURRENT_OBJCG_UPDATE_FLAG); + if (objcg) + obj_cgroup_put(objcg); + + /* + * Some kernel allocations can happen after this point, + * but let's ignore them. It can be done without any synchronization + * because it's always performed on the current task, so does + * current_objcg_update(). + */ + task->objcg = NULL; +} +#endif + #ifdef CONFIG_LRU_GEN -static void mem_cgroup_attach(struct cgroup_taskset *tset) +static void mem_cgroup_lru_gen_attach(struct cgroup_taskset *tset) { struct task_struct *task; struct cgroup_subsys_state *css; @@ -6377,10 +6462,31 @@ static void mem_cgroup_attach(struct cgroup_taskset *tset) task_unlock(task); } #else +static void mem_cgroup_lru_gen_attach(struct cgroup_taskset *tset) {} +#endif /* CONFIG_LRU_GEN */ + +#ifdef CONFIG_MEMCG_KMEM +static void mem_cgroup_kmem_attach(struct cgroup_taskset *tset) +{ + struct task_struct *task; + struct cgroup_subsys_state *css; + + cgroup_taskset_for_each(task, css, tset) { + /* atomically set the update bit */ + set_bit(0, (unsigned long *)&task->objcg); + } +} +#else +static void mem_cgroup_kmem_attach(struct cgroup_taskset *tset) {} +#endif /* CONFIG_MEMCG_KMEM */ + +#if defined(CONFIG_LRU_GEN) || defined(CONFIG_MEMCG_KMEM) static void mem_cgroup_attach(struct cgroup_taskset *tset) { + mem_cgroup_lru_gen_attach(tset); + mem_cgroup_kmem_attach(tset); } -#endif /* CONFIG_LRU_GEN */ +#endif static int seq_puts_memcg_tunable(struct seq_file *m, unsigned long value) { @@ -6824,9 +6930,15 @@ struct cgroup_subsys memory_cgrp_subsys = { .css_reset = mem_cgroup_css_reset, .css_rstat_flush = mem_cgroup_css_rstat_flush, .can_attach = mem_cgroup_can_attach, +#if defined(CONFIG_LRU_GEN) || defined(CONFIG_MEMCG_KMEM) .attach = mem_cgroup_attach, +#endif .cancel_attach = mem_cgroup_cancel_attach, .post_attach = mem_cgroup_move_task, +#ifdef CONFIG_MEMCG_KMEM + .fork = mem_cgroup_fork, + .exit = mem_cgroup_exit, +#endif .dfl_cftypes = memory_files, .legacy_cftypes = mem_cgroup_legacy_files, .early_init = 0, From patchwork Mon Oct 16 22:18:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 153786 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp3755989vqb; Mon, 16 Oct 2023 15:19:43 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGIWP4Q5HMjqKi5pqeTT1nmcsB886hm4qdjhDV+xdd+szkVNHAYaj1Uk4uP0gvhTMF8SgOi X-Received: by 2002:a05:6a20:9741:b0:163:ab09:196d with SMTP id hs1-20020a056a20974100b00163ab09196dmr377000pzc.1.1697494783580; Mon, 16 Oct 2023 15:19:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697494783; cv=none; d=google.com; s=arc-20160816; b=I8zxBF7NJWiRcsWoSmKVk3BkTg0ljn8CfJKH27ry4bAWYhFcK9qxI/RgEuY6hj6cgs rZ2e+FTOUvvS8flAQxmeDkJuQdRHN6p64QfV4y6VpMk2qp+zTLc1dXa98+otvEY8m5KO KL/U/Vx+RFm+0Ovdd9Zl24lG0lYWGjlm+oOKI1n6axjxvrVjxbjaDOgZk9uaNd70Goxt uVm0X/bzs6eu/H6OZqsK7+snDo0jmm7SuRRlZ3n5zRBV0PexbuU8+cPhgWf/C8j4zM+z jsJ00ozXviIWEThFcCdsCDtLlaOSEZtSFhyyHEnfoJUd1A1xDHzxt4QQxVznj73KXEsb oLfg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=8nbWMqlQbup8L2PuI3o6aZR+b1Gh4bcn04UHgRGHul4=; fh=TyPg14X/GXePfX2HFCpC8BoYIRT/2UqYFN7bT1+n50k=; b=RJ/DLebRtJPKxO5/TZHKO0UPWhPjaSVjjacnnysll/MFSHNDSaKh3S5EUeVyoIQJfY AHqRumMw2xS8gB6XpNkClGKxTibA5eI7LdliUD2mbbD2EgFP39i2BHchu83okCs/tSfh iRsR5VKRsW/0pZjVuLDvSY0jikUn68JiuXB+xz6H00Atv6gORyNtNdxpDEpJyoNFAUy+ TKx9D8RxgOgIAKM9d12FYLFoqMHJpUu5jKW+bTIAU/vyW6m9Xd9RvmTDzii/kJ8eZoe8 BYX13sQS8C/TxkTQ7Jl1BnarqHewc/JTGGmntO0UlzxJxHvnyQNvqyOJopBrT8vvF7cY NFvQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linux.dev header.s=key1 header.b=GVxjccE0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id p11-20020a170902a40b00b001b9ffda162csi247701plq.441.2023.10.16.15.19.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Oct 2023 15:19:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=fail header.i=@linux.dev header.s=key1 header.b=GVxjccE0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id D0F148026DC6; Mon, 16 Oct 2023 15:19:42 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234281AbjJPWTh (ORCPT + 18 others); Mon, 16 Oct 2023 18:19:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57260 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234261AbjJPWTf (ORCPT ); Mon, 16 Oct 2023 18:19:35 -0400 Received: from out-205.mta1.migadu.com (out-205.mta1.migadu.com [IPv6:2001:41d0:203:375::cd]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4BD34EE for ; Mon, 16 Oct 2023 15:19:31 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1697494769; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8nbWMqlQbup8L2PuI3o6aZR+b1Gh4bcn04UHgRGHul4=; b=GVxjccE0wRDPf3xmRq5yp+EZpFthfbOTdOcitturXfYEyCKxOFwUAdjvcbfQaAHmjoiYHf XchZu3Jj3LIkIaTLAvXdaMbdYfKBOOd4vnoDv+0zus1yi0xwVi2DEtoTht+7fcm4V0ms/G FLHpSoYdp6Gpzdvw2ssI9g/dgNewT7w= From: Roman Gushchin To: Andrew Morton Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Johannes Weiner , Michal Hocko , Shakeel@vger.kernel.org, Butt@vger.kernel.org, shakeelb@google.com, Muchun Song , Dennis Zhou , David Rientjes , Vlastimil Babka , Naresh Kamboju , Roman Gushchin Subject: [PATCH v3 3/5] mm: kmem: make memcg keep a reference to the original objcg Date: Mon, 16 Oct 2023 15:18:58 -0700 Message-ID: <20231016221900.4031141-4-roman.gushchin@linux.dev> In-Reply-To: <20231016221900.4031141-1-roman.gushchin@linux.dev> References: <20231016221900.4031141-1-roman.gushchin@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 16 Oct 2023 15:19:42 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779952290270149281 X-GMAIL-MSGID: 1779952290270149281 Keep a reference to the original objcg object for the entire life of a memcg structure. This allows to simplify the synchronization on the kernel memory allocation paths: pinning a (live) memcg will also pin the corresponding objcg. The memory overhead of this change is minimal because object cgroups usually outlive their corresponding memory cgroups even without this change, so it's only an additional pointer per memcg. Signed-off-by: Roman Gushchin (Cruise) Tested-by: Naresh Kamboju Acked-by: Shakeel Butt Reviewed-by: Vlastimil Babka --- include/linux/memcontrol.h | 8 +++++++- mm/memcontrol.c | 5 +++++ 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index ab94ad4597d0..277690af383d 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -299,7 +299,13 @@ struct mem_cgroup { #ifdef CONFIG_MEMCG_KMEM int kmemcg_id; - struct obj_cgroup __rcu *objcg; + /* + * memcg->objcg is wiped out as a part of the objcg repaprenting + * process. memcg->orig_objcg preserves a pointer (and a reference) + * to the original objcg until the end of live of memcg. + */ + struct obj_cgroup __rcu *objcg; + struct obj_cgroup *orig_objcg; /* list of inherited objcgs, protected by objcg_lock */ struct list_head objcg_list; #endif diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 0605e45bd4a2..d90cc19e4113 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3808,6 +3808,8 @@ static int memcg_online_kmem(struct mem_cgroup *memcg) objcg->memcg = memcg; rcu_assign_pointer(memcg->objcg, objcg); + obj_cgroup_get(objcg); + memcg->orig_objcg = objcg; static_branch_enable(&memcg_kmem_online_key); @@ -5302,6 +5304,9 @@ static void __mem_cgroup_free(struct mem_cgroup *memcg) { int node; + if (memcg->orig_objcg) + obj_cgroup_put(memcg->orig_objcg); + for_each_node(node) free_mem_cgroup_per_node_info(memcg, node); kfree(memcg->vmstats); From patchwork Mon Oct 16 22:18:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 153787 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp3755999vqb; Mon, 16 Oct 2023 15:19:44 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHRKzEf4A4XZaaMJcae9DL8cCXa1GCiimpcChxxw6DL7RZqxc/sTUjMtjl2mQziFEF/DucW X-Received: by 2002:a05:6358:72a6:b0:166:d9c9:dbe with SMTP id w38-20020a05635872a600b00166d9c90dbemr544112rwf.3.1697494784592; Mon, 16 Oct 2023 15:19:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697494784; cv=none; d=google.com; s=arc-20160816; b=GSCE3kbcO8ezvw9Ql3f/LVddr1IUhbmcXnSc6HipRO5/fL6LtRYcKWfjvfn+zDLCdd Z/n3KHAgQ9tZxfS+iRW3O4QE32bDDGdPhBj4hW7QHqLS/sMpuXEP8f5BFbxZOGklX1Wi Q1Ex8IxA4QWFWt290K+5ipo1g5pwGlSLZZsHStvJZ6dLIw4vYbOrGxJldcfIiPgYagcr UDyP3I2boEGfLcyQZbpQA0EYvXUDW8zvvwVKXo7TVa7nAOLoboOTrllmEVhnfVRhWoKI qQAdEdY4tABg7mzme3qcRxnYwxJAmGM3Jb8YEGzKc572ufeKFRMM5L3WTsxTVrZfvR57 8pZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Ij7CcCAZNuMmzhQYGZQhCh1ZAw1S65+h3j6CiMYYH1k=; fh=TyPg14X/GXePfX2HFCpC8BoYIRT/2UqYFN7bT1+n50k=; b=1J+6P7a0wTMVssdQXXwGAg+L6pvMFtQdIM37dZAVTyKKrYS9DGwwZY6tM38VnMWOGB iuI0BCkWA4VP3YIiDCRuQM4QOOX9Aniewc7x4H5zzo1wkZJMEovfcML3B5gixV791Oxg BWN/t2AUNvGd2JAJd6xp7EimyE6Bqgk2FPRnBOCpTVFhMgstfXGEykEd4SDnFzWGTLgn YONZpGvG5MlzvA2Ark2itL1UqfN2cW7KQGP0MVmJRqy6HoXFndJCiG3nzDFsp+PFVXnq ccqub2pFuVMEYcXHhQp4c+yw6xZQXdbDgessGraNYXBXd1CU4s2s4vb7XKZoSIeNlLMZ vg5g== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linux.dev header.s=key1 header.b=qe61nNJW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id j71-20020a638b4a000000b0054405623a4asi273332pge.615.2023.10.16.15.19.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Oct 2023 15:19:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=fail header.i=@linux.dev header.s=key1 header.b=qe61nNJW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id CF1E0802733E; Mon, 16 Oct 2023 15:19:43 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234308AbjJPWTl (ORCPT + 18 others); Mon, 16 Oct 2023 18:19:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234232AbjJPWTg (ORCPT ); Mon, 16 Oct 2023 18:19:36 -0400 Received: from out-204.mta1.migadu.com (out-204.mta1.migadu.com [IPv6:2001:41d0:203:375::cc]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA90FFB for ; Mon, 16 Oct 2023 15:19:33 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1697494772; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ij7CcCAZNuMmzhQYGZQhCh1ZAw1S65+h3j6CiMYYH1k=; b=qe61nNJWgfKNPHV4MWaSLxAEqEn1iXauirsmpPoVE9GASZahjTRawLqTlqdTbyG4eiukfY Lq6KFaL1MtAZvkYpGyYNft0nl3lKbvNndKBxJreTEgFopeQTbNfb5AL+9pAMbJkrsLR2M1 Rt53tdYfBOdP5StnaeY5MmTpUf7UHGw= From: Roman Gushchin To: Andrew Morton Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Johannes Weiner , Michal Hocko , Shakeel@vger.kernel.org, Butt@vger.kernel.org, shakeelb@google.com, Muchun Song , Dennis Zhou , David Rientjes , Vlastimil Babka , Naresh Kamboju , Roman Gushchin Subject: [PATCH v3 4/5] mm: kmem: scoped objcg protection Date: Mon, 16 Oct 2023 15:18:59 -0700 Message-ID: <20231016221900.4031141-5-roman.gushchin@linux.dev> In-Reply-To: <20231016221900.4031141-1-roman.gushchin@linux.dev> References: <20231016221900.4031141-1-roman.gushchin@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 16 Oct 2023 15:19:43 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779952291662951839 X-GMAIL-MSGID: 1779952291662951839 Switch to a scope-based protection of the objcg pointer on slab/kmem allocation paths. Instead of using the get_() semantics in the pre-allocation hook and put the reference afterwards, let's rely on the fact that objcg is pinned by the scope. It's possible because: 1) if the objcg is received from the current task struct, the task is keeping a reference to the objcg. 2) if the objcg is received from an active memcg (remote charging), the memcg is pinned by the scope and has a reference to the corresponding objcg. Signed-off-by: Roman Gushchin (Cruise) Tested-by: Naresh Kamboju Acked-by: Shakeel Butt Reviewed-by: Vlastimil Babka --- include/linux/memcontrol.h | 9 ++++++++ include/linux/sched/mm.h | 4 ++++ mm/memcontrol.c | 47 ++++++++++++++++++++++++++++++++++++-- mm/slab.h | 15 ++++++------ 4 files changed, 66 insertions(+), 9 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 277690af383d..a89df289144d 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1769,6 +1769,15 @@ bool mem_cgroup_kmem_disabled(void); int __memcg_kmem_charge_page(struct page *page, gfp_t gfp, int order); void __memcg_kmem_uncharge_page(struct page *page, int order); +/* + * The returned objcg pointer is safe to use without additional + * protection within a scope. The scope is defined either by + * the current task (similar to the "current" global variable) + * or by set_active_memcg() pair. + * Please, use obj_cgroup_get() to get a reference if the pointer + * needs to be used outside of the local scope. + */ +struct obj_cgroup *current_obj_cgroup(void); struct obj_cgroup *get_obj_cgroup_from_current(void); struct obj_cgroup *get_obj_cgroup_from_folio(struct folio *folio); diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index 8d89c8c4fac1..9a19f1b42f64 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -403,6 +403,10 @@ DECLARE_PER_CPU(struct mem_cgroup *, int_active_memcg); * __GFP_ACCOUNT allocations till the end of the scope will be charged to the * given memcg. * + * Please, make sure that caller has a reference to the passed memcg structure, + * so its lifetime is guaranteed to exceed the scope between two + * set_active_memcg() calls. + * * NOTE: This function can nest. Users must save the return value and * reset the previous value after their own charging scope is over. */ diff --git a/mm/memcontrol.c b/mm/memcontrol.c index d90cc19e4113..852fe6918f82 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3079,6 +3079,49 @@ __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void) return objcg; } +__always_inline struct obj_cgroup *current_obj_cgroup(void) +{ + struct mem_cgroup *memcg; + struct obj_cgroup *objcg; + + if (in_task()) { + memcg = current->active_memcg; + if (unlikely(memcg)) + goto from_memcg; + + objcg = READ_ONCE(current->objcg); + if (unlikely((unsigned long)objcg & CURRENT_OBJCG_UPDATE_FLAG)) + objcg = current_objcg_update(); + /* + * Objcg reference is kept by the task, so it's safe + * to use the objcg by the current task. + */ + return objcg; + } + + memcg = this_cpu_read(int_active_memcg); + if (unlikely(memcg)) + goto from_memcg; + + return NULL; + +from_memcg: + for (; !mem_cgroup_is_root(memcg); memcg = parent_mem_cgroup(memcg)) { + /* + * Memcg pointer is protected by scope (see set_active_memcg()) + * and is pinning the corresponding objcg, so objcg can't go + * away and can be used within the scope without any additional + * protection. + */ + objcg = rcu_dereference_check(memcg->objcg, 1); + if (likely(objcg)) + break; + objcg = NULL; + } + + return objcg; +} + struct obj_cgroup *get_obj_cgroup_from_folio(struct folio *folio) { struct obj_cgroup *objcg; @@ -3173,15 +3216,15 @@ int __memcg_kmem_charge_page(struct page *page, gfp_t gfp, int order) struct obj_cgroup *objcg; int ret = 0; - objcg = get_obj_cgroup_from_current(); + objcg = current_obj_cgroup(); if (objcg) { ret = obj_cgroup_charge_pages(objcg, gfp, 1 << order); if (!ret) { + obj_cgroup_get(objcg); page->memcg_data = (unsigned long)objcg | MEMCG_DATA_KMEM; return 0; } - obj_cgroup_put(objcg); } return ret; } diff --git a/mm/slab.h b/mm/slab.h index 799a315695c6..3d07fb428393 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -484,7 +484,12 @@ static inline bool memcg_slab_pre_alloc_hook(struct kmem_cache *s, if (!(flags & __GFP_ACCOUNT) && !(s->flags & SLAB_ACCOUNT)) return true; - objcg = get_obj_cgroup_from_current(); + /* + * The obtained objcg pointer is safe to use within the current scope, + * defined by current task or set_active_memcg() pair. + * obj_cgroup_get() is used to get a permanent reference. + */ + objcg = current_obj_cgroup(); if (!objcg) return true; @@ -497,17 +502,14 @@ static inline bool memcg_slab_pre_alloc_hook(struct kmem_cache *s, css_put(&memcg->css); if (ret) - goto out; + return false; } if (obj_cgroup_charge(objcg, flags, objects * obj_full_size(s))) - goto out; + return false; *objcgp = objcg; return true; -out: - obj_cgroup_put(objcg); - return false; } static inline void memcg_slab_post_alloc_hook(struct kmem_cache *s, @@ -542,7 +544,6 @@ static inline void memcg_slab_post_alloc_hook(struct kmem_cache *s, obj_cgroup_uncharge(objcg, obj_full_size(s)); } } - obj_cgroup_put(objcg); } static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, From patchwork Mon Oct 16 22:19:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 153788 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp3756124vqb; Mon, 16 Oct 2023 15:20:00 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFjsKyXqbFDKDUyYwT0D3d+2H6T67vXFistKcTWUqoRyp380aWxslorokS+B+oTppx91Doa X-Received: by 2002:a05:6a20:4288:b0:171:737:df97 with SMTP id o8-20020a056a20428800b001710737df97mr474874pzj.2.1697494800183; Mon, 16 Oct 2023 15:20:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697494800; cv=none; d=google.com; s=arc-20160816; b=SWKY87xALv6gzUsCC3/Y56qvRDkuVvzC9DnydYiuqllVjGKxNdotcRjhsfbkTCx/WN +EkvUl8hgaPJWy13pA1SNi3stxytPuxXephP/IIjqT+9S1RTcsjqAaFIsPWfwKGIimiO 56McK2XRTudhzsDZOQgGSBVGm/vRiwrgVw+PogXEQy+pGI7LWrwFj9KLXlcu9FcZnEcF /XfO0EvojfADERgpw+QeiHzMNvXwk5+LoVh3P5HAmAsusMMhL5FRwULiQ9dD12ekhTR8 Qszko+5HMX8StBX37X5EIMmeydQYtW7AeYQZeYe0vf8/1jaPrEjdXqsOTaA/a+fEqNYO dZaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=3kIxRaEQpjFFVG+rl0/tQDkNcffLjJm9ou/emwznPps=; fh=TyPg14X/GXePfX2HFCpC8BoYIRT/2UqYFN7bT1+n50k=; b=JpFzzpJ2Hf96bnM5EjXq9CR6BuRZ34Lpji8nLHjnRBxTc20biFzNxVyGdQvEvhpTk3 ShUo8IEXtcbsSmkAo0kVZnTU1XDiwqJFk6py2jcw20deJZ4Seg/4aJp2rySn4eapkI07 GnTw86Q6bVBiPhnS250z2GDkF/u7aSNNYFI/hAX+sKFoCzT+AJzqDjnLs18p4RExt110 GNUKBJiuSFa6q1gJM9L+XpK+WaR1uWTHn+wW9GDX0s/hHkMlKNkL68gLZd8FdXLl18oe EBA1BzHxqjuKfdm3ttA+22+4ZqVL6Zm1R4rqyxOo5mgE6+2A2uVMTlyR/KvoPcWPkxP8 ARVQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linux.dev header.s=key1 header.b=AUkcAyER; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id d12-20020a170902cecc00b001ca4ad8635fsi259248plg.447.2023.10.16.15.19.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Oct 2023 15:20:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=fail header.i=@linux.dev header.s=key1 header.b=AUkcAyER; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 29F228038929; Mon, 16 Oct 2023 15:19:58 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234322AbjJPWTp (ORCPT + 18 others); Mon, 16 Oct 2023 18:19:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57312 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234304AbjJPWTj (ORCPT ); Mon, 16 Oct 2023 18:19:39 -0400 Received: from out-191.mta1.migadu.com (out-191.mta1.migadu.com [IPv6:2001:41d0:203:375::bf]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B69E79B for ; Mon, 16 Oct 2023 15:19:36 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1697494775; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3kIxRaEQpjFFVG+rl0/tQDkNcffLjJm9ou/emwznPps=; b=AUkcAyERB4hjOs5JPnQxOrvAqevb5Ye/YTZ14NxTy/hXzOKKR1ln2/5IM9q7l/vu2I4i4H Uk9W4bOiOxtW9TjZ38qFfyyzfpGSdK17v8XcmOMmSTtvViVbGekkE3dXtgFowX5dW1xCPc 2SN88q3ySEhGI2F8mcSxfcpDQtyxUH0= From: Roman Gushchin To: Andrew Morton Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Johannes Weiner , Michal Hocko , Shakeel@vger.kernel.org, Butt@vger.kernel.org, shakeelb@google.com, Muchun Song , Dennis Zhou , David Rientjes , Vlastimil Babka , Naresh Kamboju , Roman Gushchin Subject: [PATCH v3 5/5] percpu: scoped objcg protection Date: Mon, 16 Oct 2023 15:19:00 -0700 Message-ID: <20231016221900.4031141-6-roman.gushchin@linux.dev> In-Reply-To: <20231016221900.4031141-1-roman.gushchin@linux.dev> References: <20231016221900.4031141-1-roman.gushchin@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Mon, 16 Oct 2023 15:19:58 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779952307624788150 X-GMAIL-MSGID: 1779952307624788150 Similar to slab and kmem, switch to a scope-based protection of the objcg pointer to avoid. Signed-off-by: Roman Gushchin (Cruise) Tested-by: Naresh Kamboju Acked-by: Shakeel Butt Reviewed-by: Vlastimil Babka --- mm/percpu.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/mm/percpu.c b/mm/percpu.c index a7665de8485f..f53ba692d67a 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -1628,14 +1628,12 @@ static bool pcpu_memcg_pre_alloc_hook(size_t size, gfp_t gfp, if (!memcg_kmem_online() || !(gfp & __GFP_ACCOUNT)) return true; - objcg = get_obj_cgroup_from_current(); + objcg = current_obj_cgroup(); if (!objcg) return true; - if (obj_cgroup_charge(objcg, gfp, pcpu_obj_full_size(size))) { - obj_cgroup_put(objcg); + if (obj_cgroup_charge(objcg, gfp, pcpu_obj_full_size(size))) return false; - } *objcgp = objcg; return true; @@ -1649,6 +1647,7 @@ static void pcpu_memcg_post_alloc_hook(struct obj_cgroup *objcg, return; if (likely(chunk && chunk->obj_cgroups)) { + obj_cgroup_get(objcg); chunk->obj_cgroups[off >> PCPU_MIN_ALLOC_SHIFT] = objcg; rcu_read_lock(); @@ -1657,7 +1656,6 @@ static void pcpu_memcg_post_alloc_hook(struct obj_cgroup *objcg, rcu_read_unlock(); } else { obj_cgroup_uncharge(objcg, pcpu_obj_full_size(size)); - obj_cgroup_put(objcg); } }