From patchwork Thu Oct 19 22:53:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 155736 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2010:b0:403:3b70:6f57 with SMTP id fe16csp699031vqb; Thu, 19 Oct 2023 15:56:19 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFwxUclOYPoD2Q2h3pDHv6uaSGv0VY4djvH8gozo2lBrda3kwvB0UdOP5pP5t/lNeXEBC4j X-Received: by 2002:a17:903:18a:b0:1c9:cf26:8d91 with SMTP id z10-20020a170903018a00b001c9cf268d91mr308423plg.8.1697756179580; Thu, 19 Oct 2023 15:56:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697756179; cv=none; d=google.com; s=arc-20160816; b=E+5o5EUHZZ27WAFjTvcPb2EmXA3FI3C/15dvLfjLYGYGTLphs1LZM2oHEtUWCd16RW Z6tniQO4hOYHffs8roaFxgXQEXhmu2vsITk1mIWvZNbWYPLBZ/GqTZwVTvOf9Rvr9MPx mKQzNPbo4vuQKdEdpmc5/jCFGACU//4D3CmUoszowKIq/Wiv5khW7CJ/PfZaD6GGWOuq jQTlvbjMWJPZAqZHe++cYE+rTvqFhT8NuimJIoUjKio/LKHt0DuCQztcUYQL0lXfvaS1 heFgzaHOMjlDcCmcGJV+194OR1MfdDcFqXiE7rVI6g6zwPt0YDA30Mg3xdmLgg5kG2pd RBcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=y1KwfJjJzwSt9NrrtAyc2/Wpa50oBD8kLBiaGTTPdeA=; fh=0sD7u+6ZeVLabXSJ0/ta2yf8mUfUXWwh1sJBhHvFyso=; b=PVWTtPCQ344tVPwhUDb21SdPiiBBwnaiO0nQAR4PrYpWuwNc5SuTtj2L4DpKqPXMZV uyKXoD/mUrck/D2VmiIqzmObHBKK5Z0EReyeEUzhlKaV5oLNIssmjWUaos5jZsiuVkjS n8XYzcQNYaJLbqEtxeWUHgR1snVILGAu55hDAp9e7ieD4ZG4gy6Aa4euISfdkoW43g64 304Iwb5SIMKH/bqhMiRdKfA3rYXkJDPhLnEUx0aGFtOrvjl9qVDprOEm3E5BNjbfq6Pg fwNRLLESVt9rgvXAc7Yx2Eql8Q1IiEjXrbzlus5IX7fZM+pYsL5dGyn5tjJjkGbWFHJJ s6kw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=tDTOjE7D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id t2-20020a170902bc4200b001c36705bce1si424643plz.474.2023.10.19.15.56.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Oct 2023 15:56:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=tDTOjE7D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 9F68E806E5DE; Thu, 19 Oct 2023 15:55:20 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346712AbjJSWyV (ORCPT + 26 others); Thu, 19 Oct 2023 18:54:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58798 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346700AbjJSWyS (ORCPT ); Thu, 19 Oct 2023 18:54:18 -0400 Received: from out-200.mta0.migadu.com (out-200.mta0.migadu.com [91.218.175.200]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D73F6FA for ; Thu, 19 Oct 2023 15:54:15 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1697756054; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=y1KwfJjJzwSt9NrrtAyc2/Wpa50oBD8kLBiaGTTPdeA=; b=tDTOjE7DTyy4cSZ1350GJsSA3an2i9cqKy1Q4g2SjTywV4u7mYWzqZBqKvIOjqWS/umloV Nq23nSYhq58l4XAmk6FYdMmmNVtkWSCRU7hLVRIZugprr05e2F0LexTYgAcqIDzRXPLir3 PMy2G1M3GhmwtZ5YIE+BY24OtDtLYO8= From: Roman Gushchin To: Andrew Morton Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Dennis Zhou , David Rientjes , Vlastimil Babka , Naresh Kamboju , Roman Gushchin Subject: [PATCH v5 1/6] mm: kmem: optimize get_obj_cgroup_from_current() Date: Thu, 19 Oct 2023 15:53:41 -0700 Message-ID: <20231019225346.1822282-2-roman.gushchin@linux.dev> In-Reply-To: <20231019225346.1822282-1-roman.gushchin@linux.dev> References: <20231019225346.1822282-1-roman.gushchin@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Thu, 19 Oct 2023 15:55:20 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780226383847165051 X-GMAIL-MSGID: 1780226383847165051 Manually inline memcg_kmem_bypass() and active_memcg() to speed up get_obj_cgroup_from_current() by avoiding duplicate in_task() checks and active_memcg() readings. Also add a likely() macro to __get_obj_cgroup_from_memcg(): obj_cgroup_tryget() should succeed at almost all times except a very unlikely race with the memcg deletion path. Signed-off-by: Roman Gushchin (Cruise) Tested-by: Naresh Kamboju Acked-by: Shakeel Butt Acked-by: Johannes Weiner Reviewed-by: Vlastimil Babka --- mm/memcontrol.c | 34 ++++++++++++++-------------------- 1 file changed, 14 insertions(+), 20 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 9741d62d0424..16ac2a5838fb 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1068,19 +1068,6 @@ struct mem_cgroup *get_mem_cgroup_from_mm(struct mm_struct *mm) } EXPORT_SYMBOL(get_mem_cgroup_from_mm); -static __always_inline bool memcg_kmem_bypass(void) -{ - /* Allow remote memcg charging from any context. */ - if (unlikely(active_memcg())) - return false; - - /* Memcg to charge can't be determined. */ - if (!in_task() || !current->mm || (current->flags & PF_KTHREAD)) - return true; - - return false; -} - /** * mem_cgroup_iter - iterate over memory cgroup hierarchy * @root: hierarchy root @@ -3007,7 +2994,7 @@ static struct obj_cgroup *__get_obj_cgroup_from_memcg(struct mem_cgroup *memcg) for (; !mem_cgroup_is_root(memcg); memcg = parent_mem_cgroup(memcg)) { objcg = rcu_dereference(memcg->objcg); - if (objcg && obj_cgroup_tryget(objcg)) + if (likely(objcg && obj_cgroup_tryget(objcg))) break; objcg = NULL; } @@ -3016,16 +3003,23 @@ static struct obj_cgroup *__get_obj_cgroup_from_memcg(struct mem_cgroup *memcg) __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void) { - struct obj_cgroup *objcg = NULL; struct mem_cgroup *memcg; + struct obj_cgroup *objcg; - if (memcg_kmem_bypass()) - return NULL; + if (in_task()) { + memcg = current->active_memcg; + + /* Memcg to charge can't be determined. */ + if (likely(!memcg) && (!current->mm || (current->flags & PF_KTHREAD))) + return NULL; + } else { + memcg = this_cpu_read(int_active_memcg); + if (likely(!memcg)) + return NULL; + } rcu_read_lock(); - if (unlikely(active_memcg())) - memcg = active_memcg(); - else + if (!memcg) memcg = mem_cgroup_from_task(current); objcg = __get_obj_cgroup_from_memcg(memcg); rcu_read_unlock(); From patchwork Thu Oct 19 22:53:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 155731 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2010:b0:403:3b70:6f57 with SMTP id fe16csp698439vqb; Thu, 19 Oct 2023 15:54:39 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGlG9v42pF8Q3q6/OoB+SX/hdWcXdvVXkFw/0VQqDDcxCCQ0w0vq9OsiJcVIQFXWHFdmhgV X-Received: by 2002:a05:6a21:3b46:b0:161:2df0:eadf with SMTP id zy6-20020a056a213b4600b001612df0eadfmr169914pzb.24.1697756079281; Thu, 19 Oct 2023 15:54:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697756079; cv=none; d=google.com; s=arc-20160816; b=jUtzq49MlKYEYZ/CyyIOgDSUI4pMyWvNX46EmtBtVfRuQplvSf/v5VaI7AVng5/LRg Bg7SIn5JjUjiW5+CwquTYtFEAnaGRzkcrji/HFjM6ZkHtCsCnchdEJLcT4zr6svLY7Kj OyKIq46RWBlUCMC7zjsHySDS/KzKdEX8BBApfBTMv9cs2eAK4uB+1QyhiV/g5c/Nin/2 kZpcz5Jiy7DRBfDUTQYAMEC0v5x9r3znXe+m/rw5hij4SE2p3y3V6qX0UTV5sThmPKyN 0GcsqCbTPWhJm/077rJPW9mEe9iZmnKpsqjOjDKEjoNyyqR3HwqWnQPHMPbj8WfxY87l YohA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=BDgQ0sTDZEWEeAci7LBUiUqwaAWQt9IHTOfrbC8DqQs=; fh=0sD7u+6ZeVLabXSJ0/ta2yf8mUfUXWwh1sJBhHvFyso=; b=eaNCmf2wwr+a7fc+8MOW0/EZqZbv3htrGZE159nC+hVZhq9YJAdClT3mV12lKtQIUL yvwbkesYQTBT51TVgICnPpoFGioQetD1EwTDS2y6D+4sFeitldyFUANZZQXzJnL3PXoo tpu1NDszvO31lc2XhqKdYHJM8kMM9A/zrZCfibFINxdMiZQQl+KvX1+q0L6IctwxcS0v +42qhRRnP8lJG6qP1qcZjs7NSRmt1H/htGYb+3F5tYpN5iOFfHrMBL4JXlR9GW2vG3Z5 nw92nD7YNZjavmMKuF4vDobtzHy4o5TrFzA4WfBIkSz6UU26+MHQCqvAhZOm310ln+Dq CrfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=fL6Ma5oJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from morse.vger.email (morse.vger.email. [23.128.96.31]) by mx.google.com with ESMTPS id d22-20020a637356000000b005af21fd2c7dsi549856pgn.412.2023.10.19.15.54.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Oct 2023 15:54:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) client-ip=23.128.96.31; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=fL6Ma5oJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 661D7836CE6F; Thu, 19 Oct 2023 15:54:36 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346738AbjJSWy2 (ORCPT + 26 others); Thu, 19 Oct 2023 18:54:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58822 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346706AbjJSWyU (ORCPT ); Thu, 19 Oct 2023 18:54:20 -0400 Received: from out-199.mta0.migadu.com (out-199.mta0.migadu.com [IPv6:2001:41d0:1004:224b::c7]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4E737126 for ; Thu, 19 Oct 2023 15:54:18 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1697756056; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BDgQ0sTDZEWEeAci7LBUiUqwaAWQt9IHTOfrbC8DqQs=; b=fL6Ma5oJS1yv97IrEAJ77mrasInr+A/crITevGQ0uLYteMKKaYIA/cYsV3yA8y96/XCBjG U2IV9UTj9jUvPJfZSNBU/fXC8YCVyL6TuHNvxuu0KaDLplBsBCYB7rdJ2jGH0ZszM6YZoG ScQD3a19KgfN3bbavvQaI7wwRLw/nfM= From: Roman Gushchin To: Andrew Morton Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Dennis Zhou , David Rientjes , Vlastimil Babka , Naresh Kamboju , Roman Gushchin Subject: [PATCH v5 2/6] mm: kmem: add direct objcg pointer to task_struct Date: Thu, 19 Oct 2023 15:53:42 -0700 Message-ID: <20231019225346.1822282-3-roman.gushchin@linux.dev> In-Reply-To: <20231019225346.1822282-1-roman.gushchin@linux.dev> References: <20231019225346.1822282-1-roman.gushchin@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Thu, 19 Oct 2023 15:54:36 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780226278735442674 X-GMAIL-MSGID: 1780226278735442674 To charge a freshly allocated kernel object to a memory cgroup, the kernel needs to obtain an objcg pointer. Currently it does it indirectly by obtaining the memcg pointer first and then calling to __get_obj_cgroup_from_memcg(). Usually tasks spend their entire life belonging to the same object cgroup. So it makes sense to save the objcg pointer on task_struct directly, so it can be obtained faster. It requires some work on fork, exit and cgroup migrate paths, but these paths are way colder. To avoid any costly synchronization the following rules are applied: 1) A task sets it's objcg pointer itself. 2) If a task is being migrated to another cgroup, the least significant bit of the objcg pointer is set atomically. 3) On the allocation path the objcg pointer is obtained locklessly using the READ_ONCE() macro and the least significant bit is checked. If it's set, the following procedure is used to update it locklessly: - task->objcg is zeroed using cmpxcg - new objcg pointer is obtained - task->objcg is updated using try_cmpxchg - operation is repeated if try_cmpxcg fails It guarantees that no updates will be lost if task migration is racing against objcg pointer update. It also allows to keep both read and write paths fully lockless. Because the task is keeping a reference to the objcg, it can't go away while the task is alive. This commit doesn't change the way the remote memcg charging works. Signed-off-by: Roman Gushchin (Cruise) Tested-by: Naresh Kamboju Acked-by: Johannes Weiner Acked-by: Shakeel Butt Reviewed-by: Vlastimil Babka --- include/linux/sched.h | 4 ++ mm/memcontrol.c | 139 +++++++++++++++++++++++++++++++++++++++--- 2 files changed, 134 insertions(+), 9 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 77f01ac385f7..60de42715b56 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1443,6 +1443,10 @@ struct task_struct { struct mem_cgroup *active_memcg; #endif +#ifdef CONFIG_MEMCG_KMEM + struct obj_cgroup *objcg; +#endif + #ifdef CONFIG_BLK_CGROUP struct gendisk *throttle_disk; #endif diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 16ac2a5838fb..4c4b1f85f939 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -249,6 +249,9 @@ struct mem_cgroup *vmpressure_to_memcg(struct vmpressure *vmpr) return container_of(vmpr, struct mem_cgroup, vmpressure); } +#define CURRENT_OBJCG_UPDATE_BIT 0 +#define CURRENT_OBJCG_UPDATE_FLAG (1UL << CURRENT_OBJCG_UPDATE_BIT) + #ifdef CONFIG_MEMCG_KMEM static DEFINE_SPINLOCK(objcg_lock); @@ -3001,6 +3004,58 @@ static struct obj_cgroup *__get_obj_cgroup_from_memcg(struct mem_cgroup *memcg) return objcg; } +static struct obj_cgroup *current_objcg_update(void) +{ + struct mem_cgroup *memcg; + struct obj_cgroup *old, *objcg = NULL; + + do { + /* Atomically drop the update bit. */ + old = xchg(¤t->objcg, NULL); + if (old) { + old = (struct obj_cgroup *) + ((unsigned long)old & ~CURRENT_OBJCG_UPDATE_FLAG); + if (old) + obj_cgroup_put(old); + + old = NULL; + } + + /* If new objcg is NULL, no reason for the second atomic update. */ + if (!current->mm || (current->flags & PF_KTHREAD)) + return NULL; + + /* + * Release the objcg pointer from the previous iteration, + * if try_cmpxcg() below fails. + */ + if (unlikely(objcg)) { + obj_cgroup_put(objcg); + objcg = NULL; + } + + /* + * Obtain the new objcg pointer. The current task can be + * asynchronously moved to another memcg and the previous + * memcg can be offlined. So let's get the memcg pointer + * and try get a reference to objcg under a rcu read lock. + */ + + rcu_read_lock(); + memcg = mem_cgroup_from_task(current); + objcg = __get_obj_cgroup_from_memcg(memcg); + rcu_read_unlock(); + + /* + * Try set up a new objcg pointer atomically. If it + * fails, it means the update flag was set concurrently, so + * the whole procedure should be repeated. + */ + } while (!try_cmpxchg(¤t->objcg, &old, objcg)); + + return objcg; +} + __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void) { struct mem_cgroup *memcg; @@ -3008,19 +3063,26 @@ __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void) if (in_task()) { memcg = current->active_memcg; + if (unlikely(memcg)) + goto from_memcg; - /* Memcg to charge can't be determined. */ - if (likely(!memcg) && (!current->mm || (current->flags & PF_KTHREAD))) - return NULL; + objcg = READ_ONCE(current->objcg); + if (unlikely((unsigned long)objcg & CURRENT_OBJCG_UPDATE_FLAG)) + objcg = current_objcg_update(); + + if (objcg) { + obj_cgroup_get(objcg); + return objcg; + } } else { memcg = this_cpu_read(int_active_memcg); - if (likely(!memcg)) - return NULL; + if (unlikely(memcg)) + goto from_memcg; } + return NULL; +from_memcg: rcu_read_lock(); - if (!memcg) - memcg = mem_cgroup_from_task(current); objcg = __get_obj_cgroup_from_memcg(memcg); rcu_read_unlock(); return objcg; @@ -6345,6 +6407,7 @@ static void mem_cgroup_move_task(void) mem_cgroup_clear_mc(); } } + #else /* !CONFIG_MMU */ static int mem_cgroup_can_attach(struct cgroup_taskset *tset) { @@ -6358,8 +6421,39 @@ static void mem_cgroup_move_task(void) } #endif +#ifdef CONFIG_MEMCG_KMEM +static void mem_cgroup_fork(struct task_struct *task) +{ + /* + * Set the update flag to cause task->objcg to be initialized lazily + * on the first allocation. It can be done without any synchronization + * because it's always performed on the current task, so does + * current_objcg_update(). + */ + task->objcg = (struct obj_cgroup *)CURRENT_OBJCG_UPDATE_FLAG; +} + +static void mem_cgroup_exit(struct task_struct *task) +{ + struct obj_cgroup *objcg = task->objcg; + + objcg = (struct obj_cgroup *) + ((unsigned long)objcg & ~CURRENT_OBJCG_UPDATE_FLAG); + if (objcg) + obj_cgroup_put(objcg); + + /* + * Some kernel allocations can happen after this point, + * but let's ignore them. It can be done without any synchronization + * because it's always performed on the current task, so does + * current_objcg_update(). + */ + task->objcg = NULL; +} +#endif + #ifdef CONFIG_LRU_GEN -static void mem_cgroup_attach(struct cgroup_taskset *tset) +static void mem_cgroup_lru_gen_attach(struct cgroup_taskset *tset) { struct task_struct *task; struct cgroup_subsys_state *css; @@ -6377,10 +6471,31 @@ static void mem_cgroup_attach(struct cgroup_taskset *tset) task_unlock(task); } #else +static void mem_cgroup_lru_gen_attach(struct cgroup_taskset *tset) {} +#endif /* CONFIG_LRU_GEN */ + +#ifdef CONFIG_MEMCG_KMEM +static void mem_cgroup_kmem_attach(struct cgroup_taskset *tset) +{ + struct task_struct *task; + struct cgroup_subsys_state *css; + + cgroup_taskset_for_each(task, css, tset) { + /* atomically set the update bit */ + set_bit(CURRENT_OBJCG_UPDATE_BIT, (unsigned long *)&task->objcg); + } +} +#else +static void mem_cgroup_kmem_attach(struct cgroup_taskset *tset) {} +#endif /* CONFIG_MEMCG_KMEM */ + +#if defined(CONFIG_LRU_GEN) || defined(CONFIG_MEMCG_KMEM) static void mem_cgroup_attach(struct cgroup_taskset *tset) { + mem_cgroup_lru_gen_attach(tset); + mem_cgroup_kmem_attach(tset); } -#endif /* CONFIG_LRU_GEN */ +#endif static int seq_puts_memcg_tunable(struct seq_file *m, unsigned long value) { @@ -6824,9 +6939,15 @@ struct cgroup_subsys memory_cgrp_subsys = { .css_reset = mem_cgroup_css_reset, .css_rstat_flush = mem_cgroup_css_rstat_flush, .can_attach = mem_cgroup_can_attach, +#if defined(CONFIG_LRU_GEN) || defined(CONFIG_MEMCG_KMEM) .attach = mem_cgroup_attach, +#endif .cancel_attach = mem_cgroup_cancel_attach, .post_attach = mem_cgroup_move_task, +#ifdef CONFIG_MEMCG_KMEM + .fork = mem_cgroup_fork, + .exit = mem_cgroup_exit, +#endif .dfl_cftypes = memory_files, .legacy_cftypes = mem_cgroup_legacy_files, .early_init = 0, From patchwork Thu Oct 19 22:53:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 155732 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2010:b0:403:3b70:6f57 with SMTP id fe16csp698442vqb; Thu, 19 Oct 2023 15:54:40 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHvTtoP2rZ4upPOIZIcOdCLY/eFs+IgdsLK4zM9ZqdPa94HRdVR96SXSAu3+0qeBI21qFcl X-Received: by 2002:a05:6870:b254:b0:1e9:b79a:c6c6 with SMTP id b20-20020a056870b25400b001e9b79ac6c6mr338117oam.7.1697756079798; Thu, 19 Oct 2023 15:54:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697756079; cv=none; d=google.com; s=arc-20160816; b=hfE+Cx79srVC6ynVwWpZ323EiK9T4TLMHNTxjGcffsFhVUN3nFz4FD6zO4X9+z337x rE2MgYwzT5hcFxpXvWsFVXKKPChj5t9nc+WtjiqzqGvGI2EH8hZuUtYseQv6M+enkJtp B8KooMHEftIf0SDm85hpGoSQ/jvHyEu+x3wqyDR6EU/FVk2lrI7QVrzqRE4h3hvGZ1op mAsBlIFv/dFUmwwHAETGGbxu4+K5YLjw8vzsqkgDewLzDF1GJjBLsmNnFWw3E+KEMJjf YOSzSFAejADAF0Op53O7eUMgjg8gqgKvMrxilhI07rlvrLWkR1yshHEdyeC1JmHRTkcx izIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=TjRbekCofEjC4WvfGU9ZZivZap7GfKCofqK2R8XbFtA=; fh=0sD7u+6ZeVLabXSJ0/ta2yf8mUfUXWwh1sJBhHvFyso=; b=TDlpuIBHpX7m+9qmASrZi9E4HyVF2olODkShtFCS1v3b8fN1dLfwIYx5cS9lx741FH 0WczrZudfygM7QEjtgxmB9BpQl3uIOLreGoz2FdzZH8/LEJLhPgb0UGSXtNaLls08pqv rSlXZdJofrsmRUgbzwZJAHd+wqJ1T2Ageqi9VZBkY4MOpn31qiLPhYcdd/ikuPdGkqVK 2TWoYPswp10PocJcoNbBjHju0vE5hEzJc4lJl3B7CNNP3cUnqt44bXLjUR//VBGzrCgf Qh51qWBts3/7Jf7HVIyHrYxd90rY8bz/YWLR7VIJSFM14Z/UZLNuqzI+zgeG2GDH6f/l 6Jjw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=CSSCOtFv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id t10-20020a63b70a000000b005898b54186esi509900pgf.96.2023.10.19.15.54.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Oct 2023 15:54:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=CSSCOtFv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 7ADA58280D5B; Thu, 19 Oct 2023 15:54:38 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346663AbjJSWyb (ORCPT + 26 others); Thu, 19 Oct 2023 18:54:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58886 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346716AbjJSWy0 (ORCPT ); Thu, 19 Oct 2023 18:54:26 -0400 Received: from out-201.mta0.migadu.com (out-201.mta0.migadu.com [91.218.175.201]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C9DED130 for ; Thu, 19 Oct 2023 15:54:20 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1697756059; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TjRbekCofEjC4WvfGU9ZZivZap7GfKCofqK2R8XbFtA=; b=CSSCOtFvi7Ic1PfkshMhl604bLC9wtck8ncGfvc+CVnVI76s2gWESUbI5kqm+wkB4NEvbZ f6YjpCeVY9hDbH+wgY8JXIJC3BqE3wtExzGADF2pr5NRWaoryj7dR2Vq4KjLwkGq+VQNHQ phUAsoG9TN53NmVeOn9ebjkFEG27Nx4= From: Roman Gushchin To: Andrew Morton Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Dennis Zhou , David Rientjes , Vlastimil Babka , Naresh Kamboju , Roman Gushchin Subject: [PATCH v5 3/6] mm: kmem: make memcg keep a reference to the original objcg Date: Thu, 19 Oct 2023 15:53:43 -0700 Message-ID: <20231019225346.1822282-4-roman.gushchin@linux.dev> In-Reply-To: <20231019225346.1822282-1-roman.gushchin@linux.dev> References: <20231019225346.1822282-1-roman.gushchin@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 19 Oct 2023 15:54:38 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780226279086195893 X-GMAIL-MSGID: 1780226279086195893 Keep a reference to the original objcg object for the entire life of a memcg structure. This allows to simplify the synchronization on the kernel memory allocation paths: pinning a (live) memcg will also pin the corresponding objcg. The memory overhead of this change is minimal because object cgroups usually outlive their corresponding memory cgroups even without this change, so it's only an additional pointer per memcg. Signed-off-by: Roman Gushchin (Cruise) Tested-by: Naresh Kamboju Acked-by: Shakeel Butt Reviewed-by: Vlastimil Babka --- include/linux/memcontrol.h | 8 +++++++- mm/memcontrol.c | 5 +++++ 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index ab94ad4597d0..277690af383d 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -299,7 +299,13 @@ struct mem_cgroup { #ifdef CONFIG_MEMCG_KMEM int kmemcg_id; - struct obj_cgroup __rcu *objcg; + /* + * memcg->objcg is wiped out as a part of the objcg repaprenting + * process. memcg->orig_objcg preserves a pointer (and a reference) + * to the original objcg until the end of live of memcg. + */ + struct obj_cgroup __rcu *objcg; + struct obj_cgroup *orig_objcg; /* list of inherited objcgs, protected by objcg_lock */ struct list_head objcg_list; #endif diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 4c4b1f85f939..d964b91f00c8 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3817,6 +3817,8 @@ static int memcg_online_kmem(struct mem_cgroup *memcg) objcg->memcg = memcg; rcu_assign_pointer(memcg->objcg, objcg); + obj_cgroup_get(objcg); + memcg->orig_objcg = objcg; static_branch_enable(&memcg_kmem_online_key); @@ -5311,6 +5313,9 @@ static void __mem_cgroup_free(struct mem_cgroup *memcg) { int node; + if (memcg->orig_objcg) + obj_cgroup_put(memcg->orig_objcg); + for_each_node(node) free_mem_cgroup_per_node_info(memcg, node); kfree(memcg->vmstats); From patchwork Thu Oct 19 22:53:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 155734 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2010:b0:403:3b70:6f57 with SMTP id fe16csp698472vqb; Thu, 19 Oct 2023 15:54:46 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGQCfiQruu+JvRZKBXwdUvOmBlDSwbqhMf+KYPgl67hrOGEeAjMFfLsX74bvn7M9O4vscok X-Received: by 2002:a05:6a20:261b:b0:17a:e32d:242d with SMTP id i27-20020a056a20261b00b0017ae32d242dmr129721pze.35.1697756085788; Thu, 19 Oct 2023 15:54:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697756085; cv=none; d=google.com; s=arc-20160816; b=imFuxMnK5w7s4+jtybzX+WJQG9Ixc6bYEqd9mDYMEtRJVFOJx12eK5hNNqM4e3Wq/j 44kNOu5FBuSEijSW3aS2d4e0EQFlvXGfOsCkHv4gI/KXzJFfbpktSj/TQILDAXW5HQAG acxUK1F790ljFDhFOMUJHkPBJY3otXjKJ0cyUy8D1vYiRgf39kh2LtZ2W4w5Ez/vSnZT SbToRTyzI1B7OchqQPcj0nB0xVf43cB9+57HqtLFktAqfHhaQHURo5dYNHQATThYkBr6 sb3OG9wi3gDCVDspgvpHAWhG2KcD/ooJkphSkVJw+kK848Y0R4xdaZCd9O6PDqbjsQQ3 ClCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=pDnnpwCoxO5+bFywRQVLc1PnF6OgSg7x79d5yFEGT34=; fh=0sD7u+6ZeVLabXSJ0/ta2yf8mUfUXWwh1sJBhHvFyso=; b=xhx1pz6wZ3LtasS76WtuvlKuw+vCR0ToKUlMr8KmqouZvCd7JYW2cPkPcyreTUWaWE 5jMqM4aOS6D70WEjQYAonPd2S9MntTZ21PjzIQ74sCkWecd3oF7G2g+N2mp2vtxXhsOM hwv+5RXXjF7SJEm9Tcq3xr6VqdywvfO4kR1yGHcIEYcpTLH2isAqeKpgnti5hzNdGEMx LsP6GV9P3tvQIfVKXaZuwC9ErJMFkdAVqwl9C83GlZ1pW3Q92ajhzxgGONKrcRNTKzqG ZqDR4QE8gctNtluxF0hxPCC6LdBfVHebLk0rdFHczEcFJaj1ZuUJ1EA062Iy6QQPV/Vb hitA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=sGeh2eLu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id z7-20020a1709028f8700b001c9ff461649si430523plo.581.2023.10.19.15.54.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Oct 2023 15:54:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=sGeh2eLu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 128198280D7A; Thu, 19 Oct 2023 15:54:45 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346711AbjJSWye (ORCPT + 26 others); Thu, 19 Oct 2023 18:54:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33752 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346723AbjJSWy1 (ORCPT ); Thu, 19 Oct 2023 18:54:27 -0400 Received: from out-193.mta0.migadu.com (out-193.mta0.migadu.com [IPv6:2001:41d0:1004:224b::c1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8358B13E for ; Thu, 19 Oct 2023 15:54:23 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1697756061; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pDnnpwCoxO5+bFywRQVLc1PnF6OgSg7x79d5yFEGT34=; b=sGeh2eLuqe4WxFhppV5sFaG5eq+hU2ncs7PB5wWtuwi9wuQ83DIpEfCWbmhaaM3Z0oorFh Gywt8n3wZ9fPKg5gzb4CafmEc4yw2GFuunbhvb7QRbVJcJ8Ev4CMG8FjkPrPZ6tItPdg2l 6Qx9XK+gG2BpjvOuaf0sGgY6GvXiHHk= From: Roman Gushchin To: Andrew Morton Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Dennis Zhou , David Rientjes , Vlastimil Babka , Naresh Kamboju , Roman Gushchin Subject: [PATCH v5 4/6] mm: kmem: scoped objcg protection Date: Thu, 19 Oct 2023 15:53:44 -0700 Message-ID: <20231019225346.1822282-5-roman.gushchin@linux.dev> In-Reply-To: <20231019225346.1822282-1-roman.gushchin@linux.dev> References: <20231019225346.1822282-1-roman.gushchin@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 19 Oct 2023 15:54:45 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780226285751979451 X-GMAIL-MSGID: 1780226285751979451 Switch to a scope-based protection of the objcg pointer on slab/kmem allocation paths. Instead of using the get_() semantics in the pre-allocation hook and put the reference afterwards, let's rely on the fact that objcg is pinned by the scope. It's possible because: 1) if the objcg is received from the current task struct, the task is keeping a reference to the objcg. 2) if the objcg is received from an active memcg (remote charging), the memcg is pinned by the scope and has a reference to the corresponding objcg. Signed-off-by: Roman Gushchin (Cruise) Tested-by: Naresh Kamboju Acked-by: Shakeel Butt Reviewed-by: Vlastimil Babka --- include/linux/memcontrol.h | 9 ++++++++ include/linux/sched/mm.h | 4 ++++ mm/memcontrol.c | 47 ++++++++++++++++++++++++++++++++++++-- mm/slab.h | 15 ++++++------ 4 files changed, 66 insertions(+), 9 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 277690af383d..a89df289144d 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1769,6 +1769,15 @@ bool mem_cgroup_kmem_disabled(void); int __memcg_kmem_charge_page(struct page *page, gfp_t gfp, int order); void __memcg_kmem_uncharge_page(struct page *page, int order); +/* + * The returned objcg pointer is safe to use without additional + * protection within a scope. The scope is defined either by + * the current task (similar to the "current" global variable) + * or by set_active_memcg() pair. + * Please, use obj_cgroup_get() to get a reference if the pointer + * needs to be used outside of the local scope. + */ +struct obj_cgroup *current_obj_cgroup(void); struct obj_cgroup *get_obj_cgroup_from_current(void); struct obj_cgroup *get_obj_cgroup_from_folio(struct folio *folio); diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index 8d89c8c4fac1..9a19f1b42f64 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -403,6 +403,10 @@ DECLARE_PER_CPU(struct mem_cgroup *, int_active_memcg); * __GFP_ACCOUNT allocations till the end of the scope will be charged to the * given memcg. * + * Please, make sure that caller has a reference to the passed memcg structure, + * so its lifetime is guaranteed to exceed the scope between two + * set_active_memcg() calls. + * * NOTE: This function can nest. Users must save the return value and * reset the previous value after their own charging scope is over. */ diff --git a/mm/memcontrol.c b/mm/memcontrol.c index d964b91f00c8..e3d4b7fabb7d 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3088,6 +3088,49 @@ __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void) return objcg; } +__always_inline struct obj_cgroup *current_obj_cgroup(void) +{ + struct mem_cgroup *memcg; + struct obj_cgroup *objcg; + + if (in_task()) { + memcg = current->active_memcg; + if (unlikely(memcg)) + goto from_memcg; + + objcg = READ_ONCE(current->objcg); + if (unlikely((unsigned long)objcg & CURRENT_OBJCG_UPDATE_FLAG)) + objcg = current_objcg_update(); + /* + * Objcg reference is kept by the task, so it's safe + * to use the objcg by the current task. + */ + return objcg; + } + + memcg = this_cpu_read(int_active_memcg); + if (unlikely(memcg)) + goto from_memcg; + + return NULL; + +from_memcg: + for (; !mem_cgroup_is_root(memcg); memcg = parent_mem_cgroup(memcg)) { + /* + * Memcg pointer is protected by scope (see set_active_memcg()) + * and is pinning the corresponding objcg, so objcg can't go + * away and can be used within the scope without any additional + * protection. + */ + objcg = rcu_dereference_check(memcg->objcg, 1); + if (likely(objcg)) + break; + objcg = NULL; + } + + return objcg; +} + struct obj_cgroup *get_obj_cgroup_from_folio(struct folio *folio) { struct obj_cgroup *objcg; @@ -3182,15 +3225,15 @@ int __memcg_kmem_charge_page(struct page *page, gfp_t gfp, int order) struct obj_cgroup *objcg; int ret = 0; - objcg = get_obj_cgroup_from_current(); + objcg = current_obj_cgroup(); if (objcg) { ret = obj_cgroup_charge_pages(objcg, gfp, 1 << order); if (!ret) { + obj_cgroup_get(objcg); page->memcg_data = (unsigned long)objcg | MEMCG_DATA_KMEM; return 0; } - obj_cgroup_put(objcg); } return ret; } diff --git a/mm/slab.h b/mm/slab.h index 799a315695c6..3d07fb428393 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -484,7 +484,12 @@ static inline bool memcg_slab_pre_alloc_hook(struct kmem_cache *s, if (!(flags & __GFP_ACCOUNT) && !(s->flags & SLAB_ACCOUNT)) return true; - objcg = get_obj_cgroup_from_current(); + /* + * The obtained objcg pointer is safe to use within the current scope, + * defined by current task or set_active_memcg() pair. + * obj_cgroup_get() is used to get a permanent reference. + */ + objcg = current_obj_cgroup(); if (!objcg) return true; @@ -497,17 +502,14 @@ static inline bool memcg_slab_pre_alloc_hook(struct kmem_cache *s, css_put(&memcg->css); if (ret) - goto out; + return false; } if (obj_cgroup_charge(objcg, flags, objects * obj_full_size(s))) - goto out; + return false; *objcgp = objcg; return true; -out: - obj_cgroup_put(objcg); - return false; } static inline void memcg_slab_post_alloc_hook(struct kmem_cache *s, @@ -542,7 +544,6 @@ static inline void memcg_slab_post_alloc_hook(struct kmem_cache *s, obj_cgroup_uncharge(objcg, obj_full_size(s)); } } - obj_cgroup_put(objcg); } static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, From patchwork Thu Oct 19 22:53:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 155733 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2010:b0:403:3b70:6f57 with SMTP id fe16csp698470vqb; Thu, 19 Oct 2023 15:54:45 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGmNe7mGeHWIL7Lkag1zTXTUyta73rmwFF7y7luzJVSM8YnK3BDZCBpafIkTikZeOvUF1W3 X-Received: by 2002:a05:6a20:144c:b0:17b:2b7e:923c with SMTP id a12-20020a056a20144c00b0017b2b7e923cmr186629pzi.16.1697756085404; Thu, 19 Oct 2023 15:54:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697756085; cv=none; d=google.com; s=arc-20160816; b=lCT+mP2BwogvpF7I/rkWP7O1OSQGu2MZAdya3bk4MoB/sSdlNw9cmRLa5YFnrywiZm nyaMuQTw//ct+En1jjzMNa0+G0DCiYWFtUYyOnw/Hjj2sgV9TW3hWxCn0m2KRBFVkSAq hlSn+AgaEr1BCTWF/blcVhP5J1U5R7nqRgChzisqW9xm65FNC4YZ/XL7+0OZOz7UkQy/ B11xgvf9F7VKpeUpAHtbbQhrdFKzANu1Lx3APQDRKJUqCFKChGpNo6DZGBbabWCLz9v1 4dAnHDKQ83rF+kbhMqNMdoE0g6PeRcoUx8z/ERAXNg3Z1KZzlfJh+G16AcWXlHfvfJbP HzEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=BnzSFyH1H48PSMDn7iqVZlFC9YDx4r15eixWsUbXpKs=; fh=0sD7u+6ZeVLabXSJ0/ta2yf8mUfUXWwh1sJBhHvFyso=; b=OPX/MTnnw2APqYslJBrwG/zP/VLgGrQZL1JngJpOsBYCulnqcQmotcxVSjCoTJbNht KL3gQoeKrywN00fo1OLKCENQOfEbP9ZfILpDd1U/UY75e6mpzLxHdkAxb1OVJZwFtxPb zzFJ6lwby/6YkWAP8Ijq33r544R75omTACEWEDYrW3xuPOW0ogvMqtLEi7qMDbmpb/2r aPfY22ZHM3mXqV0p6IlYnsqHhJXQdvKo+DT23iO/1HhCTaheQm0DOBM2DdblwhepCmoI hmw62bMNCBZpfO8tEfqKCgejrNjSih1euYBpo3Eciw05X4YwZxDeyLxB1NwjeXWv2A/6 4oyw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=JnrPJVXC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id d124-20020a633682000000b005abac05ba94si549350pga.776.2023.10.19.15.54.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Oct 2023 15:54:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=JnrPJVXC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 876B48280D65; Thu, 19 Oct 2023 15:54:44 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346719AbjJSWyi (ORCPT + 26 others); Thu, 19 Oct 2023 18:54:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33714 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346741AbjJSWy3 (ORCPT ); Thu, 19 Oct 2023 18:54:29 -0400 Received: from out-197.mta0.migadu.com (out-197.mta0.migadu.com [IPv6:2001:41d0:1004:224b::c5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 227D2126 for ; Thu, 19 Oct 2023 15:54:25 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1697756064; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BnzSFyH1H48PSMDn7iqVZlFC9YDx4r15eixWsUbXpKs=; b=JnrPJVXCVoN8MXZPI6RkQGBCNH63eYwmUrhXUzgISfnhW0IuWEsi18lYzeeVWxfLpgH4GW AMLpenUkuseR/R6nGzgHYvx/b8MQBc71GCbNQ7H4HmhTQkLQ3nhzStsySvA+ZV2v2EjjBo skR/Nl40nvBP7E2e2M3bFod11TDnBcY= From: Roman Gushchin To: Andrew Morton Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Dennis Zhou , David Rientjes , Vlastimil Babka , Naresh Kamboju , Roman Gushchin Subject: [PATCH v5 5/6] percpu: scoped objcg protection Date: Thu, 19 Oct 2023 15:53:45 -0700 Message-ID: <20231019225346.1822282-6-roman.gushchin@linux.dev> In-Reply-To: <20231019225346.1822282-1-roman.gushchin@linux.dev> References: <20231019225346.1822282-1-roman.gushchin@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 19 Oct 2023 15:54:44 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780226284909752932 X-GMAIL-MSGID: 1780226284909752932 Similar to slab and kmem, switch to a scope-based protection of the objcg pointer to avoid. Signed-off-by: Roman Gushchin (Cruise) Tested-by: Naresh Kamboju Acked-by: Shakeel Butt Reviewed-by: Vlastimil Babka --- mm/percpu.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/mm/percpu.c b/mm/percpu.c index a7665de8485f..f53ba692d67a 100644 --- a/mm/percpu.c +++ b/mm/percpu.c @@ -1628,14 +1628,12 @@ static bool pcpu_memcg_pre_alloc_hook(size_t size, gfp_t gfp, if (!memcg_kmem_online() || !(gfp & __GFP_ACCOUNT)) return true; - objcg = get_obj_cgroup_from_current(); + objcg = current_obj_cgroup(); if (!objcg) return true; - if (obj_cgroup_charge(objcg, gfp, pcpu_obj_full_size(size))) { - obj_cgroup_put(objcg); + if (obj_cgroup_charge(objcg, gfp, pcpu_obj_full_size(size))) return false; - } *objcgp = objcg; return true; @@ -1649,6 +1647,7 @@ static void pcpu_memcg_post_alloc_hook(struct obj_cgroup *objcg, return; if (likely(chunk && chunk->obj_cgroups)) { + obj_cgroup_get(objcg); chunk->obj_cgroups[off >> PCPU_MIN_ALLOC_SHIFT] = objcg; rcu_read_lock(); @@ -1657,7 +1656,6 @@ static void pcpu_memcg_post_alloc_hook(struct obj_cgroup *objcg, rcu_read_unlock(); } else { obj_cgroup_uncharge(objcg, pcpu_obj_full_size(size)); - obj_cgroup_put(objcg); } } From patchwork Thu Oct 19 22:53:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 155735 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2010:b0:403:3b70:6f57 with SMTP id fe16csp698484vqb; Thu, 19 Oct 2023 15:54:47 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEsilCBEYea2q2QpTGjriATrDBdjQ5Yw5k/8frE70v7sf0bNEqXgpc9n5JaWYJJrMQxLtUJ X-Received: by 2002:a17:90a:69c1:b0:27d:f85:9505 with SMTP id s59-20020a17090a69c100b0027d0f859505mr280419pjj.24.1697756087198; Thu, 19 Oct 2023 15:54:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697756087; cv=none; d=google.com; s=arc-20160816; b=OBl1Pg+81t2BvY5vgrixCcCSxysXO/XfMqrAgWNp+hIuPV1813SCSTdzuXRaRjmXig /k96nnLGQ73pBYoJgyxJottgwgGGnVZQUD7meBW+kM3vPc1eDAU4Ltz3bVREkC/tBGDv spa6Ev9q1xZAQaQafXbYCc9iY/ZKmpVsdO8g78b7fMVI4zkCAJcQIgawk61Gu4zuEfLr VbcSWFen69zut+nOek8n9cTGMPZg3P7PjErX+zp5t1QktksuitDtQXci0M45xoTHv6nX sNyph30h3w+6YMohLYSzOX1nJMjHEG4UmK19IM3UqvUv4amu1K/PbPR4H41bcCSoTteT m8vQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=dS7frJRdSMuPJ7DdWWAjQ+cQ8KaRwRE0njPFfiYUqhI=; fh=0sD7u+6ZeVLabXSJ0/ta2yf8mUfUXWwh1sJBhHvFyso=; b=xGYxdvomebr+sZBulrTu2ApRGLU0Folm0o3xP59QNQwYyjllW/8E+HOiagDeZSnnbA qd6mGXb9Qe20x9G376Aup5NQFy1XIYCI9KulPr8jyRXR/Y2kcK9T3q9BdMXTsOAN6IuY GZCPrlAjRjV/QYX0N4gApZyJigukT9gh+3m0q5DRF4IO8WcxEWrPU9apJAWgzk1bJAZH NsqjmalFcWe1NFSuvxUTrE0JJpnKsvsTOOGAW4xngQer53qOkF50S1Zhs+q75KBWmaY9 rfmUas36wn6rBGk6vHrIPbU2n0/UewY5tmx3dI7gZTYTJpEtqCsrTqo73jl3xO4s4maH 6CJQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=swiyjDzb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id me5-20020a17090b17c500b002777ccd05bcsi3364194pjb.25.2023.10.19.15.54.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Oct 2023 15:54:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=swiyjDzb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 35D218282A05; Thu, 19 Oct 2023 15:54:46 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346739AbjJSWyl (ORCPT + 26 others); Thu, 19 Oct 2023 18:54:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33812 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235553AbjJSWyb (ORCPT ); Thu, 19 Oct 2023 18:54:31 -0400 Received: from out-194.mta0.migadu.com (out-194.mta0.migadu.com [91.218.175.194]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 826E2184 for ; Thu, 19 Oct 2023 15:54:28 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1697756066; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dS7frJRdSMuPJ7DdWWAjQ+cQ8KaRwRE0njPFfiYUqhI=; b=swiyjDzbwvO1ryyH0LOxBsG3sRDrHo8MwhzTCH/GhQa+yLHgJt9YruoXPCzep7ToRdRcIe f2eQNq6gGnHGE7yLa7z373yvV6fk04/R4k49Aiuuzgl3U3AyV/OKYwgN/BLnwOmIAr+Hpk tNxwryf0vISlCXMm0JNVfO9Q0ZShvgY= From: Roman Gushchin To: Andrew Morton Cc: linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Dennis Zhou , David Rientjes , Vlastimil Babka , Naresh Kamboju , Roman Gushchin Subject: [PATCH v5 6/6] mm: kmem: reimplement get_obj_cgroup_from_current() Date: Thu, 19 Oct 2023 15:53:46 -0700 Message-ID: <20231019225346.1822282-7-roman.gushchin@linux.dev> In-Reply-To: <20231019225346.1822282-1-roman.gushchin@linux.dev> References: <20231019225346.1822282-1-roman.gushchin@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 19 Oct 2023 15:54:46 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780226286623162321 X-GMAIL-MSGID: 1780226286623162321 Reimplement get_obj_cgroup_from_current() using current_obj_cgroup(). get_obj_cgroup_from_current() and current_obj_cgroup() share 80% of the code, so the new implementation is almost trivial. get_obj_cgroup_from_current() is a convenient function used by the bpf subsystem, so there is no reason to get rid of it completely. Signed-off-by: Roman Gushchin (Cruise) Acked-by: Shakeel Butt --- include/linux/memcontrol.h | 11 ++++++++++- mm/memcontrol.c | 32 -------------------------------- 2 files changed, 10 insertions(+), 33 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index a89df289144d..ef26551a633f 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -1778,9 +1778,18 @@ void __memcg_kmem_uncharge_page(struct page *page, int order); * needs to be used outside of the local scope. */ struct obj_cgroup *current_obj_cgroup(void); -struct obj_cgroup *get_obj_cgroup_from_current(void); struct obj_cgroup *get_obj_cgroup_from_folio(struct folio *folio); +static inline struct obj_cgroup *get_obj_cgroup_from_current(void) +{ + struct obj_cgroup *objcg = current_obj_cgroup(); + + if (objcg) + obj_cgroup_get(objcg); + + return objcg; +} + int obj_cgroup_charge(struct obj_cgroup *objcg, gfp_t gfp, size_t size); void obj_cgroup_uncharge(struct obj_cgroup *objcg, size_t size); diff --git a/mm/memcontrol.c b/mm/memcontrol.c index e3d4b7fabb7d..e13c10912c16 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3056,38 +3056,6 @@ static struct obj_cgroup *current_objcg_update(void) return objcg; } -__always_inline struct obj_cgroup *get_obj_cgroup_from_current(void) -{ - struct mem_cgroup *memcg; - struct obj_cgroup *objcg; - - if (in_task()) { - memcg = current->active_memcg; - if (unlikely(memcg)) - goto from_memcg; - - objcg = READ_ONCE(current->objcg); - if (unlikely((unsigned long)objcg & CURRENT_OBJCG_UPDATE_FLAG)) - objcg = current_objcg_update(); - - if (objcg) { - obj_cgroup_get(objcg); - return objcg; - } - } else { - memcg = this_cpu_read(int_active_memcg); - if (unlikely(memcg)) - goto from_memcg; - } - return NULL; - -from_memcg: - rcu_read_lock(); - objcg = __get_obj_cgroup_from_memcg(memcg); - rcu_read_unlock(); - return objcg; -} - __always_inline struct obj_cgroup *current_obj_cgroup(void) { struct mem_cgroup *memcg;