Message ID | 20230622085335.77010-3-zhengqi.arch@bytedance.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp4936542vqr; Thu, 22 Jun 2023 02:31:28 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7KMczOsclddSzNScJTnrUowhdpCW1xEt0GSlKpnsszjo5v3r+N2RLoGmFg7J42qhKivllq X-Received: by 2002:a17:90a:2d8f:b0:255:2dde:17cc with SMTP id p15-20020a17090a2d8f00b002552dde17ccmr17412747pjd.47.1687426288454; Thu, 22 Jun 2023 02:31:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687426288; cv=none; d=google.com; s=arc-20160816; b=YBTKy5fr9sjdOKpx/00fpdaKXcvY8GX+MFZfOhlDXJu09XwK4eeTl3sYDC3WXL6cyb Kpsmlrfdyk7vylse1nOCTFs5CTysHGCVH3Pr1QjJsmkODaDfngZkGwz1XJIKG/HSTu9V 2ynD20comiRUZmKu/Q2/akNnCtl2pFIShTGUToB5bwIxXEkJ8PdVmXgQKOe+nAIlltBm vabmP80DuXW0Fxtcjdsh1eA3CqRxmfDtlSX+s/x3pbG8PUBGvxqlTDyeKFV4mI6qoNj0 LhmxIDkESzPqiAvj4Rgm7xaJrv/UUn6SB+zKljqTQK4tTYMvvjKZlrD6MUqjDzwb+f3j 9chA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=4+KB+8A1a3nPwmndVeoDnF0UUBrNJStZPfOFWRboZRM=; b=dYx9nm83MBVu50UlYo0SHfHVpYH+kBkjyTKLD9oTBI1+7pwXMLfbUHi+vxkPc4EQn7 F5MUa1FA/vLMu/Utt/Ok4Ys0acaH8NS3uQEcPC+kU7JNUUZvIj4MZzEj8Aa1sCGPXLor 2tPapSHOoD+NJV5A1atcaUh2OZDpFfzstAUTWByqP1ZEb2nPO2Pje7Tp28NBszQlYzkE aFMfVARnvhSx8K/HBnmgYMq/4JFVhiAzd/RVewUYPqRKh64gCis0dkHK5LiyolKMsa/x ORHWFfOHLdMJfS6nDuZjR4QSd6dGcw31NER5RATgI9Q5R8TwL7UKvIreblOf+WNS3ar/ nvkg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=Sk9j4D1M; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l1-20020a17090a070100b00251662efc9dsi13238950pjl.53.2023.06.22.02.31.14; Thu, 22 Jun 2023 02:31:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=Sk9j4D1M; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231421AbjFVIzM (ORCPT <rfc822;maxin.john@gmail.com> + 99 others); Thu, 22 Jun 2023 04:55:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38904 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230281AbjFVIy2 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 22 Jun 2023 04:54:28 -0400 Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 992282683 for <linux-kernel@vger.kernel.org>; Thu, 22 Jun 2023 01:54:19 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-1b5585e84b4so6826795ad.0 for <linux-kernel@vger.kernel.org>; Thu, 22 Jun 2023 01:54:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1687424059; x=1690016059; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4+KB+8A1a3nPwmndVeoDnF0UUBrNJStZPfOFWRboZRM=; b=Sk9j4D1M64xyAQoy0v1SMiRRuiugSGVsgHhv7vYv9c3IikKaUFVDiFJczRjS+t8QJh 8bRZCGyYvkRGHDm8uUJh3JrpsRJfZFHSfxrg783G1zZ8ag+Nrop0sZP8BnmbcU+0S25q ZIKtGKgKPIH7cGpK2fHrvYtY9p82kt8TosN3NcpFb8CLbiW0zQyLzdahaQYLKutpL1Kb iV83VCrLSsoe4tZLXCgr7bzLBi5xWr4Iu6AgSr+ApJLZMKUxhfto21PwHxkzor+s4sBV FrT1a+LNzQzeAZGp0p/ozv8VRhpBFeAgfX/MFGdkWk3iK2t7SrfDgjAEjXYPkdmChJzo lkUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687424059; x=1690016059; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4+KB+8A1a3nPwmndVeoDnF0UUBrNJStZPfOFWRboZRM=; b=UdArrdnbqeXWh5jwiJAOu+ff3FKT935JkmNwXUM5pNnvxjpPe5HFaSS4tAr0KW2OSf 31WW+G4qIt4RQJ0u1OwXuJEvqc7k3TjgqhZw3TfolgnhOEbFk0wZ0uqfEEPtqCu2WWW6 iquCzwXyCClItutL74y8PHIwGxDTv/VvHzR9ZI33cBOyun43AkWW42WeXbGStFOyfkGH kOgMvj1gskukDpnBeay/Tx85pT7er9DWYPu1YmeYpJ9ui9/PE3m+/tgKAiDFTR9OIuad 0/Scoc4h4GLJMXwfTLzFpNbiUoVyKQuWFS2NC4zA0OAlD+aMPeleFKoSZpmZJjTxs/wK GBfg== X-Gm-Message-State: AC+VfDzCO2MW6QrM9aYd6slnse7hydRkdT9nrfaU4fBq4+sk99SdYWag QK2oLfa1/EmqBmvt7bimNxRylA== X-Received: by 2002:a17:903:2451:b0:1b0:34c6:3bf2 with SMTP id l17-20020a170903245100b001b034c63bf2mr21537674pls.5.1687424059079; Thu, 22 Jun 2023 01:54:19 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([139.177.225.254]) by smtp.gmail.com with ESMTPSA id h2-20020a170902f7c200b001b549fce345sm4806971plw.230.2023.06.22.01.54.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 01:54:18 -0700 (PDT) From: Qi Zheng <zhengqi.arch@bytedance.com> To: akpm@linux-foundation.org, david@fromorbit.com, tkhai@ya.ru, vbabka@suse.cz, roman.gushchin@linux.dev, djwong@kernel.org, brauner@kernel.org, paulmck@kernel.org, tytso@mit.edu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, dm-devel@redhat.com, linux-raid@vger.kernel.org, linux-bcache@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-nfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, Qi Zheng <zhengqi.arch@bytedance.com> Subject: [PATCH 02/29] mm: vmscan: introduce some helpers for dynamically allocating shrinker Date: Thu, 22 Jun 2023 16:53:08 +0800 Message-Id: <20230622085335.77010-3-zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: <20230622085335.77010-1-zhengqi.arch@bytedance.com> References: <20230622085335.77010-1-zhengqi.arch@bytedance.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769394707643695327?= X-GMAIL-MSGID: =?utf-8?q?1769394707643695327?= |
Series |
use refcount+RCU method to implement lockless slab shrink
|
|
Commit Message
Qi Zheng
June 22, 2023, 8:53 a.m. UTC
Introduce some helpers for dynamically allocating shrinker instance,
and their uses are as follows:
1. shrinker_alloc_and_init()
Used to allocate and initialize a shrinker instance, the priv_data
parameter is used to pass the pointer of the previously embedded
structure of the shrinker instance.
2. shrinker_free()
Used to free the shrinker instance when the registration of shrinker
fails.
3. unregister_and_free_shrinker()
Used to unregister and free the shrinker instance, and the kfree()
will be changed to kfree_rcu() later.
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
include/linux/shrinker.h | 12 ++++++++++++
mm/vmscan.c | 35 +++++++++++++++++++++++++++++++++++
2 files changed, 47 insertions(+)
Comments
On Thu, Jun 22, 2023 at 04:53:08PM +0800, Qi Zheng wrote: > Introduce some helpers for dynamically allocating shrinker instance, > and their uses are as follows: > > 1. shrinker_alloc_and_init() > > Used to allocate and initialize a shrinker instance, the priv_data > parameter is used to pass the pointer of the previously embedded > structure of the shrinker instance. > > 2. shrinker_free() > > Used to free the shrinker instance when the registration of shrinker > fails. > > 3. unregister_and_free_shrinker() > > Used to unregister and free the shrinker instance, and the kfree() > will be changed to kfree_rcu() later. > > Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> > --- > include/linux/shrinker.h | 12 ++++++++++++ > mm/vmscan.c | 35 +++++++++++++++++++++++++++++++++++ > 2 files changed, 47 insertions(+) > > diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h > index 43e6fcabbf51..8e9ba6fa3fcc 100644 > --- a/include/linux/shrinker.h > +++ b/include/linux/shrinker.h > @@ -107,6 +107,18 @@ extern void unregister_shrinker(struct shrinker *shrinker); > extern void free_prealloced_shrinker(struct shrinker *shrinker); > extern void synchronize_shrinkers(void); > > +typedef unsigned long (*count_objects_cb)(struct shrinker *s, > + struct shrink_control *sc); > +typedef unsigned long (*scan_objects_cb)(struct shrinker *s, > + struct shrink_control *sc); > + > +struct shrinker *shrinker_alloc_and_init(count_objects_cb count, > + scan_objects_cb scan, long batch, > + int seeks, unsigned flags, > + void *priv_data); > +void shrinker_free(struct shrinker *shrinker); > +void unregister_and_free_shrinker(struct shrinker *shrinker); Hmmmm. Not exactly how I envisioned this to be done. Ok, this will definitely work, but I don't think it is an improvement. It's certainly not what I was thinking of when I suggested dynamically allocating shrinkers. The main issue is that this doesn't simplify the API - it expands it and creates a minefield of old and new functions that have to be used in exactly the right order for the right things to happen. What I was thinking of was moving the entire shrinker setup code over to the prealloc/register_prepared() algorithm, where the setup is already separated from the activation of the shrinker. That is, we start by renaming prealloc_shrinker() to shrinker_alloc(), adding a flags field to tell it everything that it needs to alloc (i.e. the NUMA/MEMCG_AWARE flags) and having it returned a fully allocated shrinker ready to register. Initially this also contains an internal flag to say the shrinker was allocated so that unregister_shrinker() knows to free it. The caller then fills out the shrinker functions, seeks, etc. just like the do now, and then calls register_shrinker_prepared() to make the shrinker active when it wants to turn it on. When it is time to tear down the shrinker, no API needs to change. unregister_shrinker() does all the shutdown and frees all the internal memory like it does now. If the shrinker is also marked as allocated, it frees the shrinker via RCU, too. Once everything is converted to this API, we then remove register_shrinker(), rename register_shrinker_prepared() to shrinker_register(), rename unregister_shrinker to shrinker_unregister(), get rid of the internal "allocated" flag and always free the shrinker. At the end of the patchset, every shrinker should be set up in a manner like this: sb->shrinker = shrinker_alloc(SHRINKER_MEMCG_AWARE|SHRINKER_NUMA_AWARE, "sb-%s", type->name); if (!sb->shrinker) return -ENOMEM; sb->shrinker->count_objects = super_cache_count; sb->shrinker->scan_objects = super_cache_scan; sb->shrinker->batch = 1024; sb->shrinker->private = sb; ..... shrinker_register(sb->shrinker); And teardown is just a call to shrinker_unregister(sb->shrinker) as it is now. i.e. the entire shrinker regsitration API is now just three functions, down from the current four, and much simpler than the the seven functions this patch set results in... The other advantage of this is that it will break all the existing out of tree code and third party modules using the old API and will no longer work with a kernel using lockless slab shrinkers. They need to break (both at the source and binary levels) to stop bad things from happening due to using uncoverted shrinkers in the new setup. -Dave.
Hi Dave, On 2023/6/23 14:12, Dave Chinner wrote: > On Thu, Jun 22, 2023 at 04:53:08PM +0800, Qi Zheng wrote: >> Introduce some helpers for dynamically allocating shrinker instance, >> and their uses are as follows: >> >> 1. shrinker_alloc_and_init() >> >> Used to allocate and initialize a shrinker instance, the priv_data >> parameter is used to pass the pointer of the previously embedded >> structure of the shrinker instance. >> >> 2. shrinker_free() >> >> Used to free the shrinker instance when the registration of shrinker >> fails. >> >> 3. unregister_and_free_shrinker() >> >> Used to unregister and free the shrinker instance, and the kfree() >> will be changed to kfree_rcu() later. >> >> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> >> --- >> include/linux/shrinker.h | 12 ++++++++++++ >> mm/vmscan.c | 35 +++++++++++++++++++++++++++++++++++ >> 2 files changed, 47 insertions(+) >> >> diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h >> index 43e6fcabbf51..8e9ba6fa3fcc 100644 >> --- a/include/linux/shrinker.h >> +++ b/include/linux/shrinker.h >> @@ -107,6 +107,18 @@ extern void unregister_shrinker(struct shrinker *shrinker); >> extern void free_prealloced_shrinker(struct shrinker *shrinker); >> extern void synchronize_shrinkers(void); >> >> +typedef unsigned long (*count_objects_cb)(struct shrinker *s, >> + struct shrink_control *sc); >> +typedef unsigned long (*scan_objects_cb)(struct shrinker *s, >> + struct shrink_control *sc); >> + >> +struct shrinker *shrinker_alloc_and_init(count_objects_cb count, >> + scan_objects_cb scan, long batch, >> + int seeks, unsigned flags, >> + void *priv_data); >> +void shrinker_free(struct shrinker *shrinker); >> +void unregister_and_free_shrinker(struct shrinker *shrinker); > > Hmmmm. Not exactly how I envisioned this to be done. > > Ok, this will definitely work, but I don't think it is an > improvement. It's certainly not what I was thinking of when I > suggested dynamically allocating shrinkers. > > The main issue is that this doesn't simplify the API - it expands it > and creates a minefield of old and new functions that have to be > used in exactly the right order for the right things to happen. > > What I was thinking of was moving the entire shrinker setup code > over to the prealloc/register_prepared() algorithm, where the setup > is already separated from the activation of the shrinker. > > That is, we start by renaming prealloc_shrinker() to > shrinker_alloc(), adding a flags field to tell it everything that it > needs to alloc (i.e. the NUMA/MEMCG_AWARE flags) and having it > returned a fully allocated shrinker ready to register. Initially > this also contains an internal flag to say the shrinker was > allocated so that unregister_shrinker() knows to free it. > > The caller then fills out the shrinker functions, seeks, etc. just > like the do now, and then calls register_shrinker_prepared() to make > the shrinker active when it wants to turn it on. > > When it is time to tear down the shrinker, no API needs to change. > unregister_shrinker() does all the shutdown and frees all the > internal memory like it does now. If the shrinker is also marked as > allocated, it frees the shrinker via RCU, too. > > Once everything is converted to this API, we then remove > register_shrinker(), rename register_shrinker_prepared() to > shrinker_register(), rename unregister_shrinker to > shrinker_unregister(), get rid of the internal "allocated" flag > and always free the shrinker. IIUC, you mean that we also need to convert the original statically defined shrinker instances to dynamically allocated. I think this is a good idea, it helps to simplify the APIs and also remove special handling for case a and b (mentioned in cover letter). > > At the end of the patchset, every shrinker should be set > up in a manner like this: > > > sb->shrinker = shrinker_alloc(SHRINKER_MEMCG_AWARE|SHRINKER_NUMA_AWARE, > "sb-%s", type->name); > if (!sb->shrinker) > return -ENOMEM; > > sb->shrinker->count_objects = super_cache_count; > sb->shrinker->scan_objects = super_cache_scan; > sb->shrinker->batch = 1024; > sb->shrinker->private = sb; > > ..... > > shrinker_register(sb->shrinker); > > And teardown is just a call to shrinker_unregister(sb->shrinker) > as it is now. > > i.e. the entire shrinker regsitration API is now just three > functions, down from the current four, and much simpler than the > the seven functions this patch set results in... > > The other advantage of this is that it will break all the existing > out of tree code and third party modules using the old API and will > no longer work with a kernel using lockless slab shrinkers. They > need to break (both at the source and binary levels) to stop bad > things from happening due to using uncoverted shrinkers in the new > setup. Got it. And totally agree. I will do it in the v2. Thanks, Qi > > -Dave.
diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h index 43e6fcabbf51..8e9ba6fa3fcc 100644 --- a/include/linux/shrinker.h +++ b/include/linux/shrinker.h @@ -107,6 +107,18 @@ extern void unregister_shrinker(struct shrinker *shrinker); extern void free_prealloced_shrinker(struct shrinker *shrinker); extern void synchronize_shrinkers(void); +typedef unsigned long (*count_objects_cb)(struct shrinker *s, + struct shrink_control *sc); +typedef unsigned long (*scan_objects_cb)(struct shrinker *s, + struct shrink_control *sc); + +struct shrinker *shrinker_alloc_and_init(count_objects_cb count, + scan_objects_cb scan, long batch, + int seeks, unsigned flags, + void *priv_data); +void shrinker_free(struct shrinker *shrinker); +void unregister_and_free_shrinker(struct shrinker *shrinker); + #ifdef CONFIG_SHRINKER_DEBUG extern int shrinker_debugfs_add(struct shrinker *shrinker); extern struct dentry *shrinker_debugfs_detach(struct shrinker *shrinker, diff --git a/mm/vmscan.c b/mm/vmscan.c index 45d17c7cc555..64ff598fbad9 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -809,6 +809,41 @@ void unregister_shrinker(struct shrinker *shrinker) } EXPORT_SYMBOL(unregister_shrinker); +struct shrinker *shrinker_alloc_and_init(count_objects_cb count, + scan_objects_cb scan, long batch, + int seeks, unsigned flags, + void *priv_data) +{ + struct shrinker *shrinker; + + shrinker = kzalloc(sizeof(struct shrinker), GFP_KERNEL); + if (!shrinker) + return NULL; + + shrinker->count_objects = count; + shrinker->scan_objects = scan; + shrinker->batch = batch; + shrinker->seeks = seeks; + shrinker->flags = flags; + shrinker->private_data = priv_data; + + return shrinker; +} +EXPORT_SYMBOL(shrinker_alloc_and_init); + +void shrinker_free(struct shrinker *shrinker) +{ + kfree(shrinker); +} +EXPORT_SYMBOL(shrinker_free); + +void unregister_and_free_shrinker(struct shrinker *shrinker) +{ + unregister_shrinker(shrinker); + kfree(shrinker); +} +EXPORT_SYMBOL(unregister_and_free_shrinker); + /** * synchronize_shrinkers - Wait for all running shrinkers to complete. *