[V5,1/2] mm: compaction: move compaction sysctl to its own file

Message ID 202303221046286197958@zte.com.cn
State New
Headers
Series [V5,1/2] mm: compaction: move compaction sysctl to its own file |

Commit Message

ye.xingchen@zte.com.cn March 22, 2023, 2:46 a.m. UTC
  From: Minghao Chi <chi.minghao@zte.com.cn>

This moves all compaction sysctls to its own file.

Link: https://lore.kernel.org/lkml/9952bbf8-cf59-7bea-ce50-0200d4f4165e@suse.cz/
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Signed-off-by: Ye Xingchen <ye.xingchen@zte.com.cn>
---
 include/linux/compaction.h |  7 ----
 kernel/sysctl.c            | 59 -----------------------------
 mm/compaction.c            | 77 ++++++++++++++++++++++++++++++++++----
 3 files changed, 69 insertions(+), 74 deletions(-)
  

Comments

Christoph Hellwig March 22, 2023, 8:35 a.m. UTC | #1
On Wed, Mar 22, 2023 at 10:46:28AM +0800, ye.xingchen@zte.com.cn wrote:
> From: Minghao Chi <chi.minghao@zte.com.cn>
> 
> This moves all compaction sysctls to its own file.

So there's a whole lot of these 'move sysctrls to their own file'
patches, but no actual explanation of why that is desirable.  Please
explain why we'd want to split code that is closely related, and now
requires marking symbols non-static just to create a new tiny source
file.
  
Vlastimil Babka March 23, 2023, 4:19 p.m. UTC | #2
On 3/22/23 09:35, Christoph Hellwig wrote:
> On Wed, Mar 22, 2023 at 10:46:28AM +0800, ye.xingchen@zte.com.cn wrote:
>> From: Minghao Chi <chi.minghao@zte.com.cn>
>> 
>> This moves all compaction sysctls to its own file.
> 
> So there's a whole lot of these 'move sysctrls to their own file'
> patches, but no actual explanation of why that is desirable.  Please

I think Luis started this initiative, maybe he can provide the canonical
reasoning :)

> explain why we'd want to split code that is closely related, and now
> requires marking symbols non-static just to create a new tiny source
> file.

Hmm? I can see the opposite, at least in the compaction patch here. Related
code and variables are moved closer together, made static, declarations
removed from headers. It looks like an improvement to me.
  
Luis Chamberlain March 23, 2023, 7:39 p.m. UTC | #3
On Thu, Mar 23, 2023 at 05:19:15PM +0100, Vlastimil Babka wrote:
> On 3/22/23 09:35, Christoph Hellwig wrote:
> > On Wed, Mar 22, 2023 at 10:46:28AM +0800, ye.xingchen@zte.com.cn wrote:
> >> From: Minghao Chi <chi.minghao@zte.com.cn>
> >> 
> >> This moves all compaction sysctls to its own file.
> > 
> > So there's a whole lot of these 'move sysctrls to their own file'
> > patches, but no actual explanation of why that is desirable.  Please
> 
> I think Luis started this initiative, maybe he can provide the canonical
> reasoning :)

The kernel/sysctl.c is flooded now with commit log entries which
describe a proper rationale, however some folks forget to also include
similar rationale in new patches. I try to remind folks when I can,
thanks for reminding them to continue to do that. That needs to be fixed
in this patch. The summary is its hard to coordiante merge conflicts
with all the syctls in one place, best to just put them where they are
used.

There is a small hidden penalty increase in size to the kernel with the
sysctls moves today though and one for which Matthew Wilcox has recently asked
for us to pause these moves until we can save more memory. The extra memory
is caused by the extra empty struct ctl_table added to the end of the new
array. The way to avoid that penalty is to deprecate all APIs in sysctl
registation which deal with complex array structures. I have some some
of that addressed on my sysctl-next tree (and merged on linux-next) but
much more work is required to deprecate the older APIs. I was ready to
pause the kernel/sysctl.c moves until those APIs are deprecated and we
start having sysctl APIs which allow us to not have the empty array at
the end. But as I thought about this just now, the use cases that move
the sysctls where __init is used could benefit already in size.

For this patch there seems to be a savings of 4 bytes:

$ ./scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 1/0 grow/shrink: 1/2 up/down: 346/-350 (-4)
Function                                     old     new   delta
vm_compaction                                  -     320    +320
kcompactd_init                               167     193     +26
proc_dointvec_minmax_warn_RT_change          104      10     -94
vm_table                                    2112    1856    -256
Total: Before=19287558, After=19287554, chg -0.00%

So I don't think we need to pause this move or others where are have savings.

Minghao, can you fix the commit log, and explain how you are also saving
4 bytes as per the above bloat-o-meter results?

> > explain why we'd want to split code that is closely related, and now
> > requires marking symbols non-static just to create a new tiny source
> > file.
> 
> Hmm? I can see the opposite, at least in the compaction patch here. Related
> code and variables are moved closer together, made static, declarations
> removed from headers. It looks like an improvement to me.

Glad this helps.

  Luis
  
ye xingchen March 24, 2023, 6:24 a.m. UTC | #4
>$ ./scripts/bloat-o-meter vmlinux.old vmlinux
>add/remove: 1/0 grow/shrink: 1/2 up/down: 346/-350 (-4)
>Function                                     old     new   delta
>vm_compaction                                  -     320    +320
>kcompactd_init                               167     193     +26
>proc_dointvec_minmax_warn_RT_change          104      10     -94
>vm_table                                    2112    1856    -256
>Total: Before=19287558, After=19287554, chg -0.00%
>
>So I don't think we need to pause this move or others where are have savings.
>
>Minghao, can you fix the commit log, and explain how you are also saving
>4 bytes as per the above bloat-o-meter results?

$ ./scripts/bloat-o-meter vmlinux vmlinux.new
add/remove: 1/0 grow/shrink: 1/1 up/down: 350/-256 (94)
Function                                     old     new   delta
vm_compaction                                  -     320    +320
kcompactd_init                               180     210     +30
vm_table                                    2112    1856    -256
Total: Before=21104198, After=21104292, chg +0.00%

In my environment, kcompactd_init increases by 30 instead of 26.
And proc_dointvec_minmax_warn_RT_change No expansion.

  minghao
  
Luis Chamberlain March 24, 2023, 6:11 p.m. UTC | #5
On Fri, Mar 24, 2023 at 06:24:08AM +0000, ye xingchen wrote:
> >$ ./scripts/bloat-o-meter vmlinux.old vmlinux
> >add/remove: 1/0 grow/shrink: 1/2 up/down: 346/-350 (-4)
> >Function                                     old     new   delta
> >vm_compaction                                  -     320    +320
> >kcompactd_init                               167     193     +26
> >proc_dointvec_minmax_warn_RT_change          104      10     -94
> >vm_table                                    2112    1856    -256
> >Total: Before=19287558, After=19287554, chg -0.00%
> >
> >So I don't think we need to pause this move or others where are have savings.
> >
> >Minghao, can you fix the commit log, and explain how you are also saving
> >4 bytes as per the above bloat-o-meter results?
> 
> $ ./scripts/bloat-o-meter vmlinux vmlinux.new
> add/remove: 1/0 grow/shrink: 1/1 up/down: 350/-256 (94)
> Function                                     old     new   delta
> vm_compaction                                  -     320    +320
> kcompactd_init                               180     210     +30
> vm_table                                    2112    1856    -256
> Total: Before=21104198, After=21104292, chg +0.00%
> 
> In my environment, kcompactd_init increases by 30 instead of 26.
> And proc_dointvec_minmax_warn_RT_change No expansion.

How about a defconfig + compaction enabled? Provide that information
and let Vlastimal ACK/NACK the patch.

  Luis
  
ye xingchen March 27, 2023, 2:49 a.m. UTC | #6
>> >$ ./scripts/bloat-o-meter vmlinux.old vmlinux
>> >add/remove: 1/0 grow/shrink: 1/2 up/down: 346/-350 (-4)
>> >Function                                     old     new   delta
>> >vm_compaction                                  -     320    +320
>> >kcompactd_init                               167     193     +26
>> >proc_dointvec_minmax_warn_RT_change          104      10     -94
>> >vm_table                                    2112    1856    -256
>> >Total: Before=19287558, After=19287554, chg -0.00%
>> >
>> >So I don't think we need to pause this move or others where are have savings.
>> >
>> >Minghao, can you fix the commit log, and explain how you are also saving
>> >4 bytes as per the above bloat-o-meter results?
>> 
>> $ ./scripts/bloat-o-meter vmlinux vmlinux.new
>> add/remove: 1/0 grow/shrink: 1/1 up/down: 350/-256 (94)
>> Function                                     old     new   delta
>> vm_compaction                                  -     320    +320
>> kcompactd_init                               180     210     +30
>> vm_table                                    2112    1856    -256
>> Total: Before=21104198, After=21104292, chg +0.00%
>> 
>> In my environment, kcompactd_init increases by 30 instead of 26.
>> And proc_dointvec_minmax_warn_RT_change No expansion.
>
>How about a defconfig + compaction enabled? Provide that information
>and let Vlastimal ACK/NACK the patch.
I use x86_defconfig and linux-next-20230327 branch
$ make defconfig;make all -j120
CONFIG_COMPACTION=y

add/remove: 1/0 grow/shrink: 1/1 up/down: 350/-256 (94)
Function                                     old     new   delta
vm_compaction                                  -     320    +320
kcompactd_init                               180     210     +30
vm_table                                    2112    1856    -256
Total: Before=21119987, After=21120081, chg +0.00%
  
Christoph Hellwig March 27, 2023, 3:39 a.m. UTC | #7
On Thu, Mar 23, 2023 at 05:19:15PM +0100, Vlastimil Babka wrote:

> I think Luis started this initiative, maybe he can provide the canonical
> reasoning :)

No matter who provides it, it needs to go into the commit log.
  
Vlastimil Babka March 27, 2023, 10:27 a.m. UTC | #8
On 3/27/23 04:49, ye xingchen wrote:
>>> >$ ./scripts/bloat-o-meter vmlinux.old vmlinux
>>> >add/remove: 1/0 grow/shrink: 1/2 up/down: 346/-350 (-4)
>>> >Function                                     old     new   delta
>>> >vm_compaction                                  -     320    +320
>>> >kcompactd_init                               167     193     +26
>>> >proc_dointvec_minmax_warn_RT_change          104      10     -94
>>> >vm_table                                    2112    1856    -256
>>> >Total: Before=19287558, After=19287554, chg -0.00%
>>> >
>>> >So I don't think we need to pause this move or others where are have savings.
>>> >
>>> >Minghao, can you fix the commit log, and explain how you are also saving
>>> >4 bytes as per the above bloat-o-meter results?
>>> 
>>> $ ./scripts/bloat-o-meter vmlinux vmlinux.new
>>> add/remove: 1/0 grow/shrink: 1/1 up/down: 350/-256 (94)
>>> Function                                     old     new   delta
>>> vm_compaction                                  -     320    +320
>>> kcompactd_init                               180     210     +30
>>> vm_table                                    2112    1856    -256
>>> Total: Before=21104198, After=21104292, chg +0.00%
>>> 
>>> In my environment, kcompactd_init increases by 30 instead of 26.
>>> And proc_dointvec_minmax_warn_RT_change No expansion.
>>
>>How about a defconfig + compaction enabled? Provide that information
>>and let Vlastimal ACK/NACK the patch.
> I use x86_defconfig and linux-next-20230327 branch
> $ make defconfig;make all -j120
> CONFIG_COMPACTION=y
> 
> add/remove: 1/0 grow/shrink: 1/1 up/down: 350/-256 (94)
> Function                                     old     new   delta
> vm_compaction                                  -     320    +320
> kcompactd_init                               180     210     +30
> vm_table                                    2112    1856    -256
> Total: Before=21119987, After=21120081, chg +0.00%

No savings then, but to me the patch still seems a worthwile cleanup. But if
others think the 94 bytes are an issue, it can wait for the new APIs.
  

Patch

diff --git a/include/linux/compaction.h b/include/linux/compaction.h
index 52a9ff65faee..a6e512cfb670 100644
--- a/include/linux/compaction.h
+++ b/include/linux/compaction.h
@@ -81,13 +81,6 @@  static inline unsigned long compact_gap(unsigned int order)
 }

 #ifdef CONFIG_COMPACTION
-extern unsigned int sysctl_compaction_proactiveness;
-extern int sysctl_compaction_handler(struct ctl_table *table, int write,
-			void *buffer, size_t *length, loff_t *ppos);
-extern int compaction_proactiveness_sysctl_handler(struct ctl_table *table,
-		int write, void *buffer, size_t *length, loff_t *ppos);
-extern int sysctl_extfrag_threshold;
-extern int sysctl_compact_unevictable_allowed;

 extern unsigned int extfrag_for_order(struct zone *zone, unsigned int order);
 extern int fragmentation_index(struct zone *zone, unsigned int order);
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index ce0297acf97c..e23061f33237 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -42,7 +42,6 @@ 
 #include <linux/highuid.h>
 #include <linux/writeback.h>
 #include <linux/ratelimit.h>
-#include <linux/compaction.h>
 #include <linux/hugetlb.h>
 #include <linux/initrd.h>
 #include <linux/key.h>
@@ -746,27 +745,6 @@  int proc_dointvec(struct ctl_table *table, int write, void *buffer,
 	return do_proc_dointvec(table, write, buffer, lenp, ppos, NULL, NULL);
 }

-#ifdef CONFIG_COMPACTION
-static int proc_dointvec_minmax_warn_RT_change(struct ctl_table *table,
-		int write, void *buffer, size_t *lenp, loff_t *ppos)
-{
-	int ret, old;
-
-	if (!IS_ENABLED(CONFIG_PREEMPT_RT) || !write)
-		return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
-
-	old = *(int *)table->data;
-	ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
-	if (ret)
-		return ret;
-	if (old != *(int *)table->data)
-		pr_warn_once("sysctl attribute %s changed by %s[%d]\n",
-			     table->procname, current->comm,
-			     task_pid_nr(current));
-	return ret;
-}
-#endif
-
 /**
  * proc_douintvec - read a vector of unsigned integers
  * @table: the sysctl table
@@ -2157,43 +2135,6 @@  static struct ctl_table vm_table[] = {
 		.extra1		= SYSCTL_ONE,
 		.extra2		= SYSCTL_FOUR,
 	},
-#ifdef CONFIG_COMPACTION
-	{
-		.procname	= "compact_memory",
-		.data		= NULL,
-		.maxlen		= sizeof(int),
-		.mode		= 0200,
-		.proc_handler	= sysctl_compaction_handler,
-	},
-	{
-		.procname	= "compaction_proactiveness",
-		.data		= &sysctl_compaction_proactiveness,
-		.maxlen		= sizeof(sysctl_compaction_proactiveness),
-		.mode		= 0644,
-		.proc_handler	= compaction_proactiveness_sysctl_handler,
-		.extra1		= SYSCTL_ZERO,
-		.extra2		= SYSCTL_ONE_HUNDRED,
-	},
-	{
-		.procname	= "extfrag_threshold",
-		.data		= &sysctl_extfrag_threshold,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax,
-		.extra1		= SYSCTL_ZERO,
-		.extra2		= SYSCTL_ONE_THOUSAND,
-	},
-	{
-		.procname	= "compact_unevictable_allowed",
-		.data		= &sysctl_compact_unevictable_allowed,
-		.maxlen		= sizeof(int),
-		.mode		= 0644,
-		.proc_handler	= proc_dointvec_minmax_warn_RT_change,
-		.extra1		= SYSCTL_ZERO,
-		.extra2		= SYSCTL_ONE,
-	},
-
-#endif /* CONFIG_COMPACTION */
 	{
 		.procname	= "min_free_kbytes",
 		.data		= &min_free_kbytes,
diff --git a/mm/compaction.c b/mm/compaction.c
index e689d66cedf4..ad5838c7fce7 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1728,7 +1728,9 @@  typedef enum {
  * Allow userspace to control policy on scanning the unevictable LRU for
  * compactable pages.
  */
-int sysctl_compact_unevictable_allowed __read_mostly = CONFIG_COMPACT_UNEVICTABLE_DEFAULT;
+static int sysctl_compact_unevictable_allowed __read_mostly = CONFIG_COMPACT_UNEVICTABLE_DEFAULT;
+unsigned int sysctl_compaction_proactiveness;
+static int sysctl_extfrag_threshold = 500;

 static inline void
 update_fast_start_pfn(struct compact_control *cc, unsigned long pfn)
@@ -2052,7 +2054,7 @@  static unsigned int fragmentation_score_node(pg_data_t *pgdat)

 	return score;
 }
-
+unsigned int sysctl_compaction_proactiveness;
 static unsigned int fragmentation_score_wmark(pg_data_t *pgdat, bool low)
 {
 	unsigned int wmark_low;
@@ -2228,7 +2230,7 @@  static enum compact_result __compaction_suitable(struct zone *zone, int order,

 	return COMPACT_CONTINUE;
 }
-
+static int sysctl_extfrag_threshold;
 /*
  * compaction_suitable: Is this suitable to run compaction on this zone now?
  * Returns
@@ -2584,8 +2586,6 @@  static enum compact_result compact_zone_order(struct zone *zone, int order,
 	return ret;
 }

-int sysctl_extfrag_threshold = 500;
-
 /**
  * try_to_compact_pages - Direct compact to satisfy a high-order allocation
  * @gfp_mask: The GFP mask of the current allocation
@@ -2748,8 +2748,7 @@  static void compact_nodes(void)
  * background. It takes values in the range [0, 100].
  */
 unsigned int __read_mostly sysctl_compaction_proactiveness = 20;
-
-int compaction_proactiveness_sysctl_handler(struct ctl_table *table, int write,
+static int compaction_proactiveness_sysctl_handler(struct ctl_table *table, int write,
 		void *buffer, size_t *length, loff_t *ppos)
 {
 	int rc, nid;
@@ -2779,7 +2778,7 @@  int compaction_proactiveness_sysctl_handler(struct ctl_table *table, int write,
  * This is the entry point for compacting all nodes via
  * /proc/sys/vm/compact_memory
  */
-int sysctl_compaction_handler(struct ctl_table *table, int write,
+static int sysctl_compaction_handler(struct ctl_table *table, int write,
 			void *buffer, size_t *length, loff_t *ppos)
 {
 	if (write)
@@ -3075,6 +3074,65 @@  static int kcompactd_cpu_online(unsigned int cpu)
 	return 0;
 }

+static int proc_dointvec_minmax_warn_RT_change(struct ctl_table *table,
+		int write, void *buffer, size_t *lenp, loff_t *ppos)
+{
+	int ret, old;
+
+	if (!IS_ENABLED(CONFIG_PREEMPT_RT) || !write)
+		return proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+
+	old = *(int *)table->data;
+	ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
+	if (ret)
+		return ret;
+	if (old != *(int *)table->data)
+		pr_warn_once("sysctl attribute %s changed by %s[%d]\n",
+			     table->procname, current->comm,
+			     task_pid_nr(current));
+	return ret;
+}
+
+#ifdef CONFIG_SYSCTL
+static struct ctl_table vm_compaction[] = {
+	{
+		.procname	= "compact_memory",
+		.data		= NULL,
+		.maxlen		= sizeof(int),
+		.mode		= 0200,
+		.proc_handler	= sysctl_compaction_handler,
+	},
+	{
+		.procname	= "compaction_proactiveness",
+		.data		= &sysctl_compaction_proactiveness,
+		.maxlen		= sizeof(sysctl_compaction_proactiveness),
+		.mode		= 0644,
+		.proc_handler	= compaction_proactiveness_sysctl_handler,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_ONE_HUNDRED,
+	},
+	{
+		.procname	= "extfrag_threshold",
+		.data		= &sysctl_extfrag_threshold,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_ONE_THOUSAND,
+	},
+	{
+		.procname	= "compact_unevictable_allowed",
+		.data		= &sysctl_compact_unevictable_allowed,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax_warn_RT_change,
+		.extra1		= SYSCTL_ZERO,
+		.extra2		= SYSCTL_ONE,
+	},
+	{ }
+};
+#endif
+
 static int __init kcompactd_init(void)
 {
 	int nid;
@@ -3090,6 +3148,9 @@  static int __init kcompactd_init(void)

 	for_each_node_state(nid, N_MEMORY)
 		kcompactd_run(nid);
+#ifdef CONFIG_SYSCTL
+	register_sysctl_init("vm", vm_compaction);
+#endif
 	return 0;
 }
 subsys_initcall(kcompactd_init)