[v1,2/2] hugetlb: process multiple lists in gather_bootmem_prealloc_parallel

Message ID 20240213111347.3189206-3-gang.li@linux.dev
State New
Headers
Series hugetlb: two small improvements of hugetlb init parallelization |

Commit Message

Gang Li Feb. 13, 2024, 11:13 a.m. UTC
  gather_bootmem_prealloc_node currently only process one list in
huge_boot_pages array. So gather_bootmem_prealloc expects
padata_do_multithreaded to run num_node_state(N_MEMORY) instances of
gather_bootmem_prealloc_node to process all lists in huge_boot_pages.

This works well in current padata_do_multithreaded implementation.
It guarantees that size/min_chunk <= thread num <= max_threads.

```
/* Ensure at least one thread when size < min_chunk. */
nworks = max(job->size / max(job->min_chunk, job->align), 1ul);
nworks = min(nworks, job->max_threads);

ps.nworks      = padata_work_alloc_mt(nworks, &ps, &works);
```

However, the comment of padata_do_multithreaded API only promises a
maximum value for the number of threads and does not specify a
minimum value. Which may pass multiple nodes to
gather_bootmem_prealloc_node and only one node will be processed.

To avoid potential errors, introduce gather_bootmem_prealloc_parallel
to handle the case where the number of threads does not meet the
requirement of max_threads.

Fixes: 0306f03dcbd7 ("hugetlb: parallelize 1G hugetlb initialization")
Signed-off-by: Gang Li <ligang.bdlg@bytedance.com>
---
 mm/hugetlb.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)
  

Comments

Muchun Song Feb. 13, 2024, 2:54 p.m. UTC | #1
> On Feb 13, 2024, at 19:13, Gang Li <gang.li@linux.dev> wrote:
> 
> gather_bootmem_prealloc_node currently only process one list in
> huge_boot_pages array. So gather_bootmem_prealloc expects
> padata_do_multithreaded to run num_node_state(N_MEMORY) instances of
> gather_bootmem_prealloc_node to process all lists in huge_boot_pages.
> 
> This works well in current padata_do_multithreaded implementation.
> It guarantees that size/min_chunk <= thread num <= max_threads.
> 
> ```
> /* Ensure at least one thread when size < min_chunk. */
> nworks = max(job->size / max(job->min_chunk, job->align), 1ul);
> nworks = min(nworks, job->max_threads);
> 
> ps.nworks      = padata_work_alloc_mt(nworks, &ps, &works);
> ```
> 
> However, the comment of padata_do_multithreaded API only promises a
> maximum value for the number of threads and does not specify a
> minimum value. Which may pass multiple nodes to
> gather_bootmem_prealloc_node and only one node will be processed.
> 
> To avoid potential errors, introduce gather_bootmem_prealloc_parallel
> to handle the case where the number of threads does not meet the
> requirement of max_threads.
> 
> Fixes: 0306f03dcbd7 ("hugetlb: parallelize 1G hugetlb initialization")
> Signed-off-by: Gang Li <ligang.bdlg@bytedance.com>

Reviewed-by: Muchun Song <muchun.song@linux.dev>

Thanks.
  

Patch

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 25069ca6ec248..2799a7ea098c1 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3414,10 +3414,8 @@  static void __init prep_and_add_bootmem_folios(struct hstate *h,
  * Put bootmem huge pages into the standard lists after mem_map is up.
  * Note: This only applies to gigantic (order > MAX_PAGE_ORDER) pages.
  */
-static void __init gather_bootmem_prealloc_node(unsigned long start, unsigned long end, void *arg)
-
+static void __init gather_bootmem_prealloc_node(unsigned long nid)
 {
-	int nid = start;
 	LIST_HEAD(folio_list);
 	struct huge_bootmem_page *m;
 	struct hstate *h = NULL, *prev_h = NULL;
@@ -3455,10 +3453,19 @@  static void __init gather_bootmem_prealloc_node(unsigned long start, unsigned lo
 	prep_and_add_bootmem_folios(h, &folio_list);
 }
 
+static void __init gather_bootmem_prealloc_parallel(unsigned long start,
+						    unsigned long end, void *arg)
+{
+	int nid;
+
+	for (nid = start; nid < end; nid++)
+		gather_bootmem_prealloc_node(nid);
+}
+
 static void __init gather_bootmem_prealloc(void)
 {
 	struct padata_mt_job job = {
-		.thread_fn	= gather_bootmem_prealloc_node,
+		.thread_fn	= gather_bootmem_prealloc_parallel,
 		.fn_arg		= NULL,
 		.start		= 0,
 		.size		= num_node_state(N_MEMORY),