From patchwork Tue Jan  2 13:12:43 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Gang Li <gang.li@linux.dev>
X-Patchwork-Id: 184326
Return-Path: <linux-kernel+bounces-14382-ouuuleilei=gmail.com@vger.kernel.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id
 tb2csp4435803dyb;
        Tue, 2 Jan 2024 05:13:48 -0800 (PST)
X-Google-Smtp-Source: 
 AGHT+IH+UudUXGZcZXemluCIsjhLcTpyE6YfSD0VDQmVf1FUkWOP+f9wexN4dtpLE0JuChxaQ6ak
X-Received: by 2002:a05:6402:268b:b0:554:c9af:a66e with SMTP id
 w11-20020a056402268b00b00554c9afa66emr13598234edd.9.1704201228702;
        Tue, 02 Jan 2024 05:13:48 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; t=1704201228; cv=none;
        d=google.com; s=arc-20160816;
        b=zpMSCTiWQsxSfE6jmeKFDiF11vS6ljCwAWhxuOieFD76LjPpvlirEEyo47kaQr5HCz
         7OFLWoS6cyTas/ahjs9IJtjra5BMtOo6UuOBzuKLQm9Dg7yjNbCP5lIHuuWy92V55LLh
         7vqNmvhk4eR0figHcf9+ytzZss1wISfZoKu9hQmvklO+FLQ6nCGtDDypi3oyEXaf4l/u
         BX3hEpgMBKomPRUPaQoPMZSuTvePnfZ7q+hFKV32xPwkCk0/zy0aICKm0WLT7OVJE3bl
         Tmdh3yZXHylNwg8o2jJ0Q3tRuQIPePJfZ/dP3AO3tlnKC4DHRdjoWSGLcFz8lmV3kuFw
         Gi9Q==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=content-transfer-encoding:mime-version:list-unsubscribe
         :list-subscribe:list-id:precedence:references:in-reply-to:message-id
         :date:subject:cc:to:from:dkim-signature;
        bh=5jUJr4R6zMb6dInOd1l8MHQjoDUi83oQkc19tydUOUM=;
        fh=5n5b5+XOwLgGNDePhvfXysty56rnkWOs9jaHEODdVeE=;
        b=oqxEGEA41k0XBakRcW+TP06kCFcLOo8P6RXh+ifLx2ek7wR8TCi6bOPSe9Fgzs2bsN
         pb/nDEKdYvoR9kYZjd3uc765nQZ1JUBe+voJzKooR7umCYr4vOuhl54pV79BFeG/xujj
         Y/VTArk4fBLXLqpMkXY6XFeA30o5PQ+I+BVClOinPci1E5hqqObAgs/B6eR7tcokgNA/
         fXTBRK3Ov9VZPDhX0we04NLhih/MWIk4E8uWoecSqo4U3jqMtEm599VwB7GaMDbOpkQw
         wkYGaJn5yDni38TDram/iWBwgxSAas3w9wRTeZPqrXDgticiwX1CzrnVxHawNIWefsEq
         TqYg==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=pass header.i=@linux.dev header.s=key1 header.b=wXNKNFN5;
       spf=pass (google.com: domain of
 linux-kernel+bounces-14382-ouuuleilei=gmail.com@vger.kernel.org designates
 2604:1380:4601:e00::3 as permitted sender)
 smtp.mailfrom="linux-kernel+bounces-14382-ouuuleilei=gmail.com@vger.kernel.org";
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev
Received: from am.mirrors.kernel.org (am.mirrors.kernel.org.
 [2604:1380:4601:e00::3])
        by mx.google.com with ESMTPS id
 q11-20020a50aa8b000000b0055480454c46si9190822edc.267.2024.01.02.05.13.48
        for <ouuuleilei@gmail.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 02 Jan 2024 05:13:48 -0800 (PST)
Received-SPF: pass (google.com: domain of
 linux-kernel+bounces-14382-ouuuleilei=gmail.com@vger.kernel.org designates
 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3;
Authentication-Results: mx.google.com;
       dkim=pass header.i=@linux.dev header.s=key1 header.b=wXNKNFN5;
       spf=pass (google.com: domain of
 linux-kernel+bounces-14382-ouuuleilei=gmail.com@vger.kernel.org designates
 2604:1380:4601:e00::3 as permitted sender)
 smtp.mailfrom="linux-kernel+bounces-14382-ouuuleilei=gmail.com@vger.kernel.org";
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev
Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org
 [52.25.139.140])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by am.mirrors.kernel.org (Postfix) with ESMTPS id 5141C1F22A30
	for <ouuuleilei@gmail.com>; Tue,  2 Jan 2024 13:13:48 +0000 (UTC)
Received: from localhost.localdomain (localhost.localdomain [127.0.0.1])
	by smtp.subspace.kernel.org (Postfix) with ESMTP id E5125101C4;
	Tue,  2 Jan 2024 13:13:25 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev
 header.b="wXNKNFN5"
X-Original-To: linux-kernel@vger.kernel.org
Received: from out-186.mta1.migadu.com (out-186.mta1.migadu.com
 [95.215.58.186])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 411B7FBEF
	for <linux-kernel@vger.kernel.org>; Tue,  2 Jan 2024 13:13:20 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=linux.dev
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=linux.dev
X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and
 include these headers.
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1;
	t=1704201199;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=5jUJr4R6zMb6dInOd1l8MHQjoDUi83oQkc19tydUOUM=;
	b=wXNKNFN5V2QUIIEBJ0cDhA/1XejAaQRMJMF38Uej4HkKHGocMOKkJyR8FJHUYIBh8b/xd0
	hARFT4JBoor5YBUOWh0h26mYqSlEI0jo+wKgsAE/hBQ5trdzV5u88ffrxWuDT82wCOF+uR
	okbyj35zpon6VTuBdHmYdXzMfgwdPIQ=
From: Gang Li <gang.li@linux.dev>
To: David Hildenbrand <david@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Muchun Song <muchun.song@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tim Chen <tim.c.chen@linux.intel.com>
Cc: linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	ligang.bdlg@bytedance.com,
	Gang Li <gang.li@linux.dev>
Subject: [PATCH v3 1/7] hugetlb: code clean for hugetlb_hstate_alloc_pages
Date: Tue,  2 Jan 2024 21:12:43 +0800
Message-Id: <20240102131249.76622-2-gang.li@linux.dev>
In-Reply-To: <20240102131249.76622-1-gang.li@linux.dev>
References: <20240102131249.76622-1-gang.li@linux.dev>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
X-Migadu-Flow: FLOW_OUT
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1786984507566359699
X-GMAIL-MSGID: 1786984507566359699

The readability of `hugetlb_hstate_alloc_pages` is poor. By cleaning the
code, its readability can be improved, facilitating future modifications.

This patch extracts two functions to reduce the complexity of
`hugetlb_hstate_alloc_pages` and has no functional changes.

- hugetlb_hstate_alloc_pages_node_specific() to handle iterates through
  each online node and performs allocation if necessary.
- hugetlb_hstate_alloc_pages_report() report error during allocation.
  And the value of h->max_huge_pages is updated accordingly.

Signed-off-by: Gang Li <gang.li@linux.dev>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
---
 mm/hugetlb.c | 46 +++++++++++++++++++++++++++++-----------------
 1 file changed, 29 insertions(+), 17 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index ed1581b670d42..2606135ec55e6 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3482,6 +3482,33 @@ static void __init hugetlb_hstate_alloc_pages_onenode(struct hstate *h, int nid)
 	h->max_huge_pages_node[nid] = i;
 }
 
+static bool __init hugetlb_hstate_alloc_pages_node_specific(struct hstate *h)
+{
+	int i;
+	bool node_specific_alloc = false;
+
+	for_each_online_node(i) {
+		if (h->max_huge_pages_node[i] > 0) {
+			hugetlb_hstate_alloc_pages_onenode(h, i);
+			node_specific_alloc = true;
+		}
+	}
+
+	return node_specific_alloc;
+}
+
+static void __init hugetlb_hstate_alloc_pages_report(unsigned long allocated, struct hstate *h)
+{
+	if (allocated < h->max_huge_pages) {
+		char buf[32];
+
+		string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
+		pr_warn("HugeTLB: allocating %lu of page size %s failed.  Only allocated %lu hugepages.\n",
+			h->max_huge_pages, buf, allocated);
+		h->max_huge_pages = allocated;
+	}
+}
+
 /*
  * NOTE: this routine is called in different contexts for gigantic and
  * non-gigantic pages.
@@ -3499,7 +3526,6 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
 	struct folio *folio;
 	LIST_HEAD(folio_list);
 	nodemask_t *node_alloc_noretry;
-	bool node_specific_alloc = false;
 
 	/* skip gigantic hugepages allocation if hugetlb_cma enabled */
 	if (hstate_is_gigantic(h) && hugetlb_cma_size) {
@@ -3508,14 +3534,7 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
 	}
 
 	/* do node specific alloc */
-	for_each_online_node(i) {
-		if (h->max_huge_pages_node[i] > 0) {
-			hugetlb_hstate_alloc_pages_onenode(h, i);
-			node_specific_alloc = true;
-		}
-	}
-
-	if (node_specific_alloc)
+	if (hugetlb_hstate_alloc_pages_node_specific(h))
 		return;
 
 	/* below will do all node balanced alloc */
@@ -3558,14 +3577,7 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
 	/* list will be empty if hstate_is_gigantic */
 	prep_and_add_allocated_folios(h, &folio_list);
 
-	if (i < h->max_huge_pages) {
-		char buf[32];
-
-		string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32);
-		pr_warn("HugeTLB: allocating %lu of page size %s failed.  Only allocated %lu hugepages.\n",
-			h->max_huge_pages, buf, i);
-		h->max_huge_pages = i;
-	}
+	hugetlb_hstate_alloc_pages_report(i, h);
 	kfree(node_alloc_noretry);
 }
 

From patchwork Tue Jan  2 13:12:44 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Gang Li <gang.li@linux.dev>
X-Patchwork-Id: 184327
Return-Path: <linux-kernel+bounces-14383-ouuuleilei=gmail.com@vger.kernel.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id
 tb2csp4435916dyb;
        Tue, 2 Jan 2024 05:14:01 -0800 (PST)
X-Google-Smtp-Source: 
 AGHT+IGCzQF1/KNhpHKrPUTBcztq4HMbvwpub0C599FUbr9Czr7X2brH0TK/oVWbafpeJt00kX8y
X-Received: by 2002:a17:906:caa:b0:a23:2aa8:99fe with SMTP id
 k10-20020a1709060caa00b00a232aa899femr7898318ejh.9.1704201241505;
        Tue, 02 Jan 2024 05:14:01 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; t=1704201241; cv=none;
        d=google.com; s=arc-20160816;
        b=nlQmDmaUtIj1ejpwNDPmd1Wxd9JYt06Y1kpE3DawBiIY/gEsJS2IAqZZHWV5p+JB5e
         NoGLS9KkWyVlmGeEjmrzAbcnPa8OVV9H14yRQjTgmYjlWsTw6bkrtFuel6dcTq6bFDBz
         YzPDI8fsKP9wpT9G8G31fSUIcq37rWsWGs+L8n6LITWr1+5YmOkMkY9psbM0M4jMM1tw
         lRKJ1YeWUqSMvxOIpzp38Ac2vgfoq94A/D7u7ZfwdcTgJ83+8Uey6BCBqXL9A6d2ZkRm
         qtGmxrURekaFFNMV/wtmTb+0GopTIYzOeZ07IrPXv8fwrxqKe6JiHuzV6/VeEv6JCC5a
         +3dw==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=content-transfer-encoding:mime-version:list-unsubscribe
         :list-subscribe:list-id:precedence:references:in-reply-to:message-id
         :date:subject:cc:to:from:dkim-signature;
        bh=ETU4P5ZGwiSCKQ/C0kQ3RzHFu60VRj1ek3R3KqiB2x4=;
        fh=5n5b5+XOwLgGNDePhvfXysty56rnkWOs9jaHEODdVeE=;
        b=oedtUbipbvcsqnMuHXJJVjTLPdzMbfhn+xwK4szRRg86KmeIb/axKk5yHEz8lzYL+K
         JenLZ/KaZoDfMYbqcPzC+nDd+rvXj04jo8r4GcZ3PnrIUV5Ow+4BEJbXHlzM3bU5CKLl
         57jyABLEkHagKA+AUU4Kw+ucHxunwglR8ctiSh5pNjHhr8xtSWvhC5XStxUkVkWxcAhW
         U7TygJSoVdxoA0WhHqv2CjDt6ROoSJ9YW9CFaLNJydVDyjsmqWoVifbd8XhPB3CRo7qx
         hy+NLKuPTQfTN7K3esyQEV1OSnfEo0gqK5BGsw+4+ccPB8sK0BpddICMjxp9TM+56lbF
         HvwA==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=pass header.i=@linux.dev header.s=key1 header.b=B9yLyBVp;
       spf=pass (google.com: domain of
 linux-kernel+bounces-14383-ouuuleilei=gmail.com@vger.kernel.org designates
 2604:1380:4601:e00::3 as permitted sender)
 smtp.mailfrom="linux-kernel+bounces-14383-ouuuleilei=gmail.com@vger.kernel.org";
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev
Received: from am.mirrors.kernel.org (am.mirrors.kernel.org.
 [2604:1380:4601:e00::3])
        by mx.google.com with ESMTPS id
 lt27-20020a170906fa9b00b00a236b653f10si10297455ejb.279.2024.01.02.05.14.01
        for <ouuuleilei@gmail.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 02 Jan 2024 05:14:01 -0800 (PST)
Received-SPF: pass (google.com: domain of
 linux-kernel+bounces-14383-ouuuleilei=gmail.com@vger.kernel.org designates
 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3;
Authentication-Results: mx.google.com;
       dkim=pass header.i=@linux.dev header.s=key1 header.b=B9yLyBVp;
       spf=pass (google.com: domain of
 linux-kernel+bounces-14383-ouuuleilei=gmail.com@vger.kernel.org designates
 2604:1380:4601:e00::3 as permitted sender)
 smtp.mailfrom="linux-kernel+bounces-14383-ouuuleilei=gmail.com@vger.kernel.org";
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev
Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org
 [52.25.139.140])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by am.mirrors.kernel.org (Postfix) with ESMTPS id 1E0F91F22A15
	for <ouuuleilei@gmail.com>; Tue,  2 Jan 2024 13:14:01 +0000 (UTC)
Received: from localhost.localdomain (localhost.localdomain [127.0.0.1])
	by smtp.subspace.kernel.org (Postfix) with ESMTP id CA7481119D;
	Tue,  2 Jan 2024 13:13:30 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev
 header.b="B9yLyBVp"
X-Original-To: linux-kernel@vger.kernel.org
Received: from out-176.mta1.migadu.com (out-176.mta1.migadu.com
 [95.215.58.176])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0065E101C5
	for <linux-kernel@vger.kernel.org>; Tue,  2 Jan 2024 13:13:25 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=linux.dev
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=linux.dev
X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and
 include these headers.
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1;
	t=1704201204;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=ETU4P5ZGwiSCKQ/C0kQ3RzHFu60VRj1ek3R3KqiB2x4=;
	b=B9yLyBVpFwjem6cRBg02/FU+9CM9GKbsl9V80HrvKR7Xo4obGT9RZP65pcS01nZHOo+QLL
	BoVk6IqxBdZxgvPUdD0cIMvaqetNp8gkUYHpG9SIv8mtTQz6dO7kMQJ+yIvKDYa1e8VSDa
	7Rb5Vl4PpfwdO3JiDeI+0+ONiRDKJa8=
From: Gang Li <gang.li@linux.dev>
To: David Hildenbrand <david@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Muchun Song <muchun.song@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tim Chen <tim.c.chen@linux.intel.com>
Cc: linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	ligang.bdlg@bytedance.com,
	Gang Li <gang.li@linux.dev>
Subject: [PATCH v3 2/7] hugetlb: split hugetlb_hstate_alloc_pages
Date: Tue,  2 Jan 2024 21:12:44 +0800
Message-Id: <20240102131249.76622-3-gang.li@linux.dev>
In-Reply-To: <20240102131249.76622-1-gang.li@linux.dev>
References: <20240102131249.76622-1-gang.li@linux.dev>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
X-Migadu-Flow: FLOW_OUT
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1786984520813079231
X-GMAIL-MSGID: 1786984520813079231

1G and 2M huge pages have different allocation and initialization logic,
which leads to subtle differences in parallelization. Therefore, it is
appropriate to split hugetlb_hstate_alloc_pages into gigantic and
non-gigantic.

This patch has no functional changes.

Signed-off-by: Gang Li <gang.li@linux.dev>
---
 mm/hugetlb.c | 86 +++++++++++++++++++++++++++-------------------------
 1 file changed, 45 insertions(+), 41 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 2606135ec55e6..92448e747991d 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3509,6 +3509,47 @@ static void __init hugetlb_hstate_alloc_pages_report(unsigned long allocated, st
 	}
 }
 
+static unsigned long __init hugetlb_hstate_alloc_pages_gigantic(struct hstate *h)
+{
+	unsigned long i;
+
+	for (i = 0; i < h->max_huge_pages; ++i) {
+		/*
+		 * gigantic pages not added to list as they are not
+		 * added to pools now.
+		 */
+		if (!alloc_bootmem_huge_page(h, NUMA_NO_NODE))
+			break;
+		cond_resched();
+	}
+
+	return i;
+}
+
+static unsigned long __init hugetlb_hstate_alloc_pages_non_gigantic(struct hstate *h)
+{
+	unsigned long i;
+	struct folio *folio;
+	LIST_HEAD(folio_list);
+	nodemask_t node_alloc_noretry;
+
+	/* Bit mask controlling how hard we retry per-node allocations.*/
+	nodes_clear(node_alloc_noretry);
+
+	for (i = 0; i < h->max_huge_pages; ++i) {
+		folio = alloc_pool_huge_folio(h, &node_states[N_MEMORY],
+						&node_alloc_noretry);
+		if (!folio)
+			break;
+		list_add(&folio->lru, &folio_list);
+		cond_resched();
+	}
+
+	prep_and_add_allocated_folios(h, &folio_list);
+
+	return i;
+}
+
 /*
  * NOTE: this routine is called in different contexts for gigantic and
  * non-gigantic pages.
@@ -3522,10 +3563,7 @@ static void __init hugetlb_hstate_alloc_pages_report(unsigned long allocated, st
  */
 static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
 {
-	unsigned long i;
-	struct folio *folio;
-	LIST_HEAD(folio_list);
-	nodemask_t *node_alloc_noretry;
+	unsigned long allocated;
 
 	/* skip gigantic hugepages allocation if hugetlb_cma enabled */
 	if (hstate_is_gigantic(h) && hugetlb_cma_size) {
@@ -3539,46 +3577,12 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
 
 	/* below will do all node balanced alloc */
 	if (!hstate_is_gigantic(h)) {
-		/*
-		 * Bit mask controlling how hard we retry per-node allocations.
-		 * Ignore errors as lower level routines can deal with
-		 * node_alloc_noretry == NULL.  If this kmalloc fails at boot
-		 * time, we are likely in bigger trouble.
-		 */
-		node_alloc_noretry = kmalloc(sizeof(*node_alloc_noretry),
-						GFP_KERNEL);
+		allocated = hugetlb_hstate_alloc_pages_non_gigantic(h);
 	} else {
-		/* allocations done at boot time */
-		node_alloc_noretry = NULL;
-	}
-
-	/* bit mask controlling how hard we retry per-node allocations */
-	if (node_alloc_noretry)
-		nodes_clear(*node_alloc_noretry);
-
-	for (i = 0; i < h->max_huge_pages; ++i) {
-		if (hstate_is_gigantic(h)) {
-			/*
-			 * gigantic pages not added to list as they are not
-			 * added to pools now.
-			 */
-			if (!alloc_bootmem_huge_page(h, NUMA_NO_NODE))
-				break;
-		} else {
-			folio = alloc_pool_huge_folio(h, &node_states[N_MEMORY],
-							node_alloc_noretry);
-			if (!folio)
-				break;
-			list_add(&folio->lru, &folio_list);
-		}
-		cond_resched();
+		allocated = hugetlb_hstate_alloc_pages_gigantic(h);
 	}
 
-	/* list will be empty if hstate_is_gigantic */
-	prep_and_add_allocated_folios(h, &folio_list);
-
-	hugetlb_hstate_alloc_pages_report(i, h);
-	kfree(node_alloc_noretry);
+	hugetlb_hstate_alloc_pages_report(allocated, h);
 }
 
 static void __init hugetlb_init_hstates(void)

From patchwork Tue Jan  2 13:12:45 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Gang Li <gang.li@linux.dev>
X-Patchwork-Id: 184328
Return-Path: <linux-kernel+bounces-14384-ouuuleilei=gmail.com@vger.kernel.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id
 tb2csp4436014dyb;
        Tue, 2 Jan 2024 05:14:12 -0800 (PST)
X-Google-Smtp-Source: 
 AGHT+IGB6vQgnq2/TD4HThA/wJqCCrv9psL0oprV8KccOA2secy7Rmx0R7T8l0lIeUdcihtxtWQb
X-Received: by 2002:a05:622a:14d0:b0:425:918b:f189 with SMTP id
 u16-20020a05622a14d000b00425918bf189mr23088284qtx.105.1704201252649;
        Tue, 02 Jan 2024 05:14:12 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; t=1704201252; cv=none;
        d=google.com; s=arc-20160816;
        b=oLIGbYxw9KVZcBReP2r23hixh965t1/4EdOiBH/lJ3JTeSip4MLXX4erMNSIFwpjyQ
         PPkFekfDGOrSxPvJPYAg+/vdH+X8nQ1GTRxKj87WGHDAyLbsl39iRUjljfHtNimCX5hN
         134zGRrRUfbWAcFqPKPiEE+xiOT6ZnfJr8SWNuWmETd2h+k2b6j1sKe4RelHfh5Adf5T
         Bg6U34Q4SgLbkmRbWnjKbKVZt3nW9KG69aLOeI7ir7r7gMI/3fYAj/amY4biO/Z4pB5A
         1dos4N99908xpBjDl0ix32Ujb37cYNfKLie8x6Kyx6gTnanqup4vGMiGe722901GQrHv
         krKQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=content-transfer-encoding:mime-version:list-unsubscribe
         :list-subscribe:list-id:precedence:references:in-reply-to:message-id
         :date:subject:cc:to:from:dkim-signature;
        bh=0BwxzTkbv6fMj2IlVqqVSI+5ecpQCg7jHconULcXMhU=;
        fh=5n5b5+XOwLgGNDePhvfXysty56rnkWOs9jaHEODdVeE=;
        b=g+Mvj5Z5A3Ovd3UJOL2K0DySGp6CC2xUh5I4cTZQ9HrHDOrRQjTHC0lB/UBfSXOpG1
         Li0Mbe+np8/KdWwe+PDX/OefH0Oe2knayJhx/ti1WgpCue0/L/eQCUnPmBqc4evRbLBU
         0BE1eqbjmAmpE5eTglX4PrNFZ/CoemIL/E7pDm1rHHBQDzIi0EpVB5MPVhOaFi1zPxRx
         w/2kSOK66+TLasSppu2PGf4/vgi4NnhA4kqrcByZ1cfDxazyl4DQI2xB1IpovlI9Pqcv
         1+PnrAIxMMW+XakdVqCMshn77Xi9Dej6bbN9EP6T6fYELcejy5J5eZd0/a5CEfzvR9T9
         QidA==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=pass header.i=@linux.dev header.s=key1 header.b=Eko96B8O;
       spf=pass (google.com: domain of
 linux-kernel+bounces-14384-ouuuleilei=gmail.com@vger.kernel.org designates
 2604:1380:45d1:ec00::1 as permitted sender)
 smtp.mailfrom="linux-kernel+bounces-14384-ouuuleilei=gmail.com@vger.kernel.org";
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev
Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org.
 [2604:1380:45d1:ec00::1])
        by mx.google.com with ESMTPS id
 4-20020ac85704000000b00421b95298d8si26914811qtw.404.2024.01.02.05.14.12
        for <ouuuleilei@gmail.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 02 Jan 2024 05:14:12 -0800 (PST)
Received-SPF: pass (google.com: domain of
 linux-kernel+bounces-14384-ouuuleilei=gmail.com@vger.kernel.org designates
 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1;
Authentication-Results: mx.google.com;
       dkim=pass header.i=@linux.dev header.s=key1 header.b=Eko96B8O;
       spf=pass (google.com: domain of
 linux-kernel+bounces-14384-ouuuleilei=gmail.com@vger.kernel.org designates
 2604:1380:45d1:ec00::1 as permitted sender)
 smtp.mailfrom="linux-kernel+bounces-14384-ouuuleilei=gmail.com@vger.kernel.org";
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev
Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org
 [52.25.139.140])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by ny.mirrors.kernel.org (Postfix) with ESMTPS id 726681C21FE9
	for <ouuuleilei@gmail.com>; Tue,  2 Jan 2024 13:14:12 +0000 (UTC)
Received: from localhost.localdomain (localhost.localdomain [127.0.0.1])
	by smtp.subspace.kernel.org (Postfix) with ESMTP id 9E1BA125CA;
	Tue,  2 Jan 2024 13:13:34 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev
 header.b="Eko96B8O"
X-Original-To: linux-kernel@vger.kernel.org
Received: from out-175.mta1.migadu.com (out-175.mta1.migadu.com
 [95.215.58.175])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 182BC10A0D
	for <linux-kernel@vger.kernel.org>; Tue,  2 Jan 2024 13:13:29 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=linux.dev
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=linux.dev
X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and
 include these headers.
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1;
	t=1704201208;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=0BwxzTkbv6fMj2IlVqqVSI+5ecpQCg7jHconULcXMhU=;
	b=Eko96B8OaBJVKkDyhD/UM2q4AYrrNykiYUCYNcfOZ1A32BsBy/zQcEdL/Y94CQ0FhNyARP
	ty7rzZ3ZddQ0yUhcxHQr6aLFHRTzuQy4e+fg9Mn0btG4rccV9gwgX3Wz1I1oA73vy9Mlfb
	eNChOv/zhRPqCB0SpDniNhAQ6XPQGOE=
From: Gang Li <gang.li@linux.dev>
To: David Hildenbrand <david@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Muchun Song <muchun.song@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tim Chen <tim.c.chen@linux.intel.com>
Cc: linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	ligang.bdlg@bytedance.com,
	Gang Li <gang.li@linux.dev>
Subject: [PATCH v3 3/7] padata: dispatch works on different nodes
Date: Tue,  2 Jan 2024 21:12:45 +0800
Message-Id: <20240102131249.76622-4-gang.li@linux.dev>
In-Reply-To: <20240102131249.76622-1-gang.li@linux.dev>
References: <20240102131249.76622-1-gang.li@linux.dev>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
X-Migadu-Flow: FLOW_OUT
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1786984532446064889
X-GMAIL-MSGID: 1786984532446064889

When a group of tasks that access different nodes are scheduled on the
same node, they may encounter bandwidth bottlenecks and access latency.

Thus, numa_aware flag is introduced here, allowing tasks to be
distributed across different nodes to fully utilize the advantage of
multi-node systems.

Signed-off-by: Gang Li <gang.li@linux.dev>
---
 include/linux/padata.h | 3 +++
 kernel/padata.c        | 8 ++++++--
 mm/mm_init.c           | 1 +
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/linux/padata.h b/include/linux/padata.h
index 495b16b6b4d72..f79ccd50e7f40 100644
--- a/include/linux/padata.h
+++ b/include/linux/padata.h
@@ -137,6 +137,8 @@ struct padata_shell {
  *             appropriate for one worker thread to do at once.
  * @max_threads: Max threads to use for the job, actual number may be less
  *               depending on task size and minimum chunk size.
+ * @numa_aware: Dispatch jobs to different nodes. If a node only has memory but
+ *              no CPU, dispatch its jobs to a random CPU.
  */
 struct padata_mt_job {
 	void (*thread_fn)(unsigned long start, unsigned long end, void *arg);
@@ -146,6 +148,7 @@ struct padata_mt_job {
 	unsigned long		align;
 	unsigned long		min_chunk;
 	int			max_threads;
+	bool			numa_aware;
 };
 
 /**
diff --git a/kernel/padata.c b/kernel/padata.c
index 179fb1518070c..1c2b3a337479e 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -485,7 +485,7 @@ void __init padata_do_multithreaded(struct padata_mt_job *job)
 	struct padata_work my_work, *pw;
 	struct padata_mt_job_state ps;
 	LIST_HEAD(works);
-	int nworks;
+	int nworks, nid = 0;
 
 	if (job->size == 0)
 		return;
@@ -517,7 +517,11 @@ void __init padata_do_multithreaded(struct padata_mt_job *job)
 	ps.chunk_size = roundup(ps.chunk_size, job->align);
 
 	list_for_each_entry(pw, &works, pw_list)
-		queue_work(system_unbound_wq, &pw->pw_work);
+		if (job->numa_aware)
+			queue_work_node((++nid % num_node_state(N_MEMORY)),
+					system_unbound_wq, &pw->pw_work);
+		else
+			queue_work(system_unbound_wq, &pw->pw_work);
 
 	/* Use the current thread, which saves starting a workqueue worker. */
 	padata_work_init(&my_work, padata_mt_helper, &ps, PADATA_WORK_ONSTACK);
diff --git a/mm/mm_init.c b/mm/mm_init.c
index 89dc29f1e6c6f..59fcffddf65a3 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -2225,6 +2225,7 @@ static int __init deferred_init_memmap(void *data)
 			.align       = PAGES_PER_SECTION,
 			.min_chunk   = PAGES_PER_SECTION,
 			.max_threads = max_threads,
+			.numa_aware  = false,
 		};
 
 		padata_do_multithreaded(&job);

From patchwork Tue Jan  2 13:12:46 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Gang Li <gang.li@linux.dev>
X-Patchwork-Id: 184329
Return-Path: <linux-kernel+bounces-14385-ouuuleilei=gmail.com@vger.kernel.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id
 tb2csp4436167dyb;
        Tue, 2 Jan 2024 05:14:28 -0800 (PST)
X-Google-Smtp-Source: 
 AGHT+IGOdmQa63cE1qeBoqP2YLHAbQJJ9cQvCWQgf0JR48eP74TayGTjQ6SLay3a7cYSjtW2s7yt
X-Received: by 2002:a05:6a00:b20:b0:6d9:b5ce:f17e with SMTP id
 f32-20020a056a000b2000b006d9b5cef17emr12559873pfu.5.1704201268034;
        Tue, 02 Jan 2024 05:14:28 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; t=1704201268; cv=none;
        d=google.com; s=arc-20160816;
        b=N5aXjaMFiAGdQBPEiD25wy0ppeydaN3bdTEFBWL7W+tlEZQkdZoZRZvlimztYH4Qf8
         CLqV9QZ6mrUIMkhEIZL0vqScuY8oMLmpGN4SV0CR48E6WJ/y1XNrSJ+eAD+tba6kElZO
         110wh3TaMIlrm3+q+cw+xCGaCJo0plpyt9PX4jQYtauUWQ1V0gj0IGmEJeOB5XPMmrr6
         QEmXMgGSZ4Jbps0/kZeCJJfQ3Rg0FhEpCaBzn9vhh5QLKm7Y58gHRC6BXSaw+KIDBBUh
         jAUbo3tVlsrgTL9AKJNI9CDkwLTiV/ydZjUqchLFLI1odQCLgZLPtAoxuD+M6zLV2Bvp
         Zzyg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=content-transfer-encoding:mime-version:list-unsubscribe
         :list-subscribe:list-id:precedence:references:in-reply-to:message-id
         :date:subject:cc:to:from:dkim-signature;
        bh=1vziwaehcp4Mlf2+5m5PWlp5vkO7nD1McNxi9fSvMFU=;
        fh=5n5b5+XOwLgGNDePhvfXysty56rnkWOs9jaHEODdVeE=;
        b=eFNg6xrfacz5g3Hs8JWPVFQoSiABqWI/9RZHjywfbOYNTTcG+aFscd+NaPeWD7r66F
         WSOUt9xeQmU/+6Lt6XtGRYBpOTqsRs26CS/ZjdMYbUEcxb8miDu4BELz+y8KTwfSOZ9e
         zh3ZUIJ+Bnm4Jhfe6UGW4RlM4MYCsfjwjc5Lu+2ZDq8YnO7P1jr284CSChhJJ6q+ozx8
         ErvONNeDOzBfMwtLM6oI6vDMjAFZbBLp7FAtCrjVttRFtzCoNHMUIh3zByTAxLRMaLZ0
         acyID+nmOlBYqnuSDIs/4SB1d1OP7sQwhj5iK5XAtnPUJ3c5Zdx3v+o7KSlmcAEka15E
         L16Q==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=pass header.i=@linux.dev header.s=key1 header.b="qcQ/Nui/";
       spf=pass (google.com: domain of
 linux-kernel+bounces-14385-ouuuleilei=gmail.com@vger.kernel.org designates
 139.178.88.99 as permitted sender)
 smtp.mailfrom="linux-kernel+bounces-14385-ouuuleilei=gmail.com@vger.kernel.org";
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev
Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99])
        by mx.google.com with ESMTPS id
 s3-20020aa78d43000000b006ce6b4258ffsi20193952pfe.302.2024.01.02.05.14.27
        for <ouuuleilei@gmail.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 02 Jan 2024 05:14:28 -0800 (PST)
Received-SPF: pass (google.com: domain of
 linux-kernel+bounces-14385-ouuuleilei=gmail.com@vger.kernel.org designates
 139.178.88.99 as permitted sender) client-ip=139.178.88.99;
Authentication-Results: mx.google.com;
       dkim=pass header.i=@linux.dev header.s=key1 header.b="qcQ/Nui/";
       spf=pass (google.com: domain of
 linux-kernel+bounces-14385-ouuuleilei=gmail.com@vger.kernel.org designates
 139.178.88.99 as permitted sender)
 smtp.mailfrom="linux-kernel+bounces-14385-ouuuleilei=gmail.com@vger.kernel.org";
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev
Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org
 [52.25.139.140])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by sv.mirrors.kernel.org (Postfix) with ESMTPS id 00603283573
	for <ouuuleilei@gmail.com>; Tue,  2 Jan 2024 13:14:25 +0000 (UTC)
Received: from localhost.localdomain (localhost.localdomain [127.0.0.1])
	by smtp.subspace.kernel.org (Postfix) with ESMTP id 586C412E78;
	Tue,  2 Jan 2024 13:13:38 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev
 header.b="qcQ/Nui/"
X-Original-To: linux-kernel@vger.kernel.org
Received: from out-177.mta1.migadu.com (out-177.mta1.migadu.com
 [95.215.58.177])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A66C125B5
	for <linux-kernel@vger.kernel.org>; Tue,  2 Jan 2024 13:13:33 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=linux.dev
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=linux.dev
X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and
 include these headers.
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1;
	t=1704201212;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=1vziwaehcp4Mlf2+5m5PWlp5vkO7nD1McNxi9fSvMFU=;
	b=qcQ/Nui/5Ddo4hq2TnILxFLnHXPOYrnuSz5qK6UAQm45+GQS+0RyxWTYvckmzZSTJRgHK6
	y8iYkcup7wCN7qn8SKZwnf6eds3fWGGP/aglezu4KP6rMrK/MznoUTelkMah5ZTP2EvQrM
	uC8anNSTIvbFRJYUaxaebIz6ZnbBmiE=
From: Gang Li <gang.li@linux.dev>
To: David Hildenbrand <david@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Muchun Song <muchun.song@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tim Chen <tim.c.chen@linux.intel.com>
Cc: linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	ligang.bdlg@bytedance.com,
	Gang Li <gang.li@linux.dev>
Subject: [PATCH v3 4/7] hugetlb: pass *next_nid_to_alloc directly to
 for_each_node_mask_to_alloc
Date: Tue,  2 Jan 2024 21:12:46 +0800
Message-Id: <20240102131249.76622-5-gang.li@linux.dev>
In-Reply-To: <20240102131249.76622-1-gang.li@linux.dev>
References: <20240102131249.76622-1-gang.li@linux.dev>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
X-Migadu-Flow: FLOW_OUT
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1786984548962675117
X-GMAIL-MSGID: 1786984548962675117

The parallelization of hugetlb allocation leads to errors when sharing
h->next_nid_to_alloc across different threads. To address this, it's
necessary to assign a separate next_nid_to_alloc for each thread.

Consequently, the hstate_next_node_to_alloc and for_each_node_mask_to_alloc
have been modified to directly accept a *next_nid_to_alloc parameter,
ensuring thread-specific allocation and avoiding concurrent access issues.

Signed-off-by: Gang Li <gang.li@linux.dev>
---
This patch seems not elegant, but I can't come up with anything better.
Any suggestions will be highly appreciated!
---
 mm/hugetlb.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 92448e747991d..a71bc1622b53b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1464,15 +1464,15 @@ static int get_valid_node_allowed(int nid, nodemask_t *nodes_allowed)
  * next node from which to allocate, handling wrap at end of node
  * mask.
  */
-static int hstate_next_node_to_alloc(struct hstate *h,
+static int hstate_next_node_to_alloc(int *next_nid_to_alloc,
 					nodemask_t *nodes_allowed)
 {
 	int nid;
 
 	VM_BUG_ON(!nodes_allowed);
 
-	nid = get_valid_node_allowed(h->next_nid_to_alloc, nodes_allowed);
-	h->next_nid_to_alloc = next_node_allowed(nid, nodes_allowed);
+	nid = get_valid_node_allowed(*next_nid_to_alloc, nodes_allowed);
+	*next_nid_to_alloc = next_node_allowed(nid, nodes_allowed);
 
 	return nid;
 }
@@ -1495,10 +1495,10 @@ static int hstate_next_node_to_free(struct hstate *h, nodemask_t *nodes_allowed)
 	return nid;
 }
 
-#define for_each_node_mask_to_alloc(hs, nr_nodes, node, mask)		\
+#define for_each_node_mask_to_alloc(next_nid_to_alloc, nr_nodes, node, mask)		\
 	for (nr_nodes = nodes_weight(*mask);				\
 		nr_nodes > 0 &&						\
-		((node = hstate_next_node_to_alloc(hs, mask)) || 1);	\
+		((node = hstate_next_node_to_alloc(next_nid_to_alloc, mask)) || 1);	\
 		nr_nodes--)
 
 #define for_each_node_mask_to_free(hs, nr_nodes, node, mask)		\
@@ -2350,12 +2350,13 @@ static void prep_and_add_allocated_folios(struct hstate *h,
  */
 static struct folio *alloc_pool_huge_folio(struct hstate *h,
 					nodemask_t *nodes_allowed,
-					nodemask_t *node_alloc_noretry)
+					nodemask_t *node_alloc_noretry,
+					int *next_nid_to_alloc)
 {
 	gfp_t gfp_mask = htlb_alloc_mask(h) | __GFP_THISNODE;
 	int nr_nodes, node;
 
-	for_each_node_mask_to_alloc(h, nr_nodes, node, nodes_allowed) {
+	for_each_node_mask_to_alloc(next_nid_to_alloc, nr_nodes, node, nodes_allowed) {
 		struct folio *folio;
 
 		folio = only_alloc_fresh_hugetlb_folio(h, gfp_mask, node,
@@ -3310,7 +3311,7 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid)
 		goto found;
 	}
 	/* allocate from next node when distributing huge pages */
-	for_each_node_mask_to_alloc(h, nr_nodes, node, &node_states[N_MEMORY]) {
+	for_each_node_mask_to_alloc(&h->next_nid_to_alloc, nr_nodes, node, &node_states[N_MEMORY]) {
 		m = memblock_alloc_try_nid_raw(
 				huge_page_size(h), huge_page_size(h),
 				0, MEMBLOCK_ALLOC_ACCESSIBLE, node);
@@ -3684,7 +3685,7 @@ static int adjust_pool_surplus(struct hstate *h, nodemask_t *nodes_allowed,
 	VM_BUG_ON(delta != -1 && delta != 1);
 
 	if (delta < 0) {
-		for_each_node_mask_to_alloc(h, nr_nodes, node, nodes_allowed) {
+		for_each_node_mask_to_alloc(&h->next_nid_to_alloc, nr_nodes, node, nodes_allowed) {
 			if (h->surplus_huge_pages_node[node])
 				goto found;
 		}
@@ -3799,7 +3800,8 @@ static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
 		cond_resched();
 
 		folio = alloc_pool_huge_folio(h, nodes_allowed,
-						node_alloc_noretry);
+						node_alloc_noretry,
+						&h->next_nid_to_alloc);
 		if (!folio) {
 			prep_and_add_allocated_folios(h, &page_list);
 			spin_lock_irq(&hugetlb_lock);

From patchwork Tue Jan  2 13:12:47 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Gang Li <gang.li@linux.dev>
X-Patchwork-Id: 184330
Return-Path: <linux-kernel+bounces-14386-ouuuleilei=gmail.com@vger.kernel.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id
 tb2csp4436272dyb;
        Tue, 2 Jan 2024 05:14:39 -0800 (PST)
X-Google-Smtp-Source: 
 AGHT+IFg92kOwybPNUyXxOBJS/MmGVRY9hnp9Wv+4H2aCscVBOaLoAHkoriAjSIfQHvyMSs+dm55
X-Received: by 2002:a05:622a:290:b0:428:c54:79cd with SMTP id
 z16-20020a05622a029000b004280c5479cdmr9227174qtw.111.1704201279055;
        Tue, 02 Jan 2024 05:14:39 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; t=1704201279; cv=none;
        d=google.com; s=arc-20160816;
        b=kpao1WjMEer7uQrWq81TijjhsioIPTmFDfWML5DfUol+nAemfnRCEULNC/qzIs6Pk5
         0L29l2TadzhZ8JWxPA5aGC0EN0poimFerEstkhFkljtWy+sfgNqTU2Q9pa3kqhDFj76r
         CSHcb2hvje1uojrY55PTqRVw1kAyTffgDNter4xKEv5MeYzeTIGBKANJVfeUqGcLBltk
         BNvmAWUCR89n0OJiOjXhSP1tR6qGMztfR4cc8t1d3RYJ0qdAMbjYv9HtixbU95TWeQwq
         ywTe3cFyrCx36g+jcsgFdo4DlAzrqQfPbIRn2YPv7VBYK4Usml5WY0Wb1VipzWRclXNI
         xrNw==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=content-transfer-encoding:mime-version:list-unsubscribe
         :list-subscribe:list-id:precedence:references:in-reply-to:message-id
         :date:subject:cc:to:from:dkim-signature;
        bh=uwJWRA1Ti2ov5Iev54G7xXAXLlg2CRbAiefTt3ZwHf4=;
        fh=5n5b5+XOwLgGNDePhvfXysty56rnkWOs9jaHEODdVeE=;
        b=RdNP9fs036kU/oyMcFk5bbOR0MJ5ta+cBD8fL5RzH32PAgpILaiZeYQl3xMgCI0Mcv
         g/c25XiQWzUBQ6WhTRsNmdaMHyvOkjJqpOutZFvEf+xhmgnUPFq1jJTT/jeVzVpbrV7p
         eSImGj/1HyeXtJv6bisSVEt+3skRRX6Udno4svxaOqWV9FDdYelLW0S0gt6ybhwHowHZ
         DC+YhDPr8t0m561+YLNtyn700+/XSZ6dnW7S9n1RM9Llw/PM11yT9llcqZHFbm21p9Eb
         cl1NRL96++Z3OJHehv0AusNm91OYUJMcnvpRIz8cx8MxwYuZWoJVVZBLEWThkSqJ8mRF
         j3FQ==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=pass header.i=@linux.dev header.s=key1 header.b=LLGCqrdb;
       spf=pass (google.com: domain of
 linux-kernel+bounces-14386-ouuuleilei=gmail.com@vger.kernel.org designates
 2604:1380:45d1:ec00::1 as permitted sender)
 smtp.mailfrom="linux-kernel+bounces-14386-ouuuleilei=gmail.com@vger.kernel.org";
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev
Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org.
 [2604:1380:45d1:ec00::1])
        by mx.google.com with ESMTPS id
 w3-20020a05622a134300b0042812956cfasi6696057qtk.93.2024.01.02.05.14.38
        for <ouuuleilei@gmail.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 02 Jan 2024 05:14:39 -0800 (PST)
Received-SPF: pass (google.com: domain of
 linux-kernel+bounces-14386-ouuuleilei=gmail.com@vger.kernel.org designates
 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1;
Authentication-Results: mx.google.com;
       dkim=pass header.i=@linux.dev header.s=key1 header.b=LLGCqrdb;
       spf=pass (google.com: domain of
 linux-kernel+bounces-14386-ouuuleilei=gmail.com@vger.kernel.org designates
 2604:1380:45d1:ec00::1 as permitted sender)
 smtp.mailfrom="linux-kernel+bounces-14386-ouuuleilei=gmail.com@vger.kernel.org";
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev
Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org
 [52.25.139.140])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by ny.mirrors.kernel.org (Postfix) with ESMTPS id D67B11C2207D
	for <ouuuleilei@gmail.com>; Tue,  2 Jan 2024 13:14:38 +0000 (UTC)
Received: from localhost.localdomain (localhost.localdomain [127.0.0.1])
	by smtp.subspace.kernel.org (Postfix) with ESMTP id 4A22E14A8A;
	Tue,  2 Jan 2024 13:13:42 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev
 header.b="LLGCqrdb"
X-Original-To: linux-kernel@vger.kernel.org
Received: from out-185.mta1.migadu.com (out-185.mta1.migadu.com
 [95.215.58.185])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3137312E70
	for <linux-kernel@vger.kernel.org>; Tue,  2 Jan 2024 13:13:37 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=linux.dev
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=linux.dev
X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and
 include these headers.
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1;
	t=1704201216;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=uwJWRA1Ti2ov5Iev54G7xXAXLlg2CRbAiefTt3ZwHf4=;
	b=LLGCqrdbm54bgGR02vZyuR1VdBL4mcjuR/BGT/9+I895KeZ7fyx+zd7BxXb0RBDzZmE25q
	AMVfKXgVW6wXKRG7kBGALWtRZYvIMnFzhrilIzjsqIv/oXeOP1qJqaKxBRlTDA+uY6LEko
	g3MUn0pBdC3pivvQ/woTMSibZfbEMZQ=
From: Gang Li <gang.li@linux.dev>
To: David Hildenbrand <david@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Muchun Song <muchun.song@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tim Chen <tim.c.chen@linux.intel.com>
Cc: linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	ligang.bdlg@bytedance.com,
	Gang Li <gang.li@linux.dev>
Subject: [PATCH v3 5/7] hugetlb: have CONFIG_HUGETLBFS select CONFIG_PADATA
Date: Tue,  2 Jan 2024 21:12:47 +0800
Message-Id: <20240102131249.76622-6-gang.li@linux.dev>
In-Reply-To: <20240102131249.76622-1-gang.li@linux.dev>
References: <20240102131249.76622-1-gang.li@linux.dev>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
X-Migadu-Flow: FLOW_OUT
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1786984560652091505
X-GMAIL-MSGID: 1786984560652091505

Now hugetlb uses padata_do_multithreaded for parallel initialization,
so select CONFIG_PADATA.

Signed-off-by: Gang Li <gang.li@linux.dev>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
---
 fs/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/Kconfig b/fs/Kconfig
index 89fdbefd1075f..a57d6e6c41e6f 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -262,6 +262,7 @@ menuconfig HUGETLBFS
 	depends on X86 || SPARC64 || ARCH_SUPPORTS_HUGETLBFS || BROKEN
 	depends on (SYSFS || SYSCTL)
 	select MEMFD_CREATE
+	select PADATA
 	help
 	  hugetlbfs is a filesystem backing for HugeTLB pages, based on
 	  ramfs. For architectures that support it, say Y here and read

From patchwork Tue Jan  2 13:12:48 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Gang Li <gang.li@linux.dev>
X-Patchwork-Id: 184331
Return-Path: <linux-kernel+bounces-14387-ouuuleilei=gmail.com@vger.kernel.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id
 tb2csp4436376dyb;
        Tue, 2 Jan 2024 05:14:55 -0800 (PST)
X-Google-Smtp-Source: 
 AGHT+IELKnbUxFMCFtYUjQgxePb424+z1UKd4hb4rAhSyyUF9eAMljEXYzAUU3GasgLLyNAudXdt
X-Received: by 2002:a05:622a:593:b0:427:7bdf:5a87 with SMTP id
 c19-20020a05622a059300b004277bdf5a87mr25127018qtb.10.1704201294874;
        Tue, 02 Jan 2024 05:14:54 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; t=1704201294; cv=none;
        d=google.com; s=arc-20160816;
        b=Vyl45oPX5zyq4Le+4yAhPsIL0klWMltpmSjnx+KjvIKzdKhLFKSW6/vAYGCUheZET5
         K72FlsySKm9Ju/xrYkqMqdC3T9pOuX2BtqYYz39Lxknt445bGPnW9+zZrosPil16PpjO
         D/kONAlCatIj4sIvZ3y9IpgGZXQsKLDuIi4yA/M42rDDPeBGA5+bgtczyxJ4ZipB3lgt
         Q/wxcDnxj2MCrpeDn+j5Grq0xu7GzRABa5DJgcjEDi58Ak72V6SSGI+uxHMD/PSc1M4k
         o/+tpG0M2ruipGq3R6vWWJ3eijCtXTdsVdQAlxOhF7Aue2BkVpZujY596Mc1qJ0EZk8+
         P6oQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=content-transfer-encoding:mime-version:list-unsubscribe
         :list-subscribe:list-id:precedence:references:in-reply-to:message-id
         :date:subject:cc:to:from:dkim-signature;
        bh=lq6odAt2f+wReMUr3LVecCPvhipoB8fbQDkgCZGAfFs=;
        fh=5n5b5+XOwLgGNDePhvfXysty56rnkWOs9jaHEODdVeE=;
        b=l9eoX3qZ6D7kUn27ZL0xFK+uCAlOo3HSCPBdToCJqTcqeWMhuY4oINsiiNklLvaR0+
         FjD1ZL+lmn7u0Fa8n+1HqGMfbMfcHglPfy9yK+JxFv0Zmp322qIQTttZ1OZi71EVkYMz
         PEmAOR7jfq+CINf8mZU4TXhf91txrkoAxm2aDT9HqBnWvSpwzCRIi2LjWZksvp1zNIUf
         bDxMJ8EIo/0QxSJHBP9y2Q5J+gK3+PzFtsAATUVBPqq4rraR2+drBReOguIyRhsXZdAd
         K5GNu7LgjztpG/0Z2JFFTat6xjl4OKXdetu1pY8hcPKpq1OWM09v0q/MKUT+0wwZ/9p1
         Yn9w==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=pass header.i=@linux.dev header.s=key1 header.b="rOSB7/ml";
       spf=pass (google.com: domain of
 linux-kernel+bounces-14387-ouuuleilei=gmail.com@vger.kernel.org designates
 2604:1380:45d1:ec00::1 as permitted sender)
 smtp.mailfrom="linux-kernel+bounces-14387-ouuuleilei=gmail.com@vger.kernel.org";
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev
Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org.
 [2604:1380:45d1:ec00::1])
        by mx.google.com with ESMTPS id
 4-20020ac85704000000b00421b95298d8si26914811qtw.404.2024.01.02.05.14.54
        for <ouuuleilei@gmail.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 02 Jan 2024 05:14:54 -0800 (PST)
Received-SPF: pass (google.com: domain of
 linux-kernel+bounces-14387-ouuuleilei=gmail.com@vger.kernel.org designates
 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1;
Authentication-Results: mx.google.com;
       dkim=pass header.i=@linux.dev header.s=key1 header.b="rOSB7/ml";
       spf=pass (google.com: domain of
 linux-kernel+bounces-14387-ouuuleilei=gmail.com@vger.kernel.org designates
 2604:1380:45d1:ec00::1 as permitted sender)
 smtp.mailfrom="linux-kernel+bounces-14387-ouuuleilei=gmail.com@vger.kernel.org";
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev
Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org
 [52.25.139.140])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by ny.mirrors.kernel.org (Postfix) with ESMTPS id 882E91C22088
	for <ouuuleilei@gmail.com>; Tue,  2 Jan 2024 13:14:52 +0000 (UTC)
Received: from localhost.localdomain (localhost.localdomain [127.0.0.1])
	by smtp.subspace.kernel.org (Postfix) with ESMTP id 8BB9014F73;
	Tue,  2 Jan 2024 13:13:47 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev
 header.b="rOSB7/ml"
X-Original-To: linux-kernel@vger.kernel.org
Received: from out-177.mta1.migadu.com (out-177.mta1.migadu.com
 [95.215.58.177])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 936B814AAA
	for <linux-kernel@vger.kernel.org>; Tue,  2 Jan 2024 13:13:43 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=linux.dev
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=linux.dev
X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and
 include these headers.
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1;
	t=1704201222;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=lq6odAt2f+wReMUr3LVecCPvhipoB8fbQDkgCZGAfFs=;
	b=rOSB7/mlWJwm7Sgj4858g08l3Mah9AIYX19x1bg4TM3sAFEv9Sl0kgizs7/gcRTPswrfp0
	vOOopJqL+bE11QDfDg2MV8rTX8iV8fJMNIn3ouKYEcKaZ82/9tLxyXxN5xbIYQpLmzlMAU
	1lf390N2ueiFxZGQLNGz7dsM80AvN8Q=
From: Gang Li <gang.li@linux.dev>
To: David Hildenbrand <david@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Muchun Song <muchun.song@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tim Chen <tim.c.chen@linux.intel.com>
Cc: linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	ligang.bdlg@bytedance.com,
	Gang Li <gang.li@linux.dev>
Subject: [PATCH v3 6/7] hugetlb: parallelize 2M hugetlb allocation and
 initialization
Date: Tue,  2 Jan 2024 21:12:48 +0800
Message-Id: <20240102131249.76622-7-gang.li@linux.dev>
In-Reply-To: <20240102131249.76622-1-gang.li@linux.dev>
References: <20240102131249.76622-1-gang.li@linux.dev>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
X-Migadu-Flow: FLOW_OUT
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1786984577307963551
X-GMAIL-MSGID: 1786984577307963551

By distributing both the allocation and the initialization tasks across
multiple threads, the initialization of 2M hugetlb will be faster,
thereby improving the boot speed.

Here are some test results:
        test          no patch(ms)   patched(ms)   saved
 ------------------- -------------- ------------- --------
  256c2t(4 node) 2M           3336          1051   68.52%
  128c1t(2 node) 2M           1943           716   63.15%

Signed-off-by: Gang Li <gang.li@linux.dev>
---
 mm/hugetlb.c | 72 ++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 53 insertions(+), 19 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a71bc1622b53b..d1629df5f399f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -35,6 +35,7 @@
 #include <linux/delayacct.h>
 #include <linux/memory.h>
 #include <linux/mm_inline.h>
+#include <linux/padata.h>
 
 #include <asm/page.h>
 #include <asm/pgalloc.h>
@@ -3510,6 +3511,38 @@ static void __init hugetlb_hstate_alloc_pages_report(unsigned long allocated, st
 	}
 }
 
+static void __init hugetlb_alloc_node(unsigned long start, unsigned long end, void *arg)
+{
+	struct hstate *h = (struct hstate *)arg;
+	int i, num = end - start;
+	nodemask_t node_alloc_noretry;
+	unsigned long flags;
+	int next_nid_to_alloc = 0;
+
+	/* Bit mask controlling how hard we retry per-node allocations.*/
+	nodes_clear(node_alloc_noretry);
+
+	for (i = 0; i < num; ++i) {
+		struct folio *folio = alloc_pool_huge_folio(h, &node_states[N_MEMORY],
+						&node_alloc_noretry, &next_nid_to_alloc);
+		if (!folio)
+			break;
+		spin_lock_irqsave(&hugetlb_lock, flags);
+		__prep_account_new_huge_page(h, folio_nid(folio));
+		enqueue_hugetlb_folio(h, folio);
+		spin_unlock_irqrestore(&hugetlb_lock, flags);
+		cond_resched();
+	}
+}
+
+static void __init hugetlb_vmemmap_optimize_node(unsigned long start, unsigned long end, void *arg)
+{
+	struct hstate *h = (struct hstate *)arg;
+	int nid = start;
+
+	hugetlb_vmemmap_optimize_folios(h, &h->hugepage_freelists[nid]);
+}
+
 static unsigned long __init hugetlb_hstate_alloc_pages_gigantic(struct hstate *h)
 {
 	unsigned long i;
@@ -3529,26 +3562,27 @@ static unsigned long __init hugetlb_hstate_alloc_pages_gigantic(struct hstate *h
 
 static unsigned long __init hugetlb_hstate_alloc_pages_non_gigantic(struct hstate *h)
 {
-	unsigned long i;
-	struct folio *folio;
-	LIST_HEAD(folio_list);
-	nodemask_t node_alloc_noretry;
-
-	/* Bit mask controlling how hard we retry per-node allocations.*/
-	nodes_clear(node_alloc_noretry);
-
-	for (i = 0; i < h->max_huge_pages; ++i) {
-		folio = alloc_pool_huge_folio(h, &node_states[N_MEMORY],
-						&node_alloc_noretry);
-		if (!folio)
-			break;
-		list_add(&folio->lru, &folio_list);
-		cond_resched();
-	}
-
-	prep_and_add_allocated_folios(h, &folio_list);
+	struct padata_mt_job job = {
+		.fn_arg		= h,
+		.align		= 1,
+		.numa_aware	= true
+	};
 
-	return i;
+	job.thread_fn	= hugetlb_alloc_node;
+	job.start	= 0;
+	job.size	= h->max_huge_pages;
+	job.min_chunk	= h->max_huge_pages / num_node_state(N_MEMORY) / 2;
+	job.max_threads	= num_node_state(N_MEMORY) * 2;
+	padata_do_multithreaded(&job);
+
+	job.thread_fn	= hugetlb_vmemmap_optimize_node;
+	job.start	= 0;
+	job.size	= num_node_state(N_MEMORY);
+	job.min_chunk	= 1;
+	job.max_threads	= num_node_state(N_MEMORY);
+	padata_do_multithreaded(&job);
+
+	return h->nr_huge_pages;
 }
 
 /*

From patchwork Tue Jan  2 13:12:49 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Gang Li <gang.li@linux.dev>
X-Patchwork-Id: 184332
Return-Path: <linux-kernel+bounces-14388-ouuuleilei=gmail.com@vger.kernel.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id
 tb2csp4436569dyb;
        Tue, 2 Jan 2024 05:15:15 -0800 (PST)
X-Google-Smtp-Source: 
 AGHT+IHHlySaJTRDI8MgCk/jKPh4jkflVOAOi2dMFj9xQY65RFDoF5Im1OkJj8OTNowH08KLagpy
X-Received: by 2002:a17:902:dacb:b0:1d4:2f06:a2a1 with SMTP id
 q11-20020a170902dacb00b001d42f06a2a1mr21551902plx.13.1704201315291;
        Tue, 02 Jan 2024 05:15:15 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; t=1704201315; cv=none;
        d=google.com; s=arc-20160816;
        b=EngUStPNN5BEWo3qzYK0k2s+90BhOGMkGh3kejYsa8BRxcwTNkywIkdW7pTGiS11PE
         c3SX/uS9PZVvKxQsjpIXkVjtPnECp6B71zg1xOJGS+zX9efmkuUWBTHol5Sk28tYXVNw
         ZhwEKuYlvIqIuClljmYwh/AM1lrtHEpO0tu3OqxnxmK2N0wH8DtJ5BWBRbyE4I2tls9z
         smsllhN7Lnxl+PrE2cd5p0t39ipCGaqVMhF615E5W55ppt8J1Vb6q6uI2mp4fz2NOpIM
         ehxn1jhpnE+VjVqBDTZQOZqmunsIIAsbW1pJXRzCIMXKaeyhIaWmN+MPajG+vo6oDfdm
         cqug==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=content-transfer-encoding:mime-version:list-unsubscribe
         :list-subscribe:list-id:precedence:references:in-reply-to:message-id
         :date:subject:cc:to:from:dkim-signature;
        bh=PV0S7W1YGzJ0pEPZj279MYSeM1ClpTdHk3joXGH5wj8=;
        fh=5n5b5+XOwLgGNDePhvfXysty56rnkWOs9jaHEODdVeE=;
        b=R7t6D8mdGRPciB58aMD9g81p94T6GbukpuUNiF8nm9IGsft/lkHtt/jmV/+K0MjaHU
         /Gk7awLur3Q4czICOj8tUSDtQwv82lFFpdTgGYPVxo87+7sTM89dIFYMSyEKrb6YDqkY
         QY4rM0YQsIR2UduV7Vm+j3r7C6gXc3HCb4ks1xU14MATFqH01/77J3Qx+NLeQPhlp4GQ
         ZEqIZKZhWeZ6Z0uhq7YRibKT8NTzOYl/rTcj0UamkeQPEG/+EJ+JwU2UPNd5W2n+fEfN
         ZpouzDD80hBZXO5NI1pNhxcQ0nGyDoG9oY7KxP3tMQxwRS2pGr6uPFll9v8hTh2D+W0w
         /MzA==
ARC-Authentication-Results: i=1; mx.google.com;
       dkim=pass header.i=@linux.dev header.s=key1 header.b="wooXcU/d";
       spf=pass (google.com: domain of
 linux-kernel+bounces-14388-ouuuleilei=gmail.com@vger.kernel.org designates
 2604:1380:45e3:2400::1 as permitted sender)
 smtp.mailfrom="linux-kernel+bounces-14388-ouuuleilei=gmail.com@vger.kernel.org";
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev
Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org.
 [2604:1380:45e3:2400::1])
        by mx.google.com with ESMTPS id
 n2-20020a170902e54200b001d4c2a9ad2dsi1661156plf.297.2024.01.02.05.15.15
        for <ouuuleilei@gmail.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 02 Jan 2024 05:15:15 -0800 (PST)
Received-SPF: pass (google.com: domain of
 linux-kernel+bounces-14388-ouuuleilei=gmail.com@vger.kernel.org designates
 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1;
Authentication-Results: mx.google.com;
       dkim=pass header.i=@linux.dev header.s=key1 header.b="wooXcU/d";
       spf=pass (google.com: domain of
 linux-kernel+bounces-14388-ouuuleilei=gmail.com@vger.kernel.org designates
 2604:1380:45e3:2400::1 as permitted sender)
 smtp.mailfrom="linux-kernel+bounces-14388-ouuuleilei=gmail.com@vger.kernel.org";
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev
Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org
 [52.25.139.140])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by sv.mirrors.kernel.org (Postfix) with ESMTPS id 446DB2816BC
	for <ouuuleilei@gmail.com>; Tue,  2 Jan 2024 13:15:06 +0000 (UTC)
Received: from localhost.localdomain (localhost.localdomain [127.0.0.1])
	by smtp.subspace.kernel.org (Postfix) with ESMTP id 7AB5C14F6B;
	Tue,  2 Jan 2024 13:13:52 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev
 header.b="wooXcU/d"
X-Original-To: linux-kernel@vger.kernel.org
Received: from out-182.mta1.migadu.com (out-182.mta1.migadu.com
 [95.215.58.182])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4386B14F61
	for <linux-kernel@vger.kernel.org>; Tue,  2 Jan 2024 13:13:48 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org;
 dmarc=pass (p=none dis=none) header.from=linux.dev
Authentication-Results: smtp.subspace.kernel.org;
 spf=pass smtp.mailfrom=linux.dev
X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and
 include these headers.
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1;
	t=1704201226;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=PV0S7W1YGzJ0pEPZj279MYSeM1ClpTdHk3joXGH5wj8=;
	b=wooXcU/dpB/9amxwWmyP5fx/4g8+s4Wt1e1ITrrRQgXh2eG2jtsfU0P9ici3t88EkmQfCM
	MJRecVKTfFX3pTTZU6TfsN5mYo5/0+ARbG4H+KzX5EFaHgXHY+NU17YkBHCK8+K0nyvYdy
	VrYNkQIKDDwuAPmaCwH0QTwGhcuFBOI=
From: Gang Li <gang.li@linux.dev>
To: David Hildenbrand <david@redhat.com>,
	David Rientjes <rientjes@google.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Muchun Song <muchun.song@linux.dev>,
	Andrew Morton <akpm@linux-foundation.org>,
	Tim Chen <tim.c.chen@linux.intel.com>
Cc: linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	ligang.bdlg@bytedance.com,
	Gang Li <gang.li@linux.dev>
Subject: [PATCH v3 7/7] hugetlb: parallelize 1G hugetlb initialization
Date: Tue,  2 Jan 2024 21:12:49 +0800
Message-Id: <20240102131249.76622-8-gang.li@linux.dev>
In-Reply-To: <20240102131249.76622-1-gang.li@linux.dev>
References: <20240102131249.76622-1-gang.li@linux.dev>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
X-Migadu-Flow: FLOW_OUT
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1786984598291172822
X-GMAIL-MSGID: 1786984598291172822

Optimizing the initialization speed of 1G huge pages through
parallelization.

1G hugetlbs are allocated from bootmem, a process that is already
very fast and does not currently require optimization. Therefore,
we focus on parallelizing only the initialization phase in
`gather_bootmem_prealloc`.

Here are some test results:
        test          no patch(ms)   patched(ms)   saved
 ------------------- -------------- ------------- --------
  256c2t(4 node) 1G           4745          2024   57.34%
  128c1t(2 node) 1G           3358          1712   49.02%
      12t        1G          77000         18300   76.23%

Signed-off-by: Gang Li <gang.li@linux.dev>
---
 include/linux/hugetlb.h |  2 +-
 mm/hugetlb.c            | 40 +++++++++++++++++++++++++++++++++-------
 2 files changed, 34 insertions(+), 8 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index c1ee640d87b11..77b30a8c6076b 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -178,7 +178,7 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma,
 struct address_space *hugetlb_page_mapping_lock_write(struct page *hpage);
 
 extern int sysctl_hugetlb_shm_group;
-extern struct list_head huge_boot_pages;
+extern struct list_head huge_boot_pages[MAX_NUMNODES];
 
 /* arch callbacks */
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index d1629df5f399f..e5a55707f8814 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -69,7 +69,7 @@ static bool hugetlb_cma_folio(struct folio *folio, unsigned int order)
 #endif
 static unsigned long hugetlb_cma_size __initdata;
 
-__initdata LIST_HEAD(huge_boot_pages);
+__initdata struct list_head huge_boot_pages[MAX_NUMNODES];
 
 /* for command line parsing */
 static struct hstate * __initdata parsed_hstate;
@@ -3339,7 +3339,7 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid)
 		huge_page_size(h) - PAGE_SIZE);
 	/* Put them into a private list first because mem_map is not up yet */
 	INIT_LIST_HEAD(&m->list);
-	list_add(&m->list, &huge_boot_pages);
+	list_add(&m->list, &huge_boot_pages[node]);
 	m->hstate = h;
 	return 1;
 }
@@ -3390,8 +3390,6 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h,
 	/* Send list for bulk vmemmap optimization processing */
 	hugetlb_vmemmap_optimize_folios(h, folio_list);
 
-	/* Add all new pool pages to free lists in one lock cycle */
-	spin_lock_irqsave(&hugetlb_lock, flags);
 	list_for_each_entry_safe(folio, tmp_f, folio_list, lru) {
 		if (!folio_test_hugetlb_vmemmap_optimized(folio)) {
 			/*
@@ -3404,23 +3402,27 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h,
 					HUGETLB_VMEMMAP_RESERVE_PAGES,
 					pages_per_huge_page(h));
 		}
+		/* Subdivide locks to achieve better parallel performance */
+		spin_lock_irqsave(&hugetlb_lock, flags);
 		__prep_account_new_huge_page(h, folio_nid(folio));
 		enqueue_hugetlb_folio(h, folio);
+		spin_unlock_irqrestore(&hugetlb_lock, flags);
 	}
-	spin_unlock_irqrestore(&hugetlb_lock, flags);
 }
 
 /*
  * Put bootmem huge pages into the standard lists after mem_map is up.
  * Note: This only applies to gigantic (order > MAX_PAGE_ORDER) pages.
  */
-static void __init gather_bootmem_prealloc(void)
+static void __init __gather_bootmem_prealloc(unsigned long start, unsigned long end, void *arg)
+
 {
+	int nid = start;
 	LIST_HEAD(folio_list);
 	struct huge_bootmem_page *m;
 	struct hstate *h = NULL, *prev_h = NULL;
 
-	list_for_each_entry(m, &huge_boot_pages, list) {
+	list_for_each_entry(m, &huge_boot_pages[nid], list) {
 		struct page *page = virt_to_page(m);
 		struct folio *folio = (void *)page;
 
@@ -3453,6 +3455,22 @@ static void __init gather_bootmem_prealloc(void)
 	prep_and_add_bootmem_folios(h, &folio_list);
 }
 
+static void __init gather_bootmem_prealloc(void)
+{
+	struct padata_mt_job job = {
+		.thread_fn	= __gather_bootmem_prealloc,
+		.fn_arg		= NULL,
+		.start		= 0,
+		.size		= num_node_state(N_MEMORY),
+		.align		= 1,
+		.min_chunk	= 1,
+		.max_threads	= num_node_state(N_MEMORY),
+		.numa_aware	= true,
+	};
+
+	padata_do_multithreaded(&job);
+}
+
 static void __init hugetlb_hstate_alloc_pages_onenode(struct hstate *h, int nid)
 {
 	unsigned long i;
@@ -3606,6 +3624,14 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
 		return;
 	}
 
+	/* hugetlb_hstate_alloc_pages will be called many times, init huge_boot_pages once*/
+	if (huge_boot_pages[0].next == NULL) {
+		int i = 0;
+
+		for (i = 0; i < MAX_NUMNODES; i++)
+			INIT_LIST_HEAD(&huge_boot_pages[i]);
+	}
+
 	/* do node specific alloc */
 	if (hugetlb_hstate_alloc_pages_node_specific(h))
 		return;