From patchwork Fri Dec 8 02:52:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gang Li X-Patchwork-Id: 175538 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp5207755vqy; Thu, 7 Dec 2023 18:53:34 -0800 (PST) X-Google-Smtp-Source: AGHT+IFLrfEkj/1ALtkdkVS9piTR5dr9EeAqUtiqm5TVzKwod/qgkzYaztcwvvGMWjr9MzaG379Z X-Received: by 2002:a05:6358:248b:b0:16d:bd2f:a832 with SMTP id m11-20020a056358248b00b0016dbd2fa832mr2649957rwc.17.1702004013712; Thu, 07 Dec 2023 18:53:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702004013; cv=none; d=google.com; s=arc-20160816; b=QZdBE8S74pn+ZzGVhRL+/pyTRXjzO0mFq/6o4itWZxZp6dOa5/V/p1IKguTp7hToUV r7tD9iUJZtBsFs/B86hXW/h52DqQNt9hFuOub+v6UKzriQYRzRCzaHsrPzge5dxcDZMr fv2bYor8pqBbp6iLmw3m8BYptQyF5u7uSHl7jJKxDhav8hAGij08CkSPHcAtI/U9V1IW SX5ahRaX51WpGP8e2ExY0qtxV9bFrBgpD6ZWZAwbZinxHdfEuvj+aF91ncOkMKOt7may WxAY0LEw7/9VhAgIX8Xhw5/AaF0SX0gbjWzpO9NynNyBd6QqsM1bEsMb69bKurnnzj5h n1bw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=nScQm/fmNOjGhlLnCGYZOl5/YmPviWpS9XmgJR18QP0=; fh=ALVB2amNZc7wpwPN7EU8i4C3eL0OH515L0QoCIRQrQE=; b=X2E+9l5Cx44XlwgvLEc9fTtESTor9bHgpRC6pqMWBl3L/61PAIZj1OZEv9UP48GbXn cr9xnT8bRKeL3oeLyv9EpBfwpzCQM8oluK4NYepauRrhv9V1KYswuM/Va4yE4PSKIei1 5fnFgmSOTvdTchu0Cv4xKe6cP9usa/Co8NVkl0fpRccO5KCQ2n45OeUbaFbE4tOszx7c WXlUVJ/C8Srb38/MbrRKKFTz2QmAJhrlYsb8kNH+A4KmZFeBe1bLlUTcTLlUyZ+tvVUK p4UVGfDVDaiPkxsV3rwVK6/DU9Vsuy8Vv4ZEy5cPouuvU1Q6WuOEvqSuNEq7fQ0NtKKI 8mvQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=lfAm3Ag0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id 19-20020a631253000000b0056da0ae25a2si717827pgs.32.2023.12.07.18.53.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Dec 2023 18:53:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=lfAm3Ag0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 5003380E9E83; Thu, 7 Dec 2023 18:53:29 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1573007AbjLHCxF (ORCPT + 99 others); Thu, 7 Dec 2023 21:53:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230225AbjLHCxC (ORCPT ); Thu, 7 Dec 2023 21:53:02 -0500 Received: from out-171.mta0.migadu.com (out-171.mta0.migadu.com [IPv6:2001:41d0:1004:224b::ab]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 591D4171E for ; Thu, 7 Dec 2023 18:53:08 -0800 (PST) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1702003986; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nScQm/fmNOjGhlLnCGYZOl5/YmPviWpS9XmgJR18QP0=; b=lfAm3Ag0u+nVvzsnpysTBQspAGEV7RQMWtcMnK+sgx5xG663oyDL2ImjlVJztS/qUUC2Bu ASnstCjN2e7RvnSs5QQnB61n9SzrCi2SL7VTkiqKPo9WPDD641CyExewkH4Cks64v1NnaA Y94xV0mU8qY+cPzuNp+xPfcAHW1VUQI= From: Gang Li To: David Hildenbrand , David Rientjes , Mike Kravetz , Muchun Song , Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, ligang.bdlg@bytedance.com, Gang Li Subject: [RFC PATCH v2 1/5] hugetlb: code clean for hugetlb_hstate_alloc_pages Date: Fri, 8 Dec 2023 10:52:36 +0800 Message-Id: <20231208025240.4744-2-gang.li@linux.dev> In-Reply-To: <20231208025240.4744-1-gang.li@linux.dev> References: <20231208025240.4744-1-gang.li@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Thu, 07 Dec 2023 18:53:29 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784680561015927778 X-GMAIL-MSGID: 1784680561015927778 The readability of `hugetlb_hstate_alloc_pages` is poor. By cleaning the code, its readability can be improved, facilitating future modifications. This patch extracts two functions to reduce the complexity of `hugetlb_hstate_alloc_pages` and has no functional changes. - hugetlb_hstate_alloc_pages_node_specific() to handle iterates through each online node and performs allocation if necessary. - hugetlb_hstate_alloc_pages_report() report error during allocation. And the value of h->max_huge_pages is updated accordingly. Signed-off-by: Gang Li --- mm/hugetlb.c | 46 +++++++++++++++++++++++++++++----------------- 1 file changed, 29 insertions(+), 17 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 51f50bb3dc092..252d6866a0af8 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3475,6 +3475,33 @@ static void __init hugetlb_hstate_alloc_pages_onenode(struct hstate *h, int nid) h->max_huge_pages_node[nid] = i; } +static bool __init hugetlb_hstate_alloc_pages_node_specific(struct hstate *h) +{ + int i; + bool node_specific_alloc = false; + + for_each_online_node(i) { + if (h->max_huge_pages_node[i] > 0) { + hugetlb_hstate_alloc_pages_onenode(h, i); + node_specific_alloc = true; + } + } + + return node_specific_alloc; +} + +static void __init hugetlb_hstate_alloc_pages_report(unsigned long allocated, struct hstate *h) +{ + if (allocated < h->max_huge_pages) { + char buf[32]; + + string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32); + pr_warn("HugeTLB: allocating %lu of page size %s failed. Only allocated %lu hugepages.\n", + h->max_huge_pages, buf, allocated); + h->max_huge_pages = allocated; + } +} + /* * NOTE: this routine is called in different contexts for gigantic and * non-gigantic pages. @@ -3492,7 +3519,6 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) struct folio *folio; LIST_HEAD(folio_list); nodemask_t *node_alloc_noretry; - bool node_specific_alloc = false; /* skip gigantic hugepages allocation if hugetlb_cma enabled */ if (hstate_is_gigantic(h) && hugetlb_cma_size) { @@ -3501,14 +3527,7 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) } /* do node specific alloc */ - for_each_online_node(i) { - if (h->max_huge_pages_node[i] > 0) { - hugetlb_hstate_alloc_pages_onenode(h, i); - node_specific_alloc = true; - } - } - - if (node_specific_alloc) + if (hugetlb_hstate_alloc_pages_node_specific(h)) return; /* below will do all node balanced alloc */ @@ -3551,14 +3570,7 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) /* list will be empty if hstate_is_gigantic */ prep_and_add_allocated_folios(h, &folio_list); - if (i < h->max_huge_pages) { - char buf[32]; - - string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32); - pr_warn("HugeTLB: allocating %lu of page size %s failed. Only allocated %lu hugepages.\n", - h->max_huge_pages, buf, i); - h->max_huge_pages = i; - } + hugetlb_hstate_alloc_pages_report(i, h); kfree(node_alloc_noretry); } From patchwork Fri Dec 8 02:52:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gang Li X-Patchwork-Id: 175536 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp5207726vqy; Thu, 7 Dec 2023 18:53:29 -0800 (PST) X-Google-Smtp-Source: AGHT+IHNVsu1OiK85k0aiGX7remCu75otjHOwJwcVSxOfpC5SSvqGFjHV1501rfPcghbF3cwPnUo X-Received: by 2002:a17:903:24c:b0:1d0:ae84:611b with SMTP id j12-20020a170903024c00b001d0ae84611bmr3856232plh.54.1702004008997; Thu, 07 Dec 2023 18:53:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702004008; cv=none; d=google.com; s=arc-20160816; b=gVCU8IuvCG0P4rAiFCXqsZd9YM6x+vNrbBWCpSZeyDVAebr/q8IQRvV//WubjEJ8gM CLivc54QeRYBP4UkY7fKMbcM8FfpbcRNOaehNtbMpQRlX+iYsELeSaXaM1nlSdCAorNg r0ksGKAmAt6UWWE4qfgHmFzd1RiWPBwkzuKYkp1iFdHGMSzVxaqYOyGLGY9zUWc+oWOL LDnImj28NB09PfZfMWHVOXmSCNTGhwGuVkxMXBoBY+UkakW3EGJgaajoIk2mZtsaJP7c oLEegCQUvvnekBNq4Ik6X80P8nDZQqMEeNjqscbiIK/BRmhf0UZjfX3y7H24/ih6jyrS AZ3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=n5rlcvDOEWSjvMRr90AdsF4YIrGIogPwB5/wbnheBkQ=; fh=ALVB2amNZc7wpwPN7EU8i4C3eL0OH515L0QoCIRQrQE=; b=vC90S32dF3mPq5TahZ7Tj9vjB+YCimFp6ho7IgMBwhMmqIKzCSndKbyLD3qCTSRFrq gusIyHsKoZQaTUhNSTVX3ihHT99yilrEKrjSiTXmTNoGZWqIhq//rLglmm5P8Gzx9hcb FkkiafhXsvd/INKR9/szQYtfvG/oSrKLWjyZuh3LnVoyVpG94rx9Up2e4aPUMoF6GtXe MIXhSygYnVcswUBcb5674BCcm74Vj1tHKKtD1pHsjtrdv52WmRAppoeD7hASrq1ut2iX LVzBehE3XVBd9mRt+TNyvnklxik5rTtsovmhVeSTwFIPJpNHPl3Hi9oFnRttMRi3vdLo fAQw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=IHnb3dgP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id n2-20020a170902d2c200b001cfdfa28b70si758183plc.469.2023.12.07.18.53.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Dec 2023 18:53:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=IHnb3dgP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 45F5680FCC99; Thu, 7 Dec 2023 18:53:25 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1573016AbjLHCxH (ORCPT + 99 others); Thu, 7 Dec 2023 21:53:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44738 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229671AbjLHCxF (ORCPT ); Thu, 7 Dec 2023 21:53:05 -0500 Received: from out-171.mta0.migadu.com (out-171.mta0.migadu.com [IPv6:2001:41d0:1004:224b::ab]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 645681715 for ; Thu, 7 Dec 2023 18:53:11 -0800 (PST) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1702003990; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=n5rlcvDOEWSjvMRr90AdsF4YIrGIogPwB5/wbnheBkQ=; b=IHnb3dgPdv0XvvyaI7QPgxVi6IP+FjK3KVXTcR25SvOJ6PmLr1x9emJ+YfIcImJVapgyzg uJKyXe9nXu/y0CFNFbZWwF2f89s3kEfY8JD/9hR223k4PKPEJhAu3rW0LY4r//U8QAMS8M e2PUBAGxjgBmp5nM2aTp82P5yZ+s6bI= From: Gang Li To: David Hildenbrand , David Rientjes , Mike Kravetz , Muchun Song , Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, ligang.bdlg@bytedance.com, Gang Li Subject: [RFC PATCH v2 2/5] hugetlb: split hugetlb_hstate_alloc_pages Date: Fri, 8 Dec 2023 10:52:37 +0800 Message-Id: <20231208025240.4744-3-gang.li@linux.dev> In-Reply-To: <20231208025240.4744-1-gang.li@linux.dev> References: <20231208025240.4744-1-gang.li@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Thu, 07 Dec 2023 18:53:25 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784680555704260258 X-GMAIL-MSGID: 1784680555704260258 1G and 2M huge pages have different allocation and initialization logic, which leads to subtle differences in parallelization. Therefore, it is appropriate to split hugetlb_hstate_alloc_pages into gigantic and non-gigantic. This patch has no functional changes. Signed-off-by: Gang Li --- mm/hugetlb.c | 86 +++++++++++++++++++++++++++------------------------- 1 file changed, 45 insertions(+), 41 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 252d6866a0af8..8de1653fc4c4f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3502,6 +3502,47 @@ static void __init hugetlb_hstate_alloc_pages_report(unsigned long allocated, st } } +static unsigned long __init hugetlb_hstate_alloc_pages_gigantic(struct hstate *h) +{ + unsigned long i; + + for (i = 0; i < h->max_huge_pages; ++i) { + /* + * gigantic pages not added to list as they are not + * added to pools now. + */ + if (!alloc_bootmem_huge_page(h, NUMA_NO_NODE)) + break; + cond_resched(); + } + + return i; +} + +static unsigned long __init hugetlb_hstate_alloc_pages_non_gigantic(struct hstate *h) +{ + unsigned long i; + struct folio *folio; + LIST_HEAD(folio_list); + nodemask_t node_alloc_noretry; + + /* Bit mask controlling how hard we retry per-node allocations.*/ + nodes_clear(node_alloc_noretry); + + for (i = 0; i < h->max_huge_pages; ++i) { + folio = alloc_pool_huge_folio(h, &node_states[N_MEMORY], + &node_alloc_noretry); + if (!folio) + break; + list_add(&folio->lru, &folio_list); + cond_resched(); + } + + prep_and_add_allocated_folios(h, &folio_list); + + return i; +} + /* * NOTE: this routine is called in different contexts for gigantic and * non-gigantic pages. @@ -3515,10 +3556,7 @@ static void __init hugetlb_hstate_alloc_pages_report(unsigned long allocated, st */ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) { - unsigned long i; - struct folio *folio; - LIST_HEAD(folio_list); - nodemask_t *node_alloc_noretry; + unsigned long allocated; /* skip gigantic hugepages allocation if hugetlb_cma enabled */ if (hstate_is_gigantic(h) && hugetlb_cma_size) { @@ -3532,46 +3570,12 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) /* below will do all node balanced alloc */ if (!hstate_is_gigantic(h)) { - /* - * Bit mask controlling how hard we retry per-node allocations. - * Ignore errors as lower level routines can deal with - * node_alloc_noretry == NULL. If this kmalloc fails at boot - * time, we are likely in bigger trouble. - */ - node_alloc_noretry = kmalloc(sizeof(*node_alloc_noretry), - GFP_KERNEL); + allocated = hugetlb_hstate_alloc_pages_non_gigantic(h); } else { - /* allocations done at boot time */ - node_alloc_noretry = NULL; - } - - /* bit mask controlling how hard we retry per-node allocations */ - if (node_alloc_noretry) - nodes_clear(*node_alloc_noretry); - - for (i = 0; i < h->max_huge_pages; ++i) { - if (hstate_is_gigantic(h)) { - /* - * gigantic pages not added to list as they are not - * added to pools now. - */ - if (!alloc_bootmem_huge_page(h, NUMA_NO_NODE)) - break; - } else { - folio = alloc_pool_huge_folio(h, &node_states[N_MEMORY], - node_alloc_noretry); - if (!folio) - break; - list_add(&folio->lru, &folio_list); - } - cond_resched(); + allocated = hugetlb_hstate_alloc_pages_gigantic(h); } - /* list will be empty if hstate_is_gigantic */ - prep_and_add_allocated_folios(h, &folio_list); - - hugetlb_hstate_alloc_pages_report(i, h); - kfree(node_alloc_noretry); + hugetlb_hstate_alloc_pages_report(allocated, h); } static void __init hugetlb_init_hstates(void) From patchwork Fri Dec 8 02:52:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gang Li X-Patchwork-Id: 175537 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp5207746vqy; Thu, 7 Dec 2023 18:53:32 -0800 (PST) X-Google-Smtp-Source: AGHT+IE5+MgUcrlypI0LKBMC5trRioFawUB+oH0cUbXSwyZrp28cvkemDsMXUmHJeY/7lSyGep8i X-Received: by 2002:a17:90a:9406:b0:286:6cc0:62a3 with SMTP id r6-20020a17090a940600b002866cc062a3mr296767pjo.34.1702004012528; Thu, 07 Dec 2023 18:53:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702004012; cv=none; d=google.com; s=arc-20160816; b=nkna1QQtg57QDz1iLQRoy2i9mv8a/8nN09kUoWxHE0RWwoj7vx30uFUDlkZBNtOI7V ObBZYhcD8LwQc400EEGkVYCczZO2LPJ3vexd21ps2ted87fLpstGiydUGkATtgKEdfi1 Nb/f6lISUzxHniNzdA8tvG2kD8EyJXGx05AjxJvfK/JOzoqWQoXZldwZdDza1hd2L+/i /UitidSVSnCMVUTBMsI6H3T/bGZHCEg2iqtLNh205faUaPCNpy7gcg+ByATHTLt6CUzY 9eHXIX/64AMSHURsyhiOwexiktFJ4DIKNYEejQYZFjmmPqDiQs3hdn5zKxEwApp8Vn8g WQ4Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=xBnw5RqwYO2IqY80uM+vPTSc8/TzM8Sh0FOmLemG2G4=; fh=ALVB2amNZc7wpwPN7EU8i4C3eL0OH515L0QoCIRQrQE=; b=UmYHsqnX/UFa0eQLzqOAv4HdsFkVGn9fQZwS+zWv60NCSyQSZjBUW/LfEzejHaPLTQ iiajGhok8AZqQmLAQ4feh2ewV0YEb53+33z+NzcuYu5tojAI6xR/BWopa3EQJdF+uYs/ y/bQ0UaCyasROpekbiuCj9ysUn67cWN/lbWN7UVzrMsTRGX6RXf1lA0JE7yql89srkru +h1ycCKNHl5eihHLS6igH5jDPWesK6evxecD7cV1wPuA06EwMNw7yjyg9FBwwiL0Iqn/ IzqJmKK3Cz50/bKaQ8o4nNTva+LbYbOg8S8G50TLBDqW6O1i7+tSVD2QVDjipoJh6S9e rEgA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=esddMIrN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id br12-20020a17090b0f0c00b00286b69fb2d3si841685pjb.87.2023.12.07.18.53.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Dec 2023 18:53:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=esddMIrN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id E0CC98179079; Thu, 7 Dec 2023 18:53:28 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1573039AbjLHCxO (ORCPT + 99 others); Thu, 7 Dec 2023 21:53:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44766 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573014AbjLHCxJ (ORCPT ); Thu, 7 Dec 2023 21:53:09 -0500 Received: from out-187.mta0.migadu.com (out-187.mta0.migadu.com [IPv6:2001:41d0:1004:224b::bb]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 298C010CA for ; Thu, 7 Dec 2023 18:53:15 -0800 (PST) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1702003993; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xBnw5RqwYO2IqY80uM+vPTSc8/TzM8Sh0FOmLemG2G4=; b=esddMIrNubz11Tjk5hcSDYsiI76jCdFHcT7RuRX5vkHNO/tt71jh8T95AJcRJ0EkjT5FPP nyp6uRRHoc0E4ynoEJCj3cDTTNvgLfqjFHV2Cv3uc8gZ6cNzKUTvW7ji1pnYkE7HVckU18 dtXx/KCzGEPds0Qw2r7tjuJ+Dt6mSrc= From: Gang Li To: David Hildenbrand , David Rientjes , Mike Kravetz , Muchun Song , Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, ligang.bdlg@bytedance.com, Gang Li Subject: [RFC PATCH v2 3/5] padata: dispatch works on different nodes Date: Fri, 8 Dec 2023 10:52:38 +0800 Message-Id: <20231208025240.4744-4-gang.li@linux.dev> In-Reply-To: <20231208025240.4744-1-gang.li@linux.dev> References: <20231208025240.4744-1-gang.li@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Thu, 07 Dec 2023 18:53:29 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784680559326301080 X-GMAIL-MSGID: 1784680559326301080 When a group of tasks that access different nodes are scheduled on the same node, they may encounter bandwidth bottlenecks and access latency. Thus, numa_aware flag is introduced here, allowing tasks to be distributed across different nodes to fully utilize the advantage of multi-node systems. Signed-off-by: Gang Li --- include/linux/padata.h | 2 ++ kernel/padata.c | 8 ++++++-- mm/mm_init.c | 1 + 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/include/linux/padata.h b/include/linux/padata.h index 495b16b6b4d72..f6c58c30ed96a 100644 --- a/include/linux/padata.h +++ b/include/linux/padata.h @@ -137,6 +137,7 @@ struct padata_shell { * appropriate for one worker thread to do at once. * @max_threads: Max threads to use for the job, actual number may be less * depending on task size and minimum chunk size. + * @numa_aware: Dispatch jobs to different nodes. */ struct padata_mt_job { void (*thread_fn)(unsigned long start, unsigned long end, void *arg); @@ -146,6 +147,7 @@ struct padata_mt_job { unsigned long align; unsigned long min_chunk; int max_threads; + bool numa_aware; }; /** diff --git a/kernel/padata.c b/kernel/padata.c index 179fb1518070c..80f82c563e46a 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -485,7 +485,7 @@ void __init padata_do_multithreaded(struct padata_mt_job *job) struct padata_work my_work, *pw; struct padata_mt_job_state ps; LIST_HEAD(works); - int nworks; + int nworks, nid; if (job->size == 0) return; @@ -517,7 +517,11 @@ void __init padata_do_multithreaded(struct padata_mt_job *job) ps.chunk_size = roundup(ps.chunk_size, job->align); list_for_each_entry(pw, &works, pw_list) - queue_work(system_unbound_wq, &pw->pw_work); + if (job->numa_aware) + queue_work_node((++nid % num_node_state(N_MEMORY)), + system_unbound_wq, &pw->pw_work); + else + queue_work(system_unbound_wq, &pw->pw_work); /* Use the current thread, which saves starting a workqueue worker. */ padata_work_init(&my_work, padata_mt_helper, &ps, PADATA_WORK_ONSTACK); diff --git a/mm/mm_init.c b/mm/mm_init.c index 077bfe393b5e2..1226f0c81fcb3 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -2234,6 +2234,7 @@ static int __init deferred_init_memmap(void *data) .align = PAGES_PER_SECTION, .min_chunk = PAGES_PER_SECTION, .max_threads = max_threads, + .numa_aware = false, }; padata_do_multithreaded(&job); From patchwork Fri Dec 8 02:52:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gang Li X-Patchwork-Id: 175539 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp5208087vqy; Thu, 7 Dec 2023 18:54:40 -0800 (PST) X-Google-Smtp-Source: AGHT+IFmtsEClywLr3NdYGo2KyuPo52AowQurbQOgqfEGaTAGGY0RH4BKb8vAXWlSZJFv6WZ9c4V X-Received: by 2002:a05:6a00:23c1:b0:6ce:de7d:e942 with SMTP id g1-20020a056a0023c100b006cede7de942mr170262pfc.33.1702004080516; Thu, 07 Dec 2023 18:54:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702004080; cv=none; d=google.com; s=arc-20160816; b=fzB8uCnQM7qT2RqakQ2nnbKWLtaYiuZ/n9G+/xLRy9Hhf2zHAAs7NnF/Fclyb845At Nje8UL9T8ngUeLJv9Mhu0/j8NJ65/0n5qT3mVP7eVfgVgnYlomvAARRwL0wBpxCuw+Zo k3Qqpg33g6mFM/wFykIu8ovmee/V0jWyQRvWaNJEhm8FtzwjsbMn9a7lTI9QjLPQ8UjB VTE7UXCL1iJ8zO1VE5LQpGY0mw+RrGF8vZsiDTpeoLpnhgaag6mpVrRbp1NZjjd8r3p0 F+hDFAwY3KnVrPsPFwXKMANogWY4mCCmYKz1X4nU/iYox4d2AWDKHj7UILvw+XLNAU6T eexQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=kGZs/UomIOv18ew/FrkoM04DuS4bMFxPVmfT4OFKf7o=; fh=ALVB2amNZc7wpwPN7EU8i4C3eL0OH515L0QoCIRQrQE=; b=H7HUucsNFJF9h+TtmbCo7dUgjr/gEQefy5q5fuFwhZbKxwQH1E/IamM1ys67006Gr5 +X4mie477WPAB0MDfQgUEkSxlLsLIXKVWKgOARQ1SoMMEudkG8ZsNkRk1bB11SwKrEWy yEThbODD816isw3tlFzp/90DBFSQcEpFeyhkYdA5nlkTT0dvp1WJaTmIb0N6HeOWyipO nCae+tDdrGLKSJuLsybc8wr3xJ3LHi6FeH7pOFu5LRNnkXRqzW2nKTIhiqfVM9TIkiiL NPfDoRv85X+2K7wtlPzuuQ5mqZvyj8d8EOFpieCExi9zKG9xnxe9ZfDMzUxBQFVN9smu SP/A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=chBueub5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id i13-20020a63e90d000000b005bd3c9a9528si708129pgh.263.2023.12.07.18.54.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Dec 2023 18:54:40 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=chBueub5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 7DF83806664C; Thu, 7 Dec 2023 18:53:36 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1573027AbjLHCxS (ORCPT + 99 others); Thu, 7 Dec 2023 21:53:18 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60946 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573041AbjLHCxO (ORCPT ); Thu, 7 Dec 2023 21:53:14 -0500 Received: from out-170.mta0.migadu.com (out-170.mta0.migadu.com [IPv6:2001:41d0:1004:224b::aa]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 521EE1732 for ; Thu, 7 Dec 2023 18:53:20 -0800 (PST) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1702003997; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kGZs/UomIOv18ew/FrkoM04DuS4bMFxPVmfT4OFKf7o=; b=chBueub5MeUxcphaWjUQxqFROsevyxpVPZfmiPXveRFoaHIrVUEHPmnvrGUbaR4eTRBiWA f94Dvmb/dAYuSx8K1f3+h/VIZx1F4a/NHLYb+wLTgBl21NY0lhI2iPxvxgQs9MN7KjUjor XiwWvmtR111DRnYq7EyqfBimXVLu1UU= From: Gang Li To: David Hildenbrand , David Rientjes , Mike Kravetz , Muchun Song , Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, ligang.bdlg@bytedance.com, Gang Li Subject: [RFC PATCH v2 4/5] hugetlb: parallelize 2M hugetlb allocation and initialization Date: Fri, 8 Dec 2023 10:52:39 +0800 Message-Id: <20231208025240.4744-5-gang.li@linux.dev> In-Reply-To: <20231208025240.4744-1-gang.li@linux.dev> References: <20231208025240.4744-1-gang.li@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 07 Dec 2023 18:53:36 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784680630619844465 X-GMAIL-MSGID: 1784680630619844465 By distributing both the allocation and the initialization tasks across multiple threads, the initialization of 2M hugetlb will be faster, thereby improving the boot speed. This patch can achieve 60% improvement in performance. test no patch(ms) patched(ms) saved ------------------- -------------- ------------- -------- 256c2t(4 node) 2M 2624 956 63.57% 128c1t(2 node) 2M 1788 684 61.74% Signed-off-by: Gang Li --- mm/hugetlb.c | 71 ++++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 52 insertions(+), 19 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 8de1653fc4c4f..033e359fdb86b 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -35,6 +35,7 @@ #include #include #include +#include #include #include @@ -3502,6 +3503,37 @@ static void __init hugetlb_hstate_alloc_pages_report(unsigned long allocated, st } } +static void __init hugetlb_alloc_node(unsigned long start, unsigned long end, void *arg) +{ + struct hstate *h = (struct hstate *)arg; + int i, num = end - start; + nodemask_t node_alloc_noretry; + unsigned long flags; + + /* Bit mask controlling how hard we retry per-node allocations.*/ + nodes_clear(node_alloc_noretry); + + for (i = 0; i < num; ++i) { + struct folio *folio = alloc_pool_huge_folio(h, &node_states[N_MEMORY], + &node_alloc_noretry); + if (!folio) + break; + spin_lock_irqsave(&hugetlb_lock, flags); + __prep_account_new_huge_page(h, folio_nid(folio)); + enqueue_hugetlb_folio(h, folio); + spin_unlock_irqrestore(&hugetlb_lock, flags); + cond_resched(); + } +} + +static void __init hugetlb_vmemmap_optimize_node(unsigned long start, unsigned long end, void *arg) +{ + struct hstate *h = (struct hstate *)arg; + int nid = start; + + hugetlb_vmemmap_optimize_folios(h, &h->hugepage_freelists[nid]); +} + static unsigned long __init hugetlb_hstate_alloc_pages_gigantic(struct hstate *h) { unsigned long i; @@ -3521,26 +3553,27 @@ static unsigned long __init hugetlb_hstate_alloc_pages_gigantic(struct hstate *h static unsigned long __init hugetlb_hstate_alloc_pages_non_gigantic(struct hstate *h) { - unsigned long i; - struct folio *folio; - LIST_HEAD(folio_list); - nodemask_t node_alloc_noretry; - - /* Bit mask controlling how hard we retry per-node allocations.*/ - nodes_clear(node_alloc_noretry); - - for (i = 0; i < h->max_huge_pages; ++i) { - folio = alloc_pool_huge_folio(h, &node_states[N_MEMORY], - &node_alloc_noretry); - if (!folio) - break; - list_add(&folio->lru, &folio_list); - cond_resched(); - } - - prep_and_add_allocated_folios(h, &folio_list); + struct padata_mt_job job = { + .fn_arg = h, + .align = 1, + .numa_aware = true, + }; - return i; + job.thread_fn = hugetlb_alloc_node, + job.start = 0, + job.size = h->max_huge_pages, + job.min_chunk = h->max_huge_pages / num_node_state(N_MEMORY) / 2, + job.max_threads = num_node_state(N_MEMORY) * 2, + padata_do_multithreaded(&job); + + job.thread_fn = hugetlb_vmemmap_optimize_node, + job.start = 0, + job.size = num_node_state(N_MEMORY), + job.min_chunk = 1, + job.max_threads = num_node_state(N_MEMORY), + padata_do_multithreaded(&job); + + return h->nr_huge_pages; } /* From patchwork Fri Dec 8 02:52:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gang Li X-Patchwork-Id: 175540 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp5208113vqy; Thu, 7 Dec 2023 18:54:45 -0800 (PST) X-Google-Smtp-Source: AGHT+IFdvwD2LiV+o7E3u66bIpcJYepeSs39Sc+io+xNxkd0/dwCMgVkWeEJ4cdioUB2OYfqHsQ+ X-Received: by 2002:a17:902:f814:b0:1d0:b6d1:d465 with SMTP id ix20-20020a170902f81400b001d0b6d1d465mr3418769plb.56.1702004085305; Thu, 07 Dec 2023 18:54:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702004085; cv=none; d=google.com; s=arc-20160816; b=dqmoFrjwSXcILkO6O9aCZYkJ5A/sOieba3jJE7UCTuSdUX9MCOGQsUjFC3UFeQEtuh fdRkrcCeLWdBdN3Gjon0jBTSsLHN0T368Tajw4Lv4V4tM4XTnXyVjHuXhHvR0J5PrwMz DESUso+YX7EJyjMpqsOZcdIs+nirsuRM7OBIFIdNLg84Oxx3GfFa9UHXQQxqYBVEyVmZ aR48jJQTcRpHuKMgUcdey2N3t38XhU1CboZG/BmJUK2ecqhJCkEYD8iuCmsB2PNezHTS h7G8KC9oFN6jJuWuN4ZFCWFOL8LjR77+98oWZf5Ilii3hEJfEe+9jXngH0jz5NhEYF4Z EnaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=JPiol85rH/HHWH3MtNGJ1vVzJRIHoY4UOz/xBdoQAl8=; fh=ALVB2amNZc7wpwPN7EU8i4C3eL0OH515L0QoCIRQrQE=; b=raMTJkU4yttBku994+CHljnGMF8oZ7GhEkdSbpTFpVP9xFyp7vpIGexRwG8yk0Mk7U anNWNExXvPi5tU3FZ2DCJs9SpaV/KHwjZGqtKWQQeyfKHx/1k11QyK9tgYVz4/z5MSdL ojqKtFir1p1kleFHwxCM7SwppNeoJ03F0PnB2PqjqH1l51Tr2aiQYykP2g51rnAK8wQQ SZmf4jFW9DazKl+dTEwqNiW2dsAimFleQJaWBOfbAO0h1Ylgh23+ZDE//Tuudq5ei5IU 0dObUmnbkcdZHGa9vbQkwwZRrpg47qTUMCA25sT28c0fI8S/Bmgb1XroVxk3E5NSCHP7 fZ3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=NxMVWG21; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id u13-20020a170902e5cd00b001cfb4bd0e36si763011plf.341.2023.12.07.18.54.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Dec 2023 18:54:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=NxMVWG21; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 1FF79805F9C6; Thu, 7 Dec 2023 18:53:43 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235734AbjLHCx3 (ORCPT + 99 others); Thu, 7 Dec 2023 21:53:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60938 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573023AbjLHCxR (ORCPT ); Thu, 7 Dec 2023 21:53:17 -0500 Received: from out-183.mta0.migadu.com (out-183.mta0.migadu.com [IPv6:2001:41d0:1004:224b::b7]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 736221723 for ; Thu, 7 Dec 2023 18:53:22 -0800 (PST) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1702004000; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JPiol85rH/HHWH3MtNGJ1vVzJRIHoY4UOz/xBdoQAl8=; b=NxMVWG21gW8upSMu6W1+UqWPM1mu3TNNK7ySpDZBuJgxSJQJnlV784GJ4eVMKSUVd0+zvX 1DuMiRjvpJps0PvFoWiZGOezZxshmLOm4vijHRinD2+eRO6AOMJh3fsAb46TUi6BMzI3VH gXkzeo3fmyYW7grIYjzDRsnY4fGCZgc= From: Gang Li To: David Hildenbrand , David Rientjes , Mike Kravetz , Muchun Song , Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, ligang.bdlg@bytedance.com, Gang Li Subject: [RFC PATCH v2 5/5] hugetlb: parallelize 1G hugetlb initialization Date: Fri, 8 Dec 2023 10:52:40 +0800 Message-Id: <20231208025240.4744-6-gang.li@linux.dev> In-Reply-To: <20231208025240.4744-1-gang.li@linux.dev> References: <20231208025240.4744-1-gang.li@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 07 Dec 2023 18:53:43 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784680635844591712 X-GMAIL-MSGID: 1784680635844591712 Optimizing the initialization speed of 1G huge pages through parallelization. 1G hugetlbs are allocated from bootmem, a process that is already very fast and does not currently require optimization. Therefore, we focus on parallelizing only the initialization phase in `gather_bootmem_prealloc`. This patch can achieve 40%-50% improvement in performance. test no patch(ms) patched(ms) saved ------------------- -------------- ------------- -------- 256c2t(4 node) 1G 2679 1582 40.95% 128c1t(2 node) 1G 3160 1618 48.80% Signed-off-by: Gang Li --- include/linux/hugetlb.h | 2 +- mm/hugetlb.c | 40 +++++++++++++++++++++++++++++++++------- 2 files changed, 34 insertions(+), 8 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index d3acecc5db4b3..ca94c43a63b84 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -178,7 +178,7 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, struct address_space *hugetlb_page_mapping_lock_write(struct page *hpage); extern int sysctl_hugetlb_shm_group; -extern struct list_head huge_boot_pages; +extern struct list_head huge_boot_pages[MAX_NUMNODES]; /* arch callbacks */ diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 033e359fdb86b..eb33cb15dce61 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -69,7 +69,7 @@ static bool hugetlb_cma_folio(struct folio *folio, unsigned int order) #endif static unsigned long hugetlb_cma_size __initdata; -__initdata LIST_HEAD(huge_boot_pages); +__initdata struct list_head huge_boot_pages[MAX_NUMNODES]; /* for command line parsing */ static struct hstate * __initdata parsed_hstate; @@ -3331,7 +3331,7 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) huge_page_size(h) - PAGE_SIZE); /* Put them into a private list first because mem_map is not up yet */ INIT_LIST_HEAD(&m->list); - list_add(&m->list, &huge_boot_pages); + list_add(&m->list, &huge_boot_pages[node]); m->hstate = h; return 1; } @@ -3382,8 +3382,6 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, /* Send list for bulk vmemmap optimization processing */ hugetlb_vmemmap_optimize_folios(h, folio_list); - /* Add all new pool pages to free lists in one lock cycle */ - spin_lock_irqsave(&hugetlb_lock, flags); list_for_each_entry_safe(folio, tmp_f, folio_list, lru) { if (!folio_test_hugetlb_vmemmap_optimized(folio)) { /* @@ -3396,23 +3394,27 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, HUGETLB_VMEMMAP_RESERVE_PAGES, pages_per_huge_page(h)); } + /* Subdivide locks to achieve better parallel performance */ + spin_lock_irqsave(&hugetlb_lock, flags); __prep_account_new_huge_page(h, folio_nid(folio)); enqueue_hugetlb_folio(h, folio); + spin_unlock_irqrestore(&hugetlb_lock, flags); } - spin_unlock_irqrestore(&hugetlb_lock, flags); } /* * Put bootmem huge pages into the standard lists after mem_map is up. * Note: This only applies to gigantic (order > MAX_ORDER) pages. */ -static void __init gather_bootmem_prealloc(void) +static void __init __gather_bootmem_prealloc(unsigned long start, unsigned long end, void *arg) + { + int nid = start; LIST_HEAD(folio_list); struct huge_bootmem_page *m; struct hstate *h = NULL, *prev_h = NULL; - list_for_each_entry(m, &huge_boot_pages, list) { + list_for_each_entry(m, &huge_boot_pages[nid], list) { struct page *page = virt_to_page(m); struct folio *folio = (void *)page; @@ -3445,6 +3447,22 @@ static void __init gather_bootmem_prealloc(void) prep_and_add_bootmem_folios(h, &folio_list); } +static void __init gather_bootmem_prealloc(void) +{ + struct padata_mt_job job = { + .thread_fn = __gather_bootmem_prealloc, + .fn_arg = NULL, + .start = 0, + .size = num_node_state(N_MEMORY), + .align = 1, + .min_chunk = 1, + .max_threads = num_node_state(N_MEMORY), + .numa_aware = true, + }; + + padata_do_multithreaded(&job); +} + static void __init hugetlb_hstate_alloc_pages_onenode(struct hstate *h, int nid) { unsigned long i; @@ -3597,6 +3615,14 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) return; } + /* hugetlb_hstate_alloc_pages will be called many times, init huge_boot_pages once*/ + if (huge_boot_pages[0].next == NULL) { + int i = 0; + + for (i = 0; i < MAX_NUMNODES; i++) + INIT_LIST_HEAD(&huge_boot_pages[i]); + } + /* do node specific alloc */ if (hugetlb_hstate_alloc_pages_node_specific(h)) return;