From patchwork Mon Jul 24 13:46:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 125026 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9010:0:b0:3e4:2afc:c1 with SMTP id l16csp1836527vqg; Mon, 24 Jul 2023 07:26:53 -0700 (PDT) X-Google-Smtp-Source: APBJJlEj+WMe5hihWEuNgpfSwXXlSTa8NtnE6GcUoXumXF3wy2ODRW1pUb92l1FNJshJIKF/CPmV X-Received: by 2002:aa7:c481:0:b0:522:1956:a291 with SMTP id m1-20020aa7c481000000b005221956a291mr5440650edq.8.1690208813195; Mon, 24 Jul 2023 07:26:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690208813; cv=none; d=google.com; s=arc-20160816; b=jGTmCCONdcf6J1trAOjaFim4twMmLbetQH5Hg+FN/BXqh98feLG3i8i6mLmOP37lQN yfpSYHyllPUHaGK4UjcauO9cndiFHTWceLbmC7dG7fmxupBj72BI/+3K3FqvqXJoWWzq RxiP6bqIJmd3LYxEMiU5CuWTkz/yoZMD0uUwVNT7ZcmERYHTgatV1J85u2/arEmFxsaZ T9k3LuW+TFlAO8V361LwYg2m+9J++5b3gLGterzz3koj467r9DvQ8CQvvzqP/ayMtA8D mWI75LaqEeyWbiukUo1ZG5+Z73PJJ1n0g/MCQeYAmo5ONx2f5rvtK7wl0PjmDYfg2Rbz r83Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=2+5f3AgE150Mh6cYaW30LczJjyIbsUmp9ItlqNwo9Gs=; fh=6ywE3cadAEHt3CqWKbi2najT+v0c7si89uhbZ0SfPzA=; b=KISdnRDmyiz2AKRkwZe7C5oU96Jgw/HNw4Y+4ba2LsaP49YDJT/h8pngR2WH49kqyv FFykYCrDePy+K1U+xK6SiNQnbmri/vcnYlKmgjkXmnxquayLdxYOKRvGeALmAcrpFhMt SgxNoUnbyQ0BflDwdwgYUpsEWHeITxjsnjXQ4ZHfgxfeErY1WDom/tT5VodeQhhIedC3 qTCkalzGUIkqcfXheT10ZXp8J1R12C9bKacX+e6cHKzp0P36vW24huZ7S3gFHOEu04zA bURIZfKcYpKc6tB0Pvtq2IZqeD6yDHGzdmlnUADe10QhUbbd5urOngONO7Jzo9MSI3eU 6kJA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=ikYTxmvN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b5-20020aa7c905000000b0051e0709a710si6783664edt.634.2023.07.24.07.26.27; Mon, 24 Jul 2023 07:26:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=ikYTxmvN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230340AbjGXNuY (ORCPT + 99 others); Mon, 24 Jul 2023 09:50:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49520 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230330AbjGXNtw (ORCPT ); Mon, 24 Jul 2023 09:49:52 -0400 Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 556FB100 for ; Mon, 24 Jul 2023 06:46:50 -0700 (PDT) Received: by mail-wm1-x334.google.com with SMTP id 5b1f17b1804b1-3fbc244d307so41942555e9.1 for ; Mon, 24 Jul 2023 06:46:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1690206408; x=1690811208; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2+5f3AgE150Mh6cYaW30LczJjyIbsUmp9ItlqNwo9Gs=; b=ikYTxmvNKkByuJlR0NutAuZwBS+HffntLHXBTOm8ABSvwCfWL2fKqLXT+iOY+era+y 0f+kQ2586uXX6d82nfLBSpOQr1gqnMFjKQ47G8ZLbVTI8NSLejjGc+kDPEvl1HVxnBUF jCpEFBNwQAA+XsldO7xDMXzVVhi5WukMODB/6kuUFR9VRoS+9pioBp657mkjAf4PNZ/i F4wa0/ndwm2yPUB7BijO2XndG4yNOsOBIbw8Z0pGdeoakBDEl8/5c+TqIcujNE6fR8ZQ FpFK7b0NO5bBOQMrm+8GlX2hKWNa2zqyvycuRIQfoA56RG/Cn9fc7U79QcopUi7lSSGW lQSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690206408; x=1690811208; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2+5f3AgE150Mh6cYaW30LczJjyIbsUmp9ItlqNwo9Gs=; b=iMLyZHJf6jEXtbxwDDAos85BMzdy8NNfzcJ10mey5YSkwmYR8b7Jsj2xd2eaoU6scJ y5j3dyZeD0/a/kXOptt1NXnN38Y/SAyJUd+WcaD5L2rBTFz/mJv92AEoi+i/e0VZgxcz w09ZZyhueHjiblw0pRjr9hybFOKMoTy/MqFg5TNTkyHyqruwNLta33WCxy+KUh/iDL53 ag2dQQxHN0FuxcOAjjXtgez/AUT5CqkW6tpu1rnKkYU0fNVu5cKcM2Pe5Ym/6orXm3+T WbzbGO8+MDq4uezjOquvOZrjwcqqS9ek2Lm3dRw5FdlFJqG+YSSXICshGLp4BVf1CSyT MxPg== X-Gm-Message-State: ABy/qLbGeto0GKGPKAG5dsVAIgo3zjevd1AqopZ+fXRLYVhMJgLk8tqs h2PXpXyPYlnLf3FQ2YLFqdP3qw== X-Received: by 2002:a1c:750a:0:b0:3fd:ad65:ea8b with SMTP id o10-20020a1c750a000000b003fdad65ea8bmr3233762wmc.12.1690206408483; Mon, 24 Jul 2023 06:46:48 -0700 (PDT) Received: from localhost.localdomain ([2a02:6b6a:b465:0:d7c4:7f46:8fed:f874]) by smtp.gmail.com with ESMTPSA id e19-20020a05600c219300b003fbe791a0e8sm10209354wme.0.2023.07.24.06.46.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Jul 2023 06:46:48 -0700 (PDT) From: Usama Arif To: linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, rppt@kernel.org Cc: linux-kernel@vger.kernel.org, fam.zheng@bytedance.com, liangma@liangbit.com, simon.evans@bytedance.com, punit.agrawal@bytedance.com, Usama Arif Subject: [RFC 1/4] mm/hugetlb: Skip prep of tail pages when HVO is enabled Date: Mon, 24 Jul 2023 14:46:41 +0100 Message-Id: <20230724134644.1299963-2-usama.arif@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230724134644.1299963-1-usama.arif@bytedance.com> References: <20230724134644.1299963-1-usama.arif@bytedance.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772312396518044092 X-GMAIL-MSGID: 1772312396518044092 When vmemmap is optimizable, it will free all the duplicated tail pages in hugetlb_vmemmap_optimize while preparing the new hugepage. Hence, there is no need to prepare them. For 1G x86 hugepages, it avoids preparing 262144 - 64 = 262080 struct pages per hugepage. Signed-off-by: Usama Arif --- mm/hugetlb.c | 30 +++++++++++++++++++++--------- mm/hugetlb_vmemmap.c | 2 +- mm/hugetlb_vmemmap.h | 1 + 3 files changed, 23 insertions(+), 10 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 64a3239b6407..24352abbb9e5 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1943,13 +1943,22 @@ static void prep_new_hugetlb_folio(struct hstate *h, struct folio *folio, int ni } static bool __prep_compound_gigantic_folio(struct folio *folio, - unsigned int order, bool demote) + unsigned int order, bool demote, + bool hugetlb_vmemmap_optimizable) { int i, j; int nr_pages = 1 << order; struct page *p; __folio_clear_reserved(folio); + + /* + * No need to prep pages that will be freed later by hugetlb_vmemmap_optimize + * in prep_new_huge_page. Hence, reduce nr_pages to the pages that will be kept. + */ + if (hugetlb_vmemmap_optimizable) + nr_pages = HUGETLB_VMEMMAP_RESERVE_SIZE / sizeof(struct page); + for (i = 0; i < nr_pages; i++) { p = folio_page(folio, i); @@ -2020,15 +2029,15 @@ static bool __prep_compound_gigantic_folio(struct folio *folio, } static bool prep_compound_gigantic_folio(struct folio *folio, - unsigned int order) + unsigned int order, bool hugetlb_vmemmap_optimizable) { - return __prep_compound_gigantic_folio(folio, order, false); + return __prep_compound_gigantic_folio(folio, order, false, hugetlb_vmemmap_optimizable); } static bool prep_compound_gigantic_folio_for_demote(struct folio *folio, - unsigned int order) + unsigned int order, bool hugetlb_vmemmap_optimizable) { - return __prep_compound_gigantic_folio(folio, order, true); + return __prep_compound_gigantic_folio(folio, order, true, hugetlb_vmemmap_optimizable); } /* @@ -2185,7 +2194,8 @@ static struct folio *alloc_fresh_hugetlb_folio(struct hstate *h, if (!folio) return NULL; if (hstate_is_gigantic(h)) { - if (!prep_compound_gigantic_folio(folio, huge_page_order(h))) { + if (!prep_compound_gigantic_folio(folio, huge_page_order(h), + vmemmap_should_optimize(h, &folio->page))) { /* * Rare failure to convert pages to compound page. * Free pages and try again - ONCE! @@ -3201,7 +3211,8 @@ static void __init gather_bootmem_prealloc(void) VM_BUG_ON(!hstate_is_gigantic(h)); WARN_ON(folio_ref_count(folio) != 1); - if (prep_compound_gigantic_folio(folio, huge_page_order(h))) { + if (prep_compound_gigantic_folio(folio, huge_page_order(h), + vmemmap_should_optimize(h, page))) { WARN_ON(folio_test_reserved(folio)); prep_new_hugetlb_folio(h, folio, folio_nid(folio)); free_huge_page(page); /* add to the hugepage allocator */ @@ -3624,8 +3635,9 @@ static int demote_free_hugetlb_folio(struct hstate *h, struct folio *folio) subpage = folio_page(folio, i); inner_folio = page_folio(subpage); if (hstate_is_gigantic(target_hstate)) - prep_compound_gigantic_folio_for_demote(inner_folio, - target_hstate->order); + prep_compound_gigantic_folio_for_demote(folio, + target_hstate->order, + vmemmap_should_optimize(target_hstate, subpage)); else prep_compound_page(subpage, target_hstate->order); folio_change_private(inner_folio, NULL); diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index c2007ef5e9b0..b721e87de2b3 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -486,7 +486,7 @@ int hugetlb_vmemmap_restore(const struct hstate *h, struct page *head) } /* Return true iff a HugeTLB whose vmemmap should and can be optimized. */ -static bool vmemmap_should_optimize(const struct hstate *h, const struct page *head) +bool vmemmap_should_optimize(const struct hstate *h, const struct page *head) { if (!READ_ONCE(vmemmap_optimize_enabled)) return false; diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 25bd0e002431..3525c514c061 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -57,4 +57,5 @@ static inline bool hugetlb_vmemmap_optimizable(const struct hstate *h) { return hugetlb_vmemmap_optimizable_size(h) != 0; } +bool vmemmap_should_optimize(const struct hstate *h, const struct page *head); #endif /* _LINUX_HUGETLB_VMEMMAP_H */ From patchwork Mon Jul 24 13:46:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 125038 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9010:0:b0:3e4:2afc:c1 with SMTP id l16csp1840003vqg; Mon, 24 Jul 2023 07:32:08 -0700 (PDT) X-Google-Smtp-Source: APBJJlGg1Z3wrPXF88QcbmWdSWtWhuo8fUhQ9NRx8fPjPyBIkMmmOMu8r2SpfU4vwUdJFBFMGNt7 X-Received: by 2002:a17:907:8a09:b0:986:38ab:ef99 with SMTP id sc9-20020a1709078a0900b0098638abef99mr18501892ejc.9.1690209127989; Mon, 24 Jul 2023 07:32:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690209127; cv=none; d=google.com; s=arc-20160816; b=PALvp1AXNutxtqhzl74qO5woo6yIwOLJCNJqd9BxWWit6MOZsr63fHNVWJIqpkPYw1 XsAHtR8DfK0wWoX6kxolaSWGEZcdx7n3XqOu/PJltQISx3n0ldupGlFG8hyoFrtP84Yc 1QKf5vmSwNO1tqOWH2lPD0rAreYIYjsQYYaJJQNHzz7vkz3Cw+MsRVwLDUnxadXjXrZC B3Rr/bXNrLDQDvFHl7OiFfMPASyujc5S2WGstJHQCTFSJfX0Pb5Z4rpbK0BKU+Mcs9K3 t3VEMmZLMbLmZPihCbGn79WpWJn7n8VQ5QCMM7kH6ylmg7zEbFwCnponulX1euj/oznw jGGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=TrBvXJXG5pOuPGnFnmmWgOCSm+Hu58ziTnCPKJdrxFI=; fh=6ywE3cadAEHt3CqWKbi2najT+v0c7si89uhbZ0SfPzA=; b=jtqOagQ4VA1v4CV4r9VWSA9ppDjNmfh8hCQPDKF3Vc77CwOWIGM2lWraqPacQnRPGk Ej6v6b9w2lMonZoDk7xvbBw/HsZ7/3qVT9IKqzUzFIk1A3tmlTqlglYMHsWfDe58EuT4 opdp6CDcBnFsc/JLojq1W4a9kAkTq7nHMj9JKG/6ierhbhF62VwywSzrjwgsghot85XF S244c9HXhN6jNOLdqdfdQ8E16hEGRGSZotVFleL1q2gDuSPTEitCTBOH+MH8qQw2bosG 4ncQwrkPo7Xk78nZ3J9NYsunt1NW8VqUqx8JmlhpnB85f1o6/KpOVWhVQj2qy3Du7Shu SUhQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=hpDj6ajL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id kg27-20020a17090776fb00b00993860a6d40si6010076ejc.387.2023.07.24.07.31.38; Mon, 24 Jul 2023 07:32:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=hpDj6ajL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230324AbjGXNtt (ORCPT + 99 others); Mon, 24 Jul 2023 09:49:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49528 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231524AbjGXNt2 (ORCPT ); Mon, 24 Jul 2023 09:49:28 -0400 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A7A741BC5 for ; Mon, 24 Jul 2023 06:46:51 -0700 (PDT) Received: by mail-wm1-x335.google.com with SMTP id 5b1f17b1804b1-3fd0f000f1cso27354865e9.1 for ; Mon, 24 Jul 2023 06:46:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1690206409; x=1690811209; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TrBvXJXG5pOuPGnFnmmWgOCSm+Hu58ziTnCPKJdrxFI=; b=hpDj6ajLvbAW9mGF/M6fPkHM3QLBe4nIO3sZ+5rXgwwUYggGi/WBHwUVl1j7lubKb8 LIXvVDqKO3ib1Q5crCeVVLcCwz6zNW1WL7yBf3YfRSnBoROpth2S/AcSHCUOGFSvOamJ WwynED0rZsIpTLVIUvw0WZlcGPrJtSbolW7vEwQvV4fqE/ypPbFP/TVjajy/h0SLgyJd DEkKUg1jUWxEp6QBOkRtSMiC3xBufpqtPJyRabtQ/i1vuBg6r5wxRkqLOsuG4q264q1h bnSfu1davANntTXapNjREHqzZwcU0yVx4hSTqtUWrFB7eFjVaKtY5FR0QwLONqCmO8lV fFVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690206409; x=1690811209; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TrBvXJXG5pOuPGnFnmmWgOCSm+Hu58ziTnCPKJdrxFI=; b=VuptH4MhAZNJJFuNrue9thiCzOvm8H3KwcqEGzFvXW19j7tLXeq5LPHhzdtzjS9YeD IJ8Qf45KwPfUEhGJ5y8BOhSfC1Yfpoqel8S10o1vYZAhmUttM+SEEo0rEIQgW35uPegY ybWyycyo5UZ1pfho7jZpwNQsz3UJwy/+728fXTX02ssRdJB0h+hpNVD6ewfFS7zNln5Y t+NJSte9MVaH4RAudzzapulgdMOx3Hp3Ry9R0eN2BZ3/DpslmbRd+8QIgdyuBS3hasN0 Xy4BngaDMLmLPFsrjtdDvoFgON/fBXR4vfBua/dXD5ZHwH8RZisn8eiW61SEQg6f97Bi l6fg== X-Gm-Message-State: ABy/qLYm+pdhEkJ22/n2f5rLSMp2Mim0ncWT3Y/GKlaGAciRZMG4LTMt e4iwE/0l/uBqPDrzHSiH3sE63A== X-Received: by 2002:a7b:c8d7:0:b0:3fc:180:6463 with SMTP id f23-20020a7bc8d7000000b003fc01806463mr8081280wml.7.1690206409462; Mon, 24 Jul 2023 06:46:49 -0700 (PDT) Received: from localhost.localdomain ([2a02:6b6a:b465:0:d7c4:7f46:8fed:f874]) by smtp.gmail.com with ESMTPSA id e19-20020a05600c219300b003fbe791a0e8sm10209354wme.0.2023.07.24.06.46.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Jul 2023 06:46:48 -0700 (PDT) From: Usama Arif To: linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, rppt@kernel.org Cc: linux-kernel@vger.kernel.org, fam.zheng@bytedance.com, liangma@liangbit.com, simon.evans@bytedance.com, punit.agrawal@bytedance.com, Usama Arif Subject: [RFC 2/4] mm/memblock: Add hugepage_size member to struct memblock_region Date: Mon, 24 Jul 2023 14:46:42 +0100 Message-Id: <20230724134644.1299963-3-usama.arif@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230724134644.1299963-1-usama.arif@bytedance.com> References: <20230724134644.1299963-1-usama.arif@bytedance.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772312726880145702 X-GMAIL-MSGID: 1772312726880145702 This propagates the hugepage size from the memblock APIs (memblock_alloc_try_nid_raw and memblock_alloc_range_nid) so that it can be stored in struct memblock region. This does not introduce any functional change and hugepage_size is not used in this commit. It is just a setup for the next commit where huge_pagesize is used to skip initialization of struct pages that will be freed later when HVO is enabled. Signed-off-by: Usama Arif --- arch/arm64/mm/kasan_init.c | 2 +- arch/powerpc/platforms/pasemi/iommu.c | 2 +- arch/powerpc/platforms/pseries/setup.c | 4 +- arch/powerpc/sysdev/dart_iommu.c | 2 +- include/linux/memblock.h | 8 ++- mm/cma.c | 4 +- mm/hugetlb.c | 6 +- mm/memblock.c | 60 ++++++++++++-------- mm/mm_init.c | 2 +- mm/sparse-vmemmap.c | 2 +- tools/testing/memblock/tests/alloc_nid_api.c | 2 +- 11 files changed, 56 insertions(+), 38 deletions(-) diff --git a/arch/arm64/mm/kasan_init.c b/arch/arm64/mm/kasan_init.c index f17d066e85eb..39992a418891 100644 --- a/arch/arm64/mm/kasan_init.c +++ b/arch/arm64/mm/kasan_init.c @@ -50,7 +50,7 @@ static phys_addr_t __init kasan_alloc_raw_page(int node) void *p = memblock_alloc_try_nid_raw(PAGE_SIZE, PAGE_SIZE, __pa(MAX_DMA_ADDRESS), MEMBLOCK_ALLOC_NOLEAKTRACE, - node); + node, 0); if (!p) panic("%s: Failed to allocate %lu bytes align=0x%lx nid=%d from=%llx\n", __func__, PAGE_SIZE, PAGE_SIZE, node, diff --git a/arch/powerpc/platforms/pasemi/iommu.c b/arch/powerpc/platforms/pasemi/iommu.c index 375487cba874..6963cdf76bce 100644 --- a/arch/powerpc/platforms/pasemi/iommu.c +++ b/arch/powerpc/platforms/pasemi/iommu.c @@ -201,7 +201,7 @@ static int __init iob_init(struct device_node *dn) /* For 2G space, 8x64 pages (2^21 bytes) is max total l2 size */ iob_l2_base = memblock_alloc_try_nid_raw(1UL << 21, 1UL << 21, MEMBLOCK_LOW_LIMIT, 0x80000000, - NUMA_NO_NODE); + NUMA_NO_NODE, 0); if (!iob_l2_base) panic("%s: Failed to allocate %lu bytes align=0x%lx max_addr=%x\n", __func__, 1UL << 21, 1UL << 21, 0x80000000); diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index e2a57cfa6c83..cec7198b59d2 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -160,7 +160,7 @@ static void __init fwnmi_init(void) */ mce_data_buf = memblock_alloc_try_nid_raw(RTAS_ERROR_LOG_MAX * nr_cpus, RTAS_ERROR_LOG_MAX, MEMBLOCK_LOW_LIMIT, - ppc64_rma_size, NUMA_NO_NODE); + ppc64_rma_size, NUMA_NO_NODE, 0); if (!mce_data_buf) panic("Failed to allocate %d bytes below %pa for MCE buffer\n", RTAS_ERROR_LOG_MAX * nr_cpus, &ppc64_rma_size); @@ -176,7 +176,7 @@ static void __init fwnmi_init(void) size = sizeof(struct slb_entry) * mmu_slb_size * nr_cpus; slb_ptr = memblock_alloc_try_nid_raw(size, sizeof(struct slb_entry), MEMBLOCK_LOW_LIMIT, - ppc64_rma_size, NUMA_NO_NODE); + ppc64_rma_size, NUMA_NO_NODE, 0); if (!slb_ptr) panic("Failed to allocate %zu bytes below %pa for slb area\n", size, &ppc64_rma_size); diff --git a/arch/powerpc/sysdev/dart_iommu.c b/arch/powerpc/sysdev/dart_iommu.c index 98096bbfd62e..86c676b61899 100644 --- a/arch/powerpc/sysdev/dart_iommu.c +++ b/arch/powerpc/sysdev/dart_iommu.c @@ -239,7 +239,7 @@ static void __init allocate_dart(void) */ dart_tablebase = memblock_alloc_try_nid_raw(SZ_16M, SZ_16M, MEMBLOCK_LOW_LIMIT, SZ_2G, - NUMA_NO_NODE); + NUMA_NO_NODE, 0); if (!dart_tablebase) panic("Failed to allocate 16MB below 2GB for DART table\n"); diff --git a/include/linux/memblock.h b/include/linux/memblock.h index f71ff9f0ec81..bb8019540d73 100644 --- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -63,6 +63,7 @@ struct memblock_region { #ifdef CONFIG_NUMA int nid; #endif + phys_addr_t hugepage_size; }; /** @@ -400,7 +401,8 @@ phys_addr_t memblock_phys_alloc_range(phys_addr_t size, phys_addr_t align, phys_addr_t start, phys_addr_t end); phys_addr_t memblock_alloc_range_nid(phys_addr_t size, phys_addr_t align, phys_addr_t start, - phys_addr_t end, int nid, bool exact_nid); + phys_addr_t end, int nid, bool exact_nid, + phys_addr_t hugepage_size); phys_addr_t memblock_phys_alloc_try_nid(phys_addr_t size, phys_addr_t align, int nid); static __always_inline phys_addr_t memblock_phys_alloc(phys_addr_t size, @@ -415,7 +417,7 @@ void *memblock_alloc_exact_nid_raw(phys_addr_t size, phys_addr_t align, int nid); void *memblock_alloc_try_nid_raw(phys_addr_t size, phys_addr_t align, phys_addr_t min_addr, phys_addr_t max_addr, - int nid); + int nid, phys_addr_t hugepage_size); void *memblock_alloc_try_nid(phys_addr_t size, phys_addr_t align, phys_addr_t min_addr, phys_addr_t max_addr, int nid); @@ -431,7 +433,7 @@ static inline void *memblock_alloc_raw(phys_addr_t size, { return memblock_alloc_try_nid_raw(size, align, MEMBLOCK_LOW_LIMIT, MEMBLOCK_ALLOC_ACCESSIBLE, - NUMA_NO_NODE); + NUMA_NO_NODE, 0); } static inline void *memblock_alloc_from(phys_addr_t size, diff --git a/mm/cma.c b/mm/cma.c index a4cfe995e11e..a270905aa7f2 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -334,7 +334,7 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, if (!memblock_bottom_up() && memblock_end >= SZ_4G + size) { memblock_set_bottom_up(true); addr = memblock_alloc_range_nid(size, alignment, SZ_4G, - limit, nid, true); + limit, nid, true, 0); memblock_set_bottom_up(false); } #endif @@ -353,7 +353,7 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, if (!addr) { addr = memblock_alloc_range_nid(size, alignment, base, - limit, nid, true); + limit, nid, true, 0); if (!addr) { ret = -ENOMEM; goto err; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 24352abbb9e5..5ba7fd702458 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3168,7 +3168,8 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) /* do node specific alloc */ if (nid != NUMA_NO_NODE) { m = memblock_alloc_try_nid_raw(huge_page_size(h), huge_page_size(h), - 0, MEMBLOCK_ALLOC_ACCESSIBLE, nid); + 0, MEMBLOCK_ALLOC_ACCESSIBLE, nid, + hugetlb_vmemmap_optimizable(h) ? huge_page_size(h) : 0); if (!m) return 0; goto found; @@ -3177,7 +3178,8 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) for_each_node_mask_to_alloc(h, nr_nodes, node, &node_states[N_MEMORY]) { m = memblock_alloc_try_nid_raw( huge_page_size(h), huge_page_size(h), - 0, MEMBLOCK_ALLOC_ACCESSIBLE, node); + 0, MEMBLOCK_ALLOC_ACCESSIBLE, node, + hugetlb_vmemmap_optimizable(h) ? huge_page_size(h) : 0); /* * Use the beginning of the huge page to store the * huge_bootmem_page struct (until gather_bootmem diff --git a/mm/memblock.c b/mm/memblock.c index f9e61e565a53..e92d437bcb51 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -549,7 +549,8 @@ static void __init_memblock memblock_insert_region(struct memblock_type *type, int idx, phys_addr_t base, phys_addr_t size, int nid, - enum memblock_flags flags) + enum memblock_flags flags, + phys_addr_t hugepage_size) { struct memblock_region *rgn = &type->regions[idx]; @@ -558,6 +559,7 @@ static void __init_memblock memblock_insert_region(struct memblock_type *type, rgn->base = base; rgn->size = size; rgn->flags = flags; + rgn->hugepage_size = hugepage_size; memblock_set_region_node(rgn, nid); type->cnt++; type->total_size += size; @@ -581,7 +583,7 @@ static void __init_memblock memblock_insert_region(struct memblock_type *type, */ static int __init_memblock memblock_add_range(struct memblock_type *type, phys_addr_t base, phys_addr_t size, - int nid, enum memblock_flags flags) + int nid, enum memblock_flags flags, phys_addr_t hugepage_size) { bool insert = false; phys_addr_t obase = base; @@ -598,6 +600,7 @@ static int __init_memblock memblock_add_range(struct memblock_type *type, type->regions[0].base = base; type->regions[0].size = size; type->regions[0].flags = flags; + type->regions[0].hugepage_size = hugepage_size; memblock_set_region_node(&type->regions[0], nid); type->total_size = size; return 0; @@ -646,7 +649,7 @@ static int __init_memblock memblock_add_range(struct memblock_type *type, end_rgn = idx + 1; memblock_insert_region(type, idx++, base, rbase - base, nid, - flags); + flags, hugepage_size); } } /* area below @rend is dealt with, forget about it */ @@ -661,7 +664,7 @@ static int __init_memblock memblock_add_range(struct memblock_type *type, start_rgn = idx; end_rgn = idx + 1; memblock_insert_region(type, idx, base, end - base, - nid, flags); + nid, flags, hugepage_size); } } @@ -705,7 +708,7 @@ int __init_memblock memblock_add_node(phys_addr_t base, phys_addr_t size, memblock_dbg("%s: [%pa-%pa] nid=%d flags=%x %pS\n", __func__, &base, &end, nid, flags, (void *)_RET_IP_); - return memblock_add_range(&memblock.memory, base, size, nid, flags); + return memblock_add_range(&memblock.memory, base, size, nid, flags, 0); } /** @@ -726,7 +729,7 @@ int __init_memblock memblock_add(phys_addr_t base, phys_addr_t size) memblock_dbg("%s: [%pa-%pa] %pS\n", __func__, &base, &end, (void *)_RET_IP_); - return memblock_add_range(&memblock.memory, base, size, MAX_NUMNODES, 0); + return memblock_add_range(&memblock.memory, base, size, MAX_NUMNODES, 0, 0); } /** @@ -782,7 +785,7 @@ static int __init_memblock memblock_isolate_range(struct memblock_type *type, type->total_size -= base - rbase; memblock_insert_region(type, idx, rbase, base - rbase, memblock_get_region_node(rgn), - rgn->flags); + rgn->flags, 0); } else if (rend > end) { /* * @rgn intersects from above. Split and redo the @@ -793,7 +796,7 @@ static int __init_memblock memblock_isolate_range(struct memblock_type *type, type->total_size -= end - rbase; memblock_insert_region(type, idx--, rbase, end - rbase, memblock_get_region_node(rgn), - rgn->flags); + rgn->flags, 0); } else { /* @rgn is fully contained, record it */ if (!*end_rgn) @@ -863,14 +866,20 @@ int __init_memblock memblock_phys_free(phys_addr_t base, phys_addr_t size) return memblock_remove_range(&memblock.reserved, base, size); } -int __init_memblock memblock_reserve(phys_addr_t base, phys_addr_t size) +int __init_memblock memblock_reserve_huge(phys_addr_t base, phys_addr_t size, + phys_addr_t hugepage_size) { phys_addr_t end = base + size - 1; memblock_dbg("%s: [%pa-%pa] %pS\n", __func__, &base, &end, (void *)_RET_IP_); - return memblock_add_range(&memblock.reserved, base, size, MAX_NUMNODES, 0); + return memblock_add_range(&memblock.reserved, base, size, MAX_NUMNODES, 0, hugepage_size); +} + +int __init_memblock memblock_reserve(phys_addr_t base, phys_addr_t size) +{ + return memblock_reserve_huge(base, size, 0); } #ifdef CONFIG_HAVE_MEMBLOCK_PHYS_MAP @@ -881,7 +890,7 @@ int __init_memblock memblock_physmem_add(phys_addr_t base, phys_addr_t size) memblock_dbg("%s: [%pa-%pa] %pS\n", __func__, &base, &end, (void *)_RET_IP_); - return memblock_add_range(&physmem, base, size, MAX_NUMNODES, 0); + return memblock_add_range(&physmem, base, size, MAX_NUMNODES, 0, 0); } #endif @@ -1365,6 +1374,7 @@ __next_mem_pfn_range_in_zone(u64 *idx, struct zone *zone, * @end: the upper bound of the memory region to allocate (phys address) * @nid: nid of the free area to find, %NUMA_NO_NODE for any node * @exact_nid: control the allocation fall back to other nodes + * @hugepage_size: size of the hugepages in bytes * * The allocation is performed from memory region limited by * memblock.current_limit if @end == %MEMBLOCK_ALLOC_ACCESSIBLE. @@ -1385,7 +1395,7 @@ __next_mem_pfn_range_in_zone(u64 *idx, struct zone *zone, phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size, phys_addr_t align, phys_addr_t start, phys_addr_t end, int nid, - bool exact_nid) + bool exact_nid, phys_addr_t hugepage_size) { enum memblock_flags flags = choose_memblock_flags(); phys_addr_t found; @@ -1402,14 +1412,14 @@ phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size, again: found = memblock_find_in_range_node(size, align, start, end, nid, flags); - if (found && !memblock_reserve(found, size)) + if (found && !memblock_reserve_huge(found, size, hugepage_size)) goto done; if (nid != NUMA_NO_NODE && !exact_nid) { found = memblock_find_in_range_node(size, align, start, end, NUMA_NO_NODE, flags); - if (found && !memblock_reserve(found, size)) + if (found && !memblock_reserve_huge(found, size, hugepage_size)) goto done; } @@ -1469,7 +1479,7 @@ phys_addr_t __init memblock_phys_alloc_range(phys_addr_t size, __func__, (u64)size, (u64)align, &start, &end, (void *)_RET_IP_); return memblock_alloc_range_nid(size, align, start, end, NUMA_NO_NODE, - false); + false, 0); } /** @@ -1488,7 +1498,7 @@ phys_addr_t __init memblock_phys_alloc_range(phys_addr_t size, phys_addr_t __init memblock_phys_alloc_try_nid(phys_addr_t size, phys_addr_t align, int nid) { return memblock_alloc_range_nid(size, align, 0, - MEMBLOCK_ALLOC_ACCESSIBLE, nid, false); + MEMBLOCK_ALLOC_ACCESSIBLE, nid, false, 0); } /** @@ -1514,7 +1524,7 @@ phys_addr_t __init memblock_phys_alloc_try_nid(phys_addr_t size, phys_addr_t ali static void * __init memblock_alloc_internal( phys_addr_t size, phys_addr_t align, phys_addr_t min_addr, phys_addr_t max_addr, - int nid, bool exact_nid) + int nid, bool exact_nid, phys_addr_t hugepage_size) { phys_addr_t alloc; @@ -1530,12 +1540,12 @@ static void * __init memblock_alloc_internal( max_addr = memblock.current_limit; alloc = memblock_alloc_range_nid(size, align, min_addr, max_addr, nid, - exact_nid); + exact_nid, hugepage_size); /* retry allocation without lower limit */ if (!alloc && min_addr) alloc = memblock_alloc_range_nid(size, align, 0, max_addr, nid, - exact_nid); + exact_nid, hugepage_size); if (!alloc) return NULL; @@ -1571,7 +1581,7 @@ void * __init memblock_alloc_exact_nid_raw( &max_addr, (void *)_RET_IP_); return memblock_alloc_internal(size, align, min_addr, max_addr, nid, - true); + true, 0); } /** @@ -1585,25 +1595,29 @@ void * __init memblock_alloc_exact_nid_raw( * is preferred (phys address), or %MEMBLOCK_ALLOC_ACCESSIBLE to * allocate only from memory limited by memblock.current_limit value * @nid: nid of the free area to find, %NUMA_NO_NODE for any node + * @hugepage_size: size of the hugepages in bytes * * Public function, provides additional debug information (including caller * info), if enabled. Does not zero allocated memory, does not panic if request * cannot be satisfied. * + * If hugepage_size is not 0 and HVO is enabled, then only the struct pages + * that are not freed by HVO are initialized using the hugepage_size parameter. + * * Return: * Virtual address of allocated memory block on success, NULL on failure. */ void * __init memblock_alloc_try_nid_raw( phys_addr_t size, phys_addr_t align, phys_addr_t min_addr, phys_addr_t max_addr, - int nid) + int nid, phys_addr_t hugepage_size) { memblock_dbg("%s: %llu bytes align=0x%llx nid=%d from=%pa max_addr=%pa %pS\n", __func__, (u64)size, (u64)align, nid, &min_addr, &max_addr, (void *)_RET_IP_); return memblock_alloc_internal(size, align, min_addr, max_addr, nid, - false); + false, hugepage_size); } /** @@ -1634,7 +1648,7 @@ void * __init memblock_alloc_try_nid( __func__, (u64)size, (u64)align, nid, &min_addr, &max_addr, (void *)_RET_IP_); ptr = memblock_alloc_internal(size, align, - min_addr, max_addr, nid, false); + min_addr, max_addr, nid, false, 0); if (ptr) memset(ptr, 0, size); diff --git a/mm/mm_init.c b/mm/mm_init.c index a1963c3322af..c36d768bb671 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1615,7 +1615,7 @@ void __init *memmap_alloc(phys_addr_t size, phys_addr_t align, else ptr = memblock_alloc_try_nid_raw(size, align, min_addr, MEMBLOCK_ALLOC_ACCESSIBLE, - nid); + nid, 0); if (ptr && size > 0) page_init_poison(ptr, size); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index a044a130405b..56b8b8e684df 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -43,7 +43,7 @@ static void * __ref __earlyonly_bootmem_alloc(int node, unsigned long goal) { return memblock_alloc_try_nid_raw(size, align, goal, - MEMBLOCK_ALLOC_ACCESSIBLE, node); + MEMBLOCK_ALLOC_ACCESSIBLE, node, 0); } void * __meminit vmemmap_alloc_block(unsigned long size, int node) diff --git a/tools/testing/memblock/tests/alloc_nid_api.c b/tools/testing/memblock/tests/alloc_nid_api.c index 49bb416d34ff..225044366fbb 100644 --- a/tools/testing/memblock/tests/alloc_nid_api.c +++ b/tools/testing/memblock/tests/alloc_nid_api.c @@ -43,7 +43,7 @@ static inline void *run_memblock_alloc_nid(phys_addr_t size, max_addr, nid); if (alloc_nid_test_flags & TEST_F_RAW) return memblock_alloc_try_nid_raw(size, align, min_addr, - max_addr, nid); + max_addr, nid, 0); return memblock_alloc_try_nid(size, align, min_addr, max_addr, nid); } From patchwork Mon Jul 24 13:46:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 125014 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9010:0:b0:3e4:2afc:c1 with SMTP id l16csp1832948vqg; Mon, 24 Jul 2023 07:20:46 -0700 (PDT) X-Google-Smtp-Source: APBJJlHax/Y3V8TwNjixfxMqxt2DoEy5P8Uu0khK5SSLgRm9pTP3w3XsTyvONdQ260uLO4t2RnYd X-Received: by 2002:a05:6a20:b709:b0:12e:adbd:797a with SMTP id fg9-20020a056a20b70900b0012eadbd797amr6689303pzb.62.1690208446111; Mon, 24 Jul 2023 07:20:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690208446; cv=none; d=google.com; s=arc-20160816; b=sY5WY4GJj+QWzyAyvBKiSFYFyeySLigF9YvLUbghnjqupHwtevIhb1XBNv+0wA5o++ QfQ0OqMD6YPOo1PlP58EHfzfHSPYCsLs6iiWxHkFBL70bI/XQ+a611QTDXSN1NUMC+Zv Q1LCJ/5qkEafbEWcxUMqSWHb8d09oMRFZk/yoXLgtF3a3c8N4/oapyXmSKIkCSVyXhG/ Nj4/pkZIW/D1UnzvzABVy3uSBqIFj72OAzD0EolwX/iAxZwQCARUiFuMNdMUzNc56mC9 Zj13AChH6TPrvraxGBMw+Wwct97tUjpse8QIGdItcIvs3ZOT942ap4shh1i4+Njb+GZz 5l3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=VUFHsk/m/GY1eTNfZx8Py//rj4lEwu1OMjpML8i4XiA=; fh=6ywE3cadAEHt3CqWKbi2najT+v0c7si89uhbZ0SfPzA=; b=FRuDR4oqCJkGlbAm5oX+9N6oDIO/vgNBpBiysBNZZbe/mIfea6jBYA4KR8p6viHpad XPm46R9C6hi02gdNCjcYZ6IzAPqjC99BshMdOyAUE7JmSWwqyrvszOQrRY0L/GYhornz 7+DnsPgm19Xd7oyMG9N+Fl6yHSF1SfyP8PEQ3ZVYV90XQqqHQEnCPq006nzF9JxqgtrS hOVU3YB3K9W6Okp/b/LCtHwFRRpBepmEPwZSd0Foj+hIhMWMv0FbT+Zt9Ry5qRDsFua+ cP+z8wDtj2TsE+SSAvF75ASp+beQMLVkTcmpe5w+0ZFAZXata25DCxj8JWjdaI4XVK/b ZFLQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=MgcuJRY5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ca13-20020a056a00418d00b006823a64b0a6si8886651pfb.350.2023.07.24.07.20.28; Mon, 24 Jul 2023 07:20:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=MgcuJRY5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231285AbjGXNuQ (ORCPT + 99 others); Mon, 24 Jul 2023 09:50:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49540 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231556AbjGXNtw (ORCPT ); Mon, 24 Jul 2023 09:49:52 -0400 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EE1BC1BE7 for ; Mon, 24 Jul 2023 06:46:52 -0700 (PDT) Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-3fbc1218262so41886315e9.3 for ; Mon, 24 Jul 2023 06:46:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1690206410; x=1690811210; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=VUFHsk/m/GY1eTNfZx8Py//rj4lEwu1OMjpML8i4XiA=; b=MgcuJRY5CZqiaM6oN+2ZgOrZfx75ny19uB226sSasic7Mnx54MWMiPH2vVvPOLfApJ prGjNWXiRaSzV4+PdVcYOKWYfpjwq8s/M8+afykb+Nc06bZYe/+gbxa1+Be4Rm3jeWI0 8cCsZq8cfyydU59ItSQYEJs5yxH0yKdyeTfFZmwx6fK1dDSSKyPHBvlQzTgukYT3kPUX EQcr7s+o30vHTb1ipJOy2fJMjidD+R8BdV4yReZet1FTj9rlNgXk8HK6ZpTxwwfR3+Y5 MrrxBs3KhFwb0TnnMWnCPZSjTb//F/UJ43wX8Vj0/TqJXnIo2EGyycAVn4dvvn7wm2nd L09Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690206410; x=1690811210; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VUFHsk/m/GY1eTNfZx8Py//rj4lEwu1OMjpML8i4XiA=; b=XhtmUH4OuJ4WByiG5qJ3hctyQSSnW2cEK8uJQNivA21DpL1qHugvL90JO0BAyUUQwo 368/i0Rthopp0qxK3N3cpwh47d19QU/9//VqhWB8jHJKvE0mZYOi4wn7zWJii9CW49C0 o2D/v3Q5vWv+tg311fjZdHBkp6I01Srv6uoUfvM0gln8LwUCkycNNsqeJi9vUWS0wiMX VQ3l1jHxo6bgR3H1YfJlaVvUi5bOAv79PxcmVMorHKw7EK9iKu8AyDfM1qnDveFBeH4x sMrpKz5ePFxJBPhVYF8fi7bqmhCqq9jbPUOcwOc5wnJQFI4VUrbzeddvrUdi5yA6tltw NPXg== X-Gm-Message-State: ABy/qLY12o5WG42qKKKFekNAuic5JSHMIyUgceW6yh+l3S9DMq2UCm0A 50eeD/h0kL1xam8mPF9IqOWUMw== X-Received: by 2002:a7b:c349:0:b0:3fc:5bcc:a909 with SMTP id l9-20020a7bc349000000b003fc5bcca909mr7478357wmj.2.1690206410523; Mon, 24 Jul 2023 06:46:50 -0700 (PDT) Received: from localhost.localdomain ([2a02:6b6a:b465:0:d7c4:7f46:8fed:f874]) by smtp.gmail.com with ESMTPSA id e19-20020a05600c219300b003fbe791a0e8sm10209354wme.0.2023.07.24.06.46.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Jul 2023 06:46:49 -0700 (PDT) From: Usama Arif To: linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, rppt@kernel.org Cc: linux-kernel@vger.kernel.org, fam.zheng@bytedance.com, liangma@liangbit.com, simon.evans@bytedance.com, punit.agrawal@bytedance.com, Usama Arif Subject: [RFC 3/4] mm/hugetlb_vmemmap: Use nid of the head page to reallocate it Date: Mon, 24 Jul 2023 14:46:43 +0100 Message-Id: <20230724134644.1299963-4-usama.arif@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230724134644.1299963-1-usama.arif@bytedance.com> References: <20230724134644.1299963-1-usama.arif@bytedance.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772312011832760966 X-GMAIL-MSGID: 1772312011832760966 If tail page prep and initialization is skipped, then the "start" page will not contain the correct nid. Use the nid from first vmemap page. Signed-off-by: Usama Arif --- mm/hugetlb_vmemmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index b721e87de2b3..bdf750a4786b 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -324,7 +324,7 @@ static int vmemmap_remap_free(unsigned long start, unsigned long end, .reuse_addr = reuse, .vmemmap_pages = &vmemmap_pages, }; - int nid = page_to_nid((struct page *)start); + int nid = page_to_nid((struct page *)reuse); gfp_t gfp_mask = GFP_KERNEL | __GFP_THISNODE | __GFP_NORETRY | __GFP_NOWARN; From patchwork Mon Jul 24 13:46:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 125021 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9010:0:b0:3e4:2afc:c1 with SMTP id l16csp1835561vqg; Mon, 24 Jul 2023 07:25:14 -0700 (PDT) X-Google-Smtp-Source: APBJJlHlm8qHI4D/pzvdQZlLygvVm9N+oymxy+JA0nMKYut6Qab0CqljzoE4L1JH2fwH81GDQ0HZ X-Received: by 2002:a17:907:2bfa:b0:99b:56d4:82dc with SMTP id gv58-20020a1709072bfa00b0099b56d482dcmr9948984ejc.67.1690208713986; Mon, 24 Jul 2023 07:25:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690208713; cv=none; d=google.com; s=arc-20160816; b=QWozVEIabut5zP6v03YltnKLBLluu+wqL+dYs7Ri0p8PCpqmSbWjo9qm5WUPk140SR wCqjVufuCCiSylQ4CQv4WP/TlY9OAvr3pemHKk7zaUpKjqjVx6t928qaRLUI2/bK0jJP k5aT0p/5tWFv1XmwzYMBgHaP+5FILubXvsD/cZ/Jz3udPnr9rjCL+kOWKOuFinYaNAQx lMLKG2awtYJdevmFHZC/8qp5Ii3d17s8rzWFafElx+XdYYkN4FylalCusVkMjrnV7goZ 9uVbhRKqcfvFncQTW50gkkjpJMiG3VeQMhn5LJXF/88FvP80bATRyT8HbjGuKLmivWAh D1aw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=l4pGELgjvEXBJuRM+kyS6zBxvRB1Y0OOMkMsDX033Vw=; fh=6ywE3cadAEHt3CqWKbi2najT+v0c7si89uhbZ0SfPzA=; b=lV2adXxDpZmDzjZI5LhuhdkagB7Wmz65WLpndGbiJQloCNGOS+S6tviAZf6BqR4/ed dJSzHGCkTH6mKCiHOLtVgvUr+pgsiTDBXUsW4HEkKHs6yqf+mFffKJ/DqAmw3noKoeIg IzkDcqXysN54BjDfDnBOAbB8xSBusSmkqtTBvs1FIQ8RQd6PW5Cgvet9uLupGUOvnc0H 3Kdjqq6t2NZN0RkkFiUaQLN16IxTV7h71FEuVp3XbAERmNKE2Jq3xklRdjgtL8L/WGDN eUt6/vDTGVdIuPCp+ZU438en8jRo+iQ+KuBHugqFKcmwmsu4xBaXdIfcLGp4ToSRzcGt Y+Cw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=U9TJoPZM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u23-20020a170906b11700b0099270bd1b10si5871320ejy.208.2023.07.24.07.24.48; Mon, 24 Jul 2023 07:25:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=U9TJoPZM; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229501AbjGXNuS (ORCPT + 99 others); Mon, 24 Jul 2023 09:50:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49532 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231553AbjGXNtw (ORCPT ); Mon, 24 Jul 2023 09:49:52 -0400 Received: from mail-wm1-x32d.google.com (mail-wm1-x32d.google.com [IPv6:2a00:1450:4864:20::32d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0BA991BD8 for ; Mon, 24 Jul 2023 06:46:53 -0700 (PDT) Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-3fb4146e8ceso34268115e9.0 for ; Mon, 24 Jul 2023 06:46:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1690206411; x=1690811211; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=l4pGELgjvEXBJuRM+kyS6zBxvRB1Y0OOMkMsDX033Vw=; b=U9TJoPZM8Vb+e/+TDxZOJ/G64+DOvXK8duWaCY98EJ9LR85s7Ir6jydu8BZa479KxZ jRykc91Lr65/WdO14wqIbrmESX+QVJSfPKmbkz08W6PegHJYRKhZsHYX5/ZeohmFO5I6 ROm+xlqvOa/+KqvRaMncQyPQY2le2+8HqRa+EVD/36i0+8/kad2mKe6z940vhrF7rJAO MA+rdL862CQeWZy+wv9ZnuKLEUaUrm6Ho1ie1q6HfyD0bJpWH4Cwb1xB+e3/cLjPXN9C v8pFji+l0xoy/MHWdK9ePxYNKHToDuhq5eys6H6aN5bmin9SIzYcVj3G2fhYuOBSlPSk kY4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690206411; x=1690811211; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=l4pGELgjvEXBJuRM+kyS6zBxvRB1Y0OOMkMsDX033Vw=; b=keKkqPfnskHUYymN/Qw3p+uXYNwNhkabm1tSoPmCi3PSLpqQhWQaJCppKzQh6qZCp4 xo3ITe5nhKQnQyjnt4zDJ88qtcCYx2utMw+I6kXI6YDKoIesuzwP7FQiZnze3VnJHY6o 7jjbmiKMPZ90BNwCsBcMRX5YBRPKftHtctUeQIrESYLFUgmR6sEsgy+ef5wn2tUt/5Vv y5EBc/Rc1eVZnYVZRJikj6an+drbMnVmQnZckG1W34OaxLHu/CV3d5yUl9HlC+AQ3FSM qFsTdayJwNfIOh5lI+KSAeT2Fve7DqM7QnZYSaSEFM/q3EJvvBf8Vj60uY8ki3DocJIc DRQQ== X-Gm-Message-State: ABy/qLbCbo/XemkDQNmub3SA+O74ShCyCAaTV1h/8zC94g2DFbMObrTz p8zA/QY0V2IfHyk3Bi9WPUPT+Db7pjvTTJf0BMg= X-Received: by 2002:a1c:7703:0:b0:3fa:d160:fc6d with SMTP id t3-20020a1c7703000000b003fad160fc6dmr6581149wmi.30.1690206411484; Mon, 24 Jul 2023 06:46:51 -0700 (PDT) Received: from localhost.localdomain ([2a02:6b6a:b465:0:d7c4:7f46:8fed:f874]) by smtp.gmail.com with ESMTPSA id e19-20020a05600c219300b003fbe791a0e8sm10209354wme.0.2023.07.24.06.46.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Jul 2023 06:46:50 -0700 (PDT) From: Usama Arif To: linux-mm@kvack.org, muchun.song@linux.dev, mike.kravetz@oracle.com, rppt@kernel.org Cc: linux-kernel@vger.kernel.org, fam.zheng@bytedance.com, liangma@liangbit.com, simon.evans@bytedance.com, punit.agrawal@bytedance.com, Usama Arif Subject: [RFC 4/4] mm/memblock: Skip initialization of struct pages freed later by HVO Date: Mon, 24 Jul 2023 14:46:44 +0100 Message-Id: <20230724134644.1299963-5-usama.arif@bytedance.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230724134644.1299963-1-usama.arif@bytedance.com> References: <20230724134644.1299963-1-usama.arif@bytedance.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772312292295697848 X-GMAIL-MSGID: 1772312292295697848 If the region is for hugepages and if HVO is enabled, then those struct pages which will be freed later don't need to be initialized. This can save significant time when a large number of hugepages are allocated at boot time. As memmap_init_reserved_pages is only called at boot time, we don't need to worry about memory hotplug. Hugepage regions are kept separate from non hugepage regions in memblock_merge_regions so that initialization for unused struct pages can be skipped for the entire region. Signed-off-by: Usama Arif --- mm/hugetlb_vmemmap.c | 2 +- mm/hugetlb_vmemmap.h | 3 +++ mm/memblock.c | 27 ++++++++++++++++++++++----- 3 files changed, 26 insertions(+), 6 deletions(-) diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index bdf750a4786b..b5b7834e0f42 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -443,7 +443,7 @@ static int vmemmap_remap_alloc(unsigned long start, unsigned long end, DEFINE_STATIC_KEY_FALSE(hugetlb_optimize_vmemmap_key); EXPORT_SYMBOL(hugetlb_optimize_vmemmap_key); -static bool vmemmap_optimize_enabled = IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON); +bool vmemmap_optimize_enabled = IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON); core_param(hugetlb_free_vmemmap, vmemmap_optimize_enabled, bool, 0); /** diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 3525c514c061..8b9a1563f7b9 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -58,4 +58,7 @@ static inline bool hugetlb_vmemmap_optimizable(const struct hstate *h) return hugetlb_vmemmap_optimizable_size(h) != 0; } bool vmemmap_should_optimize(const struct hstate *h, const struct page *head); + +extern bool vmemmap_optimize_enabled; + #endif /* _LINUX_HUGETLB_VMEMMAP_H */ diff --git a/mm/memblock.c b/mm/memblock.c index e92d437bcb51..62072a0226de 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -21,6 +21,7 @@ #include #include "internal.h" +#include "hugetlb_vmemmap.h" #define INIT_MEMBLOCK_REGIONS 128 #define INIT_PHYSMEM_REGIONS 4 @@ -519,7 +520,8 @@ static void __init_memblock memblock_merge_regions(struct memblock_type *type, if (this->base + this->size != next->base || memblock_get_region_node(this) != memblock_get_region_node(next) || - this->flags != next->flags) { + this->flags != next->flags || + this->hugepage_size != next->hugepage_size) { BUG_ON(this->base + this->size > next->base); i++; continue; @@ -2125,10 +2127,25 @@ static void __init memmap_init_reserved_pages(void) /* initialize struct pages for the reserved regions */ for_each_reserved_mem_region(region) { nid = memblock_get_region_node(region); - start = region->base; - end = start + region->size; - - reserve_bootmem_region(start, end, nid); + /* + * If the region is for hugepages and if HVO is enabled, then those + * struct pages which will be freed later don't need to be initialized. + * This can save significant time when a large number of hugepages are + * allocated at boot time. As this is at boot time, we don't need to + * worry about memory hotplug. + */ + if (region->hugepage_size && vmemmap_optimize_enabled) { + for (start = region->base; + start < region->base + region->size; + start += region->hugepage_size) { + end = start + HUGETLB_VMEMMAP_RESERVE_SIZE * sizeof(struct page); + reserve_bootmem_region(start, end, nid); + } + } else { + start = region->base; + end = start + region->size; + reserve_bootmem_region(start, end, nid); + } } }