Message ID | 20230614131746.3670303-1-songshuaishuai@tinylab.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp1309466vqr; Wed, 14 Jun 2023 07:36:55 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5P79euaNj+1zw6bgLXNrRdGRAQtZIgt5Ob7qMHcg8mLfTV+8SKleIkXAHqhOx0k7UpXzQp X-Received: by 2002:a17:907:16a0:b0:978:4027:57eb with SMTP id hc32-20020a17090716a000b00978402757ebmr17491874ejc.47.1686753415659; Wed, 14 Jun 2023 07:36:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686753415; cv=none; d=google.com; s=arc-20160816; b=J93lL5PrEuN5NRMiO8g0/8nMYJ54lGqPA3mCFxoQy0ceVqhGpC9/iH1YUflQ5pvp0Z b9ayEx5cSw3YGGp43oEwMMWinAcobiTr9sjP0cHwD3qcO70DjJ00YIc6LNBQ5sxV73+v Chjxm04hVjhEnikXMMWFCiAKCDw+p1/U0dCmFUv1Ca6WG0HYJk+DioFkcemtJs7sdxrP DBMKdPXwxTFcCgtbOi/D9zUKutm2UZDkMe0XS1wqHI0Ss5PVRokULayTq/BkrzDYzRs1 N1VCjCfIP0t047iugU0CU3RKZjn8guofWHa9H3fL9yc0Mk6vCw3mGb7idVg6tXgMPhJH dMsg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:feedback-id:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from; bh=r4hWL23ji32lCDz23JfaFZHvZsLTOhEsNqFSL7ZH/EM=; b=OQg9HD65wk2axwS6SnlwNzyeXq9om2uX26yazY/je1ZAfNysAL2TEgzHKVeb4FkC3m pvs0C6jp3RBUOqn0EW8yqgPENBrgZ4c6+Q2jL/rvdndqCHrUEQgc1UKxqsFkTcxxpblj r9vGzZsegD00tGFZTJt1gctu+rvMe4wIxsavo0/WcFFKhYnIpKskoBhpB4Vs09RvDr5x x0swWMjM1CPc1yBgCkDqaVNVhZIF9jOIw0F/d276UYU3iwUHQzl0qVvl2IvQSZme0fXl p41kleujItOZVbhxIypHMWnFtniKMqM5HFsRTnX8X9/yIqbnA6tXhB67PtqcN2CDxozk adCg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bw18-20020a170906c1d200b009745c464a64si8287385ejb.472.2023.06.14.07.36.30; Wed, 14 Jun 2023 07:36:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244842AbjFNNS3 (ORCPT <rfc822;n2h9z4@gmail.com> + 99 others); Wed, 14 Jun 2023 09:18:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234810AbjFNNSY (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 14 Jun 2023 09:18:24 -0400 Received: from bg4.exmail.qq.com (bg4.exmail.qq.com [43.155.65.254]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 815B1184 for <linux-kernel@vger.kernel.org>; Wed, 14 Jun 2023 06:18:22 -0700 (PDT) X-QQ-mid: bizesmtp89t1686748693tz1bp2p7 Received: from localhost.localdomain ( [58.240.82.166]) by bizesmtp.qq.com (ESMTP) with id ; Wed, 14 Jun 2023 21:18:11 +0800 (CST) X-QQ-SSF: 01200000000000B08000000A0000000 X-QQ-FEAT: rZJGTgY0+YOEKJckuG39pki7gnHxgidzDka2DFvr3e+5oWDc4cW1ifbtlpg1V PaKRRHApQvtMF2/mV5rCE2ua/TK5fMZp5F1XsJ8cCdgKBiPWBLhMCsu79Zpic840oPALn2b oVbGbJ8JVs8414qiXHNxm82vXtQTO6BpSASSyDipGU7SYOTo6ht929MkB9yFeIyVI3MlVb6 gb80s3+VFfu9paZ2etZzJgMx6miVRN3ZyzTiryRh8TJXX6VxPPkTcrA4sz6e+OVXbFb/rpu lJ14l/StXJbfq38kfwaIIeyaZ4MA8Ci9mICfq0okcIUTlvC3WOjKh1w8X/dR8gnZZM4165x IMrsQ6BHBY4aUu5kr+GA2FVXUWF0itnxQF4yGWEMkafUjBUx7o= X-QQ-GoodBg: 0 X-BIZMAIL-ID: 17931232792479222361 From: Song Shuai <songshuaishuai@tinylab.org> To: rppt@kernel.org, akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Song Shuai <songshuaishuai@tinylab.org> Subject: [PATCH] memblock: Add error message when memblock_can_resize is not ready Date: Wed, 14 Jun 2023 21:17:46 +0800 Message-Id: <20230614131746.3670303-1-songshuaishuai@tinylab.org> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:tinylab.org:qybglogicsvrgz:qybglogicsvrgz5a-3 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768689149453204487?= X-GMAIL-MSGID: =?utf-8?q?1768689149453204487?= |
Series |
memblock: Add error message when memblock_can_resize is not ready
|
|
Commit Message
Song Shuai
June 14, 2023, 1:17 p.m. UTC
The memblock APIs are always correct, thus the callers usually don't
handle the return code. But the failure caused by unready memblock_can_resize
is hard to recognize without the return code. Like this piece of log:
```
[ 0.000000] memblock_phys_alloc_range: 4096 bytes align=0x1000 from=0x0000000000000000 max_addr=0x0000000000000000 alloc_pmd_fixmap+0x14/0x1c
[ 0.000000] memblock_reserve: [0x000000017ffff000-0x000000017fffffff] memblock_alloc_range_nid+0xb8/0x128
[ 0.000000] Oops - store (or AMO) access fault [#1]
```
So add an error message for this kind of failure:
```
[ 0.000000] memblock_phys_alloc_range: 4096 bytes align=0x1000 from=0x0000000000000000 max_addr=0x0000000000000000 alloc_pmd_fixmap+0x14/0x1c
[ 0.000000] memblock_reserve: [0x000000017ffff000-0x000000017fffffff] memblock_alloc_range_nid+0xb8/0x128
[ 0.000000] memblock: Can't double reserved array for area start 0x000000017ffff000 size 4096
[ 0.000000] Oops - store (or AMO) access fault [#1]
```
Signed-off-by: Song Shuai <songshuaishuai@tinylab.org>
---
mm/memblock.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
Comments
Hi, On Wed, Jun 14, 2023 at 09:17:46PM +0800, Song Shuai wrote: > The memblock APIs are always correct, thus the callers usually don't > handle the return code. But the failure caused by unready memblock_can_resize > is hard to recognize without the return code. Like this piece of log: Please make it clear that failure is in memblock_double_array(), e.g. But when memblock_double_array() is called before memblock_can_resize is true, it is hard to understand the actual reason for the failure. > > ``` > [ 0.000000] memblock_phys_alloc_range: 4096 bytes align=0x1000 from=0x0000000000000000 max_addr=0x0000000000000000 alloc_pmd_fixmap+0x14/0x1c > [ 0.000000] memblock_reserve: [0x000000017ffff000-0x000000017fffffff] memblock_alloc_range_nid+0xb8/0x128 > [ 0.000000] Oops - store (or AMO) access fault [#1] > ``` > > So add an error message for this kind of failure: > > ``` > [ 0.000000] memblock_phys_alloc_range: 4096 bytes align=0x1000 from=0x0000000000000000 max_addr=0x0000000000000000 alloc_pmd_fixmap+0x14/0x1c > [ 0.000000] memblock_reserve: [0x000000017ffff000-0x000000017fffffff] memblock_alloc_range_nid+0xb8/0x128 > [ 0.000000] memblock: Can't double reserved array for area start 0x000000017ffff000 size 4096 > [ 0.000000] Oops - store (or AMO) access fault [#1] > ``` > > Signed-off-by: Song Shuai <songshuaishuai@tinylab.org> > --- > mm/memblock.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/mm/memblock.c b/mm/memblock.c > index 3feafea06ab2..ab952a164f62 100644 > --- a/mm/memblock.c > +++ b/mm/memblock.c > @@ -418,8 +418,11 @@ static int __init_memblock memblock_double_array(struct memblock_type *type, > /* We don't allow resizing until we know about the reserved regions > * of memory that aren't suitable for allocation > */ > - if (!memblock_can_resize) > + if (!memblock_can_resize) { > + pr_err("memblock: Can't double %s array for area start %pa size %ld\n", > + type->name, &new_area_start, (unsigned long)new_area_size); Most of the time memblock uses %llu and cast to u64 to print size, please make this consistent. > return -1; > + } > > /* Calculate new doubled size */ > old_size = type->max * sizeof(struct memblock_region); > -- > 2.20.1 > >
Sorry for not replying to you in time 在 2023/6/15 00:07, Mike Rapoport 写道: > Hi, > > On Wed, Jun 14, 2023 at 09:17:46PM +0800, Song Shuai wrote: >> The memblock APIs are always correct, thus the callers usually don't >> handle the return code. But the failure caused by unready memblock_can_resize >> is hard to recognize without the return code. Like this piece of log: > > Please make it clear that failure is in memblock_double_array(), e.g. > Having numerous memblock reservations at early boot where memblock_can_resize is unset may exhaust the INIT_MEMBLOCK_REGIONS sized memblock.reserved regions and try to double the region array via memblock_double_array() which fails and returns -1 to the caller. You can find the numerous memblock reservations reported by this commit 24cc61d8cb5a ("arm64: memblock: don't permit memblock resizing until linear mapping is up"). And the similar test sense can be simulated by a constructed dtb with numerous discrete /memreserve/ or /reserved-memory regions. > But when memblock_double_array() is called before memblock_can_resize > is true, it is hard to understand the actual reason for the failure. > >> >> ``` >> [ 0.000000] memblock_phys_alloc_range: 4096 bytes align=0x1000 from=0x0000000000000000 max_addr=0x0000000000000000 alloc_pmd_fixmap+0x14/0x1c >> [ 0.000000] memblock_reserve: [0x000000017ffff000-0x000000017fffffff] memblock_alloc_range_nid+0xb8/0x128 >> [ 0.000000] Oops - store (or AMO) access fault [#1] >> ``` >> >> So add an error message for this kind of failure: >> >> ``` >> [ 0.000000] memblock_phys_alloc_range: 4096 bytes align=0x1000 from=0x0000000000000000 max_addr=0x0000000000000000 alloc_pmd_fixmap+0x14/0x1c >> [ 0.000000] memblock_reserve: [0x000000017ffff000-0x000000017fffffff] memblock_alloc_range_nid+0xb8/0x128 >> [ 0.000000] memblock: Can't double reserved array for area start 0x000000017ffff000 size 4096 >> [ 0.000000] Oops - store (or AMO) access fault [#1] >> ``` >> >> Signed-off-by: Song Shuai <songshuaishuai@tinylab.org> >> --- >> mm/memblock.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/mm/memblock.c b/mm/memblock.c >> index 3feafea06ab2..ab952a164f62 100644 >> --- a/mm/memblock.c >> +++ b/mm/memblock.c >> @@ -418,8 +418,11 @@ static int __init_memblock memblock_double_array(struct memblock_type *type, >> /* We don't allow resizing until we know about the reserved regions >> * of memory that aren't suitable for allocation >> */ >> - if (!memblock_can_resize) >> + if (!memblock_can_resize) { >> + pr_err("memblock: Can't double %s array for area start %pa size %ld\n", >> + type->name, &new_area_start, (unsigned long)new_area_size); > > Most of the time memblock uses %llu and cast to u64 to print size, please > make this consistent. I will fix it in next version if the above description is ok for you. > >> return -1; >> + } >> >> /* Calculate new doubled size */ >> old_size = type->max * sizeof(struct memblock_region); >> -- >> 2.20.1 >> >> >
On Tue, Jun 20, 2023 at 03:04:55PM +0800, Song Shuai wrote: > Sorry for not replying to you in time > > 在 2023/6/15 00:07, Mike Rapoport 写道: > > Hi, > > > > On Wed, Jun 14, 2023 at 09:17:46PM +0800, Song Shuai wrote: > > > The memblock APIs are always correct, thus the callers usually don't > > > handle the return code. But the failure caused by unready memblock_can_resize > > > is hard to recognize without the return code. Like this piece of log: > > > > Please make it clear that failure is in memblock_double_array(), e.g. > > > > Having numerous memblock reservations at early boot where > memblock_can_resize is unset > may exhaust the INIT_MEMBLOCK_REGIONS sized memblock.reserved regions and > try to > double the region array via memblock_double_array() which fails and returns > -1 to the caller. > > You can find the numerous memblock reservations reported by this commit > 24cc61d8cb5a ("arm64: memblock: don't permit memblock resizing until linear > mapping is up"). > And the similar test sense can be simulated by a constructed dtb with > numerous discrete > /memreserve/ or /reserved-memory regions. Ideally, the callers of memblock_reserve() should check the return value and panic with a meaningful message if it fails. Still, for now something like this patch is an improvement. How about we make the changelog to be something like: Subject: memblock: report failures when memblock_can_resize is not set The callers of memblock_reserve() do not check the return value presuming that memblock_reserve() always succeeds, but there are cases where it may fail. Having numerous memblock reservations at early boot where memblock_can_resize is unset may exhaust the INIT_MEMBLOCK_REGIONS sized memblock.reserved regions array and an attempt to double this array via memblock_double_array() will fail and will return -1 to the caller. When this happens the system crashes anyway, but it's hard to identify the reason for the crash. Add a panic message to memblock_double_array() to aid debugging of the cases when too many regions are reserved before memblock can resize memblock.reserved array. > > But when memblock_double_array() is called before memblock_can_resize > > is true, it is hard to understand the actual reason for the failure. > > > > > > > > ``` > > > [ 0.000000] memblock_phys_alloc_range: 4096 bytes align=0x1000 from=0x0000000000000000 max_addr=0x0000000000000000 alloc_pmd_fixmap+0x14/0x1c > > > [ 0.000000] memblock_reserve: [0x000000017ffff000-0x000000017fffffff] memblock_alloc_range_nid+0xb8/0x128 > > > [ 0.000000] Oops - store (or AMO) access fault [#1] > > > ``` > > > > > > So add an error message for this kind of failure: > > > > > > ``` > > > [ 0.000000] memblock_phys_alloc_range: 4096 bytes align=0x1000 from=0x0000000000000000 max_addr=0x0000000000000000 alloc_pmd_fixmap+0x14/0x1c > > > [ 0.000000] memblock_reserve: [0x000000017ffff000-0x000000017fffffff] memblock_alloc_range_nid+0xb8/0x128 > > > [ 0.000000] memblock: Can't double reserved array for area start 0x000000017ffff000 size 4096 > > > [ 0.000000] Oops - store (or AMO) access fault [#1] > > > ``` > > > > > > Signed-off-by: Song Shuai <songshuaishuai@tinylab.org> > > > --- > > > mm/memblock.c | 5 ++++- > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > diff --git a/mm/memblock.c b/mm/memblock.c > > > index 3feafea06ab2..ab952a164f62 100644 > > > --- a/mm/memblock.c > > > +++ b/mm/memblock.c > > > @@ -418,8 +418,11 @@ static int __init_memblock memblock_double_array(struct memblock_type *type, > > > /* We don't allow resizing until we know about the reserved regions > > > * of memory that aren't suitable for allocation > > > */ > > > - if (!memblock_can_resize) > > > + if (!memblock_can_resize) { > > > + pr_err("memblock: Can't double %s array for area start %pa size %ld\n", > > > + type->name, &new_area_start, (unsigned long)new_area_size); The system will crash anyway if we get, here, so why won't use panic? Also, dumping new_area_start here does not add any information but rather confuses. How about panic("memblock: cannot resize %s array\n", type->name); > > > > Most of the time memblock uses %llu and cast to u64 to print size, please > > make this consistent. > I will fix it in next version if the above description is ok for you. > > > > > return -1; > > > + } > > > /* Calculate new doubled size */ > > > old_size = type->max * sizeof(struct memblock_region); > > -- > Thanks > Song Shuai > >
diff --git a/mm/memblock.c b/mm/memblock.c index 3feafea06ab2..ab952a164f62 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -418,8 +418,11 @@ static int __init_memblock memblock_double_array(struct memblock_type *type, /* We don't allow resizing until we know about the reserved regions * of memory that aren't suitable for allocation */ - if (!memblock_can_resize) + if (!memblock_can_resize) { + pr_err("memblock: Can't double %s array for area start %pa size %ld\n", + type->name, &new_area_start, (unsigned long)new_area_size); return -1; + } /* Calculate new doubled size */ old_size = type->max * sizeof(struct memblock_region);