[2/2] perf/ring_buffer: Fix high-order allocations for AUX space with correct MAX_ORDER limit
Message ID | 20230612052452.53425-3-xueshuai@linux.alibaba.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp2372708vqr; Sun, 11 Jun 2023 22:40:29 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6D3bdUYCpaC97li3T/49tdKcWoZK6qrrP/e3+wX02WofPF8XbU7Epto/Ebtd+FK9fP79t1 X-Received: by 2002:a05:6402:550:b0:514:9563:a32b with SMTP id i16-20020a056402055000b005149563a32bmr4317697edx.16.1686548429693; Sun, 11 Jun 2023 22:40:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686548429; cv=none; d=google.com; s=arc-20160816; b=auJ5zqHq+74DlSkWDt2kmZFRT0tRZ6ApTtOK3rdNVn8bIuVGlX1ni9EdnKCiO/GohE ZFHNYh9f9bPROMdeLR5lYS8/G1F4F5b/G5uYYmjBeivTa5n0R80i65/Z3/SZ0OlStRUL 4SaZU10jMDh1GEPIGVcGWT5RPpNg720faNUo5qsl3663XMBUWeIlXJDvsGwMA9hSOSzN KBV+Z+yAqAAfzAZ1yDZpOChk4M+AtV1ry2aZ2gc8uAVPZIh1NKplxex5wq4vOgt+ZF9s 2oL5WyDfVUUbkGfBPGXBErew0+STn/SlNTFkL1XwBp6LJhb+gkeUYIxoMuHUX6WmIgrO qljw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=JQMbkzuOQ5kslAbCs/mvFCzCMlfNfOJ/3gBDG2ldZyc=; b=af/RaEC545V5Ev8G9uQmCY73/eD1+WEPFfTNeBPq49VPIPUodPzV68PXxTlXvXgZni ypmBAAQJR9yBmBAd/KUNEs9yhRqJE8l/L2OdsfBDEw+gLpOrATv7c3Jxg8kjruCwkdVL AEVGdeFCx1UcDV5aabMU1domHVEU0WKRmnoh5NIB+4BBKGJKEQ7/yLk60JNCacfl1yNR ++PUl4wo/geheVq87gecFgyCzzTCKxOEjM0o5RTbCAHVTLBucxyV95fy7UDmyQC2WFQ4 n5J4pJFT+pvOG3wrFxQUAsXbSDma3F/5VWvK4S1FHZ4XN3uqR9O6+q9V+w1TjdvINoo9 Ef2A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x24-20020aa7dad8000000b0050d0ddf6582si5557425eds.474.2023.06.11.22.40.04; Sun, 11 Jun 2023 22:40:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234150AbjFLFZa (ORCPT <rfc822;rust.linux@gmail.com> + 99 others); Mon, 12 Jun 2023 01:25:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35646 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234629AbjFLFZG (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 12 Jun 2023 01:25:06 -0400 Received: from out30-98.freemail.mail.aliyun.com (out30-98.freemail.mail.aliyun.com [115.124.30.98]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 417AAE1; Sun, 11 Jun 2023 22:25:04 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R191e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045170;MF=xueshuai@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0VkrnYHA_1686547498; Received: from localhost.localdomain(mailfrom:xueshuai@linux.alibaba.com fp:SMTPD_---0VkrnYHA_1686547498) by smtp.aliyun-inc.com; Mon, 12 Jun 2023 13:25:00 +0800 From: Shuai Xue <xueshuai@linux.alibaba.com> To: alexander.shishkin@linux.intel.com, peterz@infradead.org Cc: mingo@redhat.com, acme@kernel.org, mark.rutland@arm.com, jolsa@kernel.org, namhyung@kernel.org, irogers@google.com, adrian.hunter@intel.com, linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org, bpf@vger.kernel.org Subject: [PATCH 2/2] perf/ring_buffer: Fix high-order allocations for AUX space with correct MAX_ORDER limit Date: Mon, 12 Jun 2023 13:24:52 +0800 Message-Id: <20230612052452.53425-3-xueshuai@linux.alibaba.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230612052452.53425-1-xueshuai@linux.alibaba.com> References: <20230612052452.53425-1-xueshuai@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768474206277304510?= X-GMAIL-MSGID: =?utf-8?q?1768474206277304510?= |
Series |
Fix high-order allocations for AUX space
|
|
Commit Message
Shuai Xue
June 12, 2023, 5:24 a.m. UTC
When perf-record with a large AUX area, e.g 2GB:
#perf record -C 0 -m ,2G -e arm_spe_0// -- sleep 1
it reveals a WARNING with __alloc_pages:
[ 48.888553] ------------[ cut here ]------------
[ 48.888559] WARNING: CPU: 39 PID: 17438 at mm/page_alloc.c:5568 __alloc_pages+0x1ec/0x248
[ 48.888569] Modules linked in: ip6table_filter(E) ip6_tables(E) iptable_filter(E) ebtable_nat(E) ebtables(E) vfat(E) fat(E) aes_ce_blk(E) aes_ce_cipher(E) crct10dif_ce(E) ghash_ce(E) sm4_ce_cipher(E) sm4(E) sha2_ce(E) sha256_arm64(E) sha1_ce(E) acpi_ipmi(E) sbsa_gwdt(E) sg(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) ip_tables(E) sd_mod(E) ast(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) nvme(E) sysimgblt(E) i2c_algo_bit(E) nvme_core(E) drm_shmem_helper(E) ahci(E) t10_pi(E) libahci(E) drm(E) crc64_rocksoft(E) i40e(E) crc64(E) libata(E) i2c_core(E)
[ 48.888610] CPU: 39 PID: 17438 Comm: perf Kdump: loaded Tainted: G E 6.3.0-rc4+ #56
[ 48.888613] Hardware name: Default Default/Default, BIOS 1.2.M1.AL.P.139.00 03/22/2023
[ 48.888614] pstate: 23400009 (nzCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
[ 48.888616] pc : __alloc_pages+0x1ec/0x248
[ 48.888619] lr : rb_alloc_aux_page+0x78/0xe0
[ 48.888622] sp : ffff800037c5b9e0
[ 48.888623] pmr_save: 000000e0
[ 48.888624] x29: ffff800037c5b9e0 x28: ffff00082743b800 x27: 0000000000000000
[ 48.888627] x26: 0000000000080000 x25: ffff000000000000 x24: ffff000860b21380
[ 48.888629] x23: ffff8000093fd3c0 x22: ffff00081a4a2080 x21: 000000000000000b
[ 48.888631] x20: 0000000000000000 x19: 000000000000000b x18: 0000000000000020
[ 48.888634] x17: 0000000000000000 x16: ffff800008f05be8 x15: ffff00081a4a2610
[ 48.888636] x14: 0000000000000000 x13: 323173656761705f x12: ffff00477fffeb90
[ 48.888638] x11: 0000000000000000 x10: 0000000000000000 x9 : ffff8000083608a0
[ 48.888641] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000040
[ 48.888643] x5 : 000000000007f400 x4 : 0000000000000000 x3 : 0000000000000000
[ 48.888645] x2 : 0000000000000000 x1 : 0000000000000001 x0 : 0000000000012dc0
[ 48.888648] Call trace:
[ 48.888650] __alloc_pages+0x1ec/0x248
[ 48.888653] rb_alloc_aux_page+0x78/0xe0
[ 48.888654] rb_alloc_aux+0x13c/0x298
[ 48.888656] perf_mmap+0x468/0x688
[ 48.888659] mmap_region+0x308/0x8a8
[ 48.888662] do_mmap+0x3c0/0x528
[ 48.888664] vm_mmap_pgoff+0xf4/0x1b8
[ 48.888666] ksys_mmap_pgoff+0x18c/0x218
[ 48.888668] __arm64_sys_mmap+0x38/0x58
[ 48.888671] invoke_syscall+0x50/0x128
[ 48.888673] el0_svc_common.constprop.0+0x58/0x188
[ 48.888674] do_el0_svc+0x34/0x50
[ 48.888676] el0_svc+0x34/0x108
[ 48.888679] el0t_64_sync_handler+0xb8/0xc0
[ 48.888681] el0t_64_sync+0x1a4/0x1a8
[ 48.888686] ---[ end trace 0000000000000000 ]---
The problem is that the high-order pages for AUX area is allocated with
an order of MAX_ORDER. Obviously, this is a bogus.
Fix it with an order of MAX_ORDER - 1 at maximum.
Fixes: 0a4e38e64f5e ("perf: Support high-order allocations for AUX space")
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
---
kernel/events/ring_buffer.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
Comments
On Mon, Jun 12, 2023 at 01:24:52PM +0800, Shuai Xue wrote: > When perf-record with a large AUX area, e.g 2GB: > > #perf record -C 0 -m ,2G -e arm_spe_0// -- sleep 1 > > it reveals a WARNING with __alloc_pages: > > [ 48.888553] ------------[ cut here ]------------ > [ 48.888559] WARNING: CPU: 39 PID: 17438 at mm/page_alloc.c:5568 __alloc_pages+0x1ec/0x248 > [ 48.888569] Modules linked in: ip6table_filter(E) ip6_tables(E) iptable_filter(E) ebtable_nat(E) ebtables(E) vfat(E) fat(E) aes_ce_blk(E) aes_ce_cipher(E) crct10dif_ce(E) ghash_ce(E) sm4_ce_cipher(E) sm4(E) sha2_ce(E) sha256_arm64(E) sha1_ce(E) acpi_ipmi(E) sbsa_gwdt(E) sg(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) ip_tables(E) sd_mod(E) ast(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) nvme(E) sysimgblt(E) i2c_algo_bit(E) nvme_core(E) drm_shmem_helper(E) ahci(E) t10_pi(E) libahci(E) drm(E) crc64_rocksoft(E) i40e(E) crc64(E) libata(E) i2c_core(E) > [ 48.888610] CPU: 39 PID: 17438 Comm: perf Kdump: loaded Tainted: G E 6.3.0-rc4+ #56 > [ 48.888613] Hardware name: Default Default/Default, BIOS 1.2.M1.AL.P.139.00 03/22/2023 > [ 48.888614] pstate: 23400009 (nzCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) > [ 48.888616] pc : __alloc_pages+0x1ec/0x248 > [ 48.888619] lr : rb_alloc_aux_page+0x78/0xe0 > [ 48.888622] sp : ffff800037c5b9e0 > [ 48.888623] pmr_save: 000000e0 > [ 48.888624] x29: ffff800037c5b9e0 x28: ffff00082743b800 x27: 0000000000000000 > [ 48.888627] x26: 0000000000080000 x25: ffff000000000000 x24: ffff000860b21380 > [ 48.888629] x23: ffff8000093fd3c0 x22: ffff00081a4a2080 x21: 000000000000000b > [ 48.888631] x20: 0000000000000000 x19: 000000000000000b x18: 0000000000000020 > [ 48.888634] x17: 0000000000000000 x16: ffff800008f05be8 x15: ffff00081a4a2610 > [ 48.888636] x14: 0000000000000000 x13: 323173656761705f x12: ffff00477fffeb90 > [ 48.888638] x11: 0000000000000000 x10: 0000000000000000 x9 : ffff8000083608a0 > [ 48.888641] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000040 > [ 48.888643] x5 : 000000000007f400 x4 : 0000000000000000 x3 : 0000000000000000 > [ 48.888645] x2 : 0000000000000000 x1 : 0000000000000001 x0 : 0000000000012dc0 > [ 48.888648] Call trace: > [ 48.888650] __alloc_pages+0x1ec/0x248 > [ 48.888653] rb_alloc_aux_page+0x78/0xe0 > [ 48.888654] rb_alloc_aux+0x13c/0x298 > [ 48.888656] perf_mmap+0x468/0x688 > [ 48.888659] mmap_region+0x308/0x8a8 > [ 48.888662] do_mmap+0x3c0/0x528 > [ 48.888664] vm_mmap_pgoff+0xf4/0x1b8 > [ 48.888666] ksys_mmap_pgoff+0x18c/0x218 > [ 48.888668] __arm64_sys_mmap+0x38/0x58 > [ 48.888671] invoke_syscall+0x50/0x128 > [ 48.888673] el0_svc_common.constprop.0+0x58/0x188 > [ 48.888674] do_el0_svc+0x34/0x50 > [ 48.888676] el0_svc+0x34/0x108 > [ 48.888679] el0t_64_sync_handler+0xb8/0xc0 > [ 48.888681] el0t_64_sync+0x1a4/0x1a8 > [ 48.888686] ---[ end trace 0000000000000000 ]--- > > The problem is that the high-order pages for AUX area is allocated with > an order of MAX_ORDER. Obviously, this is a bogus. > > Fix it with an order of MAX_ORDER - 1 at maximum. > > Fixes: 0a4e38e64f5e ("perf: Support high-order allocations for AUX space") > Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> LGTM: Reviewed-by: Leo Yan <leo.yan@linaro.org> > --- > kernel/events/ring_buffer.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c > index 273a0fe..d6bbdb7 100644 > --- a/kernel/events/ring_buffer.c > +++ b/kernel/events/ring_buffer.c > @@ -609,8 +609,8 @@ static struct page *rb_alloc_aux_page(int node, int order) > { > struct page *page; > > - if (order > MAX_ORDER) > - order = MAX_ORDER; > + if (order >= MAX_ORDER) > + order = MAX_ORDER - 1; > > do { > page = alloc_pages_node(node, PERF_AUX_GFP, order); > -- > 1.8.3.1 >
On 2023/6/12 16:30, Leo Yan wrote: > On Mon, Jun 12, 2023 at 01:24:52PM +0800, Shuai Xue wrote: >> When perf-record with a large AUX area, e.g 2GB: >> >> #perf record -C 0 -m ,2G -e arm_spe_0// -- sleep 1 >> >> it reveals a WARNING with __alloc_pages: >> >> [ 48.888553] ------------[ cut here ]------------ >> [ 48.888559] WARNING: CPU: 39 PID: 17438 at mm/page_alloc.c:5568 __alloc_pages+0x1ec/0x248 >> [ 48.888569] Modules linked in: ip6table_filter(E) ip6_tables(E) iptable_filter(E) ebtable_nat(E) ebtables(E) vfat(E) fat(E) aes_ce_blk(E) aes_ce_cipher(E) crct10dif_ce(E) ghash_ce(E) sm4_ce_cipher(E) sm4(E) sha2_ce(E) sha256_arm64(E) sha1_ce(E) acpi_ipmi(E) sbsa_gwdt(E) sg(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) ip_tables(E) sd_mod(E) ast(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) nvme(E) sysimgblt(E) i2c_algo_bit(E) nvme_core(E) drm_shmem_helper(E) ahci(E) t10_pi(E) libahci(E) drm(E) crc64_rocksoft(E) i40e(E) crc64(E) libata(E) i2c_core(E) >> [ 48.888610] CPU: 39 PID: 17438 Comm: perf Kdump: loaded Tainted: G E 6.3.0-rc4+ #56 >> [ 48.888613] Hardware name: Default Default/Default, BIOS 1.2.M1.AL.P.139.00 03/22/2023 >> [ 48.888614] pstate: 23400009 (nzCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) >> [ 48.888616] pc : __alloc_pages+0x1ec/0x248 >> [ 48.888619] lr : rb_alloc_aux_page+0x78/0xe0 >> [ 48.888622] sp : ffff800037c5b9e0 >> [ 48.888623] pmr_save: 000000e0 >> [ 48.888624] x29: ffff800037c5b9e0 x28: ffff00082743b800 x27: 0000000000000000 >> [ 48.888627] x26: 0000000000080000 x25: ffff000000000000 x24: ffff000860b21380 >> [ 48.888629] x23: ffff8000093fd3c0 x22: ffff00081a4a2080 x21: 000000000000000b >> [ 48.888631] x20: 0000000000000000 x19: 000000000000000b x18: 0000000000000020 >> [ 48.888634] x17: 0000000000000000 x16: ffff800008f05be8 x15: ffff00081a4a2610 >> [ 48.888636] x14: 0000000000000000 x13: 323173656761705f x12: ffff00477fffeb90 >> [ 48.888638] x11: 0000000000000000 x10: 0000000000000000 x9 : ffff8000083608a0 >> [ 48.888641] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000040 >> [ 48.888643] x5 : 000000000007f400 x4 : 0000000000000000 x3 : 0000000000000000 >> [ 48.888645] x2 : 0000000000000000 x1 : 0000000000000001 x0 : 0000000000012dc0 >> [ 48.888648] Call trace: >> [ 48.888650] __alloc_pages+0x1ec/0x248 >> [ 48.888653] rb_alloc_aux_page+0x78/0xe0 >> [ 48.888654] rb_alloc_aux+0x13c/0x298 >> [ 48.888656] perf_mmap+0x468/0x688 >> [ 48.888659] mmap_region+0x308/0x8a8 >> [ 48.888662] do_mmap+0x3c0/0x528 >> [ 48.888664] vm_mmap_pgoff+0xf4/0x1b8 >> [ 48.888666] ksys_mmap_pgoff+0x18c/0x218 >> [ 48.888668] __arm64_sys_mmap+0x38/0x58 >> [ 48.888671] invoke_syscall+0x50/0x128 >> [ 48.888673] el0_svc_common.constprop.0+0x58/0x188 >> [ 48.888674] do_el0_svc+0x34/0x50 >> [ 48.888676] el0_svc+0x34/0x108 >> [ 48.888679] el0t_64_sync_handler+0xb8/0xc0 >> [ 48.888681] el0t_64_sync+0x1a4/0x1a8 >> [ 48.888686] ---[ end trace 0000000000000000 ]--- >> >> The problem is that the high-order pages for AUX area is allocated with >> an order of MAX_ORDER. Obviously, this is a bogus. >> >> Fix it with an order of MAX_ORDER - 1 at maximum. >> >> Fixes: 0a4e38e64f5e ("perf: Support high-order allocations for AUX space") >> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> > > LGTM: > > Reviewed-by: Leo Yan <leo.yan@linaro.org> Thank you :) Cheers Shuai > >> --- >> kernel/events/ring_buffer.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c >> index 273a0fe..d6bbdb7 100644 >> --- a/kernel/events/ring_buffer.c >> +++ b/kernel/events/ring_buffer.c >> @@ -609,8 +609,8 @@ static struct page *rb_alloc_aux_page(int node, int order) >> { >> struct page *page; >> >> - if (order > MAX_ORDER) >> - order = MAX_ORDER; >> + if (order >= MAX_ORDER) >> + order = MAX_ORDER - 1; >> >> do { >> page = alloc_pages_node(node, PERF_AUX_GFP, order); >> -- >> 1.8.3.1 >>
On 12/06/2023 06:24, Shuai Xue wrote: > When perf-record with a large AUX area, e.g 2GB: > > #perf record -C 0 -m ,2G -e arm_spe_0// -- sleep 1 > > it reveals a WARNING with __alloc_pages: > > [ 48.888553] ------------[ cut here ]------------ > [ 48.888559] WARNING: CPU: 39 PID: 17438 at mm/page_alloc.c:5568 __alloc_pages+0x1ec/0x248 > [ 48.888569] Modules linked in: ip6table_filter(E) ip6_tables(E) iptable_filter(E) ebtable_nat(E) ebtables(E) vfat(E) fat(E) aes_ce_blk(E) aes_ce_cipher(E) crct10dif_ce(E) ghash_ce(E) sm4_ce_cipher(E) sm4(E) sha2_ce(E) sha256_arm64(E) sha1_ce(E) acpi_ipmi(E) sbsa_gwdt(E) sg(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) ip_tables(E) sd_mod(E) ast(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) nvme(E) sysimgblt(E) i2c_algo_bit(E) nvme_core(E) drm_shmem_helper(E) ahci(E) t10_pi(E) libahci(E) drm(E) crc64_rocksoft(E) i40e(E) crc64(E) libata(E) i2c_core(E) > [ 48.888610] CPU: 39 PID: 17438 Comm: perf Kdump: loaded Tainted: G E 6.3.0-rc4+ #56 > [ 48.888613] Hardware name: Default Default/Default, BIOS 1.2.M1.AL.P.139.00 03/22/2023 > [ 48.888614] pstate: 23400009 (nzCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) > [ 48.888616] pc : __alloc_pages+0x1ec/0x248 > [ 48.888619] lr : rb_alloc_aux_page+0x78/0xe0 > [ 48.888622] sp : ffff800037c5b9e0 > [ 48.888623] pmr_save: 000000e0 > [ 48.888624] x29: ffff800037c5b9e0 x28: ffff00082743b800 x27: 0000000000000000 > [ 48.888627] x26: 0000000000080000 x25: ffff000000000000 x24: ffff000860b21380 > [ 48.888629] x23: ffff8000093fd3c0 x22: ffff00081a4a2080 x21: 000000000000000b > [ 48.888631] x20: 0000000000000000 x19: 000000000000000b x18: 0000000000000020 > [ 48.888634] x17: 0000000000000000 x16: ffff800008f05be8 x15: ffff00081a4a2610 > [ 48.888636] x14: 0000000000000000 x13: 323173656761705f x12: ffff00477fffeb90 > [ 48.888638] x11: 0000000000000000 x10: 0000000000000000 x9 : ffff8000083608a0 > [ 48.888641] x8 : 0000000000000000 x7 : 0000000000000000 x6 : 0000000000000040 > [ 48.888643] x5 : 000000000007f400 x4 : 0000000000000000 x3 : 0000000000000000 > [ 48.888645] x2 : 0000000000000000 x1 : 0000000000000001 x0 : 0000000000012dc0 > [ 48.888648] Call trace: > [ 48.888650] __alloc_pages+0x1ec/0x248 > [ 48.888653] rb_alloc_aux_page+0x78/0xe0 > [ 48.888654] rb_alloc_aux+0x13c/0x298 > [ 48.888656] perf_mmap+0x468/0x688 > [ 48.888659] mmap_region+0x308/0x8a8 > [ 48.888662] do_mmap+0x3c0/0x528 > [ 48.888664] vm_mmap_pgoff+0xf4/0x1b8 > [ 48.888666] ksys_mmap_pgoff+0x18c/0x218 > [ 48.888668] __arm64_sys_mmap+0x38/0x58 > [ 48.888671] invoke_syscall+0x50/0x128 > [ 48.888673] el0_svc_common.constprop.0+0x58/0x188 > [ 48.888674] do_el0_svc+0x34/0x50 > [ 48.888676] el0_svc+0x34/0x108 > [ 48.888679] el0t_64_sync_handler+0xb8/0xc0 > [ 48.888681] el0t_64_sync+0x1a4/0x1a8 > [ 48.888686] ---[ end trace 0000000000000000 ]--- > > The problem is that the high-order pages for AUX area is allocated with > an order of MAX_ORDER. Obviously, this is a bogus. > > Fix it with an order of MAX_ORDER - 1 at maximum. > > Fixes: 0a4e38e64f5e ("perf: Support high-order allocations for AUX space") > Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com> > --- > kernel/events/ring_buffer.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c > index 273a0fe..d6bbdb7 100644 > --- a/kernel/events/ring_buffer.c > +++ b/kernel/events/ring_buffer.c > @@ -609,8 +609,8 @@ static struct page *rb_alloc_aux_page(int node, int order) > { > struct page *page; > > - if (order > MAX_ORDER) > - order = MAX_ORDER; > + if (order >= MAX_ORDER) > + order = MAX_ORDER - 1; > > do { > page = alloc_pages_node(node, PERF_AUX_GFP, order); It seems like this was only just recently changed with this as the commit message (23baf83): mm, treewide: redefine MAX_ORDER sanely MAX_ORDER currently defined as number of orders page allocator supports: user can ask buddy allocator for page order between 0 and MAX_ORDER-1. This definition is counter-intuitive and lead to number of bugs all over the kernel. Change the definition of MAX_ORDER to be inclusive: the range of orders user can ask from buddy allocator is 0..MAX_ORDER now. It might be worth referring to this in the commit message or adding a fixes: reference. Or maybe this new change isn't quite right? James
On Mon, Jun 12, 2023 at 09:45:38AM +0100, James Clark wrote: [...] > > @@ -609,8 +609,8 @@ static struct page *rb_alloc_aux_page(int node, int order) > > { > > struct page *page; > > > > - if (order > MAX_ORDER) > > - order = MAX_ORDER; > > + if (order >= MAX_ORDER) > > + order = MAX_ORDER - 1; > > > > do { > > page = alloc_pages_node(node, PERF_AUX_GFP, order); > > > It seems like this was only just recently changed with this as the > commit message (23baf83): > > mm, treewide: redefine MAX_ORDER sanely > > MAX_ORDER currently defined as number of orders page allocator > supports: user can ask buddy allocator for page order between 0 and > MAX_ORDER-1. > > This definition is counter-intuitive and lead to number of bugs all > over the kernel. > > Change the definition of MAX_ORDER to be inclusive: the range of > orders user can ask from buddy allocator is 0..MAX_ORDER now. > > It might be worth referring to this in the commit message or adding a > fixes: reference. Or maybe this new change isn't quite right? Good point. If so, we don't need this patch anymore. Thanks for reminding, James. Leo
On 2023/6/12 17:09, Leo Yan wrote: > On Mon, Jun 12, 2023 at 09:45:38AM +0100, James Clark wrote: > > [...] > >>> @@ -609,8 +609,8 @@ static struct page *rb_alloc_aux_page(int node, int order) >>> { >>> struct page *page; >>> >>> - if (order > MAX_ORDER) >>> - order = MAX_ORDER; >>> + if (order >= MAX_ORDER) >>> + order = MAX_ORDER - 1; >>> >>> do { >>> page = alloc_pages_node(node, PERF_AUX_GFP, order); >> >> >> It seems like this was only just recently changed with this as the >> commit message (23baf83): >> >> mm, treewide: redefine MAX_ORDER sanely >> >> MAX_ORDER currently defined as number of orders page allocator >> supports: user can ask buddy allocator for page order between 0 and >> MAX_ORDER-1. >> >> This definition is counter-intuitive and lead to number of bugs all >> over the kernel. >> >> Change the definition of MAX_ORDER to be inclusive: the range of >> orders user can ask from buddy allocator is 0..MAX_ORDER now. >> >> It might be worth referring to this in the commit message or adding a >> fixes: reference. Or maybe this new change isn't quite right? > > Good point. If so, we don't need this patch anymore. > > Thanks for reminding, James. > > Leo Hi, Leo and James, I tested on the Linus master tree, the mentioned commit 23baf83 ("mm, treewide: redefine MAX_ORDER sanely") has fix this oops. I will drop out this patch, thank you :) Cheers, Shuai
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index 273a0fe..d6bbdb7 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -609,8 +609,8 @@ static struct page *rb_alloc_aux_page(int node, int order) { struct page *page; - if (order > MAX_ORDER) - order = MAX_ORDER; + if (order >= MAX_ORDER) + order = MAX_ORDER - 1; do { page = alloc_pages_node(node, PERF_AUX_GFP, order);