Message ID | 20240103170002.1793197-1-enachman@marvell.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-15783-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id tb2csp5140387dyb; Wed, 3 Jan 2024 09:03:11 -0800 (PST) X-Google-Smtp-Source: AGHT+IH8r9Bnl7tluHGQhWr8dkSrGfDavsF1VioFn8uVdCWm3Mq+3quQ9YI4AteiopD8aeLG2Vwb X-Received: by 2002:a05:6e02:1b09:b0:35f:c0c2:4bd0 with SMTP id i9-20020a056e021b0900b0035fc0c24bd0mr32467456ilv.43.1704301391164; Wed, 03 Jan 2024 09:03:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704301391; cv=none; d=google.com; s=arc-20160816; b=bxOkAkYeql4uokLVd3j2RKGAf1HYD2MK+s43SdMRw+VjMY9xz43hLWLfi4k3MB6UC9 szdZB0W4pRosMUXtFNpIp3sRDUHNIM4nf+ZcleNxyhvlKjrwzgJXlGu5h1FckU1Lcvan WeCb+qEA5/26k8MLysWx0rtwoNFQtUODBnYKoKcMzBo97flIhlFttsCn4P0497txh6uB kmCF1bm+JC1X55UT7K3VyKQcrRVSuqdcBEx/mMeqcvVc+B93L/sEMxtkdK6CUwOEJNOP IlTq+tpyMKwYpHIdgi2u5UUGXekQao49y8O3iMY1FbHHWTmaRoyN6xsOnnCKHh7IbmdI YqMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=rUWlKryEb22Hs4Pknbtj0KZ4rivmTLmOvfofmKYCJ9o=; fh=5I6gEBir6tkszfTBR06WhRlaKJbO7Sz8OI+VYzw2p2s=; b=KFIXpUYAaEZr9L5IqsbN/51iDBdFckFTPh41p+RdINHApeo7ZxxnHXLfZNluEob4ml fCg9RJSQIN/Jl17qfBacku1ldERiI6aPxcpA57b9Ye2XvityBySo8MMh/25z9V9rkvMw 9D6zJvaF8rpgSH6Vcq+CAyqwqpT33IJVLJV+Orl5kHbDn/ex4ut7pjsxVtkpyQRKc4aA 3l5KJdkLLCjMziSOn3p4CypxB0RDan51eGyoBvSKKqU0Z64/IzoY39bbfCuBqtGAPkiL Hrgf3dk0CCsGPghsUp+7OUF3B+8OnTHkCbqrpK8rI32DeB9wLFrnPO6EV5Q10DjsJW9W M9kw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@marvell.com header.s=pfpt0220 header.b=jvOCz2V+; spf=pass (google.com: domain of linux-kernel+bounces-15783-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-15783-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=REJECT dis=NONE) header.from=marvell.com Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id u9-20020a63d349000000b005ce06ca3538si17873930pgi.873.2024.01.03.09.03.10 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Jan 2024 09:03:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-15783-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@marvell.com header.s=pfpt0220 header.b=jvOCz2V+; spf=pass (google.com: domain of linux-kernel+bounces-15783-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-15783-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=REJECT dis=NONE) header.from=marvell.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id C1DAAB2442B for <ouuuleilei@gmail.com>; Wed, 3 Jan 2024 17:01:04 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B95C41C693; Wed, 3 Jan 2024 17:00:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=marvell.com header.i=@marvell.com header.b="jvOCz2V+" X-Original-To: linux-kernel@vger.kernel.org Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C17F41BDF1 for <linux-kernel@vger.kernel.org>; Wed, 3 Jan 2024 17:00:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=marvell.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.helo=mx0b-0016f401.pphosted.com Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.24/8.17.1.24) with ESMTP id 4037XV5w020443; Wed, 3 Jan 2024 09:00:13 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h= from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding:content-type; s=pfpt0220; bh=rUWlKryE b22Hs4Pknbtj0KZ4rivmTLmOvfofmKYCJ9o=; b=jvOCz2V+yeYee+v97KGlg+sg XsSMQOQrBMlp5perPvk7ONdWmmonzPBKt6XNIWPCZo8RMeBryVkL1e5eBzflETV9 EpjwA4CW17E/nnoFSjyfQLC5pZteEQ4/WkAi0tc+7yL1nPZUSKrsNvtkGxlvA1iV UueT+47PU/C1NYR72JhaVTU2qp6r5nH5ErQ7lmmrwsgHZ3D6lNyTK141rnD+Cc08 vRLXeiYw36ngwyh+4VccuBrdHQN+PBHQzNIRvbpm/+wcZdMBwwgBf9nwUMve0BUf iMyuGgHhzeXZf0v3jUhbHjs6/CxccPzurkcdaNKbSerdktU3EjKZqWlZLOkZoQ== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3vd39qaa6u-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Wed, 03 Jan 2024 09:00:13 -0800 (PST) Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Wed, 3 Jan 2024 09:00:11 -0800 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.48 via Frontend Transport; Wed, 3 Jan 2024 09:00:11 -0800 Received: from dc3lp-swdev041.marvell.com (dc3lp-swdev041.marvell.com [10.6.60.191]) by maili.marvell.com (Postfix) with ESMTP id 8DF825B6932; Wed, 3 Jan 2024 09:00:08 -0800 (PST) From: Elad Nachman <enachman@marvell.com> To: <catalin.marinas@arm.com>, <will@kernel.org>, <thunder.leizhen@huawei.com>, <bhe@redhat.com>, <akpm@linux-foundation.org>, <yajun.deng@linux.dev>, <chris.zjh@huawei.com>, <linux-kernel@vger.kernel.org>, <linux-arm-kernel@lists.infradead.org> CC: <enachman@marvell.com> Subject: [PATCH] arm64: mm: Fix SOCs with DDR starting above zero Date: Wed, 3 Jan 2024 19:00:02 +0200 Message-ID: <20240103170002.1793197-1-enachman@marvell.com> X-Mailer: git-send-email 2.25.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Proofpoint-ORIG-GUID: 1sQMRFjHCwKBbSEVaX6Mi0lWPunAi6je X-Proofpoint-GUID: 1sQMRFjHCwKBbSEVaX6Mi0lWPunAi6je X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-12-09_02,2023-12-07_01,2023-05-22_02 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787089535736377326 X-GMAIL-MSGID: 1787089535736377326 |
Series |
arm64: mm: Fix SOCs with DDR starting above zero
|
|
Commit Message
Elad Nachman
Jan. 3, 2024, 5 p.m. UTC
From: Elad Nachman <enachman@marvell.com> Some SOCs, like the Marvell AC5/X/IM, have a combination of DDR starting at 0x2_0000_0000 coupled with DMA controllers limited to 31 and 32 bit of addressing. This requires to properly arrange ZONE_DMA and ZONE_DMA32 for these SOCs, so swiotlb and coherent DMA allocation would work properly. Change initialization so device tree dma zone bits are taken as function of offset from DRAM start, and when calculating the maximal zone physical RAM address for physical DDR starting above 32-bit, combine the physical address start plus the zone mask passed as parameter. This creates the proper zone splitting for these SOCs: 0..2GB for ZONE_DMA 2GB..4GB for ZONE_DMA32 4GB..8GB for ZONE_NORMAL Signed-off-by: Elad Nachman <enachman@marvell.com> --- arch/arm64/mm/init.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-)
Comments
On Wed, Jan 03, 2024 at 07:00:02PM +0200, Elad Nachman wrote: > From: Elad Nachman <enachman@marvell.com> > > Some SOCs, like the Marvell AC5/X/IM, have a combination > of DDR starting at 0x2_0000_0000 coupled with DMA controllers > limited to 31 and 32 bit of addressing. > This requires to properly arrange ZONE_DMA and ZONE_DMA32 for > these SOCs, so swiotlb and coherent DMA allocation would work > properly. > Change initialization so device tree dma zone bits are taken as > function of offset from DRAM start, and when calculating the > maximal zone physical RAM address for physical DDR starting above > 32-bit, combine the physical address start plus the zone mask > passed as parameter. > This creates the proper zone splitting for these SOCs: > 0..2GB for ZONE_DMA > 2GB..4GB for ZONE_DMA32 > 4GB..8GB for ZONE_NORMAL > > Signed-off-by: Elad Nachman <enachman@marvell.com> > --- > arch/arm64/mm/init.c | 20 +++++++++++++++----- > 1 file changed, 15 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 74c1db8ce271..8288c778916e 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -115,20 +115,21 @@ static void __init arch_reserve_crashkernel(void) > > /* > * Return the maximum physical address for a zone accessible by the given bits > - * limit. If DRAM starts above 32-bit, expand the zone to the maximum > - * available memory, otherwise cap it at 32-bit. > + * limit. If DRAM starts above 32-bit, expand the zone to the available memory > + * start limited by the zone bits mask, otherwise cap it at 32-bit. > */ > static phys_addr_t __init max_zone_phys(unsigned int zone_bits) > { > phys_addr_t zone_mask = DMA_BIT_MASK(zone_bits); > phys_addr_t phys_start = memblock_start_of_DRAM(); > + phys_addr_t phys_end = memblock_end_of_DRAM(); > > if (phys_start > U32_MAX) > - zone_mask = PHYS_ADDR_MAX; > + zone_mask = phys_start | zone_mask; > else if (phys_start > zone_mask) > zone_mask = U32_MAX; > > - return min(zone_mask, memblock_end_of_DRAM() - 1) + 1; > + return min(zone_mask, phys_end - 1) + 1; > } > > static void __init zone_sizes_init(void) > @@ -140,7 +141,16 @@ static void __init zone_sizes_init(void) > > #ifdef CONFIG_ZONE_DMA > acpi_zone_dma_bits = fls64(acpi_iort_dma_get_max_cpu_address()); > - dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL)); > + /* > + * When calculating the dma zone bits from the device tree, subtract > + * the DRAM start address, in case it does not start from address > + * zero. This way. we pass only the zone size related bits to > + * max_zone_phys(), which will add them to the base of the DRAM. > + * This prevents miscalculations on arm64 SOCs which combines > + * DDR starting above 4GB with memory controllers limited to > + * 32-bits or less: > + */ > + dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL) - memblock_start_of_DRAM()); > zone_dma_bits = min3(32U, dt_zone_dma_bits, acpi_zone_dma_bits); > arm64_dma_phys_limit = max_zone_phys(zone_dma_bits); > max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit); Hmm, I'm a bit worried this could regress other platforms since you seem to be assuming that DMA address 0 corresponds to the physical start of DRAM. What if that isn't the case? In any case, we should try hard to avoid different behaviour between DT and ACPI here. Will
On 1/3/24 09:45, Will Deacon wrote: > On Wed, Jan 03, 2024 at 07:00:02PM +0200, Elad Nachman wrote: >> From: Elad Nachman <enachman@marvell.com> >> >> Some SOCs, like the Marvell AC5/X/IM, have a combination >> of DDR starting at 0x2_0000_0000 coupled with DMA controllers >> limited to 31 and 32 bit of addressing. >> This requires to properly arrange ZONE_DMA and ZONE_DMA32 for >> these SOCs, so swiotlb and coherent DMA allocation would work >> properly. >> Change initialization so device tree dma zone bits are taken as >> function of offset from DRAM start, and when calculating the >> maximal zone physical RAM address for physical DDR starting above >> 32-bit, combine the physical address start plus the zone mask >> passed as parameter. >> This creates the proper zone splitting for these SOCs: >> 0..2GB for ZONE_DMA >> 2GB..4GB for ZONE_DMA32 >> 4GB..8GB for ZONE_NORMAL >> >> Signed-off-by: Elad Nachman <enachman@marvell.com> >> --- >> arch/arm64/mm/init.c | 20 +++++++++++++++----- >> 1 file changed, 15 insertions(+), 5 deletions(-) >> >> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c >> index 74c1db8ce271..8288c778916e 100644 >> --- a/arch/arm64/mm/init.c >> +++ b/arch/arm64/mm/init.c >> @@ -115,20 +115,21 @@ static void __init arch_reserve_crashkernel(void) >> >> /* >> * Return the maximum physical address for a zone accessible by the given bits >> - * limit. If DRAM starts above 32-bit, expand the zone to the maximum >> - * available memory, otherwise cap it at 32-bit. >> + * limit. If DRAM starts above 32-bit, expand the zone to the available memory >> + * start limited by the zone bits mask, otherwise cap it at 32-bit. >> */ >> static phys_addr_t __init max_zone_phys(unsigned int zone_bits) >> { >> phys_addr_t zone_mask = DMA_BIT_MASK(zone_bits); >> phys_addr_t phys_start = memblock_start_of_DRAM(); >> + phys_addr_t phys_end = memblock_end_of_DRAM(); >> >> if (phys_start > U32_MAX) >> - zone_mask = PHYS_ADDR_MAX; >> + zone_mask = phys_start | zone_mask; >> else if (phys_start > zone_mask) >> zone_mask = U32_MAX; >> >> - return min(zone_mask, memblock_end_of_DRAM() - 1) + 1; >> + return min(zone_mask, phys_end - 1) + 1; >> } >> >> static void __init zone_sizes_init(void) >> @@ -140,7 +141,16 @@ static void __init zone_sizes_init(void) >> >> #ifdef CONFIG_ZONE_DMA >> acpi_zone_dma_bits = fls64(acpi_iort_dma_get_max_cpu_address()); >> - dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL)); >> + /* >> + * When calculating the dma zone bits from the device tree, subtract >> + * the DRAM start address, in case it does not start from address >> + * zero. This way. we pass only the zone size related bits to >> + * max_zone_phys(), which will add them to the base of the DRAM. >> + * This prevents miscalculations on arm64 SOCs which combines >> + * DDR starting above 4GB with memory controllers limited to >> + * 32-bits or less: >> + */ >> + dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL) - memblock_start_of_DRAM()); >> zone_dma_bits = min3(32U, dt_zone_dma_bits, acpi_zone_dma_bits); >> arm64_dma_phys_limit = max_zone_phys(zone_dma_bits); >> max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit); > > Hmm, I'm a bit worried this could regress other platforms since you seem > to be assuming that DMA address 0 corresponds to the physical start of > DRAM. What if that isn't the case? All of our most recent Set-top-box SoCs map DRAM starting at PA 0x4000_0000 FWIW. We have the following memory maps: 1. DRAM mapped at PA 0x0000_0000 2. DRAM mapped at PA 0x4000_0000 with another memory controller starting at 0x3_0000_0000 3. DRAM mapped at PA 0x4000_0000 with a single memory controller. Here is the before/after diff with debugging prints introduced to print start, end, min, mask and the dt_zone_dma_bits value: Memory map 1. with 2GB -> no change in output Memory map 2. with 2+2GB -> no change in output Memory map 3. with 4GB: @@ -39,7 +39,7 @@ [ 0.000000] OF: reserved mem: 0x00000000fdfff000..0x00000000fdffffff (4 KiB) nomap non-reusable NWMBOX [ 0.000000] OF: reserved mem: 0x00000000fe000000..0x00000000ffffffff (32768 KiB) nomap non-reusable SRR@fe000000 [ 0.000000] max_zone_phys: start: 0x0000000040000000, end: 0x0000000140000000, min: 0x0000000100000000, mask: 0x00000000ffffffff -[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000021 +[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000020 [ 0.000000] max_zone_phys: start: 0x0000000040000000, end: 0x0000000140000000, min: 0x0000000100000000, mask: 0x00000000ffffffff [ 0.000000] Zone ranges: [ 0.000000] DMA [mem 0x0000000040000000-0x00000000ffffffff] @@ -88,243 +88,243 @@ Memory map 3. with 2GB: @@ -39,11 +39,11 @@ [ 0.000000] OF: reserved mem: 0x00000000bdfff000..0x00000000bdffffff (4 KiB) nomap non-reusable NWMBOX [ 0.000000] OF: reserved mem: 0x00000000be000000..0x00000000bfffffff (32768 KiB) nomap non-reusable SRR@be000000 [ 0.000000] max_zone_phys: start: 0x0000000040000000, end: 0x00000000c0000000, min: 0x00000000c0000000, mask: 0x00000000ffffffff -[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000020 -[ 0.000000] max_zone_phys: start: 0x0000000040000000, end: 0x00000000c0000000, min: 0x00000000c0000000, mask: 0x00000000ffffffff +[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x0000001f +[ 0.000000] max_zone_phys: start: 0x0000000040000000, end: 0x00000000c0000000, min: 0x0000000080000000, mask: 0x000000007fffffff [ 0.000000] Zone ranges: -[ 0.000000] DMA [mem 0x0000000040000000-0x00000000bfffffff] -[ 0.000000] DMA32 empty +[ 0.000000] DMA [mem 0x0000000040000000-0x000000007fffffff] +[ 0.000000] DMA32 [mem 0x0000000080000000-0x00000000bfffffff] [ 0.000000] Normal empty [ 0.000000] Movable zone start for each node [ 0.000000] Early memory node ranges @@ -54,7 +54,7 @@
Hi Elad, On Wed, Jan 03 2024, Elad Nachman wrote: > From: Elad Nachman <enachman@marvell.com> > > Some SOCs, like the Marvell AC5/X/IM, have a combination > of DDR starting at 0x2_0000_0000 coupled with DMA controllers > limited to 31 and 32 bit of addressing. > This requires to properly arrange ZONE_DMA and ZONE_DMA32 for > these SOCs, so swiotlb and coherent DMA allocation would work > properly. > Change initialization so device tree dma zone bits are taken as > function of offset from DRAM start, and when calculating the > maximal zone physical RAM address for physical DDR starting above > 32-bit, combine the physical address start plus the zone mask > passed as parameter. > This creates the proper zone splitting for these SOCs: > 0..2GB for ZONE_DMA > 2GB..4GB for ZONE_DMA32 > 4GB..8GB for ZONE_NORMAL > > Signed-off-by: Elad Nachman <enachman@marvell.com> For an alternative approach see https://lore.kernel.org/all/cover.1703683642.git.baruch@tkos.co.il/ baruch > --- > arch/arm64/mm/init.c | 20 +++++++++++++++----- > 1 file changed, 15 insertions(+), 5 deletions(-) > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 74c1db8ce271..8288c778916e 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -115,20 +115,21 @@ static void __init arch_reserve_crashkernel(void) > > /* > * Return the maximum physical address for a zone accessible by the given bits > - * limit. If DRAM starts above 32-bit, expand the zone to the maximum > - * available memory, otherwise cap it at 32-bit. > + * limit. If DRAM starts above 32-bit, expand the zone to the available memory > + * start limited by the zone bits mask, otherwise cap it at 32-bit. > */ > static phys_addr_t __init max_zone_phys(unsigned int zone_bits) > { > phys_addr_t zone_mask = DMA_BIT_MASK(zone_bits); > phys_addr_t phys_start = memblock_start_of_DRAM(); > + phys_addr_t phys_end = memblock_end_of_DRAM(); > > if (phys_start > U32_MAX) > - zone_mask = PHYS_ADDR_MAX; > + zone_mask = phys_start | zone_mask; > else if (phys_start > zone_mask) > zone_mask = U32_MAX; > > - return min(zone_mask, memblock_end_of_DRAM() - 1) + 1; > + return min(zone_mask, phys_end - 1) + 1; > } > > static void __init zone_sizes_init(void) > @@ -140,7 +141,16 @@ static void __init zone_sizes_init(void) > > #ifdef CONFIG_ZONE_DMA > acpi_zone_dma_bits = fls64(acpi_iort_dma_get_max_cpu_address()); > - dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL)); > + /* > + * When calculating the dma zone bits from the device tree, subtract > + * the DRAM start address, in case it does not start from address > + * zero. This way. we pass only the zone size related bits to > + * max_zone_phys(), which will add them to the base of the DRAM. > + * This prevents miscalculations on arm64 SOCs which combines > + * DDR starting above 4GB with memory controllers limited to > + * 32-bits or less: > + */ > + dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL) - memblock_start_of_DRAM()); > zone_dma_bits = min3(32U, dt_zone_dma_bits, acpi_zone_dma_bits); > arm64_dma_phys_limit = max_zone_phys(zone_dma_bits); > max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit);
> -----Original Message----- > From: Florian Fainelli <florian.fainelli@broadcom.com> > Sent: Wednesday, January 3, 2024 8:32 PM > To: Will Deacon <will@kernel.org>; Elad Nachman > <enachman@marvell.com> > Cc: catalin.marinas@arm.com; thunder.leizhen@huawei.com; > bhe@redhat.com; akpm@linux-foundation.org; yajun.deng@linux.dev; > chris.zjh@huawei.com; linux-kernel@vger.kernel.org; linux-arm- > kernel@lists.infradead.org > Subject: [EXT] Re: [PATCH] arm64: mm: Fix SOCs with DDR starting above > zero > > External Email > > ---------------------------------------------------------------------- > On 1/3/24 09:45, Will Deacon wrote: > > On Wed, Jan 03, 2024 at 07:00:02PM +0200, Elad Nachman wrote: > >> From: Elad Nachman <enachman@marvell.com> > >> > >> Some SOCs, like the Marvell AC5/X/IM, have a combination > >> of DDR starting at 0x2_0000_0000 coupled with DMA controllers > >> limited to 31 and 32 bit of addressing. > >> This requires to properly arrange ZONE_DMA and ZONE_DMA32 for > >> these SOCs, so swiotlb and coherent DMA allocation would work > >> properly. > >> Change initialization so device tree dma zone bits are taken as > >> function of offset from DRAM start, and when calculating the > >> maximal zone physical RAM address for physical DDR starting above > >> 32-bit, combine the physical address start plus the zone mask > >> passed as parameter. > >> This creates the proper zone splitting for these SOCs: > >> 0..2GB for ZONE_DMA > >> 2GB..4GB for ZONE_DMA32 > >> 4GB..8GB for ZONE_NORMAL > >> > >> Signed-off-by: Elad Nachman <enachman@marvell.com> > >> --- > >> arch/arm64/mm/init.c | 20 +++++++++++++++----- > >> 1 file changed, 15 insertions(+), 5 deletions(-) > >> > >> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > >> index 74c1db8ce271..8288c778916e 100644 > >> --- a/arch/arm64/mm/init.c > >> +++ b/arch/arm64/mm/init.c > >> @@ -115,20 +115,21 @@ static void __init > arch_reserve_crashkernel(void) > >> > >> /* > >> * Return the maximum physical address for a zone accessible by the > given bits > >> - * limit. If DRAM starts above 32-bit, expand the zone to the maximum > >> - * available memory, otherwise cap it at 32-bit. > >> + * limit. If DRAM starts above 32-bit, expand the zone to the available > memory > >> + * start limited by the zone bits mask, otherwise cap it at 32-bit. > >> */ > >> static phys_addr_t __init max_zone_phys(unsigned int zone_bits) > >> { > >> phys_addr_t zone_mask = DMA_BIT_MASK(zone_bits); > >> phys_addr_t phys_start = memblock_start_of_DRAM(); > >> + phys_addr_t phys_end = memblock_end_of_DRAM(); > >> > >> if (phys_start > U32_MAX) > >> - zone_mask = PHYS_ADDR_MAX; > >> + zone_mask = phys_start | zone_mask; > >> else if (phys_start > zone_mask) > >> zone_mask = U32_MAX; > >> > >> - return min(zone_mask, memblock_end_of_DRAM() - 1) + 1; > >> + return min(zone_mask, phys_end - 1) + 1; > >> } > >> > >> static void __init zone_sizes_init(void) > >> @@ -140,7 +141,16 @@ static void __init zone_sizes_init(void) > >> > >> #ifdef CONFIG_ZONE_DMA > >> acpi_zone_dma_bits = > fls64(acpi_iort_dma_get_max_cpu_address()); > >> - dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL)); > >> + /* > >> + * When calculating the dma zone bits from the device tree, subtract > >> + * the DRAM start address, in case it does not start from address > >> + * zero. This way. we pass only the zone size related bits to > >> + * max_zone_phys(), which will add them to the base of the DRAM. > >> + * This prevents miscalculations on arm64 SOCs which combines > >> + * DDR starting above 4GB with memory controllers limited to > >> + * 32-bits or less: > >> + */ > >> + dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL) - > memblock_start_of_DRAM()); > >> zone_dma_bits = min3(32U, dt_zone_dma_bits, > acpi_zone_dma_bits); > >> arm64_dma_phys_limit = max_zone_phys(zone_dma_bits); > >> max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit); > > > > Hmm, I'm a bit worried this could regress other platforms since you seem > > to be assuming that DMA address 0 corresponds to the physical start of > > DRAM. What if that isn't the case? > > All of our most recent Set-top-box SoCs map DRAM starting at PA > 0x4000_0000 FWIW. We have the following memory maps: That is below 4GB > > 1. DRAM mapped at PA 0x0000_0000 > 2. DRAM mapped at PA 0x4000_0000 with another memory controller > starting > at 0x3_0000_0000 > 3. DRAM mapped at PA 0x4000_0000 with a single memory controller. > > Here is the before/after diff with debugging prints introduced to print > start, end, min, mask and the dt_zone_dma_bits value: > > Memory map 1. with 2GB -> no change in output > > Memory map 2. with 2+2GB -> no change in output > > Memory map 3. with 4GB: > > @@ -39,7 +39,7 @@ > [ 0.000000] OF: reserved mem: > 0x00000000fdfff000..0x00000000fdffffff (4 KiB) nomap non-reusable > NWMBOX > [ 0.000000] OF: reserved mem: > 0x00000000fe000000..0x00000000ffffffff (32768 KiB) nomap non-reusable > SRR@fe000000 > [ 0.000000] max_zone_phys: start: 0x0000000040000000, end: > 0x0000000140000000, min: 0x0000000100000000, mask: 0x00000000ffffffff > -[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000021 > +[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000020 > [ 0.000000] max_zone_phys: start: 0x0000000040000000, end: > 0x0000000140000000, min: 0x0000000100000000, mask: 0x00000000ffffffff > [ 0.000000] Zone ranges: > [ 0.000000] DMA [mem 0x0000000040000000-0x00000000ffffffff] > @@ -88,243 +88,243 @@ > > Memory map 3. with 2GB: > > @@ -39,11 +39,11 @@ > [ 0.000000] OF: reserved mem: > 0x00000000bdfff000..0x00000000bdffffff (4 KiB) nomap non-reusable > NWMBOX > [ 0.000000] OF: reserved mem: > 0x00000000be000000..0x00000000bfffffff (32768 KiB) nomap non-reusable > SRR@be000000 > [ 0.000000] max_zone_phys: start: 0x0000000040000000, end: > 0x00000000c0000000, min: 0x00000000c0000000, mask: 0x00000000ffffffff > -[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000020 > -[ 0.000000] max_zone_phys: start: 0x0000000040000000, end: > 0x00000000c0000000, min: 0x00000000c0000000, mask: 0x00000000ffffffff > +[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x0000001f > +[ 0.000000] max_zone_phys: start: 0x0000000040000000, end: > 0x00000000c0000000, min: 0x0000000080000000, mask: 0x000000007fffffff > [ 0.000000] Zone ranges: > -[ 0.000000] DMA [mem 0x0000000040000000-0x00000000bfffffff] > -[ 0.000000] DMA32 empty > +[ 0.000000] DMA [mem 0x0000000040000000-0x000000007fffffff] > +[ 0.000000] DMA32 [mem 0x0000000080000000-0x00000000bfffffff] > [ 0.000000] Normal empty > [ 0.000000] Movable zone start for each node > [ 0.000000] Early memory node ranges > @@ -54,7 +54,7 @@ > -- > Florian Do you have an example where this works when DDR memory starts above 4GB? The code in max_zone_phys() has separate if clauses for memory starting above 4GB, and for memory starting above zone mask but below 4GB... Elad.
On 1/4/24 04:38, Elad Nachman wrote: > > >> -----Original Message----- >> From: Florian Fainelli <florian.fainelli@broadcom.com> >> Sent: Wednesday, January 3, 2024 8:32 PM >> To: Will Deacon <will@kernel.org>; Elad Nachman >> <enachman@marvell.com> >> Cc: catalin.marinas@arm.com; thunder.leizhen@huawei.com; >> bhe@redhat.com; akpm@linux-foundation.org; yajun.deng@linux.dev; >> chris.zjh@huawei.com; linux-kernel@vger.kernel.org; linux-arm- >> kernel@lists.infradead.org >> Subject: [EXT] Re: [PATCH] arm64: mm: Fix SOCs with DDR starting above >> zero >> >> External Email >> >> ---------------------------------------------------------------------- >> On 1/3/24 09:45, Will Deacon wrote: >>> On Wed, Jan 03, 2024 at 07:00:02PM +0200, Elad Nachman wrote: >>>> From: Elad Nachman <enachman@marvell.com> >>>> >>>> Some SOCs, like the Marvell AC5/X/IM, have a combination >>>> of DDR starting at 0x2_0000_0000 coupled with DMA controllers >>>> limited to 31 and 32 bit of addressing. >>>> This requires to properly arrange ZONE_DMA and ZONE_DMA32 for >>>> these SOCs, so swiotlb and coherent DMA allocation would work >>>> properly. >>>> Change initialization so device tree dma zone bits are taken as >>>> function of offset from DRAM start, and when calculating the >>>> maximal zone physical RAM address for physical DDR starting above >>>> 32-bit, combine the physical address start plus the zone mask >>>> passed as parameter. >>>> This creates the proper zone splitting for these SOCs: >>>> 0..2GB for ZONE_DMA >>>> 2GB..4GB for ZONE_DMA32 >>>> 4GB..8GB for ZONE_NORMAL >>>> >>>> Signed-off-by: Elad Nachman <enachman@marvell.com> >>>> --- >>>> arch/arm64/mm/init.c | 20 +++++++++++++++----- >>>> 1 file changed, 15 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c >>>> index 74c1db8ce271..8288c778916e 100644 >>>> --- a/arch/arm64/mm/init.c >>>> +++ b/arch/arm64/mm/init.c >>>> @@ -115,20 +115,21 @@ static void __init >> arch_reserve_crashkernel(void) >>>> >>>> /* >>>> * Return the maximum physical address for a zone accessible by the >> given bits >>>> - * limit. If DRAM starts above 32-bit, expand the zone to the maximum >>>> - * available memory, otherwise cap it at 32-bit. >>>> + * limit. If DRAM starts above 32-bit, expand the zone to the available >> memory >>>> + * start limited by the zone bits mask, otherwise cap it at 32-bit. >>>> */ >>>> static phys_addr_t __init max_zone_phys(unsigned int zone_bits) >>>> { >>>> phys_addr_t zone_mask = DMA_BIT_MASK(zone_bits); >>>> phys_addr_t phys_start = memblock_start_of_DRAM(); >>>> + phys_addr_t phys_end = memblock_end_of_DRAM(); >>>> >>>> if (phys_start > U32_MAX) >>>> - zone_mask = PHYS_ADDR_MAX; >>>> + zone_mask = phys_start | zone_mask; >>>> else if (phys_start > zone_mask) >>>> zone_mask = U32_MAX; >>>> >>>> - return min(zone_mask, memblock_end_of_DRAM() - 1) + 1; >>>> + return min(zone_mask, phys_end - 1) + 1; >>>> } >>>> >>>> static void __init zone_sizes_init(void) >>>> @@ -140,7 +141,16 @@ static void __init zone_sizes_init(void) >>>> >>>> #ifdef CONFIG_ZONE_DMA >>>> acpi_zone_dma_bits = >> fls64(acpi_iort_dma_get_max_cpu_address()); >>>> - dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL)); >>>> + /* >>>> + * When calculating the dma zone bits from the device tree, subtract >>>> + * the DRAM start address, in case it does not start from address >>>> + * zero. This way. we pass only the zone size related bits to >>>> + * max_zone_phys(), which will add them to the base of the DRAM. >>>> + * This prevents miscalculations on arm64 SOCs which combines >>>> + * DDR starting above 4GB with memory controllers limited to >>>> + * 32-bits or less: >>>> + */ >>>> + dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL) - >> memblock_start_of_DRAM()); >>>> zone_dma_bits = min3(32U, dt_zone_dma_bits, >> acpi_zone_dma_bits); >>>> arm64_dma_phys_limit = max_zone_phys(zone_dma_bits); >>>> max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit); >>> >>> Hmm, I'm a bit worried this could regress other platforms since you seem >>> to be assuming that DMA address 0 corresponds to the physical start of >>> DRAM. What if that isn't the case? >> >> All of our most recent Set-top-box SoCs map DRAM starting at PA >> 0x4000_0000 FWIW. We have the following memory maps: > > That is below 4GB Right, I was just making sure that there would be no regressions with non-zero base addresses for DRAM, not that there should be, but getting Tested-by tags for such changes is always a good thing IMHO. > >> >> 1. DRAM mapped at PA 0x0000_0000 >> 2. DRAM mapped at PA 0x4000_0000 with another memory controller >> starting >> at 0x3_0000_0000 >> 3. DRAM mapped at PA 0x4000_0000 with a single memory controller. >> >> Here is the before/after diff with debugging prints introduced to print >> start, end, min, mask and the dt_zone_dma_bits value: >> >> Memory map 1. with 2GB -> no change in output >> >> Memory map 2. with 2+2GB -> no change in output >> >> Memory map 3. with 4GB: >> >> @@ -39,7 +39,7 @@ >> [ 0.000000] OF: reserved mem: >> 0x00000000fdfff000..0x00000000fdffffff (4 KiB) nomap non-reusable >> NWMBOX >> [ 0.000000] OF: reserved mem: >> 0x00000000fe000000..0x00000000ffffffff (32768 KiB) nomap non-reusable >> SRR@fe000000 >> [ 0.000000] max_zone_phys: start: 0x0000000040000000, end: >> 0x0000000140000000, min: 0x0000000100000000, mask: 0x00000000ffffffff >> -[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000021 >> +[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000020 >> [ 0.000000] max_zone_phys: start: 0x0000000040000000, end: >> 0x0000000140000000, min: 0x0000000100000000, mask: 0x00000000ffffffff >> [ 0.000000] Zone ranges: >> [ 0.000000] DMA [mem 0x0000000040000000-0x00000000ffffffff] >> @@ -88,243 +88,243 @@ >> >> Memory map 3. with 2GB: >> >> @@ -39,11 +39,11 @@ >> [ 0.000000] OF: reserved mem: >> 0x00000000bdfff000..0x00000000bdffffff (4 KiB) nomap non-reusable >> NWMBOX >> [ 0.000000] OF: reserved mem: >> 0x00000000be000000..0x00000000bfffffff (32768 KiB) nomap non-reusable >> SRR@be000000 >> [ 0.000000] max_zone_phys: start: 0x0000000040000000, end: >> 0x00000000c0000000, min: 0x00000000c0000000, mask: 0x00000000ffffffff >> -[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000020 >> -[ 0.000000] max_zone_phys: start: 0x0000000040000000, end: >> 0x00000000c0000000, min: 0x00000000c0000000, mask: 0x00000000ffffffff >> +[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x0000001f >> +[ 0.000000] max_zone_phys: start: 0x0000000040000000, end: >> 0x00000000c0000000, min: 0x0000000080000000, mask: 0x000000007fffffff >> [ 0.000000] Zone ranges: >> -[ 0.000000] DMA [mem 0x0000000040000000-0x00000000bfffffff] >> -[ 0.000000] DMA32 empty >> +[ 0.000000] DMA [mem 0x0000000040000000-0x000000007fffffff] >> +[ 0.000000] DMA32 [mem 0x0000000080000000-0x00000000bfffffff] >> [ 0.000000] Normal empty >> [ 0.000000] Movable zone start for each node >> [ 0.000000] Early memory node ranges >> @@ -54,7 +54,7 @@ >> -- >> Florian > > Do you have an example where this works when DDR memory starts above 4GB? > The code in max_zone_phys() has separate if clauses for memory starting above 4GB, > and for memory starting above zone mask but below 4GB... I am afraid I don't have such a system handy. The one with 2+2GB with the second 2GB starting at 12GB might be a tad difficult to start without the first 2GB due to a number of critical reserved memory regions being there.
> -----Original Message----- > From: Florian Fainelli <florian.fainelli@broadcom.com> > Sent: Thursday, January 4, 2024 8:29 PM > To: Elad Nachman <enachman@marvell.com>; Will Deacon > <will@kernel.org> > Cc: catalin.marinas@arm.com; thunder.leizhen@huawei.com; > bhe@redhat.com; akpm@linux-foundation.org; yajun.deng@linux.dev; > chris.zjh@huawei.com; linux-kernel@vger.kernel.org; linux-arm- > kernel@lists.infradead.org > Subject: Re: [EXT] Re: [PATCH] arm64: mm: Fix SOCs with DDR starting above > zero > > On 1/4/24 04:38, Elad Nachman wrote: > > > > > >> -----Original Message----- > >> From: Florian Fainelli <florian.fainelli@broadcom.com> > >> Sent: Wednesday, January 3, 2024 8:32 PM > >> To: Will Deacon <will@kernel.org>; Elad Nachman > >> <enachman@marvell.com> > >> Cc: catalin.marinas@arm.com; thunder.leizhen@huawei.com; > >> bhe@redhat.com; akpm@linux-foundation.org; yajun.deng@linux.dev; > >> chris.zjh@huawei.com; linux-kernel@vger.kernel.org; linux-arm- > >> kernel@lists.infradead.org > >> Subject: [EXT] Re: [PATCH] arm64: mm: Fix SOCs with DDR starting above > >> zero > >> > >> External Email > >> > >> ---------------------------------------------------------------------- > >> On 1/3/24 09:45, Will Deacon wrote: > >>> On Wed, Jan 03, 2024 at 07:00:02PM +0200, Elad Nachman wrote: > >>>> From: Elad Nachman <enachman@marvell.com> > >>>> > >>>> Some SOCs, like the Marvell AC5/X/IM, have a combination > >>>> of DDR starting at 0x2_0000_0000 coupled with DMA controllers > >>>> limited to 31 and 32 bit of addressing. > >>>> This requires to properly arrange ZONE_DMA and ZONE_DMA32 for > >>>> these SOCs, so swiotlb and coherent DMA allocation would work > >>>> properly. > >>>> Change initialization so device tree dma zone bits are taken as > >>>> function of offset from DRAM start, and when calculating the > >>>> maximal zone physical RAM address for physical DDR starting above > >>>> 32-bit, combine the physical address start plus the zone mask > >>>> passed as parameter. > >>>> This creates the proper zone splitting for these SOCs: > >>>> 0..2GB for ZONE_DMA > >>>> 2GB..4GB for ZONE_DMA32 > >>>> 4GB..8GB for ZONE_NORMAL > >>>> > >>>> Signed-off-by: Elad Nachman <enachman@marvell.com> > >>>> --- > >>>> arch/arm64/mm/init.c | 20 +++++++++++++++----- > >>>> 1 file changed, 15 insertions(+), 5 deletions(-) > >>>> > >>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > >>>> index 74c1db8ce271..8288c778916e 100644 > >>>> --- a/arch/arm64/mm/init.c > >>>> +++ b/arch/arm64/mm/init.c > >>>> @@ -115,20 +115,21 @@ static void __init > >> arch_reserve_crashkernel(void) > >>>> > >>>> /* > >>>> * Return the maximum physical address for a zone accessible by the > >> given bits > >>>> - * limit. If DRAM starts above 32-bit, expand the zone to the maximum > >>>> - * available memory, otherwise cap it at 32-bit. > >>>> + * limit. If DRAM starts above 32-bit, expand the zone to the available > >> memory > >>>> + * start limited by the zone bits mask, otherwise cap it at 32-bit. > >>>> */ > >>>> static phys_addr_t __init max_zone_phys(unsigned int zone_bits) > >>>> { > >>>> phys_addr_t zone_mask = DMA_BIT_MASK(zone_bits); > >>>> phys_addr_t phys_start = memblock_start_of_DRAM(); > >>>> + phys_addr_t phys_end = memblock_end_of_DRAM(); > >>>> > >>>> if (phys_start > U32_MAX) > >>>> - zone_mask = PHYS_ADDR_MAX; > >>>> + zone_mask = phys_start | zone_mask; > >>>> else if (phys_start > zone_mask) > >>>> zone_mask = U32_MAX; > >>>> > >>>> - return min(zone_mask, memblock_end_of_DRAM() - 1) + 1; > >>>> + return min(zone_mask, phys_end - 1) + 1; > >>>> } > >>>> > >>>> static void __init zone_sizes_init(void) > >>>> @@ -140,7 +141,16 @@ static void __init zone_sizes_init(void) > >>>> > >>>> #ifdef CONFIG_ZONE_DMA > >>>> acpi_zone_dma_bits = > >> fls64(acpi_iort_dma_get_max_cpu_address()); > >>>> - dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL)); > >>>> + /* > >>>> + * When calculating the dma zone bits from the device tree, subtract > >>>> + * the DRAM start address, in case it does not start from address > >>>> + * zero. This way. we pass only the zone size related bits to > >>>> + * max_zone_phys(), which will add them to the base of the DRAM. > >>>> + * This prevents miscalculations on arm64 SOCs which combines > >>>> + * DDR starting above 4GB with memory controllers limited to > >>>> + * 32-bits or less: > >>>> + */ > >>>> + dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL) - > >> memblock_start_of_DRAM()); > >>>> zone_dma_bits = min3(32U, dt_zone_dma_bits, > >> acpi_zone_dma_bits); > >>>> arm64_dma_phys_limit = max_zone_phys(zone_dma_bits); > >>>> max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit); > >>> > >>> Hmm, I'm a bit worried this could regress other platforms since you > seem > >>> to be assuming that DMA address 0 corresponds to the physical start of > >>> DRAM. What if that isn't the case? > >> > >> All of our most recent Set-top-box SoCs map DRAM starting at PA > >> 0x4000_0000 FWIW. We have the following memory maps: > > > > That is below 4GB > > Right, I was just making sure that there would be no regressions with > non-zero base addresses for DRAM, not that there should be, but getting > Tested-by tags for such changes is always a good thing IMHO. I do not have such board nor access to one. Perhaps someone on your side can test this patch on the SOC which has DDR starting at 0x4000_0000 for regression? Probably a quick one, boot + DMA sanity test. Thanks, > > > > >> > >> 1. DRAM mapped at PA 0x0000_0000 > >> 2. DRAM mapped at PA 0x4000_0000 with another memory controller > >> starting > >> at 0x3_0000_0000 > >> 3. DRAM mapped at PA 0x4000_0000 with a single memory controller. > >> > >> Here is the before/after diff with debugging prints introduced to print > >> start, end, min, mask and the dt_zone_dma_bits value: > >> > >> Memory map 1. with 2GB -> no change in output > >> > >> Memory map 2. with 2+2GB -> no change in output > >> > >> Memory map 3. with 4GB: > >> > >> @@ -39,7 +39,7 @@ > >> [ 0.000000] OF: reserved mem: > >> 0x00000000fdfff000..0x00000000fdffffff (4 KiB) nomap non-reusable > >> NWMBOX > >> [ 0.000000] OF: reserved mem: > >> 0x00000000fe000000..0x00000000ffffffff (32768 KiB) nomap non-reusable > >> SRR@fe000000 > >> [ 0.000000] max_zone_phys: start: 0x0000000040000000, end: > >> 0x0000000140000000, min: 0x0000000100000000, mask: > 0x00000000ffffffff > >> -[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000021 > >> +[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000020 > >> [ 0.000000] max_zone_phys: start: 0x0000000040000000, end: > >> 0x0000000140000000, min: 0x0000000100000000, mask: > 0x00000000ffffffff > >> [ 0.000000] Zone ranges: > >> [ 0.000000] DMA [mem 0x0000000040000000-0x00000000ffffffff] > >> @@ -88,243 +88,243 @@ > >> > >> Memory map 3. with 2GB: > >> > >> @@ -39,11 +39,11 @@ > >> [ 0.000000] OF: reserved mem: > >> 0x00000000bdfff000..0x00000000bdffffff (4 KiB) nomap non-reusable > >> NWMBOX > >> [ 0.000000] OF: reserved mem: > >> 0x00000000be000000..0x00000000bfffffff (32768 KiB) nomap non- > reusable > >> SRR@be000000 > >> [ 0.000000] max_zone_phys: start: 0x0000000040000000, end: > >> 0x00000000c0000000, min: 0x00000000c0000000, mask: > 0x00000000ffffffff > >> -[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x00000020 > >> -[ 0.000000] max_zone_phys: start: 0x0000000040000000, end: > >> 0x00000000c0000000, min: 0x00000000c0000000, mask: > 0x00000000ffffffff > >> +[ 0.000000] zone_sizes_init: dt_zone_dma_bits: 0x0000001f > >> +[ 0.000000] max_zone_phys: start: 0x0000000040000000, end: > >> 0x00000000c0000000, min: 0x0000000080000000, mask: > 0x000000007fffffff > >> [ 0.000000] Zone ranges: > >> -[ 0.000000] DMA [mem 0x0000000040000000-0x00000000bfffffff] > >> -[ 0.000000] DMA32 empty > >> +[ 0.000000] DMA [mem 0x0000000040000000-0x000000007fffffff] > >> +[ 0.000000] DMA32 [mem 0x0000000080000000-0x00000000bfffffff] > >> [ 0.000000] Normal empty > >> [ 0.000000] Movable zone start for each node > >> [ 0.000000] Early memory node ranges > >> @@ -54,7 +54,7 @@ > >> -- > >> Florian > > > > Do you have an example where this works when DDR memory starts above > 4GB? > > The code in max_zone_phys() has separate if clauses for memory starting > above 4GB, > > and for memory starting above zone mask but below 4GB... > > I am afraid I don't have such a system handy. The one with 2+2GB with > the second 2GB starting at 12GB might be a tad difficult to start > without the first 2GB due to a number of critical reserved memory > regions being there. > -- > Florian Elad.
On 1/4/24 10:34, Elad Nachman wrote: > > >> -----Original Message----- >> From: Florian Fainelli <florian.fainelli@broadcom.com> >> Sent: Thursday, January 4, 2024 8:29 PM >> To: Elad Nachman <enachman@marvell.com>; Will Deacon >> <will@kernel.org> >> Cc: catalin.marinas@arm.com; thunder.leizhen@huawei.com; >> bhe@redhat.com; akpm@linux-foundation.org; yajun.deng@linux.dev; >> chris.zjh@huawei.com; linux-kernel@vger.kernel.org; linux-arm- >> kernel@lists.infradead.org >> Subject: Re: [EXT] Re: [PATCH] arm64: mm: Fix SOCs with DDR starting above >> zero >> >> On 1/4/24 04:38, Elad Nachman wrote: >>> >>> >>>> -----Original Message----- >>>> From: Florian Fainelli <florian.fainelli@broadcom.com> >>>> Sent: Wednesday, January 3, 2024 8:32 PM >>>> To: Will Deacon <will@kernel.org>; Elad Nachman >>>> <enachman@marvell.com> >>>> Cc: catalin.marinas@arm.com; thunder.leizhen@huawei.com; >>>> bhe@redhat.com; akpm@linux-foundation.org; yajun.deng@linux.dev; >>>> chris.zjh@huawei.com; linux-kernel@vger.kernel.org; linux-arm- >>>> kernel@lists.infradead.org >>>> Subject: [EXT] Re: [PATCH] arm64: mm: Fix SOCs with DDR starting above >>>> zero >>>> >>>> External Email >>>> >>>> ---------------------------------------------------------------------- >>>> On 1/3/24 09:45, Will Deacon wrote: >>>>> On Wed, Jan 03, 2024 at 07:00:02PM +0200, Elad Nachman wrote: >>>>>> From: Elad Nachman <enachman@marvell.com> >>>>>> >>>>>> Some SOCs, like the Marvell AC5/X/IM, have a combination >>>>>> of DDR starting at 0x2_0000_0000 coupled with DMA controllers >>>>>> limited to 31 and 32 bit of addressing. >>>>>> This requires to properly arrange ZONE_DMA and ZONE_DMA32 for >>>>>> these SOCs, so swiotlb and coherent DMA allocation would work >>>>>> properly. >>>>>> Change initialization so device tree dma zone bits are taken as >>>>>> function of offset from DRAM start, and when calculating the >>>>>> maximal zone physical RAM address for physical DDR starting above >>>>>> 32-bit, combine the physical address start plus the zone mask >>>>>> passed as parameter. >>>>>> This creates the proper zone splitting for these SOCs: >>>>>> 0..2GB for ZONE_DMA >>>>>> 2GB..4GB for ZONE_DMA32 >>>>>> 4GB..8GB for ZONE_NORMAL >>>>>> >>>>>> Signed-off-by: Elad Nachman <enachman@marvell.com> >>>>>> --- >>>>>> arch/arm64/mm/init.c | 20 +++++++++++++++----- >>>>>> 1 file changed, 15 insertions(+), 5 deletions(-) >>>>>> >>>>>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c >>>>>> index 74c1db8ce271..8288c778916e 100644 >>>>>> --- a/arch/arm64/mm/init.c >>>>>> +++ b/arch/arm64/mm/init.c >>>>>> @@ -115,20 +115,21 @@ static void __init >>>> arch_reserve_crashkernel(void) >>>>>> >>>>>> /* >>>>>> * Return the maximum physical address for a zone accessible by the >>>> given bits >>>>>> - * limit. If DRAM starts above 32-bit, expand the zone to the maximum >>>>>> - * available memory, otherwise cap it at 32-bit. >>>>>> + * limit. If DRAM starts above 32-bit, expand the zone to the available >>>> memory >>>>>> + * start limited by the zone bits mask, otherwise cap it at 32-bit. >>>>>> */ >>>>>> static phys_addr_t __init max_zone_phys(unsigned int zone_bits) >>>>>> { >>>>>> phys_addr_t zone_mask = DMA_BIT_MASK(zone_bits); >>>>>> phys_addr_t phys_start = memblock_start_of_DRAM(); >>>>>> + phys_addr_t phys_end = memblock_end_of_DRAM(); >>>>>> >>>>>> if (phys_start > U32_MAX) >>>>>> - zone_mask = PHYS_ADDR_MAX; >>>>>> + zone_mask = phys_start | zone_mask; >>>>>> else if (phys_start > zone_mask) >>>>>> zone_mask = U32_MAX; >>>>>> >>>>>> - return min(zone_mask, memblock_end_of_DRAM() - 1) + 1; >>>>>> + return min(zone_mask, phys_end - 1) + 1; >>>>>> } >>>>>> >>>>>> static void __init zone_sizes_init(void) >>>>>> @@ -140,7 +141,16 @@ static void __init zone_sizes_init(void) >>>>>> >>>>>> #ifdef CONFIG_ZONE_DMA >>>>>> acpi_zone_dma_bits = >>>> fls64(acpi_iort_dma_get_max_cpu_address()); >>>>>> - dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL)); >>>>>> + /* >>>>>> + * When calculating the dma zone bits from the device tree, subtract >>>>>> + * the DRAM start address, in case it does not start from address >>>>>> + * zero. This way. we pass only the zone size related bits to >>>>>> + * max_zone_phys(), which will add them to the base of the DRAM. >>>>>> + * This prevents miscalculations on arm64 SOCs which combines >>>>>> + * DDR starting above 4GB with memory controllers limited to >>>>>> + * 32-bits or less: >>>>>> + */ >>>>>> + dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL) - >>>> memblock_start_of_DRAM()); >>>>>> zone_dma_bits = min3(32U, dt_zone_dma_bits, >>>> acpi_zone_dma_bits); >>>>>> arm64_dma_phys_limit = max_zone_phys(zone_dma_bits); >>>>>> max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit); >>>>> >>>>> Hmm, I'm a bit worried this could regress other platforms since you >> seem >>>>> to be assuming that DMA address 0 corresponds to the physical start of >>>>> DRAM. What if that isn't the case? >>>> >>>> All of our most recent Set-top-box SoCs map DRAM starting at PA >>>> 0x4000_0000 FWIW. We have the following memory maps: >>> >>> That is below 4GB >> >> Right, I was just making sure that there would be no regressions with >> non-zero base addresses for DRAM, not that there should be, but getting >> Tested-by tags for such changes is always a good thing IMHO. > > I do not have such board nor access to one. > Perhaps someone on your side can test this patch on the SOC which has DDR starting at 0x4000_0000 for regression? I suppose I was not clear with the intent of my email. It was not just comparing the dmesg, but also ran our usual test suite which exercises Ethernet, USB, NAND, eMMC and did not get any regressions from that. This was done on all of the boards/systems that were described in my email, including the two systems with DDR starting at 0x4000_0000.
On Wed, Jan 03, 2024 at 07:00:02PM +0200, Elad Nachman wrote: > From: Elad Nachman <enachman@marvell.com> > > Some SOCs, like the Marvell AC5/X/IM, have a combination > of DDR starting at 0x2_0000_0000 coupled with DMA controllers > limited to 31 and 32 bit of addressing. > This requires to properly arrange ZONE_DMA and ZONE_DMA32 for > these SOCs, so swiotlb and coherent DMA allocation would work > properly. > Change initialization so device tree dma zone bits are taken as > function of offset from DRAM start, and when calculating the > maximal zone physical RAM address for physical DDR starting above > 32-bit, combine the physical address start plus the zone mask > passed as parameter. > This creates the proper zone splitting for these SOCs: > 0..2GB for ZONE_DMA > 2GB..4GB for ZONE_DMA32 > 4GB..8GB for ZONE_NORMAL Please see this discussion: https://lore.kernel.org/all/ZU0QEL9ByWNYVki1@arm.com/ and follow-up patches from Baruch, though I haven't reviewed them yet: https://lore.kernel.org/all/fae5b1180161a7d8cd626a96f5df80b0a0796b8b.1703683642.git.baruch@tkos.co.il/ The problem is that the core code pretty much assumes that DRAM starts from 0. No matter how you massage the zones in the arm64 kernel for your case, memblock_start_of_DRAM() + (2 << zone_dma_bits) won't be a power of two and therefore zone_dma_bits in the core code cannot describe what you need. I can see Baruch added a zone_dma_off assuming it's the same for all DMA-capable devices on that SoC (well, those with a coherent mask smaller than 64-bit). I need to think a bit more about this. Anyway, we first need to address the mask/bits comparisons in the core code, maybe changing bits to a physical limit instead and take the device DMA offset into account. After that we can look at how to correctly set up the DMA zones on arm64.
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 74c1db8ce271..8288c778916e 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -115,20 +115,21 @@ static void __init arch_reserve_crashkernel(void) /* * Return the maximum physical address for a zone accessible by the given bits - * limit. If DRAM starts above 32-bit, expand the zone to the maximum - * available memory, otherwise cap it at 32-bit. + * limit. If DRAM starts above 32-bit, expand the zone to the available memory + * start limited by the zone bits mask, otherwise cap it at 32-bit. */ static phys_addr_t __init max_zone_phys(unsigned int zone_bits) { phys_addr_t zone_mask = DMA_BIT_MASK(zone_bits); phys_addr_t phys_start = memblock_start_of_DRAM(); + phys_addr_t phys_end = memblock_end_of_DRAM(); if (phys_start > U32_MAX) - zone_mask = PHYS_ADDR_MAX; + zone_mask = phys_start | zone_mask; else if (phys_start > zone_mask) zone_mask = U32_MAX; - return min(zone_mask, memblock_end_of_DRAM() - 1) + 1; + return min(zone_mask, phys_end - 1) + 1; } static void __init zone_sizes_init(void) @@ -140,7 +141,16 @@ static void __init zone_sizes_init(void) #ifdef CONFIG_ZONE_DMA acpi_zone_dma_bits = fls64(acpi_iort_dma_get_max_cpu_address()); - dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL)); + /* + * When calculating the dma zone bits from the device tree, subtract + * the DRAM start address, in case it does not start from address + * zero. This way. we pass only the zone size related bits to + * max_zone_phys(), which will add them to the base of the DRAM. + * This prevents miscalculations on arm64 SOCs which combines + * DDR starting above 4GB with memory controllers limited to + * 32-bits or less: + */ + dt_zone_dma_bits = fls64(of_dma_get_max_cpu_address(NULL) - memblock_start_of_DRAM()); zone_dma_bits = min3(32U, dt_zone_dma_bits, acpi_zone_dma_bits); arm64_dma_phys_limit = max_zone_phys(zone_dma_bits); max_zone_pfns[ZONE_DMA] = PFN_DOWN(arm64_dma_phys_limit);