Message ID | 20240202135030.42265-1-csd4492@csd.uoc.gr |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-49926-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:9bc1:b0:106:209c:c626 with SMTP id op1csp453274dyc; Fri, 2 Feb 2024 06:12:21 -0800 (PST) X-Google-Smtp-Source: AGHT+IHl1kGbZb1ZyIakMpbtDYlPjuNQpJd4moav1p4VFTjZi0i04JXyN51xTWC7sxnQiW0uWXAy X-Received: by 2002:a17:906:4ca:b0:a37:2696:b3e with SMTP id g10-20020a17090604ca00b00a3726960b3emr482137eja.59.1706883141196; Fri, 02 Feb 2024 06:12:21 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706883141; cv=pass; d=google.com; s=arc-20160816; b=rMaKwcuITcWEo0WLHopLmDVC7/Ep/90CLdYqSQlwjg7MGtGXNYfrYMYqu7ex5TZJls pI3tbdpJrxwjMhvPFtUsIMklfx2lTYfCWr4fXQMn5Iks3Tb14TGmg1ceQSRIwYWDrdm2 3Tlqkhjk/EyX6cf/z4tvQBuyFya7Wi3JDdRGFjDs11T2GJ9qLaMR2ML2Tp/AzZBxnkh6 wQ0F0Yzsm7m0IGeUUAN7q5ft7UMsIj1B0X32bcNUBlZL5GlQvJ8fzSkLF/0qd4sGO5ka hF0wIZIaJx0sg0qBu+vSjeFUPCNFeOXkM0b7f/htSnwTZoQYveyh/laow8BkeYSJjAdF a0Cg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=8xiDaKPe3EcE5yGO2LEkBF3Kug+H9effrggz/KsrocE=; fh=HD5KnK7o/3HOglMeeQD+sgU/EEPDR7kCFF6OoJtgSkU=; b=lo5/XSIWTkID1yYUTKT3CYtTmorr1wdQ+aZEqULC3iQUU2X2li+kcHHgp/qD1Fz4Iu cLe6PX9QZTMkYuGM/IuErUfQvRedo4BPN9K15KDi++JvUDNBQHD8TmSXqNaZj8U3Qfo9 KRBjIvANoJoXyRrbT/xCVF1fgi6SwYlPhFQlC0l4904XGKMhwTJqyQAtowk1X6qGqgE6 gudce/ihb/6BbO4vH6f4w/BklWlG3o2KYA8pMayE+f6060d8XyKibr3pWLf1OMQeNKIi PEpDt5KFR6S1kcXV0FOgrLT3Xdpv7bzMg3Bch3xMvxlsFajH6lpwxqf6/HvKA96V0wzS +37A==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@csd.uoc.gr header.s=csd header.b=G7vOWtx0; arc=pass (i=1 spf=pass spfdomain=csd.uoc.gr dkim=pass dkdomain=csd.uoc.gr dmarc=pass fromdomain=csd.uoc.gr); spf=pass (google.com: domain of linux-kernel+bounces-49926-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-49926-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=csd.uoc.gr X-Forwarded-Encrypted: i=1; AJvYcCWuDRFiltawj/wYEUSTsQPfBxmqN6OELyPxqiJjhi/K0/2igmYSeONZ+cz1OOYiVnbKKQLK0IczKkCph0iAhLB+NsVjsQ== Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id j6-20020a170906474600b00a36c39234bdsi864181ejs.488.2024.02.02.06.12.21 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Feb 2024 06:12:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-49926-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@csd.uoc.gr header.s=csd header.b=G7vOWtx0; arc=pass (i=1 spf=pass spfdomain=csd.uoc.gr dkim=pass dkdomain=csd.uoc.gr dmarc=pass fromdomain=csd.uoc.gr); spf=pass (google.com: domain of linux-kernel+bounces-49926-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-49926-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=csd.uoc.gr Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 364191F2AFB1 for <ouuuleilei@gmail.com>; Fri, 2 Feb 2024 14:12:01 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id D3C41145328; Fri, 2 Feb 2024 14:11:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=csd.uoc.gr header.i=@csd.uoc.gr header.b="G7vOWtx0" Received: from ermis.csd.uoc.gr (ermis.csd.uoc.gr [147.52.203.66]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 040181420A5 for <linux-kernel@vger.kernel.org>; Fri, 2 Feb 2024 14:11:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=147.52.203.66 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706883094; cv=none; b=uhSF17/LYCl9eYD6p1vEEQHuKDlar12acQD9EC+sIOq48YB7X/Ok9K6loCunK2RLyARRXdcHgdlzLKfE0P1vQxVAwWOyTQltnMQjC+GQjXWrfmPc0oE8oTF6kjYH+cCrTjQnLezPY23iaT4nljnpGstKelUv87iHrFNOIDRvZyQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706883094; c=relaxed/simple; bh=0fLShIaYA46L2v+Ga5N4MTBmAYSvwPEcuI+RHi2CSIw=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version:Content-Type; b=V4VGjRpXsdTQ7DSmRskYLRdMOGKcm9Y5goKqa9Wrgv98ewevhSl16iKJoI5VWGWGjBHSxtUnwfPWoavfhXG/JNOaLr61ajcsVgGcM5RkWD/PeHCZN+1LVAhi79/nfL4nki2tYf5C6MoF/k5Weon9VOOceqzwUG6+OtXf+hNTjqM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=csd.uoc.gr; spf=pass smtp.mailfrom=csd.uoc.gr; dkim=pass (2048-bit key) header.d=csd.uoc.gr header.i=@csd.uoc.gr header.b=G7vOWtx0; arc=none smtp.client-ip=147.52.203.66 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=csd.uoc.gr Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=csd.uoc.gr Received: from cave.168.239.147 (ppp178059052035.access.hol.gr [178.59.52.35]) by ermis.csd.uoc.gr (Postfix) with ESMTPSA id 818193C07E6; Fri, 2 Feb 2024 15:50:30 +0200 (EET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=csd.uoc.gr; s=csd; t=1706881833; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=8xiDaKPe3EcE5yGO2LEkBF3Kug+H9effrggz/KsrocE=; b=G7vOWtx0lFu4MpEoOLGB2l0LVjYeOEEP8SLtdxbFCuJBUsoFkr5PapgIdpBgZ9HGKHdovC T9hiqPlubuPwxlRkDirmTTclG7A9VLAOONzptl7QzWzQ9dbWqtHXR96D1FAmmM1qlMmu83 EDdIJuzldAFuNP9eARfbbbaV4fk5oNai1Bb6l2WBZbaxS2s+hBADclXx0g1hSYlggOSwr7 Ly7lT+jew6Q6ym6RuEAmFHbcOdFXSjxy+ih4dRAKwAeDtWXAbRwmm+5/Sef6rJwUt5a4G1 6PsTMaT9EE5kGhAOaJvOYXUnUlVHi4mWS2HQksueYpcDuHaLq6A6+C/VgTcPgw== Authentication-Results: ORIGINATING; auth=pass smtp.auth=csd4492 smtp.mailfrom=csd4492@csd.uoc.gr From: Dimitris Vlachos <csd4492@csd.uoc.gr> To: csd4492@csd.uoc.gr, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: clameter@sgi.com, akpm@linux-foundation.org, rppt@kernel.org, arnd@arndb.de, paul.walmsley@sifive.com, palmer@dabbelt.com, alexghiti@rivosinc.com, mick@ics.forth.gr Subject: [PATCH] [RFC] sparsemem: warn on out-of-bounds initialization Date: Fri, 2 Feb 2024 15:50:30 +0200 Message-Id: <20240202135030.42265-1-csd4492@csd.uoc.gr> X-Mailer: git-send-email 2.39.2 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789796696808240638 X-GMAIL-MSGID: 1789796696808240638 |
Series |
[RFC] sparsemem: warn on out-of-bounds initialization
|
|
Commit Message
Dimitris Vlachos
Feb. 2, 2024, 1:50 p.m. UTC
From: Dimitris Vlachos <dvlachos@ics.forth.gr>
Hello all
I am sending this email with regards to a bug that I discovered in the Sparse Memory Model configuration and more specifically, the Virtual Memory Map optimization. Moreover, I would like to inquire about possible ways of fixing it.
I work as a pre-graduate research assistant at ICS-FORTH in the Computer Architecture and VLSI Systems laboratory.
We were running some tests in our prototype hardware (RISC-V), where we noticed that the Kernel crashes early in the boot process with the following setup:
We are using the default Kconfig configurations (defconfig) that includes Sparse Memory + Virtual Memory Map.
The DRAM base address of our system is : 0x800000000000
A 3-level page table is used (Sv39).
When the vmemmap optimization is enabled the macro pfn_to_page() is called, which offsets the vmemmap with the pfn argument to acquire a struct page pointer.
As our DRAM starts at 0x800000000000, the initial pfn will be 0x800000000000 divided by PAGE_SIZE. The calculation result will be:
0xffffffcf00000000 (vmemmap start) + (0x800000000 (pfn) * 64 (sizeof(struct page))
This causes an overflow as the number is very large, the resulting address is 0x1c700000000 which is not a valid Sv39 address (all bits above bit 38 should be set) and does not belong to the kernel’s virtual address space.
The same will happen with all valid pfns as the memory is being initialized. When the Kernel will try to access a page using pfn_to_page, a page fault will occur crashing the system.
It should be noted that when the vmemmap is disabled the system boots normally.
However, I would like to emphasize another important detail. When the DRAM base is small enough to avoid an overflow, the virtual memory map mappings will be initialized out of bounds with regard to the vmemmap address space which is 4GiB in Sv39.
The system will not crash, but the address space used for this purpose will be outside of the designated range.
Everything mentioned above holds true when Sv48 is used as well. Given a high enough DRAM base the system will either crash or perform false initializations. Given the fact that virtual memory map is not only used by RISC-V systems, this could be an issue for other architectures as well.
The kernel already contains mminit_validate_memmodel_limits() that checks memory limits.
However, it only checks physical memory ranges. I added a few checks, provided that virtual memory map is enabled, to determine whether offset start and offset end fall inside the range of virtual memory map. Otherwise, a warning will be printed.
Finally, I would like to ask you for any information/advice that could help me fix/prevent the issues that I mentioned (if it’s possible) or recover from them at runtime by disabling the optimization and reverting back to the non-vmemmap mappings.
Thank you in advance.
Best regards,
Dimitris Vlachos
---
mm/sparse.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
Comments
Hi Dimitris, On Fri, Feb 2, 2024 at 2:50 PM Dimitris Vlachos <csd4492@csd.uoc.gr> wrote: > > From: Dimitris Vlachos <dvlachos@ics.forth.gr> > > Hello all > > I am sending this email with regards to a bug that I discovered in the Sparse Memory Model configuration and more specifically, the Virtual Memory Map optimization. Moreover, I would like to inquire about possible ways of fixing it. > > I work as a pre-graduate research assistant at ICS-FORTH in the Computer Architecture and VLSI Systems laboratory. > We were running some tests in our prototype hardware (RISC-V), where we noticed that the Kernel crashes early in the boot process with the following setup: > > We are using the default Kconfig configurations (defconfig) that includes Sparse Memory + Virtual Memory Map. > The DRAM base address of our system is : 0x800000000000 > A 3-level page table is used (Sv39). > > When the vmemmap optimization is enabled the macro pfn_to_page() is called, which offsets the vmemmap with the pfn argument to acquire a struct page pointer. > > As our DRAM starts at 0x800000000000, the initial pfn will be 0x800000000000 divided by PAGE_SIZE. The calculation result will be: > 0xffffffcf00000000 (vmemmap start) + (0x800000000 (pfn) * 64 (sizeof(struct page)) > > This causes an overflow as the number is very large, the resulting address is 0x1c700000000 which is not a valid Sv39 address (all bits above bit 38 should be set) and does not belong to the kernel’s virtual address space. > > The same will happen with all valid pfns as the memory is being initialized. When the Kernel will try to access a page using pfn_to_page, a page fault will occur crashing the system. > It should be noted that when the vmemmap is disabled the system boots normally. > > However, I would like to emphasize another important detail. When the DRAM base is small enough to avoid an overflow, the virtual memory map mappings will be initialized out of bounds with regard to the vmemmap address space which is 4GiB in Sv39. > The system will not crash, but the address space used for this purpose will be outside of the designated range. > > Everything mentioned above holds true when Sv48 is used as well. Given a high enough DRAM base the system will either crash or perform false initializations. Given the fact that virtual memory map is not only used by RISC-V systems, this could be an issue for other architectures as well. > > The kernel already contains mminit_validate_memmodel_limits() that checks memory limits. > However, it only checks physical memory ranges. I added a few checks, provided that virtual memory map is enabled, to determine whether offset start and offset end fall inside the range of virtual memory map. Otherwise, a warning will be printed. > > Finally, I would like to ask you for any information/advice that could help me fix/prevent the issues that I mentioned (if it’s possible) or recover from them at runtime by disabling the optimization and reverting back to the non-vmemmap mappings. > > Thank you in advance. > Best regards, > Dimitris Vlachos > --- > mm/sparse.c | 20 ++++++++++++++++++++ > 1 file changed, 20 insertions(+) > > diff --git a/mm/sparse.c b/mm/sparse.c > index 338cf946d..33b57758e 100644 > --- a/mm/sparse.c > +++ b/mm/sparse.c > @@ -149,6 +149,26 @@ static void __meminit mminit_validate_memmodel_limits(unsigned long *start_pfn, > WARN_ON_ONCE(1); > *end_pfn = max_sparsemem_pfn; > } > + > + /*check vmemmap limits*/ > + #ifdef CONFIG_SPARSEMEM_VMEMMAP > + > + unsigned long vmemmap_offset_start = (unsigned long) pfn_to_page(*start_pfn); > + unsigned long vmemmap_offset_end = (unsigned long) pfn_to_page(*end_pfn); > + > + if (vmemmap_offset_start < VMEMMAP_START) { > + mminit_dprintk(MMINIT_WARNING, "pfnvalidation", > + "Start of range %lu -> %lu exceeds bounds of SPARSEMEM virtual memory map address range %lu -> %lu\n", > + vmemmap_offset_start, vmemmap_offset_end, VMEMMAP_START,VMEMMAP_END); > + WARN_ON_ONCE(1); > + > + } else if (vmemmap_offset_end > VMEMMAP_END ) { > + mminit_dprintk(MMINIT_WARNING, "pfnvalidation", > + "End of range %lu -> %lu exceeds bounds of SPARSEMEM virtual memory map address range %lu -> %lu\n", > + vmemmap_offset_start, vmemmap_offset_end, VMEMMAP_START,VMEMMAP_END); > + WARN_ON_ONCE(1); > + } > + #endif > } > > /* > -- > 2.39.2 > Since struct pages are only available for memory, we could offset vmemmap so that VMEMMAP_START actually points to the first pfn?
Hi Dimitris, On Fri, Feb 2, 2024 at 2:59 PM Alexandre Ghiti <alexghiti@rivosinc.com> wrote: > > Hi Dimitris, > > On Fri, Feb 2, 2024 at 2:50 PM Dimitris Vlachos <csd4492@csd.uoc.gr> wrote: > > > > From: Dimitris Vlachos <dvlachos@ics.forth.gr> > > > > Hello all > > > > I am sending this email with regards to a bug that I discovered in the Sparse Memory Model configuration and more specifically, the Virtual Memory Map optimization. Moreover, I would like to inquire about possible ways of fixing it. > > > > I work as a pre-graduate research assistant at ICS-FORTH in the Computer Architecture and VLSI Systems laboratory. > > We were running some tests in our prototype hardware (RISC-V), where we noticed that the Kernel crashes early in the boot process with the following setup: > > > > We are using the default Kconfig configurations (defconfig) that includes Sparse Memory + Virtual Memory Map. > > The DRAM base address of our system is : 0x800000000000 > > A 3-level page table is used (Sv39). > > > > When the vmemmap optimization is enabled the macro pfn_to_page() is called, which offsets the vmemmap with the pfn argument to acquire a struct page pointer. > > > > As our DRAM starts at 0x800000000000, the initial pfn will be 0x800000000000 divided by PAGE_SIZE. The calculation result will be: > > 0xffffffcf00000000 (vmemmap start) + (0x800000000 (pfn) * 64 (sizeof(struct page)) > > > > This causes an overflow as the number is very large, the resulting address is 0x1c700000000 which is not a valid Sv39 address (all bits above bit 38 should be set) and does not belong to the kernel’s virtual address space. > > > > The same will happen with all valid pfns as the memory is being initialized. When the Kernel will try to access a page using pfn_to_page, a page fault will occur crashing the system. > > It should be noted that when the vmemmap is disabled the system boots normally. > > > > However, I would like to emphasize another important detail. When the DRAM base is small enough to avoid an overflow, the virtual memory map mappings will be initialized out of bounds with regard to the vmemmap address space which is 4GiB in Sv39. > > The system will not crash, but the address space used for this purpose will be outside of the designated range. > > > > Everything mentioned above holds true when Sv48 is used as well. Given a high enough DRAM base the system will either crash or perform false initializations. Given the fact that virtual memory map is not only used by RISC-V systems, this could be an issue for other architectures as well. > > > > The kernel already contains mminit_validate_memmodel_limits() that checks memory limits. > > However, it only checks physical memory ranges. I added a few checks, provided that virtual memory map is enabled, to determine whether offset start and offset end fall inside the range of virtual memory map. Otherwise, a warning will be printed. > > > > Finally, I would like to ask you for any information/advice that could help me fix/prevent the issues that I mentioned (if it’s possible) or recover from them at runtime by disabling the optimization and reverting back to the non-vmemmap mappings. > > > > Thank you in advance. > > Best regards, > > Dimitris Vlachos > > --- > > mm/sparse.c | 20 ++++++++++++++++++++ > > 1 file changed, 20 insertions(+) > > > > diff --git a/mm/sparse.c b/mm/sparse.c > > index 338cf946d..33b57758e 100644 > > --- a/mm/sparse.c > > +++ b/mm/sparse.c > > @@ -149,6 +149,26 @@ static void __meminit mminit_validate_memmodel_limits(unsigned long *start_pfn, > > WARN_ON_ONCE(1); > > *end_pfn = max_sparsemem_pfn; > > } > > + > > + /*check vmemmap limits*/ > > + #ifdef CONFIG_SPARSEMEM_VMEMMAP > > + > > + unsigned long vmemmap_offset_start = (unsigned long) pfn_to_page(*start_pfn); > > + unsigned long vmemmap_offset_end = (unsigned long) pfn_to_page(*end_pfn); > > + > > + if (vmemmap_offset_start < VMEMMAP_START) { > > + mminit_dprintk(MMINIT_WARNING, "pfnvalidation", > > + "Start of range %lu -> %lu exceeds bounds of SPARSEMEM virtual memory map address range %lu -> %lu\n", > > + vmemmap_offset_start, vmemmap_offset_end, VMEMMAP_START,VMEMMAP_END); > > + WARN_ON_ONCE(1); > > + > > + } else if (vmemmap_offset_end > VMEMMAP_END ) { > > + mminit_dprintk(MMINIT_WARNING, "pfnvalidation", > > + "End of range %lu -> %lu exceeds bounds of SPARSEMEM virtual memory map address range %lu -> %lu\n", > > + vmemmap_offset_start, vmemmap_offset_end, VMEMMAP_START,VMEMMAP_END); > > + WARN_ON_ONCE(1); > > + } > > + #endif > > } > > > > /* > > -- > > 2.39.2 > > > > Since struct pages are only available for memory, we could offset > vmemmap so that VMEMMAP_START actually points to the first pfn? > Hello Alexandre, > My first thought was to use a "better" offset for vmemmap as well.Since pfn is essential for pfn_to_page and page_to_pfn i think that maybe we could get a smaller number out of pfn to offset the vmemmap and ensure that it's bounds are respected for pfn_to_page operation. > If we use only a part of the pfn for the offset we could get the pfn from page_to_pfn by adding PAGE-VMEMMAP_START > to the DRAM base/PAGE_SIZE which is the first pfn. Let's keep the discussion here. IIUC you propose to change pfn_to_page()/page_to_pfn() but what I proposed was to directly offset vmemmap like this: #define vmemmap ((struct page *)VMEMMAP_START - phys_ram_base >> PAGE_SHIFT) So that the first page in vmemmap corresponds to the first page of physical memory, that avoids the burden of changing the conversion macros, WDYT? Thanks, Alex
Alexandre, Yes, you understood correctly, I indeed proposed to change pfn_to_page()/page_to_pfn() but your solution appears to solve the problem without risking any modifications to the conversion macros. In addition, your solution seems to be valid for any phys_ram_base/pfn value inside the limits of physical memory. However, I wanted to note that if the pfn is large enough, vmemmap will not be a valid SV39/48 address unless a pfn offset is applied to it. I can't think of a possible scenario where vmemmap would be used without an offset. I would like to know your opinion on that, does it concern you? Finally, do you want me to send a patch or will you do it? Dimitris On 2/18/2024 10:58 PM, Dimitris Vlachos wrote: > > > > > -------- Forwarded Message -------- > Subject: Re: [PATCH] [RFC] sparsemem: warn on out-of-bounds > initialization > Date: Tue, 6 Feb 2024 22:17:44 +0100 > From: Alexandre Ghiti <alexghiti@rivosinc.com> > To: Dimitris Vlachos <csd4492@csd.uoc.gr> > CC: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, > clameter@sgi.com, akpm@linux-foundation.org, rppt@kernel.org, > arnd@arndb.de, paul.walmsley@sifive.com, palmer@dabbelt.com, > mick@ics.forth.gr > > > > Hi Dimitris, > > On Fri, Feb 2, 2024 at 2:59 PM Alexandre Ghiti > <alexghiti@rivosinc.com> wrote: >> Hi Dimitris, >> >> On Fri, Feb 2, 2024 at 2:50 PM Dimitris Vlachos<csd4492@csd.uoc.gr> wrote: >>> From: Dimitris Vlachos<dvlachos@ics.forth.gr> >>> >>> Hello all >>> >>> I am sending this email with regards to a bug that I discovered in the Sparse Memory Model configuration and more specifically, the Virtual Memory Map optimization. Moreover, I would like to inquire about possible ways of fixing it. >>> >>> I work as a pre-graduate research assistant at ICS-FORTH in the Computer Architecture and VLSI Systems laboratory. >>> We were running some tests in our prototype hardware (RISC-V), where we noticed that the Kernel crashes early in the boot process with the following setup: >>> >>> We are using the default Kconfig configurations (defconfig) that includes Sparse Memory + Virtual Memory Map. >>> The DRAM base address of our system is : 0x800000000000 >>> A 3-level page table is used (Sv39). >>> >>> When the vmemmap optimization is enabled the macro pfn_to_page() is called, which offsets the vmemmap with the pfn argument to acquire a struct page pointer. >>> >>> As our DRAM starts at 0x800000000000, the initial pfn will be 0x800000000000 divided by PAGE_SIZE. The calculation result will be: >>> 0xffffffcf00000000 (vmemmap start) + (0x800000000 (pfn) * 64 (sizeof(struct page)) >>> >>> This causes an overflow as the number is very large, the resulting address is 0x1c700000000 which is not a valid Sv39 address (all bits above bit 38 should be set) and does not belong to the kernel’s virtual address space. >>> >>> The same will happen with all valid pfns as the memory is being initialized. When the Kernel will try to access a page using pfn_to_page, a page fault will occur crashing the system. >>> It should be noted that when the vmemmap is disabled the system boots normally. >>> >>> However, I would like to emphasize another important detail. When the DRAM base is small enough to avoid an overflow, the virtual memory map mappings will be initialized out of bounds with regard to the vmemmap address space which is 4GiB in Sv39. >>> The system will not crash, but the address space used for this purpose will be outside of the designated range. >>> >>> Everything mentioned above holds true when Sv48 is used as well. Given a high enough DRAM base the system will either crash or perform false initializations. Given the fact that virtual memory map is not only used by RISC-V systems, this could be an issue for other architectures as well. >>> >>> The kernel already contains mminit_validate_memmodel_limits() that checks memory limits. >>> However, it only checks physical memory ranges. I added a few checks, provided that virtual memory map is enabled, to determine whether offset start and offset end fall inside the range of virtual memory map. Otherwise, a warning will be printed. >>> >>> Finally, I would like to ask you for any information/advice that could help me fix/prevent the issues that I mentioned (if it’s possible) or recover from them at runtime by disabling the optimization and reverting back to the non-vmemmap mappings. >>> >>> Thank you in advance. >>> Best regards, >>> Dimitris Vlachos >>> --- >>> mm/sparse.c | 20 ++++++++++++++++++++ >>> 1 file changed, 20 insertions(+) >>> >>> diff --git a/mm/sparse.c b/mm/sparse.c >>> index 338cf946d..33b57758e 100644 >>> --- a/mm/sparse.c >>> +++ b/mm/sparse.c >>> @@ -149,6 +149,26 @@ static void __meminit mminit_validate_memmodel_limits(unsigned long *start_pfn, >>> WARN_ON_ONCE(1); >>> *end_pfn = max_sparsemem_pfn; >>> } >>> + >>> + /*check vmemmap limits*/ >>> + #ifdef CONFIG_SPARSEMEM_VMEMMAP >>> + >>> + unsigned long vmemmap_offset_start = (unsigned long) pfn_to_page(*start_pfn); >>> + unsigned long vmemmap_offset_end = (unsigned long) pfn_to_page(*end_pfn); >>> + >>> + if (vmemmap_offset_start < VMEMMAP_START) { >>> + mminit_dprintk(MMINIT_WARNING, "pfnvalidation", >>> + "Start of range %lu -> %lu exceeds bounds of SPARSEMEM virtual memory map address range %lu -> %lu\n", >>> + vmemmap_offset_start, vmemmap_offset_end, VMEMMAP_START,VMEMMAP_END); >>> + WARN_ON_ONCE(1); >>> + >>> + } else if (vmemmap_offset_end > VMEMMAP_END ) { >>> + mminit_dprintk(MMINIT_WARNING, "pfnvalidation", >>> + "End of range %lu -> %lu exceeds bounds of SPARSEMEM virtual memory map address range %lu -> %lu\n", >>> + vmemmap_offset_start, vmemmap_offset_end, VMEMMAP_START,VMEMMAP_END); >>> + WARN_ON_ONCE(1); >>> + } >>> + #endif >>> } >>> >>> /* >>> -- >>> 2.39.2 >>> >> >> Since struct pages are only available for memory, we could offset >> vmemmap so that VMEMMAP_START actually points to the first pfn? > >> Hello Alexandre, >> My first thought was to use a "better" offset for vmemmap as well.Since pfn is essential for pfn_to_page and page_to_pfn i think that maybe we could get a smaller number out of pfn to offset the vmemmap and ensure that it's bounds are respected for pfn_to_page operation. >> If we use only a part of the pfn for the offset we could get the pfn from page_to_pfn by adding PAGE-VMEMMAP_START >> to the DRAM base/PAGE_SIZE which is the first pfn. > > Let's keep the discussion here. > > IIUC you propose to change pfn_to_page()/page_to_pfn() but what I > proposed was to directly offset vmemmap like this: > > #define vmemmap ((struct page *)VMEMMAP_START - phys_ram_base >>> PAGE_SHIFT) > > So that the first page in vmemmap corresponds to the first page of > physical memory, that avoids the burden of changing the conversion > macros, WDYT? > > Thanks, > > Alex
Hi Dimitris, On Mon, Feb 19, 2024 at 3:59 PM dvlachos <dvlachos@ics.forth.gr> wrote: > > Alexandre, > > Yes, you understood correctly, I indeed proposed to change > pfn_to_page()/page_to_pfn() but your solution appears to solve the > problem without risking any modifications to the conversion macros. In > addition, your solution seems to be valid for any phys_ram_base/pfn > value inside the limits of physical memory. > > However, I wanted to note that if the pfn is large enough, vmemmap will > not be a valid SV39/48 address unless a pfn offset is applied to it. I > can't think of a possible scenario where vmemmap would be used without > an offset. I would like to know your opinion on that, does it concern you? No, I don't see when that could happen, that would be a mistake and then it would be easy to catch with this patch :) > > Finally, do you want me to send a patch or will you do it? > I'd be happy to help if you need, and if it's too much of a pain for you, I'll do it, let me know. Thanks, Alex > Dimitris > > On 2/18/2024 10:58 PM, Dimitris Vlachos wrote: > > > > > > > > > > -------- Forwarded Message -------- > > Subject: Re: [PATCH] [RFC] sparsemem: warn on out-of-bounds > > initialization > > Date: Tue, 6 Feb 2024 22:17:44 +0100 > > From: Alexandre Ghiti <alexghiti@rivosinc.com> > > To: Dimitris Vlachos <csd4492@csd.uoc.gr> > > CC: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, > > clameter@sgi.com, akpm@linux-foundation.org, rppt@kernel.org, > > arnd@arndb.de, paul.walmsley@sifive.com, palmer@dabbelt.com, > > mick@ics.forth.gr > > > > > > > > Hi Dimitris, > > > > On Fri, Feb 2, 2024 at 2:59 PM Alexandre Ghiti > > <alexghiti@rivosinc.com> wrote: > >> Hi Dimitris, > >> > >> On Fri, Feb 2, 2024 at 2:50 PM Dimitris Vlachos<csd4492@csd.uoc.gr> wrote: > >>> From: Dimitris Vlachos<dvlachos@ics.forth.gr> > >>> > >>> Hello all > >>> > >>> I am sending this email with regards to a bug that I discovered in the Sparse Memory Model configuration and more specifically, the Virtual Memory Map optimization. Moreover, I would like to inquire about possible ways of fixing it. > >>> > >>> I work as a pre-graduate research assistant at ICS-FORTH in the Computer Architecture and VLSI Systems laboratory. > >>> We were running some tests in our prototype hardware (RISC-V), where we noticed that the Kernel crashes early in the boot process with the following setup: > >>> > >>> We are using the default Kconfig configurations (defconfig) that includes Sparse Memory + Virtual Memory Map. > >>> The DRAM base address of our system is : 0x800000000000 > >>> A 3-level page table is used (Sv39). > >>> > >>> When the vmemmap optimization is enabled the macro pfn_to_page() is called, which offsets the vmemmap with the pfn argument to acquire a struct page pointer. > >>> > >>> As our DRAM starts at 0x800000000000, the initial pfn will be 0x800000000000 divided by PAGE_SIZE. The calculation result will be: > >>> 0xffffffcf00000000 (vmemmap start) + (0x800000000 (pfn) * 64 (sizeof(struct page)) > >>> > >>> This causes an overflow as the number is very large, the resulting address is 0x1c700000000 which is not a valid Sv39 address (all bits above bit 38 should be set) and does not belong to the kernel’s virtual address space. > >>> > >>> The same will happen with all valid pfns as the memory is being initialized. When the Kernel will try to access a page using pfn_to_page, a page fault will occur crashing the system. > >>> It should be noted that when the vmemmap is disabled the system boots normally. > >>> > >>> However, I would like to emphasize another important detail. When the DRAM base is small enough to avoid an overflow, the virtual memory map mappings will be initialized out of bounds with regard to the vmemmap address space which is 4GiB in Sv39. > >>> The system will not crash, but the address space used for this purpose will be outside of the designated range. > >>> > >>> Everything mentioned above holds true when Sv48 is used as well. Given a high enough DRAM base the system will either crash or perform false initializations. Given the fact that virtual memory map is not only used by RISC-V systems, this could be an issue for other architectures as well. > >>> > >>> The kernel already contains mminit_validate_memmodel_limits() that checks memory limits. > >>> However, it only checks physical memory ranges. I added a few checks, provided that virtual memory map is enabled, to determine whether offset start and offset end fall inside the range of virtual memory map. Otherwise, a warning will be printed. > >>> > >>> Finally, I would like to ask you for any information/advice that could help me fix/prevent the issues that I mentioned (if it’s possible) or recover from them at runtime by disabling the optimization and reverting back to the non-vmemmap mappings. > >>> > >>> Thank you in advance. > >>> Best regards, > >>> Dimitris Vlachos > >>> --- > >>> mm/sparse.c | 20 ++++++++++++++++++++ > >>> 1 file changed, 20 insertions(+) > >>> > >>> diff --git a/mm/sparse.c b/mm/sparse.c > >>> index 338cf946d..33b57758e 100644 > >>> --- a/mm/sparse.c > >>> +++ b/mm/sparse.c > >>> @@ -149,6 +149,26 @@ static void __meminit mminit_validate_memmodel_limits(unsigned long *start_pfn, > >>> WARN_ON_ONCE(1); > >>> *end_pfn = max_sparsemem_pfn; > >>> } > >>> + > >>> + /*check vmemmap limits*/ > >>> + #ifdef CONFIG_SPARSEMEM_VMEMMAP > >>> + > >>> + unsigned long vmemmap_offset_start = (unsigned long) pfn_to_page(*start_pfn); > >>> + unsigned long vmemmap_offset_end = (unsigned long) pfn_to_page(*end_pfn); > >>> + > >>> + if (vmemmap_offset_start < VMEMMAP_START) { > >>> + mminit_dprintk(MMINIT_WARNING, "pfnvalidation", > >>> + "Start of range %lu -> %lu exceeds bounds of SPARSEMEM virtual memory map address range %lu -> %lu\n", > >>> + vmemmap_offset_start, vmemmap_offset_end, VMEMMAP_START,VMEMMAP_END); > >>> + WARN_ON_ONCE(1); > >>> + > >>> + } else if (vmemmap_offset_end > VMEMMAP_END ) { > >>> + mminit_dprintk(MMINIT_WARNING, "pfnvalidation", > >>> + "End of range %lu -> %lu exceeds bounds of SPARSEMEM virtual memory map address range %lu -> %lu\n", > >>> + vmemmap_offset_start, vmemmap_offset_end, VMEMMAP_START,VMEMMAP_END); > >>> + WARN_ON_ONCE(1); > >>> + } > >>> + #endif > >>> } > >>> > >>> /* > >>> -- > >>> 2.39.2 > >>> > >> > >> Since struct pages are only available for memory, we could offset > >> vmemmap so that VMEMMAP_START actually points to the first pfn? > > > >> Hello Alexandre, > >> My first thought was to use a "better" offset for vmemmap as well.Since pfn is essential for pfn_to_page and page_to_pfn i think that maybe we could get a smaller number out of pfn to offset the vmemmap and ensure that it's bounds are respected for pfn_to_page operation. > >> If we use only a part of the pfn for the offset we could get the pfn from page_to_pfn by adding PAGE-VMEMMAP_START > >> to the DRAM base/PAGE_SIZE which is the first pfn. > > > > Let's keep the discussion here. > > > > IIUC you propose to change pfn_to_page()/page_to_pfn() but what I > > proposed was to directly offset vmemmap like this: > > > > #define vmemmap ((struct page *)VMEMMAP_START - phys_ram_base > >>> PAGE_SHIFT) > > > > So that the first page in vmemmap corresponds to the first page of > > physical memory, that avoids the burden of changing the conversion > > macros, WDYT? > > > > Thanks, > > > > Alex
diff --git a/mm/sparse.c b/mm/sparse.c index 338cf946d..33b57758e 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -149,6 +149,26 @@ static void __meminit mminit_validate_memmodel_limits(unsigned long *start_pfn, WARN_ON_ONCE(1); *end_pfn = max_sparsemem_pfn; } + + /*check vmemmap limits*/ + #ifdef CONFIG_SPARSEMEM_VMEMMAP + + unsigned long vmemmap_offset_start = (unsigned long) pfn_to_page(*start_pfn); + unsigned long vmemmap_offset_end = (unsigned long) pfn_to_page(*end_pfn); + + if (vmemmap_offset_start < VMEMMAP_START) { + mminit_dprintk(MMINIT_WARNING, "pfnvalidation", + "Start of range %lu -> %lu exceeds bounds of SPARSEMEM virtual memory map address range %lu -> %lu\n", + vmemmap_offset_start, vmemmap_offset_end, VMEMMAP_START,VMEMMAP_END); + WARN_ON_ONCE(1); + + } else if (vmemmap_offset_end > VMEMMAP_END ) { + mminit_dprintk(MMINIT_WARNING, "pfnvalidation", + "End of range %lu -> %lu exceeds bounds of SPARSEMEM virtual memory map address range %lu -> %lu\n", + vmemmap_offset_start, vmemmap_offset_end, VMEMMAP_START,VMEMMAP_END); + WARN_ON_ONCE(1); + } + #endif } /*