From patchwork Tue Nov 22 08:41:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Ghiti X-Patchwork-Id: 24202 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp2085436wrr; Tue, 22 Nov 2022 00:56:33 -0800 (PST) X-Google-Smtp-Source: AA0mqf4SJeR0pqtQCRJWH3MNpzVl+aS0ZwuUVqi7CgebbcuvpVOhWq2j/F48hBINV1qfmo71H6pN X-Received: by 2002:a50:fa86:0:b0:469:cb4d:a529 with SMTP id w6-20020a50fa86000000b00469cb4da529mr3137118edr.30.1669107392791; Tue, 22 Nov 2022 00:56:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669107392; cv=none; d=google.com; s=arc-20160816; b=pfy70tq5h4reyTm3mARAFH8Q8IsNWfonV82QYSnMV4zPBw7c/w22lL+EpQuoqXo/mC BevArEIKjufjpl13rEmz9ZxF+Yo698GJkjbTiyT+WdGcSYq3Swa3KL+IBofiCd4K7Le5 ydP1fFS5zBJL9uofVAJIp4OilZS2nx/waBsHqJ5bkzIeTcaGflSGMLvN1DJ923Pn21EW MgjCLkS9VnFYhsfzwYf5QQN58uOoUot3AmxH2uCcO3zMuU/3HHplRBL9we0u+TG8IjLQ gQScVcnTZ27TDf2QdQFAbnVI+Hn4O0um9DnHpKesyymh7IUjISPZR82cT10E6CZhlPN8 sRCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=cT7TvPmXIjtGK6xCoCSuA3bUKEp8LGUDwYLIrTFVPuQ=; b=AWleyrNIJL2ISxvX+d2ljjwyAdXRnX7lWrh7QXy6p8t8mqgmHCn6hCQu6vIs1olTcp UuJia+04tUCHpa9gl/zlJNy1q4+EE3LVfa1DLNRoTcE9aaWszMwGR+2uSkpMhy+/F2Gy DJkROIEd44iBtiZ1dpV6DYYR/9zEaE2KaRxsasG83OwNcqjiBwJ2Lh2MpOd3XDG9jml0 ZWnTNKokwHHsrhxHJVEIKqikdYPDEqWarG7q6kEfV0sA06v15sx+kwx/Z6sp++XHNFB+ 8c2orwTRt5AzWQ+1QGco0nhbtZv60TpYwA5saa7CsGV4rQWJavJsEjVMM/adeAEJ6zVe JWTw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@rivosinc-com.20210112.gappssmtp.com header.s=20210112 header.b=MsYQOv9b; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qo14-20020a170907874e00b007ae931d56f9si10277818ejc.89.2022.11.22.00.56.06; Tue, 22 Nov 2022 00:56:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20210112.gappssmtp.com header.s=20210112 header.b=MsYQOv9b; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232646AbiKVIlx (ORCPT + 99 others); Tue, 22 Nov 2022 03:41:53 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232445AbiKVIlv (ORCPT ); Tue, 22 Nov 2022 03:41:51 -0500 Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 69FB242995 for ; Tue, 22 Nov 2022 00:41:48 -0800 (PST) Received: by mail-wm1-x332.google.com with SMTP id t4so10167423wmj.5 for ; Tue, 22 Nov 2022 00:41:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=cT7TvPmXIjtGK6xCoCSuA3bUKEp8LGUDwYLIrTFVPuQ=; b=MsYQOv9bn1C6g04flOIQP//SWP05czQKUUtIcAFiVWfTVEj8gbzEDjiU7Z7qDyxzYo uyQDwC4eDl0eryE3DTJ0H3IVgWVbtoo45p2sKVVu73G3s+z20A/iylCR5ROWcZGFKT5t 0RFGkqw0YjcjgvosNyZai397Qg9f/e1m134l+91DDwQh5WaEuH55dNhU80ywUu1kau1b jgRBI7gJ+fFbztSP740ZV9T5g1BACX5rhBImARJbBLuRa2HBrSzU4psE/4WtxlVgouXN f79QGrt33sJXex0pLHCz2hmEaj82WAS2XEVrkAiptyg5IrrRPu3ZPxHyF7kqMdPafAJS AnKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=cT7TvPmXIjtGK6xCoCSuA3bUKEp8LGUDwYLIrTFVPuQ=; b=lQ4f/nmFMfbMk4TQxgncf/nPv2PX9cjg2sG9qF6UUFo3e0lM7plL4mgqwOwJ3ky3LV CIfhR6LIz5g2pOJTXpYy8AdPkpNDBLDV4Nvy+wFGRlEI3akqw7qAG+KjuYmloovN0T/A 4UsPvcNcv0U8skMJZevcYvxnnGQXwwCTl4+tIYu5iDu7zSBXQOXDHQGGHKL+W5is2ASC RaQA0OO1MmBms63OX0rz91UV38EaopRrv9j4JdMyRE8rsDTQCKfwQmrX/FqVvZjGSZsP /JhJ3OTF1dpFbY7Pv5xYM8L9zOXt5ysjL3SDFOJPr2soMpITlHJHDJ0TWP/Yl/OcOpD2 ejFw== X-Gm-Message-State: ANoB5pmfkp4/a8GLMNqIB471Yl3/GboDs6BqUk7BJ9NYqHUtTacXUK2F 1quyNDsM5vzqhmSjPV01bM/E5g== X-Received: by 2002:a05:600c:19d1:b0:3cf:cf89:90f with SMTP id u17-20020a05600c19d100b003cfcf89090fmr5078823wmq.186.1669106506896; Tue, 22 Nov 2022 00:41:46 -0800 (PST) Received: from localhost.localdomain (lfbn-gre-1-201-46.w90-112.abo.wanadoo.fr. [90.112.163.46]) by smtp.gmail.com with ESMTPSA id a8-20020adffac8000000b00241b5af8697sm2156301wrs.85.2022.11.22.00.41.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Nov 2022 00:41:45 -0800 (PST) From: Alexandre Ghiti To: Paul Walmsley , Palmer Dabbelt , Albert Ou , Guo Ren , Rob Herring , Frank Rowand , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, devicetree@vger.kernel.org, linux-arch@vger.kernel.org Cc: Alexandre Ghiti Subject: [PATCH] riscv: Use PUD/P4D/PGD pages for the linear mapping Date: Tue, 22 Nov 2022 09:41:41 +0100 Message-Id: <20221122084141.1849421-1-alexghiti@rivosinc.com> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750185953353738362?= X-GMAIL-MSGID: =?utf-8?q?1750185953353738362?= During the early page table creation, we used to set the mapping for PAGE_OFFSET to the kernel load address: but the kernel load address is always offseted by PMD_SIZE which makes it impossible to use PUD/P4D/PGD pages as this physical address is not aligned on PUD/P4D/PGD size (whereas PAGE_OFFSET is). But actually we don't have to establish this mapping (ie set va_pa_offset) that early in the boot process because: - first, setup_vm installs a temporary kernel mapping and among other things, discovers the system memory, - then, setup_vm_final creates the final kernel mapping and takes advantage of the discovered system memory to create the linear mapping. During the first phase, we don't know the start of the system memory and then until the second phase is finished, we can't use the linear mapping at all and phys_to_virt/virt_to_phys translations must not be used because it would result in a different translation from the 'real' one once the final mapping is installed. So here we simply delay the initialization of va_pa_offset to after the system memory discovery. But to make sure noone uses the linear mapping before, we add some guard in the DEBUG_VIRTUAL config. Finally we can use PUD/P4D/PGD hugepages when possible, which will result in a better TLB utilization. Note that we rely on the firmware to protect itself using PMP. Signed-off-by: Alexandre Ghiti --- Note that this patch is rebased on top of: [PATCH v1 1/1] riscv: mm: call best_map_size many times during linear-mapping arch/riscv/include/asm/page.h | 16 ++++++++++++++++ arch/riscv/mm/init.c | 25 +++++++++++++++++++------ arch/riscv/mm/physaddr.c | 16 ++++++++++++++++ drivers/of/fdt.c | 5 ++++- 4 files changed, 55 insertions(+), 7 deletions(-) diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h index ac70b0fd9a9a..f3af526a149f 100644 --- a/arch/riscv/include/asm/page.h +++ b/arch/riscv/include/asm/page.h @@ -90,6 +90,14 @@ typedef struct page *pgtable_t; #define PTE_FMT "%08lx" #endif +#ifdef CONFIG_64BIT +/* + * We override this value as its generic definition uses __pa too early in + * the boot process (before kernel_map.va_pa_offset is set). + */ +#define MIN_MEMBLOCK_ADDR 0 +#endif + #ifdef CONFIG_MMU extern unsigned long riscv_pfn_base; #define ARCH_PFN_OFFSET (riscv_pfn_base) @@ -122,7 +130,11 @@ extern phys_addr_t phys_ram_base; #define is_linear_mapping(x) \ ((x) >= PAGE_OFFSET && (!IS_ENABLED(CONFIG_64BIT) || (x) < PAGE_OFFSET + KERN_VIRT_SIZE)) +#ifndef CONFIG_DEBUG_VIRTUAL #define linear_mapping_pa_to_va(x) ((void *)((unsigned long)(x) + kernel_map.va_pa_offset)) +#else +void *linear_mapping_pa_to_va(unsigned long x); +#endif #define kernel_mapping_pa_to_va(y) ({ \ unsigned long _y = y; \ (IS_ENABLED(CONFIG_XIP_KERNEL) && _y < phys_ram_base) ? \ @@ -131,7 +143,11 @@ extern phys_addr_t phys_ram_base; }) #define __pa_to_va_nodebug(x) linear_mapping_pa_to_va(x) +#ifndef CONFIG_DEBUG_VIRTUAL #define linear_mapping_va_to_pa(x) ((unsigned long)(x) - kernel_map.va_pa_offset) +#else +phys_addr_t linear_mapping_va_to_pa(unsigned long x); +#endif #define kernel_mapping_va_to_pa(y) ({ \ unsigned long _y = y; \ (IS_ENABLED(CONFIG_XIP_KERNEL) && _y < kernel_map.virt_addr + XIP_OFFSET) ? \ diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 1b76d3fe4e26..58bcf395efdc 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -213,6 +213,14 @@ static void __init setup_bootmem(void) phys_ram_end = memblock_end_of_DRAM(); if (!IS_ENABLED(CONFIG_XIP_KERNEL)) phys_ram_base = memblock_start_of_DRAM(); + + /* + * Any use of __va/__pa before this point is wrong as we did not know the + * start of DRAM before. + */ + kernel_map.va_pa_offset = PAGE_OFFSET - phys_ram_base; + riscv_pfn_base = PFN_DOWN(phys_ram_base); + /* * memblock allocator is not aware of the fact that last 4K bytes of * the addressable memory can not be mapped because of IS_ERR_VALUE @@ -672,9 +680,16 @@ void __init create_pgd_mapping(pgd_t *pgdp, static uintptr_t __init best_map_size(phys_addr_t base, phys_addr_t size) { - /* Upgrade to PMD_SIZE mappings whenever possible */ - base &= PMD_SIZE - 1; - if (!base && size >= PMD_SIZE) + if (!(base & (PGDIR_SIZE - 1)) && size >= PGDIR_SIZE) + return PGDIR_SIZE; + + if (!(base & (P4D_SIZE - 1)) && size >= P4D_SIZE) + return P4D_SIZE; + + if (!(base & (PUD_SIZE - 1)) && size >= PUD_SIZE) + return PUD_SIZE; + + if (!(base & (PMD_SIZE - 1)) && size >= PMD_SIZE) return PMD_SIZE; return PAGE_SIZE; @@ -983,11 +998,9 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa) set_satp_mode(); #endif - kernel_map.va_pa_offset = PAGE_OFFSET - kernel_map.phys_addr; + kernel_map.va_pa_offset = 0UL; kernel_map.va_kernel_pa_offset = kernel_map.virt_addr - kernel_map.phys_addr; - riscv_pfn_base = PFN_DOWN(kernel_map.phys_addr); - /* * The default maximal physical memory size is KERN_VIRT_SIZE for 32-bit * kernel, whereas for 64-bit kernel, the end of the virtual address diff --git a/arch/riscv/mm/physaddr.c b/arch/riscv/mm/physaddr.c index 19cf25a74ee2..5ae4bd166e25 100644 --- a/arch/riscv/mm/physaddr.c +++ b/arch/riscv/mm/physaddr.c @@ -33,3 +33,19 @@ phys_addr_t __phys_addr_symbol(unsigned long x) return __va_to_pa_nodebug(x); } EXPORT_SYMBOL(__phys_addr_symbol); + +phys_addr_t linear_mapping_va_to_pa(unsigned long x) +{ + BUG_ON(!kernel_map.va_pa_offset); + + return ((unsigned long)(x) - kernel_map.va_pa_offset); +} +EXPORT_SYMBOL(linear_mapping_va_to_pa); + +void *linear_mapping_pa_to_va(unsigned long x) +{ + BUG_ON(!kernel_map.va_pa_offset); + + return ((void *)((unsigned long)(x) + kernel_map.va_pa_offset)); +} +EXPORT_SYMBOL(linear_mapping_pa_to_va); diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c index 7b571a631639..04e3ecb51722 100644 --- a/drivers/of/fdt.c +++ b/drivers/of/fdt.c @@ -895,8 +895,11 @@ static void __early_init_dt_declare_initrd(unsigned long start, * enabled since __va() is called too early. ARM64 does make use * of phys_initrd_start/phys_initrd_size so we can skip this * conversion. + * On RISCV64, the usage of __va() before the linear mapping exists + * is wrong. */ - if (!IS_ENABLED(CONFIG_ARM64)) { + if (!IS_ENABLED(CONFIG_ARM64) && + !(IS_ENABLED(CONFIG_RISCV) && IS_ENABLED(CONFIG_64BIT))) { initrd_start = (unsigned long)__va(start); initrd_end = (unsigned long)__va(end); initrd_below_start_ok = 1;