[RFT,1/2] RISC-V: handle missing "no-map" properties for OpenSBI's PMP protected regions
Message ID | 20230802-detention-second-82ab2b53e07a@wendy |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f41:0:b0:3e4:2afc:c1 with SMTP id v1csp396448vqx; Wed, 2 Aug 2023 04:53:09 -0700 (PDT) X-Google-Smtp-Source: APBJJlGBPw+ZDD+oB0TH1ZJbTULqd2HOOOJDNfVZzHkqnUvLWwY9wF2IbTRpaDBAVohUunbzUyGp X-Received: by 2002:a17:90b:1c87:b0:268:abf:6431 with SMTP id oo7-20020a17090b1c8700b002680abf6431mr16528345pjb.24.1690977189152; Wed, 02 Aug 2023 04:53:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690977189; cv=none; d=google.com; s=arc-20160816; b=GmupZ5z7tB57RfaKlIFRJz3lv0G3bULLyQyAuWhu6mkjSInzsUqcN34olTTIRctzjk Ce5rds+mPTOGhc/ZLgDIVkcOeVFjcVQJ/l/d1wkLlfzYVVyOMF9BkzmVDsPBSWPxaxjn wPnTo6ctO93VWwOGI5JbT7tJZFsJ/XYY00ui1kahFr6d1b3AMmCBzjfOtG1C/264uhbz yZu5prlhp1XY4FVWN2NSzfQRDElxbt94EkXvjRd+rEC8fWs6MHEu7Mi8BRyd7Dh6bz+N t9AOvMd6yZsaSdkmrbFKE2Fip/4dTErTgY3y74SFUEFSQbIzZ5rirxkzHwhQSjzX46lr Tykw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=IGXBjFNBc3jlSnW2kmvTDvaZGpgBFcfOrpbaRnMFcIA=; fh=nX2wgFIzIbxuMRy4LRJOi+tcsu7af/O6BOQfdpd5QTc=; b=lM2WmSxsNpnq/QcKyIojzfmWOpCURxQo1TElIgrlnSd25Jwrr6nFtXYH898plAaQiN 3rOz0heDkKuBTyrekJTSabuqIpR0204/fqy08Jolzhs3evxzBAkv0Rykzle/ubXtmMey 0n6AoUDKBKYTLglSFlpbIKPKcRjU1htkZbsipiA+RSxlTvHKksZqrcwegNkyijHVHLYd 2E709n7+uypJayYdWjN8zpRCFkEbnDb5TIIsS2ePPnrS1NrMqv0u8VwzlJF0pwKKGg7y tIc2fMWkTa4Rd1Q//NCqzBvOU5VoACKrmStlv+MfpSPNepBam+9hUP8iINGGBcl0eq8m Rbxg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@microchip.com header.s=mchp header.b=ImE58bJV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=microchip.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id pg5-20020a17090b1e0500b002683f507990si1140902pjb.109.2023.08.02.04.52.53; Wed, 02 Aug 2023 04:53:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@microchip.com header.s=mchp header.b=ImE58bJV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=microchip.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233574AbjHBLOG (ORCPT <rfc822;maxi.paulin@gmail.com> + 99 others); Wed, 2 Aug 2023 07:14:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44444 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232013AbjHBLOB (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 2 Aug 2023 07:14:01 -0400 Received: from esa.microchip.iphmx.com (esa.microchip.iphmx.com [68.232.154.123]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B165C1FED; Wed, 2 Aug 2023 04:13:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=microchip.com; i=@microchip.com; q=dns/txt; s=mchp; t=1690974840; x=1722510840; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hd7d47LxtBd1DyVtDUyG879vKOJH26KzWOIUEmCNKhs=; b=ImE58bJVgMXHTk3KfAO7Os1DfZkD0iEQOS6GnaQh4+IRClNTLYTAThLA wevN+1XXXoRd5l8v3j7YYX0ASfO9wkVuycXyHLpuAsOre8T/BgdBmU3R/ O7MyABPr9zNwmvV7Ek4LdZuHs1bvCCWw1H0uBSZb2cl/alyEHHROpwf7E 6P7GDNurn5+Crg/fUXtSLfye3BTGE4gt4UsBNyXt8Nd1dDJ8b+mYEmfPI umcXxLqmGsfAxgjVo0VHLPVuF9BiBdVSk5gNRjZmhynpXKJmYrFRz1c7Z zKvPcYj7RobxuzITxEmnRBPBKpW4yZzmQN9cuC361ZQ+5qYHkGaO9J5a0 A==; X-IronPort-AV: E=Sophos;i="6.01,249,1684825200"; d="scan'208";a="223462640" X-Amp-Result: SKIPPED(no attachment in message) Received: from unknown (HELO email.microchip.com) ([170.129.1.10]) by esa4.microchip.iphmx.com with ESMTP/TLS/AES256-SHA256; 02 Aug 2023 04:14:00 -0700 Received: from chn-vm-ex03.mchp-main.com (10.10.85.151) by chn-vm-ex03.mchp-main.com (10.10.85.151) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.21; Wed, 2 Aug 2023 04:13:49 -0700 Received: from wendy.microchip.com (10.10.115.15) by chn-vm-ex03.mchp-main.com (10.10.85.151) with Microsoft SMTP Server id 15.1.2507.21 via Frontend Transport; Wed, 2 Aug 2023 04:13:47 -0700 From: Conor Dooley <conor.dooley@microchip.com> To: <palmer@dabbelt.com> CC: <conor@kernel.org>, <conor.dooley@microchip.com>, Paul Walmsley <paul.walmsley@sifive.com>, Atish Patra <atishp@rivosinc.com>, Anup Patel <apatel@ventanamicro.com>, Alexandre Ghiti <alexghiti@rivosinc.com>, =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= <bjorn@rivosinc.com>, Song Shuai <suagrfillet@gmail.com>, JeeHeng Sia <jeeheng.sia@starfivetech.com>, "Petr Tesarik" <petrtesarik@huaweicloud.com>, <linux-riscv@lists.infradead.org>, <linux-kernel@vger.kernel.org>, <stable@vger.kernel.org> Subject: [RFT 1/2] RISC-V: handle missing "no-map" properties for OpenSBI's PMP protected regions Date: Wed, 2 Aug 2023 12:12:52 +0100 Message-ID: <20230802-detention-second-82ab2b53e07a@wendy> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230802-purse-hydrant-6f44f77364b0@wendy> References: <20230802-purse-hydrant-6f44f77364b0@wendy> MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=4807; i=conor.dooley@microchip.com; h=from:subject:message-id; bh=hd7d47LxtBd1DyVtDUyG879vKOJH26KzWOIUEmCNKhs=; b=owGbwMvMwCFWscWwfUFT0iXG02pJDCmnrIw79JoOSM+ft5fLNlJCZf3HzV9/3m0r7wszedRjd5Pr RPK1jlIWBjEOBlkxRZbE230tUuv/uOxw7nkLM4eVCWQIAxenAEzEfiojw1ZRWbnNXZeKb8ttTvKeeK /CpUxJzvRtdHTfims9TGoP7RgZJpY4XDn29Zj+5Gutgk+OH6rZvkg6bfMDS97tE1O8S8yyWQE= X-Developer-Key: i=conor.dooley@microchip.com; a=openpgp; fpr=F9ECA03CF54F12CD01F1655722E2C55B37CF380C Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_NONE, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773118096860479473 X-GMAIL-MSGID: 1773118096860479473 |
Series |
RISC-V: handle missing "no-map" properties for OpenSBI's PMP protected regions
|
|
Commit Message
Conor Dooley
Aug. 2, 2023, 11:12 a.m. UTC
Add an erratum for versions [v0.8 to v1.3) of OpenSBI which fail to add
the "no-map" property to the reserved memory nodes for the regions it
has protected using PMPs.
Our existing fix sweeping hibernation under the carpet by marking it
NONPORTABLE is insufficient as there are other ways to generate
accesses to these reserved memory regions, as Petr discovered [1]
while testing crash kernels & kdump.
Intercede during the boot process when the afflicted versions of OpenSBI
are present & set the "no-map" property in all "mmode_resv" nodes before
the kernel does its reserved memory region initialisation.
Reported-by: Song Shuai <suagrfillet@gmail.com>
Link: https://lore.kernel.org/all/CAAYs2=gQvkhTeioMmqRDVGjdtNF_vhB+vm_1dHJxPNi75YDQ_Q@mail.gmail.com/
Reported-by: JeeHeng Sia <jeeheng.sia@starfivetech.com>
Link: https://groups.google.com/a/groups.riscv.org/g/sw-dev/c/ITXwaKfA6z8
Reported-by: Petr Tesarik <petrtesarik@huaweicloud.com>
Closes: https://lore.kernel.org/linux-riscv/76ff0f51-d6c1-580d-f943-061e93073306@huaweicloud.com/ [1]
CC: stable@vger.kernel.org
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
---
arch/riscv/include/asm/sbi.h | 5 +++++
arch/riscv/kernel/sbi.c | 42 +++++++++++++++++++++++++++++++++++-
arch/riscv/mm/init.c | 3 +++
3 files changed, 49 insertions(+), 1 deletion(-)
Comments
Hi Conor, On 02/08/2023 13:12, Conor Dooley wrote: > Add an erratum for versions [v0.8 to v1.3) of OpenSBI which fail to add > the "no-map" property to the reserved memory nodes for the regions it > has protected using PMPs. > > Our existing fix sweeping hibernation under the carpet by marking it > NONPORTABLE is insufficient as there are other ways to generate > accesses to these reserved memory regions, as Petr discovered [1] > while testing crash kernels & kdump. > > Intercede during the boot process when the afflicted versions of OpenSBI > are present & set the "no-map" property in all "mmode_resv" nodes before > the kernel does its reserved memory region initialisation. > > Reported-by: Song Shuai <suagrfillet@gmail.com> > Link: https://lore.kernel.org/all/CAAYs2=gQvkhTeioMmqRDVGjdtNF_vhB+vm_1dHJxPNi75YDQ_Q@mail.gmail.com/ > Reported-by: JeeHeng Sia <jeeheng.sia@starfivetech.com> > Link: https://groups.google.com/a/groups.riscv.org/g/sw-dev/c/ITXwaKfA6z8 > Reported-by: Petr Tesarik <petrtesarik@huaweicloud.com> > Closes: https://lore.kernel.org/linux-riscv/76ff0f51-d6c1-580d-f943-061e93073306@huaweicloud.com/ [1] > CC: stable@vger.kernel.org > Signed-off-by: Conor Dooley <conor.dooley@microchip.com> > --- > arch/riscv/include/asm/sbi.h | 5 +++++ > arch/riscv/kernel/sbi.c | 42 +++++++++++++++++++++++++++++++++++- > arch/riscv/mm/init.c | 3 +++ > 3 files changed, 49 insertions(+), 1 deletion(-) > > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h > index 5b4a1bf5f439..5360f3476278 100644 > --- a/arch/riscv/include/asm/sbi.h > +++ b/arch/riscv/include/asm/sbi.h > @@ -252,6 +252,9 @@ enum sbi_pmu_ctr_type { > #define SBI_ERR_ALREADY_STARTED -7 > #define SBI_ERR_ALREADY_STOPPED -8 > > +/* SBI implementation IDs */ > +#define SBI_IMP_OPENSBI 1 > + > extern unsigned long sbi_spec_version; > struct sbiret { > long error; > @@ -259,6 +262,8 @@ struct sbiret { > }; > > void sbi_init(void); > +void sbi_apply_reserved_mem_erratum(void *dtb_va); > + > struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, > unsigned long arg1, unsigned long arg2, > unsigned long arg3, unsigned long arg4, > diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c > index c672c8ba9a2a..aeb27263fa53 100644 > --- a/arch/riscv/kernel/sbi.c > +++ b/arch/riscv/kernel/sbi.c > @@ -5,8 +5,10 @@ > * Copyright (c) 2020 Western Digital Corporation or its affiliates. > */ > > +#include <linux/acpi.h> > #include <linux/bits.h> > #include <linux/init.h> > +#include <linux/libfdt.h> > #include <linux/pm.h> > #include <linux/reboot.h> > #include <asm/sbi.h> > @@ -583,6 +585,40 @@ long sbi_get_mimpid(void) > } > EXPORT_SYMBOL_GPL(sbi_get_mimpid); > > +static long sbi_firmware_id; > +static long sbi_firmware_version; > + > +/* > + * For devicetrees patched by OpenSBI a "mmode_resv" node is added to cover > + * the region OpenSBI has protected by means of a PMP. Some versions of OpenSBI, > + * [v0.8 to v1.3), omitted the "no-map" property, but this trips up hibernation > + * among other things. > + */ > +void __init sbi_apply_reserved_mem_erratum(void *dtb_pa) > +{ > + int child, reserved_mem; > + > + if (sbi_firmware_id != SBI_IMP_OPENSBI) > + return; > + > + if (!acpi_disabled) > + return; > + > + if (sbi_firmware_version >= 0x10003 || sbi_firmware_version < 0x8) > + return; > + > + reserved_mem = fdt_path_offset((void *)dtb_pa, "/reserved-memory"); > + if (reserved_mem < 0) > + return; > + > + fdt_for_each_subnode(child, (void *)dtb_pa, reserved_mem) { > + const char *name = fdt_get_name((void *)dtb_pa, child, NULL); > + > + if (!strncmp(name, "mmode_resv", 10)) I would check that name != NULL before strncmp. > + fdt_setprop((void *)dtb_pa, child, "no-map", NULL, 0); > + } > +}; > + > void __init sbi_init(void) > { > int ret; > @@ -596,8 +632,12 @@ void __init sbi_init(void) > sbi_major_version(), sbi_minor_version()); > > if (!sbi_spec_is_0_1()) { > + sbi_firmware_id = sbi_get_firmware_id(); > + sbi_firmware_version = sbi_get_firmware_version(); > + > pr_info("SBI implementation ID=0x%lx Version=0x%lx\n", > - sbi_get_firmware_id(), sbi_get_firmware_version()); > + sbi_firmware_id, sbi_firmware_version); > + > if (sbi_probe_extension(SBI_EXT_TIME)) { > __sbi_set_timer = __sbi_set_timer_v02; > pr_info("SBI TIME extension detected\n"); > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index 70fb31960b63..cb16bfdeacdb 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -29,6 +29,7 @@ > #include <asm/tlbflush.h> > #include <asm/sections.h> > #include <asm/soc.h> > +#include <asm/sbi.h> > #include <asm/io.h> > #include <asm/ptdump.h> > #include <asm/numa.h> > @@ -253,6 +254,8 @@ static void __init setup_bootmem(void) > * in the device tree, otherwise the allocation could end up in a > * reserved region. > */ > + > + sbi_apply_reserved_mem_erratum(dtb_early_va); > early_init_fdt_scan_reserved_mem(); > > /* Otherwise the patch looks good to me: You can add: Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> I just have one question though (that we discussed privately already): should we fix openSBI in the kernel? If yes, what makes a bug worth being fixed in the kernel? Thanks,
> -----Original Message----- > From: Conor Dooley <conor.dooley@microchip.com> > Sent: Wednesday, August 2, 2023 7:13 PM > To: palmer@dabbelt.com > Cc: conor@kernel.org; conor.dooley@microchip.com; Paul Walmsley <paul.walmsley@sifive.com>; Atish Patra > <atishp@rivosinc.com>; Anup Patel <apatel@ventanamicro.com>; Alexandre Ghiti <alexghiti@rivosinc.com>; Björn Töpel > <bjorn@rivosinc.com>; Song Shuai <suagrfillet@gmail.com>; JeeHeng Sia <jeeheng.sia@starfivetech.com>; Petr Tesarik > <petrtesarik@huaweicloud.com>; linux-riscv@lists.infradead.org; linux-kernel@vger.kernel.org; stable@vger.kernel.org > Subject: [RFT 1/2] RISC-V: handle missing "no-map" properties for OpenSBI's PMP protected regions > > Add an erratum for versions [v0.8 to v1.3) of OpenSBI which fail to add > the "no-map" property to the reserved memory nodes for the regions it > has protected using PMPs. > > Our existing fix sweeping hibernation under the carpet by marking it > NONPORTABLE is insufficient as there are other ways to generate > accesses to these reserved memory regions, as Petr discovered [1] > while testing crash kernels & kdump. > > Intercede during the boot process when the afflicted versions of OpenSBI > are present & set the "no-map" property in all "mmode_resv" nodes before > the kernel does its reserved memory region initialisation. > > Reported-by: Song Shuai <suagrfillet@gmail.com> > Link: https://lore.kernel.org/all/CAAYs2=gQvkhTeioMmqRDVGjdtNF_vhB+vm_1dHJxPNi75YDQ_Q@mail.gmail.com/ > Reported-by: JeeHeng Sia <jeeheng.sia@starfivetech.com> > Link: https://groups.google.com/a/groups.riscv.org/g/sw-dev/c/ITXwaKfA6z8 > Reported-by: Petr Tesarik <petrtesarik@huaweicloud.com> > Closes: https://lore.kernel.org/linux-riscv/76ff0f51-d6c1-580d-f943-061e93073306@huaweicloud.com/ [1] > CC: stable@vger.kernel.org > Signed-off-by: Conor Dooley <conor.dooley@microchip.com> > --- > arch/riscv/include/asm/sbi.h | 5 +++++ > arch/riscv/kernel/sbi.c | 42 +++++++++++++++++++++++++++++++++++- > arch/riscv/mm/init.c | 3 +++ > 3 files changed, 49 insertions(+), 1 deletion(-) > > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h > index 5b4a1bf5f439..5360f3476278 100644 > --- a/arch/riscv/include/asm/sbi.h > +++ b/arch/riscv/include/asm/sbi.h > @@ -252,6 +252,9 @@ enum sbi_pmu_ctr_type { > #define SBI_ERR_ALREADY_STARTED -7 > #define SBI_ERR_ALREADY_STOPPED -8 > > +/* SBI implementation IDs */ > +#define SBI_IMP_OPENSBI 1 I would suggest to create an enum struct for the SBI Imp ID in the sbi.h file. What do you think? > + > extern unsigned long sbi_spec_version; > struct sbiret { > long error; > @@ -259,6 +262,8 @@ struct sbiret { > }; > > void sbi_init(void); > +void sbi_apply_reserved_mem_erratum(void *dtb_va); > + > struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, > unsigned long arg1, unsigned long arg2, > unsigned long arg3, unsigned long arg4, > diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c > index c672c8ba9a2a..aeb27263fa53 100644 > --- a/arch/riscv/kernel/sbi.c > +++ b/arch/riscv/kernel/sbi.c > @@ -5,8 +5,10 @@ > * Copyright (c) 2020 Western Digital Corporation or its affiliates. > */ > > +#include <linux/acpi.h> > #include <linux/bits.h> > #include <linux/init.h> > +#include <linux/libfdt.h> > #include <linux/pm.h> > #include <linux/reboot.h> > #include <asm/sbi.h> > @@ -583,6 +585,40 @@ long sbi_get_mimpid(void) > } > EXPORT_SYMBOL_GPL(sbi_get_mimpid); > > +static long sbi_firmware_id; > +static long sbi_firmware_version; > + > +/* > + * For devicetrees patched by OpenSBI a "mmode_resv" node is added to cover > + * the region OpenSBI has protected by means of a PMP. Some versions of OpenSBI, > + * [v0.8 to v1.3), omitted the "no-map" property, but this trips up hibernation > + * among other things. > + */ > +void __init sbi_apply_reserved_mem_erratum(void *dtb_pa) > +{ > + int child, reserved_mem; > + > + if (sbi_firmware_id != SBI_IMP_OPENSBI) > + return; > + > + if (!acpi_disabled) > + return; > + > + if (sbi_firmware_version >= 0x10003 || sbi_firmware_version < 0x8) > + return; > + > + reserved_mem = fdt_path_offset((void *)dtb_pa, "/reserved-memory"); > + if (reserved_mem < 0) > + return; > + > + fdt_for_each_subnode(child, (void *)dtb_pa, reserved_mem) { > + const char *name = fdt_get_name((void *)dtb_pa, child, NULL); > + > + if (!strncmp(name, "mmode_resv", 10)) > + fdt_setprop((void *)dtb_pa, child, "no-map", NULL, 0); > + } > +}; > + > void __init sbi_init(void) > { > int ret; > @@ -596,8 +632,12 @@ void __init sbi_init(void) > sbi_major_version(), sbi_minor_version()); > > if (!sbi_spec_is_0_1()) { > + sbi_firmware_id = sbi_get_firmware_id(); > + sbi_firmware_version = sbi_get_firmware_version(); > + > pr_info("SBI implementation ID=0x%lx Version=0x%lx\n", > - sbi_get_firmware_id(), sbi_get_firmware_version()); > + sbi_firmware_id, sbi_firmware_version); > + > if (sbi_probe_extension(SBI_EXT_TIME)) { > __sbi_set_timer = __sbi_set_timer_v02; > pr_info("SBI TIME extension detected\n"); > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index 70fb31960b63..cb16bfdeacdb 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -29,6 +29,7 @@ > #include <asm/tlbflush.h> > #include <asm/sections.h> > #include <asm/soc.h> > +#include <asm/sbi.h> > #include <asm/io.h> > #include <asm/ptdump.h> > #include <asm/numa.h> > @@ -253,6 +254,8 @@ static void __init setup_bootmem(void) > * in the device tree, otherwise the allocation could end up in a > * reserved region. > */ > + > + sbi_apply_reserved_mem_erratum(dtb_early_va); > early_init_fdt_scan_reserved_mem(); > > /* > -- > 2.40.1
On Wed, Aug 2, 2023 at 4:14 AM Conor Dooley <conor.dooley@microchip.com> wrote: > > Add an erratum for versions [v0.8 to v1.3) of OpenSBI which fail to add > the "no-map" property to the reserved memory nodes for the regions it > has protected using PMPs. > > Our existing fix sweeping hibernation under the carpet by marking it > NONPORTABLE is insufficient as there are other ways to generate > accesses to these reserved memory regions, as Petr discovered [1] > while testing crash kernels & kdump. > > Intercede during the boot process when the afflicted versions of OpenSBI > are present & set the "no-map" property in all "mmode_resv" nodes before > the kernel does its reserved memory region initialisation. > We have different mechanisms of DT being passed to the kernel. 1. A prior stage(e.g U-Boot SPL) to M-mode runtime firmware (e.g. OpenSBI, rustSBI) passes the DT to M-mode runtime firmware and it passes to the next stage. In this case, M-mode runtime firmware gets a chance to update the no-map property in DT that the kernel can parse. 2. User loads the DT from the boot loader (e.g EDK2, U-Boot proper). Any DT patching done by the M-mode firmware is useless. If these DTBs don't have the no-map property, hibernation or EFI booting will have issues as well. We are trying to solve only one part of problem #1 in this patch. I don't think any other M-mode runtime firmware patches DT with no-map property as well. Please let me know if I am wrong about that. The problem is not restricted to [v0.8 to v1.3) of OpenSBI. The booting doc now says that no-map property must be set and any usage of DTBs without that (via #1 or #2) will have unintended consequences. > Reported-by: Song Shuai <suagrfillet@gmail.com> > Link: https://lore.kernel.org/all/CAAYs2=gQvkhTeioMmqRDVGjdtNF_vhB+vm_1dHJxPNi75YDQ_Q@mail.gmail.com/ > Reported-by: JeeHeng Sia <jeeheng.sia@starfivetech.com> > Link: https://groups.google.com/a/groups.riscv.org/g/sw-dev/c/ITXwaKfA6z8 > Reported-by: Petr Tesarik <petrtesarik@huaweicloud.com> > Closes: https://lore.kernel.org/linux-riscv/76ff0f51-d6c1-580d-f943-061e93073306@huaweicloud.com/ [1] > CC: stable@vger.kernel.org > Signed-off-by: Conor Dooley <conor.dooley@microchip.com> > --- > arch/riscv/include/asm/sbi.h | 5 +++++ > arch/riscv/kernel/sbi.c | 42 +++++++++++++++++++++++++++++++++++- > arch/riscv/mm/init.c | 3 +++ > 3 files changed, 49 insertions(+), 1 deletion(-) > > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h > index 5b4a1bf5f439..5360f3476278 100644 > --- a/arch/riscv/include/asm/sbi.h > +++ b/arch/riscv/include/asm/sbi.h > @@ -252,6 +252,9 @@ enum sbi_pmu_ctr_type { > #define SBI_ERR_ALREADY_STARTED -7 > #define SBI_ERR_ALREADY_STOPPED -8 > > +/* SBI implementation IDs */ > +#define SBI_IMP_OPENSBI 1 > + > extern unsigned long sbi_spec_version; > struct sbiret { > long error; > @@ -259,6 +262,8 @@ struct sbiret { > }; > > void sbi_init(void); > +void sbi_apply_reserved_mem_erratum(void *dtb_va); > + > struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, > unsigned long arg1, unsigned long arg2, > unsigned long arg3, unsigned long arg4, > diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c > index c672c8ba9a2a..aeb27263fa53 100644 > --- a/arch/riscv/kernel/sbi.c > +++ b/arch/riscv/kernel/sbi.c > @@ -5,8 +5,10 @@ > * Copyright (c) 2020 Western Digital Corporation or its affiliates. > */ > > +#include <linux/acpi.h> > #include <linux/bits.h> > #include <linux/init.h> > +#include <linux/libfdt.h> > #include <linux/pm.h> > #include <linux/reboot.h> > #include <asm/sbi.h> > @@ -583,6 +585,40 @@ long sbi_get_mimpid(void) > } > EXPORT_SYMBOL_GPL(sbi_get_mimpid); > > +static long sbi_firmware_id; > +static long sbi_firmware_version; > + > +/* > + * For devicetrees patched by OpenSBI a "mmode_resv" node is added to cover > + * the region OpenSBI has protected by means of a PMP. Some versions of OpenSBI, > + * [v0.8 to v1.3), omitted the "no-map" property, but this trips up hibernation > + * among other things. > + */ > +void __init sbi_apply_reserved_mem_erratum(void *dtb_pa) > +{ > + int child, reserved_mem; > + > + if (sbi_firmware_id != SBI_IMP_OPENSBI) > + return; > + > + if (!acpi_disabled) > + return; > + > + if (sbi_firmware_version >= 0x10003 || sbi_firmware_version < 0x8) > + return; > + > + reserved_mem = fdt_path_offset((void *)dtb_pa, "/reserved-memory"); > + if (reserved_mem < 0) > + return; > + > + fdt_for_each_subnode(child, (void *)dtb_pa, reserved_mem) { > + const char *name = fdt_get_name((void *)dtb_pa, child, NULL); > + > + if (!strncmp(name, "mmode_resv", 10)) > + fdt_setprop((void *)dtb_pa, child, "no-map", NULL, 0); > + } > +}; > + > void __init sbi_init(void) > { > int ret; > @@ -596,8 +632,12 @@ void __init sbi_init(void) > sbi_major_version(), sbi_minor_version()); > > if (!sbi_spec_is_0_1()) { > + sbi_firmware_id = sbi_get_firmware_id(); > + sbi_firmware_version = sbi_get_firmware_version(); > + > pr_info("SBI implementation ID=0x%lx Version=0x%lx\n", > - sbi_get_firmware_id(), sbi_get_firmware_version()); > + sbi_firmware_id, sbi_firmware_version); > + > if (sbi_probe_extension(SBI_EXT_TIME)) { > __sbi_set_timer = __sbi_set_timer_v02; > pr_info("SBI TIME extension detected\n"); > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index 70fb31960b63..cb16bfdeacdb 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -29,6 +29,7 @@ > #include <asm/tlbflush.h> > #include <asm/sections.h> > #include <asm/soc.h> > +#include <asm/sbi.h> > #include <asm/io.h> > #include <asm/ptdump.h> > #include <asm/numa.h> > @@ -253,6 +254,8 @@ static void __init setup_bootmem(void) > * in the device tree, otherwise the allocation could end up in a > * reserved region. > */ > + > + sbi_apply_reserved_mem_erratum(dtb_early_va); > early_init_fdt_scan_reserved_mem(); > > /* > -- > 2.40.1 >
On Mon, Aug 07, 2023 at 12:44:07AM +0000, JeeHeng Sia wrote: > > +/* SBI implementation IDs */ > > +#define SBI_IMP_OPENSBI 1 > I would suggest to create an enum struct for the SBI Imp ID in > the sbi.h file. What do you think? I'm not really sure what the advantage of doing so is.
On Tue, Aug 08, 2023 at 12:54:11AM -0700, Atish Kumar Patra wrote: > On Wed, Aug 2, 2023 at 4:14 AM Conor Dooley <conor.dooley@microchip.com> wrote: > > > > Add an erratum for versions [v0.8 to v1.3) of OpenSBI which fail to add > > the "no-map" property to the reserved memory nodes for the regions it > > has protected using PMPs. > > > > Our existing fix sweeping hibernation under the carpet by marking it > > NONPORTABLE is insufficient as there are other ways to generate > > accesses to these reserved memory regions, as Petr discovered [1] > > while testing crash kernels & kdump. > > > > Intercede during the boot process when the afflicted versions of OpenSBI > > are present & set the "no-map" property in all "mmode_resv" nodes before > > the kernel does its reserved memory region initialisation. > > > > We have different mechanisms of DT being passed to the kernel. > > 1. A prior stage(e.g U-Boot SPL) to M-mode runtime firmware (e.g. > OpenSBI, rustSBI) passes the DT to M-mode runtime firmware and it > passes to the next stage. > In this case, M-mode runtime firmware gets a chance to update the > no-map property in DT that the kernel can parse. > > 2. User loads the DT from the boot loader (e.g EDK2, U-Boot proper). > Any DT patching done by the M-mode firmware is useless. If these DTBs > don't have the no-map > property, hibernation or EFI booting will have issues as well. > > We are trying to solve only one part of problem #1 in this patch. Correct. If someone's second stage is also providing an incorrect devicetree then, yeah, this approach would fall apart - but it's the firmware provided devicetree being incorrect that I am trying to account for here. If a person incorrectly constructed one, I am not really sure what we can do for them, they incorrect described their hardware /shrug My patch should of course help in some of the scenarios you mention above if the name of the reserved memory region from OpenSBI is propagated by the second-stage bootloader, but that is just an extension of case 1, not case 2. > I > don't think any other M-mode runtime firmware patches DT with no-map > property as well. > Please let me know if I am wrong about that. The problem is not > restricted to [v0.8 to v1.3) of OpenSBI. It comes down to Alex's question - do we want to fix this kind of firmware issue in the kernel? Ultimately this is a policy decision that "somebody" has to make. Maybe the list of firmwares that need this grows, but hopefully this bug will be fixed everywhere going forward, since, as you say, we now have a document. The reason I've only targeted OpenSBI here is that it's what is being shipped on people's SBCs etc by default & is also what is in the versions of QEMU that people seem to be using. Unless there's some example that I am missing, other firmwares are either not affected or not actually being shipped on to consumers? > The booting doc now says that no-map property must be set and any > usage of DTBs without that (via #1 or #2) will have unintended > consequences. Which is great. It unfortunately does not help one iota with stuff that's already in the wild. Thanks, Conor. > > Reported-by: Song Shuai <suagrfillet@gmail.com> > > Link: https://lore.kernel.org/all/CAAYs2=gQvkhTeioMmqRDVGjdtNF_vhB+vm_1dHJxPNi75YDQ_Q@mail.gmail.com/ > > Reported-by: JeeHeng Sia <jeeheng.sia@starfivetech.com> > > Link: https://groups.google.com/a/groups.riscv.org/g/sw-dev/c/ITXwaKfA6z8 > > Reported-by: Petr Tesarik <petrtesarik@huaweicloud.com> > > Closes: https://lore.kernel.org/linux-riscv/76ff0f51-d6c1-580d-f943-061e93073306@huaweicloud.com/ [1] > > CC: stable@vger.kernel.org > > Signed-off-by: Conor Dooley <conor.dooley@microchip.com> > > --- > > arch/riscv/include/asm/sbi.h | 5 +++++ > > arch/riscv/kernel/sbi.c | 42 +++++++++++++++++++++++++++++++++++- > > arch/riscv/mm/init.c | 3 +++ > > 3 files changed, 49 insertions(+), 1 deletion(-) > > > > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h > > index 5b4a1bf5f439..5360f3476278 100644 > > --- a/arch/riscv/include/asm/sbi.h > > +++ b/arch/riscv/include/asm/sbi.h > > @@ -252,6 +252,9 @@ enum sbi_pmu_ctr_type { > > #define SBI_ERR_ALREADY_STARTED -7 > > #define SBI_ERR_ALREADY_STOPPED -8 > > > > +/* SBI implementation IDs */ > > +#define SBI_IMP_OPENSBI 1 > > + > > extern unsigned long sbi_spec_version; > > struct sbiret { > > long error; > > @@ -259,6 +262,8 @@ struct sbiret { > > }; > > > > void sbi_init(void); > > +void sbi_apply_reserved_mem_erratum(void *dtb_va); > > + > > struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, > > unsigned long arg1, unsigned long arg2, > > unsigned long arg3, unsigned long arg4, > > diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c > > index c672c8ba9a2a..aeb27263fa53 100644 > > --- a/arch/riscv/kernel/sbi.c > > +++ b/arch/riscv/kernel/sbi.c > > @@ -5,8 +5,10 @@ > > * Copyright (c) 2020 Western Digital Corporation or its affiliates. > > */ > > > > +#include <linux/acpi.h> > > #include <linux/bits.h> > > #include <linux/init.h> > > +#include <linux/libfdt.h> > > #include <linux/pm.h> > > #include <linux/reboot.h> > > #include <asm/sbi.h> > > @@ -583,6 +585,40 @@ long sbi_get_mimpid(void) > > } > > EXPORT_SYMBOL_GPL(sbi_get_mimpid); > > > > +static long sbi_firmware_id; > > +static long sbi_firmware_version; > > + > > +/* > > + * For devicetrees patched by OpenSBI a "mmode_resv" node is added to cover > > + * the region OpenSBI has protected by means of a PMP. Some versions of OpenSBI, > > + * [v0.8 to v1.3), omitted the "no-map" property, but this trips up hibernation > > + * among other things. > > + */ > > +void __init sbi_apply_reserved_mem_erratum(void *dtb_pa) > > +{ > > + int child, reserved_mem; > > + > > + if (sbi_firmware_id != SBI_IMP_OPENSBI) > > + return; > > + > > + if (!acpi_disabled) > > + return; > > + > > + if (sbi_firmware_version >= 0x10003 || sbi_firmware_version < 0x8) > > + return; > > + > > + reserved_mem = fdt_path_offset((void *)dtb_pa, "/reserved-memory"); > > + if (reserved_mem < 0) > > + return; > > + > > + fdt_for_each_subnode(child, (void *)dtb_pa, reserved_mem) { > > + const char *name = fdt_get_name((void *)dtb_pa, child, NULL); > > + > > + if (!strncmp(name, "mmode_resv", 10)) > > + fdt_setprop((void *)dtb_pa, child, "no-map", NULL, 0); > > + } > > +}; > > + > > void __init sbi_init(void) > > { > > int ret; > > @@ -596,8 +632,12 @@ void __init sbi_init(void) > > sbi_major_version(), sbi_minor_version()); > > > > if (!sbi_spec_is_0_1()) { > > + sbi_firmware_id = sbi_get_firmware_id(); > > + sbi_firmware_version = sbi_get_firmware_version(); > > + > > pr_info("SBI implementation ID=0x%lx Version=0x%lx\n", > > - sbi_get_firmware_id(), sbi_get_firmware_version()); > > + sbi_firmware_id, sbi_firmware_version); > > + > > if (sbi_probe_extension(SBI_EXT_TIME)) { > > __sbi_set_timer = __sbi_set_timer_v02; > > pr_info("SBI TIME extension detected\n"); > > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > > index 70fb31960b63..cb16bfdeacdb 100644 > > --- a/arch/riscv/mm/init.c > > +++ b/arch/riscv/mm/init.c > > @@ -29,6 +29,7 @@ > > #include <asm/tlbflush.h> > > #include <asm/sections.h> > > #include <asm/soc.h> > > +#include <asm/sbi.h> > > #include <asm/io.h> > > #include <asm/ptdump.h> > > #include <asm/numa.h> > > @@ -253,6 +254,8 @@ static void __init setup_bootmem(void) > > * in the device tree, otherwise the allocation could end up in a > > * reserved region. > > */ > > + > > + sbi_apply_reserved_mem_erratum(dtb_early_va); > > early_init_fdt_scan_reserved_mem(); > > > > /* > > -- > > 2.40.1 > >
On Tue, Aug 8, 2023 at 6:39 AM Conor Dooley <conor@kernel.org> wrote: > > On Tue, Aug 08, 2023 at 12:54:11AM -0700, Atish Kumar Patra wrote: > > On Wed, Aug 2, 2023 at 4:14 AM Conor Dooley <conor.dooley@microchip.com> wrote: > > > > > > Add an erratum for versions [v0.8 to v1.3) of OpenSBI which fail to add > > > the "no-map" property to the reserved memory nodes for the regions it > > > has protected using PMPs. > > > > > > Our existing fix sweeping hibernation under the carpet by marking it > > > NONPORTABLE is insufficient as there are other ways to generate > > > accesses to these reserved memory regions, as Petr discovered [1] > > > while testing crash kernels & kdump. > > > > > > Intercede during the boot process when the afflicted versions of OpenSBI > > > are present & set the "no-map" property in all "mmode_resv" nodes before > > > the kernel does its reserved memory region initialisation. > > > > > > > We have different mechanisms of DT being passed to the kernel. > > > > 1. A prior stage(e.g U-Boot SPL) to M-mode runtime firmware (e.g. > > OpenSBI, rustSBI) passes the DT to M-mode runtime firmware and it > > passes to the next stage. > > In this case, M-mode runtime firmware gets a chance to update the > > no-map property in DT that the kernel can parse. > > > > 2. User loads the DT from the boot loader (e.g EDK2, U-Boot proper). > > Any DT patching done by the M-mode firmware is useless. If these DTBs > > don't have the no-map > > property, hibernation or EFI booting will have issues as well. > > > > > We are trying to solve only one part of problem #1 in this patch. > > Correct. > > If someone's second stage is also providing an incorrect devicetree > then, yeah, this approach would fall apart - but it's the firmware > provided devicetree being incorrect that I am trying to account for > here. If a person incorrectly constructed one, I am not really sure what > we can do for them, they incorrect described their hardware /shrug > My patch should of course help in some of the scenarios you mention above > if the name of the reserved memory region from OpenSBI is propagated by > the second-stage bootloader, but that is just an extension of case 1, > not case 2. > > > I > > don't think any other M-mode runtime firmware patches DT with no-map > > property as well. > > Please let me know if I am wrong about that. The problem is not > > restricted to [v0.8 to v1.3) of OpenSBI. > > It comes down to Alex's question - do we want to fix this kind of > firmware issue in the kernel? Ultimately this is a policy decision that > "somebody" has to make. Maybe the list of firmwares that need this IMO, we shouldn't as this is a slippery slope. Kernel can't fix every firmware bug by having erratas. I agree with your point below about firmware in shipping products. I am not aware of any official products shipping anything other than OpenSBI either. However, I have seen users using other firmwares in their dev environment. IMHO, this approach sets a bad precedent for the future especially when it only solves one part of the problem. We shouldn't hide firmware bugs in the kernel when an upgraded firmware is already available. This bug is well documented in various threads and fixed in the latest version of OpenSBI. I am assuming other firmwares will follow it as well. Anybody facing hibernation or efi related booting issues should just upgrade to the latest version of firmware (e.g OpenSBI v1.3) Latest version of Qemu will support(if not happened already) the latest version of OpenSBI. This issue will only manifest in kernels 6.4 or higher. Any user facing these with the latest kernel can also upgrade the firmware. Do you see any issue with that ? > grows, but hopefully this bug will be fixed everywhere going forward, > since, as you say, we now have a document. > The reason I've only targeted OpenSBI here is that it's what is being > shipped on people's SBCs etc by default & is also what is in the > versions of QEMU that people seem to be using. Unless there's some > example that I am missing, other firmwares are either not affected or > not actually being shipped on to consumers? > > > The booting doc now says that no-map property must be set and any > > usage of DTBs without that (via #1 or #2) will have unintended > > consequences. > > Which is great. It unfortunately does not help one iota with stuff > that's already in the wild. > > Thanks, > Conor. > > > > Reported-by: Song Shuai <suagrfillet@gmail.com> > > > Link: https://lore.kernel.org/all/CAAYs2=gQvkhTeioMmqRDVGjdtNF_vhB+vm_1dHJxPNi75YDQ_Q@mail.gmail.com/ > > > Reported-by: JeeHeng Sia <jeeheng.sia@starfivetech.com> > > > Link: https://groups.google.com/a/groups.riscv.org/g/sw-dev/c/ITXwaKfA6z8 > > > Reported-by: Petr Tesarik <petrtesarik@huaweicloud.com> > > > Closes: https://lore.kernel.org/linux-riscv/76ff0f51-d6c1-580d-f943-061e93073306@huaweicloud.com/ [1] > > > CC: stable@vger.kernel.org > > > Signed-off-by: Conor Dooley <conor.dooley@microchip.com> > > > --- > > > arch/riscv/include/asm/sbi.h | 5 +++++ > > > arch/riscv/kernel/sbi.c | 42 +++++++++++++++++++++++++++++++++++- > > > arch/riscv/mm/init.c | 3 +++ > > > 3 files changed, 49 insertions(+), 1 deletion(-) > > > > > > diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h > > > index 5b4a1bf5f439..5360f3476278 100644 > > > --- a/arch/riscv/include/asm/sbi.h > > > +++ b/arch/riscv/include/asm/sbi.h > > > @@ -252,6 +252,9 @@ enum sbi_pmu_ctr_type { > > > #define SBI_ERR_ALREADY_STARTED -7 > > > #define SBI_ERR_ALREADY_STOPPED -8 > > > > > > +/* SBI implementation IDs */ > > > +#define SBI_IMP_OPENSBI 1 > > > + > > > extern unsigned long sbi_spec_version; > > > struct sbiret { > > > long error; > > > @@ -259,6 +262,8 @@ struct sbiret { > > > }; > > > > > > void sbi_init(void); > > > +void sbi_apply_reserved_mem_erratum(void *dtb_va); > > > + > > > struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, > > > unsigned long arg1, unsigned long arg2, > > > unsigned long arg3, unsigned long arg4, > > > diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c > > > index c672c8ba9a2a..aeb27263fa53 100644 > > > --- a/arch/riscv/kernel/sbi.c > > > +++ b/arch/riscv/kernel/sbi.c > > > @@ -5,8 +5,10 @@ > > > * Copyright (c) 2020 Western Digital Corporation or its affiliates. > > > */ > > > > > > +#include <linux/acpi.h> > > > #include <linux/bits.h> > > > #include <linux/init.h> > > > +#include <linux/libfdt.h> > > > #include <linux/pm.h> > > > #include <linux/reboot.h> > > > #include <asm/sbi.h> > > > @@ -583,6 +585,40 @@ long sbi_get_mimpid(void) > > > } > > > EXPORT_SYMBOL_GPL(sbi_get_mimpid); > > > > > > +static long sbi_firmware_id; > > > +static long sbi_firmware_version; > > > + > > > +/* > > > + * For devicetrees patched by OpenSBI a "mmode_resv" node is added to cover > > > + * the region OpenSBI has protected by means of a PMP. Some versions of OpenSBI, > > > + * [v0.8 to v1.3), omitted the "no-map" property, but this trips up hibernation > > > + * among other things. > > > + */ > > > +void __init sbi_apply_reserved_mem_erratum(void *dtb_pa) > > > +{ > > > + int child, reserved_mem; > > > + > > > + if (sbi_firmware_id != SBI_IMP_OPENSBI) > > > + return; > > > + > > > + if (!acpi_disabled) > > > + return; > > > + > > > + if (sbi_firmware_version >= 0x10003 || sbi_firmware_version < 0x8) > > > + return; > > > + > > > + reserved_mem = fdt_path_offset((void *)dtb_pa, "/reserved-memory"); > > > + if (reserved_mem < 0) > > > + return; > > > + > > > + fdt_for_each_subnode(child, (void *)dtb_pa, reserved_mem) { > > > + const char *name = fdt_get_name((void *)dtb_pa, child, NULL); > > > + > > > + if (!strncmp(name, "mmode_resv", 10)) > > > + fdt_setprop((void *)dtb_pa, child, "no-map", NULL, 0); > > > + } > > > +}; > > > + > > > void __init sbi_init(void) > > > { > > > int ret; > > > @@ -596,8 +632,12 @@ void __init sbi_init(void) > > > sbi_major_version(), sbi_minor_version()); > > > > > > if (!sbi_spec_is_0_1()) { > > > + sbi_firmware_id = sbi_get_firmware_id(); > > > + sbi_firmware_version = sbi_get_firmware_version(); > > > + > > > pr_info("SBI implementation ID=0x%lx Version=0x%lx\n", > > > - sbi_get_firmware_id(), sbi_get_firmware_version()); > > > + sbi_firmware_id, sbi_firmware_version); > > > + > > > if (sbi_probe_extension(SBI_EXT_TIME)) { > > > __sbi_set_timer = __sbi_set_timer_v02; > > > pr_info("SBI TIME extension detected\n"); > > > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > > > index 70fb31960b63..cb16bfdeacdb 100644 > > > --- a/arch/riscv/mm/init.c > > > +++ b/arch/riscv/mm/init.c > > > @@ -29,6 +29,7 @@ > > > #include <asm/tlbflush.h> > > > #include <asm/sections.h> > > > #include <asm/soc.h> > > > +#include <asm/sbi.h> > > > #include <asm/io.h> > > > #include <asm/ptdump.h> > > > #include <asm/numa.h> > > > @@ -253,6 +254,8 @@ static void __init setup_bootmem(void) > > > * in the device tree, otherwise the allocation could end up in a > > > * reserved region. > > > */ > > > + > > > + sbi_apply_reserved_mem_erratum(dtb_early_va); > > > early_init_fdt_scan_reserved_mem(); > > > > > > /* > > > -- > > > 2.40.1 > > >
> -----Original Message----- > From: Conor Dooley <conor@kernel.org> > Sent: Tuesday, August 8, 2023 9:13 PM > To: JeeHeng Sia <jeeheng.sia@starfivetech.com> > Cc: Conor Dooley <conor.dooley@microchip.com>; palmer@dabbelt.com; Paul Walmsley <paul.walmsley@sifive.com>; Atish Patra > <atishp@rivosinc.com>; Anup Patel <apatel@ventanamicro.com>; Alexandre Ghiti <alexghiti@rivosinc.com>; Björn Töpel > <bjorn@rivosinc.com>; Song Shuai <suagrfillet@gmail.com>; Petr Tesarik <petrtesarik@huaweicloud.com>; linux- > riscv@lists.infradead.org; linux-kernel@vger.kernel.org; stable@vger.kernel.org > Subject: Re: [RFT 1/2] RISC-V: handle missing "no-map" properties for OpenSBI's PMP protected regions > > On Mon, Aug 07, 2023 at 12:44:07AM +0000, JeeHeng Sia wrote: > > > > +/* SBI implementation IDs */ > > > +#define SBI_IMP_OPENSBI 1 > > I would suggest to create an enum struct for the SBI Imp ID in > > the sbi.h file. What do you think? > > I'm not really sure what the advantage of doing so is. The macro SBI_IMP_OPENSBI seems weird (I would read it as "SBI Implementation OpenSBI"). However, if we implement an enum struct for SBI_IMP_ID (There are numerous IDs available), the macro can be abbreviated to OpenSBI. By doing this, the conditional checking of the implementation ID would be more readable, as shown below: if (sbi_firmware_id != OPENSBI)
On Wed, Aug 09, 2023 at 10:24:57AM +0000, JeeHeng Sia wrote: > > > > -----Original Message----- > > From: Conor Dooley <conor@kernel.org> > > Sent: Tuesday, August 8, 2023 9:13 PM > > To: JeeHeng Sia <jeeheng.sia@starfivetech.com> > > Cc: Conor Dooley <conor.dooley@microchip.com>; palmer@dabbelt.com; Paul Walmsley <paul.walmsley@sifive.com>; Atish Patra > > <atishp@rivosinc.com>; Anup Patel <apatel@ventanamicro.com>; Alexandre Ghiti <alexghiti@rivosinc.com>; Björn Töpel > > <bjorn@rivosinc.com>; Song Shuai <suagrfillet@gmail.com>; Petr Tesarik <petrtesarik@huaweicloud.com>; linux- > > riscv@lists.infradead.org; linux-kernel@vger.kernel.org; stable@vger.kernel.org > > Subject: Re: [RFT 1/2] RISC-V: handle missing "no-map" properties for OpenSBI's PMP protected regions > > > > On Mon, Aug 07, 2023 at 12:44:07AM +0000, JeeHeng Sia wrote: > > > > > > +/* SBI implementation IDs */ > > > > +#define SBI_IMP_OPENSBI 1 > > > I would suggest to create an enum struct for the SBI Imp ID in > > > the sbi.h file. What do you think? > > > > I'm not really sure what the advantage of doing so is. > The macro SBI_IMP_OPENSBI seems weird (I would read it as "SBI Implementation OpenSBI"). That is what it is though, so I don't see what's weird about that. > However, if we implement an enum struct for SBI_IMP_ID > (There are numerous IDs available), Ohh I know, but I didn't see the point adding those when I was only focusing on a single implementation. > the macro can be abbreviated to OpenSBI. By doing this, the conditional > checking of the implementation ID would be more readable, as shown below: > if (sbi_firmware_id != OPENSBI) I don't see that it can become that simple, it'd still need to be prefixed with SBI_ to be consistent with any other SBI related enum, and at that point adding the extra IMP_ makes little odds.
On Wed, Aug 09, 2023 at 02:01:07AM -0700, Atish Kumar Patra wrote: > On Tue, Aug 8, 2023 at 6:39 AM Conor Dooley <conor@kernel.org> wrote: > > > > On Tue, Aug 08, 2023 at 12:54:11AM -0700, Atish Kumar Patra wrote: > > > On Wed, Aug 2, 2023 at 4:14 AM Conor Dooley <conor.dooley@microchip.com> wrote: > > > > > > > > Add an erratum for versions [v0.8 to v1.3) of OpenSBI which fail to add > > > > the "no-map" property to the reserved memory nodes for the regions it > > > > has protected using PMPs. > > > > > > > > Our existing fix sweeping hibernation under the carpet by marking it > > > > NONPORTABLE is insufficient as there are other ways to generate > > > > accesses to these reserved memory regions, as Petr discovered [1] > > > > while testing crash kernels & kdump. > > > > > > > > Intercede during the boot process when the afflicted versions of OpenSBI > > > > are present & set the "no-map" property in all "mmode_resv" nodes before > > > > the kernel does its reserved memory region initialisation. > > > > > > > > > > We have different mechanisms of DT being passed to the kernel. > > > > > > 1. A prior stage(e.g U-Boot SPL) to M-mode runtime firmware (e.g. > > > OpenSBI, rustSBI) passes the DT to M-mode runtime firmware and it > > > passes to the next stage. > > > In this case, M-mode runtime firmware gets a chance to update the > > > no-map property in DT that the kernel can parse. > > > > > > 2. User loads the DT from the boot loader (e.g EDK2, U-Boot proper). > > > Any DT patching done by the M-mode firmware is useless. If these DTBs > > > don't have the no-map > > > property, hibernation or EFI booting will have issues as well. > > > > > > > > We are trying to solve only one part of problem #1 in this patch. > > > > Correct. > > > > If someone's second stage is also providing an incorrect devicetree > > then, yeah, this approach would fall apart - but it's the firmware > > provided devicetree being incorrect that I am trying to account for > > here. If a person incorrectly constructed one, I am not really sure what > > we can do for them, they incorrect described their hardware /shrug > > My patch should of course help in some of the scenarios you mention above > > if the name of the reserved memory region from OpenSBI is propagated by > > the second-stage bootloader, but that is just an extension of case 1, > > not case 2. > > > > > I > > > don't think any other M-mode runtime firmware patches DT with no-map > > > property as well. > > > Please let me know if I am wrong about that. The problem is not > > > restricted to [v0.8 to v1.3) of OpenSBI. > > > > It comes down to Alex's question - do we want to fix this kind of > > firmware issue in the kernel? Ultimately this is a policy decision that > > "somebody" has to make. Maybe the list of firmwares that need this > > IMO, we shouldn't as this is a slippery slope. Kernel can't fix every > firmware bug by having erratas. > I agree with your point below about firmware in shipping products. I > am not aware of any official products shipping anything other than > OpenSBI either. > However, I have seen users using other firmwares in their dev > environment. If someone's already changed their boards firmware, I have less sympathy for them, as they should be able to make further changes. Punters buying SBCs to install Fedora or Debian w/o having to consider their firmware are who I am more interested in helping. > IMHO, this approach sets a bad precedent for the future especially > when it only solves one part of the problem. Yeah, I'm certainly wary of setting an unwise precedent here. Inevitably we will need to have firmware-related errata and it'd be good to have a policy for what is (or more importantly what isn't acceptable). Certainly we have said that known-broken version of OpenSBI that T-Head puts in their SDK is not supported by the mainline kernel. On the latter part, I'm perfectly happy to expand the erratum to cover all affected firmwares, but I wasn't even sure if my fix worked properly, hence the request for testing from those who encountered the problem. > We shouldn't hide firmware bugs in the kernel when an upgraded > firmware is already available. Just to note, availability of an updated firmware upstream does not necessarily mean that corresponding update is possible for affected hardware. > This bug is well documented in various threads and fixed in the latest > version of OpenSBI. > I am assuming other firmwares will follow it as well. > > Anybody facing hibernation or efi related booting issues should just > upgrade to the latest version of firmware (e.g OpenSBI v1.3) > Latest version of Qemu will support(if not happened already) the > latest version of OpenSBI. > > This issue will only manifest in kernels 6.4 or higher. Any user > facing these with the latest kernel can also upgrade the firmware. > Do you see any issue with that ? I don't think it is fair to compare the ease of upgrading the kernel to that required to upgrade a boards firmware, with the latter being far, far more inconvenient on pretty much all of the boards that I have. I'm perfectly happy to drop this series though, if people generally are of the opinion that this sort of firmware workaround is ill-advised. We are unaffected by it, so I certainly have no pressure to have something working here. It's my desire not to be user-hostile that motivated this patch.
On Thu, 10 Aug 2023 02:07:10 PDT (-0700), Conor Dooley wrote: > On Wed, Aug 09, 2023 at 02:01:07AM -0700, Atish Kumar Patra wrote: >> On Tue, Aug 8, 2023 at 6:39 AM Conor Dooley <conor@kernel.org> wrote: >> > >> > On Tue, Aug 08, 2023 at 12:54:11AM -0700, Atish Kumar Patra wrote: >> > > On Wed, Aug 2, 2023 at 4:14 AM Conor Dooley <conor.dooley@microchip.com> wrote: >> > > > >> > > > Add an erratum for versions [v0.8 to v1.3) of OpenSBI which fail to add >> > > > the "no-map" property to the reserved memory nodes for the regions it >> > > > has protected using PMPs. >> > > > >> > > > Our existing fix sweeping hibernation under the carpet by marking it >> > > > NONPORTABLE is insufficient as there are other ways to generate >> > > > accesses to these reserved memory regions, as Petr discovered [1] >> > > > while testing crash kernels & kdump. >> > > > >> > > > Intercede during the boot process when the afflicted versions of OpenSBI >> > > > are present & set the "no-map" property in all "mmode_resv" nodes before >> > > > the kernel does its reserved memory region initialisation. >> > > > >> > > >> > > We have different mechanisms of DT being passed to the kernel. >> > > >> > > 1. A prior stage(e.g U-Boot SPL) to M-mode runtime firmware (e.g. >> > > OpenSBI, rustSBI) passes the DT to M-mode runtime firmware and it >> > > passes to the next stage. >> > > In this case, M-mode runtime firmware gets a chance to update the >> > > no-map property in DT that the kernel can parse. >> > > >> > > 2. User loads the DT from the boot loader (e.g EDK2, U-Boot proper). >> > > Any DT patching done by the M-mode firmware is useless. If these DTBs >> > > don't have the no-map >> > > property, hibernation or EFI booting will have issues as well. >> > > >> > >> > > We are trying to solve only one part of problem #1 in this patch. >> > >> > Correct. >> > >> > If someone's second stage is also providing an incorrect devicetree >> > then, yeah, this approach would fall apart - but it's the firmware >> > provided devicetree being incorrect that I am trying to account for >> > here. If a person incorrectly constructed one, I am not really sure what >> > we can do for them, they incorrect described their hardware /shrug >> > My patch should of course help in some of the scenarios you mention above >> > if the name of the reserved memory region from OpenSBI is propagated by >> > the second-stage bootloader, but that is just an extension of case 1, >> > not case 2. >> > >> > > I >> > > don't think any other M-mode runtime firmware patches DT with no-map >> > > property as well. >> > > Please let me know if I am wrong about that. The problem is not >> > > restricted to [v0.8 to v1.3) of OpenSBI. >> > >> > It comes down to Alex's question - do we want to fix this kind of >> > firmware issue in the kernel? Ultimately this is a policy decision that >> > "somebody" has to make. Maybe the list of firmwares that need this >> >> IMO, we shouldn't as this is a slippery slope. Kernel can't fix every >> firmware bug by having erratas. >> I agree with your point below about firmware in shipping products. I >> am not aware of any official products shipping anything other than >> OpenSBI either. > >> However, I have seen users using other firmwares in their dev >> environment. > > If someone's already changed their boards firmware, I have less sympathy > for them, as they should be able to make further changes. Punters buying > SBCs to install Fedora or Debian w/o having to consider their firmware > are who I am more interested in helping. > >> IMHO, this approach sets a bad precedent for the future especially >> when it only solves one part of the problem. > > Yeah, I'm certainly wary of setting an unwise precedent here. > Inevitably we will need to have firmware-related errata and it'd be good > to have a policy for what is (or more importantly what isn't > acceptable). Certainly we have said that known-broken version of OpenSBI > that T-Head puts in their SDK is not supported by the mainline kernel. > On the latter part, I'm perfectly happy to expand the erratum to cover > all affected firmwares, but I wasn't even sure if my fix worked > properly, hence the request for testing from those who encountered the > problem. > >> We shouldn't hide firmware bugs in the kernel when an upgraded >> firmware is already available. > > Just to note, availability of an updated firmware upstream does not > necessarily mean that corresponding update is possible for affected > hardware. Yep. I think we're been in a very hobbist-centric world in RISC-V land, but in general trying to get people to update firmware is hard. Part of the whole "kernel updates don't break users" thing is what's underneath the kernel, it's not just a uABI thing. >> This bug is well documented in various threads and fixed in the latest >> version of OpenSBI. >> I am assuming other firmwares will follow it as well. >> >> Anybody facing hibernation or efi related booting issues should just >> upgrade to the latest version of firmware (e.g OpenSBI v1.3) >> Latest version of Qemu will support(if not happened already) the >> latest version of OpenSBI. >> >> This issue will only manifest in kernels 6.4 or higher. Any user >> facing these with the latest kernel can also upgrade the firmware. >> Do you see any issue with that ? > > I don't think it is fair to compare the ease of upgrading the kernel > to that required to upgrade a boards firmware, with the latter being > far, far more inconvenient on pretty much all of the boards that I have. IMO we're in the same spot as every other port here, and generally they work around firmware bugs when they've rolled out into production somewhere that firmware updates aren't likely to happen quickly. I'm not sure if there's any sort of exact rules written down anywhere, but IMO if the bug is going to impact users then we should deal with it. That applies for hardware bugs, but also firmware bugs (at a certain point we won't be able to tell the difference). We're sort of doing this with the misaligned access handling, for example. > I'm perfectly happy to drop this series though, if people generally are > of the opinion that this sort of firmware workaround is ill-advised. > We are unaffected by it, so I certainly have no pressure to have > something working here. It's my desire not to be user-hostile that > motivated this patch. IIUC you guys and Reneas are the only ones who have hardware that might be in a spot where users aren't able to update the firmware (ie, it's out in production somewhere). So I'm adding Geert, though he probably saw this months ago... On that note: It's been ~4 months and it look like nobody's tested anything (and the comments aren't really things that would preculde testing). So maybe we just pick that second patch up into for-next and see what happens? IIUC that will result in broken systems for users who haven't updated their firmware. I agree that's a user-hostile way to do things, which is generally a bad way to go, but if it's really true that there's no users then we're safe. Probably also worth calling it out on sw-dev just to be safe.
On Wed, Dec 06, 2023 at 04:52:11AM -0800, Palmer Dabbelt wrote: > On Thu, 10 Aug 2023 02:07:10 PDT (-0700), Conor Dooley wrote: > > On Wed, Aug 09, 2023 at 02:01:07AM -0700, Atish Kumar Patra wrote: > > > On Tue, Aug 8, 2023 at 6:39 AM Conor Dooley <conor@kernel.org> wrote: > > > > > > > > On Tue, Aug 08, 2023 at 12:54:11AM -0700, Atish Kumar Patra wrote: > > > > > On Wed, Aug 2, 2023 at 4:14 AM Conor Dooley <conor.dooley@microchip.com> wrote: > > > > > > > > > > > > Add an erratum for versions [v0.8 to v1.3) of OpenSBI which fail to add > > > > > > the "no-map" property to the reserved memory nodes for the regions it > > > > > > has protected using PMPs. > > > > > > > > > > > > Our existing fix sweeping hibernation under the carpet by marking it > > > > > > NONPORTABLE is insufficient as there are other ways to generate > > > > > > accesses to these reserved memory regions, as Petr discovered [1] > > > > > > while testing crash kernels & kdump. > > > > > > > > > > > > Intercede during the boot process when the afflicted versions of OpenSBI > > > > > > are present & set the "no-map" property in all "mmode_resv" nodes before > > > > > > the kernel does its reserved memory region initialisation. > > > > > > > > > > > > > > > > We have different mechanisms of DT being passed to the kernel. > > > > > > > > > > 1. A prior stage(e.g U-Boot SPL) to M-mode runtime firmware (e.g. > > > > > OpenSBI, rustSBI) passes the DT to M-mode runtime firmware and it > > > > > passes to the next stage. > > > > > In this case, M-mode runtime firmware gets a chance to update the > > > > > no-map property in DT that the kernel can parse. > > > > > > > > > > 2. User loads the DT from the boot loader (e.g EDK2, U-Boot proper). > > > > > Any DT patching done by the M-mode firmware is useless. If these DTBs > > > > > don't have the no-map > > > > > property, hibernation or EFI booting will have issues as well. > > > > > > > > > > > > > > We are trying to solve only one part of problem #1 in this patch. > > > > > > > > Correct. > > > > > > > > If someone's second stage is also providing an incorrect devicetree > > > > then, yeah, this approach would fall apart - but it's the firmware > > > > provided devicetree being incorrect that I am trying to account for > > > > here. If a person incorrectly constructed one, I am not really sure what > > > > we can do for them, they incorrect described their hardware /shrug > > > > My patch should of course help in some of the scenarios you mention above > > > > if the name of the reserved memory region from OpenSBI is propagated by > > > > the second-stage bootloader, but that is just an extension of case 1, > > > > not case 2. > > > > > > > > > I > > > > > don't think any other M-mode runtime firmware patches DT with no-map > > > > > property as well. > > > > > Please let me know if I am wrong about that. The problem is not > > > > > restricted to [v0.8 to v1.3) of OpenSBI. > > > > > > > > It comes down to Alex's question - do we want to fix this kind of > > > > firmware issue in the kernel? Ultimately this is a policy decision that > > > > "somebody" has to make. Maybe the list of firmwares that need this > > > > > > IMO, we shouldn't as this is a slippery slope. Kernel can't fix every > > > firmware bug by having erratas. > > > I agree with your point below about firmware in shipping products. I > > > am not aware of any official products shipping anything other than > > > OpenSBI either. > > > > > However, I have seen users using other firmwares in their dev > > > environment. > > > > If someone's already changed their boards firmware, I have less sympathy > > for them, as they should be able to make further changes. Punters buying > > SBCs to install Fedora or Debian w/o having to consider their firmware > > are who I am more interested in helping. > > > > > IMHO, this approach sets a bad precedent for the future especially > > > when it only solves one part of the problem. > > > > Yeah, I'm certainly wary of setting an unwise precedent here. > > Inevitably we will need to have firmware-related errata and it'd be good > > to have a policy for what is (or more importantly what isn't > > acceptable). Certainly we have said that known-broken version of OpenSBI > > that T-Head puts in their SDK is not supported by the mainline kernel. > > On the latter part, I'm perfectly happy to expand the erratum to cover > > all affected firmwares, but I wasn't even sure if my fix worked > > properly, hence the request for testing from those who encountered the > > problem. > > > > > We shouldn't hide firmware bugs in the kernel when an upgraded > > > firmware is already available. > > > > Just to note, availability of an updated firmware upstream does not > > necessarily mean that corresponding update is possible for affected > > hardware. > > Yep. I think we're been in a very hobbist-centric world in RISC-V land, but > in general trying to get people to update firmware is hard. Part of the > whole "kernel updates don't break users" thing is what's underneath the > kernel, it's not just a uABI thing. Yeah, there's certainly an attitude that I think needs to go away, that updating firmware etc is something we can expect to be carried out on a universal basis. Or that fixing things in the upstream version of OpenSBI means it'll actually propagate down to system integrators. > > > > This bug is well documented in various threads and fixed in the latest > > > version of OpenSBI. > > > I am assuming other firmwares will follow it as well. > > > > > > Anybody facing hibernation or efi related booting issues should just > > > upgrade to the latest version of firmware (e.g OpenSBI v1.3) > > > Latest version of Qemu will support(if not happened already) the > > > latest version of OpenSBI. > > > > > > This issue will only manifest in kernels 6.4 or higher. Any user > > > facing these with the latest kernel can also upgrade the firmware. > > > Do you see any issue with that ? > > > > I don't think it is fair to compare the ease of upgrading the kernel > > to that required to upgrade a boards firmware, with the latter being > > far, far more inconvenient on pretty much all of the boards that I have. > > IMO we're in the same spot as every other port here, and generally they work > around firmware bugs when they've rolled out into production somewhere that > firmware updates aren't likely to happen quickly. I'm not sure if there's > any sort of exact rules written down anywhere, but IMO if the bug is going > to impact users then we should deal with it. > > That applies for hardware bugs, but also firmware bugs (at a certain point > we won't be able to tell the difference). We're sort of doing this with the > misaligned access handling, for example. > > > I'm perfectly happy to drop this series though, if people generally are > > of the opinion that this sort of firmware workaround is ill-advised. > > We are unaffected by it, so I certainly have no pressure to have > > something working here. It's my desire not to be user-hostile that > > motivated this patch. > > IIUC you guys and Reneas are the only ones who have hardware that might be > in a spot where users aren't able to update the firmware (ie, it's out in > production somewhere). I dunno if we can really keep thinking like that though. In terms of people who have devicetrees in the kernel and stuff available in western catalog distribution, sure. I don't think we can assume that that covers all users though, certainly the syntacore folks pop up every now and then, and I sure hope that Andes etc have larger customer bases than the in-kernel users would suggest. > So I'm adding Geert, though he probably saw this > months ago... Prabhakar might be a good call on that front. I'm not sure if the Renesas stuff works on affected versions of OpenSBI though, guess it depends on the sequencing of the support for the non-coherent stuff and when this bug was fixed. > On that note: It's been ~4 months and it look like nobody's tested anything > (and the comments aren't really things that would preculde testing). Yeah, nobody seems to really have given a crap. I was hoping the StarFive guys that actually added the support for this would be interested in it, but alas they were not. I don't really care all that much - the platform I support is not affected by the problem and I just don't enable the option elsewhere. > So > maybe we just pick that second patch up into for-next and see what happens? > IIUC that will result in broken systems for users who haven't updated their > firmware. > > I agree that's a user-hostile way to do things, which is generally a bad way > to go, but if it's really true that there's no users then we're safe. > Probably also worth calling it out on sw-dev just to be safe. And if there are users, the fix is actually relatively straight-forward, just apply patch #1.
On Wed, Dec 6, 2023 at 2:26 PM Conor Dooley <conor@kernel.org> wrote: > > On Wed, Dec 06, 2023 at 04:52:11AM -0800, Palmer Dabbelt wrote: > > On Thu, 10 Aug 2023 02:07:10 PDT (-0700), Conor Dooley wrote: > > > On Wed, Aug 09, 2023 at 02:01:07AM -0700, Atish Kumar Patra wrote: > > > > On Tue, Aug 8, 2023 at 6:39 AM Conor Dooley <conor@kernel.org> wrote: > > > > > > > > > > On Tue, Aug 08, 2023 at 12:54:11AM -0700, Atish Kumar Patra wrote: > > > > > > On Wed, Aug 2, 2023 at 4:14 AM Conor Dooley <conor.dooley@microchip.com> wrote: > > > > > > > > > > > > > > Add an erratum for versions [v0.8 to v1.3) of OpenSBI which fail to add > > > > > > > the "no-map" property to the reserved memory nodes for the regions it > > > > > > > has protected using PMPs. > > > > > > > > > > > > > > Our existing fix sweeping hibernation under the carpet by marking it > > > > > > > NONPORTABLE is insufficient as there are other ways to generate > > > > > > > accesses to these reserved memory regions, as Petr discovered [1] > > > > > > > while testing crash kernels & kdump. > > > > > > > > > > > > > > Intercede during the boot process when the afflicted versions of OpenSBI > > > > > > > are present & set the "no-map" property in all "mmode_resv" nodes before > > > > > > > the kernel does its reserved memory region initialisation. > > > > > > > > > > > > > > > > > > > We have different mechanisms of DT being passed to the kernel. > > > > > > > > > > > > 1. A prior stage(e.g U-Boot SPL) to M-mode runtime firmware (e.g. > > > > > > OpenSBI, rustSBI) passes the DT to M-mode runtime firmware and it > > > > > > passes to the next stage. > > > > > > In this case, M-mode runtime firmware gets a chance to update the > > > > > > no-map property in DT that the kernel can parse. > > > > > > > > > > > > 2. User loads the DT from the boot loader (e.g EDK2, U-Boot proper). > > > > > > Any DT patching done by the M-mode firmware is useless. If these DTBs > > > > > > don't have the no-map > > > > > > property, hibernation or EFI booting will have issues as well. > > > > > > > > > > > > > > > > > We are trying to solve only one part of problem #1 in this patch. > > > > > > > > > > Correct. > > > > > > > > > > If someone's second stage is also providing an incorrect devicetree > > > > > then, yeah, this approach would fall apart - but it's the firmware > > > > > provided devicetree being incorrect that I am trying to account for > > > > > here. If a person incorrectly constructed one, I am not really sure what > > > > > we can do for them, they incorrect described their hardware /shrug > > > > > My patch should of course help in some of the scenarios you mention above > > > > > if the name of the reserved memory region from OpenSBI is propagated by > > > > > the second-stage bootloader, but that is just an extension of case 1, > > > > > not case 2. > > > > > > > > > > > I > > > > > > don't think any other M-mode runtime firmware patches DT with no-map > > > > > > property as well. > > > > > > Please let me know if I am wrong about that. The problem is not > > > > > > restricted to [v0.8 to v1.3) of OpenSBI. > > > > > > > > > > It comes down to Alex's question - do we want to fix this kind of > > > > > firmware issue in the kernel? Ultimately this is a policy decision that > > > > > "somebody" has to make. Maybe the list of firmwares that need this > > > > > > > > IMO, we shouldn't as this is a slippery slope. Kernel can't fix every > > > > firmware bug by having erratas. > > > > I agree with your point below about firmware in shipping products. I > > > > am not aware of any official products shipping anything other than > > > > OpenSBI either. > > > > > > > However, I have seen users using other firmwares in their dev > > > > environment. > > > > > > If someone's already changed their boards firmware, I have less sympathy > > > for them, as they should be able to make further changes. Punters buying > > > SBCs to install Fedora or Debian w/o having to consider their firmware > > > are who I am more interested in helping. > > > > > > > IMHO, this approach sets a bad precedent for the future especially > > > > when it only solves one part of the problem. > > > > > > Yeah, I'm certainly wary of setting an unwise precedent here. > > > Inevitably we will need to have firmware-related errata and it'd be good > > > to have a policy for what is (or more importantly what isn't > > > acceptable). Certainly we have said that known-broken version of OpenSBI > > > that T-Head puts in their SDK is not supported by the mainline kernel. > > > On the latter part, I'm perfectly happy to expand the erratum to cover > > > all affected firmwares, but I wasn't even sure if my fix worked > > > properly, hence the request for testing from those who encountered the > > > problem. > > > > > > > We shouldn't hide firmware bugs in the kernel when an upgraded > > > > firmware is already available. > > > > > > Just to note, availability of an updated firmware upstream does not > > > necessarily mean that corresponding update is possible for affected > > > hardware. > > > > Yep. I think we're been in a very hobbist-centric world in RISC-V land, but > > in general trying to get people to update firmware is hard. Part of the > > whole "kernel updates don't break users" thing is what's underneath the > > kernel, it's not just a uABI thing. > > Yeah, there's certainly an attitude that I think needs to go away, that > updating firmware etc is something we can expect to be carried out on a > universal basis. Or that fixing things in the upstream version of > OpenSBI means it'll actually propagate down to system integrators. > > > > > > > This bug is well documented in various threads and fixed in the latest > > > > version of OpenSBI. > > > > I am assuming other firmwares will follow it as well. > > > > > > > > Anybody facing hibernation or efi related booting issues should just > > > > upgrade to the latest version of firmware (e.g OpenSBI v1.3) > > > > Latest version of Qemu will support(if not happened already) the > > > > latest version of OpenSBI. > > > > > > > > This issue will only manifest in kernels 6.4 or higher. Any user > > > > facing these with the latest kernel can also upgrade the firmware. > > > > Do you see any issue with that ? > > > > > > I don't think it is fair to compare the ease of upgrading the kernel > > > to that required to upgrade a boards firmware, with the latter being > > > far, far more inconvenient on pretty much all of the boards that I have. > > > > IMO we're in the same spot as every other port here, and generally they work > > around firmware bugs when they've rolled out into production somewhere that > > firmware updates aren't likely to happen quickly. I'm not sure if there's > > any sort of exact rules written down anywhere, but IMO if the bug is going > > to impact users then we should deal with it. > > > > That applies for hardware bugs, but also firmware bugs (at a certain point > > we won't be able to tell the difference). We're sort of doing this with the > > misaligned access handling, for example. > > > > > I'm perfectly happy to drop this series though, if people generally are > > > of the opinion that this sort of firmware workaround is ill-advised. > > > We are unaffected by it, so I certainly have no pressure to have > > > something working here. It's my desire not to be user-hostile that > > > motivated this patch. > > > > IIUC you guys and Reneas are the only ones who have hardware that might be > > in a spot where users aren't able to update the firmware (ie, it's out in > > production somewhere). > > I dunno if we can really keep thinking like that though. In terms of > people who have devicetrees in the kernel and stuff available in western > catalog distribution, sure. > I don't think we can assume that that covers all users though, certainly > the syntacore folks pop up every now and then, and I sure hope that > Andes etc have larger customer bases than the in-kernel users would > suggest. > > > So I'm adding Geert, though he probably saw this > > months ago... > > Prabhakar might be a good call on that front. I'm not sure if the > Renesas stuff works on affected versions of OpenSBI though, guess it > depends on the sequencing of the support for the non-coherent stuff and > when this bug was fixed. > ATM, I dont think there are any users who are using the upstream kernel + OpenSBI (apart from me and Geert!). Currently the customers are using the BSP releases. Cheers, Prabhakar > > On that note: It's been ~4 months and it look like nobody's tested anything > > (and the comments aren't really things that would preculde testing). > > Yeah, nobody seems to really have given a crap. I was hoping the > StarFive guys that actually added the support for this would be > interested in it, but alas they were not. > I don't really care all that much - the platform I support is not > affected by the problem and I just don't enable the option elsewhere. > > > So > > maybe we just pick that second patch up into for-next and see what happens? > > IIUC that will result in broken systems for users who haven't updated their > > firmware. > > > > I agree that's a user-hostile way to do things, which is generally a bad way > > to go, but if it's really true that there's no users then we're safe. > > Probably also worth calling it out on sw-dev just to be safe. > > And if there are users, the fix is actually relatively straight-forward, > just apply patch #1. > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
On Thu, Dec 07, 2023 at 01:02:00PM +0000, Lad, Prabhakar wrote: > On Wed, Dec 6, 2023 at 2:26 PM Conor Dooley <conor@kernel.org> wrote: > > On Wed, Dec 06, 2023 at 04:52:11AM -0800, Palmer Dabbelt wrote: > > > On Thu, 10 Aug 2023 02:07:10 PDT (-0700), Conor Dooley wrote: > > > > I'm perfectly happy to drop this series though, if people generally are > > > > of the opinion that this sort of firmware workaround is ill-advised. > > > > We are unaffected by it, so I certainly have no pressure to have > > > > something working here. It's my desire not to be user-hostile that > > > > motivated this patch. > > > > > > IIUC you guys and Reneas are the only ones who have hardware that might be > > > in a spot where users aren't able to update the firmware (ie, it's out in > > > production somewhere). > > > > I dunno if we can really keep thinking like that though. In terms of > > people who have devicetrees in the kernel and stuff available in western > > catalog distribution, sure. > > I don't think we can assume that that covers all users though, certainly > > the syntacore folks pop up every now and then, and I sure hope that > > Andes etc have larger customer bases than the in-kernel users would > > suggest. > > > > > So I'm adding Geert, though he probably saw this > > > months ago... > > > > Prabhakar might be a good call on that front. I'm not sure if the > > Renesas stuff works on affected versions of OpenSBI though, guess it > > depends on the sequencing of the support for the non-coherent stuff and > > when this bug was fixed. > > > ATM, I dont think there are any users who are using the upstream > kernel + OpenSBI (apart from me and Geert!). Currently the customers > are using the BSP releases. That doesn't really answer whether or not you (and your customers) are using an affected version of the vendor OpenSBI? The affected range for OpenSBI itself is [v0.8 to v1.3).
Hey, On Thu, Dec 07, 2023 at 01:11:23PM +0000, Conor Dooley wrote: > On Thu, Dec 07, 2023 at 01:02:00PM +0000, Lad, Prabhakar wrote: > > On Wed, Dec 6, 2023 at 2:26 PM Conor Dooley <conor@kernel.org> wrote: > > > On Wed, Dec 06, 2023 at 04:52:11AM -0800, Palmer Dabbelt wrote: > > > > On Thu, 10 Aug 2023 02:07:10 PDT (-0700), Conor Dooley wrote: > > > > > > I'm perfectly happy to drop this series though, if people generally are > > > > > of the opinion that this sort of firmware workaround is ill-advised. > > > > > We are unaffected by it, so I certainly have no pressure to have > > > > > something working here. It's my desire not to be user-hostile that > > > > > motivated this patch. > > > > > > > > IIUC you guys and Reneas are the only ones who have hardware that might be > > > > in a spot where users aren't able to update the firmware (ie, it's out in > > > > production somewhere). > > > > > > I dunno if we can really keep thinking like that though. In terms of > > > people who have devicetrees in the kernel and stuff available in western > > > catalog distribution, sure. > > > I don't think we can assume that that covers all users though, certainly > > > the syntacore folks pop up every now and then, and I sure hope that > > > Andes etc have larger customer bases than the in-kernel users would > > > suggest. > > > > > > > So I'm adding Geert, though he probably saw this > > > > months ago... > > > > > > Prabhakar might be a good call on that front. I'm not sure if the > > > Renesas stuff works on affected versions of OpenSBI though, guess it > > > depends on the sequencing of the support for the non-coherent stuff and > > > when this bug was fixed. > > > > > ATM, I dont think there are any users who are using the upstream > > kernel + OpenSBI (apart from me and Geert!). Currently the customers > > are using the BSP releases. > > That doesn't really answer whether or not you (and your customers) are > using an affected version of the vendor OpenSBI? > The affected range for OpenSBI itself is [v0.8 to v1.3). Did you perhaps miss this mail Prabhakar? Cheers, Conor.
Hi Conor, > -----Original Message----- > From: Conor Dooley <conor@kernel.org> > Sent: Tuesday, December 19, 2023 5:18 PM > To: Conor Dooley <conor.dooley@microchip.com> > Cc: Lad, Prabhakar <prabhakar.csengg@gmail.com>; Palmer Dabbelt <palmer@dabbelt.com>; > geert+renesas@glider.be; Atish Patra <atishp@rivosinc.com>; Paul Walmsley <paul.walmsley@sifive.com>; > apatel@ventanamicro.com; alexghiti@rivosinc.com; Bjorn Topel <bjorn@rivosinc.com>; > suagrfillet@gmail.com; jeeheng.sia@starfivetech.com; petrtesarik@huaweicloud.com; linux- > riscv@lists.infradead.org; linux-kernel@vger.kernel.org; stable@vger.kernel.org; Prabhakar Mahadev Lad > <prabhakar.mahadev-lad.rj@bp.renesas.com> > Subject: Re: [RFT 1/2] RISC-V: handle missing "no-map" properties for OpenSBI's PMP protected regions > > Hey, > > On Thu, Dec 07, 2023 at 01:11:23PM +0000, Conor Dooley wrote: > > On Thu, Dec 07, 2023 at 01:02:00PM +0000, Lad, Prabhakar wrote: > > > On Wed, Dec 6, 2023 at 2:26 PM Conor Dooley <conor@kernel.org> wrote: > > > > On Wed, Dec 06, 2023 at 04:52:11AM -0800, Palmer Dabbelt wrote: > > > > > On Thu, 10 Aug 2023 02:07:10 PDT (-0700), Conor Dooley wrote: > > > > > > > > I'm perfectly happy to drop this series though, if people > > > > > > generally are of the opinion that this sort of firmware workaround is ill-advised. > > > > > > We are unaffected by it, so I certainly have no pressure to > > > > > > have something working here. It's my desire not to be > > > > > > user-hostile that motivated this patch. > > > > > > > > > > IIUC you guys and Reneas are the only ones who have hardware > > > > > that might be in a spot where users aren't able to update the > > > > > firmware (ie, it's out in production somewhere). > > > > > > > > I dunno if we can really keep thinking like that though. In terms > > > > of people who have devicetrees in the kernel and stuff available > > > > in western catalog distribution, sure. > > > > I don't think we can assume that that covers all users though, > > > > certainly the syntacore folks pop up every now and then, and I > > > > sure hope that Andes etc have larger customer bases than the > > > > in-kernel users would suggest. > > > > > > > > > So I'm adding Geert, though he probably saw this months ago... > > > > > > > > Prabhakar might be a good call on that front. I'm not sure if the > > > > Renesas stuff works on affected versions of OpenSBI though, guess > > > > it depends on the sequencing of the support for the non-coherent > > > > stuff and when this bug was fixed. > > > > > > > ATM, I dont think there are any users who are using the upstream > > > kernel + OpenSBI (apart from me and Geert!). Currently the customers > > > are using the BSP releases. > > > > That doesn't really answer whether or not you (and your customers) are > > using an affected version of the vendor OpenSBI? > > The affected range for OpenSBI itself is [v0.8 to v1.3). > > Did you perhaps miss this mail Prabhakar? > Oops sorry for that. I can confirm the BSP version used by the customers is v1.0 [0]. [0] https://github.com/renesas-rz/rz_opensbi/commits/work/OpenSBI-PMA/ Cheers, Prabhakar
On Tue, 19 Dec 2023 09:27:42 PST (-0800), prabhakar.mahadev-lad.rj@bp.renesas.com wrote: > Hi Conor, > >> -----Original Message----- >> From: Conor Dooley <conor@kernel.org> >> Sent: Tuesday, December 19, 2023 5:18 PM >> To: Conor Dooley <conor.dooley@microchip.com> >> Cc: Lad, Prabhakar <prabhakar.csengg@gmail.com>; Palmer Dabbelt <palmer@dabbelt.com>; >> geert+renesas@glider.be; Atish Patra <atishp@rivosinc.com>; Paul Walmsley <paul.walmsley@sifive.com>; >> apatel@ventanamicro.com; alexghiti@rivosinc.com; Bjorn Topel <bjorn@rivosinc.com>; >> suagrfillet@gmail.com; jeeheng.sia@starfivetech.com; petrtesarik@huaweicloud.com; linux- >> riscv@lists.infradead.org; linux-kernel@vger.kernel.org; stable@vger.kernel.org; Prabhakar Mahadev Lad >> <prabhakar.mahadev-lad.rj@bp.renesas.com> >> Subject: Re: [RFT 1/2] RISC-V: handle missing "no-map" properties for OpenSBI's PMP protected regions >> >> Hey, >> >> On Thu, Dec 07, 2023 at 01:11:23PM +0000, Conor Dooley wrote: >> > On Thu, Dec 07, 2023 at 01:02:00PM +0000, Lad, Prabhakar wrote: >> > > On Wed, Dec 6, 2023 at 2:26 PM Conor Dooley <conor@kernel.org> wrote: >> > > > On Wed, Dec 06, 2023 at 04:52:11AM -0800, Palmer Dabbelt wrote: >> > > > > On Thu, 10 Aug 2023 02:07:10 PDT (-0700), Conor Dooley wrote: >> > >> > > > > > I'm perfectly happy to drop this series though, if people >> > > > > > generally are of the opinion that this sort of firmware workaround is ill-advised. >> > > > > > We are unaffected by it, so I certainly have no pressure to >> > > > > > have something working here. It's my desire not to be >> > > > > > user-hostile that motivated this patch. >> > > > > >> > > > > IIUC you guys and Reneas are the only ones who have hardware >> > > > > that might be in a spot where users aren't able to update the >> > > > > firmware (ie, it's out in production somewhere). >> > > > >> > > > I dunno if we can really keep thinking like that though. In terms >> > > > of people who have devicetrees in the kernel and stuff available >> > > > in western catalog distribution, sure. >> > > > I don't think we can assume that that covers all users though, >> > > > certainly the syntacore folks pop up every now and then, and I >> > > > sure hope that Andes etc have larger customer bases than the >> > > > in-kernel users would suggest. >> > > > >> > > > > So I'm adding Geert, though he probably saw this months ago... >> > > > >> > > > Prabhakar might be a good call on that front. I'm not sure if the >> > > > Renesas stuff works on affected versions of OpenSBI though, guess >> > > > it depends on the sequencing of the support for the non-coherent >> > > > stuff and when this bug was fixed. >> > > > >> > > ATM, I dont think there are any users who are using the upstream >> > > kernel + OpenSBI (apart from me and Geert!). Currently the customers >> > > are using the BSP releases. >> > >> > That doesn't really answer whether or not you (and your customers) are >> > using an affected version of the vendor OpenSBI? >> > The affected range for OpenSBI itself is [v0.8 to v1.3). >> >> Did you perhaps miss this mail Prabhakar? >> > Oops sorry for that. > > I can confirm the BSP version used by the customers is v1.0 [0]. > > [0] https://github.com/renesas-rz/rz_opensbi/commits/work/OpenSBI-PMA/ OK, so sounds like would end up with broken systems from this bug, then? IIRC we still have the Renesas systems as NONPORTABLE due to that DMA pool Kconfig conflict. So if it's really only these Renesas systems that have the bug, maybe we can still remove hibernation from NONPORTABLE and just add in some sort of Kconfig to disable the Renesas+hibernation combinations that would break? > > Cheers, > Prabhakar
Hi Palmer, > -----Original Message----- > From: Palmer Dabbelt <palmer@dabbelt.com> > Sent: Tuesday, December 19, 2023 6:07 PM > To: Prabhakar Mahadev Lad <prabhakar.mahadev-lad.rj@bp.renesas.com> > Cc: Conor Dooley <conor@kernel.org>; Conor Dooley <conor.dooley@microchip.com>; > prabhakar.csengg@gmail.com; geert+renesas@glider.be; Atish Patra <atishp@rivosinc.com>; Paul Walmsley > <paul.walmsley@sifive.com>; apatel@ventanamicro.com; alexghiti@rivosinc.com; Bjorn Topel > <bjorn@rivosinc.com>; suagrfillet@gmail.com; jeeheng.sia@starfivetech.com; > petrtesarik@huaweicloud.com; linux-riscv@lists.infradead.org; linux-kernel@vger.kernel.org; > stable@vger.kernel.org > Subject: RE: [RFT 1/2] RISC-V: handle missing "no-map" properties for OpenSBI's PMP protected regions > > On Tue, 19 Dec 2023 09:27:42 PST (-0800), prabhakar.mahadev-lad.rj@bp.renesas.com wrote: > > Hi Conor, > > > >> -----Original Message----- > >> From: Conor Dooley <conor@kernel.org> > >> Sent: Tuesday, December 19, 2023 5:18 PM > >> To: Conor Dooley <conor.dooley@microchip.com> > >> Cc: Lad, Prabhakar <prabhakar.csengg@gmail.com>; Palmer Dabbelt > >> <palmer@dabbelt.com>; > >> geert+renesas@glider.be; Atish Patra <atishp@rivosinc.com>; Paul > >> geert+Walmsley <paul.walmsley@sifive.com>; > >> apatel@ventanamicro.com; alexghiti@rivosinc.com; Bjorn Topel > >> <bjorn@rivosinc.com>; suagrfillet@gmail.com; > >> jeeheng.sia@starfivetech.com; petrtesarik@huaweicloud.com; linux- > >> riscv@lists.infradead.org; linux-kernel@vger.kernel.org; > >> stable@vger.kernel.org; Prabhakar Mahadev Lad > >> <prabhakar.mahadev-lad.rj@bp.renesas.com> > >> Subject: Re: [RFT 1/2] RISC-V: handle missing "no-map" properties for > >> OpenSBI's PMP protected regions > >> > >> Hey, > >> > >> On Thu, Dec 07, 2023 at 01:11:23PM +0000, Conor Dooley wrote: > >> > On Thu, Dec 07, 2023 at 01:02:00PM +0000, Lad, Prabhakar wrote: > >> > > On Wed, Dec 6, 2023 at 2:26 PM Conor Dooley <conor@kernel.org> wrote: > >> > > > On Wed, Dec 06, 2023 at 04:52:11AM -0800, Palmer Dabbelt wrote: > >> > > > > On Thu, 10 Aug 2023 02:07:10 PDT (-0700), Conor Dooley wrote: > >> > > >> > > > > > I'm perfectly happy to drop this series though, if people > >> > > > > > generally are of the opinion that this sort of firmware workaround is ill-advised. > >> > > > > > We are unaffected by it, so I certainly have no pressure to > >> > > > > > have something working here. It's my desire not to be > >> > > > > > user-hostile that motivated this patch. > >> > > > > > >> > > > > IIUC you guys and Reneas are the only ones who have hardware > >> > > > > that might be in a spot where users aren't able to update the > >> > > > > firmware (ie, it's out in production somewhere). > >> > > > > >> > > > I dunno if we can really keep thinking like that though. In > >> > > > terms of people who have devicetrees in the kernel and stuff > >> > > > available in western catalog distribution, sure. > >> > > > I don't think we can assume that that covers all users though, > >> > > > certainly the syntacore folks pop up every now and then, and I > >> > > > sure hope that Andes etc have larger customer bases than the > >> > > > in-kernel users would suggest. > >> > > > > >> > > > > So I'm adding Geert, though he probably saw this months ago... > >> > > > > >> > > > Prabhakar might be a good call on that front. I'm not sure if > >> > > > the Renesas stuff works on affected versions of OpenSBI though, > >> > > > guess it depends on the sequencing of the support for the > >> > > > non-coherent stuff and when this bug was fixed. > >> > > > > >> > > ATM, I dont think there are any users who are using the upstream > >> > > kernel + OpenSBI (apart from me and Geert!). Currently the > >> > > customers are using the BSP releases. > >> > > >> > That doesn't really answer whether or not you (and your customers) > >> > are using an affected version of the vendor OpenSBI? > >> > The affected range for OpenSBI itself is [v0.8 to v1.3). > >> > >> Did you perhaps miss this mail Prabhakar? > >> > > Oops sorry for that. > > > > I can confirm the BSP version used by the customers is v1.0 [0]. > > > > [0] > > https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith > > ub.com%2Frenesas-rz%2Frz_opensbi%2Fcommits%2Fwork%2FOpenSBI-PMA%2F&dat > > a=05%7C02%7Cprabhakar.mahadev-lad.rj%40bp.renesas.com%7C014cf4ddfd1e48 > > 1ff5bc08dc00bd467f%7C53d82571da1947e49cb4625a166a4a2a%7C0%7C0%7C638386 > > 060130410731%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luM > > zIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=a0tcsXY4EQlODi > > I34QygXS9QpJnVBqL8bNkxE8N5J2g%3D&reserved=0 > > OK, so sounds like would end up with broken systems from this bug, then? > > IIRC we still have the Renesas systems as NONPORTABLE due to that DMA pool Kconfig conflict. So if > it's really only these Renesas systems that have the bug, maybe we can still remove hibernation from > NONPORTABLE and just add in some sort of Kconfig to disable the > Renesas+hibernation combinations that would break? > Well customers using BSP uses v1.0 for OpenSBI and kernel 5.10-cip, and people wanting to run upstream kernel will have to only use the upstream OpenSBI as the OpenSBI used in BSP is not compatible with upstream kernel(Linux doesn’t bootup). ATM I can say that its only me and Geert using upstream OpenBSI and upstream kernel. With that in mind would we still require that change? Cheers, Prabhakar
On 19 December 2023 18:38:30 GMT, Prabhakar Mahadev Lad <prabhakar.mahadev-lad.rj@bp.renesas.com> wrote: >Hi Palmer, > >> -----Original Message----- >> From: Palmer Dabbelt <palmer@dabbelt.com> >> Sent: Tuesday, December 19, 2023 6:07 PM >> To: Prabhakar Mahadev Lad <prabhakar.mahadev-lad.rj@bp.renesas.com> >> Cc: Conor Dooley <conor@kernel.org>; Conor Dooley <conor.dooley@microchip.com>; >> prabhakar.csengg@gmail.com; geert+renesas@glider.be; Atish Patra <atishp@rivosinc.com>; Paul Walmsley >> <paul.walmsley@sifive.com>; apatel@ventanamicro.com; alexghiti@rivosinc.com; Bjorn Topel >> <bjorn@rivosinc.com>; suagrfillet@gmail.com; jeeheng.sia@starfivetech.com; >> petrtesarik@huaweicloud.com; linux-riscv@lists.infradead.org; linux-kernel@vger.kernel.org; >> stable@vger.kernel.org >> Subject: RE: [RFT 1/2] RISC-V: handle missing "no-map" properties for OpenSBI's PMP protected regions >> >> On Tue, 19 Dec 2023 09:27:42 PST (-0800), prabhakar.mahadev-lad.rj@bp.renesas.com wrote: >> > Hi Conor, >> > >> >> -----Original Message----- >> >> From: Conor Dooley <conor@kernel.org> >> >> Sent: Tuesday, December 19, 2023 5:18 PM >> >> To: Conor Dooley <conor.dooley@microchip.com> >> >> Cc: Lad, Prabhakar <prabhakar.csengg@gmail.com>; Palmer Dabbelt >> >> <palmer@dabbelt.com>; >> >> geert+renesas@glider.be; Atish Patra <atishp@rivosinc.com>; Paul >> >> geert+Walmsley <paul.walmsley@sifive.com>; >> >> apatel@ventanamicro.com; alexghiti@rivosinc.com; Bjorn Topel >> >> <bjorn@rivosinc.com>; suagrfillet@gmail.com; >> >> jeeheng.sia@starfivetech.com; petrtesarik@huaweicloud.com; linux- >> >> riscv@lists.infradead.org; linux-kernel@vger.kernel.org; >> >> stable@vger.kernel.org; Prabhakar Mahadev Lad >> >> <prabhakar.mahadev-lad.rj@bp.renesas.com> >> >> Subject: Re: [RFT 1/2] RISC-V: handle missing "no-map" properties for >> >> OpenSBI's PMP protected regions >> >> >> >> Hey, >> >> >> >> On Thu, Dec 07, 2023 at 01:11:23PM +0000, Conor Dooley wrote: >> >> > On Thu, Dec 07, 2023 at 01:02:00PM +0000, Lad, Prabhakar wrote: >> >> > > On Wed, Dec 6, 2023 at 2:26 PM Conor Dooley <conor@kernel.org> wrote: >> >> > > > On Wed, Dec 06, 2023 at 04:52:11AM -0800, Palmer Dabbelt wrote: >> >> > > > > On Thu, 10 Aug 2023 02:07:10 PDT (-0700), Conor Dooley wrote: >> >> > >> >> > > > > > I'm perfectly happy to drop this series though, if people >> >> > > > > > generally are of the opinion that this sort of firmware workaround is ill-advised. >> >> > > > > > We are unaffected by it, so I certainly have no pressure to >> >> > > > > > have something working here. It's my desire not to be >> >> > > > > > user-hostile that motivated this patch. >> >> > > > > >> >> > > > > IIUC you guys and Reneas are the only ones who have hardware >> >> > > > > that might be in a spot where users aren't able to update the >> >> > > > > firmware (ie, it's out in production somewhere). >> >> > > > >> >> > > > I dunno if we can really keep thinking like that though. In >> >> > > > terms of people who have devicetrees in the kernel and stuff >> >> > > > available in western catalog distribution, sure. >> >> > > > I don't think we can assume that that covers all users though, >> >> > > > certainly the syntacore folks pop up every now and then, and I >> >> > > > sure hope that Andes etc have larger customer bases than the >> >> > > > in-kernel users would suggest. >> >> > > > >> >> > > > > So I'm adding Geert, though he probably saw this months ago... >> >> > > > >> >> > > > Prabhakar might be a good call on that front. I'm not sure if >> >> > > > the Renesas stuff works on affected versions of OpenSBI though, >> >> > > > guess it depends on the sequencing of the support for the >> >> > > > non-coherent stuff and when this bug was fixed. >> >> > > > >> >> > > ATM, I dont think there are any users who are using the upstream >> >> > > kernel + OpenSBI (apart from me and Geert!). Currently the >> >> > > customers are using the BSP releases. >> >> > >> >> > That doesn't really answer whether or not you (and your customers) >> >> > are using an affected version of the vendor OpenSBI? >> >> > The affected range for OpenSBI itself is [v0.8 to v1.3). >> >> >> >> Did you perhaps miss this mail Prabhakar? >> >> >> > Oops sorry for that. >> > >> > I can confirm the BSP version used by the customers is v1.0 [0]. >> > >> > [0] >> > https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgith >> > ub.com%2Frenesas-rz%2Frz_opensbi%2Fcommits%2Fwork%2FOpenSBI-PMA%2F&dat >> > a=05%7C02%7Cprabhakar.mahadev-lad.rj%40bp.renesas.com%7C014cf4ddfd1e48 >> > 1ff5bc08dc00bd467f%7C53d82571da1947e49cb4625a166a4a2a%7C0%7C0%7C638386 >> > 060130410731%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luM >> > zIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=a0tcsXY4EQlODi >> > I34QygXS9QpJnVBqL8bNkxE8N5J2g%3D&reserved=0 >> >> OK, so sounds like would end up with broken systems from this bug, then? >> >> IIRC we still have the Renesas systems as NONPORTABLE due to that DMA pool Kconfig conflict. So if >> it's really only these Renesas systems that have the bug, maybe we can still remove hibernation from >> NONPORTABLE and just add in some sort of Kconfig to disable the >> Renesas+hibernation combinations that would break? >> >Well customers using BSP uses v1.0 for OpenSBI and kernel 5.10-cip, and people wanting to run upstream kernel will have to only use the upstream OpenSBI as the OpenSBI used in BSP is not compatible with upstream kernel(Linux doesn’t bootup). > >ATM I can say that its only me and Geert using upstream OpenBSI and upstream kernel. > >With that in mind would we still require that change? 5.10 doesn't have hibernation support in it, although I'm not sure why anyone would really want to use a kernel that old with a RISC-V system. The upstream versions of opensbi that support the renesas stuff have the no-map fix, right? If that's the case, then nothing special config wise is likely required. I'm still wary of other systems though, we are acting as if it is only Microchip and Renesas are the "real" users.
> -----Original Message----- > From: Conor Dooley <conor@kernel.org> > Sent: Tuesday, December 19, 2023 6:58 PM > To: Prabhakar Mahadev Lad <prabhakar.mahadev-lad.rj@bp.renesas.com>; Palmer Dabbelt > <palmer@dabbelt.com> > Cc: Conor Dooley <conor.dooley@microchip.com>; prabhakar.csengg@gmail.com; geert+renesas@glider.be; > Atish Patra <atishp@rivosinc.com>; Paul Walmsley <paul.walmsley@sifive.com>; apatel@ventanamicro.com; > alexghiti@rivosinc.com; Bjorn Topel <bjorn@rivosinc.com>; suagrfillet@gmail.com; > jeeheng.sia@starfivetech.com; petrtesarik@huaweicloud.com; linux-riscv@lists.infradead.org; linux- > kernel@vger.kernel.org; stable@vger.kernel.org > Subject: RE: [RFT 1/2] RISC-V: handle missing "no-map" properties for OpenSBI's PMP protected regions > > > > On 19 December 2023 18:38:30 GMT, Prabhakar Mahadev Lad <prabhakar.mahadev-lad.rj@bp.renesas.com> > wrote: > >Hi Palmer, > > > >> -----Original Message----- > >> From: Palmer Dabbelt <palmer@dabbelt.com> > >> Sent: Tuesday, December 19, 2023 6:07 PM > >> To: Prabhakar Mahadev Lad <prabhakar.mahadev-lad.rj@bp.renesas.com> > >> Cc: Conor Dooley <conor@kernel.org>; Conor Dooley > >> <conor.dooley@microchip.com>; prabhakar.csengg@gmail.com; > >> geert+renesas@glider.be; Atish Patra <atishp@rivosinc.com>; Paul > >> Walmsley <paul.walmsley@sifive.com>; apatel@ventanamicro.com; > >> alexghiti@rivosinc.com; Bjorn Topel <bjorn@rivosinc.com>; > >> suagrfillet@gmail.com; jeeheng.sia@starfivetech.com; > >> petrtesarik@huaweicloud.com; linux-riscv@lists.infradead.org; > >> linux-kernel@vger.kernel.org; stable@vger.kernel.org > >> Subject: RE: [RFT 1/2] RISC-V: handle missing "no-map" properties for > >> OpenSBI's PMP protected regions > >> > >> On Tue, 19 Dec 2023 09:27:42 PST (-0800), prabhakar.mahadev-lad.rj@bp.renesas.com wrote: > >> > Hi Conor, > >> > > >> >> -----Original Message----- > >> >> From: Conor Dooley <conor@kernel.org> > >> >> Sent: Tuesday, December 19, 2023 5:18 PM > >> >> To: Conor Dooley <conor.dooley@microchip.com> > >> >> Cc: Lad, Prabhakar <prabhakar.csengg@gmail.com>; Palmer Dabbelt > >> >> <palmer@dabbelt.com>; > >> >> geert+renesas@glider.be; Atish Patra <atishp@rivosinc.com>; Paul > >> >> geert+Walmsley <paul.walmsley@sifive.com>; > >> >> apatel@ventanamicro.com; alexghiti@rivosinc.com; Bjorn Topel > >> >> <bjorn@rivosinc.com>; suagrfillet@gmail.com; > >> >> jeeheng.sia@starfivetech.com; petrtesarik@huaweicloud.com; linux- > >> >> riscv@lists.infradead.org; linux-kernel@vger.kernel.org; > >> >> stable@vger.kernel.org; Prabhakar Mahadev Lad > >> >> <prabhakar.mahadev-lad.rj@bp.renesas.com> > >> >> Subject: Re: [RFT 1/2] RISC-V: handle missing "no-map" properties > >> >> for OpenSBI's PMP protected regions > >> >> > >> >> Hey, > >> >> > >> >> On Thu, Dec 07, 2023 at 01:11:23PM +0000, Conor Dooley wrote: > >> >> > On Thu, Dec 07, 2023 at 01:02:00PM +0000, Lad, Prabhakar wrote: > >> >> > > On Wed, Dec 6, 2023 at 2:26 PM Conor Dooley <conor@kernel.org> wrote: > >> >> > > > On Wed, Dec 06, 2023 at 04:52:11AM -0800, Palmer Dabbelt wrote: > >> >> > > > > On Thu, 10 Aug 2023 02:07:10 PDT (-0700), Conor Dooley wrote: > >> >> > > >> >> > > > > > I'm perfectly happy to drop this series though, if > >> >> > > > > > people generally are of the opinion that this sort of firmware workaround is ill- > advised. > >> >> > > > > > We are unaffected by it, so I certainly have no pressure > >> >> > > > > > to have something working here. It's my desire not to be > >> >> > > > > > user-hostile that motivated this patch. > >> >> > > > > > >> >> > > > > IIUC you guys and Reneas are the only ones who have > >> >> > > > > hardware that might be in a spot where users aren't able > >> >> > > > > to update the firmware (ie, it's out in production somewhere). > >> >> > > > > >> >> > > > I dunno if we can really keep thinking like that though. In > >> >> > > > terms of people who have devicetrees in the kernel and stuff > >> >> > > > available in western catalog distribution, sure. > >> >> > > > I don't think we can assume that that covers all users > >> >> > > > though, certainly the syntacore folks pop up every now and > >> >> > > > then, and I sure hope that Andes etc have larger customer > >> >> > > > bases than the in-kernel users would suggest. > >> >> > > > > >> >> > > > > So I'm adding Geert, though he probably saw this months ago... > >> >> > > > > >> >> > > > Prabhakar might be a good call on that front. I'm not sure > >> >> > > > if the Renesas stuff works on affected versions of OpenSBI > >> >> > > > though, guess it depends on the sequencing of the support > >> >> > > > for the non-coherent stuff and when this bug was fixed. > >> >> > > > > >> >> > > ATM, I dont think there are any users who are using the > >> >> > > upstream kernel + OpenSBI (apart from me and Geert!). > >> >> > > Currently the customers are using the BSP releases. > >> >> > > >> >> > That doesn't really answer whether or not you (and your > >> >> > customers) are using an affected version of the vendor OpenSBI? > >> >> > The affected range for OpenSBI itself is [v0.8 to v1.3). > >> >> > >> >> Did you perhaps miss this mail Prabhakar? > >> >> > >> > Oops sorry for that. > >> > > >> > I can confirm the BSP version used by the customers is v1.0 [0]. > >> > > >> > [0] > >> > https://jpn01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fg > >> > ith%2F&data=05%7C02%7Cprabhakar.mahadev-lad.rj%40bp.renesas.com%7C6 > >> > 3259d3bbda343ccde3e08dc00c46054%7C53d82571da1947e49cb4625a166a4a2a% > >> > 7C0%7C0%7C638386090629864750%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLj > >> > AwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7 > >> > C&sdata=xebpQgqY9W03HSDdoL0Si2taJ2RkOgTiR8H6koSKNq8%3D&reserved=0 > >> > ub.com%2Frenesas-rz%2Frz_opensbi%2Fcommits%2Fwork%2FOpenSBI-PMA%2F& > >> > dat > >> > a=05%7C02%7Cprabhakar.mahadev-lad.rj%40bp.renesas.com%7C014cf4ddfd1 > >> > e48 > >> > 1ff5bc08dc00bd467f%7C53d82571da1947e49cb4625a166a4a2a%7C0%7C0%7C638 > >> > 386 > >> > 060130410731%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2 > >> > luM > >> > zIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=a0tcsXY4EQl > >> > ODi > >> > I34QygXS9QpJnVBqL8bNkxE8N5J2g%3D&reserved=0 > >> > >> OK, so sounds like would end up with broken systems from this bug, then? > >> > >> IIRC we still have the Renesas systems as NONPORTABLE due to that DMA > >> pool Kconfig conflict. So if it's really only these Renesas systems > >> that have the bug, maybe we can still remove hibernation from > >> NONPORTABLE and just add in some sort of Kconfig to disable the > >> Renesas+hibernation combinations that would break? > >> > >Well customers using BSP uses v1.0 for OpenSBI and kernel 5.10-cip, and people wanting to run > upstream kernel will have to only use the upstream OpenSBI as the OpenSBI used in BSP is not > compatible with upstream kernel(Linux doesn’t bootup). > > > >ATM I can say that its only me and Geert using upstream OpenBSI and upstream kernel. > > > >With that in mind would we still require that change? > > 5.10 doesn't have hibernation support in it, although I'm not sure why anyone would really want to use > a kernel that old with a RISC-V system. > At Renesas we have the BSPs based on the CIP kernel. Currently the BSPs are based on 5.10-cip [0] (we plan to upgrade it to 6.1-cip). > The upstream versions of opensbi that support the renesas stuff have the no-map fix, right? > If that's the case, then nothing special config wise is likely required. > Yes it does. > I'm still wary of other systems though, we are acting as if it is only Microchip and Renesas are the > "real" users. [0] https://gitlab.com/cip-project/cip-kernel/linux-cip/-/tree/linux-5.10.y-cip?ref_type=heads Cheers, Prabhakar
diff --git a/arch/riscv/include/asm/sbi.h b/arch/riscv/include/asm/sbi.h index 5b4a1bf5f439..5360f3476278 100644 --- a/arch/riscv/include/asm/sbi.h +++ b/arch/riscv/include/asm/sbi.h @@ -252,6 +252,9 @@ enum sbi_pmu_ctr_type { #define SBI_ERR_ALREADY_STARTED -7 #define SBI_ERR_ALREADY_STOPPED -8 +/* SBI implementation IDs */ +#define SBI_IMP_OPENSBI 1 + extern unsigned long sbi_spec_version; struct sbiret { long error; @@ -259,6 +262,8 @@ struct sbiret { }; void sbi_init(void); +void sbi_apply_reserved_mem_erratum(void *dtb_va); + struct sbiret sbi_ecall(int ext, int fid, unsigned long arg0, unsigned long arg1, unsigned long arg2, unsigned long arg3, unsigned long arg4, diff --git a/arch/riscv/kernel/sbi.c b/arch/riscv/kernel/sbi.c index c672c8ba9a2a..aeb27263fa53 100644 --- a/arch/riscv/kernel/sbi.c +++ b/arch/riscv/kernel/sbi.c @@ -5,8 +5,10 @@ * Copyright (c) 2020 Western Digital Corporation or its affiliates. */ +#include <linux/acpi.h> #include <linux/bits.h> #include <linux/init.h> +#include <linux/libfdt.h> #include <linux/pm.h> #include <linux/reboot.h> #include <asm/sbi.h> @@ -583,6 +585,40 @@ long sbi_get_mimpid(void) } EXPORT_SYMBOL_GPL(sbi_get_mimpid); +static long sbi_firmware_id; +static long sbi_firmware_version; + +/* + * For devicetrees patched by OpenSBI a "mmode_resv" node is added to cover + * the region OpenSBI has protected by means of a PMP. Some versions of OpenSBI, + * [v0.8 to v1.3), omitted the "no-map" property, but this trips up hibernation + * among other things. + */ +void __init sbi_apply_reserved_mem_erratum(void *dtb_pa) +{ + int child, reserved_mem; + + if (sbi_firmware_id != SBI_IMP_OPENSBI) + return; + + if (!acpi_disabled) + return; + + if (sbi_firmware_version >= 0x10003 || sbi_firmware_version < 0x8) + return; + + reserved_mem = fdt_path_offset((void *)dtb_pa, "/reserved-memory"); + if (reserved_mem < 0) + return; + + fdt_for_each_subnode(child, (void *)dtb_pa, reserved_mem) { + const char *name = fdt_get_name((void *)dtb_pa, child, NULL); + + if (!strncmp(name, "mmode_resv", 10)) + fdt_setprop((void *)dtb_pa, child, "no-map", NULL, 0); + } +}; + void __init sbi_init(void) { int ret; @@ -596,8 +632,12 @@ void __init sbi_init(void) sbi_major_version(), sbi_minor_version()); if (!sbi_spec_is_0_1()) { + sbi_firmware_id = sbi_get_firmware_id(); + sbi_firmware_version = sbi_get_firmware_version(); + pr_info("SBI implementation ID=0x%lx Version=0x%lx\n", - sbi_get_firmware_id(), sbi_get_firmware_version()); + sbi_firmware_id, sbi_firmware_version); + if (sbi_probe_extension(SBI_EXT_TIME)) { __sbi_set_timer = __sbi_set_timer_v02; pr_info("SBI TIME extension detected\n"); diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 70fb31960b63..cb16bfdeacdb 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -29,6 +29,7 @@ #include <asm/tlbflush.h> #include <asm/sections.h> #include <asm/soc.h> +#include <asm/sbi.h> #include <asm/io.h> #include <asm/ptdump.h> #include <asm/numa.h> @@ -253,6 +254,8 @@ static void __init setup_bootmem(void) * in the device tree, otherwise the allocation could end up in a * reserved region. */ + + sbi_apply_reserved_mem_erratum(dtb_early_va); early_init_fdt_scan_reserved_mem(); /*