[v4] RISC-V: Show accurate per-hart isa in /proc/cpuinfo

Message ID 20230711201831.2695097-1-evan@rivosinc.com
State New
Headers
Series [v4] RISC-V: Show accurate per-hart isa in /proc/cpuinfo |

Commit Message

Evan Green July 11, 2023, 8:18 p.m. UTC
  In /proc/cpuinfo, most of the information we show for each processor is
specific to that hart: marchid, mvendorid, mimpid, processor, hart,
compatible, and the mmu size. But the ISA string gets filtered through a
lowest common denominator mask, so that if one CPU is missing an ISA
extension, no CPUs will show it.

Now that we track the ISA extensions for each hart, let's report ISA
extension info accurately per-hart in /proc/cpuinfo. We cannot change
the "isa:" line, as usermode may be relying on that line to show only
the common set of extensions supported across all harts. Add a new "hart
isa" line instead, which reports the true set of extensions for that
hart. This matches what is returned in riscv_hwprobe() when querying a
given hart.

Signed-off-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>

---

Changes in v4:
 - Documentation: Made the underline match the text line (Conor)
 - Documentation: hanged "in question" to "being described" (Andrew)

Changes in v3:
 - Add some documentation (Conor)

Changes in v2:
 - Added new "hart isa" line rather than altering behavior of existing
   "isa" line (Conor, Palmer)


I based this series on top of Conor's riscv-extensions-strings branch
from July 3rd, since otherwise this change gets hopelessly entangled
with that series.

---
 Documentation/riscv/uabi.rst | 10 ++++++++++
 arch/riscv/kernel/cpu.c      | 22 ++++++++++++++++++----
 2 files changed, 28 insertions(+), 4 deletions(-)
  

Comments

Conor Dooley Aug. 28, 2023, 5:13 p.m. UTC | #1
On Mon, Aug 28, 2023 at 09:44:55AM -0700, Evan Green wrote:
> On Sat, Aug 26, 2023 at 1:01 AM Andrew Jones <ajones@ventanamicro.com> wrote:
> >
> > On Fri, Aug 25, 2023 at 03:51:28PM -0700, Evan Green wrote:
> > > On Fri, Aug 25, 2023 at 1:16 AM Andrew Jones <ajones@ventanamicro.com> wrote:
> > > >
> > > > On Thu, Aug 24, 2023 at 03:06:53PM -0700, Evan Green wrote:
> > > > > On Thu, Aug 24, 2023 at 10:29 AM Conor Dooley <conor@kernel.org> wrote:
> > > > ...
> > > > > > Do you want to have this new thing in cpuinfo tell the user "this hart
> > > > > > has xyz extensions that are supported by a kernel, but maybe not this
> > > > > > kernel" or to tell the user "this hart has xyz extensions that are
> > > > > > supported by this kernel"? Your text above says "understood by the
> > > > > > kernel", but I think that's a poor definition that needs to be improved
> > > > > > to spell out exactly what you mean. IOW does "understood" mean the
> > > > > > kernel will parse them into a structure, or does it mean "yes you can
> > > > > > use this extension on this particular hart".
> > > > >
> > > > > I'm imagining /proc/cpuinfo being closer to "the CPU has it and the
> > > > > kernel at least vaguely understands it, but may not have full support
> > > > > for it enabled". I'm assuming /proc/cpuinfo is mostly used by 1)
> > > > > humans wanting to know if they have hardware support for a feature,
> > > > > and 2) administrators collecting telemetry to manage fleets (ie do I
> > > > > have any hardware deployed that supports X).
> > > >
> > > > Is (2) a special case of (1)? (I want to make sure I understand all the
> > > > cases.)
> > >
> > > More or less, yes. In bucket two are also folks wondering things like
> > > "are all these crash reports I'm getting specific to machines with X".
> > >
> > > >
> > > > > Programmers looking to
> > > > > see "is the kernel support for this feature ready right now" would
> > > > > ideally not be parsing /proc/cpuinfo text, as more direct mechanisms
> > > > > like specific hwprobe bits for "am I fully ready to go" would be
> > > > > easier to work with. Feel free to yell at me if this overall vision
> > > > > seems flawed.
> > > > >
> > > > > I tried to look to see if there was consensus among the other
> > > > > architectures. Aarch64 seems to go with "supported and fully enabled",
> > > > > as their cpu_has_feature() directly tests elf_hwcap, and elements in
> > > > > arm64_elf_hwcaps[] are Kconfig gated. X86 is complicated, but IIRC is
> > > > > more along the lines of "hardware has it". They have two macros,
> > > > > cpu_has() for "raw capability" and cpu_feature_enabled() for "kernel
> > > > > can do it too", and they use cpu_has() for /proc/cpuinfo flags.
> > > > >
> > > >
> > > > I'd lean more towards the way AArch64 goes, because, unless /proc/cpuinfo
> > > > is just a blind regurgitation of an isa string from DT / ACPI, then the
> > > > kernel must at least know something about it. Advertising a feature which
> > > > is known, but also known not to work, seems odd to me. So my vote is that
> > > > only features which are present and enabled in the kernel or present and
> > > > not necessary to be enabled in the kernel in order for userspace or
> > > > virtual machines to use be advertised in /proc/cpuinfo.
> > > >
> > > > We still have SMBIOS (dmidecode) to blindly dump what the hardware
> > > > supports for cases (1) and (2) above.
> > >
> > > Yeah, there's an argument to be made for that. My worry is it's a
> > > difficult line to hold. The bar you're really trying to describe (or
> > > at least what people might take away from it) is "if it's listed here
> > > then it's fully ready to be used in userspace". But then things get
> > > squishy when there are additional ways to control the use of the
> > > feature (prctls() in init to turn it on, usermode policy to turn it
> > > off, security doodads that disable it, etc). I'm assuming nobody wants
> > > a version of /proc/cpuinfo that changes depending on which process is
> > > asking. So then the line would have to be more carefully described as
> > > "well, the hardware can do it, and the kernel COULD do it under some
> > > circumstances, but YMMV", which ends up falling somewhat short of the
> > > original goal. In my mind keeping /proc/cpuinfo as close to "here's
> > > what the hardware can do" seems like a more defensible position.
> > > -Evan
> >
> > I agree with that. I was actually even trying to say the same thing,
> > but only by bringing up virtual machines. Once we decide we'll expose
> > extensions to VMs, whether or not the host kernel enables them, then
> > none of the other host kernel configurations matter with respect to
> > advertising the feature, since the guest kernel may have a completely
> > different set of configurations.
> 
> My head spins a little when I try to picture a feature which 1)
> requires kernel support to use, 2) has that kernel support turned off
> in the host kernel, but 3) is passed down into guest kernels.

Mine did too, but apparently these already exist for kvm guests. I can't
find the exact email, but either Drew or Anup told me that Svpbmt can be
used by a guest even if support for it is not present in the host
kernel.

Thanks,
Conor.

> Generally though, I agree that trying to tie the guarantees of
> features in /proc/cpuinfo too much to software gets confusing when
> viewed through the double lens of virtualization.
> 
> >
> > So I think we should only be filtering out extensions that are disabled
> > because they're broken (have a detected erratum), have been "hidden"
> > (have a kernel command line allowing them to be treated as if not
> > present), or cannot be used at all due to missing accompanying hardware
> > descriptions (such as block size info for CBO extensions). In all cases,
> > I presume we'd wire checks up in riscv_isa_extension_check() and no
> > checks would be gated on Kconfigs or anything else. And, since
> > /proc/cpuinfo gets its list from the bitmap that's already filtered by
> > riscv_isa_extension_check(), then, long story short, we're good to go :-)
> >
> > But maybe we can try to spell that policy out a bit more in
> > Documentation/riscv/uabi.rst.
> 
> Right, that sounds reasonable to me, and is consistent with the
> behavior we already have. With this documentation change I was only
> trying to define the lower bound, rather than the complete definition
> for every case. In other words, seeing a feature in cpuinfo guarantees
> only that the hardware (or virtualized hardware) supports the feature,
> but that's all the language says. So for instance NOT seeing a feature
> in cpuinfo doesn't necessarily mean the hardware doesn't support it.
> Software turning it off for the reasons you describe IMO doesn't
> contradict what's written here. I was planning to leave that tacit,
> but if you have suggestions on how to spell that out I'd take them.
> 
> -Evan
  

Patch

diff --git a/Documentation/riscv/uabi.rst b/Documentation/riscv/uabi.rst
index 8960fac42c40..afdda580e5a2 100644
--- a/Documentation/riscv/uabi.rst
+++ b/Documentation/riscv/uabi.rst
@@ -42,6 +42,16 @@  An example string following the order is::
 
    rv64imadc_zifoo_zigoo_zafoo_sbar_scar_zxmbaz_xqux_xrux
 
+"isa" vs "hart isa" lines in /proc/cpuinfo
+------------------------------------------
+
+The "isa" line in /proc/cpuinfo describes the lowest common denominator of
+RISC-V ISA extensions understood by the kernel and implemented on all harts. The
+"hart isa" line, in contrast, describes the set of extensions understood by the
+kernel on the particular hart being described, even if those extensions may not
+be present on all harts in the system. The "hart isa" line is consistent with
+what's returned by __riscv_hwprobe() when querying for that specific CPU.
+
 Misaligned accesses
 -------------------
 
diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
index 1acf3679600d..6264b7b94945 100644
--- a/arch/riscv/kernel/cpu.c
+++ b/arch/riscv/kernel/cpu.c
@@ -197,9 +197,8 @@  arch_initcall(riscv_cpuinfo_init);
 
 #ifdef CONFIG_PROC_FS
 
-static void print_isa(struct seq_file *f)
+static void print_isa(struct seq_file *f, const unsigned long *isa_bitmap)
 {
-	seq_puts(f, "isa\t\t: ");
 
 	if (IS_ENABLED(CONFIG_32BIT))
 		seq_write(f, "rv32", 4);
@@ -207,7 +206,7 @@  static void print_isa(struct seq_file *f)
 		seq_write(f, "rv64", 4);
 
 	for (int i = 0; i < riscv_isa_ext_count; i++) {
-		if (!__riscv_isa_extension_available(NULL, riscv_isa_ext[i].id))
+		if (!__riscv_isa_extension_available(isa_bitmap, riscv_isa_ext[i].id))
 			continue;
 
 		/* Only multi-letter extensions are split by underscores */
@@ -271,7 +270,15 @@  static int c_show(struct seq_file *m, void *v)
 
 	seq_printf(m, "processor\t: %lu\n", cpu_id);
 	seq_printf(m, "hart\t\t: %lu\n", cpuid_to_hartid_map(cpu_id));
-	print_isa(m);
+
+	/*
+	 * For historical raisins, the isa: line is limited to the lowest common
+	 * denominator of extensions supported across all harts. A true list of
+	 * extensions supported on this hart is printed later in the hart_isa:
+	 * line.
+	 */
+	seq_puts(m, "isa\t\t: ");
+	print_isa(m, NULL);
 	print_mmu(m);
 
 	if (acpi_disabled) {
@@ -287,6 +294,13 @@  static int c_show(struct seq_file *m, void *v)
 	seq_printf(m, "mvendorid\t: 0x%lx\n", ci->mvendorid);
 	seq_printf(m, "marchid\t\t: 0x%lx\n", ci->marchid);
 	seq_printf(m, "mimpid\t\t: 0x%lx\n", ci->mimpid);
+
+	/*
+	 * Print the ISA extensions specific to this hart, which may show
+	 * additional extensions not present across all harts.
+	 */
+	seq_puts(m, "hart isa\t: ");
+	print_isa(m, hart_isa[cpu_id].isa);
 	seq_puts(m, "\n");
 
 	return 0;