[v2] arm64/fpsimd: Only provide the length to cpufeature for xCR registers

Message ID 20230731-arm64-sme-fa64-hotplug-v2-1-7714c00dd902@kernel.org
State New
Headers
Series [v2] arm64/fpsimd: Only provide the length to cpufeature for xCR registers |

Commit Message

Mark Brown July 31, 2023, 1:58 p.m. UTC
  For both SVE and SME we abuse the generic register field comparison
support in the cpufeature code as part of our detection of unsupported
variations in the vector lengths available to PEs, reporting the maximum
vector lengths via ZCR_EL1.LEN and SMCR_EL1.LEN.  Since these are
configuration registers rather than identification registers the
assumptions the cpufeature code makes about how unknown bitfields behave
are invalid, leading to warnings when SME features like FA64 are enabled
and we hotplug a CPU:

  CPU features: SANITY CHECK: Unexpected variation in SYS_SMCR_EL1. Boot CPU: 0x0000000000000f, CPU3: 0x0000008000000f
  CPU features: Unsupported CPU feature variation detected.

SVE has no controls other than the vector length so is not yet impacted
but the same issue will apply there if any are defined.

Since the only field we are interested in having the cpufeature code
handle is the length field and we use a custom read function to obtain
the value we can avoid these warnings by filtering out all other bits
when we return the register value, if we're doing that we don't need to
bother reading the register at all and can simply use the RDVL/RDSVL
value we were filling in instead.

Fixes: 2e0f2478ea37eb ("arm64/sve: Probe SVE capabilities and usable vector lengths")
FixeS: b42990d3bf77cc ("arm64/sme: Identify supported SME vector lengths at boot")
Signed-off-by: Mark Brown <broonie@kernel.org>
---
Changes in v2:
- Skip the munging into the register step and just return the value we
  would munge in.
- Link to v1: https://lore.kernel.org/r/20230727-arm64-sme-fa64-hotplug-v1-1-34ae93afc05b@kernel.org
---
 arch/arm64/kernel/fpsimd.c | 22 ++++------------------
 1 file changed, 4 insertions(+), 18 deletions(-)


---
base-commit: 5d0c230f1de8c7515b6567d9afba1f196fb4e2f4
change-id: 20230727-arm64-sme-fa64-hotplug-1e6896746f97

Best regards,
  

Comments

Catalin Marinas Aug. 3, 2023, 4:39 p.m. UTC | #1
On Mon, Jul 31, 2023 at 02:58:48PM +0100, Mark Brown wrote:
> For both SVE and SME we abuse the generic register field comparison
> support in the cpufeature code as part of our detection of unsupported
> variations in the vector lengths available to PEs, reporting the maximum
> vector lengths via ZCR_EL1.LEN and SMCR_EL1.LEN.  Since these are
> configuration registers rather than identification registers the
> assumptions the cpufeature code makes about how unknown bitfields behave
> are invalid, leading to warnings when SME features like FA64 are enabled
> and we hotplug a CPU:
> 
>   CPU features: SANITY CHECK: Unexpected variation in SYS_SMCR_EL1. Boot CPU: 0x0000000000000f, CPU3: 0x0000008000000f
>   CPU features: Unsupported CPU feature variation detected.
> 
> SVE has no controls other than the vector length so is not yet impacted
> but the same issue will apply there if any are defined.
> 
> Since the only field we are interested in having the cpufeature code
> handle is the length field and we use a custom read function to obtain
> the value we can avoid these warnings by filtering out all other bits
> when we return the register value, if we're doing that we don't need to
> bother reading the register at all and can simply use the RDVL/RDSVL
> value we were filling in instead.

Maybe that's the simplest fix, especially if you want it in stable, but
I wonder why we even bother with with treating ZCR_EL1 and SMCR_EL1 as
feature registers. We already have verify_sme_features() to check for
the mismatch. BTW, is vec_verify_vq_map() sufficient so that we can skip
the maximum vector length check?
  
Mark Brown Aug. 3, 2023, 5:44 p.m. UTC | #2
On Thu, Aug 03, 2023 at 05:39:38PM +0100, Catalin Marinas wrote:
> On Mon, Jul 31, 2023 at 02:58:48PM +0100, Mark Brown wrote:

> > Since the only field we are interested in having the cpufeature code
> > handle is the length field and we use a custom read function to obtain
> > the value we can avoid these warnings by filtering out all other bits
> > when we return the register value, if we're doing that we don't need to
> > bother reading the register at all and can simply use the RDVL/RDSVL
> > value we were filling in instead.

> Maybe that's the simplest fix, especially if you want it in stable, but

Yeah, it's definitely the sort of change we want as a fix - anything
more invasive would be inappropriate.

> I wonder why we even bother with with treating ZCR_EL1 and SMCR_EL1 as
> feature registers. We already have verify_sme_features() to check for
> the mismatch. BTW, is vec_verify_vq_map() sufficient so that we can skip
> the maximum vector length check?

Both enumeration mechanisms were added in the initial series supporting
SVE for reasons that are not entirely obvious to me.  The changelogs
explain what we're doing with the pseudo ID register stuff but do not
comment on why.  There is a cross check between the answers the two give
which appears to be geared towards detecting systems with asymmetric
maximum VLs for some reason but I'm not sure why that's done given that
we can't cope if *any* VL in the committed set is missing, not just the
maximum.  It's not KVM because that has even stricter requirements and
can't cope with any extra supported VLs.  I'm not sure if the inclusion
in the cpufeature stuff is supposed to expose the registers via
emulation of mrs, though a brief test appears to indicate that this
doesn't actually work if it was ever intended to.

The whole thing is very suspect but given that we don't currently have
any ability to emulate systems with asymmetric vector lengths I'm a bit
reluctant to poke at it.
  
Catalin Marinas Aug. 4, 2023, 4:20 p.m. UTC | #3
On Thu, Aug 03, 2023 at 06:44:24PM +0100, Mark Brown wrote:
> On Thu, Aug 03, 2023 at 05:39:38PM +0100, Catalin Marinas wrote:
> > On Mon, Jul 31, 2023 at 02:58:48PM +0100, Mark Brown wrote:
> > > Since the only field we are interested in having the cpufeature code
> > > handle is the length field and we use a custom read function to obtain
> > > the value we can avoid these warnings by filtering out all other bits
> > > when we return the register value, if we're doing that we don't need to
> > > bother reading the register at all and can simply use the RDVL/RDSVL
> > > value we were filling in instead.
> 
> > Maybe that's the simplest fix, especially if you want it in stable, but
> 
> Yeah, it's definitely the sort of change we want as a fix - anything
> more invasive would be inappropriate.

I'd say it's still ok if we can just rip come code out safely (the fake
ID reg).

> > I wonder why we even bother with with treating ZCR_EL1 and SMCR_EL1 as
> > feature registers. We already have verify_sme_features() to check for
> > the mismatch. BTW, is vec_verify_vq_map() sufficient so that we can skip
> > the maximum vector length check?
> 
> Both enumeration mechanisms were added in the initial series supporting
> SVE for reasons that are not entirely obvious to me.  The changelogs
> explain what we're doing with the pseudo ID register stuff but do not
> comment on why.  There is a cross check between the answers the two give
> which appears to be geared towards detecting systems with asymmetric
> maximum VLs for some reason but I'm not sure why that's done given that
> we can't cope if *any* VL in the committed set is missing, not just the
> maximum.

We can cope with different VLs if the committed map is built during boot
(early secondary CPU bring-up). For any late/hotplugged CPUs, if they
don't fit the map, they'll be rejected. Not sure where the actual
maximum length matters in this process though (or later for user space).
I assume the user will only be allowed to set the common VLs across all
the early CPUs.

> The whole thing is very suspect but given that we don't currently have
> any ability to emulate systems with asymmetric vector lengths I'm a bit
> reluctant to poke at it.

The Arm fast models should allow such configuration, though I haven't
tried.
  
Mark Brown Aug. 4, 2023, 4:37 p.m. UTC | #4
On Fri, Aug 04, 2023 at 05:20:21PM +0100, Catalin Marinas wrote:
> On Thu, Aug 03, 2023 at 06:44:24PM +0100, Mark Brown wrote:
> > On Thu, Aug 03, 2023 at 05:39:38PM +0100, Catalin Marinas wrote:

> > > Maybe that's the simplest fix, especially if you want it in stable, but

> > Yeah, it's definitely the sort of change we want as a fix - anything
> > more invasive would be inappropriate.

> I'd say it's still ok if we can just rip come code out safely (the fake
> ID reg).

It's the safely bit that concerns me here - it feels like a great way to
discover why the code was there, possibly including a use that was there
in the past but has subsequently been removed so bites a stable version.

> > Both enumeration mechanisms were added in the initial series supporting
> > SVE for reasons that are not entirely obvious to me.  The changelogs
> > explain what we're doing with the pseudo ID register stuff but do not
> > comment on why.  There is a cross check between the answers the two give
> > which appears to be geared towards detecting systems with asymmetric
> > maximum VLs for some reason but I'm not sure why that's done given that
> > we can't cope if *any* VL in the committed set is missing, not just the
> > maximum.

> We can cope with different VLs if the committed map is built during boot
> (early secondary CPU bring-up). For any late/hotplugged CPUs, if they
> don't fit the map, they'll be rejected. Not sure where the actual
> maximum length matters in this process though (or later for user space).
> I assume the user will only be allowed to set the common VLs across all
> the early CPUs.

Indeed, since we need to check each VL in the set we expose to userspace
individually that will include the maximum VL which should mean that
having a separate check for the maximum VL is redundant.  That's always
been the case though which makes me worried about changing it for a fix
rather than a cleanup.

For KVM we need the stricter requirement that no additional VLs are
supported in the subset that KVM exports to clients since guests can
directly enumerate VLs from the hardware and we don't want the answer
changing depending on what physical CPU we schedule a vCPU on.  That
should similarly not need a distinct check for the maximum VL.

> > The whole thing is very suspect but given that we don't currently have
> > any ability to emulate systems with asymmetric vector lengths I'm a bit
> > reluctant to poke at it.

> The Arm fast models should allow such configuration, though I haven't
> tried.

They don't, SVE and SME are provided as a plugin and all their
configuration is done at the plugin level so there's no per PE or per
cluster options like there are for features implemented in the model
itself.
  
Catalin Marinas Aug. 9, 2023, 4:11 p.m. UTC | #5
On Fri, Aug 04, 2023 at 05:37:25PM +0100, Mark Brown wrote:
> On Fri, Aug 04, 2023 at 05:20:21PM +0100, Catalin Marinas wrote:
> > On Thu, Aug 03, 2023 at 06:44:24PM +0100, Mark Brown wrote:
> > > On Thu, Aug 03, 2023 at 05:39:38PM +0100, Catalin Marinas wrote:
> > > > Maybe that's the simplest fix, especially if you want it in stable, but
> > > 
> > > Yeah, it's definitely the sort of change we want as a fix - anything
> > > more invasive would be inappropriate.
> > 
> > I'd say it's still ok if we can just rip come code out safely (the fake
> > ID reg).
> 
> It's the safely bit that concerns me here - it feels like a great way to
> discover why the code was there, possibly including a use that was there
> in the past but has subsequently been removed so bites a stable version.

As discussed earlier, let's queue this patch as is (for 6.6 I'd say)
with cc stable and post a new patch on top that removes the fake CPUID
register going forward, in case we missed anything.

For this patch:

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
  
Will Deacon Aug. 11, 2023, 11:44 a.m. UTC | #6
On Mon, 31 Jul 2023 14:58:48 +0100, Mark Brown wrote:
> For both SVE and SME we abuse the generic register field comparison
> support in the cpufeature code as part of our detection of unsupported
> variations in the vector lengths available to PEs, reporting the maximum
> vector lengths via ZCR_EL1.LEN and SMCR_EL1.LEN.  Since these are
> configuration registers rather than identification registers the
> assumptions the cpufeature code makes about how unknown bitfields behave
> are invalid, leading to warnings when SME features like FA64 are enabled
> and we hotplug a CPU:
> 
> [...]

Applied to arm64 (for-next/cpufeature), thanks!

[1/1] arm64/fpsimd: Only provide the length to cpufeature for xCR registers
      https://git.kernel.org/arm64/c/01948b09edc3

Cheers,
  

Patch

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index 520b681a07bb..aa4194c646b2 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1178,9 +1178,6 @@  void sve_kernel_enable(const struct arm64_cpu_capabilities *__always_unused p)
  */
 u64 read_zcr_features(void)
 {
-	u64 zcr;
-	unsigned int vq_max;
-
 	/*
 	 * Set the maximum possible VL, and write zeroes to all other
 	 * bits to see if they stick.
@@ -1188,12 +1185,8 @@  u64 read_zcr_features(void)
 	sve_kernel_enable(NULL);
 	write_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);
 
-	zcr = read_sysreg_s(SYS_ZCR_EL1);
-	zcr &= ~(u64)ZCR_ELx_LEN_MASK; /* find sticky 1s outside LEN field */
-	vq_max = sve_vq_from_vl(sve_get_vl());
-	zcr |= vq_max - 1; /* set LEN field to maximum effective value */
-
-	return zcr;
+	/* Return LEN value that would be written to get the maximum VL */
+	return sve_vq_from_vl(sve_get_vl()) - 1;
 }
 
 void __init sve_setup(void)
@@ -1348,9 +1341,6 @@  void fa64_kernel_enable(const struct arm64_cpu_capabilities *__always_unused p)
  */
 u64 read_smcr_features(void)
 {
-	u64 smcr;
-	unsigned int vq_max;
-
 	sme_kernel_enable(NULL);
 
 	/*
@@ -1359,12 +1349,8 @@  u64 read_smcr_features(void)
 	write_sysreg_s(read_sysreg_s(SYS_SMCR_EL1) | SMCR_ELx_LEN_MASK,
 		       SYS_SMCR_EL1);
 
-	smcr = read_sysreg_s(SYS_SMCR_EL1);
-	smcr &= ~(u64)SMCR_ELx_LEN_MASK; /* Only the LEN field */
-	vq_max = sve_vq_from_vl(sme_get_vl());
-	smcr |= vq_max - 1; /* set LEN field to maximum effective value */
-
-	return smcr;
+	/* Return LEN value that would be written to get the maximum VL */
+	return sve_vq_from_vl(sme_get_vl()) - 1;
 }
 
 void __init sme_setup(void)