[v2] arm64/sve: Lower the maximum allocation for the SVE ptrace regset

Message ID 20240213-arm64-sve-ptrace-regset-size-v2-1-c7600ca74b9b@kernel.org
State New
Headers
Series [v2] arm64/sve: Lower the maximum allocation for the SVE ptrace regset |

Commit Message

Mark Brown Feb. 13, 2024, 6:24 p.m. UTC
  Doug Anderson observed that ChromeOS crashes are being reported which
include failing allocations of order 7 during core dumps due to ptrace
allocating storage for regsets:

  chrome: page allocation failure: order:7,
          mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO),
          nodemask=(null),cpuset=urgent,mems_allowed=0
   ...
  regset_get_alloc+0x1c/0x28
  elf_core_dump+0x3d8/0xd8c
  do_coredump+0xeb8/0x1378

with further investigation showing that this is:

   [   66.957385] DOUG: Allocating 279584 bytes

which is the maximum size of the SVE regset. As Doug observes it is not
entirely surprising that such a large allocation of contiguous memory might
fail on a long running system.

The SVE regset is currently sized to hold SVE registers with a VQ of
SVE_VQ_MAX which is 512, substantially more than the architectural maximum
of 16 which we might see even in a system emulating the limits of the
architecture. Since we don't expose the size we tell the regset core
externally let's define ARCH_SVE_VQ_MAX with the actual architectural
maximum and use that for the regset, we'll still overallocate most of the
time but much less so which will be helpful even if the core is fixed to
not require contiguous allocations.

Specify ARCH_SVE_VQ_MAX in terms of the maximum value that can be written
into ZCR_ELx.LEN (where this is set in the hardware). For consistency
update the maximum SME vector length to be specified in the same style
while we are at it.

We could also teach the ptrace core about runtime discoverable regset sizes
but that would be a more invasive change and this is being observed in
practical systems.

Reported-by: Doug Anderson <dianders@chromium.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
---
We should probably also use the actual architectural limit for the
bitmasks we use in the VL enumeration code, though that's both a little
bit more involved and less immediately a problem.
---
Changes in v2:
- Specify the value using the size of the bitfield it goes into.
- Link to v1: https://lore.kernel.org/r/20240203-arm64-sve-ptrace-regset-size-v1-1-2c3ba1386b9e@kernel.org
---
 arch/arm64/include/asm/fpsimd.h | 12 ++++++------
 arch/arm64/kernel/ptrace.c      |  3 ++-
 2 files changed, 8 insertions(+), 7 deletions(-)


---
base-commit: 41bccc98fb7931d63d03f326a746ac4d429c1dd3
change-id: 20240202-arm64-sve-ptrace-regset-size-21b0928969e1

Best regards,
  

Comments

Doug Anderson Feb. 13, 2024, 9:04 p.m. UTC | #1
Hi,

On Tue, Feb 13, 2024 at 10:24 AM Mark Brown <broonie@kernel.org> wrote:
>
> Doug Anderson observed that ChromeOS crashes are being reported which
> include failing allocations of order 7 during core dumps due to ptrace
> allocating storage for regsets:
>
>   chrome: page allocation failure: order:7,
>           mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO),
>           nodemask=(null),cpuset=urgent,mems_allowed=0
>    ...
>   regset_get_alloc+0x1c/0x28
>   elf_core_dump+0x3d8/0xd8c
>   do_coredump+0xeb8/0x1378
>
> with further investigation showing that this is:
>
>    [   66.957385] DOUG: Allocating 279584 bytes
>
> which is the maximum size of the SVE regset. As Doug observes it is not
> entirely surprising that such a large allocation of contiguous memory might
> fail on a long running system.
>
> The SVE regset is currently sized to hold SVE registers with a VQ of
> SVE_VQ_MAX which is 512, substantially more than the architectural maximum
> of 16 which we might see even in a system emulating the limits of the
> architecture. Since we don't expose the size we tell the regset core
> externally let's define ARCH_SVE_VQ_MAX with the actual architectural
> maximum and use that for the regset, we'll still overallocate most of the
> time but much less so which will be helpful even if the core is fixed to
> not require contiguous allocations.
>
> Specify ARCH_SVE_VQ_MAX in terms of the maximum value that can be written
> into ZCR_ELx.LEN (where this is set in the hardware). For consistency
> update the maximum SME vector length to be specified in the same style
> while we are at it.
>
> We could also teach the ptrace core about runtime discoverable regset sizes
> but that would be a more invasive change and this is being observed in
> practical systems.
>
> Reported-by: Doug Anderson <dianders@chromium.org>
> Signed-off-by: Mark Brown <broonie@kernel.org>
> ---
> We should probably also use the actual architectural limit for the
> bitmasks we use in the VL enumeration code, though that's both a little
> bit more involved and less immediately a problem.
> ---
> Changes in v2:
> - Specify the value using the size of the bitfield it goes into.
> - Link to v1: https://lore.kernel.org/r/20240203-arm64-sve-ptrace-regset-size-v1-1-2c3ba1386b9e@kernel.org
> ---
>  arch/arm64/include/asm/fpsimd.h | 12 ++++++------
>  arch/arm64/kernel/ptrace.c      |  3 ++-
>  2 files changed, 8 insertions(+), 7 deletions(-)

FWIW, v2 still works for me from a black box point of view. :-P I ran
the same test I did from v1. Repeating here:

Confirmed that when I send a "quit" signal to Chrome now that the
allocation I see for "core_note_type" NT_ARM_SVE goes down from
279,584 bytes (n=17474) to just 8,768 bytes (n=548).

Tested-by: Douglas Anderson <dianders@chromium.org>
  
Will Deacon Feb. 15, 2024, 12:44 p.m. UTC | #2
On Tue, 13 Feb 2024 18:24:38 +0000, Mark Brown wrote:
> Doug Anderson observed that ChromeOS crashes are being reported which
> include failing allocations of order 7 during core dumps due to ptrace
> allocating storage for regsets:
> 
>   chrome: page allocation failure: order:7,
>           mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO),
>           nodemask=(null),cpuset=urgent,mems_allowed=0
>    ...
>   regset_get_alloc+0x1c/0x28
>   elf_core_dump+0x3d8/0xd8c
>   do_coredump+0xeb8/0x1378
> 
> [...]

Applied to arm64 (for-next/fixes), thanks!

[1/1] arm64/sve: Lower the maximum allocation for the SVE ptrace regset
      https://git.kernel.org/arm64/c/2813926261e4

Cheers,
  
Doug Anderson Feb. 15, 2024, 5:38 p.m. UTC | #3
Hi,

On Thu, Feb 15, 2024 at 4:44 AM Will Deacon <will@kernel.org> wrote:
>
> On Tue, 13 Feb 2024 18:24:38 +0000, Mark Brown wrote:
> > Doug Anderson observed that ChromeOS crashes are being reported which
> > include failing allocations of order 7 during core dumps due to ptrace
> > allocating storage for regsets:
> >
> >   chrome: page allocation failure: order:7,
> >           mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO),
> >           nodemask=(null),cpuset=urgent,mems_allowed=0
> >    ...
> >   regset_get_alloc+0x1c/0x28
> >   elf_core_dump+0x3d8/0xd8c
> >   do_coredump+0xeb8/0x1378
> >
> > [...]
>
> Applied to arm64 (for-next/fixes), thanks!
>
> [1/1] arm64/sve: Lower the maximum allocation for the SVE ptrace regset
>       https://git.kernel.org/arm64/c/2813926261e4

Thanks!

A side note to anyone trying to backport this is that there's a hidden
dependency of commit f171f9e4097d ("arm64/fp: Make SVE and SME length
register definition match architecture"). Without that dependency then
backporting ${SUBJECT} patch is just a no-op. ...backporting the
hidden dependency ends up giving merge conflicts, so for the purposes
of picking this to our 5.15 or earlier branches I went back to just
making the #define be 16 again like it was in v1.

-Doug
  

Patch

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 50e5f25d3024..481d94416d69 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -62,13 +62,13 @@  static inline void cpacr_restore(unsigned long cpacr)
  * When we defined the maximum SVE vector length we defined the ABI so
  * that the maximum vector length included all the reserved for future
  * expansion bits in ZCR rather than those just currently defined by
- * the architecture. While SME follows a similar pattern the fact that
- * it includes a square matrix means that any allocations that attempt
- * to cover the maximum potential vector length (such as happen with
- * the regset used for ptrace) end up being extremely large. Define
- * the much lower actual limit for use in such situations.
+ * the architecture.  Using this length to allocate worst size buffers
+ * results in excessively large allocations, and this effect is even
+ * more pronounced for SME due to ZA.  Define more suitable VLs for
+ * these situations.
  */
-#define SME_VQ_MAX	16
+#define ARCH_SVE_VQ_MAX ((ZCR_ELx_LEN_MASK >> ZCR_ELx_LEN_SHIFT) + 1)
+#define SME_VQ_MAX	((SMCR_ELx_LEN_MASK >> SMCR_ELx_LEN_SHIFT) + 1)
 
 struct task_struct;
 
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index dc6cf0e37194..e3bef38fc2e2 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1500,7 +1500,8 @@  static const struct user_regset aarch64_regsets[] = {
 #ifdef CONFIG_ARM64_SVE
 	[REGSET_SVE] = { /* Scalable Vector Extension */
 		.core_note_type = NT_ARM_SVE,
-		.n = DIV_ROUND_UP(SVE_PT_SIZE(SVE_VQ_MAX, SVE_PT_REGS_SVE),
+		.n = DIV_ROUND_UP(SVE_PT_SIZE(ARCH_SVE_VQ_MAX,
+					      SVE_PT_REGS_SVE),
 				  SVE_VQ_BYTES),
 		.size = SVE_VQ_BYTES,
 		.align = SVE_VQ_BYTES,