ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512

Message ID 794a1211-630b-3ee5-55a3-c06f10df1490@linux.com
State New
Headers
Series ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512 |

Commit Message

Christoph Lameter (Ampere) Dec. 15, 2023, 12:05 a.m. UTC
  Ampere Computing develops high end ARM processor that support an ever
increasing number of processors. The default 256 processors are
not enough for our newer products. The default is used by
distros and therefore our customers cannot use distro kernels because
the number of processors is not supported.

One of the objections against earlier patches to increase the limit
was that the memory use becomes too high. There is a feature called
CPUMASK_OFFSTACK that configures the cpumasks in the kernel to be
dynamically allocated. This was used in the X86 architecture in the
past to enable support for larger CPU configurations up to 8k cpus.

With that is becomes possible to dynamically size the allocation of
the cpu bitmaps depending on the quantity of processors detected on
bootup.

This patch enables that logic if more than 256 processors
are configured and increases the default to 512 processors.

Further increases may be needed if ARM processor vendors start
supporting more processors. Given the current inflationary trends
in core counts from multiple processor manufacturers this may occur.

Signed-off-by: Christoph Lameter (Ampere) <cl@linux.com>

---

This was discussed earlier and this updated patch was proposed at 
https://lore.kernel.org/linux-arm-kernel/6a854175-5f89-c754-17b8-deda18447f1f@gentwo.org/T/
  

Comments

Russell King (Oracle) Jan. 15, 2024, 3:39 p.m. UTC | #1
On Thu, Dec 14, 2023 at 04:05:56PM -0800, Christoph Lameter (Ampere) wrote:
> Index: linux/arch/arm64/Kconfig
> ===================================================================
> --- linux.orig/arch/arm64/Kconfig
> +++ linux/arch/arm64/Kconfig
> @@ -1407,7 +1407,21 @@ config SCHED_SMT
>   config NR_CPUS
>   	int "Maximum number of CPUs (2-4096)"
>   	range 2 4096

I think your mailer got to your patch and messed up the white space.
There are two spaces before each of these lines rather than the usual
one.

> -	default "256"
> +	default 512
> +
> +#
> +# Determines the placement of cpumasks.
> +#
> +# With CPUMASK_OFFSTACK the cpumasks are dynamically allocated.
> +# Useful for machines with lots of core because it avoids increasing
> +# the size of many of the data structures in the kernel.
> +#
> +# If this is off then the cpumasks have a static sizes and are
> +# embedded within data structures.
> +#
> +config CPUMASK_OFFSTACK
> +	def_bool y
> +	depends on NR_CPUS > 256

Should that be ">= 256" ?

> 
>   config HOTPLUG_CPU
>   	bool "Support for hot-pluggable CPUs"

Same here.
  
Eric Mackay Jan. 15, 2024, 11:59 p.m. UTC | #2
Whitespace issues aside, I have applied the patch on top of kernel 6.1.55 and tested on both a dual-socket Ampere Altra machine with < 256 CPUs, and a dual-socket AmpereOne machine with > 256 CPUs. Works as expected, with all CPUs visible and functional.

>   config NR_CPUS
>   	int "Maximum number of CPUs (2-4096)"
>   	range 2 4096
> - 	default "256"
> + 	default 512

Nit: the new default value should be in quotation marks, if we want to be pedantic

Tested-by: Eric Mackay <eric.mackay@oracle.com>
  
Kefeng Wang Jan. 16, 2024, 7:10 a.m. UTC | #3
On 2024/1/15 23:39, Russell King (Oracle) wrote:
> On Thu, Dec 14, 2023 at 04:05:56PM -0800, Christoph Lameter (Ampere) wrote:
>> Index: linux/arch/arm64/Kconfig
>> ===================================================================
>> --- linux.orig/arch/arm64/Kconfig
>> +++ linux/arch/arm64/Kconfig
>> @@ -1407,7 +1407,21 @@ config SCHED_SMT
>>    config NR_CPUS
>>    	int "Maximum number of CPUs (2-4096)"
>>    	range 2 4096
> 
> I think your mailer got to your patch and messed up the white space.
> There are two spaces before each of these lines rather than the usual
> one.
> 
>> -	default "256"
>> +	default 512
>> +
>> +#
>> +# Determines the placement of cpumasks.
>> +#
>> +# With CPUMASK_OFFSTACK the cpumasks are dynamically allocated.
>> +# Useful for machines with lots of core because it avoids increasing
>> +# the size of many of the data structures in the kernel.
>> +#
>> +# If this is off then the cpumasks have a static sizes and are
>> +# embedded within data structures.
>> +#
>> +config CPUMASK_OFFSTACK
>> +	def_bool y
>> +	depends on NR_CPUS > 256
> 
> Should that be ">= 256" ?

Maybe just select CPUMASK_OFFSTACK if NR_CPUS >= 256,


But could we just make CPUMASK_OFFSTACK configurable and let user/distro
to enable it?

diff --git a/lib/Kconfig b/lib/Kconfig
index 5ddda7c2ed9b..4254be5aa843 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -535,7 +535,9 @@ config CHECK_SIGNATURE
         bool

  config CPUMASK_OFFSTACK
-       bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
+       bool "Force CPU masks off stack"
+       depends on SMP
+       default n
         help
           Use dynamic allocation for cpumask_var_t, instead of putting
           them on the stack.  This is a bit more expensive, but avoids


> 
>>
>>    config HOTPLUG_CPU
>>    	bool "Support for hot-pluggable CPUs"
> 
> Same here.
>
  
Russell King (Oracle) Jan. 16, 2024, 9:28 a.m. UTC | #4
On Tue, Jan 16, 2024 at 03:10:26PM +0800, Kefeng Wang wrote:
> On 2024/1/15 23:39, Russell King (Oracle) wrote:
> > On Thu, Dec 14, 2023 at 04:05:56PM -0800, Christoph Lameter (Ampere) wrote:
> > > Index: linux/arch/arm64/Kconfig
> > > ===================================================================
> > > --- linux.orig/arch/arm64/Kconfig
> > > +++ linux/arch/arm64/Kconfig
> > > @@ -1407,7 +1407,21 @@ config SCHED_SMT
> > >    config NR_CPUS
> > >    	int "Maximum number of CPUs (2-4096)"
> > >    	range 2 4096
> > 
> > I think your mailer got to your patch and messed up the white space.
> > There are two spaces before each of these lines rather than the usual
> > one.
> > 
> > > -	default "256"
> > > +	default 512
> > > +
> > > +#
> > > +# Determines the placement of cpumasks.
> > > +#
> > > +# With CPUMASK_OFFSTACK the cpumasks are dynamically allocated.
> > > +# Useful for machines with lots of core because it avoids increasing
> > > +# the size of many of the data structures in the kernel.
> > > +#
> > > +# If this is off then the cpumasks have a static sizes and are
> > > +# embedded within data structures.
> > > +#
> > > +config CPUMASK_OFFSTACK
> > > +	def_bool y
> > > +	depends on NR_CPUS > 256
> > 
> > Should that be ">= 256" ?
> 
> Maybe just select CPUMASK_OFFSTACK if NR_CPUS >= 256,
> 
> 
> But could we just make CPUMASK_OFFSTACK configurable and let user/distro
> to enable it?
> 
> diff --git a/lib/Kconfig b/lib/Kconfig
> index 5ddda7c2ed9b..4254be5aa843 100644
> --- a/lib/Kconfig
> +++ b/lib/Kconfig
> @@ -535,7 +535,9 @@ config CHECK_SIGNATURE
>         bool
> 
>  config CPUMASK_OFFSTACK
> -       bool "Force CPU masks off stack" if DEBUG_PER_CPU_MAPS
> +       bool "Force CPU masks off stack"
> +       depends on SMP
> +       default n

Please. No.

There is no point in defining a default of n. The default default is n.
Therefore, specifying a default of n is utterly redundant as the option
will still default to n and just adds clutter to Kconfig files.
  
Russell King (Oracle) Jan. 16, 2024, 11:24 a.m. UTC | #5
On Mon, Jan 15, 2024 at 03:59:11PM -0800, Eric Mackay wrote:
> Whitespace issues aside, I have applied the patch on top of kernel 6.1.55 and tested on both a dual-socket Ampere Altra machine with < 256 CPUs, and a dual-socket AmpereOne machine with > 256 CPUs. Works as expected, with all CPUs visible and functional.
> 
> >   config NR_CPUS
> >   	int "Maximum number of CPUs (2-4096)"
> >   	range 2 4096
> > - 	default "256"
> > + 	default 512
> 
> Nit: the new default value should be in quotation marks, if we want to be pedantic

I can't find anything that requires the quotes - and as "range" doesn't
for consistency it seems that default shouldn't either. There's nothing
in the documentation that indicates quotes should be used, and looking
at the code, it's just treated as a string. The only thing that quotes
seem to do is to ensure that whitespace will be included.
  
Mark Rutland Jan. 16, 2024, 1:08 p.m. UTC | #6
On Mon, Jan 15, 2024 at 03:39:00PM +0000, Russell King (Oracle) wrote:
> On Thu, Dec 14, 2023 at 04:05:56PM -0800, Christoph Lameter (Ampere) wrote:
> > +# Determines the placement of cpumasks.
> > +#
> > +# With CPUMASK_OFFSTACK the cpumasks are dynamically allocated.
> > +# Useful for machines with lots of core because it avoids increasing
> > +# the size of many of the data structures in the kernel.
> > +#
> > +# If this is off then the cpumasks have a static sizes and are
> > +# embedded within data structures.
> > +#
> > +config CPUMASK_OFFSTACK
> > +	def_bool y
> > +	depends on NR_CPUS > 256
> 
> Should that be ">= 256" ?

I don't think that ">= 256" makes sense. Note that since the cpumasks are
arrays of unsigned long, they're chunked into groups of 64 bits:

    2 to  64 cpus:  1 x unsigned long =>  8 bytes
   65 to 128 cpus:  2 x unsigned long => 16 bytes
  129 to 192 cpus:  3 x unsigned long => 24 bytes
  193 to 256 cpus:  4 x unsigned long => 32 bytes
  257 to 320 cpus:  5 x unsigned long => 40 bytes

.. and so if a mask for 256 CPUs is too big to go in the stack, so is any mask
for 193+ CPUs, and so ">= 256" should be clamped down to ">= 193" or "> 192".
The boundary should be just after a multiple of 64.

How did we choose 256 specifically? I note that x86-64 allows 512 CPUs before
requiring CPUMASK_OFFSTACK, and I see that powerpc selects CPUMASK_OFFSTACK
when NR_CPUS >= 8192.

Mark.
  
Eric Mackay Jan. 16, 2024, 9:06 p.m. UTC | #7
On Tue, Jan 16, 2024 1:08:20PM +0000, Mark Rutland wrote:
> On Mon, Jan 15, 2024 at 03:39:00PM +0000, Russell King (Oracle) wrote:
> > On Thu, Dec 14, 2023 at 04:05:56PM -0800, Christoph Lameter (Ampere) wrote:
> > > +# Determines the placement of cpumasks.
> > > +#
> > > +# With CPUMASK_OFFSTACK the cpumasks are dynamically allocated.
> > > +# Useful for machines with lots of core because it avoids increasing
> > > +# the size of many of the data structures in the kernel.
> > > +#
> > > +# If this is off then the cpumasks have a static sizes and are
> > > +# embedded within data structures.
> > > +#
> > > +config CPUMASK_OFFSTACK
> > > +	def_bool y
> > > +	depends on NR_CPUS > 256
> > 
> > Should that be ">= 256" ?
> 
> I don't think that ">= 256" makes sense. Note that since the cpumasks are
> arrays of unsigned long, they're chunked into groups of 64 bits:
> 
>     2 to  64 cpus:  1 x unsigned long =>  8 bytes
>    65 to 128 cpus:  2 x unsigned long => 16 bytes
>   129 to 192 cpus:  3 x unsigned long => 24 bytes
>   193 to 256 cpus:  4 x unsigned long => 32 bytes
>   257 to 320 cpus:  5 x unsigned long => 40 bytes
> 
> ... and so if a mask for 256 CPUs is too big to go in the stack, so is any mask
> for 193+ CPUs, and so ">= 256" should be clamped down to ">= 193" or "> 192".
> The boundary should be just after a multiple of 64.
> 
> How did we choose 256 specifically? I note that x86-64 allows 512 CPUs before
> requiring CPUMASK_OFFSTACK, and I see that powerpc selects CPUMASK_OFFSTACK
> when NR_CPUS >= 8192.
> 
> Mark.

The suggestion for >= 256 may have been a zero-index/one-index mixup.

It seems > 256 was chosen as the cutoff simply because it preserves existing behavior.
The patch description seems to imply there was pushback from distro maintainers on just increasing
the default NR_CPUS.

The existing default value of 256 is probably already a strain for smaller ARM systems, to which
x86-64 isn't a reasonable comparison. I'm not sure what the reaction to increasing from 64 to 256
in 2019 was like, but picking a pivot point for CPUMASK_OFFSTACK beyond 256 may skew the balance
even less in favor of smaller ARM systems.
  
Eric Mackay Jan. 16, 2024, 9:06 p.m. UTC | #8
On Tue, Jan 16, 2024 11:24:18PM +0000, Russell King (Oracle) wrote:
> On Mon, Jan 15, 2024 at 03:59:11PM -0800, Eric Mackay wrote:
> > Whitespace issues aside, I have applied the patch on top of kernel 6.1.55 and tested on both a dual-socket Ampere Altra machine with < 256 CPUs, and a dual-socket AmpereOne machine with > 256 CPUs. Works as expected, with all CPUs visible and functional.
> > 
> > >   config NR_CPUS
> > >   	int "Maximum number of CPUs (2-4096)"
> > >   	range 2 4096
> > > - 	default "256"
> > > + 	default 512
> > 
> > Nit: the new default value should be in quotation marks, if we want to be pedantic
> 
> I can't find anything that requires the quotes - and as "range" doesn't
> for consistency it seems that default shouldn't either. There's nothing
> in the documentation that indicates quotes should be used, and looking
> at the code, it's just treated as a string. The only thing that quotes
> seem to do is to ensure that whitespace will be included.
> 
> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!

Ack. I withdraw my nit.
  

Patch

Index: linux/arch/arm64/Kconfig
===================================================================
--- linux.orig/arch/arm64/Kconfig
+++ linux/arch/arm64/Kconfig
@@ -1407,7 +1407,21 @@  config SCHED_SMT
   config NR_CPUS
   	int "Maximum number of CPUs (2-4096)"
   	range 2 4096
-	default "256"
+	default 512
+
+#
+# Determines the placement of cpumasks.
+#
+# With CPUMASK_OFFSTACK the cpumasks are dynamically allocated.
+# Useful for machines with lots of core because it avoids increasing
+# the size of many of the data structures in the kernel.
+#
+# If this is off then the cpumasks have a static sizes and are
+# embedded within data structures.
+#
+config CPUMASK_OFFSTACK
+	def_bool y
+	depends on NR_CPUS > 256

   config HOTPLUG_CPU
   	bool "Support for hot-pluggable CPUs"