diff mbox series

arch_topology: Support SMT control on arm64

Message ID	20230919123319.23785-1-yangyicong@huawei.com
State	New
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; From: Yicong Yang <yangyicong@huawei.com> To: <catalin.marinas@arm.com>, <will@kernel.org>, <sudeep.holla@arm.com>, <linux-arm-kernel@lists.infradead.org> CC: <gregkh@linuxfoundation.org>, <rafael@kernel.org>, <jonathan.cameron@huawei.com>, <prime.zeng@hisilicon.com>, <linuxarm@huawei.com>, <yangyicong@hisilicon.com>, <linux-kernel@vger.kernel.org> Subject: [PATCH] arch_topology: Support SMT control on arm64 Date: Tue, 19 Sep 2023 20:33:19 +0800 Message-ID: <20230919123319.23785-1-yangyicong@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII Precedence: bulk
Series	arch_topology: Support SMT control on arm64 \| arch_topology: Support SMT control on arm64

Commit Message

Yicong Yang Sept. 19, 2023, 12:33 p.m. UTC

  From: Yicong Yang <yangyicong@hisilicon.com>

The core CPU control framework supports runtime SMT control which
is not yet supported on arm64. Besides the general vulnerabilities
concerns we want this runtime control on our arm64 server for:

- better single CPU performance in some cases
- saving overall power consumption

This patch implements it in the following aspects:

- implement the callbacks of the core
- update the SMT status after the topology enumerated on arm64
- select HOTPLUG_SMT for arm64

For disabling SMT we'll offline all the secondary threads and
only leave the primary thread. Since we don't have restriction
for primary thread selection, the first thread is chosen as the
primary thread in this implementation.

Tests has been done on our ACPI based arm64 server and on
ACPI/OF based QEMU VMs.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 arch/arm64/Kconfig            |  1 +
 drivers/base/arch_topology.c  | 63 +++++++++++++++++++++++++++++++++++
 include/linux/arch_topology.h | 11 ++++++
 3 files changed, 75 insertions(+)

Comments

kernel test robot Sept. 19, 2023, 11:30 p.m. UTC | #1

Hi Yicong,

kernel test robot noticed the following build errors:

[auto build test ERROR on arm64/for-next/core]
[also build test ERROR on driver-core/driver-core-testing driver-core/driver-core-next driver-core/driver-core-linus linus/master v6.6-rc2 next-20230919]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Yicong-Yang/arch_topology-Support-SMT-control-on-arm64/20230919-223458
base:   https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/core
patch link:    https://lore.kernel.org/r/20230919123319.23785-1-yangyicong%40huawei.com
patch subject: [PATCH] arch_topology: Support SMT control on arm64
config: arm64-allnoconfig (https://download.01.org/0day-ci/archive/20230920/202309200727.CtYl75aH-lkp@intel.com/config)
compiler: aarch64-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230920/202309200727.CtYl75aH-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202309200727.CtYl75aH-lkp@intel.com/

All errors (new ones prefixed by >>):

   kernel/cpu.c: In function 'cpuhp_get_primary_thread_mask':
   kernel/cpu.c:660:16: error: 'cpu_primary_thread_mask' undeclared (first use in this function); did you mean 'cpuhp_get_primary_thread_mask'?
     660 |         return cpu_primary_thread_mask;
         |                ^~~~~~~~~~~~~~~~~~~~~~~
         |                cpuhp_get_primary_thread_mask
   kernel/cpu.c:660:16: note: each undeclared identifier is reported only once for each function it appears in
   kernel/cpu.c: In function 'cpuhp_smt_disable':
>> kernel/cpu.c:2629:23: error: implicit declaration of function 'cpu_down_maps_locked' [-Werror=implicit-function-declaration]
    2629 |                 ret = cpu_down_maps_locked(cpu, CPUHP_OFFLINE);
         |                       ^~~~~~~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors


vim +/cpu_down_maps_locked +2629 kernel/cpu.c

dc8d37ed304eee Arnd Bergmann 2019-12-10  2620  
dc8d37ed304eee Arnd Bergmann 2019-12-10  2621  int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
dc8d37ed304eee Arnd Bergmann 2019-12-10  2622  {
dc8d37ed304eee Arnd Bergmann 2019-12-10  2623  	int cpu, ret = 0;
dc8d37ed304eee Arnd Bergmann 2019-12-10  2624  
dc8d37ed304eee Arnd Bergmann 2019-12-10  2625  	cpu_maps_update_begin();
dc8d37ed304eee Arnd Bergmann 2019-12-10  2626  	for_each_online_cpu(cpu) {
dc8d37ed304eee Arnd Bergmann 2019-12-10  2627  		if (topology_is_primary_thread(cpu))
dc8d37ed304eee Arnd Bergmann 2019-12-10  2628  			continue;
dc8d37ed304eee Arnd Bergmann 2019-12-10 @2629  		ret = cpu_down_maps_locked(cpu, CPUHP_OFFLINE);
dc8d37ed304eee Arnd Bergmann 2019-12-10  2630  		if (ret)
dc8d37ed304eee Arnd Bergmann 2019-12-10  2631  			break;
dc8d37ed304eee Arnd Bergmann 2019-12-10  2632  		/*
dc8d37ed304eee Arnd Bergmann 2019-12-10  2633  		 * As this needs to hold the cpu maps lock it's impossible
dc8d37ed304eee Arnd Bergmann 2019-12-10  2634  		 * to call device_offline() because that ends up calling
dc8d37ed304eee Arnd Bergmann 2019-12-10  2635  		 * cpu_down() which takes cpu maps lock. cpu maps lock
dc8d37ed304eee Arnd Bergmann 2019-12-10  2636  		 * needs to be held as this might race against in kernel
dc8d37ed304eee Arnd Bergmann 2019-12-10  2637  		 * abusers of the hotplug machinery (thermal management).
dc8d37ed304eee Arnd Bergmann 2019-12-10  2638  		 *
dc8d37ed304eee Arnd Bergmann 2019-12-10  2639  		 * So nothing would update device:offline state. That would
dc8d37ed304eee Arnd Bergmann 2019-12-10  2640  		 * leave the sysfs entry stale and prevent onlining after
dc8d37ed304eee Arnd Bergmann 2019-12-10  2641  		 * smt control has been changed to 'off' again. This is
dc8d37ed304eee Arnd Bergmann 2019-12-10  2642  		 * called under the sysfs hotplug lock, so it is properly
dc8d37ed304eee Arnd Bergmann 2019-12-10  2643  		 * serialized against the regular offline usage.
dc8d37ed304eee Arnd Bergmann 2019-12-10  2644  		 */
dc8d37ed304eee Arnd Bergmann 2019-12-10  2645  		cpuhp_offline_cpu_device(cpu);
dc8d37ed304eee Arnd Bergmann 2019-12-10  2646  	}
dc8d37ed304eee Arnd Bergmann 2019-12-10  2647  	if (!ret)
dc8d37ed304eee Arnd Bergmann 2019-12-10  2648  		cpu_smt_control = ctrlval;
dc8d37ed304eee Arnd Bergmann 2019-12-10  2649  	cpu_maps_update_done();
dc8d37ed304eee Arnd Bergmann 2019-12-10  2650  	return ret;
dc8d37ed304eee Arnd Bergmann 2019-12-10  2651  }
dc8d37ed304eee Arnd Bergmann 2019-12-10  2652

Yicong Yang Sept. 20, 2023, 1:36 a.m. UTC | #2

On 2023/9/20 7:30, kernel test robot wrote:
> Hi Yicong,
> 
> kernel test robot noticed the following build errors:
> 
> [auto build test ERROR on arm64/for-next/core]
> [also build test ERROR on driver-core/driver-core-testing driver-core/driver-core-next driver-core/driver-core-linus linus/master v6.6-rc2 next-20230919]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:    https://github.com/intel-lab-lkp/linux/commits/Yicong-Yang/arch_topology-Support-SMT-control-on-arm64/20230919-223458
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/core
> patch link:    https://lore.kernel.org/r/20230919123319.23785-1-yangyicong%40huawei.com
> patch subject: [PATCH] arch_topology: Support SMT control on arm64
> config: arm64-allnoconfig (https://download.01.org/0day-ci/archive/20230920/202309200727.CtYl75aH-lkp@intel.com/config)
> compiler: aarch64-linux-gcc (GCC) 13.2.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20230920/202309200727.CtYl75aH-lkp@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202309200727.CtYl75aH-lkp@intel.com/
> 
> All errors (new ones prefixed by >>):
> 
>    kernel/cpu.c: In function 'cpuhp_get_primary_thread_mask':
>    kernel/cpu.c:660:16: error: 'cpu_primary_thread_mask' undeclared (first use in this function); did you mean 'cpuhp_get_primary_thread_mask'?
>      660 |         return cpu_primary_thread_mask;
>          |                ^~~~~~~~~~~~~~~~~~~~~~~
>          |                cpuhp_get_primary_thread_mask

cpu_primary_thread_mask is used for CONFIG_HOTPLUG_PARALLEL which is not enabled on arm64. Previous it has a dependency for
CONFIG_HOTPLUG_SMT but is later decoupled by 7a4dcb4a5de1 ("cpu/hotplug: Remove dependancy against cpu_primary_thread_mask")
which is merged after v6.6-rc1. Checked that arm64 branch doesn't contain this commit for now.

The patch builds well after v6.6-rc1.

>    kernel/cpu.c:660:16: note: each undeclared identifier is reported only once for each function it appears in
>    kernel/cpu.c: In function 'cpuhp_smt_disable':
>>> kernel/cpu.c:2629:23: error: implicit declaration of function 'cpu_down_maps_locked' [-Werror=implicit-function-declaration]
>     2629 |                 ret = cpu_down_maps_locked(cpu, CPUHP_OFFLINE);
>          |                       ^~~~~~~~~~~~~~~~~~~~
>    cc1: some warnings being treated as errors
> 
> 
> vim +/cpu_down_maps_locked +2629 kernel/cpu.c
> 
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2620  
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2621  int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2622  {
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2623  	int cpu, ret = 0;
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2624  
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2625  	cpu_maps_update_begin();
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2626  	for_each_online_cpu(cpu) {
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2627  		if (topology_is_primary_thread(cpu))
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2628  			continue;
> dc8d37ed304eee Arnd Bergmann 2019-12-10 @2629  		ret = cpu_down_maps_locked(cpu, CPUHP_OFFLINE);
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2630  		if (ret)
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2631  			break;
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2632  		/*
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2633  		 * As this needs to hold the cpu maps lock it's impossible
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2634  		 * to call device_offline() because that ends up calling
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2635  		 * cpu_down() which takes cpu maps lock. cpu maps lock
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2636  		 * needs to be held as this might race against in kernel
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2637  		 * abusers of the hotplug machinery (thermal management).
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2638  		 *
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2639  		 * So nothing would update device:offline state. That would
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2640  		 * leave the sysfs entry stale and prevent onlining after
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2641  		 * smt control has been changed to 'off' again. This is
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2642  		 * called under the sysfs hotplug lock, so it is properly
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2643  		 * serialized against the regular offline usage.
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2644  		 */
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2645  		cpuhp_offline_cpu_device(cpu);
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2646  	}
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2647  	if (!ret)
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2648  		cpu_smt_control = ctrlval;
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2649  	cpu_maps_update_done();
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2650  	return ret;
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2651  }
> dc8d37ed304eee Arnd Bergmann 2019-12-10  2652  
>

Dietmar Eggemann Sept. 20, 2023, 5:08 p.m. UTC | #3

On 19/09/2023 14:33, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> The core CPU control framework supports runtime SMT control which
> is not yet supported on arm64. Besides the general vulnerabilities
> concerns we want this runtime control on our arm64 server for:
> 
> - better single CPU performance in some cases
> - saving overall power consumption
> 
> This patch implements it in the following aspects:
> 
> - implement the callbacks of the core

I see only 1 function here: topology_is_primary_thread() ?

> - update the SMT status after the topology enumerated on arm64

That's the call init_cpu_topology()
                 topology_smt_set_num_threads()
                  cpu_smt_set_num_threads()

> - select HOTPLUG_SMT for arm64
> 
> For disabling SMT we'll offline all the secondary threads and

`disabling SMT` means here setting cpu_smt_control=CPU_SMT_DISABLED ?

> only leave the primary thread. Since we don't have restriction
> for primary thread selection, the first thread is chosen as the
> primary thread in this implementation.
> 
> Tests has been done on our ACPI based arm64 server and on
> ACPI/OF based QEMU VMs.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
>  arch/arm64/Kconfig            |  1 +
>  drivers/base/arch_topology.c  | 63 +++++++++++++++++++++++++++++++++++
>  include/linux/arch_topology.h | 11 ++++++
>  3 files changed, 75 insertions(+)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index b10515c0200b..531a71c7f499 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -233,6 +233,7 @@ config ARM64
>  	select HAVE_KRETPROBES
>  	select HAVE_GENERIC_VDSO
>  	select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
> +	select HOTPLUG_SMT if SMP
>  	select IRQ_DOMAIN
>  	select IRQ_FORCED_THREADING
>  	select KASAN_VMALLOC if KASAN
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index b741b5ba82bd..75a693834fff 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -729,6 +729,63 @@ const struct cpumask *cpu_clustergroup_mask(int cpu)
>  	return &cpu_topology[cpu].cluster_sibling;
>  }
>  
> +#ifdef CONFIG_HOTPLUG_SMT
> +static int topology_smt_num_threads = 1;
> +
> +void __init topology_smt_set_num_threads(void)
> +{
> +	int cpu, sibling, threads;
> +
> +	/*
> +	 * Walk all the CPUs to find the largest thread number, in case we're
> +	 * on a heterogeneous platform with only part of the CPU cores support
> +	 * SMT.
> +	 *
> +	 * Get the thread number by checking the CPUs with same core id
> +	 * rather than checking the topology_sibling_cpumask(), since the
> +	 * sibling mask will not cover all the CPUs if there's CPU offline.
> +	 */
> +	for_each_possible_cpu(cpu) {
> +		threads = 1;
> +
> +		/* Invalid thread id, this CPU is not in a SMT core */
> +		if (cpu_topology[cpu].thread_id == -1)
> +			continue;
> +
> +		for_each_possible_cpu(sibling) {
> +			if (sibling == cpu || cpu_topology[sibling].thread_id == -1)
> +				continue;
> +
> +			if (cpu_topology[cpu].core_id == cpu_topology[sibling].core_id)
> +				threads++;
> +		}
> +
> +		if (threads > topology_smt_num_threads)
> +			topology_smt_num_threads = threads;
> +	}
> +
> +	/*
> +	 * We don't support CONFIG_SMT_NUM_THREADS_DYNAMIC so make the
> +	 * max_threads == num_threads.
> +	 */
> +	cpu_smt_set_num_threads(topology_smt_num_threads, topology_smt_num_threads);
> +}
> +
> +/*
> + * On SMT Hotplug the primary thread of the SMT won't be disabled. For x86 they
> + * seem to have a primary thread for special purpose. For other arthitectures
> + * like arm64 there's no such restriction for a primary thread, so make the
> + * first thread in the SMT as the primary thread.
> + */
> +bool topology_is_primary_thread(unsigned int cpu)
> +{
> +	if (cpu == cpumask_first(topology_sibling_cpumask(cpu)))
> +		return true;
> +
> +	return false;
> +}
> +#endif
> +
>  void update_siblings_masks(unsigned int cpuid)
>  {
>  	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
> @@ -841,6 +898,12 @@ void __init init_cpu_topology(void)
>  		reset_cpu_topology();
>  	}
>  
> +	/*
> +	 * By this stage we get to know whether we support SMT or not, update
> +	 * the information for the core.
> +	 */
> +	topology_smt_set_num_threads();
> +

So this would be the diff between x86 and arm64:

start_kernel() [init/main.c]

 arch_cpu_finalize_init() [arch/x86/kernel/cpu/common.c] <- x86

  identify_boot_cpu() [arch/x86/kernel/cpu/common.c]

   detect_ht() [arch/x86/kernel/cpu/common.c]

    detect_ht_early() [arch/x86/kernel/cpu/common.c]

  cpu_smt_set_num_threads(smp_num_siblings, smp_num_siblings) <- (1)
   

 arch_call_rest_init() [init/main.c] <- arm64

  rest_init() [init/main.c]

   kernel_init() [init/main.c]

    kernel_init_freeable() [init/main.c]

     smp_prepare_cpus() [arch/arm64/kernel/smp.c]

      init_cpu_topology() [drivers/base/arch_topology.c]

       topology_smt_set_num_threads()

        cpu_smt_set_num_threads(topology_smt_num_threads, topology_smt_num_threads) <- (1) 

[...]

Did some rough testing with your patch on an SMT4 Arm64 server with 256
CPUs:

(1) CPU hp out all secondaries from the thread_siblings masks

for i in {32..255}; do echo 0 > /sys/devices/system/cpu/cpu$i/online; done

(2) Check thread_siblings cpumasks

cat /sys/devices/system/cpu/cpu*/topology/thread_siblings
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000002
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000400
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000800
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00001000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00002000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00004000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00008000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00010000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00020000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00040000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00080000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000004
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00100000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00200000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00400000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00800000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,01000000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,02000000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,04000000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,08000000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,10000000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,20000000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000008
00000000,00000000,00000000,00000000,00000000,00000000,00000000,40000000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,80000000
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000010
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000020
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000040
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000080
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000100
00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000200

(3) CPU hp out and out CPU31

echo 0 > /sys/devices/system/cpu/cpu31/online
echo 1 > /sys/devices/system/cpu/cpu31/online

cpu_smt_control is still CPU_SMT_ENABLED in cpu_smt_allowed() so
topology_is_primary_thread() isn't called?

Yicong Yang Sept. 21, 2023, 8:56 a.m. UTC | #4

On 2023/9/21 1:08, Dietmar Eggemann wrote:
> On 19/09/2023 14:33, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> The core CPU control framework supports runtime SMT control which
>> is not yet supported on arm64. Besides the general vulnerabilities
>> concerns we want this runtime control on our arm64 server for:
>>
>> - better single CPU performance in some cases
>> - saving overall power consumption
>>
>> This patch implements it in the following aspects:
>>
>> - implement the callbacks of the core
> 
> I see only 1 function here: topology_is_primary_thread() ?

Yes. Before 6.6-rc1 there's also one function topology_smt_supported() required,
I forgot to update the comment after rebase on 6.6-rc1.

> 
>> - update the SMT status after the topology enumerated on arm64
> 
> That's the call init_cpu_topology()
>                  topology_smt_set_num_threads()
>                   cpu_smt_set_num_threads()
> 

Yes.

>> - select HOTPLUG_SMT for arm64
>>
>> For disabling SMT we'll offline all the secondary threads and
> 
> `disabling SMT` means here setting cpu_smt_control=CPU_SMT_DISABLED ?
> 

Yes. The SMT control provides user interface like
`/sys/devices/system/cpu/smt/control` or cmdline option `nosmt=[force]`,
which will update cpu_smt_control = CPU_SMT_DISABLED.

>> only leave the primary thread. Since we don't have restriction
>> for primary thread selection, the first thread is chosen as the
>> primary thread in this implementation.
>>
>> Tests has been done on our ACPI based arm64 server and on
>> ACPI/OF based QEMU VMs.
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>>  arch/arm64/Kconfig            |  1 +
>>  drivers/base/arch_topology.c  | 63 +++++++++++++++++++++++++++++++++++
>>  include/linux/arch_topology.h | 11 ++++++
>>  3 files changed, 75 insertions(+)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index b10515c0200b..531a71c7f499 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -233,6 +233,7 @@ config ARM64
>>  	select HAVE_KRETPROBES
>>  	select HAVE_GENERIC_VDSO
>>  	select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
>> +	select HOTPLUG_SMT if SMP
>>  	select IRQ_DOMAIN
>>  	select IRQ_FORCED_THREADING
>>  	select KASAN_VMALLOC if KASAN
>> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
>> index b741b5ba82bd..75a693834fff 100644
>> --- a/drivers/base/arch_topology.c
>> +++ b/drivers/base/arch_topology.c
>> @@ -729,6 +729,63 @@ const struct cpumask *cpu_clustergroup_mask(int cpu)
>>  	return &cpu_topology[cpu].cluster_sibling;
>>  }
>>  
>> +#ifdef CONFIG_HOTPLUG_SMT
>> +static int topology_smt_num_threads = 1;
>> +
>> +void __init topology_smt_set_num_threads(void)
>> +{
>> +	int cpu, sibling, threads;
>> +
>> +	/*
>> +	 * Walk all the CPUs to find the largest thread number, in case we're
>> +	 * on a heterogeneous platform with only part of the CPU cores support
>> +	 * SMT.
>> +	 *
>> +	 * Get the thread number by checking the CPUs with same core id
>> +	 * rather than checking the topology_sibling_cpumask(), since the
>> +	 * sibling mask will not cover all the CPUs if there's CPU offline.
>> +	 */
>> +	for_each_possible_cpu(cpu) {
>> +		threads = 1;
>> +
>> +		/* Invalid thread id, this CPU is not in a SMT core */
>> +		if (cpu_topology[cpu].thread_id == -1)
>> +			continue;
>> +
>> +		for_each_possible_cpu(sibling) {
>> +			if (sibling == cpu || cpu_topology[sibling].thread_id == -1)
>> +				continue;
>> +
>> +			if (cpu_topology[cpu].core_id == cpu_topology[sibling].core_id)
>> +				threads++;
>> +		}
>> +
>> +		if (threads > topology_smt_num_threads)
>> +			topology_smt_num_threads = threads;
>> +	}
>> +
>> +	/*
>> +	 * We don't support CONFIG_SMT_NUM_THREADS_DYNAMIC so make the
>> +	 * max_threads == num_threads.
>> +	 */
>> +	cpu_smt_set_num_threads(topology_smt_num_threads, topology_smt_num_threads);
>> +}
>> +
>> +/*
>> + * On SMT Hotplug the primary thread of the SMT won't be disabled. For x86 they
>> + * seem to have a primary thread for special purpose. For other arthitectures
>> + * like arm64 there's no such restriction for a primary thread, so make the
>> + * first thread in the SMT as the primary thread.
>> + */
>> +bool topology_is_primary_thread(unsigned int cpu)
>> +{
>> +	if (cpu == cpumask_first(topology_sibling_cpumask(cpu)))
>> +		return true;
>> +
>> +	return false;
>> +}
>> +#endif
>> +
>>  void update_siblings_masks(unsigned int cpuid)
>>  {
>>  	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
>> @@ -841,6 +898,12 @@ void __init init_cpu_topology(void)
>>  		reset_cpu_topology();
>>  	}
>>  
>> +	/*
>> +	 * By this stage we get to know whether we support SMT or not, update
>> +	 * the information for the core.
>> +	 */
>> +	topology_smt_set_num_threads();
>> +
> 
> So this would be the diff between x86 and arm64:
> 
> start_kernel() [init/main.c]
> 
>  arch_cpu_finalize_init() [arch/x86/kernel/cpu/common.c] <- x86
> 
>   identify_boot_cpu() [arch/x86/kernel/cpu/common.c]
> 
>    detect_ht() [arch/x86/kernel/cpu/common.c]
> 
>     detect_ht_early() [arch/x86/kernel/cpu/common.c]
> 
>   cpu_smt_set_num_threads(smp_num_siblings, smp_num_siblings) <- (1)
>    
> 
>  arch_call_rest_init() [init/main.c] <- arm64
> 
>   rest_init() [init/main.c]
> 
>    kernel_init() [init/main.c]
> 
>     kernel_init_freeable() [init/main.c]
> 
>      smp_prepare_cpus() [arch/arm64/kernel/smp.c]
> 
>       init_cpu_topology() [drivers/base/arch_topology.c]
> 
>        topology_smt_set_num_threads()
> 
>         cpu_smt_set_num_threads(topology_smt_num_threads, topology_smt_num_threads) <- (1) 
> 
> [...]
> 

Yes we have some differences. On arm64 the SMT information is retrieved from firmware
(ACPI/OF, parse_{acpi, dt}_topology()). So we need to update the SMT information
to the core after parsing the ACPI/OF in init_cpu_topology().

> Did some rough testing with your patch on an SMT4 Arm64 server with 256
> CPUs:
> 
> (1) CPU hp out all secondaries from the thread_siblings masks
> 
> for i in {32..255}; do echo 0 > /sys/devices/system/cpu/cpu$i/online; done
> 
> (2) Check thread_siblings cpumasks
> 
> cat /sys/devices/system/cpu/cpu*/topology/thread_siblings
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000002
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000400
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000800
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00001000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00002000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00004000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00008000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00010000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00020000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00040000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00080000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000004
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00100000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00200000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00400000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00800000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,01000000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,02000000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,04000000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,08000000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,10000000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,20000000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000008
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,40000000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,80000000
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000010
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000020
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000040
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000080
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000100
> 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000200
> 
> (3) CPU hp out and out CPU31
> 
> echo 0 > /sys/devices/system/cpu/cpu31/online
> echo 1 > /sys/devices/system/cpu/cpu31/online
> 
> cpu_smt_control is still CPU_SMT_ENABLED in cpu_smt_allowed() so
> topology_is_primary_thread() isn't called? 
> 

If you manually disable SMT by offline each CPUs the cpu_smt_control will
not be updated. It'll updated when using the interface like
`/sys/devices/system/cpu/smt/control` or cmdline. By these means,
the framework will use topology_is_primary_thread() to decide which CPU
in the SMT will keep online:

// e.g. echo off > /sys/devices/system/cpu/smt/control
[ kernel/cpu.c ]
control_store()
  __store_smt_control()
    cpuhp_smt_disable()
      for_each_online_cpu(cpu)
        if (topology_is_primary_thread(cpu))
          continue;            <---------- will skip the primary thread
      [...]
      cpu_smt_control = CPU_SMT_DISABLED;

topology_is_primary_thread() checking only applies to the SMT control but
not to the CPU offline.

Thanks,
Yicong

Sudeep Holla Sept. 21, 2023, 3:03 p.m. UTC | #5

On Tue, Sep 19, 2023 at 08:33:19PM +0800, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> The core CPU control framework supports runtime SMT control which
> is not yet supported on arm64. Besides the general vulnerabilities
> concerns we want this runtime control on our arm64 server for:
> 
> - better single CPU performance in some cases
> - saving overall power consumption
> 
> This patch implements it in the following aspects:
> 
> - implement the callbacks of the core
> - update the SMT status after the topology enumerated on arm64
> - select HOTPLUG_SMT for arm64
> 
> For disabling SMT we'll offline all the secondary threads and
> only leave the primary thread. Since we don't have restriction
> for primary thread selection, the first thread is chosen as the
> primary thread in this implementation.
> 
> Tests has been done on our ACPI based arm64 server and on
> ACPI/OF based QEMU VMs.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
>  arch/arm64/Kconfig            |  1 +
>  drivers/base/arch_topology.c  | 63 +++++++++++++++++++++++++++++++++++
>  include/linux/arch_topology.h | 11 ++++++
>  3 files changed, 75 insertions(+)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index b10515c0200b..531a71c7f499 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -233,6 +233,7 @@ config ARM64
>  	select HAVE_KRETPROBES
>  	select HAVE_GENERIC_VDSO
>  	select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
> +	select HOTPLUG_SMT if SMP
>  	select IRQ_DOMAIN
>  	select IRQ_FORCED_THREADING
>  	select KASAN_VMALLOC if KASAN
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index b741b5ba82bd..75a693834fff 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -729,6 +729,63 @@ const struct cpumask *cpu_clustergroup_mask(int cpu)
>  	return &cpu_topology[cpu].cluster_sibling;
>  }
>  
> +#ifdef CONFIG_HOTPLUG_SMT
> +static int topology_smt_num_threads = 1;
> +
> +void __init topology_smt_set_num_threads(void)
> +{
> +	int cpu, sibling, threads;
> +
> +	/*
> +	 * Walk all the CPUs to find the largest thread number, in case we're
> +	 * on a heterogeneous platform with only part of the CPU cores support
> +	 * SMT.
> +	 *
> +	 * Get the thread number by checking the CPUs with same core id
> +	 * rather than checking the topology_sibling_cpumask(), since the
> +	 * sibling mask will not cover all the CPUs if there's CPU offline.
> +	 */
> +	for_each_possible_cpu(cpu) {
> +		threads = 1;
> +
> +		/* Invalid thread id, this CPU is not in a SMT core */
> +		if (cpu_topology[cpu].thread_id == -1)
> +			continue;
> +
> +		for_each_possible_cpu(sibling) {

I would really like to avoid parsing all the cpus here(O(cpu^2))

Another random thought(just looking at DT parsing) is we can count threads
while parsing itself if we need the info early before the topology cpumasks
are setup. Need to look at ACPI parsing and how to make that generic but
thought of checking the idea here first.

[...]

> @@ -841,6 +898,12 @@ void __init init_cpu_topology(void)
>  		reset_cpu_topology();
>  	}
>  
> +	/*
> +	 * By this stage we get to know whether we support SMT or not, update
> +	 * the information for the core.
> +	 */
> +	topology_smt_set_num_threads();
> +

Does this have to be done this early ? I was wondering if we can defer until
all the cpumasks are set and you can rely on the thread_sibling mask ?
You can just get the weight of that mask on all cpus and choose the max value.

Yicong Yang Sept. 22, 2023, 9:46 a.m. UTC | #6

On 2023/9/21 23:03, Sudeep Holla wrote:
> On Tue, Sep 19, 2023 at 08:33:19PM +0800, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> The core CPU control framework supports runtime SMT control which
>> is not yet supported on arm64. Besides the general vulnerabilities
>> concerns we want this runtime control on our arm64 server for:
>>
>> - better single CPU performance in some cases
>> - saving overall power consumption
>>
>> This patch implements it in the following aspects:
>>
>> - implement the callbacks of the core
>> - update the SMT status after the topology enumerated on arm64
>> - select HOTPLUG_SMT for arm64
>>
>> For disabling SMT we'll offline all the secondary threads and
>> only leave the primary thread. Since we don't have restriction
>> for primary thread selection, the first thread is chosen as the
>> primary thread in this implementation.
>>
>> Tests has been done on our ACPI based arm64 server and on
>> ACPI/OF based QEMU VMs.
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>>  arch/arm64/Kconfig            |  1 +
>>  drivers/base/arch_topology.c  | 63 +++++++++++++++++++++++++++++++++++
>>  include/linux/arch_topology.h | 11 ++++++
>>  3 files changed, 75 insertions(+)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index b10515c0200b..531a71c7f499 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -233,6 +233,7 @@ config ARM64
>>  	select HAVE_KRETPROBES
>>  	select HAVE_GENERIC_VDSO
>>  	select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
>> +	select HOTPLUG_SMT if SMP
>>  	select IRQ_DOMAIN
>>  	select IRQ_FORCED_THREADING
>>  	select KASAN_VMALLOC if KASAN
>> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
>> index b741b5ba82bd..75a693834fff 100644
>> --- a/drivers/base/arch_topology.c
>> +++ b/drivers/base/arch_topology.c
>> @@ -729,6 +729,63 @@ const struct cpumask *cpu_clustergroup_mask(int cpu)
>>  	return &cpu_topology[cpu].cluster_sibling;
>>  }
>>  
>> +#ifdef CONFIG_HOTPLUG_SMT
>> +static int topology_smt_num_threads = 1;
>> +
>> +void __init topology_smt_set_num_threads(void)
>> +{
>> +	int cpu, sibling, threads;
>> +
>> +	/*
>> +	 * Walk all the CPUs to find the largest thread number, in case we're
>> +	 * on a heterogeneous platform with only part of the CPU cores support
>> +	 * SMT.
>> +	 *
>> +	 * Get the thread number by checking the CPUs with same core id
>> +	 * rather than checking the topology_sibling_cpumask(), since the
>> +	 * sibling mask will not cover all the CPUs if there's CPU offline.
>> +	 */
>> +	for_each_possible_cpu(cpu) {
>> +		threads = 1;
>> +
>> +		/* Invalid thread id, this CPU is not in a SMT core */
>> +		if (cpu_topology[cpu].thread_id == -1)
>> +			continue;
>> +
>> +		for_each_possible_cpu(sibling) {
> 
> I would really like to avoid parsing all the cpus here(O(cpu^2))
> 

What if we use a temp cpumask to record each visited CPU and make it O(cpu)?
I didn't use it because I don't want to see a memory allocation failure for
the cpumask. In current implementation there's no way to fail.

> Another random thought(just looking at DT parsing) is we can count threads
> while parsing itself if we need the info early before the topology cpumasks
> are setup. Need to look at ACPI parsing and how to make that generic but
> thought of checking the idea here first.
> 

The ACPI parsing is in each arch's codes (currently for arm64 and loongarch).
The code will be isolated to DT, arm64 ACPI parsing, loogarch ACPI parsing
and future ACPI based archs, it'll be redundant and hard to maintian, I think.
So I perfer to unify the processing since the logic is common, just check
the finally built cpu_topology array regardless whether we're on an ACPI/OF
based system.

> [...]
> 
>> @@ -841,6 +898,12 @@ void __init init_cpu_topology(void)
>>  		reset_cpu_topology();
>>  	}
>>  
>> +	/*
>> +	 * By this stage we get to know whether we support SMT or not, update
>> +	 * the information for the core.
>> +	 */
>> +	topology_smt_set_num_threads();
>> +
> 
> Does this have to be done this early ? I was wondering if we can defer until
> all the cpumasks are set and you can rely on the thread_sibling mask ?
> You can just get the weight of that mask on all cpus and choose the max value.
> 

We do this at this stage because we've already gotten the necessary informations.
As commented in topology_smt_set_num_threads(), I didn't use sibling masks
because the sibling masks will not contain all the informations, it'll only be updated
when CPUs going online in secondary_start_kernel()->store_cpu_topology(). Think of
the case if we boot with maxcpus=1, the SMT status won't be collected completely if
we're using the sibling mask. For example, the sibling mask of the boot CPU will
be correct, but not for its sibling cores.

Thanks.

Dietmar Eggemann Sept. 22, 2023, 11:13 a.m. UTC | #7

On 21/09/2023 10:56, Yicong Yang wrote:
> On 2023/9/21 1:08, Dietmar Eggemann wrote:
>> On 19/09/2023 14:33, Yicong Yang wrote:
>>> From: Yicong Yang <yangyicong@hisilicon.com>

[...]

> If you manually disable SMT by offline each CPUs the cpu_smt_control will
> not be updated. It'll updated when using the interface like
> `/sys/devices/system/cpu/smt/control` or cmdline. By these means,
> the framework will use topology_is_primary_thread() to decide which CPU
> in the SMT will keep online:
> 
> // e.g. echo off > /sys/devices/system/cpu/smt/control
> [ kernel/cpu.c ]
> control_store()
>   __store_smt_control()
>     cpuhp_smt_disable()
>       for_each_online_cpu(cpu)
>         if (topology_is_primary_thread(cpu))
>           continue;            <---------- will skip the primary thread
>       [...]
>       cpu_smt_control = CPU_SMT_DISABLED;
> 
> topology_is_primary_thread() checking only applies to the SMT control but
> not to the CPU offline.

I see, make sense. Retested on my SMT4 Arm64 server with 256 CPUs.

Yicong Yang Sept. 27, 2023, 11:53 a.m. UTC | #8

Hi Sudeep,

Any further comments? Did my replies answer your concerns?
Willing for more suggestions if there's any to make progress on this.

Thanks.

On 2023/9/22 17:46, Yicong Yang wrote:
> On 2023/9/21 23:03, Sudeep Holla wrote:
>> On Tue, Sep 19, 2023 at 08:33:19PM +0800, Yicong Yang wrote:
>>> From: Yicong Yang <yangyicong@hisilicon.com>
>>>
>>> The core CPU control framework supports runtime SMT control which
>>> is not yet supported on arm64. Besides the general vulnerabilities
>>> concerns we want this runtime control on our arm64 server for:
>>>
>>> - better single CPU performance in some cases
>>> - saving overall power consumption
>>>
>>> This patch implements it in the following aspects:
>>>
>>> - implement the callbacks of the core
>>> - update the SMT status after the topology enumerated on arm64
>>> - select HOTPLUG_SMT for arm64
>>>
>>> For disabling SMT we'll offline all the secondary threads and
>>> only leave the primary thread. Since we don't have restriction
>>> for primary thread selection, the first thread is chosen as the
>>> primary thread in this implementation.
>>>
>>> Tests has been done on our ACPI based arm64 server and on
>>> ACPI/OF based QEMU VMs.
>>>
>>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>>> ---
>>>  arch/arm64/Kconfig            |  1 +
>>>  drivers/base/arch_topology.c  | 63 +++++++++++++++++++++++++++++++++++
>>>  include/linux/arch_topology.h | 11 ++++++
>>>  3 files changed, 75 insertions(+)
>>>
>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>> index b10515c0200b..531a71c7f499 100644
>>> --- a/arch/arm64/Kconfig
>>> +++ b/arch/arm64/Kconfig
>>> @@ -233,6 +233,7 @@ config ARM64
>>>  	select HAVE_KRETPROBES
>>>  	select HAVE_GENERIC_VDSO
>>>  	select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
>>> +	select HOTPLUG_SMT if SMP
>>>  	select IRQ_DOMAIN
>>>  	select IRQ_FORCED_THREADING
>>>  	select KASAN_VMALLOC if KASAN
>>> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
>>> index b741b5ba82bd..75a693834fff 100644
>>> --- a/drivers/base/arch_topology.c
>>> +++ b/drivers/base/arch_topology.c
>>> @@ -729,6 +729,63 @@ const struct cpumask *cpu_clustergroup_mask(int cpu)
>>>  	return &cpu_topology[cpu].cluster_sibling;
>>>  }
>>>  
>>> +#ifdef CONFIG_HOTPLUG_SMT
>>> +static int topology_smt_num_threads = 1;
>>> +
>>> +void __init topology_smt_set_num_threads(void)
>>> +{
>>> +	int cpu, sibling, threads;
>>> +
>>> +	/*
>>> +	 * Walk all the CPUs to find the largest thread number, in case we're
>>> +	 * on a heterogeneous platform with only part of the CPU cores support
>>> +	 * SMT.
>>> +	 *
>>> +	 * Get the thread number by checking the CPUs with same core id
>>> +	 * rather than checking the topology_sibling_cpumask(), since the
>>> +	 * sibling mask will not cover all the CPUs if there's CPU offline.
>>> +	 */
>>> +	for_each_possible_cpu(cpu) {
>>> +		threads = 1;
>>> +
>>> +		/* Invalid thread id, this CPU is not in a SMT core */
>>> +		if (cpu_topology[cpu].thread_id == -1)
>>> +			continue;
>>> +
>>> +		for_each_possible_cpu(sibling) {
>>
>> I would really like to avoid parsing all the cpus here(O(cpu^2))
>>
> 
> What if we use a temp cpumask to record each visited CPU and make it O(cpu)?
> I didn't use it because I don't want to see a memory allocation failure for
> the cpumask. In current implementation there's no way to fail.
> 
>> Another random thought(just looking at DT parsing) is we can count threads
>> while parsing itself if we need the info early before the topology cpumasks
>> are setup. Need to look at ACPI parsing and how to make that generic but
>> thought of checking the idea here first.
>>
> 
> The ACPI parsing is in each arch's codes (currently for arm64 and loongarch).
> The code will be isolated to DT, arm64 ACPI parsing, loogarch ACPI parsing
> and future ACPI based archs, it'll be redundant and hard to maintian, I think.
> So I perfer to unify the processing since the logic is common, just check
> the finally built cpu_topology array regardless whether we're on an ACPI/OF
> based system.
> 
>> [...]
>>
>>> @@ -841,6 +898,12 @@ void __init init_cpu_topology(void)
>>>  		reset_cpu_topology();
>>>  	}
>>>  
>>> +	/*
>>> +	 * By this stage we get to know whether we support SMT or not, update
>>> +	 * the information for the core.
>>> +	 */
>>> +	topology_smt_set_num_threads();
>>> +
>>
>> Does this have to be done this early ? I was wondering if we can defer until
>> all the cpumasks are set and you can rely on the thread_sibling mask ?
>> You can just get the weight of that mask on all cpus and choose the max value.
>>
> 
> We do this at this stage because we've already gotten the necessary informations.
> As commented in topology_smt_set_num_threads(), I didn't use sibling masks
> because the sibling masks will not contain all the informations, it'll only be updated
> when CPUs going online in secondary_start_kernel()->store_cpu_topology(). Think of
> the case if we boot with maxcpus=1, the SMT status won't be collected completely if
> we're using the sibling mask. For example, the sibling mask of the boot CPU will
> be correct, but not for its sibling cores.
> 
> Thanks.
> .
>

diff mbox series

Patch

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index b10515c0200b..531a71c7f499 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -233,6 +233,7 @@  config ARM64
 	select HAVE_KRETPROBES
 	select HAVE_GENERIC_VDSO
 	select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
+	select HOTPLUG_SMT if SMP
 	select IRQ_DOMAIN
 	select IRQ_FORCED_THREADING
 	select KASAN_VMALLOC if KASAN
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index b741b5ba82bd..75a693834fff 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -729,6 +729,63 @@  const struct cpumask *cpu_clustergroup_mask(int cpu)
 	return &cpu_topology[cpu].cluster_sibling;
 }
 
+#ifdef CONFIG_HOTPLUG_SMT
+static int topology_smt_num_threads = 1;
+
+void __init topology_smt_set_num_threads(void)
+{
+	int cpu, sibling, threads;
+
+	/*
+	 * Walk all the CPUs to find the largest thread number, in case we're
+	 * on a heterogeneous platform with only part of the CPU cores support
+	 * SMT.
+	 *
+	 * Get the thread number by checking the CPUs with same core id
+	 * rather than checking the topology_sibling_cpumask(), since the
+	 * sibling mask will not cover all the CPUs if there's CPU offline.
+	 */
+	for_each_possible_cpu(cpu) {
+		threads = 1;
+
+		/* Invalid thread id, this CPU is not in a SMT core */
+		if (cpu_topology[cpu].thread_id == -1)
+			continue;
+
+		for_each_possible_cpu(sibling) {
+			if (sibling == cpu || cpu_topology[sibling].thread_id == -1)
+				continue;
+
+			if (cpu_topology[cpu].core_id == cpu_topology[sibling].core_id)
+				threads++;
+		}
+
+		if (threads > topology_smt_num_threads)
+			topology_smt_num_threads = threads;
+	}
+
+	/*
+	 * We don't support CONFIG_SMT_NUM_THREADS_DYNAMIC so make the
+	 * max_threads == num_threads.
+	 */
+	cpu_smt_set_num_threads(topology_smt_num_threads, topology_smt_num_threads);
+}
+
+/*
+ * On SMT Hotplug the primary thread of the SMT won't be disabled. For x86 they
+ * seem to have a primary thread for special purpose. For other arthitectures
+ * like arm64 there's no such restriction for a primary thread, so make the
+ * first thread in the SMT as the primary thread.
+ */
+bool topology_is_primary_thread(unsigned int cpu)
+{
+	if (cpu == cpumask_first(topology_sibling_cpumask(cpu)))
+		return true;
+
+	return false;
+}
+#endif
+
 void update_siblings_masks(unsigned int cpuid)
 {
 	struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid];
@@ -841,6 +898,12 @@  void __init init_cpu_topology(void)
 		reset_cpu_topology();
 	}
 
+	/*
+	 * By this stage we get to know whether we support SMT or not, update
+	 * the information for the core.
+	 */
+	topology_smt_set_num_threads();
+
 	for_each_possible_cpu(cpu) {
 		ret = fetch_cache_info(cpu);
 		if (!ret)
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index a07b510e7dc5..cf605a576e7b 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -92,6 +92,17 @@  void update_siblings_masks(unsigned int cpu);
 void remove_cpu_topology(unsigned int cpuid);
 void reset_cpu_topology(void);
 int parse_acpi_topology(void);
+
+#ifdef CONFIG_HOTPLUG_SMT
+bool topology_smt_supported(void);
+bool topology_is_primary_thread(unsigned int cpu);
+void topology_smt_set_num_threads(void);
+#else
+static inline bool topology_smt_supported(void) { return false; }
+static inline bool topology_is_primary_thread(unsigned int cpu) { return false; }
+static inline void topology_smt_set_num_threads(void) { }
+#endif
+
 #endif
 
 #endif /* _LINUX_ARCH_TOPOLOGY_H_ */