[v2,1/2] cpu/hotplug: introduce 'num_dying_cpus' to get dying CPUs count

Message ID 20230406015629.1804722-2-yebin@huaweicloud.com
State New
Headers
Series fix dying cpu compare race |

Commit Message

Ye Bin April 6, 2023, 1:56 a.m. UTC
  From: Ye Bin <yebin10@huawei.com>

Introduce '__num_dying_cpus' variable to cache the number of dying CPUs
in the core and just return the cached variable.

Signed-off-by: Ye Bin <yebin10@huawei.com>
---
 include/linux/cpumask.h | 20 ++++++++++++++++----
 kernel/cpu.c            |  2 ++
 2 files changed, 18 insertions(+), 4 deletions(-)
  

Comments

Yury Norov April 10, 2023, 5:42 p.m. UTC | #1
On Thu, Apr 06, 2023 at 09:56:28AM +0800, Ye Bin wrote:
> From: Ye Bin <yebin10@huawei.com>
> 
> Introduce '__num_dying_cpus' variable to cache the number of dying CPUs
> in the core and just return the cached variable.
> 
> Signed-off-by: Ye Bin <yebin10@huawei.com>

It looks like you didn't address any comments for v1. Can you please
do that? Otherwise, NAK.

Thanks,
Yury
  
Thomas Gleixner April 10, 2023, 8:12 p.m. UTC | #2
On Thu, Apr 06 2023 at 09:56, Ye Bin wrote:
> From: Ye Bin <yebin10@huawei.com>
>
> Introduce '__num_dying_cpus' variable to cache the number of dying CPUs
> in the core and just return the cached variable.

Why?

That atomic counter is racy too if read and acted upon w/o having CPUs
read locked.

All it does is making the race window smaller vs. the cpumask_weight()
based implementation. It's still racy and incorrect.

So no, this is not going to happen.

Thanks,

        tglx
  

Patch

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index 2a61ddcf8321..8127fd598f51 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -135,6 +135,8 @@  extern struct cpumask __cpu_dying_mask;
 
 extern atomic_t __num_online_cpus;
 
+extern atomic_t __num_dying_cpus;
+
 extern cpumask_t cpus_booted_once_mask;
 
 static __always_inline void cpu_max_bits_warn(unsigned int cpu, unsigned int bits)
@@ -1018,10 +1020,14 @@  set_cpu_active(unsigned int cpu, bool active)
 static __always_inline void
 set_cpu_dying(unsigned int cpu, bool dying)
 {
-	if (dying)
-		cpumask_set_cpu(cpu, &__cpu_dying_mask);
-	else
-		cpumask_clear_cpu(cpu, &__cpu_dying_mask);
+	if (dying) {
+		if (!cpumask_test_and_set_cpu(cpu, &__cpu_dying_mask))
+			atomic_inc(&__num_dying_cpus);
+	}
+	else {
+		if (cpumask_test_and_clear_cpu(cpu, &__cpu_dying_mask))
+			atomic_dec(&__num_dying_cpus);
+	}
 }
 
 /**
@@ -1073,6 +1079,11 @@  static __always_inline unsigned int num_online_cpus(void)
 {
 	return arch_atomic_read(&__num_online_cpus);
 }
+
+static __always_inline unsigned int num_dying_cpus(void)
+{
+	return arch_atomic_read(&__num_dying_cpus);
+}
 #define num_possible_cpus()	cpumask_weight(cpu_possible_mask)
 #define num_present_cpus()	cpumask_weight(cpu_present_mask)
 #define num_active_cpus()	cpumask_weight(cpu_active_mask)
@@ -1108,6 +1119,7 @@  static __always_inline bool cpu_dying(unsigned int cpu)
 #define num_possible_cpus()	1U
 #define num_present_cpus()	1U
 #define num_active_cpus()	1U
+#define num_dying_cpus()	0U
 
 static __always_inline bool cpu_online(unsigned int cpu)
 {
diff --git a/kernel/cpu.c b/kernel/cpu.c
index f4a2c5845bcb..1c96c04cb259 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2662,6 +2662,8 @@  EXPORT_SYMBOL(__cpu_dying_mask);
 atomic_t __num_online_cpus __read_mostly;
 EXPORT_SYMBOL(__num_online_cpus);
 
+atomic_t __num_dying_cpus __read_mostly;
+
 void init_cpu_present(const struct cpumask *src)
 {
 	cpumask_copy(&__cpu_present_mask, src);