[v3,3/7] lib/group_cpus: relax atomicity requirement in grp_spread_init_one()

Message ID 20231212042108.682072-4-yury.norov@gmail.com
State New
Headers
Series lib/group_cpus: rework grp_spread_init_one() and make it O(1) |

Commit Message

Yury Norov Dec. 12, 2023, 4:21 a.m. UTC
  Because nmsk and irqmsk are stable, extra atomicity is not required.

Signed-off-by: Yury Norov <yury.norov@gmail.com>
---
 lib/group_cpus.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)
  

Comments

Ming Lei Dec. 12, 2023, 9:50 a.m. UTC | #1
On Mon, Dec 11, 2023 at 08:21:03PM -0800, Yury Norov wrote:
> Because nmsk and irqmsk are stable, extra atomicity is not required.
> 
> Signed-off-by: Yury Norov <yury.norov@gmail.com>
> ---
>  lib/group_cpus.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/group_cpus.c b/lib/group_cpus.c
> index 10dead3ab0e0..7ac94664230f 100644
> --- a/lib/group_cpus.c
> +++ b/lib/group_cpus.c
> @@ -24,8 +24,8 @@ static void grp_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
>  		if (cpu >= nr_cpu_ids)
>  			return;
>  
> -		cpumask_clear_cpu(cpu, nmsk);
> -		cpumask_set_cpu(cpu, irqmsk);
> +		__cpumask_clear_cpu(cpu, nmsk);
> +		__cpumask_set_cpu(cpu, irqmsk);
>  		cpus_per_grp--;
>  
>  		/* If the cpu has siblings, use them first */
> @@ -33,8 +33,8 @@ static void grp_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
>  		sibl = cpu + 1;
>  
>  		for_each_cpu_and_from(sibl, siblmsk, nmsk) {
> -			cpumask_clear_cpu(sibl, nmsk);
> -			cpumask_set_cpu(sibl, irqmsk);
> +			__cpumask_clear_cpu(sibl, nmsk);
> +			__cpumask_set_cpu(sibl, irqmsk);

I think this kind of change should be avoided, here the code is
absolutely in slow path, and we care code cleanness and readability
much more than the saved cycle from non atomicity.


Thanks,
Ming
  
Yury Norov Dec. 12, 2023, 4:52 p.m. UTC | #2
On Tue, Dec 12, 2023 at 05:50:04PM +0800, Ming Lei wrote:
> On Mon, Dec 11, 2023 at 08:21:03PM -0800, Yury Norov wrote:
> > Because nmsk and irqmsk are stable, extra atomicity is not required.
> > 
> > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > ---
> >  lib/group_cpus.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/lib/group_cpus.c b/lib/group_cpus.c
> > index 10dead3ab0e0..7ac94664230f 100644
> > --- a/lib/group_cpus.c
> > +++ b/lib/group_cpus.c
> > @@ -24,8 +24,8 @@ static void grp_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
> >  		if (cpu >= nr_cpu_ids)
> >  			return;
> >  
> > -		cpumask_clear_cpu(cpu, nmsk);
> > -		cpumask_set_cpu(cpu, irqmsk);
> > +		__cpumask_clear_cpu(cpu, nmsk);
> > +		__cpumask_set_cpu(cpu, irqmsk);
> >  		cpus_per_grp--;
> >  
> >  		/* If the cpu has siblings, use them first */
> > @@ -33,8 +33,8 @@ static void grp_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
> >  		sibl = cpu + 1;
> >  
> >  		for_each_cpu_and_from(sibl, siblmsk, nmsk) {
> > -			cpumask_clear_cpu(sibl, nmsk);
> > -			cpumask_set_cpu(sibl, irqmsk);
> > +			__cpumask_clear_cpu(sibl, nmsk);
> > +			__cpumask_set_cpu(sibl, irqmsk);
> 
> I think this kind of change should be avoided, here the code is
> absolutely in slow path, and we care code cleanness and readability
> much more than the saved cycle from non atomicity.

Atomic ops have special meaning and special function. This 'atomic' way
of moving a bit from one bitmap to another looks completely non-trivial
and puzzling to me.

A sequence of atomic ops is not atomic itself. Normally it's a sing of
a bug. But in this case, both masks are stable, and we don't need
atomicity at all.

It's not about performance, it's about readability.

Thanks,
Yury
  
Ming Lei Dec. 13, 2023, 12:14 a.m. UTC | #3
On Tue, Dec 12, 2023 at 08:52:14AM -0800, Yury Norov wrote:
> On Tue, Dec 12, 2023 at 05:50:04PM +0800, Ming Lei wrote:
> > On Mon, Dec 11, 2023 at 08:21:03PM -0800, Yury Norov wrote:
> > > Because nmsk and irqmsk are stable, extra atomicity is not required.
> > > 
> > > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > > ---
> > >  lib/group_cpus.c | 8 ++++----
> > >  1 file changed, 4 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/lib/group_cpus.c b/lib/group_cpus.c
> > > index 10dead3ab0e0..7ac94664230f 100644
> > > --- a/lib/group_cpus.c
> > > +++ b/lib/group_cpus.c
> > > @@ -24,8 +24,8 @@ static void grp_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
> > >  		if (cpu >= nr_cpu_ids)
> > >  			return;
> > >  
> > > -		cpumask_clear_cpu(cpu, nmsk);
> > > -		cpumask_set_cpu(cpu, irqmsk);
> > > +		__cpumask_clear_cpu(cpu, nmsk);
> > > +		__cpumask_set_cpu(cpu, irqmsk);
> > >  		cpus_per_grp--;
> > >  
> > >  		/* If the cpu has siblings, use them first */
> > > @@ -33,8 +33,8 @@ static void grp_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
> > >  		sibl = cpu + 1;
> > >  
> > >  		for_each_cpu_and_from(sibl, siblmsk, nmsk) {
> > > -			cpumask_clear_cpu(sibl, nmsk);
> > > -			cpumask_set_cpu(sibl, irqmsk);
> > > +			__cpumask_clear_cpu(sibl, nmsk);
> > > +			__cpumask_set_cpu(sibl, irqmsk);
> > 
> > I think this kind of change should be avoided, here the code is
> > absolutely in slow path, and we care code cleanness and readability
> > much more than the saved cycle from non atomicity.
> 
> Atomic ops have special meaning and special function. This 'atomic' way
> of moving a bit from one bitmap to another looks completely non-trivial
> and puzzling to me.
> 
> A sequence of atomic ops is not atomic itself. Normally it's a sing of
> a bug. But in this case, both masks are stable, and we don't need
> atomicity at all.

Here we don't care the atomicity.

> 
> It's not about performance, it's about readability.

__cpumask_clear_cpu() and __cpumask_set_cpu() are more like private
helper, and more hard to follow.

[@linux]$ git grep -n -w -E "cpumask_clear_cpu|cpumask_set_cpu" ./ | wc
    674    2055   53954
[@linux]$ git grep -n -w -E "__cpumask_clear_cpu|__cpumask_set_cpu" ./ | wc
     21      74    1580

I don't object to comment the current usage, but NAK for this change.


Thanks,
Ming
  
Yury Norov Dec. 13, 2023, 5:03 p.m. UTC | #4
On Wed, Dec 13, 2023 at 08:14:45AM +0800, Ming Lei wrote:
> On Tue, Dec 12, 2023 at 08:52:14AM -0800, Yury Norov wrote:
> > On Tue, Dec 12, 2023 at 05:50:04PM +0800, Ming Lei wrote:
> > > On Mon, Dec 11, 2023 at 08:21:03PM -0800, Yury Norov wrote:
> > > > Because nmsk and irqmsk are stable, extra atomicity is not required.
> > > > 
> > > > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > > > ---
> > > >  lib/group_cpus.c | 8 ++++----
> > > >  1 file changed, 4 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/lib/group_cpus.c b/lib/group_cpus.c
> > > > index 10dead3ab0e0..7ac94664230f 100644
> > > > --- a/lib/group_cpus.c
> > > > +++ b/lib/group_cpus.c
> > > > @@ -24,8 +24,8 @@ static void grp_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
> > > >  		if (cpu >= nr_cpu_ids)
> > > >  			return;
> > > >  
> > > > -		cpumask_clear_cpu(cpu, nmsk);
> > > > -		cpumask_set_cpu(cpu, irqmsk);
> > > > +		__cpumask_clear_cpu(cpu, nmsk);
> > > > +		__cpumask_set_cpu(cpu, irqmsk);
> > > >  		cpus_per_grp--;
> > > >  
> > > >  		/* If the cpu has siblings, use them first */
> > > > @@ -33,8 +33,8 @@ static void grp_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
> > > >  		sibl = cpu + 1;
> > > >  
> > > >  		for_each_cpu_and_from(sibl, siblmsk, nmsk) {
> > > > -			cpumask_clear_cpu(sibl, nmsk);
> > > > -			cpumask_set_cpu(sibl, irqmsk);
> > > > +			__cpumask_clear_cpu(sibl, nmsk);
> > > > +			__cpumask_set_cpu(sibl, irqmsk);
> > > 
> > > I think this kind of change should be avoided, here the code is
> > > absolutely in slow path, and we care code cleanness and readability
> > > much more than the saved cycle from non atomicity.
> > 
> > Atomic ops have special meaning and special function. This 'atomic' way
> > of moving a bit from one bitmap to another looks completely non-trivial
> > and puzzling to me.
> > 
> > A sequence of atomic ops is not atomic itself. Normally it's a sing of
> > a bug. But in this case, both masks are stable, and we don't need
> > atomicity at all.
> 
> Here we don't care the atomicity.
> 
> > 
> > It's not about performance, it's about readability.
> 
> __cpumask_clear_cpu() and __cpumask_set_cpu() are more like private
> helper, and more hard to follow.

No that's not true. Non-atomic version of the function is not a
private helper of course.
 
> [@linux]$ git grep -n -w -E "cpumask_clear_cpu|cpumask_set_cpu" ./ | wc
>     674    2055   53954
> [@linux]$ git grep -n -w -E "__cpumask_clear_cpu|__cpumask_set_cpu" ./ | wc
>      21      74    1580
> 
> I don't object to comment the current usage, but NAK for this change.

No problem, I'll add you NAK.
  
Ming Lei Dec. 14, 2023, 12:43 a.m. UTC | #5
On Wed, Dec 13, 2023 at 09:03:17AM -0800, Yury Norov wrote:
> On Wed, Dec 13, 2023 at 08:14:45AM +0800, Ming Lei wrote:
> > On Tue, Dec 12, 2023 at 08:52:14AM -0800, Yury Norov wrote:
> > > On Tue, Dec 12, 2023 at 05:50:04PM +0800, Ming Lei wrote:
> > > > On Mon, Dec 11, 2023 at 08:21:03PM -0800, Yury Norov wrote:
> > > > > Because nmsk and irqmsk are stable, extra atomicity is not required.
> > > > > 
> > > > > Signed-off-by: Yury Norov <yury.norov@gmail.com>
> > > > > ---
> > > > >  lib/group_cpus.c | 8 ++++----
> > > > >  1 file changed, 4 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/lib/group_cpus.c b/lib/group_cpus.c
> > > > > index 10dead3ab0e0..7ac94664230f 100644
> > > > > --- a/lib/group_cpus.c
> > > > > +++ b/lib/group_cpus.c
> > > > > @@ -24,8 +24,8 @@ static void grp_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
> > > > >  		if (cpu >= nr_cpu_ids)
> > > > >  			return;
> > > > >  
> > > > > -		cpumask_clear_cpu(cpu, nmsk);
> > > > > -		cpumask_set_cpu(cpu, irqmsk);
> > > > > +		__cpumask_clear_cpu(cpu, nmsk);
> > > > > +		__cpumask_set_cpu(cpu, irqmsk);
> > > > >  		cpus_per_grp--;
> > > > >  
> > > > >  		/* If the cpu has siblings, use them first */
> > > > > @@ -33,8 +33,8 @@ static void grp_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
> > > > >  		sibl = cpu + 1;
> > > > >  
> > > > >  		for_each_cpu_and_from(sibl, siblmsk, nmsk) {
> > > > > -			cpumask_clear_cpu(sibl, nmsk);
> > > > > -			cpumask_set_cpu(sibl, irqmsk);
> > > > > +			__cpumask_clear_cpu(sibl, nmsk);
> > > > > +			__cpumask_set_cpu(sibl, irqmsk);
> > > > 
> > > > I think this kind of change should be avoided, here the code is
> > > > absolutely in slow path, and we care code cleanness and readability
> > > > much more than the saved cycle from non atomicity.
> > > 
> > > Atomic ops have special meaning and special function. This 'atomic' way
> > > of moving a bit from one bitmap to another looks completely non-trivial
> > > and puzzling to me.
> > > 
> > > A sequence of atomic ops is not atomic itself. Normally it's a sing of
> > > a bug. But in this case, both masks are stable, and we don't need
> > > atomicity at all.
> > 
> > Here we don't care the atomicity.
> > 
> > > 
> > > It's not about performance, it's about readability.
> > 
> > __cpumask_clear_cpu() and __cpumask_set_cpu() are more like private
> > helper, and more hard to follow.
> 
> No that's not true. Non-atomic version of the function is not a
> private helper of course.
>  
> > [@linux]$ git grep -n -w -E "cpumask_clear_cpu|cpumask_set_cpu" ./ | wc
> >     674    2055   53954
> > [@linux]$ git grep -n -w -E "__cpumask_clear_cpu|__cpumask_set_cpu" ./ | wc
> >      21      74    1580
> > 
> > I don't object to comment the current usage, but NAK for this change.
> 
> No problem, I'll add you NAK.

You can add the following words meantime:

__cpumask_clear_cpu() and __cpumask_set_cpu() are added in commit 6c8557bdb28d
("smp, cpumask: Use non-atomic cpumask_{set,clear}_cpu()") for fast code path(
smp_call_function_many()).

We have ~670 users of cpumask_clear_cpu & cpumask_set_cpu, lots of them
fall into same category with group_cpus.c(doesn't care atomicity, not in fast
code path), and needn't change to __cpumask_clear_cpu() and __cpumask_set_cpu().
Otherwise, this way may encourage to update others into the __cpumask_* version.


Thanks, 
Ming
  

Patch

diff --git a/lib/group_cpus.c b/lib/group_cpus.c
index 10dead3ab0e0..7ac94664230f 100644
--- a/lib/group_cpus.c
+++ b/lib/group_cpus.c
@@ -24,8 +24,8 @@  static void grp_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
 		if (cpu >= nr_cpu_ids)
 			return;
 
-		cpumask_clear_cpu(cpu, nmsk);
-		cpumask_set_cpu(cpu, irqmsk);
+		__cpumask_clear_cpu(cpu, nmsk);
+		__cpumask_set_cpu(cpu, irqmsk);
 		cpus_per_grp--;
 
 		/* If the cpu has siblings, use them first */
@@ -33,8 +33,8 @@  static void grp_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
 		sibl = cpu + 1;
 
 		for_each_cpu_and_from(sibl, siblmsk, nmsk) {
-			cpumask_clear_cpu(sibl, nmsk);
-			cpumask_set_cpu(sibl, irqmsk);
+			__cpumask_clear_cpu(sibl, nmsk);
+			__cpumask_set_cpu(sibl, irqmsk);
 			if (cpus_per_grp-- == 0)
 				return;
 		}