[RESEND,v2,2/5] sbitmap: remove redundant check in __sbitmap_queue_get_batch

Message ID 20221222143353.598042-3-shikemeng@huaweicloud.com
State New
Headers
Series A few bugfix and cleanup patches for sbitmap |

Commit Message

Kemeng Shi Dec. 22, 2022, 2:33 p.m. UTC
  Commit fbb564a557809 ("lib/sbitmap: Fix invalid loop in
__sbitmap_queue_get_batch()") mentioned that "Checking free bits when
setting the target bits. Otherwise, it may reuse the busying bits."
This commit add check to make sure all masked bits in word before
cmpxchg is zero. Then the existing check after cmpxchg to check any
zero bit is existing in masked bits in word is redundant.

Actually, old value of word before cmpxchg is stored in val and we
will filter out busy bits in val by "(get_mask & ~val)" after cmpxchg.
So we will not reuse busy bits methioned in commit fbb564a557809
("lib/sbitmap: Fix invalid loop in __sbitmap_queue_get_batch()"). Revert
new-added check to remove redundant check.

Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>
---
 lib/sbitmap.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)
  

Comments

Jan Kara Dec. 22, 2022, 11:23 a.m. UTC | #1
On Thu 22-12-22 22:33:50, Kemeng Shi wrote:
> Commit fbb564a557809 ("lib/sbitmap: Fix invalid loop in
> __sbitmap_queue_get_batch()") mentioned that "Checking free bits when
> setting the target bits. Otherwise, it may reuse the busying bits."
> This commit add check to make sure all masked bits in word before
> cmpxchg is zero. Then the existing check after cmpxchg to check any
> zero bit is existing in masked bits in word is redundant.
> 
> Actually, old value of word before cmpxchg is stored in val and we
> will filter out busy bits in val by "(get_mask & ~val)" after cmpxchg.
> So we will not reuse busy bits methioned in commit fbb564a557809
> ("lib/sbitmap: Fix invalid loop in __sbitmap_queue_get_batch()"). Revert
> new-added check to remove redundant check.
> 
> Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com>

...

> diff --git a/lib/sbitmap.c b/lib/sbitmap.c
> index cb5e03a2d65b..11e75f4040fb 100644
> --- a/lib/sbitmap.c
> +++ b/lib/sbitmap.c
> @@ -518,11 +518,9 @@ unsigned long __sbitmap_queue_get_batch(struct sbitmap_queue *sbq, int nr_tags,
>  
>  			get_mask = ((1UL << nr_tags) - 1) << nr;
>  			val = READ_ONCE(map->word);
> -			do {
> -				if ((val & ~get_mask) != val)
> -					goto next;
> -			} while (!atomic_long_try_cmpxchg(ptr, &val,
> -							  get_mask | val));
> +			while (!atomic_long_try_cmpxchg(ptr, &val,
> +							  get_mask | val))
> +				;
>  			get_mask = (get_mask & ~val) >> nr;
>  			if (get_mask) {
>  				*offset = nr + (index << sb->shift);

So I agree this will result in correct behavior but it can change
performance. In the original code, we end up doing
atomic_long_try_cmpxchg() only for words where we have a chance of getting
all tags allocated. Now we just accept any word where we could allocate at
least one bit. Frankly the original code looks rather restrictive and also
the fact that we look only from the first zero bit in the word looks
unnecessarily restrictive so maybe I miss some details about what's
expected from __sbitmap_queue_get_batch(). So all in all I wanted to point
out this needs more scrutiny from someone understanding better expectations
from __sbitmap_queue_get_batch().

								Honza
  
Kemeng Shi Dec. 22, 2022, 11:49 a.m. UTC | #2
Hi Jan, thanks for review.
on 12/22/2022 7:23 PM, Jan Kara wrote:
>> diff --git a/lib/sbitmap.c b/lib/sbitmap.c
>> index cb5e03a2d65b..11e75f4040fb 100644
>> --- a/lib/sbitmap.c
>> +++ b/lib/sbitmap.c
>> @@ -518,11 +518,9 @@ unsigned long __sbitmap_queue_get_batch(struct sbitmap_queue *sbq, int nr_tags,
>>  
>>  			get_mask = ((1UL << nr_tags) - 1) << nr;
>>  			val = READ_ONCE(map->word);
>> -			do {
>> -				if ((val & ~get_mask) != val)
>> -					goto next;
>> -			} while (!atomic_long_try_cmpxchg(ptr, &val,
>> -							  get_mask | val));
>> +			while (!atomic_long_try_cmpxchg(ptr, &val,
>> +							  get_mask | val))
>> +				;
>>  			get_mask = (get_mask & ~val) >> nr;
>>  			if (get_mask) {
>>  				*offset = nr + (index << sb->shift);
> 
> So I agree this will result in correct behavior but it can change
> performance. In the original code, we end up doing
> atomic_long_try_cmpxchg() only for words where we have a chance of getting
> all tags allocated. Now we just accept any word where we could allocate at
> least one bit. Frankly the original code looks rather restrictive and also
> the fact that we look only from the first zero bit in the word looks
> unnecessarily restrictive so maybe I miss some details about what's
> expected from __sbitmap_queue_get_batch(). So all in all I wanted to point
> out this needs more scrutiny from someone understanding better expectations
> from __sbitmap_queue_get_batch().
In the very beginning, __sbitmap_queue_get_batch will return if we only
get partial tags allocated. Recent commit fbb564a557809 ("lib/sbitmap: Fix
invalid loop in __sbitmap_queue_get_batch()") thought we may reuse busying
bits in old codes and change behavior of __sbitmap_queue_get_batch() to get
all tags. However we will not reuse busying bits in old codes actually. So
I try to revert this wrong fix and keep the behavior of
__sbitmap_queue_get_batch() as it designed to be at beginning.

Besides, if we keep to get all tags,the check below is redundant.
	get_mask = (get_mask & ~ret) >> nr;
	if (get_mask) {
		...
	}
As we only reach here if we get all tags and the check above will always
pass. So this check in old codes should be removed.
  
Jan Kara Dec. 22, 2022, 12:16 p.m. UTC | #3
[Added to CC original author of the problematic commit and reviewer]

On Thu 22-12-22 19:49:12, Kemeng Shi wrote:
> Hi Jan, thanks for review.
> on 12/22/2022 7:23 PM, Jan Kara wrote:
> >> diff --git a/lib/sbitmap.c b/lib/sbitmap.c
> >> index cb5e03a2d65b..11e75f4040fb 100644
> >> --- a/lib/sbitmap.c
> >> +++ b/lib/sbitmap.c
> >> @@ -518,11 +518,9 @@ unsigned long __sbitmap_queue_get_batch(struct sbitmap_queue *sbq, int nr_tags,
> >>  
> >>  			get_mask = ((1UL << nr_tags) - 1) << nr;
> >>  			val = READ_ONCE(map->word);
> >> -			do {
> >> -				if ((val & ~get_mask) != val)
> >> -					goto next;
> >> -			} while (!atomic_long_try_cmpxchg(ptr, &val,
> >> -							  get_mask | val));
> >> +			while (!atomic_long_try_cmpxchg(ptr, &val,
> >> +							  get_mask | val))
> >> +				;
> >>  			get_mask = (get_mask & ~val) >> nr;
> >>  			if (get_mask) {
> >>  				*offset = nr + (index << sb->shift);
> > 
> > So I agree this will result in correct behavior but it can change
> > performance. In the original code, we end up doing
> > atomic_long_try_cmpxchg() only for words where we have a chance of getting
> > all tags allocated. Now we just accept any word where we could allocate at
> > least one bit. Frankly the original code looks rather restrictive and also
> > the fact that we look only from the first zero bit in the word looks
> > unnecessarily restrictive so maybe I miss some details about what's
> > expected from __sbitmap_queue_get_batch(). So all in all I wanted to point
> > out this needs more scrutiny from someone understanding better expectations
> > from __sbitmap_queue_get_batch().
> In the very beginning, __sbitmap_queue_get_batch will return if we only
> get partial tags allocated. Recent commit fbb564a557809 ("lib/sbitmap: Fix
> invalid loop in __sbitmap_queue_get_batch()") thought we may reuse busying
> bits in old codes and change behavior of __sbitmap_queue_get_batch() to get
> all tags. However we will not reuse busying bits in old codes actually. So
> I try to revert this wrong fix and keep the behavior of
> __sbitmap_queue_get_batch() as it designed to be at beginning.

I see and now I agree. Please add tag:

Fixes: fbb564a557809 ("lib/sbitmap: Fix invalid loop in __sbitmap_queue_get_batch()") 

to your commit. Also feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>
 
> Besides, if we keep to get all tags,the check below is redundant.
> 	get_mask = (get_mask & ~ret) >> nr;
> 	if (get_mask) {
> 		...
> 	}
> As we only reach here if we get all tags and the check above will always
> pass. So this check in old codes should be removed.

Yeah, I agree.

								Honza
  

Patch

diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index cb5e03a2d65b..11e75f4040fb 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -518,11 +518,9 @@  unsigned long __sbitmap_queue_get_batch(struct sbitmap_queue *sbq, int nr_tags,
 
 			get_mask = ((1UL << nr_tags) - 1) << nr;
 			val = READ_ONCE(map->word);
-			do {
-				if ((val & ~get_mask) != val)
-					goto next;
-			} while (!atomic_long_try_cmpxchg(ptr, &val,
-							  get_mask | val));
+			while (!atomic_long_try_cmpxchg(ptr, &val,
+							  get_mask | val))
+				;
 			get_mask = (get_mask & ~val) >> nr;
 			if (get_mask) {
 				*offset = nr + (index << sb->shift);