[v6,1/5] lib/bitmap: add bitmap_{read,write}()

Message ID 20231006134529.2816540-2-glider@google.com
State New
Headers
Series Implement MTE tag compression for swapped pages |

Commit Message

Alexander Potapenko Oct. 6, 2023, 1:45 p.m. UTC
  From: Syed Nayyar Waris <syednwaris@gmail.com>

The two new functions allow reading/writing values of length up to
BITS_PER_LONG bits at arbitrary position in the bitmap.

The code was taken from "bitops: Introduce the for_each_set_clump macro"
by Syed Nayyar Waris with a number of changes and simplifications:
 - instead of using roundup(), which adds an unnecessary dependency
   on <linux/math.h>, we calculate space as BITS_PER_LONG-offset;
 - indentation is reduced by not using else-clauses (suggested by
   checkpatch for bitmap_get_value());
 - bitmap_get_value()/bitmap_set_value() are renamed to bitmap_read()
   and bitmap_write();
 - some redundant computations are omitted.

Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Syed Nayyar Waris <syednwaris@gmail.com>
Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/lkml/fe12eedf3666f4af5138de0e70b67a07c7f40338.1592224129.git.syednwaris@gmail.com/
Suggested-by: Yury Norov <yury.norov@gmail.com>
Co-developed-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Alexander Potapenko <glider@google.com>

---
This patch was previously called "lib/bitmap: add
bitmap_{set,get}_value()"
(https://lore.kernel.org/lkml/20230720173956.3674987-2-glider@google.com/)

v6:
 - As suggested by Yury Norov, do not require bitmap_read(..., 0) to
   return 0.

v5:
 - Address comments by Yury Norov:
   - updated code comments and patch title/description
   - replace GENMASK(nbits - 1, 0) with BITMAP_LAST_WORD_MASK(nbits)
   - more compact bitmap_write() implementation

v4:
 - Address comments by Andy Shevchenko and Yury Norov:
   - prevent passing values >= 64 to GENMASK()
   - fix commit authorship
   - change comments
   - check for unlikely(nbits==0)
   - drop unnecessary const declarations
   - fix kernel-doc comments
   - rename bitmap_{get,set}_value() to bitmap_{read,write}()
---
 include/linux/bitmap.h | 68 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)
  

Comments

Andy Shevchenko Oct. 6, 2023, 2:47 p.m. UTC | #1
On Fri, Oct 06, 2023 at 03:45:25PM +0200, Alexander Potapenko wrote:
> From: Syed Nayyar Waris <syednwaris@gmail.com>
> 
> The two new functions allow reading/writing values of length up to
> BITS_PER_LONG bits at arbitrary position in the bitmap.
> 
> The code was taken from "bitops: Introduce the for_each_set_clump macro"
> by Syed Nayyar Waris with a number of changes and simplifications:
>  - instead of using roundup(), which adds an unnecessary dependency
>    on <linux/math.h>, we calculate space as BITS_PER_LONG-offset;
>  - indentation is reduced by not using else-clauses (suggested by
>    checkpatch for bitmap_get_value());
>  - bitmap_get_value()/bitmap_set_value() are renamed to bitmap_read()
>    and bitmap_write();
>  - some redundant computations are omitted.

...

> v6:
>  - As suggested by Yury Norov, do not require bitmap_read(..., 0) to
>    return 0.

Hmm... See below.

...

>   *  bitmap_to_arr32(buf, src, nbits)            Copy nbits from buf to u32[] dst
>   *  bitmap_to_arr64(buf, src, nbits)            Copy nbits from buf to u64[] dst

With the grouping as below I would add a blank line here. But was the intention
to group _arrXX() to these groups?

>   *  bitmap_get_value8(map, start)               Get 8bit value from map at start
> + *  bitmap_read(map, start, nbits)              Read an nbits-sized value from
> + *                                              map at start
>   *  bitmap_set_value8(map, value, start)        Set 8bit value to map at start
> + *  bitmap_write(map, value, start, nbits)      Write an nbits-sized value to
> + *                                              map at start

...

> +static inline unsigned long bitmap_read(const unsigned long *map,
> +					unsigned long start,
> +					unsigned long nbits)
> +{
> +	size_t index = BIT_WORD(start);
> +	unsigned long offset = start % BITS_PER_LONG;
> +	unsigned long space = BITS_PER_LONG - offset;
> +	unsigned long value_low, value_high;

> +	if (unlikely(!nbits))
> +		return 0;

Hmm... I didn't get was the comment to add or to remove these checks?

> +	if (space >= nbits)
> +		return (map[index] >> offset) & GENMASK(nbits - 1, 0);

And don't you want to replace this GENMASK() as well?

> +	value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
> +	value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits);
> +	return (value_low >> offset) | (value_high << space);
> +}
  
Yury Norov Oct. 6, 2023, 4:53 p.m. UTC | #2
On Fri, Oct 06, 2023 at 05:47:49PM +0300, Andy Shevchenko wrote:
> On Fri, Oct 06, 2023 at 03:45:25PM +0200, Alexander Potapenko wrote:
> > From: Syed Nayyar Waris <syednwaris@gmail.com>
> > 
> > The two new functions allow reading/writing values of length up to
> > BITS_PER_LONG bits at arbitrary position in the bitmap.
> > 
> > The code was taken from "bitops: Introduce the for_each_set_clump macro"
> > by Syed Nayyar Waris with a number of changes and simplifications:
> >  - instead of using roundup(), which adds an unnecessary dependency
> >    on <linux/math.h>, we calculate space as BITS_PER_LONG-offset;
> >  - indentation is reduced by not using else-clauses (suggested by
> >    checkpatch for bitmap_get_value());
> >  - bitmap_get_value()/bitmap_set_value() are renamed to bitmap_read()
> >    and bitmap_write();
> >  - some redundant computations are omitted.
> 
> ...
> 
> > v6:
> >  - As suggested by Yury Norov, do not require bitmap_read(..., 0) to
> >    return 0.
> 
> Hmm... See below.

 [...]
 
> > +static inline unsigned long bitmap_read(const unsigned long *map,
> > +					unsigned long start,
> > +					unsigned long nbits)
> > +{
> > +	size_t index = BIT_WORD(start);
> > +	unsigned long offset = start % BITS_PER_LONG;
> > +	unsigned long space = BITS_PER_LONG - offset;
> > +	unsigned long value_low, value_high;
> 
> > +	if (unlikely(!nbits))
> > +		return 0;
> 
> Hmm... I didn't get was the comment to add or to remove these checks?

The sentence relates to the test, and the comment that confused you
should to to the 2nd patch.

I.e., bitmap_read(..., 0) is not required to return 0, and it's purely
an implementation details.

> 
> > +	if (space >= nbits)
> > +		return (map[index] >> offset) & GENMASK(nbits - 1, 0);
> 
> And don't you want to replace this GENMASK() as well?

+1

> > +	value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
> > +	value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits);
> > +	return (value_low >> offset) | (value_high << space);
> > +}
> 
> -- 
> With Best Regards,
> Andy Shevchenko
>
  
Yury Norov Oct. 6, 2023, 10:35 p.m. UTC | #3
On Fri, Oct 06, 2023 at 03:45:25PM +0200, Alexander Potapenko wrote:
> From: Syed Nayyar Waris <syednwaris@gmail.com>
> 
> The two new functions allow reading/writing values of length up to
> BITS_PER_LONG bits at arbitrary position in the bitmap.
> 
> The code was taken from "bitops: Introduce the for_each_set_clump macro"
> by Syed Nayyar Waris with a number of changes and simplifications:
>  - instead of using roundup(), which adds an unnecessary dependency
>    on <linux/math.h>, we calculate space as BITS_PER_LONG-offset;
>  - indentation is reduced by not using else-clauses (suggested by
>    checkpatch for bitmap_get_value());
>  - bitmap_get_value()/bitmap_set_value() are renamed to bitmap_read()
>    and bitmap_write();
>  - some redundant computations are omitted.
> 
> Cc: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Syed Nayyar Waris <syednwaris@gmail.com>
> Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
> Link: https://lore.kernel.org/lkml/fe12eedf3666f4af5138de0e70b67a07c7f40338.1592224129.git.syednwaris@gmail.com/
> Suggested-by: Yury Norov <yury.norov@gmail.com>
> Co-developed-by: Alexander Potapenko <glider@google.com>
> Signed-off-by: Alexander Potapenko <glider@google.com>
> 
> ---
> This patch was previously called "lib/bitmap: add
> bitmap_{set,get}_value()"
> (https://lore.kernel.org/lkml/20230720173956.3674987-2-glider@google.com/)
> 
> v6:
>  - As suggested by Yury Norov, do not require bitmap_read(..., 0) to
>    return 0.
> 
> v5:
>  - Address comments by Yury Norov:
>    - updated code comments and patch title/description
>    - replace GENMASK(nbits - 1, 0) with BITMAP_LAST_WORD_MASK(nbits)
>    - more compact bitmap_write() implementation
> 
> v4:
>  - Address comments by Andy Shevchenko and Yury Norov:
>    - prevent passing values >= 64 to GENMASK()
>    - fix commit authorship
>    - change comments
>    - check for unlikely(nbits==0)
>    - drop unnecessary const declarations
>    - fix kernel-doc comments
>    - rename bitmap_{get,set}_value() to bitmap_{read,write}()
> ---
>  include/linux/bitmap.h | 68 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 68 insertions(+)
> 
> diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
> index 03644237e1efb..e72c054d21d48 100644
> --- a/include/linux/bitmap.h
> +++ b/include/linux/bitmap.h
> @@ -76,7 +76,11 @@ struct device;
>   *  bitmap_to_arr32(buf, src, nbits)            Copy nbits from buf to u32[] dst
>   *  bitmap_to_arr64(buf, src, nbits)            Copy nbits from buf to u64[] dst
>   *  bitmap_get_value8(map, start)               Get 8bit value from map at start
> + *  bitmap_read(map, start, nbits)              Read an nbits-sized value from
> + *                                              map at start
>   *  bitmap_set_value8(map, value, start)        Set 8bit value to map at start
> + *  bitmap_write(map, value, start, nbits)      Write an nbits-sized value to
> + *                                              map at start
>   *
>   * Note, bitmap_zero() and bitmap_fill() operate over the region of
>   * unsigned longs, that is, bits behind bitmap till the unsigned long
> @@ -583,6 +587,33 @@ static inline unsigned long bitmap_get_value8(const unsigned long *map,
>  	return (map[index] >> offset) & 0xFF;
>  }
>  
> +/**
> + * bitmap_read - read a value of n-bits from the memory region
> + * @map: address to the bitmap memory region
> + * @start: bit offset of the n-bit value
> + * @nbits: size of value in bits, nonzero, up to BITS_PER_LONG
> + *
> + * Returns: value of nbits located at the @start bit offset within the @map
> + * memory region.
> + */
> +static inline unsigned long bitmap_read(const unsigned long *map,
> +					unsigned long start,
> +					unsigned long nbits)
> +{
> +	size_t index = BIT_WORD(start);
> +	unsigned long offset = start % BITS_PER_LONG;
> +	unsigned long space = BITS_PER_LONG - offset;
> +	unsigned long value_low, value_high;
> +
> +	if (unlikely(!nbits))
> +		return 0;
> +	if (space >= nbits)
> +		return (map[index] >> offset) & GENMASK(nbits - 1, 0);
> +	value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
> +	value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits);
> +	return (value_low >> offset) | (value_high << space);
> +}
> +
>  /**
>   * bitmap_set_value8 - set an 8-bit value within a memory region
>   * @map: address to the bitmap memory region
> @@ -599,6 +630,43 @@ static inline void bitmap_set_value8(unsigned long *map, unsigned long value,
>  	map[index] |= value << offset;
>  }
>  
> +/**
> + * bitmap_write - write n-bit value within a memory region
> + * @map: address to the bitmap memory region
> + * @value: value to write, clamped to nbits
> + * @start: bit offset of the n-bit value
> + * @nbits: size of value in bits, nonzero, up to BITS_PER_LONG.
> + *
> + * bitmap_write() behaves similarly to @nbits calls of assign_bit(), i.e. bits
> + * beyond @nbits are ignored:
> + *
> + *   for (bit = 0; bit < nbits; bit++)
> + *           assign_bit(start + bit, bitmap, val & BIT(bit));

__assign_bit()

> + */

'behaves similarly' sounds like an understatement. I think, it behaves
much faster because it can assign up to 64 bits at once, not mentioning
the pressure on cache lines traffic.

How faster - that's a good question. I'd be really pleased if you add
a performance test for bitmap_write/read. Or I can do it myself later.
You can find examples in the same lib/test_bitmap.c.

> +static inline void bitmap_write(unsigned long *map,
> +				unsigned long value,
> +				unsigned long start, unsigned long nbits)
> +{
> +	size_t index = BIT_WORD(start);
> +	unsigned long offset = start % BITS_PER_LONG;
> +	unsigned long space = BITS_PER_LONG - offset;
> +	unsigned long mask;
> +
> +	if (unlikely(!nbits))
> +		return;

can you please add more empty lines to separate blocks visually?

> +	mask = BITMAP_LAST_WORD_MASK(nbits);
> +	value &= mask;
> +	if (space >= nbits) {
> +		map[index] &= ~(mask << offset);
> +		map[index] |= value << offset;
> +		return;
> +	}
> +	map[index] &= ~BITMAP_FIRST_WORD_MASK(start);
> +	map[index] |= value << offset;
> +	map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> +	map[index + 1] |= (value >> space);
> +}

I compiled the below fix on spark64 BE machine:

--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -608,7 +608,7 @@ static inline unsigned long bitmap_read(const unsigned long *map,
        if (unlikely(!nbits))
                return 0;
        if (space >= nbits)
-               return (map[index] >> offset) & GENMASK(nbits - 1, 0);
+               return (map[index] >> offset) & BITMAP_LAST_WORD_MASK(nbits);
        value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
        value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits);
        return (value_low >> offset) | (value_high << space);
@@ -661,9 +661,9 @@ static inline void bitmap_write(unsigned long *map,
                map[index] |= value << offset;
                return;
        }
-       map[index] &= ~BITMAP_FIRST_WORD_MASK(start);
+       map[index] &= BITMAP_LAST_WORD_MASK(start);
        map[index] |= value << offset;
-       map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
+       map[index + 1] &= BITMAP_FIRST_WORD_MASK(start + nbits);
        map[index + 1] |= (value >> space);
 }

All the tests are passed just as before, and there's no any difference
reported by bloat-o-meter. Can you please use non-negation versions as
they are more straightforward?

> +
>  #endif /* __ASSEMBLY__ */
>  
>  #endif /* __LINUX_BITMAP_H */
> -- 
> 2.42.0.609.gbb76f46606-goog
  
Alexander Potapenko Oct. 10, 2023, 8:16 a.m. UTC | #4
On Fri, Oct 6, 2023 at 4:48 PM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Fri, Oct 06, 2023 at 03:45:25PM +0200, Alexander Potapenko wrote:
> > From: Syed Nayyar Waris <syednwaris@gmail.com>
> >
> > The two new functions allow reading/writing values of length up to
> > BITS_PER_LONG bits at arbitrary position in the bitmap.
> >
> > The code was taken from "bitops: Introduce the for_each_set_clump macro"
> > by Syed Nayyar Waris with a number of changes and simplifications:
> >  - instead of using roundup(), which adds an unnecessary dependency
> >    on <linux/math.h>, we calculate space as BITS_PER_LONG-offset;
> >  - indentation is reduced by not using else-clauses (suggested by
> >    checkpatch for bitmap_get_value());
> >  - bitmap_get_value()/bitmap_set_value() are renamed to bitmap_read()
> >    and bitmap_write();
> >  - some redundant computations are omitted.
>
> ...
>
> > v6:
> >  - As suggested by Yury Norov, do not require bitmap_read(..., 0) to
> >    return 0.
>
> Hmm... See below.
>
> ...
>
> >   *  bitmap_to_arr32(buf, src, nbits)            Copy nbits from buf to u32[] dst
> >   *  bitmap_to_arr64(buf, src, nbits)            Copy nbits from buf to u64[] dst
>
> With the grouping as below I would add a blank line here. But was the intention
> to group _arrXX() to these groups?

Note that there's no single blank line in this long list.
What if I swap bitmap_read with bitmap_set_value8(), would the
grouping become more logical?

I.e.
*  bitmap_get_value8(map, start)               Get 8bit value from map at start
*  bitmap_set_value8(map, value, start)        Set 8bit value to map at start
*  bitmap_read(map, start, nbits)              Read an nbits-sized value from
*                                              map at start
*  bitmap_write(map, value, start, nbits)      Write an nbits-sized value to
*                                              map at start

> > +     if (unlikely(!nbits))
> > +             return 0;
>
> Hmm... I didn't get was the comment to add or to remove these checks?

As Yury said, we should not require the return value to be 0, so I
added "nonzero" to the descriptions of the @nbits parameter.
The check stays in place, but the users relying on it is now a mistake.

>
> > +     if (space >= nbits)
> > +             return (map[index] >> offset) & GENMASK(nbits - 1, 0);
>
> And don't you want to replace this GENMASK() as well?

See my next reply to Yury, tl;dr this is a stale code version :(
  
Alexander Potapenko Oct. 10, 2023, 9:17 a.m. UTC | #5
> > + *
> > + *   for (bit = 0; bit < nbits; bit++)
> > + *           assign_bit(start + bit, bitmap, val & BIT(bit));
>
> __assign_bit()

Ack

>
> > + */
>
> 'behaves similarly' sounds like an understatement. I think, it behaves
> much faster because it can assign up to 64 bits at once, not mentioning
> the pressure on cache lines traffic.

My intent was to describe the visible behavior, of course the
generated code is better, and the number of memory accesses lower.

How about the following description:

 * The result of bitmap_write() is similar to @nbits calls of assign_bit(), i.e.
 * bits beyond @nbits are ignored:
 *
 *   for (bit = 0; bit < nbits; bit++)
 *           assign_bit(start + bit, bitmap, val & BIT(bit));

?

>
> How faster - that's a good question. I'd be really pleased if you add
> a performance test for bitmap_write/read. Or I can do it myself later.
> You can find examples in the same lib/test_bitmap.c.

I can add two separate functions doing some bitmap_read and
bitmap_write calls in a loop to measure their performance
independently - along the lines of what you did here:
https://lore.kernel.org/lkml/ZL9X0TZb%2FQhCbEiC@yury-ThinkPad/


> > +     if (unlikely(!nbits))
> > +             return;
>
> can you please add more empty lines to separate blocks visually?

Sure, will do.

>
> > +     mask = BITMAP_LAST_WORD_MASK(nbits);
> > +     value &= mask;
> > +     if (space >= nbits) {
> > +             map[index] &= ~(mask << offset);
> > +             map[index] |= value << offset;
> > +             return;
> > +     }
> > +     map[index] &= ~BITMAP_FIRST_WORD_MASK(start);
> > +     map[index] |= value << offset;
> > +     map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > +     map[index + 1] |= (value >> space);
> > +}
>
> I compiled the below fix on spark64 BE machine:
>
> --- a/include/linux/bitmap.h
> +++ b/include/linux/bitmap.h
> @@ -608,7 +608,7 @@ static inline unsigned long bitmap_read(const unsigned long *map,
>         if (unlikely(!nbits))
>                 return 0;
>         if (space >= nbits)
> -               return (map[index] >> offset) & GENMASK(nbits - 1, 0);
> +               return (map[index] >> offset) & BITMAP_LAST_WORD_MASK(nbits);
>         value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
>         value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits);
>         return (value_low >> offset) | (value_high << space);
> @@ -661,9 +661,9 @@ static inline void bitmap_write(unsigned long *map,
>                 map[index] |= value << offset;
>                 return;
>         }
> -       map[index] &= ~BITMAP_FIRST_WORD_MASK(start);
> +       map[index] &= BITMAP_LAST_WORD_MASK(start);
>         map[index] |= value << offset;
> -       map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> +       map[index + 1] &= BITMAP_FIRST_WORD_MASK(start + nbits);
>         map[index + 1] |= (value >> space);
>  }
>
> All the tests are passed just as before, and there's no any difference
> reported by bloat-o-meter. Can you please use non-negation versions as
> they are more straightforward?

I am terribly sorry for this, but this code version is completely
different from what we've discussed here:
https://lore.kernel.org/lkml/CAG_fn=VrPJj6YowHNki5RGAAs8qvwZpUVN4K9qw=cf4aW7Qw9A@mail.gmail.com/

The correct version roughly looks as follows:

void bitmap_write(unsigned long *map, unsigned long value,
                      unsigned long start, unsigned long nbits)
{
        unsigned long offset, space, mask;
        size_t index;
        bool fit;

        if (unlikely(!nbits))
                return;

        mask = BITMAP_LAST_WORD_MASK(nbits);
        value &= mask;
        offset = start % BITS_PER_LONG;
        space = BITS_PER_LONG - offset;
        index = BIT_WORD(start);
        fit = space >= nbits;

        map[index] &= (fit ? (~(mask << offset)) :
~BITMAP_FIRST_WORD_MASK(start));
        map[index] |= value << offset;
        if (fit)
                return;

        map[index + 1] &= BITMAP_FIRST_WORD_MASK(start + nbits);
        map[index + 1] |= (value >> space);
}

Note that here replacing ~BITMAP_FIRST_WORD_MASK() with
BITMAP_LAST_WORD_MASK() costs us 25 bytes (116 bytes with
~BITMAP_FIRST_WORD_MASK() vs. 141 bytes without).
  
Alexander Potapenko Oct. 10, 2023, 9:43 a.m. UTC | #6
> >
> > > +     if (space >= nbits)
> > > +             return (map[index] >> offset) & GENMASK(nbits - 1, 0);
> >
> > And don't you want to replace this GENMASK() as well?
>
> See my next reply to Yury, tl;dr this is a stale code version :(

Actually, no, we haven't discussed improvements to bitmap_read(), and
this is the correct version.
I'll fix this one.
  
Rasmus Villemoes Oct. 10, 2023, 11:03 a.m. UTC | #7
On 10/10/2023 11.17, Alexander Potapenko wrote:

>> 'behaves similarly' sounds like an understatement. I think, it behaves
>> much faster because it can assign up to 64 bits at once, not mentioning
>> the pressure on cache lines traffic.
> 
> My intent was to describe the visible behavior, of course the
> generated code is better, and the number of memory accesses lower.
> 
> How about the following description:
> 
>  * The result of bitmap_write() is similar to @nbits calls of assign_bit(), i.e.
>  * bits beyond @nbits are ignored:
>  *
>  *   for (bit = 0; bit < nbits; bit++)
>  *           assign_bit(start + bit, bitmap, val & BIT(bit));
> 
> ?

C programmers should know the meaning of the term "as-if". Perhaps use
that. Smth like "bitmap_write() behaves as-if implemented as @nbits
calls of __assign_bit()".

Rasmus
  
Alexander Potapenko Oct. 10, 2023, 12:14 p.m. UTC | #8
On Tue, Oct 10, 2023 at 1:03 PM Rasmus Villemoes
<linux@rasmusvillemoes.dk> wrote:
>
> On 10/10/2023 11.17, Alexander Potapenko wrote:
>
> >> 'behaves similarly' sounds like an understatement. I think, it behaves
> >> much faster because it can assign up to 64 bits at once, not mentioning
> >> the pressure on cache lines traffic.
> >
> > My intent was to describe the visible behavior, of course the
> > generated code is better, and the number of memory accesses lower.
> >
> > How about the following description:
> >
> >  * The result of bitmap_write() is similar to @nbits calls of assign_bit(), i.e.
> >  * bits beyond @nbits are ignored:
> >  *
> >  *   for (bit = 0; bit < nbits; bit++)
> >  *           assign_bit(start + bit, bitmap, val & BIT(bit));
> >
> > ?
>
> C programmers should know the meaning of the term "as-if". Perhaps use
> that. Smth like "bitmap_write() behaves as-if implemented as @nbits
> calls of __assign_bit()".

Good idea, I'll go with this one.
Thank you!
  

Patch

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 03644237e1efb..e72c054d21d48 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -76,7 +76,11 @@  struct device;
  *  bitmap_to_arr32(buf, src, nbits)            Copy nbits from buf to u32[] dst
  *  bitmap_to_arr64(buf, src, nbits)            Copy nbits from buf to u64[] dst
  *  bitmap_get_value8(map, start)               Get 8bit value from map at start
+ *  bitmap_read(map, start, nbits)              Read an nbits-sized value from
+ *                                              map at start
  *  bitmap_set_value8(map, value, start)        Set 8bit value to map at start
+ *  bitmap_write(map, value, start, nbits)      Write an nbits-sized value to
+ *                                              map at start
  *
  * Note, bitmap_zero() and bitmap_fill() operate over the region of
  * unsigned longs, that is, bits behind bitmap till the unsigned long
@@ -583,6 +587,33 @@  static inline unsigned long bitmap_get_value8(const unsigned long *map,
 	return (map[index] >> offset) & 0xFF;
 }
 
+/**
+ * bitmap_read - read a value of n-bits from the memory region
+ * @map: address to the bitmap memory region
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits, nonzero, up to BITS_PER_LONG
+ *
+ * Returns: value of nbits located at the @start bit offset within the @map
+ * memory region.
+ */
+static inline unsigned long bitmap_read(const unsigned long *map,
+					unsigned long start,
+					unsigned long nbits)
+{
+	size_t index = BIT_WORD(start);
+	unsigned long offset = start % BITS_PER_LONG;
+	unsigned long space = BITS_PER_LONG - offset;
+	unsigned long value_low, value_high;
+
+	if (unlikely(!nbits))
+		return 0;
+	if (space >= nbits)
+		return (map[index] >> offset) & GENMASK(nbits - 1, 0);
+	value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
+	value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits);
+	return (value_low >> offset) | (value_high << space);
+}
+
 /**
  * bitmap_set_value8 - set an 8-bit value within a memory region
  * @map: address to the bitmap memory region
@@ -599,6 +630,43 @@  static inline void bitmap_set_value8(unsigned long *map, unsigned long value,
 	map[index] |= value << offset;
 }
 
+/**
+ * bitmap_write - write n-bit value within a memory region
+ * @map: address to the bitmap memory region
+ * @value: value to write, clamped to nbits
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits, nonzero, up to BITS_PER_LONG.
+ *
+ * bitmap_write() behaves similarly to @nbits calls of assign_bit(), i.e. bits
+ * beyond @nbits are ignored:
+ *
+ *   for (bit = 0; bit < nbits; bit++)
+ *           assign_bit(start + bit, bitmap, val & BIT(bit));
+ */
+static inline void bitmap_write(unsigned long *map,
+				unsigned long value,
+				unsigned long start, unsigned long nbits)
+{
+	size_t index = BIT_WORD(start);
+	unsigned long offset = start % BITS_PER_LONG;
+	unsigned long space = BITS_PER_LONG - offset;
+	unsigned long mask;
+
+	if (unlikely(!nbits))
+		return;
+	mask = BITMAP_LAST_WORD_MASK(nbits);
+	value &= mask;
+	if (space >= nbits) {
+		map[index] &= ~(mask << offset);
+		map[index] |= value << offset;
+		return;
+	}
+	map[index] &= ~BITMAP_FIRST_WORD_MASK(start);
+	map[index] |= value << offset;
+	map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
+	map[index + 1] |= (value >> space);
+}
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __LINUX_BITMAP_H */