[v3,1/5] lib/bitmap: add bitmap_{set,get}_value()

Message ID 20230717113709.328671-2-glider@google.com
State New
Headers
Series Implement MTE tag compression for swapped pages |

Commit Message

Alexander Potapenko July 17, 2023, 11:37 a.m. UTC
  The two new functions allow setting/getting values of length up to
BITS_PER_LONG bits at arbitrary position in the bitmap.

The code was taken from "bitops: Introduce the for_each_set_clump macro"
by Syed Nayyar Waris with a couple of minor changes:
 - instead of using roundup(), which adds an unnecessary dependency
   on <linux/math.h>, we calculate space as BITS_PER_LONG-offset;
 - indentation is reduced by not using else-clauses (suggested by
   checkpatch for bitmap_get_value())

Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Syed Nayyar Waris <syednwaris@gmail.com>
Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
Link: https://lore.kernel.org/lkml/fe12eedf3666f4af5138de0e70b67a07c7f40338.1592224129.git.syednwaris@gmail.com/
Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Alexander Potapenko <glider@google.com>
---
 include/linux/bitmap.h | 57 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 57 insertions(+)
  

Comments

Andy Shevchenko July 17, 2023, 1:01 p.m. UTC | #1
On Mon, Jul 17, 2023 at 01:37:04PM +0200, Alexander Potapenko wrote:
> The two new functions allow setting/getting values of length up to
> BITS_PER_LONG bits at arbitrary position in the bitmap.
> 
> The code was taken from "bitops: Introduce the for_each_set_clump macro"
> by Syed Nayyar Waris with a couple of minor changes:

Since changes are minor, please make sure that the authorship is kept
untouched.

>  - instead of using roundup(), which adds an unnecessary dependency
>    on <linux/math.h>, we calculate space as BITS_PER_LONG-offset;
>  - indentation is reduced by not using else-clauses (suggested by
>    checkpatch for bitmap_get_value())

> Cc: Arnd Bergmann <arnd@arndb.de>

You can use --cc to `git send-email` instead of polluting the commit message.

> Signed-off-by: Syed Nayyar Waris <syednwaris@gmail.com>
> Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
> Link: https://lore.kernel.org/lkml/fe12eedf3666f4af5138de0e70b67a07c7f40338.1592224129.git.syednwaris@gmail.com/
> Suggested-by: Yury Norov <yury.norov@gmail.com>
> Signed-off-by: Alexander Potapenko <glider@google.com>

With above, I think you can also add Co-developed-by (as the changes were
made).

...

> +static inline void bitmap_set_value(unsigned long *map,
> +				    unsigned long value,
> +				    unsigned long start, unsigned long nbits)
> +{
> +	const size_t index = BIT_WORD(start);
> +	const unsigned long offset = start % BITS_PER_LONG;
> +	const unsigned long space = BITS_PER_LONG - offset;
> +
> +	value &= GENMASK(nbits - 1, 0);
> +
> +	if (space >= nbits) {

> +		map[index] &= ~(GENMASK(nbits + offset - 1, offset));

I remember that this construction may bring horrible code on some architectures
with some version(s) of the compiler (*).

To fix that I found an easy refactoring:

		map[index] &= ~(GENMASK(nbits, 0) << offset));


(*) don't remember the actual versions, though, but anyway...

> +		map[index] |= value << offset;
> +		return;
> +	}
> +	map[index] &= ~BITMAP_FIRST_WORD_MASK(start);
> +	map[index] |= value << offset;
> +	map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> +	map[index + 1] |= (value >> space);
> +}
  
Alexander Potapenko July 17, 2023, 2:14 p.m. UTC | #2
>
> > Cc: Arnd Bergmann <arnd@arndb.de>
>
> You can use --cc to `git send-email` instead of polluting the commit message.

Right. But as far as I can tell, certain kernel devs prefer to be CCed
on the whole series, whereas others do not want to see anything but
the actual patch they were interested in.
I am not sure about Arnd's preferences, so I just decided to keep the
tag from the original patch by Syed Nayyar Waris (which I also
consider to be an indication of the fact "that potentially interested
parties have been included in the discussion" per
https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html#when-to-use-acked-by-cc-and-co-developed-by)

> > Signed-off-by: Syed Nayyar Waris <syednwaris@gmail.com>
> > Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
> > Link: https://lore.kernel.org/lkml/fe12eedf3666f4af5138de0e70b67a07c7f40338.1592224129.git.syednwaris@gmail.com/
> > Suggested-by: Yury Norov <yury.norov@gmail.com>
> > Signed-off-by: Alexander Potapenko <glider@google.com>
>
> With above, I think you can also add Co-developed-by (as the changes were
> made).

Ok, will do.

> ...
>
> > +static inline void bitmap_set_value(unsigned long *map,
> > +                                 unsigned long value,
> > +                                 unsigned long start, unsigned long nbits)
> > +{
> > +     const size_t index = BIT_WORD(start);
> > +     const unsigned long offset = start % BITS_PER_LONG;
> > +     const unsigned long space = BITS_PER_LONG - offset;
> > +
> > +     value &= GENMASK(nbits - 1, 0);
> > +
> > +     if (space >= nbits) {
>
> > +             map[index] &= ~(GENMASK(nbits + offset - 1, offset));
>
> I remember that this construction may bring horrible code on some architectures
> with some version(s) of the compiler (*).

Wow, even the trunk Clang and GCC seem to generate better code for
your version of this line: https://godbolt.org/z/36Kqxhe6j

> To fix that I found an easy refactoring:
>
>                 map[index] &= ~(GENMASK(nbits, 0) << offset));
>

I'll take this one.

> (*) don't remember the actual versions, though, but anyway...
>
> > +             map[index] |= value << offset;
> > +             return;
> > +     }
> > +     map[index] &= ~BITMAP_FIRST_WORD_MASK(start);
> > +     map[index] |= value << offset;
> > +     map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > +     map[index + 1] |= (value >> space);
> > +}
>
> --
> With Best Regards,
> Andy Shevchenko
>
>
  
Andy Shevchenko July 17, 2023, 2:29 p.m. UTC | #3
On Mon, Jul 17, 2023 at 04:14:57PM +0200, Alexander Potapenko wrote:

+Cc: Nathan (on code generation question below)

...

> > > Cc: Arnd Bergmann <arnd@arndb.de>
> >
> > You can use --cc to `git send-email` instead of polluting the commit message.
> 
> Right. But as far as I can tell, certain kernel devs prefer to be CCed
> on the whole series, whereas others do not want to see anything but
> the actual patch they were interested in.
> I am not sure about Arnd's preferences, so I just decided to keep the
> tag from the original patch by Syed Nayyar Waris (which I also
> consider to be an indication of the fact "that potentially interested
> parties have been included in the discussion" per
> https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html#when-to-use-acked-by-cc-and-co-developed-by)

My personal statistics from the field that more than 90% of maintainers would
like to receive 100% of the series. Of course it depends on the series (if it's
treewide, I will agree with you). Here another point to my suggestion is that
Arnd is SoC tree maintainer, where ARM is one of the biggest player, so I think
he would like to see arm*: prefixed patches anyway.

...

> > > +             map[index] &= ~(GENMASK(nbits + offset - 1, offset));
> >
> > I remember that this construction may bring horrible code on some architectures
> > with some version(s) of the compiler (*).
> 
> Wow, even the trunk Clang and GCC seem to generate better code for
> your version of this line: https://godbolt.org/z/36Kqxhe6j

Wow, indeed! Perhaps time to report to clang and GCC people. I believe the root
cause is that in the original version compiler can't prove that l is constant
for GENMASK().

> > To fix that I found an easy refactoring:
> >
> >                 map[index] &= ~(GENMASK(nbits, 0) << offset));
> 
> I'll take this one.
> 
> > (*) don't remember the actual versions, though, but anyway...
  
Andy Shevchenko July 17, 2023, 2:31 p.m. UTC | #4
On Mon, Jul 17, 2023 at 05:29:12PM +0300, Andy Shevchenko wrote:
> On Mon, Jul 17, 2023 at 04:14:57PM +0200, Alexander Potapenko wrote:

...

> > > > +             map[index] &= ~(GENMASK(nbits + offset - 1, offset));
> > >
> > > I remember that this construction may bring horrible code on some architectures
> > > with some version(s) of the compiler (*).
> > 
> > Wow, even the trunk Clang and GCC seem to generate better code for
> > your version of this line: https://godbolt.org/z/36Kqxhe6j
> 
> Wow, indeed! Perhaps time to report to clang and GCC people. I believe the root
> cause is that in the original version compiler can't prove that l is constant
> for GENMASK().
> 
> > > To fix that I found an easy refactoring:
> > >
> > >                 map[index] &= ~(GENMASK(nbits, 0) << offset));

nbits - 1 it should be, btw. In any case it seems the code is still better.

> > I'll take this one.
> > 
> > > (*) don't remember the actual versions, though, but anyway...
  
Alexander Potapenko July 17, 2023, 2:53 p.m. UTC | #5
> >
> > I remember that this construction may bring horrible code on some architectures
> > with some version(s) of the compiler (*).
>
> Wow, even the trunk Clang and GCC seem to generate better code for
> your version of this line: https://godbolt.org/z/36Kqxhe6j

Oh wait.
First, my Godbolt reproducer is incorrect, it is using sizeof(unsigned
long) instead of 8 * sizeof(unsigned long) for BITS_PER_LONG.

>
> > To fix that I found an easy refactoring:
> >
> >                 map[index] &= ~(GENMASK(nbits, 0) << offset));
> >

Second, the line above should probably be:
  map[index] &= ~(GENMASK(nbits - 1, 0) << offset));

, right?
  
Andy Shevchenko July 17, 2023, 3:03 p.m. UTC | #6
On Mon, Jul 17, 2023 at 04:53:51PM +0200, Alexander Potapenko wrote:

...

> > > I remember that this construction may bring horrible code on some architectures
> > > with some version(s) of the compiler (*).
> >
> > Wow, even the trunk Clang and GCC seem to generate better code for
> > your version of this line: https://godbolt.org/z/36Kqxhe6j
> 
> Oh wait.
> First, my Godbolt reproducer is incorrect, it is using sizeof(unsigned
> long) instead of 8 * sizeof(unsigned long) for BITS_PER_LONG.

Still slightly better. And note, that the same GENMASK() is used at the
beginning of the function. Compiler actually might cache it.

> > > To fix that I found an easy refactoring:
> > >
> > >                 map[index] &= ~(GENMASK(nbits, 0) << offset));
> > >
> 
> Second, the line above should probably be:
>   map[index] &= ~(GENMASK(nbits - 1, 0) << offset));
> 
> , right?

Yes.
  
Yury Norov July 17, 2023, 3:50 p.m. UTC | #7
On Mon, Jul 17, 2023 at 01:37:04PM +0200, Alexander Potapenko wrote:
> The two new functions allow setting/getting values of length up to
> BITS_PER_LONG bits at arbitrary position in the bitmap.
> 
> The code was taken from "bitops: Introduce the for_each_set_clump macro"
> by Syed Nayyar Waris with a couple of minor changes:
>  - instead of using roundup(), which adds an unnecessary dependency
>    on <linux/math.h>, we calculate space as BITS_PER_LONG-offset;
>  - indentation is reduced by not using else-clauses (suggested by
>    checkpatch for bitmap_get_value())

Please preserve Syed's authorship ('From' field in git log).

> Cc: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Syed Nayyar Waris <syednwaris@gmail.com>
> Signed-off-by: William Breathitt Gray <william.gray@linaro.org>
> Link: https://lore.kernel.org/lkml/fe12eedf3666f4af5138de0e70b67a07c7f40338.1592224129.git.syednwaris@gmail.com/
> Suggested-by: Yury Norov <yury.norov@gmail.com>
> Signed-off-by: Alexander Potapenko <glider@google.com>
> ---
>  include/linux/bitmap.h | 57 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 57 insertions(+)
> 
> diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
> index 03644237e1efb..4559366084988 100644
> --- a/include/linux/bitmap.h
> +++ b/include/linux/bitmap.h
> @@ -76,7 +76,11 @@ struct device;
>   *  bitmap_to_arr32(buf, src, nbits)            Copy nbits from buf to u32[] dst
>   *  bitmap_to_arr64(buf, src, nbits)            Copy nbits from buf to u64[] dst
>   *  bitmap_get_value8(map, start)               Get 8bit value from map at start
> + *  bitmap_get_value(map, start, nbits)         Get bit value of size 'nbits'
> + *                                              from map at start
>   *  bitmap_set_value8(map, value, start)        Set 8bit value to map at start
> + *  bitmap_set_value(map, value, start, nbits)  Set bit value of size 'nbits'
> + *                                              of map at start

The 'bit value of size' sounds more confusing than it should. The size
of bit is actually a bit... Can you rephrase? Moreover, 'set bits' has
a meaning of actually setting them, i.e. switching to '1'. Maybe:
"Copy 'nbits' to bitmap starting at 'start'"?

>   *
>   * Note, bitmap_zero() and bitmap_fill() operate over the region of
>   * unsigned longs, that is, bits behind bitmap till the unsigned long
> @@ -583,6 +587,31 @@ static inline unsigned long bitmap_get_value8(const unsigned long *map,
>  	return (map[index] >> offset) & 0xFF;
>  }
>  
> +/**
> + * bitmap_get_value - get a value of n-bits from the memory region
> + * @map: address to the bitmap memory region
> + * @start: bit offset of the n-bit value
> + * @nbits: size of value in bits


 * @nbits: size of value in bits, up to BITS_PER_LONG

> + *
> + * Returns value of nbits located at the @start bit offset within the @map
> + * memory region.
> + */
> +static inline unsigned long bitmap_get_value(const unsigned long *map,
> +					     unsigned long start,
> +					     unsigned long nbits)
> +{
> +	const size_t index = BIT_WORD(start);
> +	const unsigned long offset = start % BITS_PER_LONG;
> +	const unsigned long space = BITS_PER_LONG - offset;
> +	unsigned long value_low, value_high;
> +
> +	if (space >= nbits)
> +		return (map[index] >> offset) & GENMASK(nbits - 1, 0);
> +	value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
> +	value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits);
> +	return (value_low >> offset) | (value_high << space);
> +}

When nbits == 0, copy-like functions shouldn't touch any memory. See how
other bitmap and find_bit functions hold it.

Thanks,
Yury
  
Yury Norov July 17, 2023, 4:15 p.m. UTC | #8
On Mon, Jul 17, 2023 at 05:31:44PM +0300, Andy Shevchenko wrote:
> On Mon, Jul 17, 2023 at 05:29:12PM +0300, Andy Shevchenko wrote:
> > On Mon, Jul 17, 2023 at 04:14:57PM +0200, Alexander Potapenko wrote:
> 
> ...
> 
> > > > > +             map[index] &= ~(GENMASK(nbits + offset - 1, offset));
> > > >
> > > > I remember that this construction may bring horrible code on some architectures
> > > > with some version(s) of the compiler (*).
> > > 
> > > Wow, even the trunk Clang and GCC seem to generate better code for
> > > your version of this line: https://godbolt.org/z/36Kqxhe6j
> > 
> > Wow, indeed! Perhaps time to report to clang and GCC people. I believe the root
> > cause is that in the original version compiler can't prove that l is constant
> > for GENMASK().
> > 
> > > > To fix that I found an easy refactoring:
> > > >
> > > >                 map[index] &= ~(GENMASK(nbits, 0) << offset));
> 
> nbits - 1 it should be, btw. In any case it seems the code is still better.
 
 Yep.

 Alexander, for the next round can you please show disassembly for the
 functions in case of compile-time and run-time defined start and nbits?


 Thanks,
 Yury
  
Alexander Potapenko July 17, 2023, 4:29 p.m. UTC | #9
On Mon, Jul 17, 2023 at 5:03 PM Andy Shevchenko
<andriy.shevchenko@linux.intel.com> wrote:
>
> On Mon, Jul 17, 2023 at 04:53:51PM +0200, Alexander Potapenko wrote:
>
> ...
>
> > > > I remember that this construction may bring horrible code on some architectures
> > > > with some version(s) of the compiler (*).
> > >
> > > Wow, even the trunk Clang and GCC seem to generate better code for
> > > your version of this line: https://godbolt.org/z/36Kqxhe6j
> >
> > Oh wait.
> > First, my Godbolt reproducer is incorrect, it is using sizeof(unsigned
> > long) instead of 8 * sizeof(unsigned long) for BITS_PER_LONG.
>
> Still slightly better. And note, that the same GENMASK() is used at the
> beginning of the function. Compiler actually might cache it.

I think that the compiler might consider this transformation invalid
because "GENMASK(nbits - 1, 0) << offset" and "GENMASK(nbits + offset
- 1, offset)" have different values in the case "nbits + offset"
exceed 64.
Restricting the domain of bitmap_set_value() might result in better
code, but it is easier to stick to your version, which prevents the
overflow.

> > > > To fix that I found an easy refactoring:
> > > >
> > > >                 map[index] &= ~(GENMASK(nbits, 0) << offset));
> > > >
> >
> > Second, the line above should probably be:
> >   map[index] &= ~(GENMASK(nbits - 1, 0) << offset));
> >
> > , right?
>
> Yes.

This "nbits vs. nbits - 1" was not detected by test_set_get_value(),
so I'll add an extra test case for it.
  
Alexander Potapenko July 18, 2023, 9:30 a.m. UTC | #10
On Mon, Jul 17, 2023 at 5:51 PM Yury Norov <yury.norov@gmail.com> wrote:
>
> On Mon, Jul 17, 2023 at 01:37:04PM +0200, Alexander Potapenko wrote:
> > The two new functions allow setting/getting values of length up to
> > BITS_PER_LONG bits at arbitrary position in the bitmap.
> >
> > The code was taken from "bitops: Introduce the for_each_set_clump macro"
> > by Syed Nayyar Waris with a couple of minor changes:
> >  - instead of using roundup(), which adds an unnecessary dependency
> >    on <linux/math.h>, we calculate space as BITS_PER_LONG-offset;
> >  - indentation is reduced by not using else-clauses (suggested by
> >    checkpatch for bitmap_get_value())
>
> Please preserve Syed's authorship ('From' field in git log).
Done

> >   *  bitmap_set_value8(map, value, start)        Set 8bit value to map at start
> > + *  bitmap_set_value(map, value, start, nbits)  Set bit value of size 'nbits'
> > + *                                              of map at start
>
> The 'bit value of size' sounds more confusing than it should. The size
> of bit is actually a bit... Can you rephrase?
How about "Get an nbits-sized value from map at start" and "Set an
nbits-sized value to map at start"?

> Moreover, 'set bits' has
> a meaning of actually setting them, i.e. switching to '1'. Maybe:
> "Copy 'nbits' to bitmap starting at 'start'"?

Right now it is in line with the comment for bitmap_set_value8 (and
the names of the functions also have _set_ in them).
Shall I also change that comment?
WDYT about "Put an nbits-sized value to map at start"?


> > +/**
> > + * bitmap_get_value - get a value of n-bits from the memory region
> > + * @map: address to the bitmap memory region
> > + * @start: bit offset of the n-bit value
> > + * @nbits: size of value in bits
>
>
>  * @nbits: size of value in bits, up to BITS_PER_LONG
Ok

> > + *
> > + * Returns value of nbits located at the @start bit offset within the @map
> > + * memory region.
> > + */
> > +static inline unsigned long bitmap_get_value(const unsigned long *map,
> > +                                          unsigned long start,
> > +                                          unsigned long nbits)
> > +{
> > +     const size_t index = BIT_WORD(start);
> > +     const unsigned long offset = start % BITS_PER_LONG;
> > +     const unsigned long space = BITS_PER_LONG - offset;
> > +     unsigned long value_low, value_high;
> > +
> > +     if (space >= nbits)
> > +             return (map[index] >> offset) & GENMASK(nbits - 1, 0);
> > +     value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
> > +     value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits);
> > +     return (value_low >> offset) | (value_high << space);
> > +}
>
> When nbits == 0, copy-like functions shouldn't touch any memory. See how
> other bitmap and find_bit functions hold it.

I think this is different from what other bitmap functions do, but it
should be enough to bail out on !nbits, i.e.:

    if (!nbits)
        return 0;

You probably meant adding a __builtin_constant_p() (which is used all
over the place in bitmap.h), but:
 - the compiler won't have problem optimizing away the code for a
constant nbits=0;
 - we anyway need a dynamic check for the case nbits is not constant
(for both bitmap_get_value() and bitmap_set_value(), I assume).

What do you think?
  
Andy Shevchenko July 18, 2023, 2:01 p.m. UTC | #11
On Tue, Jul 18, 2023 at 11:30:00AM +0200, Alexander Potapenko wrote:
> On Mon, Jul 17, 2023 at 5:51 PM Yury Norov <yury.norov@gmail.com> wrote:
> > On Mon, Jul 17, 2023 at 01:37:04PM +0200, Alexander Potapenko wrote:

...

> > When nbits == 0, copy-like functions shouldn't touch any memory. See how
> > other bitmap and find_bit functions hold it.
> 
> I think this is different from what other bitmap functions do, but it
> should be enough to bail out on !nbits, i.e.:
> 
>     if (!nbits)
>         return 0;
> 
> You probably meant adding a __builtin_constant_p() (which is used all
> over the place in bitmap.h), but:
>  - the compiler won't have problem optimizing away the code for a
> constant nbits=0;
>  - we anyway need a dynamic check for the case nbits is not constant
> (for both bitmap_get_value() and bitmap_set_value(), I assume).
> 
> What do you think?

The idea behind is to eliminate the code completely for the cases nbits != 0.
In your case the dynamic check will be there. That's what we want to avoid.
  
Yury Norov July 18, 2023, 5:03 p.m. UTC | #12
On Tue, Jul 18, 2023 at 05:01:28PM +0300, Andy Shevchenko wrote:
> On Tue, Jul 18, 2023 at 11:30:00AM +0200, Alexander Potapenko wrote:
> > On Mon, Jul 17, 2023 at 5:51 PM Yury Norov <yury.norov@gmail.com> wrote:
> > > On Mon, Jul 17, 2023 at 01:37:04PM +0200, Alexander Potapenko wrote:
> 
> ...
> 
> > > When nbits == 0, copy-like functions shouldn't touch any memory. See how
> > > other bitmap and find_bit functions hold it.
> > 
> > I think this is different from what other bitmap functions do, but it
> > should be enough to bail out on !nbits, i.e.:
> > 
> >     if (!nbits)
> >         return 0;
> > 
> > You probably meant adding a __builtin_constant_p() (which is used all
> > over the place in bitmap.h), but:

No, I didn't mean that.

> >  - the compiler won't have problem optimizing away the code for a
> > constant nbits=0;

Look at your code, having nbits == 0 in mind:

       const size_t index = BIT_WORD(start);
       const unsigned long offset = start % BITS_PER_LONG;
       const unsigned long space = BITS_PER_LONG - offset;
       unsigned long value_low, value_high;

       if (space >= nbits) // This is always the case
               return (map[index] >> offset) & GENMASK(nbits - 1, 0);
       ...             ^^                      ^^
                       Unconditional fetch     Wshift-count-overflow

Thanks to GENMASK() implementation, you'll be warned by GENMASK_INPUT_CHECK()
if nbits is a compile-time variable. In case of runtime, it's a pure undef,
not mentioning useless, expensive and dangerous fetch.

> >  - we anyway need a dynamic check for the case nbits is not constant
> > (for both bitmap_get_value() and bitmap_set_value(), I assume).
> > 
> > What do you think?

I think that instead of speculations, it's better to cover nbits == 0
with the explicit tests for run- and compile-time. That way you're
always on a safe side.

bitmap_get_val(NULL, 0, 0) shouldn't crash the kernel.
 
> The idea behind is to eliminate the code completely for the cases nbits != 0.
> In your case the dynamic check will be there. That's what we want to avoid.

Alexander is right - we can't avoid testing against 0 if we need to
test for 0... In case of other functions we have inline and outline
implementations, controlled by small_const_nbits().

As you can see, the small_const_nbits() tests against 0 explicitly,
although it's free at compile time. But if nbits == 0, we pick
outline version of a function regardless.

On their turn, outline versions again do their test against nbits == 0,
but most of the time implicitly.

In case of bitmap_set_val, we are touching at max 2 words, and there's
no reason for outline version, so we have to test nbits against 0
inside inline code. 

Having all that in mind, and because nbits == 0 is most likely an
error we'd follow the following rules:
 - no memory must be touched as we're potentially in error condition,
   and pointer may be corrupted;
 - the cost of the check must be as minimal as possible.

So I suggest:

        if (unlikely(nbits == 0))
                return;

For readers that would literally mean: we don't expect that, and we find
it suspicious, but we'll handle that as correct as we can.

By the way, Alexander, please drop that 'const' things. Those are for
pointers or some global variables, not for inline functions with 4
lines of code. (If you think it helps the code to be safe than no - it's
unsafe even with consts.)

Thanks,
Yury
  
Andy Shevchenko July 18, 2023, 5:20 p.m. UTC | #13
On Tue, Jul 18, 2023 at 10:03:25AM -0700, Yury Norov wrote:
> On Tue, Jul 18, 2023 at 05:01:28PM +0300, Andy Shevchenko wrote:
> > On Tue, Jul 18, 2023 at 11:30:00AM +0200, Alexander Potapenko wrote:

...

> > The idea behind is to eliminate the code completely for the cases nbits != 0.
> > In your case the dynamic check will be there. That's what we want to avoid.
> 
> Alexander is right - we can't avoid testing against 0 if we need to
> test for 0... In case of other functions we have inline and outline
> implementations, controlled by small_const_nbits().
> 
> As you can see, the small_const_nbits() tests against 0 explicitly,
> although it's free at compile time. But if nbits == 0, we pick
> outline version of a function regardless.
> 
> On their turn, outline versions again do their test against nbits == 0,
> but most of the time implicitly.
> 
> In case of bitmap_set_val, we are touching at max 2 words, and there's
> no reason for outline version, so we have to test nbits against 0
> inside inline code. 
> 
> Having all that in mind, and because nbits == 0 is most likely an
> error we'd follow the following rules:
>  - no memory must be touched as we're potentially in error condition,
>    and pointer may be corrupted;
>  - the cost of the check must be as minimal as possible.
> 
> So I suggest:
> 
>         if (unlikely(nbits == 0))
>                 return;
> 
> For readers that would literally mean: we don't expect that, and we find
> it suspicious, but we'll handle that as correct as we can.

Okay, thank you for elaborated answer.
  
Alexander Potapenko July 19, 2023, 9 a.m. UTC | #14
>
> Thanks to GENMASK() implementation, you'll be warned by GENMASK_INPUT_CHECK()
> if nbits is a compile-time variable. In case of runtime, it's a pure undef,
> not mentioning useless, expensive and dangerous fetch.
>
> > >  - we anyway need a dynamic check for the case nbits is not constant
> > > (for both bitmap_get_value() and bitmap_set_value(), I assume).
> > >
> > > What do you think?
>
> I think that instead of speculations, it's better to cover nbits == 0
> with the explicit tests for run- and compile-time. That way you're
> always on a safe side.

You are right. I added tests for these cases.

> bitmap_get_val(NULL, 0, 0) shouldn't crash the kernel.

Haha, the compiler is smart enough to not crash the kernel in this case.
But passing zero via a volatile variable did the trick.

>
> > The idea behind is to eliminate the code completely for the cases nbits != 0.
> > In your case the dynamic check will be there. That's what we want to avoid.
>
> Alexander is right - we can't avoid testing against 0 if we need to
> test for 0... In case of other functions we have inline and outline
> implementations, controlled by small_const_nbits().
>
> As you can see, the small_const_nbits() tests against 0 explicitly,
> although it's free at compile time. But if nbits == 0, we pick
> outline version of a function regardless.
>
> On their turn, outline versions again do their test against nbits == 0,
> but most of the time implicitly.
>
> In case of bitmap_set_val, we are touching at max 2 words, and there's
> no reason for outline version, so we have to test nbits against 0
> inside inline code.
>
> Having all that in mind, and because nbits == 0 is most likely an
> error we'd follow the following rules:
>  - no memory must be touched as we're potentially in error condition,
>    and pointer may be corrupted;
>  - the cost of the check must be as minimal as possible.
>
> So I suggest:
>
>         if (unlikely(nbits == 0))
>                 return;

Sounds good, I'll add unlikely() around the check.
Thanks for the explanation!

>
> For readers that would literally mean: we don't expect that, and we find
> it suspicious, but we'll handle that as correct as we can.
>
> By the way, Alexander, please drop that 'const' things. Those are for
> pointers or some global variables, not for inline functions with 4
> lines of code. (If you think it helps the code to be safe than no - it's
> unsafe even with consts.)

These consts are from the original Syed's patch and were probably
added for consistency with bitmap_{set,get}_value8().
But, okay, I'll remove them.




--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Liana Sebastian
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
  

Patch

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 03644237e1efb..4559366084988 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -76,7 +76,11 @@  struct device;
  *  bitmap_to_arr32(buf, src, nbits)            Copy nbits from buf to u32[] dst
  *  bitmap_to_arr64(buf, src, nbits)            Copy nbits from buf to u64[] dst
  *  bitmap_get_value8(map, start)               Get 8bit value from map at start
+ *  bitmap_get_value(map, start, nbits)         Get bit value of size 'nbits'
+ *                                              from map at start
  *  bitmap_set_value8(map, value, start)        Set 8bit value to map at start
+ *  bitmap_set_value(map, value, start, nbits)  Set bit value of size 'nbits'
+ *                                              of map at start
  *
  * Note, bitmap_zero() and bitmap_fill() operate over the region of
  * unsigned longs, that is, bits behind bitmap till the unsigned long
@@ -583,6 +587,31 @@  static inline unsigned long bitmap_get_value8(const unsigned long *map,
 	return (map[index] >> offset) & 0xFF;
 }
 
+/**
+ * bitmap_get_value - get a value of n-bits from the memory region
+ * @map: address to the bitmap memory region
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits
+ *
+ * Returns value of nbits located at the @start bit offset within the @map
+ * memory region.
+ */
+static inline unsigned long bitmap_get_value(const unsigned long *map,
+					     unsigned long start,
+					     unsigned long nbits)
+{
+	const size_t index = BIT_WORD(start);
+	const unsigned long offset = start % BITS_PER_LONG;
+	const unsigned long space = BITS_PER_LONG - offset;
+	unsigned long value_low, value_high;
+
+	if (space >= nbits)
+		return (map[index] >> offset) & GENMASK(nbits - 1, 0);
+	value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
+	value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits);
+	return (value_low >> offset) | (value_high << space);
+}
+
 /**
  * bitmap_set_value8 - set an 8-bit value within a memory region
  * @map: address to the bitmap memory region
@@ -599,6 +628,34 @@  static inline void bitmap_set_value8(unsigned long *map, unsigned long value,
 	map[index] |= value << offset;
 }
 
+/**
+ * bitmap_set_value - set n-bit value within a memory region
+ * @map: address to the bitmap memory region
+ * @value: value of nbits
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits
+ */
+static inline void bitmap_set_value(unsigned long *map,
+				    unsigned long value,
+				    unsigned long start, unsigned long nbits)
+{
+	const size_t index = BIT_WORD(start);
+	const unsigned long offset = start % BITS_PER_LONG;
+	const unsigned long space = BITS_PER_LONG - offset;
+
+	value &= GENMASK(nbits - 1, 0);
+
+	if (space >= nbits) {
+		map[index] &= ~(GENMASK(nbits + offset - 1, offset));
+		map[index] |= value << offset;
+		return;
+	}
+	map[index] &= ~BITMAP_FIRST_WORD_MASK(start);
+	map[index] |= value << offset;
+	map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
+	map[index + 1] |= (value >> space);
+}
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __LINUX_BITMAP_H */