[v1,2/2] x86/asm/bitops: Use __builtin_clz*() to evaluate constant expressions

Message ID 20221106095106.849154-3-mailhol.vincent@wanadoo.fr
State New
Headers
Series x86/asm/bitops: optimize fls functions for constant expressions |

Commit Message

Vincent Mailhol Nov. 6, 2022, 9:51 a.m. UTC
  GCC and clang offers the __builtin_clz(x) and __builtin_clzll(x)
functions which return the number of leading zero bits in
x. c.f. [1]. By a simple subtraction, we can derive below equivalences:

* For fls:
  Aside of the x = 0 special case, fls(x) is equivalent to
  BITS_PER_TYPE(x) - __builtin_clz(x).

* For fls64:
  Aside of the x = 0 special case, fls64(x) is equivalent to
  BITS_PER_TYPE(x) - __builtin_clzll(x). __builtin_clzll() takes an
  unsigned long long as argument. We choose this version because
  BITS_PER_LONG_LONG is defined as 64 bits for all architecture making
  this flavor the most portable one. A BUILD_BUG_ON() safety net is
  added.

When used on constant expressions, the compiler is only able to fold
the builtin version (c.f. [2]). However, for non constant expressions,
the kernel inline assembly results in better code for both GCC and
clang.

Use __builtin_constant_p() to select between the kernel's
fls()/fls64() __builtin_clz()/__builtin_clzll() depending on whether
the argument is constant or not.

While renaming fls64() to variable_fls64(), change the argument type
from __64 to u64 because we are not in an uapi header.

[1] Built-in Functions Provided by GCC:
https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html

[2] commit 146034fed6ee ("x86/asm/bitops: Use __builtin_ffs() to evaluate constant expressions")

CC: Borislav Petkov <bp@suse.de>
CC: Nick Desaulniers <ndesaulniers@google.com>
CC: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
---
 arch/x86/include/asm/bitops.h | 57 ++++++++++++++++++++++++-----------
 1 file changed, 40 insertions(+), 17 deletions(-)
  

Comments

Nick Desaulniers Nov. 10, 2022, 7:01 p.m. UTC | #1
On Sun, Nov 6, 2022 at 1:51 AM Vincent Mailhol
<mailhol.vincent@wanadoo.fr> wrote:
>
>  #ifdef CONFIG_X86_64
> -static __always_inline int fls64(__u64 x)
> +static __always_inline int constant_fls64(u64 x)
> +{
> +       BUILD_BUG_ON(sizeof(unsigned long long) != sizeof(x));

Thanks for the patches! They LGTM; but why do we need this BUILD_BUG_ON here?

> +
> +       if (!x)
> +               return 0;
> +
> +       return BITS_PER_TYPE(x) - __builtin_clzll(x);
> +}
  
Vincent Mailhol Nov. 11, 2022, 1:57 a.m. UTC | #2
On Fri. 11 Nov. 2022 at 04:01, Nick Desaulniers <ndesaulniers@google.com> wrote:
> On Sun, Nov 6, 2022 at 1:51 AM Vincent Mailhol
> <mailhol.vincent@wanadoo.fr> wrote:
> >
> >  #ifdef CONFIG_X86_64
> > -static __always_inline int fls64(__u64 x)
> > +static __always_inline int constant_fls64(u64 x)
> > +{
> > +       BUILD_BUG_ON(sizeof(unsigned long long) != sizeof(x));
>
> Thanks for the patches! They LGTM; but why do we need this BUILD_BUG_ON here?

There is no absolute need for sure.

Call this a paranoiac check and you will be correct. My reasoning for still
using it is that:

  1/ It is a compile time check, so no runtime penalty.
  2/ Strictly speaking, the C standard doesn't guarantee 'u64' and
     'unsigned long long int' to be the same (and you can argue that in clang
     and gcc long long is always 64 bits on all platforms and one more time
     you will be correct).
  3/ It serves as a documentation to say: "hey I am using the clz long long
     version on a u64 and I know what I am doing."

If you want me to remove it, OK for me. Let me know.

> > +
> > +       if (!x)
> > +               return 0;
> > +
> > +       return BITS_PER_TYPE(x) - __builtin_clzll(x);
> > +}
  
Yury Norov Nov. 11, 2022, 3:36 a.m. UTC | #3
On Fri, Nov 11, 2022 at 10:57:17AM +0900, Vincent MAILHOL wrote:
> On Fri. 11 Nov. 2022 at 04:01, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > On Sun, Nov 6, 2022 at 1:51 AM Vincent Mailhol
> > <mailhol.vincent@wanadoo.fr> wrote:
> > >
> > >  #ifdef CONFIG_X86_64
> > > -static __always_inline int fls64(__u64 x)
> > > +static __always_inline int constant_fls64(u64 x)
> > > +{
> > > +       BUILD_BUG_ON(sizeof(unsigned long long) != sizeof(x));
> >
> > Thanks for the patches! They LGTM; but why do we need this BUILD_BUG_ON here?
> 
> There is no absolute need for sure.
> 
> Call this a paranoiac check and you will be correct. My reasoning for still
> using it is that:
> 
>   1/ It is a compile time check, so no runtime penalty.
>   2/ Strictly speaking, the C standard doesn't guarantee 'u64' and
>      'unsigned long long int' to be the same (and you can argue that in clang
>      and gcc long long is always 64 bits on all platforms and one more time
>      you will be correct).
>   3/ It serves as a documentation to say: "hey I am using the clz long long
>      version on a u64 and I know what I am doing."
> 
> If you want me to remove it, OK for me. Let me know.

In fact, compiler's typecheck would be more strict than your BUG().
For example, your check allows pointers, but compiler will complain.
  
Vincent Mailhol Nov. 11, 2022, 7:02 a.m. UTC | #4
On Fri. 11 nov. 2022 à 12:36, Yury Norov <yury.norov@gmail.com> a écrit :
> On Fri, Nov 11, 2022 at 10:57:17AM +0900, Vincent MAILHOL wrote:
> > On Fri. 11 Nov. 2022 at 04:01, Nick Desaulniers <ndesaulniers@google.com> wrote:
> > > On Sun, Nov 6, 2022 at 1:51 AM Vincent Mailhol
> > > <mailhol.vincent@wanadoo.fr> wrote:
> > > >
> > > >  #ifdef CONFIG_X86_64
> > > > -static __always_inline int fls64(__u64 x)
> > > > +static __always_inline int constant_fls64(u64 x)
> > > > +{
> > > > +       BUILD_BUG_ON(sizeof(unsigned long long) != sizeof(x));
> > >
> > > Thanks for the patches! They LGTM; but why do we need this BUILD_BUG_ON here?
> >
> > There is no absolute need for sure.
> >
> > Call this a paranoiac check and you will be correct. My reasoning for still
> > using it is that:
> >
> >   1/ It is a compile time check, so no runtime penalty.
> >   2/ Strictly speaking, the C standard doesn't guarantee 'u64' and
> >      'unsigned long long int' to be the same (and you can argue that in clang
> >      and gcc long long is always 64 bits on all platforms and one more time
> >      you will be correct).
> >   3/ It serves as a documentation to say: "hey I am using the clz long long
> >      version on a u64 and I know what I am doing."
> >
> > If you want me to remove it, OK for me. Let me know.
>
> In fact, compiler's typecheck would be more strict than your BUG().
> For example, your check allows pointers, but compiler will complain.

Here, x is a scalar, so in that specific case, it is equivalent. But
the compiler type check is more explicit and more natural because it
would work if ported to other contexts as you pointed. So it is a good
idea.

The check would become:
        static_assert(__builtin_types_compatible_p(typeof(x),
                                                   unsigned long long int));

Alternatively, we could use the assert_arg_type() macro from
https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/irq_stack.h#L126
which does exactly the same thing.

But in such a case, it would be better to first move assert_arg_type()
to a more appropriate header (maybe linux/build_bug.h ?)


Yours sincerely,
Vincent Mailhol
  

Patch

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index a31453d7686d..58fb2fc49760 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -333,18 +333,15 @@  static __always_inline int variable_ffs(int x)
  */
 #define ffs(x) (__builtin_constant_p(x) ? __builtin_ffs(x) : variable_ffs(x))
 
-/**
- * fls - find last set bit in word
- * @x: the word to search
- *
- * This is defined in a similar way as the libc and compiler builtin
- * ffs, but returns the position of the most significant set bit.
- *
- * fls(value) returns 0 if value is 0 or the position of the last
- * set bit if value is nonzero. The last (most significant) bit is
- * at position 32.
- */
-static __always_inline int fls(unsigned int x)
+static __always_inline int constant_fls(unsigned int x)
+{
+	if (!x)
+		return 0;
+
+	return BITS_PER_TYPE(x) - __builtin_clz(x);
+}
+
+static __always_inline int variable_fls(unsigned int x)
 {
 	int r;
 
@@ -375,18 +372,30 @@  static __always_inline int fls(unsigned int x)
 }
 
 /**
- * fls64 - find last set bit in a 64-bit word
+ * fls - find last set bit in word
  * @x: the word to search
  *
  * This is defined in a similar way as the libc and compiler builtin
- * ffsll, but returns the position of the most significant set bit.
+ * ffs, but returns the position of the most significant set bit.
  *
- * fls64(value) returns 0 if value is 0 or the position of the last
+ * fls(value) returns 0 if value is 0 or the position of the last
  * set bit if value is nonzero. The last (most significant) bit is
- * at position 64.
+ * at position 32.
  */
+#define fls(x) (__builtin_constant_p(x) ? constant_fls(x) : variable_fls(x))
+
 #ifdef CONFIG_X86_64
-static __always_inline int fls64(__u64 x)
+static __always_inline int constant_fls64(u64 x)
+{
+	BUILD_BUG_ON(sizeof(unsigned long long) != sizeof(x));
+
+	if (!x)
+		return 0;
+
+	return BITS_PER_TYPE(x) - __builtin_clzll(x);
+}
+
+static __always_inline int variable_fls64(u64 x)
 {
 	int bitpos = -1;
 	/*
@@ -399,6 +408,20 @@  static __always_inline int fls64(__u64 x)
 	    : "rm" (x));
 	return bitpos + 1;
 }
+
+/**
+ * fls64 - find last set bit in a 64-bit word
+ * @x: the word to search
+ *
+ * This is defined in a similar way as the libc and compiler builtin
+ * ffsll, but returns the position of the most significant set bit.
+ *
+ * fls64(value) returns 0 if value is 0 or the position of the last
+ * set bit if value is nonzero. The last (most significant) bit is
+ * at position 64.
+ */
+#define fls64(x) \
+	(__builtin_constant_p(x) ? constant_fls64(x) : variable_fls64(x))
 #else
 #include <asm-generic/bitops/fls64.h>
 #endif