Message ID | 20230717113709.328671-2-glider@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp1055365vqt; Mon, 17 Jul 2023 04:46:25 -0700 (PDT) X-Google-Smtp-Source: APBJJlEmgXh++ri6FXmsLxf/7u4xfAsE+Tu0LKcUWzpZuTmqJLhCn8QnzrMPP+OqhTu+T/FuasS4 X-Received: by 2002:ac2:5b10:0:b0:4fb:7888:7e6d with SMTP id v16-20020ac25b10000000b004fb78887e6dmr6720933lfn.46.1689594384981; Mon, 17 Jul 2023 04:46:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689594384; cv=none; d=google.com; s=arc-20160816; b=VCywN3J3lUjKj8bsSIkbmRlMRMX2OIWtn38dIfakb2SD2hUQXVZXTCnpPha7aFXdVY jpVAmOsGV7YSvdQa9WXJUaKfkmCbBDalEZrOUaKxAQPqE43TPmEbUrVxiVk79dbTlk2E xDCbVCqzfQnsFZ5UrZWhQnUAtXo7Ma6QPxeciyaYeH0TNPffdHHfA/WF9ztgXl3dExfh cU/7rcQlv77amWbAkaJS/9xG1ciUHnUgc0Gfh+X6wV3KteiKBe7tUYEG7eoap4c1EPFn nNHyHxS/RoRvAVYAEtR6q7kcddkV+pfrYTQnNCc+q9+QxfLmT/fVnv8ZuYZwFGyKOEPT tZLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=FGuCnRoGjtuxCyYZwSXlSd4VQfcsRWWQdyiV1Wnie4Q=; fh=JgIgyeVfHwvOXq2SUW9xixt7IvWppQhIeupjE3HP1BQ=; b=k2KBbX8ZrOF7BInPIbBIU3oxmaEUoAEo+BwKvgvcUSoDGVCsVT+sD3oDZLvutybuk1 bVRCzvOeW4krwp9UKA/CDQHE4CrRgb6P/oPnq+LUzPlsQ0m3hJxAMJb6hC6U265fELVo ZXzdd2OKZKhK6qRVEgitWPSR5s0Wzd86hhDZ2MWiy4L7PB3A35F397KGxgX2cYjrZphB YBBdLC9+LjOi4CRI41MegofJholDkPsLy3pc4otL4NrkhPYF0mOUkLKycuSl7eUcCPaB pQCLoG5qJVpD7ybcA8bnsPtLGHIhqGe4SibzVix3n38yGI8DSzdKKApvCBJZkG18hmHg 6qSw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=lOGtagPB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n17-20020aa7c451000000b0051df73ddeffsi13727460edr.358.2023.07.17.04.46.00; Mon, 17 Jul 2023 04:46:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=lOGtagPB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229669AbjGQLhY (ORCPT <rfc822;hadasmailinglist@gmail.com> + 99 others); Mon, 17 Jul 2023 07:37:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46734 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230169AbjGQLhT (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 17 Jul 2023 07:37:19 -0400 Received: from mail-ed1-x549.google.com (mail-ed1-x549.google.com [IPv6:2a00:1450:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B8BBBBF for <linux-kernel@vger.kernel.org>; Mon, 17 Jul 2023 04:37:17 -0700 (PDT) Received: by mail-ed1-x549.google.com with SMTP id 4fb4d7f45d1cf-51e278a3a2dso2963097a12.3 for <linux-kernel@vger.kernel.org>; Mon, 17 Jul 2023 04:37:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689593836; x=1692185836; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=FGuCnRoGjtuxCyYZwSXlSd4VQfcsRWWQdyiV1Wnie4Q=; b=lOGtagPBwApVjfpfXneYZBu8MWYD1OliVo2uXXUvPFvV6+APoIbAw6+/k3YWQj3eId ++zdOrcTldnSRQnleASyL9AkS3sHtgDNjCJqdxPyeYFLbcNbm7H47WMPCrL025NnVhlV G1Qol0TNu5t6DU2yoyQVV2/M6YY2bbB1S6KNkkNOFhBw/Gi3exXLENY44qDhbiI+qFGh XTOGJBYScj/gFqOpJMAG1/JvF8wnIW5CuArTM2y25y2u8rr7pO8KA2XD2gxTxG4FbLph Yo+NTPG4cWlDz+nePnG/bgJtDwcLrzw/IHDxEArbQPROIOahGhUPuqygOW0gEksJ4t/k Sb+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689593836; x=1692185836; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=FGuCnRoGjtuxCyYZwSXlSd4VQfcsRWWQdyiV1Wnie4Q=; b=NHyjQM2ejPJiIBl/l4pJ9XcuK23hDk2cGV/X0flQGqtHXdrJm5cV22wHQ9c4i7fKjX r9PmCCME+SMGXAvmS1y39FEQU47z7uy2u3c4ZVLfy5q36bOs20dls1iPinJ6NntKrhIl juR8OA5LMY0Gxgeg8Wlj07duxZID41YBeuuM9jMwpGNCZe7F6npq4azMwtaK2XwXDEEF pbSwUyPKTXCCxb9RjgytKDNsjUUvu1zWaMKahw9DMn5G5ZSpbS+dmWY4a2s6HmKs1hhX Xp/RaGTkOPf8q6Us6gKNGyoWCXFvl5FLRpfDTon18yUhHz7VtwBX0thc/YjSo8jU1Z++ WekQ== X-Gm-Message-State: ABy/qLbhiupnYWHk834VLrIaTHUT/96cV8LvnrwJoU7LBRZQDD+3wU6t 7jPoxaJ86vUp6cf6nHOcblk4PZTPnYc= X-Received: from glider.muc.corp.google.com ([2a00:79e0:9c:201:3b6e:8102:6db8:d83f]) (user=glider job=sendgmr) by 2002:a50:c310:0:b0:51b:ee9e:aad2 with SMTP id a16-20020a50c310000000b0051bee9eaad2mr66529edb.0.1689593835900; Mon, 17 Jul 2023 04:37:15 -0700 (PDT) Date: Mon, 17 Jul 2023 13:37:04 +0200 In-Reply-To: <20230717113709.328671-1-glider@google.com> Mime-Version: 1.0 References: <20230717113709.328671-1-glider@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230717113709.328671-2-glider@google.com> Subject: [PATCH v3 1/5] lib/bitmap: add bitmap_{set,get}_value() From: Alexander Potapenko <glider@google.com> To: glider@google.com, catalin.marinas@arm.com, will@kernel.org, pcc@google.com, andreyknvl@gmail.com, andriy.shevchenko@linux.intel.com, linux@rasmusvillemoes.dk, yury.norov@gmail.com Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, eugenis@google.com, syednwaris@gmail.com, william.gray@linaro.org, Arnd Bergmann <arnd@arndb.de> Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771668122144434414 X-GMAIL-MSGID: 1771668122144434414 |
Series |
Implement MTE tag compression for swapped pages
|
|
Commit Message
Alexander Potapenko
July 17, 2023, 11:37 a.m. UTC
The two new functions allow setting/getting values of length up to BITS_PER_LONG bits at arbitrary position in the bitmap. The code was taken from "bitops: Introduce the for_each_set_clump macro" by Syed Nayyar Waris with a couple of minor changes: - instead of using roundup(), which adds an unnecessary dependency on <linux/math.h>, we calculate space as BITS_PER_LONG-offset; - indentation is reduced by not using else-clauses (suggested by checkpatch for bitmap_get_value()) Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Syed Nayyar Waris <syednwaris@gmail.com> Signed-off-by: William Breathitt Gray <william.gray@linaro.org> Link: https://lore.kernel.org/lkml/fe12eedf3666f4af5138de0e70b67a07c7f40338.1592224129.git.syednwaris@gmail.com/ Suggested-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: Alexander Potapenko <glider@google.com> --- include/linux/bitmap.h | 57 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 57 insertions(+)
Comments
On Mon, Jul 17, 2023 at 01:37:04PM +0200, Alexander Potapenko wrote: > The two new functions allow setting/getting values of length up to > BITS_PER_LONG bits at arbitrary position in the bitmap. > > The code was taken from "bitops: Introduce the for_each_set_clump macro" > by Syed Nayyar Waris with a couple of minor changes: Since changes are minor, please make sure that the authorship is kept untouched. > - instead of using roundup(), which adds an unnecessary dependency > on <linux/math.h>, we calculate space as BITS_PER_LONG-offset; > - indentation is reduced by not using else-clauses (suggested by > checkpatch for bitmap_get_value()) > Cc: Arnd Bergmann <arnd@arndb.de> You can use --cc to `git send-email` instead of polluting the commit message. > Signed-off-by: Syed Nayyar Waris <syednwaris@gmail.com> > Signed-off-by: William Breathitt Gray <william.gray@linaro.org> > Link: https://lore.kernel.org/lkml/fe12eedf3666f4af5138de0e70b67a07c7f40338.1592224129.git.syednwaris@gmail.com/ > Suggested-by: Yury Norov <yury.norov@gmail.com> > Signed-off-by: Alexander Potapenko <glider@google.com> With above, I think you can also add Co-developed-by (as the changes were made). ... > +static inline void bitmap_set_value(unsigned long *map, > + unsigned long value, > + unsigned long start, unsigned long nbits) > +{ > + const size_t index = BIT_WORD(start); > + const unsigned long offset = start % BITS_PER_LONG; > + const unsigned long space = BITS_PER_LONG - offset; > + > + value &= GENMASK(nbits - 1, 0); > + > + if (space >= nbits) { > + map[index] &= ~(GENMASK(nbits + offset - 1, offset)); I remember that this construction may bring horrible code on some architectures with some version(s) of the compiler (*). To fix that I found an easy refactoring: map[index] &= ~(GENMASK(nbits, 0) << offset)); (*) don't remember the actual versions, though, but anyway... > + map[index] |= value << offset; > + return; > + } > + map[index] &= ~BITMAP_FIRST_WORD_MASK(start); > + map[index] |= value << offset; > + map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits); > + map[index + 1] |= (value >> space); > +}
> > > Cc: Arnd Bergmann <arnd@arndb.de> > > You can use --cc to `git send-email` instead of polluting the commit message. Right. But as far as I can tell, certain kernel devs prefer to be CCed on the whole series, whereas others do not want to see anything but the actual patch they were interested in. I am not sure about Arnd's preferences, so I just decided to keep the tag from the original patch by Syed Nayyar Waris (which I also consider to be an indication of the fact "that potentially interested parties have been included in the discussion" per https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html#when-to-use-acked-by-cc-and-co-developed-by) > > Signed-off-by: Syed Nayyar Waris <syednwaris@gmail.com> > > Signed-off-by: William Breathitt Gray <william.gray@linaro.org> > > Link: https://lore.kernel.org/lkml/fe12eedf3666f4af5138de0e70b67a07c7f40338.1592224129.git.syednwaris@gmail.com/ > > Suggested-by: Yury Norov <yury.norov@gmail.com> > > Signed-off-by: Alexander Potapenko <glider@google.com> > > With above, I think you can also add Co-developed-by (as the changes were > made). Ok, will do. > ... > > > +static inline void bitmap_set_value(unsigned long *map, > > + unsigned long value, > > + unsigned long start, unsigned long nbits) > > +{ > > + const size_t index = BIT_WORD(start); > > + const unsigned long offset = start % BITS_PER_LONG; > > + const unsigned long space = BITS_PER_LONG - offset; > > + > > + value &= GENMASK(nbits - 1, 0); > > + > > + if (space >= nbits) { > > > + map[index] &= ~(GENMASK(nbits + offset - 1, offset)); > > I remember that this construction may bring horrible code on some architectures > with some version(s) of the compiler (*). Wow, even the trunk Clang and GCC seem to generate better code for your version of this line: https://godbolt.org/z/36Kqxhe6j > To fix that I found an easy refactoring: > > map[index] &= ~(GENMASK(nbits, 0) << offset)); > I'll take this one. > (*) don't remember the actual versions, though, but anyway... > > > + map[index] |= value << offset; > > + return; > > + } > > + map[index] &= ~BITMAP_FIRST_WORD_MASK(start); > > + map[index] |= value << offset; > > + map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits); > > + map[index + 1] |= (value >> space); > > +} > > -- > With Best Regards, > Andy Shevchenko > >
On Mon, Jul 17, 2023 at 04:14:57PM +0200, Alexander Potapenko wrote: +Cc: Nathan (on code generation question below) ... > > > Cc: Arnd Bergmann <arnd@arndb.de> > > > > You can use --cc to `git send-email` instead of polluting the commit message. > > Right. But as far as I can tell, certain kernel devs prefer to be CCed > on the whole series, whereas others do not want to see anything but > the actual patch they were interested in. > I am not sure about Arnd's preferences, so I just decided to keep the > tag from the original patch by Syed Nayyar Waris (which I also > consider to be an indication of the fact "that potentially interested > parties have been included in the discussion" per > https://www.kernel.org/doc/html/v4.17/process/submitting-patches.html#when-to-use-acked-by-cc-and-co-developed-by) My personal statistics from the field that more than 90% of maintainers would like to receive 100% of the series. Of course it depends on the series (if it's treewide, I will agree with you). Here another point to my suggestion is that Arnd is SoC tree maintainer, where ARM is one of the biggest player, so I think he would like to see arm*: prefixed patches anyway. ... > > > + map[index] &= ~(GENMASK(nbits + offset - 1, offset)); > > > > I remember that this construction may bring horrible code on some architectures > > with some version(s) of the compiler (*). > > Wow, even the trunk Clang and GCC seem to generate better code for > your version of this line: https://godbolt.org/z/36Kqxhe6j Wow, indeed! Perhaps time to report to clang and GCC people. I believe the root cause is that in the original version compiler can't prove that l is constant for GENMASK(). > > To fix that I found an easy refactoring: > > > > map[index] &= ~(GENMASK(nbits, 0) << offset)); > > I'll take this one. > > > (*) don't remember the actual versions, though, but anyway...
On Mon, Jul 17, 2023 at 05:29:12PM +0300, Andy Shevchenko wrote: > On Mon, Jul 17, 2023 at 04:14:57PM +0200, Alexander Potapenko wrote: ... > > > > + map[index] &= ~(GENMASK(nbits + offset - 1, offset)); > > > > > > I remember that this construction may bring horrible code on some architectures > > > with some version(s) of the compiler (*). > > > > Wow, even the trunk Clang and GCC seem to generate better code for > > your version of this line: https://godbolt.org/z/36Kqxhe6j > > Wow, indeed! Perhaps time to report to clang and GCC people. I believe the root > cause is that in the original version compiler can't prove that l is constant > for GENMASK(). > > > > To fix that I found an easy refactoring: > > > > > > map[index] &= ~(GENMASK(nbits, 0) << offset)); nbits - 1 it should be, btw. In any case it seems the code is still better. > > I'll take this one. > > > > > (*) don't remember the actual versions, though, but anyway...
> > > > I remember that this construction may bring horrible code on some architectures > > with some version(s) of the compiler (*). > > Wow, even the trunk Clang and GCC seem to generate better code for > your version of this line: https://godbolt.org/z/36Kqxhe6j Oh wait. First, my Godbolt reproducer is incorrect, it is using sizeof(unsigned long) instead of 8 * sizeof(unsigned long) for BITS_PER_LONG. > > > To fix that I found an easy refactoring: > > > > map[index] &= ~(GENMASK(nbits, 0) << offset)); > > Second, the line above should probably be: map[index] &= ~(GENMASK(nbits - 1, 0) << offset)); , right?
On Mon, Jul 17, 2023 at 04:53:51PM +0200, Alexander Potapenko wrote: ... > > > I remember that this construction may bring horrible code on some architectures > > > with some version(s) of the compiler (*). > > > > Wow, even the trunk Clang and GCC seem to generate better code for > > your version of this line: https://godbolt.org/z/36Kqxhe6j > > Oh wait. > First, my Godbolt reproducer is incorrect, it is using sizeof(unsigned > long) instead of 8 * sizeof(unsigned long) for BITS_PER_LONG. Still slightly better. And note, that the same GENMASK() is used at the beginning of the function. Compiler actually might cache it. > > > To fix that I found an easy refactoring: > > > > > > map[index] &= ~(GENMASK(nbits, 0) << offset)); > > > > > Second, the line above should probably be: > map[index] &= ~(GENMASK(nbits - 1, 0) << offset)); > > , right? Yes.
On Mon, Jul 17, 2023 at 01:37:04PM +0200, Alexander Potapenko wrote: > The two new functions allow setting/getting values of length up to > BITS_PER_LONG bits at arbitrary position in the bitmap. > > The code was taken from "bitops: Introduce the for_each_set_clump macro" > by Syed Nayyar Waris with a couple of minor changes: > - instead of using roundup(), which adds an unnecessary dependency > on <linux/math.h>, we calculate space as BITS_PER_LONG-offset; > - indentation is reduced by not using else-clauses (suggested by > checkpatch for bitmap_get_value()) Please preserve Syed's authorship ('From' field in git log). > Cc: Arnd Bergmann <arnd@arndb.de> > Signed-off-by: Syed Nayyar Waris <syednwaris@gmail.com> > Signed-off-by: William Breathitt Gray <william.gray@linaro.org> > Link: https://lore.kernel.org/lkml/fe12eedf3666f4af5138de0e70b67a07c7f40338.1592224129.git.syednwaris@gmail.com/ > Suggested-by: Yury Norov <yury.norov@gmail.com> > Signed-off-by: Alexander Potapenko <glider@google.com> > --- > include/linux/bitmap.h | 57 ++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 57 insertions(+) > > diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h > index 03644237e1efb..4559366084988 100644 > --- a/include/linux/bitmap.h > +++ b/include/linux/bitmap.h > @@ -76,7 +76,11 @@ struct device; > * bitmap_to_arr32(buf, src, nbits) Copy nbits from buf to u32[] dst > * bitmap_to_arr64(buf, src, nbits) Copy nbits from buf to u64[] dst > * bitmap_get_value8(map, start) Get 8bit value from map at start > + * bitmap_get_value(map, start, nbits) Get bit value of size 'nbits' > + * from map at start > * bitmap_set_value8(map, value, start) Set 8bit value to map at start > + * bitmap_set_value(map, value, start, nbits) Set bit value of size 'nbits' > + * of map at start The 'bit value of size' sounds more confusing than it should. The size of bit is actually a bit... Can you rephrase? Moreover, 'set bits' has a meaning of actually setting them, i.e. switching to '1'. Maybe: "Copy 'nbits' to bitmap starting at 'start'"? > * > * Note, bitmap_zero() and bitmap_fill() operate over the region of > * unsigned longs, that is, bits behind bitmap till the unsigned long > @@ -583,6 +587,31 @@ static inline unsigned long bitmap_get_value8(const unsigned long *map, > return (map[index] >> offset) & 0xFF; > } > > +/** > + * bitmap_get_value - get a value of n-bits from the memory region > + * @map: address to the bitmap memory region > + * @start: bit offset of the n-bit value > + * @nbits: size of value in bits * @nbits: size of value in bits, up to BITS_PER_LONG > + * > + * Returns value of nbits located at the @start bit offset within the @map > + * memory region. > + */ > +static inline unsigned long bitmap_get_value(const unsigned long *map, > + unsigned long start, > + unsigned long nbits) > +{ > + const size_t index = BIT_WORD(start); > + const unsigned long offset = start % BITS_PER_LONG; > + const unsigned long space = BITS_PER_LONG - offset; > + unsigned long value_low, value_high; > + > + if (space >= nbits) > + return (map[index] >> offset) & GENMASK(nbits - 1, 0); > + value_low = map[index] & BITMAP_FIRST_WORD_MASK(start); > + value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits); > + return (value_low >> offset) | (value_high << space); > +} When nbits == 0, copy-like functions shouldn't touch any memory. See how other bitmap and find_bit functions hold it. Thanks, Yury
On Mon, Jul 17, 2023 at 05:31:44PM +0300, Andy Shevchenko wrote: > On Mon, Jul 17, 2023 at 05:29:12PM +0300, Andy Shevchenko wrote: > > On Mon, Jul 17, 2023 at 04:14:57PM +0200, Alexander Potapenko wrote: > > ... > > > > > > + map[index] &= ~(GENMASK(nbits + offset - 1, offset)); > > > > > > > > I remember that this construction may bring horrible code on some architectures > > > > with some version(s) of the compiler (*). > > > > > > Wow, even the trunk Clang and GCC seem to generate better code for > > > your version of this line: https://godbolt.org/z/36Kqxhe6j > > > > Wow, indeed! Perhaps time to report to clang and GCC people. I believe the root > > cause is that in the original version compiler can't prove that l is constant > > for GENMASK(). > > > > > > To fix that I found an easy refactoring: > > > > > > > > map[index] &= ~(GENMASK(nbits, 0) << offset)); > > nbits - 1 it should be, btw. In any case it seems the code is still better. Yep. Alexander, for the next round can you please show disassembly for the functions in case of compile-time and run-time defined start and nbits? Thanks, Yury
On Mon, Jul 17, 2023 at 5:03 PM Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote: > > On Mon, Jul 17, 2023 at 04:53:51PM +0200, Alexander Potapenko wrote: > > ... > > > > > I remember that this construction may bring horrible code on some architectures > > > > with some version(s) of the compiler (*). > > > > > > Wow, even the trunk Clang and GCC seem to generate better code for > > > your version of this line: https://godbolt.org/z/36Kqxhe6j > > > > Oh wait. > > First, my Godbolt reproducer is incorrect, it is using sizeof(unsigned > > long) instead of 8 * sizeof(unsigned long) for BITS_PER_LONG. > > Still slightly better. And note, that the same GENMASK() is used at the > beginning of the function. Compiler actually might cache it. I think that the compiler might consider this transformation invalid because "GENMASK(nbits - 1, 0) << offset" and "GENMASK(nbits + offset - 1, offset)" have different values in the case "nbits + offset" exceed 64. Restricting the domain of bitmap_set_value() might result in better code, but it is easier to stick to your version, which prevents the overflow. > > > > To fix that I found an easy refactoring: > > > > > > > > map[index] &= ~(GENMASK(nbits, 0) << offset)); > > > > > > > > Second, the line above should probably be: > > map[index] &= ~(GENMASK(nbits - 1, 0) << offset)); > > > > , right? > > Yes. This "nbits vs. nbits - 1" was not detected by test_set_get_value(), so I'll add an extra test case for it.
On Mon, Jul 17, 2023 at 5:51 PM Yury Norov <yury.norov@gmail.com> wrote: > > On Mon, Jul 17, 2023 at 01:37:04PM +0200, Alexander Potapenko wrote: > > The two new functions allow setting/getting values of length up to > > BITS_PER_LONG bits at arbitrary position in the bitmap. > > > > The code was taken from "bitops: Introduce the for_each_set_clump macro" > > by Syed Nayyar Waris with a couple of minor changes: > > - instead of using roundup(), which adds an unnecessary dependency > > on <linux/math.h>, we calculate space as BITS_PER_LONG-offset; > > - indentation is reduced by not using else-clauses (suggested by > > checkpatch for bitmap_get_value()) > > Please preserve Syed's authorship ('From' field in git log). Done > > * bitmap_set_value8(map, value, start) Set 8bit value to map at start > > + * bitmap_set_value(map, value, start, nbits) Set bit value of size 'nbits' > > + * of map at start > > The 'bit value of size' sounds more confusing than it should. The size > of bit is actually a bit... Can you rephrase? How about "Get an nbits-sized value from map at start" and "Set an nbits-sized value to map at start"? > Moreover, 'set bits' has > a meaning of actually setting them, i.e. switching to '1'. Maybe: > "Copy 'nbits' to bitmap starting at 'start'"? Right now it is in line with the comment for bitmap_set_value8 (and the names of the functions also have _set_ in them). Shall I also change that comment? WDYT about "Put an nbits-sized value to map at start"? > > +/** > > + * bitmap_get_value - get a value of n-bits from the memory region > > + * @map: address to the bitmap memory region > > + * @start: bit offset of the n-bit value > > + * @nbits: size of value in bits > > > * @nbits: size of value in bits, up to BITS_PER_LONG Ok > > + * > > + * Returns value of nbits located at the @start bit offset within the @map > > + * memory region. > > + */ > > +static inline unsigned long bitmap_get_value(const unsigned long *map, > > + unsigned long start, > > + unsigned long nbits) > > +{ > > + const size_t index = BIT_WORD(start); > > + const unsigned long offset = start % BITS_PER_LONG; > > + const unsigned long space = BITS_PER_LONG - offset; > > + unsigned long value_low, value_high; > > + > > + if (space >= nbits) > > + return (map[index] >> offset) & GENMASK(nbits - 1, 0); > > + value_low = map[index] & BITMAP_FIRST_WORD_MASK(start); > > + value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits); > > + return (value_low >> offset) | (value_high << space); > > +} > > When nbits == 0, copy-like functions shouldn't touch any memory. See how > other bitmap and find_bit functions hold it. I think this is different from what other bitmap functions do, but it should be enough to bail out on !nbits, i.e.: if (!nbits) return 0; You probably meant adding a __builtin_constant_p() (which is used all over the place in bitmap.h), but: - the compiler won't have problem optimizing away the code for a constant nbits=0; - we anyway need a dynamic check for the case nbits is not constant (for both bitmap_get_value() and bitmap_set_value(), I assume). What do you think?
On Tue, Jul 18, 2023 at 11:30:00AM +0200, Alexander Potapenko wrote: > On Mon, Jul 17, 2023 at 5:51 PM Yury Norov <yury.norov@gmail.com> wrote: > > On Mon, Jul 17, 2023 at 01:37:04PM +0200, Alexander Potapenko wrote: ... > > When nbits == 0, copy-like functions shouldn't touch any memory. See how > > other bitmap and find_bit functions hold it. > > I think this is different from what other bitmap functions do, but it > should be enough to bail out on !nbits, i.e.: > > if (!nbits) > return 0; > > You probably meant adding a __builtin_constant_p() (which is used all > over the place in bitmap.h), but: > - the compiler won't have problem optimizing away the code for a > constant nbits=0; > - we anyway need a dynamic check for the case nbits is not constant > (for both bitmap_get_value() and bitmap_set_value(), I assume). > > What do you think? The idea behind is to eliminate the code completely for the cases nbits != 0. In your case the dynamic check will be there. That's what we want to avoid.
On Tue, Jul 18, 2023 at 05:01:28PM +0300, Andy Shevchenko wrote: > On Tue, Jul 18, 2023 at 11:30:00AM +0200, Alexander Potapenko wrote: > > On Mon, Jul 17, 2023 at 5:51 PM Yury Norov <yury.norov@gmail.com> wrote: > > > On Mon, Jul 17, 2023 at 01:37:04PM +0200, Alexander Potapenko wrote: > > ... > > > > When nbits == 0, copy-like functions shouldn't touch any memory. See how > > > other bitmap and find_bit functions hold it. > > > > I think this is different from what other bitmap functions do, but it > > should be enough to bail out on !nbits, i.e.: > > > > if (!nbits) > > return 0; > > > > You probably meant adding a __builtin_constant_p() (which is used all > > over the place in bitmap.h), but: No, I didn't mean that. > > - the compiler won't have problem optimizing away the code for a > > constant nbits=0; Look at your code, having nbits == 0 in mind: const size_t index = BIT_WORD(start); const unsigned long offset = start % BITS_PER_LONG; const unsigned long space = BITS_PER_LONG - offset; unsigned long value_low, value_high; if (space >= nbits) // This is always the case return (map[index] >> offset) & GENMASK(nbits - 1, 0); ... ^^ ^^ Unconditional fetch Wshift-count-overflow Thanks to GENMASK() implementation, you'll be warned by GENMASK_INPUT_CHECK() if nbits is a compile-time variable. In case of runtime, it's a pure undef, not mentioning useless, expensive and dangerous fetch. > > - we anyway need a dynamic check for the case nbits is not constant > > (for both bitmap_get_value() and bitmap_set_value(), I assume). > > > > What do you think? I think that instead of speculations, it's better to cover nbits == 0 with the explicit tests for run- and compile-time. That way you're always on a safe side. bitmap_get_val(NULL, 0, 0) shouldn't crash the kernel. > The idea behind is to eliminate the code completely for the cases nbits != 0. > In your case the dynamic check will be there. That's what we want to avoid. Alexander is right - we can't avoid testing against 0 if we need to test for 0... In case of other functions we have inline and outline implementations, controlled by small_const_nbits(). As you can see, the small_const_nbits() tests against 0 explicitly, although it's free at compile time. But if nbits == 0, we pick outline version of a function regardless. On their turn, outline versions again do their test against nbits == 0, but most of the time implicitly. In case of bitmap_set_val, we are touching at max 2 words, and there's no reason for outline version, so we have to test nbits against 0 inside inline code. Having all that in mind, and because nbits == 0 is most likely an error we'd follow the following rules: - no memory must be touched as we're potentially in error condition, and pointer may be corrupted; - the cost of the check must be as minimal as possible. So I suggest: if (unlikely(nbits == 0)) return; For readers that would literally mean: we don't expect that, and we find it suspicious, but we'll handle that as correct as we can. By the way, Alexander, please drop that 'const' things. Those are for pointers or some global variables, not for inline functions with 4 lines of code. (If you think it helps the code to be safe than no - it's unsafe even with consts.) Thanks, Yury
On Tue, Jul 18, 2023 at 10:03:25AM -0700, Yury Norov wrote: > On Tue, Jul 18, 2023 at 05:01:28PM +0300, Andy Shevchenko wrote: > > On Tue, Jul 18, 2023 at 11:30:00AM +0200, Alexander Potapenko wrote: ... > > The idea behind is to eliminate the code completely for the cases nbits != 0. > > In your case the dynamic check will be there. That's what we want to avoid. > > Alexander is right - we can't avoid testing against 0 if we need to > test for 0... In case of other functions we have inline and outline > implementations, controlled by small_const_nbits(). > > As you can see, the small_const_nbits() tests against 0 explicitly, > although it's free at compile time. But if nbits == 0, we pick > outline version of a function regardless. > > On their turn, outline versions again do their test against nbits == 0, > but most of the time implicitly. > > In case of bitmap_set_val, we are touching at max 2 words, and there's > no reason for outline version, so we have to test nbits against 0 > inside inline code. > > Having all that in mind, and because nbits == 0 is most likely an > error we'd follow the following rules: > - no memory must be touched as we're potentially in error condition, > and pointer may be corrupted; > - the cost of the check must be as minimal as possible. > > So I suggest: > > if (unlikely(nbits == 0)) > return; > > For readers that would literally mean: we don't expect that, and we find > it suspicious, but we'll handle that as correct as we can. Okay, thank you for elaborated answer.
> > Thanks to GENMASK() implementation, you'll be warned by GENMASK_INPUT_CHECK() > if nbits is a compile-time variable. In case of runtime, it's a pure undef, > not mentioning useless, expensive and dangerous fetch. > > > > - we anyway need a dynamic check for the case nbits is not constant > > > (for both bitmap_get_value() and bitmap_set_value(), I assume). > > > > > > What do you think? > > I think that instead of speculations, it's better to cover nbits == 0 > with the explicit tests for run- and compile-time. That way you're > always on a safe side. You are right. I added tests for these cases. > bitmap_get_val(NULL, 0, 0) shouldn't crash the kernel. Haha, the compiler is smart enough to not crash the kernel in this case. But passing zero via a volatile variable did the trick. > > > The idea behind is to eliminate the code completely for the cases nbits != 0. > > In your case the dynamic check will be there. That's what we want to avoid. > > Alexander is right - we can't avoid testing against 0 if we need to > test for 0... In case of other functions we have inline and outline > implementations, controlled by small_const_nbits(). > > As you can see, the small_const_nbits() tests against 0 explicitly, > although it's free at compile time. But if nbits == 0, we pick > outline version of a function regardless. > > On their turn, outline versions again do their test against nbits == 0, > but most of the time implicitly. > > In case of bitmap_set_val, we are touching at max 2 words, and there's > no reason for outline version, so we have to test nbits against 0 > inside inline code. > > Having all that in mind, and because nbits == 0 is most likely an > error we'd follow the following rules: > - no memory must be touched as we're potentially in error condition, > and pointer may be corrupted; > - the cost of the check must be as minimal as possible. > > So I suggest: > > if (unlikely(nbits == 0)) > return; Sounds good, I'll add unlikely() around the check. Thanks for the explanation! > > For readers that would literally mean: we don't expect that, and we find > it suspicious, but we'll handle that as correct as we can. > > By the way, Alexander, please drop that 'const' things. Those are for > pointers or some global variables, not for inline functions with 4 > lines of code. (If you think it helps the code to be safe than no - it's > unsafe even with consts.) These consts are from the original Syed's patch and were probably added for consistency with bitmap_{set,get}_value8(). But, okay, I'll remove them. -- Alexander Potapenko Software Engineer Google Germany GmbH Erika-Mann-Straße, 33 80636 München Geschäftsführer: Paul Manicle, Liana Sebastian Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h index 03644237e1efb..4559366084988 100644 --- a/include/linux/bitmap.h +++ b/include/linux/bitmap.h @@ -76,7 +76,11 @@ struct device; * bitmap_to_arr32(buf, src, nbits) Copy nbits from buf to u32[] dst * bitmap_to_arr64(buf, src, nbits) Copy nbits from buf to u64[] dst * bitmap_get_value8(map, start) Get 8bit value from map at start + * bitmap_get_value(map, start, nbits) Get bit value of size 'nbits' + * from map at start * bitmap_set_value8(map, value, start) Set 8bit value to map at start + * bitmap_set_value(map, value, start, nbits) Set bit value of size 'nbits' + * of map at start * * Note, bitmap_zero() and bitmap_fill() operate over the region of * unsigned longs, that is, bits behind bitmap till the unsigned long @@ -583,6 +587,31 @@ static inline unsigned long bitmap_get_value8(const unsigned long *map, return (map[index] >> offset) & 0xFF; } +/** + * bitmap_get_value - get a value of n-bits from the memory region + * @map: address to the bitmap memory region + * @start: bit offset of the n-bit value + * @nbits: size of value in bits + * + * Returns value of nbits located at the @start bit offset within the @map + * memory region. + */ +static inline unsigned long bitmap_get_value(const unsigned long *map, + unsigned long start, + unsigned long nbits) +{ + const size_t index = BIT_WORD(start); + const unsigned long offset = start % BITS_PER_LONG; + const unsigned long space = BITS_PER_LONG - offset; + unsigned long value_low, value_high; + + if (space >= nbits) + return (map[index] >> offset) & GENMASK(nbits - 1, 0); + value_low = map[index] & BITMAP_FIRST_WORD_MASK(start); + value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits); + return (value_low >> offset) | (value_high << space); +} + /** * bitmap_set_value8 - set an 8-bit value within a memory region * @map: address to the bitmap memory region @@ -599,6 +628,34 @@ static inline void bitmap_set_value8(unsigned long *map, unsigned long value, map[index] |= value << offset; } +/** + * bitmap_set_value - set n-bit value within a memory region + * @map: address to the bitmap memory region + * @value: value of nbits + * @start: bit offset of the n-bit value + * @nbits: size of value in bits + */ +static inline void bitmap_set_value(unsigned long *map, + unsigned long value, + unsigned long start, unsigned long nbits) +{ + const size_t index = BIT_WORD(start); + const unsigned long offset = start % BITS_PER_LONG; + const unsigned long space = BITS_PER_LONG - offset; + + value &= GENMASK(nbits - 1, 0); + + if (space >= nbits) { + map[index] &= ~(GENMASK(nbits + offset - 1, offset)); + map[index] |= value << offset; + return; + } + map[index] &= ~BITMAP_FIRST_WORD_MASK(start); + map[index] |= value << offset; + map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits); + map[index + 1] |= (value >> space); +} + #endif /* __ASSEMBLY__ */ #endif /* __LINUX_BITMAP_H */