Message ID | 20231006134529.2816540-2-glider@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a888:0:b0:403:3b70:6f57 with SMTP id x8csp342190vqo; Fri, 6 Oct 2023 06:46:22 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHE0RTSDtUgNpZrqbNae+5RuqmrBOonflwYqmWBWTXZq5GsDtPAmEbI4JyFwc77Jep0RU4K X-Received: by 2002:a05:6358:788:b0:143:383e:5b22 with SMTP id n8-20020a056358078800b00143383e5b22mr9099053rwj.28.1696599981982; Fri, 06 Oct 2023 06:46:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696599981; cv=none; d=google.com; s=arc-20160816; b=y9taGHFB1eCkcm2CMzHf3jBSsGQKMhh+o53lIjUULV+pnjjiTfyFhrXnL8ZmbYZeG9 P4l1FcEti7+iu7BRRywJWLa8WQSf6EXrFkIZXRpIUvi4zlSvPEconX45EZx8Wu5gADK+ k3Jz9U3jK/+3x3lXjBMvD40J7sxcdlV2u2yvq/+tdphPDgc7sdwCWrR12X2CuaGS+1I5 kVKeuzkZm0QefbU303fbk/0xNea8kFYTXs5vri41ajM4zTba+hEaKJjoFaj5qPGUYuXz Dro3o5DyXDTeQ+YSefq+mlLJXhaOfGHDwZLoCRoP5CLT/R57jSNeG+BGLXxBXIqrXHu2 6KSg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=FVc3w80Slds1rDJeGDtC5d5l1mYuX/jp9UzDI84XpOg=; fh=0D/4Dk0SCz2o/J4vln6eQMWplz/1vusR3/re587LiAI=; b=zFfJAyWS4eR9VYLhZQTrsdy9o9lBlF1XidCyTtG/XfDJ77xPcglzUZRhod2pXpm/AP RSoGTk0isEMkWFt++F5e0FiRhtZBExovW9WZSptA4IIp6NCCSb6sxLu6fhuEqwuPPRcx 7f0/b0JgM+Y/wFlVEg6qBw0UpQP1GMYNkMmRBlPlWoNvVepj+5HFL8ehbuTMiQRetFEb AZpRDTwD11rdI7qJTyABfsDbEuZw7DlTu6YqH/EVLOgLF/Y/ZmwJc6LPOJ8gSlWnI1l4 u4AdbwMXzt+vs0KfKDq6SMHoiPymfbjIH6RnUYtTfY6L/Id2IV0jNAU0grC3GhXFWOAs //5w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=v3UbX2Cq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id i4-20020a17090a974400b00276e95d7657si5873013pjw.33.2023.10.06.06.46.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 06 Oct 2023 06:46:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=v3UbX2Cq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 09D8C83A2BF9; Fri, 6 Oct 2023 06:46:19 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232646AbjJFNqG (ORCPT <rfc822;ezelljr.billy@gmail.com> + 18 others); Fri, 6 Oct 2023 09:46:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58982 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232530AbjJFNpx (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 6 Oct 2023 09:45:53 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3EEAA19F for <linux-kernel@vger.kernel.org>; Fri, 6 Oct 2023 06:45:38 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-59e8ebc0376so29390987b3.2 for <linux-kernel@vger.kernel.org>; Fri, 06 Oct 2023 06:45:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1696599936; x=1697204736; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=FVc3w80Slds1rDJeGDtC5d5l1mYuX/jp9UzDI84XpOg=; b=v3UbX2CqS7W30A+nhyshcgQVxcFgh2c7QdExDVSdunWpt0UDi6FRpnwSLCyJIGFn0W ye6zGHTb/Kz2OvEozN3jkHEkt0htrlykzF8orwyGP5uWDoBw83TqMSVJ7dgAWaSNJ0pZ ViqIpNh6lCJWiI/HPlDIScpJYgaEfarFyGLW0fljnPJCs0tj50SCmYpnMwkXqCg2l16F +p0oNzsYsqTCoT+q4ErM7BnQqYcEi4CMRL83LZnWw6X/THYPrbdYBYkkujpx9OEo5v8a 6ijNjpgfrMv9YdDodVVbSvdXV56alkB+ND72WPqhEXEMOjd960C4oIPGteZu94MOmx07 yHgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696599936; x=1697204736; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=FVc3w80Slds1rDJeGDtC5d5l1mYuX/jp9UzDI84XpOg=; b=HI34eIFh5jev0SoqNYRijUSb3AOMK/rEr1WsSmZD55xgh6ruG+D81jAIlmITYc33X+ x6IwLzKgoXD+VGa2t2JoVYlaZ6jvdnXV8/E0mPOUaIJG5t9sV8GAjR4ThjDfoFbKO1k9 k3gIg+el1+SAeepZEMd5xdFPYBxu7Jkj8BBX3Dl+4qHeDbzBTUJyDBFEAbLpE3kSW9Qa Gqpn2sJIi+htxuNL8PauSoDayROlffJ4A3JYeFYTrAaMREoI2RmfXuHi5EW5+HZ4q3ut Q5KfwLlPGk/nY9hivfPMj9vcEs+ebVALHg6iRM8P/mAltaF+2/DjCOiG/jM3Am11S9Rp 6bHQ== X-Gm-Message-State: AOJu0YyAstYQNeEvZ+dxEZ89z+BGMQxv9bC0nkEe/5dEQVOe4UC+qis3 7AufFNKqv+GZRJRRPcQR13qe49XC4hc= X-Received: from glider.muc.corp.google.com ([2a00:79e0:9c:201:2691:23e9:f01f:964]) (user=glider job=sendgmr) by 2002:a05:6902:13cc:b0:d91:8876:2040 with SMTP id y12-20020a05690213cc00b00d9188762040mr133955ybu.5.1696599936015; Fri, 06 Oct 2023 06:45:36 -0700 (PDT) Date: Fri, 6 Oct 2023 15:45:25 +0200 In-Reply-To: <20231006134529.2816540-1-glider@google.com> Mime-Version: 1.0 References: <20231006134529.2816540-1-glider@google.com> X-Mailer: git-send-email 2.42.0.609.gbb76f46606-goog Message-ID: <20231006134529.2816540-2-glider@google.com> Subject: [PATCH v6 1/5] lib/bitmap: add bitmap_{read,write}() From: Alexander Potapenko <glider@google.com> To: glider@google.com, catalin.marinas@arm.com, will@kernel.org, pcc@google.com, andreyknvl@gmail.com, andriy.shevchenko@linux.intel.com, aleksander.lobakin@intel.com, linux@rasmusvillemoes.dk, yury.norov@gmail.com Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, eugenis@google.com, syednwaris@gmail.com, william.gray@linaro.org, Arnd Bergmann <arnd@arndb.de> Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-4.8 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Fri, 06 Oct 2023 06:46:19 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779014022668146588 X-GMAIL-MSGID: 1779014022668146588 |
Series |
Implement MTE tag compression for swapped pages
|
|
Commit Message
Alexander Potapenko
Oct. 6, 2023, 1:45 p.m. UTC
From: Syed Nayyar Waris <syednwaris@gmail.com> The two new functions allow reading/writing values of length up to BITS_PER_LONG bits at arbitrary position in the bitmap. The code was taken from "bitops: Introduce the for_each_set_clump macro" by Syed Nayyar Waris with a number of changes and simplifications: - instead of using roundup(), which adds an unnecessary dependency on <linux/math.h>, we calculate space as BITS_PER_LONG-offset; - indentation is reduced by not using else-clauses (suggested by checkpatch for bitmap_get_value()); - bitmap_get_value()/bitmap_set_value() are renamed to bitmap_read() and bitmap_write(); - some redundant computations are omitted. Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Syed Nayyar Waris <syednwaris@gmail.com> Signed-off-by: William Breathitt Gray <william.gray@linaro.org> Link: https://lore.kernel.org/lkml/fe12eedf3666f4af5138de0e70b67a07c7f40338.1592224129.git.syednwaris@gmail.com/ Suggested-by: Yury Norov <yury.norov@gmail.com> Co-developed-by: Alexander Potapenko <glider@google.com> Signed-off-by: Alexander Potapenko <glider@google.com> --- This patch was previously called "lib/bitmap: add bitmap_{set,get}_value()" (https://lore.kernel.org/lkml/20230720173956.3674987-2-glider@google.com/) v6: - As suggested by Yury Norov, do not require bitmap_read(..., 0) to return 0. v5: - Address comments by Yury Norov: - updated code comments and patch title/description - replace GENMASK(nbits - 1, 0) with BITMAP_LAST_WORD_MASK(nbits) - more compact bitmap_write() implementation v4: - Address comments by Andy Shevchenko and Yury Norov: - prevent passing values >= 64 to GENMASK() - fix commit authorship - change comments - check for unlikely(nbits==0) - drop unnecessary const declarations - fix kernel-doc comments - rename bitmap_{get,set}_value() to bitmap_{read,write}() --- include/linux/bitmap.h | 68 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 68 insertions(+)
Comments
On Fri, Oct 06, 2023 at 03:45:25PM +0200, Alexander Potapenko wrote: > From: Syed Nayyar Waris <syednwaris@gmail.com> > > The two new functions allow reading/writing values of length up to > BITS_PER_LONG bits at arbitrary position in the bitmap. > > The code was taken from "bitops: Introduce the for_each_set_clump macro" > by Syed Nayyar Waris with a number of changes and simplifications: > - instead of using roundup(), which adds an unnecessary dependency > on <linux/math.h>, we calculate space as BITS_PER_LONG-offset; > - indentation is reduced by not using else-clauses (suggested by > checkpatch for bitmap_get_value()); > - bitmap_get_value()/bitmap_set_value() are renamed to bitmap_read() > and bitmap_write(); > - some redundant computations are omitted. ... > v6: > - As suggested by Yury Norov, do not require bitmap_read(..., 0) to > return 0. Hmm... See below. ... > * bitmap_to_arr32(buf, src, nbits) Copy nbits from buf to u32[] dst > * bitmap_to_arr64(buf, src, nbits) Copy nbits from buf to u64[] dst With the grouping as below I would add a blank line here. But was the intention to group _arrXX() to these groups? > * bitmap_get_value8(map, start) Get 8bit value from map at start > + * bitmap_read(map, start, nbits) Read an nbits-sized value from > + * map at start > * bitmap_set_value8(map, value, start) Set 8bit value to map at start > + * bitmap_write(map, value, start, nbits) Write an nbits-sized value to > + * map at start ... > +static inline unsigned long bitmap_read(const unsigned long *map, > + unsigned long start, > + unsigned long nbits) > +{ > + size_t index = BIT_WORD(start); > + unsigned long offset = start % BITS_PER_LONG; > + unsigned long space = BITS_PER_LONG - offset; > + unsigned long value_low, value_high; > + if (unlikely(!nbits)) > + return 0; Hmm... I didn't get was the comment to add or to remove these checks? > + if (space >= nbits) > + return (map[index] >> offset) & GENMASK(nbits - 1, 0); And don't you want to replace this GENMASK() as well? > + value_low = map[index] & BITMAP_FIRST_WORD_MASK(start); > + value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits); > + return (value_low >> offset) | (value_high << space); > +}
On Fri, Oct 06, 2023 at 05:47:49PM +0300, Andy Shevchenko wrote: > On Fri, Oct 06, 2023 at 03:45:25PM +0200, Alexander Potapenko wrote: > > From: Syed Nayyar Waris <syednwaris@gmail.com> > > > > The two new functions allow reading/writing values of length up to > > BITS_PER_LONG bits at arbitrary position in the bitmap. > > > > The code was taken from "bitops: Introduce the for_each_set_clump macro" > > by Syed Nayyar Waris with a number of changes and simplifications: > > - instead of using roundup(), which adds an unnecessary dependency > > on <linux/math.h>, we calculate space as BITS_PER_LONG-offset; > > - indentation is reduced by not using else-clauses (suggested by > > checkpatch for bitmap_get_value()); > > - bitmap_get_value()/bitmap_set_value() are renamed to bitmap_read() > > and bitmap_write(); > > - some redundant computations are omitted. > > ... > > > v6: > > - As suggested by Yury Norov, do not require bitmap_read(..., 0) to > > return 0. > > Hmm... See below. [...] > > +static inline unsigned long bitmap_read(const unsigned long *map, > > + unsigned long start, > > + unsigned long nbits) > > +{ > > + size_t index = BIT_WORD(start); > > + unsigned long offset = start % BITS_PER_LONG; > > + unsigned long space = BITS_PER_LONG - offset; > > + unsigned long value_low, value_high; > > > + if (unlikely(!nbits)) > > + return 0; > > Hmm... I didn't get was the comment to add or to remove these checks? The sentence relates to the test, and the comment that confused you should to to the 2nd patch. I.e., bitmap_read(..., 0) is not required to return 0, and it's purely an implementation details. > > > + if (space >= nbits) > > + return (map[index] >> offset) & GENMASK(nbits - 1, 0); > > And don't you want to replace this GENMASK() as well? +1 > > + value_low = map[index] & BITMAP_FIRST_WORD_MASK(start); > > + value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits); > > + return (value_low >> offset) | (value_high << space); > > +} > > -- > With Best Regards, > Andy Shevchenko >
On Fri, Oct 06, 2023 at 03:45:25PM +0200, Alexander Potapenko wrote: > From: Syed Nayyar Waris <syednwaris@gmail.com> > > The two new functions allow reading/writing values of length up to > BITS_PER_LONG bits at arbitrary position in the bitmap. > > The code was taken from "bitops: Introduce the for_each_set_clump macro" > by Syed Nayyar Waris with a number of changes and simplifications: > - instead of using roundup(), which adds an unnecessary dependency > on <linux/math.h>, we calculate space as BITS_PER_LONG-offset; > - indentation is reduced by not using else-clauses (suggested by > checkpatch for bitmap_get_value()); > - bitmap_get_value()/bitmap_set_value() are renamed to bitmap_read() > and bitmap_write(); > - some redundant computations are omitted. > > Cc: Arnd Bergmann <arnd@arndb.de> > Signed-off-by: Syed Nayyar Waris <syednwaris@gmail.com> > Signed-off-by: William Breathitt Gray <william.gray@linaro.org> > Link: https://lore.kernel.org/lkml/fe12eedf3666f4af5138de0e70b67a07c7f40338.1592224129.git.syednwaris@gmail.com/ > Suggested-by: Yury Norov <yury.norov@gmail.com> > Co-developed-by: Alexander Potapenko <glider@google.com> > Signed-off-by: Alexander Potapenko <glider@google.com> > > --- > This patch was previously called "lib/bitmap: add > bitmap_{set,get}_value()" > (https://lore.kernel.org/lkml/20230720173956.3674987-2-glider@google.com/) > > v6: > - As suggested by Yury Norov, do not require bitmap_read(..., 0) to > return 0. > > v5: > - Address comments by Yury Norov: > - updated code comments and patch title/description > - replace GENMASK(nbits - 1, 0) with BITMAP_LAST_WORD_MASK(nbits) > - more compact bitmap_write() implementation > > v4: > - Address comments by Andy Shevchenko and Yury Norov: > - prevent passing values >= 64 to GENMASK() > - fix commit authorship > - change comments > - check for unlikely(nbits==0) > - drop unnecessary const declarations > - fix kernel-doc comments > - rename bitmap_{get,set}_value() to bitmap_{read,write}() > --- > include/linux/bitmap.h | 68 ++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 68 insertions(+) > > diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h > index 03644237e1efb..e72c054d21d48 100644 > --- a/include/linux/bitmap.h > +++ b/include/linux/bitmap.h > @@ -76,7 +76,11 @@ struct device; > * bitmap_to_arr32(buf, src, nbits) Copy nbits from buf to u32[] dst > * bitmap_to_arr64(buf, src, nbits) Copy nbits from buf to u64[] dst > * bitmap_get_value8(map, start) Get 8bit value from map at start > + * bitmap_read(map, start, nbits) Read an nbits-sized value from > + * map at start > * bitmap_set_value8(map, value, start) Set 8bit value to map at start > + * bitmap_write(map, value, start, nbits) Write an nbits-sized value to > + * map at start > * > * Note, bitmap_zero() and bitmap_fill() operate over the region of > * unsigned longs, that is, bits behind bitmap till the unsigned long > @@ -583,6 +587,33 @@ static inline unsigned long bitmap_get_value8(const unsigned long *map, > return (map[index] >> offset) & 0xFF; > } > > +/** > + * bitmap_read - read a value of n-bits from the memory region > + * @map: address to the bitmap memory region > + * @start: bit offset of the n-bit value > + * @nbits: size of value in bits, nonzero, up to BITS_PER_LONG > + * > + * Returns: value of nbits located at the @start bit offset within the @map > + * memory region. > + */ > +static inline unsigned long bitmap_read(const unsigned long *map, > + unsigned long start, > + unsigned long nbits) > +{ > + size_t index = BIT_WORD(start); > + unsigned long offset = start % BITS_PER_LONG; > + unsigned long space = BITS_PER_LONG - offset; > + unsigned long value_low, value_high; > + > + if (unlikely(!nbits)) > + return 0; > + if (space >= nbits) > + return (map[index] >> offset) & GENMASK(nbits - 1, 0); > + value_low = map[index] & BITMAP_FIRST_WORD_MASK(start); > + value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits); > + return (value_low >> offset) | (value_high << space); > +} > + > /** > * bitmap_set_value8 - set an 8-bit value within a memory region > * @map: address to the bitmap memory region > @@ -599,6 +630,43 @@ static inline void bitmap_set_value8(unsigned long *map, unsigned long value, > map[index] |= value << offset; > } > > +/** > + * bitmap_write - write n-bit value within a memory region > + * @map: address to the bitmap memory region > + * @value: value to write, clamped to nbits > + * @start: bit offset of the n-bit value > + * @nbits: size of value in bits, nonzero, up to BITS_PER_LONG. > + * > + * bitmap_write() behaves similarly to @nbits calls of assign_bit(), i.e. bits > + * beyond @nbits are ignored: > + * > + * for (bit = 0; bit < nbits; bit++) > + * assign_bit(start + bit, bitmap, val & BIT(bit)); __assign_bit() > + */ 'behaves similarly' sounds like an understatement. I think, it behaves much faster because it can assign up to 64 bits at once, not mentioning the pressure on cache lines traffic. How faster - that's a good question. I'd be really pleased if you add a performance test for bitmap_write/read. Or I can do it myself later. You can find examples in the same lib/test_bitmap.c. > +static inline void bitmap_write(unsigned long *map, > + unsigned long value, > + unsigned long start, unsigned long nbits) > +{ > + size_t index = BIT_WORD(start); > + unsigned long offset = start % BITS_PER_LONG; > + unsigned long space = BITS_PER_LONG - offset; > + unsigned long mask; > + > + if (unlikely(!nbits)) > + return; can you please add more empty lines to separate blocks visually? > + mask = BITMAP_LAST_WORD_MASK(nbits); > + value &= mask; > + if (space >= nbits) { > + map[index] &= ~(mask << offset); > + map[index] |= value << offset; > + return; > + } > + map[index] &= ~BITMAP_FIRST_WORD_MASK(start); > + map[index] |= value << offset; > + map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits); > + map[index + 1] |= (value >> space); > +} I compiled the below fix on spark64 BE machine: --- a/include/linux/bitmap.h +++ b/include/linux/bitmap.h @@ -608,7 +608,7 @@ static inline unsigned long bitmap_read(const unsigned long *map, if (unlikely(!nbits)) return 0; if (space >= nbits) - return (map[index] >> offset) & GENMASK(nbits - 1, 0); + return (map[index] >> offset) & BITMAP_LAST_WORD_MASK(nbits); value_low = map[index] & BITMAP_FIRST_WORD_MASK(start); value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits); return (value_low >> offset) | (value_high << space); @@ -661,9 +661,9 @@ static inline void bitmap_write(unsigned long *map, map[index] |= value << offset; return; } - map[index] &= ~BITMAP_FIRST_WORD_MASK(start); + map[index] &= BITMAP_LAST_WORD_MASK(start); map[index] |= value << offset; - map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits); + map[index + 1] &= BITMAP_FIRST_WORD_MASK(start + nbits); map[index + 1] |= (value >> space); } All the tests are passed just as before, and there's no any difference reported by bloat-o-meter. Can you please use non-negation versions as they are more straightforward? > + > #endif /* __ASSEMBLY__ */ > > #endif /* __LINUX_BITMAP_H */ > -- > 2.42.0.609.gbb76f46606-goog
On Fri, Oct 6, 2023 at 4:48 PM Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote: > > On Fri, Oct 06, 2023 at 03:45:25PM +0200, Alexander Potapenko wrote: > > From: Syed Nayyar Waris <syednwaris@gmail.com> > > > > The two new functions allow reading/writing values of length up to > > BITS_PER_LONG bits at arbitrary position in the bitmap. > > > > The code was taken from "bitops: Introduce the for_each_set_clump macro" > > by Syed Nayyar Waris with a number of changes and simplifications: > > - instead of using roundup(), which adds an unnecessary dependency > > on <linux/math.h>, we calculate space as BITS_PER_LONG-offset; > > - indentation is reduced by not using else-clauses (suggested by > > checkpatch for bitmap_get_value()); > > - bitmap_get_value()/bitmap_set_value() are renamed to bitmap_read() > > and bitmap_write(); > > - some redundant computations are omitted. > > ... > > > v6: > > - As suggested by Yury Norov, do not require bitmap_read(..., 0) to > > return 0. > > Hmm... See below. > > ... > > > * bitmap_to_arr32(buf, src, nbits) Copy nbits from buf to u32[] dst > > * bitmap_to_arr64(buf, src, nbits) Copy nbits from buf to u64[] dst > > With the grouping as below I would add a blank line here. But was the intention > to group _arrXX() to these groups? Note that there's no single blank line in this long list. What if I swap bitmap_read with bitmap_set_value8(), would the grouping become more logical? I.e. * bitmap_get_value8(map, start) Get 8bit value from map at start * bitmap_set_value8(map, value, start) Set 8bit value to map at start * bitmap_read(map, start, nbits) Read an nbits-sized value from * map at start * bitmap_write(map, value, start, nbits) Write an nbits-sized value to * map at start > > + if (unlikely(!nbits)) > > + return 0; > > Hmm... I didn't get was the comment to add or to remove these checks? As Yury said, we should not require the return value to be 0, so I added "nonzero" to the descriptions of the @nbits parameter. The check stays in place, but the users relying on it is now a mistake. > > > + if (space >= nbits) > > + return (map[index] >> offset) & GENMASK(nbits - 1, 0); > > And don't you want to replace this GENMASK() as well? See my next reply to Yury, tl;dr this is a stale code version :(
> > + * > > + * for (bit = 0; bit < nbits; bit++) > > + * assign_bit(start + bit, bitmap, val & BIT(bit)); > > __assign_bit() Ack > > > + */ > > 'behaves similarly' sounds like an understatement. I think, it behaves > much faster because it can assign up to 64 bits at once, not mentioning > the pressure on cache lines traffic. My intent was to describe the visible behavior, of course the generated code is better, and the number of memory accesses lower. How about the following description: * The result of bitmap_write() is similar to @nbits calls of assign_bit(), i.e. * bits beyond @nbits are ignored: * * for (bit = 0; bit < nbits; bit++) * assign_bit(start + bit, bitmap, val & BIT(bit)); ? > > How faster - that's a good question. I'd be really pleased if you add > a performance test for bitmap_write/read. Or I can do it myself later. > You can find examples in the same lib/test_bitmap.c. I can add two separate functions doing some bitmap_read and bitmap_write calls in a loop to measure their performance independently - along the lines of what you did here: https://lore.kernel.org/lkml/ZL9X0TZb%2FQhCbEiC@yury-ThinkPad/ > > + if (unlikely(!nbits)) > > + return; > > can you please add more empty lines to separate blocks visually? Sure, will do. > > > + mask = BITMAP_LAST_WORD_MASK(nbits); > > + value &= mask; > > + if (space >= nbits) { > > + map[index] &= ~(mask << offset); > > + map[index] |= value << offset; > > + return; > > + } > > + map[index] &= ~BITMAP_FIRST_WORD_MASK(start); > > + map[index] |= value << offset; > > + map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits); > > + map[index + 1] |= (value >> space); > > +} > > I compiled the below fix on spark64 BE machine: > > --- a/include/linux/bitmap.h > +++ b/include/linux/bitmap.h > @@ -608,7 +608,7 @@ static inline unsigned long bitmap_read(const unsigned long *map, > if (unlikely(!nbits)) > return 0; > if (space >= nbits) > - return (map[index] >> offset) & GENMASK(nbits - 1, 0); > + return (map[index] >> offset) & BITMAP_LAST_WORD_MASK(nbits); > value_low = map[index] & BITMAP_FIRST_WORD_MASK(start); > value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits); > return (value_low >> offset) | (value_high << space); > @@ -661,9 +661,9 @@ static inline void bitmap_write(unsigned long *map, > map[index] |= value << offset; > return; > } > - map[index] &= ~BITMAP_FIRST_WORD_MASK(start); > + map[index] &= BITMAP_LAST_WORD_MASK(start); > map[index] |= value << offset; > - map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits); > + map[index + 1] &= BITMAP_FIRST_WORD_MASK(start + nbits); > map[index + 1] |= (value >> space); > } > > All the tests are passed just as before, and there's no any difference > reported by bloat-o-meter. Can you please use non-negation versions as > they are more straightforward? I am terribly sorry for this, but this code version is completely different from what we've discussed here: https://lore.kernel.org/lkml/CAG_fn=VrPJj6YowHNki5RGAAs8qvwZpUVN4K9qw=cf4aW7Qw9A@mail.gmail.com/ The correct version roughly looks as follows: void bitmap_write(unsigned long *map, unsigned long value, unsigned long start, unsigned long nbits) { unsigned long offset, space, mask; size_t index; bool fit; if (unlikely(!nbits)) return; mask = BITMAP_LAST_WORD_MASK(nbits); value &= mask; offset = start % BITS_PER_LONG; space = BITS_PER_LONG - offset; index = BIT_WORD(start); fit = space >= nbits; map[index] &= (fit ? (~(mask << offset)) : ~BITMAP_FIRST_WORD_MASK(start)); map[index] |= value << offset; if (fit) return; map[index + 1] &= BITMAP_FIRST_WORD_MASK(start + nbits); map[index + 1] |= (value >> space); } Note that here replacing ~BITMAP_FIRST_WORD_MASK() with BITMAP_LAST_WORD_MASK() costs us 25 bytes (116 bytes with ~BITMAP_FIRST_WORD_MASK() vs. 141 bytes without).
> > > > > + if (space >= nbits) > > > + return (map[index] >> offset) & GENMASK(nbits - 1, 0); > > > > And don't you want to replace this GENMASK() as well? > > See my next reply to Yury, tl;dr this is a stale code version :( Actually, no, we haven't discussed improvements to bitmap_read(), and this is the correct version. I'll fix this one.
On 10/10/2023 11.17, Alexander Potapenko wrote: >> 'behaves similarly' sounds like an understatement. I think, it behaves >> much faster because it can assign up to 64 bits at once, not mentioning >> the pressure on cache lines traffic. > > My intent was to describe the visible behavior, of course the > generated code is better, and the number of memory accesses lower. > > How about the following description: > > * The result of bitmap_write() is similar to @nbits calls of assign_bit(), i.e. > * bits beyond @nbits are ignored: > * > * for (bit = 0; bit < nbits; bit++) > * assign_bit(start + bit, bitmap, val & BIT(bit)); > > ? C programmers should know the meaning of the term "as-if". Perhaps use that. Smth like "bitmap_write() behaves as-if implemented as @nbits calls of __assign_bit()". Rasmus
On Tue, Oct 10, 2023 at 1:03 PM Rasmus Villemoes <linux@rasmusvillemoes.dk> wrote: > > On 10/10/2023 11.17, Alexander Potapenko wrote: > > >> 'behaves similarly' sounds like an understatement. I think, it behaves > >> much faster because it can assign up to 64 bits at once, not mentioning > >> the pressure on cache lines traffic. > > > > My intent was to describe the visible behavior, of course the > > generated code is better, and the number of memory accesses lower. > > > > How about the following description: > > > > * The result of bitmap_write() is similar to @nbits calls of assign_bit(), i.e. > > * bits beyond @nbits are ignored: > > * > > * for (bit = 0; bit < nbits; bit++) > > * assign_bit(start + bit, bitmap, val & BIT(bit)); > > > > ? > > C programmers should know the meaning of the term "as-if". Perhaps use > that. Smth like "bitmap_write() behaves as-if implemented as @nbits > calls of __assign_bit()". Good idea, I'll go with this one. Thank you!
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h index 03644237e1efb..e72c054d21d48 100644 --- a/include/linux/bitmap.h +++ b/include/linux/bitmap.h @@ -76,7 +76,11 @@ struct device; * bitmap_to_arr32(buf, src, nbits) Copy nbits from buf to u32[] dst * bitmap_to_arr64(buf, src, nbits) Copy nbits from buf to u64[] dst * bitmap_get_value8(map, start) Get 8bit value from map at start + * bitmap_read(map, start, nbits) Read an nbits-sized value from + * map at start * bitmap_set_value8(map, value, start) Set 8bit value to map at start + * bitmap_write(map, value, start, nbits) Write an nbits-sized value to + * map at start * * Note, bitmap_zero() and bitmap_fill() operate over the region of * unsigned longs, that is, bits behind bitmap till the unsigned long @@ -583,6 +587,33 @@ static inline unsigned long bitmap_get_value8(const unsigned long *map, return (map[index] >> offset) & 0xFF; } +/** + * bitmap_read - read a value of n-bits from the memory region + * @map: address to the bitmap memory region + * @start: bit offset of the n-bit value + * @nbits: size of value in bits, nonzero, up to BITS_PER_LONG + * + * Returns: value of nbits located at the @start bit offset within the @map + * memory region. + */ +static inline unsigned long bitmap_read(const unsigned long *map, + unsigned long start, + unsigned long nbits) +{ + size_t index = BIT_WORD(start); + unsigned long offset = start % BITS_PER_LONG; + unsigned long space = BITS_PER_LONG - offset; + unsigned long value_low, value_high; + + if (unlikely(!nbits)) + return 0; + if (space >= nbits) + return (map[index] >> offset) & GENMASK(nbits - 1, 0); + value_low = map[index] & BITMAP_FIRST_WORD_MASK(start); + value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + nbits); + return (value_low >> offset) | (value_high << space); +} + /** * bitmap_set_value8 - set an 8-bit value within a memory region * @map: address to the bitmap memory region @@ -599,6 +630,43 @@ static inline void bitmap_set_value8(unsigned long *map, unsigned long value, map[index] |= value << offset; } +/** + * bitmap_write - write n-bit value within a memory region + * @map: address to the bitmap memory region + * @value: value to write, clamped to nbits + * @start: bit offset of the n-bit value + * @nbits: size of value in bits, nonzero, up to BITS_PER_LONG. + * + * bitmap_write() behaves similarly to @nbits calls of assign_bit(), i.e. bits + * beyond @nbits are ignored: + * + * for (bit = 0; bit < nbits; bit++) + * assign_bit(start + bit, bitmap, val & BIT(bit)); + */ +static inline void bitmap_write(unsigned long *map, + unsigned long value, + unsigned long start, unsigned long nbits) +{ + size_t index = BIT_WORD(start); + unsigned long offset = start % BITS_PER_LONG; + unsigned long space = BITS_PER_LONG - offset; + unsigned long mask; + + if (unlikely(!nbits)) + return; + mask = BITMAP_LAST_WORD_MASK(nbits); + value &= mask; + if (space >= nbits) { + map[index] &= ~(mask << offset); + map[index] |= value << offset; + return; + } + map[index] &= ~BITMAP_FIRST_WORD_MASK(start); + map[index] |= value << offset; + map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits); + map[index + 1] |= (value >> space); +} + #endif /* __ASSEMBLY__ */ #endif /* __LINUX_BITMAP_H */