Message ID | 20230926052007.3917389-6-andriy.shevchenko@linux.intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:cae8:0:b0:403:3b70:6f57 with SMTP id r8csp1996872vqu; Tue, 26 Sep 2023 08:22:37 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEjd6tQGyIC4/T2h9zAmwugVpGVDZbFJpih7ezPPfHF/RoaXsMSR5loseWGJW2i37b8tRCR X-Received: by 2002:a17:90b:1287:b0:274:6d4e:e0d9 with SMTP id fw7-20020a17090b128700b002746d4ee0d9mr9264141pjb.45.1695741757119; Tue, 26 Sep 2023 08:22:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695741757; cv=none; d=google.com; s=arc-20160816; b=IdyScCcEeL61qw0NucsY8vK4WBlQzZHD7q6QMH1ErFarPlu1VbyggeVI2wfGgzIzwo buqD0oNKVOC6sq2gFR13PH8rXHACawIFrIdM4JsD5/ooK9a7KB5XnaczE7sKHvYlaK4Q 3IeawwPZp49gSryu0IE2la2psvPVOD5iguf1vXhEp2/4O6YZoHiWjq+0xBkHsb6z/0Sm DzMQ+8Wq1Tju1qk+84/g+uqRQH1bJcyBdap3mEFcCJsH/sH60FXpvK0HuOtTJryCiQPY VC0pjRVmQSy0Fpu84lkDf7Z6qOl9BNYcanKFWLDSJdEinVEVFpkkaZjuI2OvgXVew98c GtsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=5A548O0tLyQT1QoOeHI3rcVy8RDU676utBgf7/p9rTA=; fh=vW1oNT0lvWbI8AMX2u/BZGKXD0nTZmrv2xM94BgUfog=; b=q/7hiY5HKnAkbxPPRnWWUVIT8GaQOpPka8C6xAdbHOAeHEY+voARkKWbCPRPFUFnCI oDEEBsuWo13Kz5CdGxRzt4aZVmrwTI5OsXGMtGcpXWZY644ZcbyfXGWXbcyxfEhglhGt 8wCuifE5XpZtiZjhgake6jQc433wrFk/XarmZOxG4x7Gku6nnzjZlJVAdlSbSDMU0wqV CnXv66i92SQtpK2ywzS4lizxD5TNMt04puUMPloDOO9vNOPOCYjJChtkqVkFlqEB+7FY xPONa1kj3n2XZ5R9PBForKK9oMiONjdE+H+NeiA0sfQP+MpbwVJ9RB2bF1dQyUjBTtLh 7P8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=bqkT8gDq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from morse.vger.email (morse.vger.email. [23.128.96.31]) by mx.google.com with ESMTPS id lx4-20020a17090b4b0400b002766354b7aesi16781491pjb.166.2023.09.26.08.22.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Sep 2023 08:22:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) client-ip=23.128.96.31; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=bqkT8gDq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id EB4A1826E325; Mon, 25 Sep 2023 22:21:19 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233695AbjIZFU6 (ORCPT <rfc822;ruipengqi7@gmail.com> + 26 others); Tue, 26 Sep 2023 01:20:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45820 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232239AbjIZFUg (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 26 Sep 2023 01:20:36 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 279EEF2; Mon, 25 Sep 2023 22:20:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695705629; x=1727241629; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2HOZ+oim+zNWe8lFVHeNEs80ZdHaUduFZXUMLCdW3No=; b=bqkT8gDq5l9vXOhCHEgLdaTSreExf3hLjWetfKuF5mDvlLkMjzhjBDvg AIMYx2wJJMbAk2JE/8R2pey70Yl3Fgtq90qDT8Su8ov8f/CaGJuSVgxPT 810FWwAgcTLuQnr6P0MBwIHQKfhqfO8EFKuHPSz5cEfKLDCub1TbrMsti eg6EknMbrt5BE4zHo745ZXI+t0du5YUZBREMECiDeNlOOssrCM28jd4xp +3b7F+2RRrXUiV4Kkr+7EucffQXyrewKB2POEPrOnn9gQvK9mg3Croqz6 hbom/b3fvi+0NgoR+Dh0grVl2zfLYNnRmikv5uw9gNB2kSRZygzYvMp0x g==; X-IronPort-AV: E=McAfee;i="6600,9927,10843"; a="360865705" X-IronPort-AV: E=Sophos;i="6.03,177,1694761200"; d="scan'208";a="360865705" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Sep 2023 22:20:28 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10843"; a="748689685" X-IronPort-AV: E=Sophos;i="6.03,177,1694761200"; d="scan'208";a="748689685" Received: from black.fi.intel.com ([10.237.72.28]) by orsmga002.jf.intel.com with ESMTP; 25 Sep 2023 22:20:24 -0700 Received: by black.fi.intel.com (Postfix, from userid 1003) id 162DB133D; Tue, 26 Sep 2023 08:20:19 +0300 (EEST) From: Andy Shevchenko <andriy.shevchenko@linux.intel.com> To: Andy Shevchenko <andriy.shevchenko@linux.intel.com>, Linus Walleij <linus.walleij@linaro.org>, Bartosz Golaszewski <bartosz.golaszewski@linaro.org>, Yury Norov <yury.norov@gmail.com>, linux-gpio@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Cc: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com>, Srinivas Neeli <srinivas.neeli@amd.com>, Michal Simek <michal.simek@amd.com>, Bartosz Golaszewski <brgl@bgdev.pl>, Andy Shevchenko <andy@kernel.org>, Rasmus Villemoes <linux@rasmusvillemoes.dk>, =?utf-8?q?Marek_Beh=C3=BAn?= <kabel@kernel.org> Subject: [PATCH v1 5/5] gpiolib: cdev: Utilize more bitmap APIs Date: Tue, 26 Sep 2023 08:20:07 +0300 Message-Id: <20230926052007.3917389-6-andriy.shevchenko@linux.intel.com> X-Mailer: git-send-email 2.40.0.1.gaa8946217a0b In-Reply-To: <20230926052007.3917389-1-andriy.shevchenko@linux.intel.com> References: <20230926052007.3917389-1-andriy.shevchenko@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Mon, 25 Sep 2023 22:21:19 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778114108975291142 X-GMAIL-MSGID: 1778114108975291142 |
Series |
bitmap: get rid of bitmap_remap() and bitmap_biremap() uses
|
|
Commit Message
Andy Shevchenko
Sept. 26, 2023, 5:20 a.m. UTC
Currently we have a few bitmap calls that are open coded in the library
module. Let's convert them to use generic bitmap APIs instead.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
drivers/gpio/gpiolib-cdev.c | 79 +++++++++++++++++--------------------
1 file changed, 36 insertions(+), 43 deletions(-)
Comments
On Tue, Sep 26, 2023 at 08:20:07AM +0300, Andy Shevchenko wrote: > Currently we have a few bitmap calls that are open coded in the library > module. Let's convert them to use generic bitmap APIs instead. > > Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > --- > drivers/gpio/gpiolib-cdev.c | 79 +++++++++++++++++-------------------- > 1 file changed, 36 insertions(+), 43 deletions(-) > > diff --git a/drivers/gpio/gpiolib-cdev.c b/drivers/gpio/gpiolib-cdev.c > index e39d344feb28..a5bbbd44531f 100644 > --- a/drivers/gpio/gpiolib-cdev.c > +++ b/drivers/gpio/gpiolib-cdev.c > @@ -1263,35 +1263,32 @@ static long linereq_get_values(struct linereq *lr, void __user *ip) > { > struct gpio_v2_line_values lv; > DECLARE_BITMAP(vals, GPIO_V2_LINES_MAX); > + DECLARE_BITMAP(mask, GPIO_V2_LINES_MAX); > + DECLARE_BITMAP(bits, GPIO_V2_LINES_MAX); > struct gpio_desc **descs; > unsigned int i, didx, num_get; > - bool val; > int ret; > > /* NOTE: It's ok to read values of output lines. */ > if (copy_from_user(&lv, ip, sizeof(lv))) > return -EFAULT; > > - for (num_get = 0, i = 0; i < lr->num_lines; i++) { > - if (lv.mask & BIT_ULL(i)) { > - num_get++; > - descs = &lr->lines[i].desc; > - } > - } > + bitmap_from_arr64(mask, &lv.mask, GPIO_V2_LINES_MAX); > > + num_get = bitmap_weight(mask, lr->num_lines); > if (num_get == 0) > return -EINVAL; > > - if (num_get != 1) { > + if (num_get == 1) { > + descs = &lr->lines[find_first_bit(mask, lr->num_lines)].desc; > + } else { > descs = kmalloc_array(num_get, sizeof(*descs), GFP_KERNEL); > if (!descs) > return -ENOMEM; > - for (didx = 0, i = 0; i < lr->num_lines; i++) { > - if (lv.mask & BIT_ULL(i)) { > - descs[didx] = lr->lines[i].desc; > - didx++; > - } > - } > + > + didx = 0; > + for_each_set_bit(i, mask, lr->num_lines) > + descs[didx++] = lr->lines[i].desc; > } > ret = gpiod_get_array_value_complex(false, true, num_get, > descs, NULL, vals); > @@ -1301,19 +1298,15 @@ static long linereq_get_values(struct linereq *lr, void __user *ip) > if (ret) > return ret; > > - lv.bits = 0; > - for (didx = 0, i = 0; i < lr->num_lines; i++) { > - if (lv.mask & BIT_ULL(i)) { > - if (lr->lines[i].sw_debounced) > - val = debounced_value(&lr->lines[i]); > - else > - val = test_bit(didx, vals); > - if (val) > - lv.bits |= BIT_ULL(i); > - didx++; > - } > + bitmap_scatter(bits, vals, mask, lr->num_lines); > + > + for_each_set_bit(i, mask, lr->num_lines) { > + if (lr->lines[i].sw_debounced) > + __assign_bit(i, bits, debounced_value(&lr->lines[i])); > } > > + bitmap_to_arr64(&lv.bits, bits, GPIO_V2_LINES_MAX); > + > if (copy_to_user(ip, &lv, sizeof(lv))) > return -EFAULT; > > @@ -1324,35 +1317,35 @@ static long linereq_set_values_unlocked(struct linereq *lr, > struct gpio_v2_line_values *lv) > { > DECLARE_BITMAP(vals, GPIO_V2_LINES_MAX); > + DECLARE_BITMAP(mask, GPIO_V2_LINES_MAX); > + DECLARE_BITMAP(bits, GPIO_V2_LINES_MAX); > struct gpio_desc **descs; > unsigned int i, didx, num_set; > int ret; > > - bitmap_zero(vals, GPIO_V2_LINES_MAX); > - for (num_set = 0, i = 0; i < lr->num_lines; i++) { > - if (lv->mask & BIT_ULL(i)) { > - if (!test_bit(FLAG_IS_OUT, &lr->lines[i].desc->flags)) > - return -EPERM; > - if (lv->bits & BIT_ULL(i)) > - __set_bit(num_set, vals); > - num_set++; > - descs = &lr->lines[i].desc; > - } > - } > + bitmap_from_arr64(mask, &lv->mask, GPIO_V2_LINES_MAX); > + bitmap_from_arr64(bits, &lv->bits, GPIO_V2_LINES_MAX); > + > + num_set = bitmap_gather(vals, bits, mask, lr->num_lines); It looks like GPIO_V2_LINES_MAX is always 64, and so I wonder: is my understanding correct that all bits in ->mask and ->bits beyond lr->num_lines are clear? If so, you can seemingly pass the GPIO_V2_LINES_MAX instead of lr->num_lines, and that way it will be small_cons_nbits()-optimized. > if (num_set == 0) > return -EINVAL; > > - if (num_set != 1) { > + for_each_set_bit(i, mask, lr->num_lines) { > + if (!test_bit(FLAG_IS_OUT, &lr->lines[i].desc->flags)) > + return -EPERM; > + } > + > + if (num_set == 1) { > + descs = &lr->lines[find_first_bit(mask, lr->num_lines)].desc; > + } else { > /* build compacted desc array and values */ > descs = kmalloc_array(num_set, sizeof(*descs), GFP_KERNEL); > if (!descs) > return -ENOMEM; > - for (didx = 0, i = 0; i < lr->num_lines; i++) { > - if (lv->mask & BIT_ULL(i)) { > - descs[didx] = lr->lines[i].desc; > - didx++; > - } > - } > + > + didx = 0; > + for_each_set_bit(i, mask, lr->num_lines) > + descs[didx++] = lr->lines[i].desc; > } > ret = gpiod_set_array_value_complex(false, true, num_set, > descs, NULL, vals); > -- > 2.40.0.1.gaa8946217a0b
On Tue, Sep 26, 2023 at 08:20:07AM +0300, Andy Shevchenko wrote: > Currently we have a few bitmap calls that are open coded in the library > module. Let's convert them to use generic bitmap APIs instead. > Firstly, I didn't consider using the bitmap module here as, in my mind at least, that is intended for bitmaps wider than 64 bits, or with variable width. In this case the bitmap is fixed at 64 bits, so bitops seemed more appropriate. And I would argue that they aren't "open coded" - they are parallelized to reduce the number of passes over the bitmap. This change serialises them, e.g. the get used to require 2 passes over the bitmap, it now requires 3 or 4. The set used to require 1 and now requires 2. And there are additional copies that the original doesn't require. So your change looks less efficient to me - unless there is direct hardware support for bitmap ops?? Wrt the argument that the serialized form is clearer and more maintainable, optimised code is frequently more cryptic - as noted in bitmap.c itself, and this code has remained unchanged since it was merged 3 years ago, so the only maintenance it has required is to be more maintainable?? Ok then. Your patch is functionally equivalent and pass my uAPI tests, so Tested-by: Kent Gibson <warthog618@gmail.com> but my preference is to leave it as is. Cheers, Kent. > Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > --- > drivers/gpio/gpiolib-cdev.c | 79 +++++++++++++++++-------------------- > 1 file changed, 36 insertions(+), 43 deletions(-) > > diff --git a/drivers/gpio/gpiolib-cdev.c b/drivers/gpio/gpiolib-cdev.c > index e39d344feb28..a5bbbd44531f 100644 > --- a/drivers/gpio/gpiolib-cdev.c > +++ b/drivers/gpio/gpiolib-cdev.c > @@ -1263,35 +1263,32 @@ static long linereq_get_values(struct linereq *lr, void __user *ip) > { > struct gpio_v2_line_values lv; > DECLARE_BITMAP(vals, GPIO_V2_LINES_MAX); > + DECLARE_BITMAP(mask, GPIO_V2_LINES_MAX); > + DECLARE_BITMAP(bits, GPIO_V2_LINES_MAX); > struct gpio_desc **descs; > unsigned int i, didx, num_get; > - bool val; > int ret; > > /* NOTE: It's ok to read values of output lines. */ > if (copy_from_user(&lv, ip, sizeof(lv))) > return -EFAULT; > > - for (num_get = 0, i = 0; i < lr->num_lines; i++) { > - if (lv.mask & BIT_ULL(i)) { > - num_get++; > - descs = &lr->lines[i].desc; > - } > - } > + bitmap_from_arr64(mask, &lv.mask, GPIO_V2_LINES_MAX); > > + num_get = bitmap_weight(mask, lr->num_lines); > if (num_get == 0) > return -EINVAL; > > - if (num_get != 1) { > + if (num_get == 1) { > + descs = &lr->lines[find_first_bit(mask, lr->num_lines)].desc; > + } else { > descs = kmalloc_array(num_get, sizeof(*descs), GFP_KERNEL); > if (!descs) > return -ENOMEM; > - for (didx = 0, i = 0; i < lr->num_lines; i++) { > - if (lv.mask & BIT_ULL(i)) { > - descs[didx] = lr->lines[i].desc; > - didx++; > - } > - } > + > + didx = 0; > + for_each_set_bit(i, mask, lr->num_lines) > + descs[didx++] = lr->lines[i].desc; > } > ret = gpiod_get_array_value_complex(false, true, num_get, > descs, NULL, vals); > @@ -1301,19 +1298,15 @@ static long linereq_get_values(struct linereq *lr, void __user *ip) > if (ret) > return ret; > > - lv.bits = 0; > - for (didx = 0, i = 0; i < lr->num_lines; i++) { > - if (lv.mask & BIT_ULL(i)) { > - if (lr->lines[i].sw_debounced) > - val = debounced_value(&lr->lines[i]); > - else > - val = test_bit(didx, vals); > - if (val) > - lv.bits |= BIT_ULL(i); > - didx++; > - } > + bitmap_scatter(bits, vals, mask, lr->num_lines); > + > + for_each_set_bit(i, mask, lr->num_lines) { > + if (lr->lines[i].sw_debounced) > + __assign_bit(i, bits, debounced_value(&lr->lines[i])); > } > > + bitmap_to_arr64(&lv.bits, bits, GPIO_V2_LINES_MAX); > + > if (copy_to_user(ip, &lv, sizeof(lv))) > return -EFAULT; > > @@ -1324,35 +1317,35 @@ static long linereq_set_values_unlocked(struct linereq *lr, > struct gpio_v2_line_values *lv) > { > DECLARE_BITMAP(vals, GPIO_V2_LINES_MAX); > + DECLARE_BITMAP(mask, GPIO_V2_LINES_MAX); > + DECLARE_BITMAP(bits, GPIO_V2_LINES_MAX); > struct gpio_desc **descs; > unsigned int i, didx, num_set; > int ret; > > - bitmap_zero(vals, GPIO_V2_LINES_MAX); > - for (num_set = 0, i = 0; i < lr->num_lines; i++) { > - if (lv->mask & BIT_ULL(i)) { > - if (!test_bit(FLAG_IS_OUT, &lr->lines[i].desc->flags)) > - return -EPERM; > - if (lv->bits & BIT_ULL(i)) > - __set_bit(num_set, vals); > - num_set++; > - descs = &lr->lines[i].desc; > - } > - } > + bitmap_from_arr64(mask, &lv->mask, GPIO_V2_LINES_MAX); > + bitmap_from_arr64(bits, &lv->bits, GPIO_V2_LINES_MAX); > + > + num_set = bitmap_gather(vals, bits, mask, lr->num_lines); > if (num_set == 0) > return -EINVAL; > > - if (num_set != 1) { > + for_each_set_bit(i, mask, lr->num_lines) { > + if (!test_bit(FLAG_IS_OUT, &lr->lines[i].desc->flags)) > + return -EPERM; > + } > + > + if (num_set == 1) { > + descs = &lr->lines[find_first_bit(mask, lr->num_lines)].desc; > + } else { > /* build compacted desc array and values */ > descs = kmalloc_array(num_set, sizeof(*descs), GFP_KERNEL); > if (!descs) > return -ENOMEM; > - for (didx = 0, i = 0; i < lr->num_lines; i++) { > - if (lv->mask & BIT_ULL(i)) { > - descs[didx] = lr->lines[i].desc; > - didx++; > - } > - } > + > + didx = 0; > + for_each_set_bit(i, mask, lr->num_lines) > + descs[didx++] = lr->lines[i].desc; > } > ret = gpiod_set_array_value_complex(false, true, num_set, > descs, NULL, vals); > -- > 2.40.0.1.gaa8946217a0b >
On Tue, Sep 26, 2023 at 05:46:07PM -0700, Yury Norov wrote: > On Tue, Sep 26, 2023 at 08:20:07AM +0300, Andy Shevchenko wrote: > > Currently we have a few bitmap calls that are open coded in the library > > module. Let's convert them to use generic bitmap APIs instead. > > > > Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> > > + bitmap_from_arr64(mask, &lv->mask, GPIO_V2_LINES_MAX); > > + bitmap_from_arr64(bits, &lv->bits, GPIO_V2_LINES_MAX); > > + > > + num_set = bitmap_gather(vals, bits, mask, lr->num_lines); > > It looks like GPIO_V2_LINES_MAX is always 64, and so I wonder: is > my understanding correct that all bits in ->mask and ->bits beyond > lr->num_lines are clear? > The lv fields come from userspace and so cannot be guaranteed to be zeroed beyond lr->num_lines. Any set bits beyond that must be ignored, one way or another. > If so, you can seemingly pass the GPIO_V2_LINES_MAX instead of > lr->num_lines, and that way it will be small_cons_nbits()-optimized. > But that would be decidedly non-optimal for the most common case where lr->num_lines == 1. Cheers, Kent.
On Wed, Sep 27, 2023 at 09:32:11AM +0800, Kent Gibson wrote: > On Tue, Sep 26, 2023 at 08:20:07AM +0300, Andy Shevchenko wrote: > > Currently we have a few bitmap calls that are open coded in the library > > module. Let's convert them to use generic bitmap APIs instead. > > Firstly, I didn't consider using the bitmap module here as, in my mind at > least, that is intended for bitmaps wider than 64 bits, or with variable > width. In this case the bitmap is fixed at 64 bits, so bitops seemed more > appropriate. > > And I would argue that they aren't "open coded" - they are parallelized > to reduce the number of passes over the bitmap. > This change serialises them, e.g. the get used to require 2 passes over > the bitmap, it now requires 3 or 4. The set used to require 1 and now > requires 2. > And there are additional copies that the original doesn't require. > So your change looks less efficient to me - unless there is direct > hardware support for bitmap ops?? > > Wrt the argument that the serialized form is clearer and more > maintainable, optimised code is frequently more cryptic - as noted in > bitmap.c itself, and this code has remained unchanged since it was merged > 3 years ago, so the only maintenance it has required is to be more > maintainable?? Ok then. > > Your patch is functionally equivalent and pass my uAPI tests, so > > Tested-by: Kent Gibson <warthog618@gmail.com> Thanks for testing! > but my preference is to leave it as is. As Yury mentioned we need to look at bitmap APIs and make them possible to have a compile-time optimizations. With that in mind, I would prefer bitmap APIs over open-coded stuff which is hardly to be understood (yes, I still point out that it takes a few hours to me, maybe because I'm stupid enough, to get what's the heck is going one there, esp. for the == 1 case). Yet, it opens a way to scale this in case we might have v3 ABI that let's say allows to work with 512 GPIOs at a time. With your code it will be much harder to achieve and see what you wrote about maintenance (in that case).
On Wed, Sep 27, 2023 at 03:17:06PM +0300, Andy Shevchenko wrote: > On Wed, Sep 27, 2023 at 09:32:11AM +0800, Kent Gibson wrote: > > On Tue, Sep 26, 2023 at 08:20:07AM +0300, Andy Shevchenko wrote: > > > Currently we have a few bitmap calls that are open coded in the library > > > module. Let's convert them to use generic bitmap APIs instead. > > > > Firstly, I didn't consider using the bitmap module here as, in my mind at > > least, that is intended for bitmaps wider than 64 bits, or with variable > > width. In this case the bitmap is fixed at 64 bits, so bitops seemed more > > appropriate. > > > > And I would argue that they aren't "open coded" - they are parallelized > > to reduce the number of passes over the bitmap. > > This change serialises them, e.g. the get used to require 2 passes over > > the bitmap, it now requires 3 or 4. The set used to require 1 and now > > requires 2. > > And there are additional copies that the original doesn't require. > > So your change looks less efficient to me - unless there is direct > > hardware support for bitmap ops?? > > > > Wrt the argument that the serialized form is clearer and more > > maintainable, optimised code is frequently more cryptic - as noted in > > bitmap.c itself, and this code has remained unchanged since it was merged > > 3 years ago, so the only maintenance it has required is to be more > > maintainable?? Ok then. > > > > Your patch is functionally equivalent and pass my uAPI tests, so > > > > Tested-by: Kent Gibson <warthog618@gmail.com> > > Thanks for testing! > Not a problem - that is what test suites are for. > > but my preference is to leave it as is. > > As Yury mentioned we need to look at bitmap APIs and make them possible to have > a compile-time optimizations. With that in mind, I would prefer bitmap APIs > over open-coded stuff which is hardly to be understood (yes, I still point > out that it takes a few hours to me, maybe because I'm stupid enough, to > get what's the heck is going one there, esp. for the == 1 case). > Really? With all the bits out in the open it seems pretty clear to me. Clearer than scatter/gather in fact. Sure, if there is suitable hardware support then bitmaps COULD be faster than bitops. But without that, and that is the general case, it will be slower. Do you have ANY cases where your implementation is currently faster? Then you would have a stronger case. And if you find the existing implementation unclear then the appropriate solution is to better document it, as bitmaps itself does, not replace it with something simpler and slower. > Yet, it opens a way to scale this in case we might have v3 ABI that let's say > allows to work with 512 GPIOs at a time. With your code it will be much harder > to achieve and see what you wrote about maintenance (in that case). > v3 ABI?? libgpiod v2 is barely out the door! Do you have any cases where 64 lines per request is limiting? If that sort of speculation isn't premature optimisation then I don't know what is. Cheers, Kent.
On Wed, Sep 27, 2023 at 09:49:35PM +0800, Kent Gibson wrote: > On Wed, Sep 27, 2023 at 03:17:06PM +0300, Andy Shevchenko wrote: > > On Wed, Sep 27, 2023 at 09:32:11AM +0800, Kent Gibson wrote: > > > On Tue, Sep 26, 2023 at 08:20:07AM +0300, Andy Shevchenko wrote: > > > > Currently we have a few bitmap calls that are open coded in the library > > > > module. Let's convert them to use generic bitmap APIs instead. > > > > > > Firstly, I didn't consider using the bitmap module here as, in my mind at > > > least, that is intended for bitmaps wider than 64 bits, or with variable > > > width. In this case the bitmap is fixed at 64 bits, so bitops seemed more > > > appropriate. > > > > > > And I would argue that they aren't "open coded" - they are parallelized > > > to reduce the number of passes over the bitmap. > > > This change serialises them, e.g. the get used to require 2 passes over > > > the bitmap, it now requires 3 or 4. The set used to require 1 and now > > > requires 2. > > > And there are additional copies that the original doesn't require. > > > So your change looks less efficient to me - unless there is direct > > > hardware support for bitmap ops?? > > > > > > Wrt the argument that the serialized form is clearer and more > > > maintainable, optimised code is frequently more cryptic - as noted in > > > bitmap.c itself, and this code has remained unchanged since it was merged > > > 3 years ago, so the only maintenance it has required is to be more > > > maintainable?? Ok then. > > > > > > Your patch is functionally equivalent and pass my uAPI tests, so > > > > > > Tested-by: Kent Gibson <warthog618@gmail.com> > > > > Thanks for testing! > > Not a problem - that is what test suites are for. > > > > but my preference is to leave it as is. > > > > As Yury mentioned we need to look at bitmap APIs and make them possible to have > > a compile-time optimizations. With that in mind, I would prefer bitmap APIs > > over open-coded stuff which is hardly to be understood (yes, I still point > > out that it takes a few hours to me, maybe because I'm stupid enough, to > > get what's the heck is going one there, esp. for the == 1 case). > > Really? With all the bits out in the open it seems pretty clear to me. > Clearer than scatter/gather in fact. Yes, you are biased. :-) Ask some stranger about this code and I am pretty sure there will be double-figures percentage of people who can tell that the current code is a bit voodoo. > Sure, if there is suitable hardware support then bitmaps COULD be faster > than bitops. But without that, and that is the general case, it will be > slower. Do you have ANY cases where your implementation is currently > faster? Then you would have a stronger case. Why do we care here about performance? But if we do, I would check this on the 32-bit platform where 64-bit operations somewhat problematic / slow. If Yury gives an idea about performance tests I can consider to add this piece to compare with and we might see the difference. > And if you find the existing implementation unclear then the appropriate > solution is to better document it, as bitmaps itself does, not replace it > with something simpler and slower. Documentation will be needed either way. In general statistics it will be 50/50 who (mis)understands this or new code. Pity that the original author of the code hadn't though about documenting this... > > Yet, it opens a way to scale this in case we might have v3 ABI that let's say > > allows to work with 512 GPIOs at a time. With your code it will be much harder > > to achieve and see what you wrote about maintenance (in that case). > > v3 ABI?? libgpiod v2 is barely out the door! > Do you have any cases where 64 lines per request is limiting? IIRC it was SO question where the OP asks exactly about breaking the 64 lines limitation in the current ABI. > If that sort of speculation isn't premature optimisation then I don't know > what is. No, based on the real question / discussion, just have no link at hand. But it's quite a niche, I can agree.
On Wed, Sep 27, 2023 at 04:59:34PM +0300, Andy Shevchenko wrote: > On Wed, Sep 27, 2023 at 09:49:35PM +0800, Kent Gibson wrote: > > On Wed, Sep 27, 2023 at 03:17:06PM +0300, Andy Shevchenko wrote: > > > On Wed, Sep 27, 2023 at 09:32:11AM +0800, Kent Gibson wrote: > > > > On Tue, Sep 26, 2023 at 08:20:07AM +0300, Andy Shevchenko wrote: > > > > > Currently we have a few bitmap calls that are open coded in the library > > > > > module. Let's convert them to use generic bitmap APIs instead. > > > > > > > > Firstly, I didn't consider using the bitmap module here as, in my mind at > > > > least, that is intended for bitmaps wider than 64 bits, or with variable > > > > width. In this case the bitmap is fixed at 64 bits, so bitops seemed more > > > > appropriate. > > > > > > > > And I would argue that they aren't "open coded" - they are parallelized > > > > to reduce the number of passes over the bitmap. > > > > This change serialises them, e.g. the get used to require 2 passes over > > > > the bitmap, it now requires 3 or 4. The set used to require 1 and now > > > > requires 2. > > > > And there are additional copies that the original doesn't require. > > > > So your change looks less efficient to me - unless there is direct > > > > hardware support for bitmap ops?? > > > > > > > > Wrt the argument that the serialized form is clearer and more > > > > maintainable, optimised code is frequently more cryptic - as noted in > > > > bitmap.c itself, and this code has remained unchanged since it was merged > > > > 3 years ago, so the only maintenance it has required is to be more > > > > maintainable?? Ok then. > > > > > > > > Your patch is functionally equivalent and pass my uAPI tests, so > > > > > > > > Tested-by: Kent Gibson <warthog618@gmail.com> > > > > > > Thanks for testing! > > > > Not a problem - that is what test suites are for. > > > > > > but my preference is to leave it as is. > > > > > > As Yury mentioned we need to look at bitmap APIs and make them possible to have > > > a compile-time optimizations. With that in mind, I would prefer bitmap APIs > > > over open-coded stuff which is hardly to be understood (yes, I still point > > > out that it takes a few hours to me, maybe because I'm stupid enough, to > > > get what's the heck is going one there, esp. for the == 1 case). > > > > Really? With all the bits out in the open it seems pretty clear to me. > > Clearer than scatter/gather in fact. > > Yes, you are biased. :-) Ask some stranger about this code and I am pretty sure > there will be double-figures percentage of people who can tell that the current > code is a bit voodoo. > It is the same as yours - just inside out. i.e. it performs the ops per selected line, not each op on the whole bitmap of lines. > > Sure, if there is suitable hardware support then bitmaps COULD be faster > > than bitops. But without that, and that is the general case, it will be > > slower. Do you have ANY cases where your implementation is currently > > faster? Then you would have a stronger case. > > Why do we care here about performance? But if we do, I would check this on > the 32-bit platform where 64-bit operations somewhat problematic / slow. > Yet you argue that bitmaps could be more performant?? Pick a side! > If Yury gives an idea about performance tests I can consider to add this > piece to compare with and we might see the difference. > > > And if you find the existing implementation unclear then the appropriate > > solution is to better document it, as bitmaps itself does, not replace it > > with something simpler and slower. > > Documentation will be needed either way. In general statistics it will be 50/50 > who (mis)understands this or new code. Pity that the original author of the code > hadn't though about documenting this... > And who was the original author? I forget. What you mean to say is it is a pity the reviewers at the time were satisfied with the code as it stands, right? Cos there is a process here. As I recall reviewers were more often than not complaining about pointless comments, not the lack of comments, so the natural bias as the author is towards under-documenting... > > > Yet, it opens a way to scale this in case we might have v3 ABI that let's say > > > allows to work with 512 GPIOs at a time. With your code it will be much harder > > > to achieve and see what you wrote about maintenance (in that case). > > > > v3 ABI?? libgpiod v2 is barely out the door! > > Do you have any cases where 64 lines per request is limiting? > > IIRC it was SO question where the OP asks exactly about breaking the 64 lines > limitation in the current ABI. > > > If that sort of speculation isn't premature optimisation then I don't know > > what is. > > No, based on the real question / discussion, just have no link at hand. > But it's quite a niche, I can agree. > Let me know if you find a ref to that discussion - I'm curious. Cheers, Kent.
On Wed, Sep 27, 2023 at 10:23:12PM +0800, Kent Gibson wrote: > On Wed, Sep 27, 2023 at 04:59:34PM +0300, Andy Shevchenko wrote: > > On Wed, Sep 27, 2023 at 09:49:35PM +0800, Kent Gibson wrote: > > > On Wed, Sep 27, 2023 at 03:17:06PM +0300, Andy Shevchenko wrote: > > > > On Wed, Sep 27, 2023 at 09:32:11AM +0800, Kent Gibson wrote: ... > > > > Yet, it opens a way to scale this in case we might have v3 ABI that let's say > > > > allows to work with 512 GPIOs at a time. With your code it will be much harder > > > > to achieve and see what you wrote about maintenance (in that case). > > > > > > v3 ABI?? libgpiod v2 is barely out the door! > > > Do you have any cases where 64 lines per request is limiting? > > > > IIRC it was SO question where the OP asks exactly about breaking the 64 lines > > limitation in the current ABI. > > > > > If that sort of speculation isn't premature optimisation then I don't know > > > what is. > > > > No, based on the real question / discussion, just have no link at hand. > > But it's quite a niche, I can agree. > > Let me know if you find a ref to that discussion - I'm curious. Here it is (read comments as well): https://stackoverflow.com/questions/76307370/control-gpio-from-linux-userspace-with-linux-gpio-h
On Mon, Oct 02, 2023 at 12:05:11PM +0300, Andy Shevchenko wrote: > On Wed, Sep 27, 2023 at 10:23:12PM +0800, Kent Gibson wrote: > > On Wed, Sep 27, 2023 at 04:59:34PM +0300, Andy Shevchenko wrote: > > > On Wed, Sep 27, 2023 at 09:49:35PM +0800, Kent Gibson wrote: > > > > On Wed, Sep 27, 2023 at 03:17:06PM +0300, Andy Shevchenko wrote: > > > > > On Wed, Sep 27, 2023 at 09:32:11AM +0800, Kent Gibson wrote: > > ... > > > > > > Yet, it opens a way to scale this in case we might have v3 ABI that let's say > > > > > allows to work with 512 GPIOs at a time. With your code it will be much harder > > > > > to achieve and see what you wrote about maintenance (in that case). > > > > > > > > v3 ABI?? libgpiod v2 is barely out the door! > > > > Do you have any cases where 64 lines per request is limiting? > > > > > > IIRC it was SO question where the OP asks exactly about breaking the 64 lines > > > limitation in the current ABI. > > > > > > > If that sort of speculation isn't premature optimisation then I don't know > > > > what is. > > > > > > No, based on the real question / discussion, just have no link at hand. > > > But it's quite a niche, I can agree. > > > > Let me know if you find a ref to that discussion - I'm curious. > > Here it is (read comments as well): > https://stackoverflow.com/questions/76307370/control-gpio-from-linux-userspace-with-linux-gpio-h > That question looks to me to be confusing how many GPIOs can be requested per request (64) and in total (effectively unlimited) - thinking they are the same. That could be due to their desire to use the gpiod_chip_get_all_lines() convenience function with a chip with more than 64 lines, rather than because they have an actual need for the lines to be managed in a single request. So that doesn't look like a genuine use case to me - just a "what if I want to do X" question. Certainly not something that would warrant a v3 ABI. Cheers, Kent.
On Mon, Oct 02, 2023 at 05:25:05PM +0800, Kent Gibson wrote: > On Mon, Oct 02, 2023 at 12:05:11PM +0300, Andy Shevchenko wrote: > > On Wed, Sep 27, 2023 at 10:23:12PM +0800, Kent Gibson wrote: > > > On Wed, Sep 27, 2023 at 04:59:34PM +0300, Andy Shevchenko wrote: > > > > On Wed, Sep 27, 2023 at 09:49:35PM +0800, Kent Gibson wrote: > > > > > On Wed, Sep 27, 2023 at 03:17:06PM +0300, Andy Shevchenko wrote: > > > > > > On Wed, Sep 27, 2023 at 09:32:11AM +0800, Kent Gibson wrote: ... > > > > > > Yet, it opens a way to scale this in case we might have v3 ABI that let's say > > > > > > allows to work with 512 GPIOs at a time. With your code it will be much harder > > > > > > to achieve and see what you wrote about maintenance (in that case). > > > > > > > > > > v3 ABI?? libgpiod v2 is barely out the door! > > > > > Do you have any cases where 64 lines per request is limiting? > > > > > > > > IIRC it was SO question where the OP asks exactly about breaking the 64 lines > > > > limitation in the current ABI. > > > > > > > > > If that sort of speculation isn't premature optimisation then I don't know > > > > > what is. > > > > > > > > No, based on the real question / discussion, just have no link at hand. > > > > But it's quite a niche, I can agree. > > > > > > Let me know if you find a ref to that discussion - I'm curious. > > > > Here it is (read comments as well): > > https://stackoverflow.com/questions/76307370/control-gpio-from-linux-userspace-with-linux-gpio-h > > > > That question looks to me to be confusing how many GPIOs can be > requested per request (64) and in total (effectively unlimited) - thinking > they are the same. > That could be due to their desire to use the gpiod_chip_get_all_lines() > convenience function with a chip with more than 64 lines, rather than > because they have an actual need for the lines to be managed in a single > request. > > So that doesn't look like a genuine use case to me - just a "what if I > want to do X" question. Certainly not something that would warrant a v3 > ABI. Sure, and I'm not talking about v3 ABI to go for, see the word "might" in my reply in the first paragraph of this message.
On Mon, Oct 02, 2023 at 12:32:22PM +0300, Andy Shevchenko wrote: > On Mon, Oct 02, 2023 at 05:25:05PM +0800, Kent Gibson wrote: > > On Mon, Oct 02, 2023 at 12:05:11PM +0300, Andy Shevchenko wrote: > > > On Wed, Sep 27, 2023 at 10:23:12PM +0800, Kent Gibson wrote: > > > > On Wed, Sep 27, 2023 at 04:59:34PM +0300, Andy Shevchenko wrote: > > > > > On Wed, Sep 27, 2023 at 09:49:35PM +0800, Kent Gibson wrote: > > > > > > On Wed, Sep 27, 2023 at 03:17:06PM +0300, Andy Shevchenko wrote: > > > > > > > On Wed, Sep 27, 2023 at 09:32:11AM +0800, Kent Gibson wrote: > > ... > > > > > > > > Yet, it opens a way to scale this in case we might have v3 ABI that let's say > > > > > > > allows to work with 512 GPIOs at a time. With your code it will be much harder > > > > > > > to achieve and see what you wrote about maintenance (in that case). > > > > > > > > > > > > v3 ABI?? libgpiod v2 is barely out the door! > > > > > > Do you have any cases where 64 lines per request is limiting? > > > > > > > > > > IIRC it was SO question where the OP asks exactly about breaking the 64 lines > > > > > limitation in the current ABI. > > > > > > > > > > > If that sort of speculation isn't premature optimisation then I don't know > > > > > > what is. > > > > > > > > > > No, based on the real question / discussion, just have no link at hand. > > > > > But it's quite a niche, I can agree. > > > > > > > > Let me know if you find a ref to that discussion - I'm curious. > > > > > > Here it is (read comments as well): > > > https://stackoverflow.com/questions/76307370/control-gpio-from-linux-userspace-with-linux-gpio-h > > > > > > > That question looks to me to be confusing how many GPIOs can be > > requested per request (64) and in total (effectively unlimited) - thinking > > they are the same. > > That could be due to their desire to use the gpiod_chip_get_all_lines() > > convenience function with a chip with more than 64 lines, rather than > > because they have an actual need for the lines to be managed in a single > > request. > > > > So that doesn't look like a genuine use case to me - just a "what if I > > want to do X" question. Certainly not something that would warrant a v3 > > ABI. > > Sure, and I'm not talking about v3 ABI to go for, see the word "might" in my > reply in the first paragraph of this message. > Ok, so your original point was pure speculation. Cheers, Kent.
diff --git a/drivers/gpio/gpiolib-cdev.c b/drivers/gpio/gpiolib-cdev.c index e39d344feb28..a5bbbd44531f 100644 --- a/drivers/gpio/gpiolib-cdev.c +++ b/drivers/gpio/gpiolib-cdev.c @@ -1263,35 +1263,32 @@ static long linereq_get_values(struct linereq *lr, void __user *ip) { struct gpio_v2_line_values lv; DECLARE_BITMAP(vals, GPIO_V2_LINES_MAX); + DECLARE_BITMAP(mask, GPIO_V2_LINES_MAX); + DECLARE_BITMAP(bits, GPIO_V2_LINES_MAX); struct gpio_desc **descs; unsigned int i, didx, num_get; - bool val; int ret; /* NOTE: It's ok to read values of output lines. */ if (copy_from_user(&lv, ip, sizeof(lv))) return -EFAULT; - for (num_get = 0, i = 0; i < lr->num_lines; i++) { - if (lv.mask & BIT_ULL(i)) { - num_get++; - descs = &lr->lines[i].desc; - } - } + bitmap_from_arr64(mask, &lv.mask, GPIO_V2_LINES_MAX); + num_get = bitmap_weight(mask, lr->num_lines); if (num_get == 0) return -EINVAL; - if (num_get != 1) { + if (num_get == 1) { + descs = &lr->lines[find_first_bit(mask, lr->num_lines)].desc; + } else { descs = kmalloc_array(num_get, sizeof(*descs), GFP_KERNEL); if (!descs) return -ENOMEM; - for (didx = 0, i = 0; i < lr->num_lines; i++) { - if (lv.mask & BIT_ULL(i)) { - descs[didx] = lr->lines[i].desc; - didx++; - } - } + + didx = 0; + for_each_set_bit(i, mask, lr->num_lines) + descs[didx++] = lr->lines[i].desc; } ret = gpiod_get_array_value_complex(false, true, num_get, descs, NULL, vals); @@ -1301,19 +1298,15 @@ static long linereq_get_values(struct linereq *lr, void __user *ip) if (ret) return ret; - lv.bits = 0; - for (didx = 0, i = 0; i < lr->num_lines; i++) { - if (lv.mask & BIT_ULL(i)) { - if (lr->lines[i].sw_debounced) - val = debounced_value(&lr->lines[i]); - else - val = test_bit(didx, vals); - if (val) - lv.bits |= BIT_ULL(i); - didx++; - } + bitmap_scatter(bits, vals, mask, lr->num_lines); + + for_each_set_bit(i, mask, lr->num_lines) { + if (lr->lines[i].sw_debounced) + __assign_bit(i, bits, debounced_value(&lr->lines[i])); } + bitmap_to_arr64(&lv.bits, bits, GPIO_V2_LINES_MAX); + if (copy_to_user(ip, &lv, sizeof(lv))) return -EFAULT; @@ -1324,35 +1317,35 @@ static long linereq_set_values_unlocked(struct linereq *lr, struct gpio_v2_line_values *lv) { DECLARE_BITMAP(vals, GPIO_V2_LINES_MAX); + DECLARE_BITMAP(mask, GPIO_V2_LINES_MAX); + DECLARE_BITMAP(bits, GPIO_V2_LINES_MAX); struct gpio_desc **descs; unsigned int i, didx, num_set; int ret; - bitmap_zero(vals, GPIO_V2_LINES_MAX); - for (num_set = 0, i = 0; i < lr->num_lines; i++) { - if (lv->mask & BIT_ULL(i)) { - if (!test_bit(FLAG_IS_OUT, &lr->lines[i].desc->flags)) - return -EPERM; - if (lv->bits & BIT_ULL(i)) - __set_bit(num_set, vals); - num_set++; - descs = &lr->lines[i].desc; - } - } + bitmap_from_arr64(mask, &lv->mask, GPIO_V2_LINES_MAX); + bitmap_from_arr64(bits, &lv->bits, GPIO_V2_LINES_MAX); + + num_set = bitmap_gather(vals, bits, mask, lr->num_lines); if (num_set == 0) return -EINVAL; - if (num_set != 1) { + for_each_set_bit(i, mask, lr->num_lines) { + if (!test_bit(FLAG_IS_OUT, &lr->lines[i].desc->flags)) + return -EPERM; + } + + if (num_set == 1) { + descs = &lr->lines[find_first_bit(mask, lr->num_lines)].desc; + } else { /* build compacted desc array and values */ descs = kmalloc_array(num_set, sizeof(*descs), GFP_KERNEL); if (!descs) return -ENOMEM; - for (didx = 0, i = 0; i < lr->num_lines; i++) { - if (lv->mask & BIT_ULL(i)) { - descs[didx] = lr->lines[i].desc; - didx++; - } - } + + didx = 0; + for_each_set_bit(i, mask, lr->num_lines) + descs[didx++] = lr->lines[i].desc; } ret = gpiod_set_array_value_complex(false, true, num_set, descs, NULL, vals);