Message ID | 20221203150539.11483-1-brgl@bgdev.pl |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1417705wrr; Sat, 3 Dec 2022 07:31:33 -0800 (PST) X-Google-Smtp-Source: AA0mqf7RSKCZrpP2ymFvm8iaBUw1RU0+s8DB+Rqp1KQP9N0jK3HAdUHmgFhf4ZKZvFIZd7gTxKzF X-Received: by 2002:a17:906:64e:b0:7b7:c1b9:8cb4 with SMTP id t14-20020a170906064e00b007b7c1b98cb4mr48234248ejb.96.1670081493117; Sat, 03 Dec 2022 07:31:33 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670081493; cv=none; d=google.com; s=arc-20160816; b=PZZcSreX0gJ9Bk3Q8t9vuSXZ7Q+9H9A4rPFeko6VQSzEMQtAvop2KWICwLONGMcS5V afw0E0hRHaD2Eub3q5M1QG8wwsyr20iJ1d1alMLm+9i4hyDa9IrH9kvsolI0bQcMvLM4 vRK5/P7Cml96p5IP27vT7RdgoQswLBm32SU+Vcd51VgrzAqEM0M6YPuge4YLp85IDM4s pUYv32ig1ibML2zcWSnAP1koAEZI8Ktj2IFtf9iB7Tm9ZeQL04yjV/M0mhM45NdY1Dl2 Fjw9VZ3jjAFc9ct63QFZ7zEICAyHhhc5QynX0OXPoHIYhAeNVigRzAkn1Zd9StU0S+mi mX6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=z5m1n5zwZkdaEw92jbiAEE+AtzglWpi155V/SqU625A=; b=FPXcU+XCpcyGmuKP+g+InvFrDXHNaO4O2Oc8OCIbY6zSa4uWWA/ftU7uzAMfyTsxrp /JMUaovJzJ0WxAOxpFbeHYB9cs0kWArS/RQQZHrnQXLfGAQUHbE50Os8Y0/8V3aSbR8z KR/QkbuvDTo3UGGkFSLVP0poy7VtpnWPSIhrZIKqeEXUqoffRYMfjVWH3w5QgW688FqN WUDbm4NAjNIfFuNV89fr4Gxh/jIlwIFyS+oXVo/oc9MIpuNbID55YJOBteid9dg95RSx TSUelCBNWGi+o1Eq+HZSbMZbaDpV90QrXrppkWFl3D+hMulztUM4KM8VYI/9X7q7dnm7 nNzw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bgdev-pl.20210112.gappssmtp.com header.s=20210112 header.b=79UI45Hw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ne42-20020a1709077baa00b007807e1f3d9dsi8121449ejc.842.2022.12.03.07.31.05; Sat, 03 Dec 2022 07:31:33 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bgdev-pl.20210112.gappssmtp.com header.s=20210112 header.b=79UI45Hw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229665AbiLCPFr (ORCPT <rfc822;lhua1029@gmail.com> + 99 others); Sat, 3 Dec 2022 10:05:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38252 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229619AbiLCPFq (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sat, 3 Dec 2022 10:05:46 -0500 Received: from mail-wr1-x436.google.com (mail-wr1-x436.google.com [IPv6:2a00:1450:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E1F62181D for <linux-kernel@vger.kernel.org>; Sat, 3 Dec 2022 07:05:44 -0800 (PST) Received: by mail-wr1-x436.google.com with SMTP id y16so12203389wrm.2 for <linux-kernel@vger.kernel.org>; Sat, 03 Dec 2022 07:05:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bgdev-pl.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=z5m1n5zwZkdaEw92jbiAEE+AtzglWpi155V/SqU625A=; b=79UI45HwVX+xpL7pA/erF8ClVSl6Fzkmfn8ZdtGP4sC7E83KTU6sgdSwvlLGa8bSMw Srxnln2mtZfXc00pgCjg6nHRJgASYUzgp/GlRXjGHSXe8UB1hKjTOV6sN6RCfBflLy0k bAEdi5aAmTVAVMDCLVsSryO8f1S6uMKqcZNeV9KoviHXj0eyQA5AhdS9YtHaWfz5Bzrh nsGxd+2GxqjPojjs1zaftz9h8g4zVc+aYw5nUYSk8lrkET4cT/cCcf6+Ca819ImGAi4w 93koZb9nVvXAC5ochVLwoMwn0fd3LyxtILulrsREdtweGBJegb1y/SiBWPBo2E8coB3I DGWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=z5m1n5zwZkdaEw92jbiAEE+AtzglWpi155V/SqU625A=; b=EAkvQyYrW/xHfA/YV+gncRonMJMTgzkaWSeX3sS9vOxw8mqkby4q0KU6xgTHIzT3Ia 59X8w9BkInaig0jGoum2+78+kzZoIlj3fueeMMXatCfs2TnIi/PjTuJYBRlOcMgdHIBd AidSCJErIoA/WMnR6YYFcPg3k/6ALgagv7sSF6AoXioDh0uQqJYqSj0ffDktq4z6S92y iO+5B9cNUhDJvQ6KW6ZxCvRqlXmUr9s2aalqej6OUW6IN1Qiq6Q2JpIJuQe+4VVEC5V7 KLQtvkL8vjRkJeDoEOkmxjad0Dv4Mb9IK9HjjLr5Nt+P+u411vwBCMyvpOHfNo9ox5Nu 5+FA== X-Gm-Message-State: ANoB5pkpZdJujbI/RKJDiNQzPBnc7lZxXKkMEAbV1aQPu8QYaxhRQaGm ZWqF5EfUI6/54ONTi0xftmTXvQ== X-Received: by 2002:adf:d847:0:b0:242:e13:44c2 with SMTP id k7-20020adfd847000000b002420e1344c2mr20222540wrl.529.1670079942459; Sat, 03 Dec 2022 07:05:42 -0800 (PST) Received: from brgl-uxlite.home ([2a01:cb1d:334:ac00:873f:3a1a:a321:43a]) by smtp.gmail.com with ESMTPSA id w6-20020adfec46000000b0022efc4322a9sm10047581wrn.10.2022.12.03.07.05.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 03 Dec 2022 07:05:42 -0800 (PST) From: Bartosz Golaszewski <brgl@bgdev.pl> To: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>, Linus Walleij <linus.walleij@linaro.org>, linux-gpio@vger.kernel.org, linux-kernel@vger.kernel.org, Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Subject: [GIT PULL] gpio: fixes for v6.1-rc8 - take 2 Date: Sat, 3 Dec 2022 16:05:39 +0100 Message-Id: <20221203150539.11483-1-brgl@bgdev.pl> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1751207371815690142?= X-GMAIL-MSGID: =?utf-8?q?1751207371815690142?= |
Series |
[GIT,PULL] gpio: fixes for v6.1-rc8 - take 2
|
|
Pull-request
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux.git gpio-fixes-for-v6.1-rc8-take-2Message
Bartosz Golaszewski
Dec. 3, 2022, 3:05 p.m. UTC
From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Linus,
Here's a fixed PR from the GPIO subsystem for the next rc.
Here's the last round of fixes for the upcoming release. The two resource leak
fixes are self-explanatory. The two character device commits need some more
backstory:
I recently listened to Laurent Pinchart's talk from this year's LPC[1] where he
discussed an issue with many subsystems that export device nodes to user-space
where one can open the device file, unbind the underlying device from the
driver, then call any of the relevant system calls and observe the kernel crash
with a NULL-pointer derefence.
I verified that the problem exists with the GPIO subsystem as well. The reason
for that is: when a GPIO chip is removed, we drop the chip's data from the
GPIO device and set the relevant pointer to NULL but don't check it in syscall
callbacks in the character device's code. That's fixed by the first patch.
However that fix alone leaves the character device vulnerable to a race
condition in which the GPIO chip is removed when a syscall is in progress.
To avoid that, we have a second fix that uses a lock to protect the device's
structure from being disabled before all syscall callbacks returned.
Laurent blamed the issue on devres but I believe the problem comes from the
fact that many driver developers are simply unaware that resources exported
to user-space need to survive the driver unbind and only be released when the
device's reference count goes down to 0. Devres' docs are pretty clear on that:
the resources get freed on driver unbind. Resources that should survive it,
must not be managed. This is however a topic for a separate discussion which
I intend to start soon.
Please pull,
Bartosz Golaszewski
[1] https://www.youtube.com/watch?v=kW8LHWlJPTU
The following changes since commit b7b275e60bcd5f89771e865a8239325f86d9927d:
Linux 6.1-rc7 (2022-11-27 13:31:48 -0800)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux.git gpio-fixes-for-v6.1-rc8-take-2
for you to fetch changes up to 450571883735e9a7c3b38691225531d54773e9a2:
gpiolib: protect the GPIO device against being dropped while in use by user-space (2022-12-02 17:01:37 +0100)
----------------------------------------------------------------
gpio fixes for v6.1-rc8
- fix a memory leak in gpiochip_setup_dev() in core gpiolib
- fix a reference leak in gpio-amd8111
- fix a problem with user-space being able to trigger a NULL-pointer
dereference ovet the GPIO character device
----------------------------------------------------------------
Bartosz Golaszewski (2):
gpiolib: cdev: fix NULL-pointer dereferences
gpiolib: protect the GPIO device against being dropped while in use by user-space
Xiongfeng Wang (1):
gpio: amd8111: Fix PCI device reference count leak
Zeng Heng (1):
gpiolib: fix memory leak in gpiochip_setup_dev()
drivers/gpio/gpio-amd8111.c | 4 +
drivers/gpio/gpiolib-cdev.c | 207 ++++++++++++++++++++++++++++++++++++++------
drivers/gpio/gpiolib.c | 46 ++++++----
drivers/gpio/gpiolib.h | 5 ++
4 files changed, 221 insertions(+), 41 deletions(-)
Comments
On Sat, Dec 3, 2022 at 7:05 AM Bartosz Golaszewski <brgl@bgdev.pl> wrote: > > Here's a fixed PR from the GPIO subsystem for the next rc. No, this cannot be right. That last commit seems *very* dubious, and in particular all those if (!down_read_trylock(&gdev->sem)) return EPOLLHUP | EPOLLERR; are a sign that something is very very wrong there. Either the lock is necessary or it isn't, and "trylock" isn't the way to deal with it, with random failures if you cannot take the lock. If you are using "trylock" because the data structure might go away from under you, you have already lost, and the code is buggy. And if the data structure cannot go away from under you, you should do an unconditional lock, and then check "gdev->chip" for being NULL once you have gotten the lock (the same way you did in open()). But a "trylock and return error if it failed" just means that now you are randomly returning errors to user space, which is entirely undebuggable and makes no sense. Or, alternatively, the trylock succeeds - because it hits fully *after* gpiochip_remove() has finished, and now ->chip is NULL anyway, which is what you claim to protect against. End result: "trylock" can never be right in this kind of context. That "call_locked() helper might make sense more along the lines of ret = -ENODEV; down_read(&gdev->sem)) // Does the device still exist? if (gdev->chip) ret = func(file, cmd, arg); up_read(&gdev->sem); return ret; or similar. Not with that odd "try to lock, and if that fails, assume error". And again - if the trylock is there because 'gdev' itself might go away at any time and you can't afford to wait on the lock, then it's broken regardless (and the above suggestion won't help either) Anyway: the end result of this all is that I think this is a fundamental bug in the gpio layer, and rc7 (soon to be rc8) is too late to try these kinds of unfinished games. Fix it properly for 6.2, and make it back-portable, because I'm not pulling something like this right now. Linus
On Sun, Dec 4, 2022 at 12:17 PM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > And again - if the trylock is there because 'gdev' itself might go > away at any time and you can't afford to wait on the lock, then it's > broken regardless (and the above suggestion won't help either) .. another reason I can see is that you are holding other locks, and the trylock is either for deadlock avoidance or because the other locks are spinning locks and you cannot sleep. But if that's the case, then the trylock is basically a hacky workaround for broken locking, together with a "I know the only reason it fails is because we've already entered the shutdown phase". Again, that kind of hacky thing is not for this late in the rc game. It might be an acceptable workaround for backporting if it has *huge* comments about why it's done that way, but it's not acceptable as some kind of fix without that kind of documentation for why it's done that hacky way rather than with proper locking. Linus
On Sun, Dec 4, 2022 at 9:17 PM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Sat, Dec 3, 2022 at 7:05 AM Bartosz Golaszewski <brgl@bgdev.pl> wrote: > > > > Here's a fixed PR from the GPIO subsystem for the next rc. > > No, this cannot be right. > > That last commit seems *very* dubious, and in particular all those > > if (!down_read_trylock(&gdev->sem)) > return EPOLLHUP | EPOLLERR; > > are a sign that something is very very wrong there. > > Either the lock is necessary or it isn't, and "trylock" isn't the way > to deal with it, with random failures if you cannot take the lock. > > If you are using "trylock" because the data structure might go away > from under you, you have already lost, and the code is buggy. > > And if the data structure cannot go away from under you, you should > do an unconditional lock, and then check "gdev->chip" for being NULL > once you have gotten the lock (the same way you did in open()). > No, the data can't be removed with these locks in place. It's just to avoid going into the callbacks if gpiochip_remove() is already in progress (the only reason why trylock would fail). > But a "trylock and return error if it failed" just means that now you > are randomly returning errors to user space, which is entirely > undebuggable and makes no sense. > Technically these are the same errors we'll return later if gdev->chip is NULL but I get your point. > Or, alternatively, the trylock succeeds - because it hits fully > *after* gpiochip_remove() has finished, and now ->chip is NULL anyway, > which is what you claim to protect against. > > End result: "trylock" can never be right in this kind of context. > > That "call_locked() helper might make sense more along the lines of > > ret = -ENODEV; > > down_read(&gdev->sem)) > // Does the device still exist? > if (gdev->chip) > ret = func(file, cmd, arg); > up_read(&gdev->sem); > > return ret; > This is a good suggestion, thanks. And with it, the two patches can get squashed into one for easier backporting. > or similar. Not with that odd "try to lock, and if that fails, assume error". > > And again - if the trylock is there because 'gdev' itself might go > away at any time and you can't afford to wait on the lock, then it's > broken regardless (and the above suggestion won't help either) > > Anyway: the end result of this all is that I think this is a > fundamental bug in the gpio layer, and rc7 (soon to be rc8) is too > late to try these kinds of unfinished games. > > Fix it properly for 6.2, and make it back-portable, because I'm not > pulling something like this right now. > > Linus Will do. I will still resend the PR with only the resource leak fixes. Bartosz
On Sun, Dec 4, 2022 at 12:47 PM Bartosz Golaszewski <brgl@bgdev.pl> wrote: > > No, the data can't be removed with these locks in place. It's just to > avoid going into the callbacks if gpiochip_remove() is already in > progress (the only reason why trylock would fail). That "the only reason why trylock would fail" may be true in practice, but it's really just an implementation detail. Other issues *could* make a trylock fail. For example, assuming the trylock is just implemented as a non-looping "cmpxchg" (which may or may not be the case), even another reader coming in and racing with a trylock could make the cmpxchg fail. That "do one single cmpxchg" is what the spinlock trylock code does, for example. Now, that's not actually what we do for the down_read_trylock() case - we will actually loop over it - but locking is complicated and you absolutely should not make assumptions about the exact implementation details. And even with the loop, if you ever have *any* other reason to get the write-lock (or even just do a "try_write", suddenly the details of when the trylock fails changes entirely. So that's why I really don't want some random driver layer to depend on some semantics for trylock that aren't actually guaranteed in the bigger picture. The only thing "trylock" says is "I tried to get the lock". It really could easily fail for random reasons that aren't about actual writers. For example, even aside from any "do a single cmpxchg" issue: a sleeping lock could be implemented using a spinlock to protect the "real" locking data. That kind of implementation is particularly likely for debugging. And then, in order to be usable in a recursive environment, a "trylock" would quite likely be implemented without spinning on that spinlock, even before it gets to the actual lock internal implementation. So just contention on the spinlock - by other readers, not writers - could make the trylock fail. See? Linus