Message ID | 20230224-rust-iopt-rtkit-v1-2-49ced3391295@asahilina.net |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp836572wrd; Fri, 24 Feb 2023 03:01:09 -0800 (PST) X-Google-Smtp-Source: AK7set9hfHNsuT8GC0WZJNSIOeMCXoD7wxJH/4nSj1cqnjhgbE188BxUVOjeqJd7/xo6byBm4ZbE X-Received: by 2002:a05:6402:20c:b0:4ab:1d33:69ba with SMTP id t12-20020a056402020c00b004ab1d3369bamr13608604edv.16.1677236468885; Fri, 24 Feb 2023 03:01:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677236468; cv=none; d=google.com; s=arc-20160816; b=my1s3TChG+QKqhofrTSJIkxTgwyMYxDdv27gQwA291ilZ32l9MtUHYAizFmibP1c4v QR9gJE+IWoequymq+ZNN7ch5Z9QShx9fk3s5Ip7/3YrU0ahqdmN/dGiZ1tFMIreVcf18 uiVLobYV/2ydtA8TtvZEQfezvnKSJbGSupV2sXjDc+DVF0m0CdVQnfD4KLEKoZR5epjR k1ObfipLtf0nLQhA6H/E0qcMWB4gfzWKRKq+BfSNOZ8Le6J+Q+En/AZiqRaXOFauaN1U Mb+bnynlTXBA8igVG/i9wYHuNXLnOgAFiElzvLxGoxLu7XLMzlkkh1QUiCHfsX8FAdmc HTDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:in-reply-to:references:message-id :content-transfer-encoding:mime-version:subject:date:from :dkim-signature; bh=E45BpH984Exa0fMcJL/obVfS9/NpUGLW4abDz1Bq+Hg=; b=dvMrkHYGNkjrJx0lR/M5sKaJt97KirG+mbHWwtBFSIEDDDG06zJgK+1W4iwmdQ3Jpw jxOzE9ZPRApeVJZGdZ0sdBghja7+92/J3RSqkjSCNnrWtZD79PDWIjheqv2djwyqOC/0 UE3hmbpC/UfVhR4abyqsWKH/0G5V9pa0V6nWAQI6K6d0uESqZC7tR6HGyLUFEJViz6r0 N5fBHimKTwSK1jLXTMUj6X3zua0HWT81Rp82IHLdXBiLO3rhhSZwPgLAERD7rbsVpUOd GFXiiiz0m72Phd46DXOt5FUAE2pN+WBaKNJ5y+1oQSCBD8bkmZNFr8UmbDZt0sha6TEz efuQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@asahilina.net header.s=default header.b=S5GfRRLt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=asahilina.net Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y25-20020a50e619000000b004aef1b3a863si15019470edm.518.2023.02.24.03.00.44; Fri, 24 Feb 2023 03:01:08 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@asahilina.net header.s=default header.b=S5GfRRLt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=asahilina.net Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230148AbjBXKyE (ORCPT <rfc822;jeff.pang.chn@gmail.com> + 99 others); Fri, 24 Feb 2023 05:54:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230073AbjBXKyA (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 24 Feb 2023 05:54:00 -0500 Received: from mail.marcansoft.com (marcansoft.com [212.63.210.85]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E7DD149A5; Fri, 24 Feb 2023 02:53:51 -0800 (PST) Received: from [127.0.0.1] (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: linasend@asahilina.net) by mail.marcansoft.com (Postfix) with ESMTPSA id C3BF64248B; Fri, 24 Feb 2023 10:53:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=asahilina.net; s=default; t=1677236029; bh=XcpgOxW9xbW5RI+qtCoIK9qZZvy3Hmqang8OrPgwtSc=; h=From:Date:Subject:References:In-Reply-To:To:Cc; b=S5GfRRLtjTEa1GDauhqm4LwpVNGK9hwlbYZMDG1ymiXWS62XPbSGKGYfVdiESMhsi oUWZbjIsVtLtdpjnPlNLc77tt5cg+DOJTJpoB3SVHHaCuQoH8yutbwrBHCwUAni2W1 mqtw2lyg9Mt87jqcgmsl1HYvkyr9NBfRbfvZgjxMbTcr393YYR5lLtfwsxo3q3n7pq ++U2vR8szdn9AXLuHQV+Y5Rvt+zp2PfwNt4/xhZjI/W2vjqvDSw+Fg+khrXomc4NsQ 0TGF7QjLUZgLXGpMl/IEJlXXz2yC5DOQ17LEM+bGey75KDq1KQM6dPemGRxock3Wkv rB9HwMDHJCxtg== From: Asahi Lina <lina@asahilina.net> Date: Fri, 24 Feb 2023 19:53:14 +0900 Subject: [PATCH 2/5] rust: device: Add a minimal RawDevice trait MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20230224-rust-iopt-rtkit-v1-2-49ced3391295@asahilina.net> References: <20230224-rust-iopt-rtkit-v1-0-49ced3391295@asahilina.net> In-Reply-To: <20230224-rust-iopt-rtkit-v1-0-49ced3391295@asahilina.net> To: Miguel Ojeda <ojeda@kernel.org>, Alex Gaynor <alex.gaynor@gmail.com>, Wedson Almeida Filho <wedsonaf@gmail.com>, Boqun Feng <boqun.feng@gmail.com>, Gary Guo <gary@garyguo.net>, =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= <bjorn3_gh@protonmail.com>, Will Deacon <will@kernel.org>, Robin Murphy <robin.murphy@arm.com>, Joerg Roedel <joro@8bytes.org>, Hector Martin <marcan@marcan.st>, Sven Peter <sven@svenpeter.dev>, Arnd Bergmann <arnd@arndb.de>, Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Rafael J. Wysocki" <rafael@kernel.org>, Alyssa Rosenzweig <alyssa@rosenzweig.io>, Neal Gompa <neal@gompa.dev>, rust-for-linux@vger.kernel.org, linux-kernel@vger.kernel.org, asahi@lists.linux.dev, Asahi Lina <lina@asahilina.net> X-Mailer: b4 0.12.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1677236013; l=2542; i=lina@asahilina.net; s=20230221; h=from:subject:message-id; bh=soR466tRTXjasiXIM/nVS1quyj2/l4EZ0DnmL76tJBs=; b=lgy1PZ0uyKdd/tM/G8noK7EaD5kUIHfT1uSUmIXZ1IA3ri0x3dNuPzq/TelJO3hm34TrgUqE9 bNEhHLGfh08CnZPIzu2dvrfFa3L9h2u7tHdI6Yy4lBJWvKiNRcLpm1x X-Developer-Key: i=lina@asahilina.net; a=ed25519; pk=Qn8jZuOtR1m5GaiDfTrAoQ4NE1XoYVZ/wmt5YtXWFC4= X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1758709907756889089?= X-GMAIL-MSGID: =?utf-8?q?1758709907756889089?= |
Series | rust: Add io_pgtable and RTKit abstractions | |
Commit Message
Asahi Lina
Feb. 24, 2023, 10:53 a.m. UTC
From: Wedson Almeida Filho <wedsonaf@gmail.com> Add a RawDevice trait which can be implemented by any type representing a device class (such as a PlatformDevice). This is the minimum amount of Device support code required to unblock abstractions that need to take device pointers. Lina: Rewrote commit message, and dropped everything except RawDevice. Co-developed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com> Signed-off-by: Asahi Lina <lina@asahilina.net> --- rust/bindings/bindings_helper.h | 1 + rust/kernel/device.rs | 23 +++++++++++++++++++++++ rust/kernel/lib.rs | 1 + 3 files changed, 25 insertions(+)
Comments
On Fri, Feb 24, 2023 at 07:53:14PM +0900, Asahi Lina wrote: > From: Wedson Almeida Filho <wedsonaf@gmail.com> > > Add a RawDevice trait which can be implemented by any type representing > a device class (such as a PlatformDevice). This is the minimum amount of > Device support code required to unblock abstractions that need to take > device pointers. > > Lina: Rewrote commit message, and dropped everything except RawDevice. > > Co-developed-by: Miguel Ojeda <ojeda@kernel.org> > Signed-off-by: Miguel Ojeda <ojeda@kernel.org> > Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com> > Signed-off-by: Asahi Lina <lina@asahilina.net> > --- > rust/bindings/bindings_helper.h | 1 + > rust/kernel/device.rs | 23 +++++++++++++++++++++++ > rust/kernel/lib.rs | 1 + > 3 files changed, 25 insertions(+) > > diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h > index 75d85bd6c592..3632a39a28a6 100644 > --- a/rust/bindings/bindings_helper.h > +++ b/rust/bindings/bindings_helper.h > @@ -6,6 +6,7 @@ > * Sorted alphabetically. > */ > > +#include <linux/device.h> > #include <linux/slab.h> > #include <linux/refcount.h> > > diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs > new file mode 100644 > index 000000000000..9be021e393ca > --- /dev/null > +++ b/rust/kernel/device.rs > @@ -0,0 +1,23 @@ > +// SPDX-License-Identifier: GPL-2.0 > + > +//! Generic devices that are part of the kernel's driver model. > +//! > +//! C header: [`include/linux/device.h`](../../../../include/linux/device.h) > + > +use crate::bindings; > + > +/// A raw device. > +/// > +/// # Safety > +/// > +/// Implementers must ensure that the `*mut device` returned by [`RawDevice::raw_device`] is > +/// related to `self`, that is, actions on it will affect `self`. For example, if one calls > +/// `get_device`, then the refcount on the device represented by `self` will be incremented. What is a "implementer" here? Rust code? C code? Who is calling get_device() here, rust code or the driver code or something else? How are you matching up the reference logic of this structure with the fact that the driver core actually owns the reference, not any rust code? > +/// > +/// Additionally, implementers must ensure that the device is never renamed. Commit a5462516aa99 > +/// ("driver-core: document restrictions on device_rename()") has details on why `device_rename` > +/// should not be used. As much as I would have liked that, that commit was from 2010 and device_rename() is still being used by some pretty large subsystems (networking and IB) and I don't see that going away any year soon. So yes, your device will be renamed, so don't mess with the device name like I mentioned in review of path 5/5 in this series. > +pub unsafe trait RawDevice { > + /// Returns the raw `struct device` related to `self`. > + fn raw_device(&self) -> *mut bindings::device; Again, what prevents this device from going away? I don't see a call to get_device() here :( And again, why are bindings needed for a "raw" struct device at all? Shouldn't the bus-specific wrappings work better? thanks, greg k-h
On 24/02/2023 20.23, Greg Kroah-Hartman wrote: > On Fri, Feb 24, 2023 at 07:53:14PM +0900, Asahi Lina wrote: >> From: Wedson Almeida Filho <wedsonaf@gmail.com> >> >> Add a RawDevice trait which can be implemented by any type representing >> a device class (such as a PlatformDevice). This is the minimum amount of >> Device support code required to unblock abstractions that need to take >> device pointers. >> >> Lina: Rewrote commit message, and dropped everything except RawDevice. >> >> Co-developed-by: Miguel Ojeda <ojeda@kernel.org> >> Signed-off-by: Miguel Ojeda <ojeda@kernel.org> >> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com> >> Signed-off-by: Asahi Lina <lina@asahilina.net> >> --- >> rust/bindings/bindings_helper.h | 1 + >> rust/kernel/device.rs | 23 +++++++++++++++++++++++ >> rust/kernel/lib.rs | 1 + >> 3 files changed, 25 insertions(+) >> >> diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs >> new file mode 100644 >> index 000000000000..9be021e393ca >> --- /dev/null >> +++ b/rust/kernel/device.rs >> @@ -0,0 +1,23 @@ >> +// SPDX-License-Identifier: GPL-2.0 >> + >> +//! Generic devices that are part of the kernel's driver model. >> +//! >> +//! C header: [`include/linux/device.h`](../../../../include/linux/device.h) >> + >> +use crate::bindings; >> + >> +/// A raw device. >> +/// >> +/// # Safety >> +/// >> +/// Implementers must ensure that the `*mut device` returned by [`RawDevice::raw_device`] is >> +/// related to `self`, that is, actions on it will affect `self`. For example, if one calls >> +/// `get_device`, then the refcount on the device represented by `self` will be incremented. > > What is a "implementer" here? Rust code? C code? Who is calling > get_device() here, rust code or the driver code or something else? How > are you matching up the reference logic of this structure with the fact > that the driver core actually owns the reference, not any rust code? This is a Rust trait, so that would be Rust code. In practice that means other abstractions for different device buses, and the generic one which I sent as patch 5. It's not a structure, so it doesn't have any reference logic itself. That would go into the implementing struct (like `Device`). What the comment is saying is that other parts of the Rust kernel crate may make assumptions about the relationship between the lifetime of the struct implementing RawDevice and the lifetime of the underlying `struct device`. So for example, it's not legal to return a borrowed reference to a `struct device` that might go away before `self` does. We should probably make this `Sealed` too, to make sure no driver can even try to implement it (which probably wouldn't make sense)... this should only ever be implemented by stuff in the kernel crate. The reference is owned by whoever owns the reference, no? If you call get_device(), you own a new reference to the device (which means you can keep the pointer around until you call put_device()). Normally in Rust you would have logic where whatever Rust structure that owns a reference (which in practice is most things that implement RawDevice) would also have a `Clone` trait impl that calls `get_device()` and duplicates itself, and a `Drop` trait impl that calls `put_device()`. In Rust terms, the Rust structure "owns" the device reference, and whatever code creates that structure needs to either be allowed to steal the reference from its caller or explicitly call `get_device()`. You can also elide that in some cases though, like when you're just passing a device reference down from a callback into Rust code. Then the caller guarantees it has a device reference, which will outlive the execution of the callback. So the wrapper that calls user Rust code can materialize a RawDevice implementation without calling `get_device()`, and then pass a Rust reference to it (which means the downstream code can't steal it), and then make sure it gets destroyed without a `put_device()` before the callback returns. In Rust terms, that's a code path that passes a borrowed reference to the device down into user code. This kind of ties back to why Rust rather than C... C does not encode concepts like "borrowing a reference" or "taking over a reference" in the type system, so it's really easy to mess up the refcounting and end up with dangling references or leaks. Rust does, so once you wrap a C API with the matching Rust semantics, there's no way to mess up and have those issues any more. > >> +/// >> +/// Additionally, implementers must ensure that the device is never renamed. Commit a5462516aa99 >> +/// ("driver-core: document restrictions on device_rename()") has details on why `device_rename` >> +/// should not be used. > > As much as I would have liked that, that commit was from 2010 and > device_rename() is still being used by some pretty large subsystems > (networking and IB) and I don't see that going away any year soon. > > So yes, your device will be renamed, so don't mess with the device name > like I mentioned in review of path 5/5 in this series. I think we can just drop that part (and the name stuff) then. Wedson, is that okay? > >> +pub unsafe trait RawDevice { >> + /// Returns the raw `struct device` related to `self`. >> + fn raw_device(&self) -> *mut bindings::device; > > Again, what prevents this device from going away? I don't see a call to > get_device() here :( That would be up to the caller, if it needs to keep the pointer around. Think of `raw_device()` as just "please return the `struct device *` pointer for yourself, thanks". If you need to store that pointer for later, you need to call `get_device()`. But most of the time it will be used transiently, and then you don't need to take any references since the safety requirement above guarantees that `self` owns a reference. The caller then just needs to make sure not to throw away `self` before it finishes using the returned pointer. Keep in mind that this is an internal abstraction, it's there to be implemented and used by the kernel crate itself. User code can't actually do anything with the returned pointer without using `unsafe` (so it's safe to expose this) and can't implement the trait without `unsafe` either. But it can take dynamic trait arguments of type RawDevice and pass them along, which is perfectly fine. Rust guarantees that the right `get_device()` and `put_device()` are called when references get cloned or dropped, for owned objects, and for borrowed references there is no need to touch the refcount. > And again, why are bindings needed for a "raw" struct device at all? > Shouldn't the bus-specific wrappings work better? Because lots of kernel subsystems need to be able to accept "any" device and don't care about the bus! That's what this is for. All the bus wrappers would implement this so they can be used as an argument for all those subsystems (plus a generic one when you just need to pass around an actual owned generic reference and no longer need bus-specific operations - you can materialize that out of a RawDevice impl, which is when get_device() would be called). That's why I'm introducing this now, because both io_pgtable and rtkit need to take `struct device` pointers on the C side so we need some "generic struct device" view on the Rust side. ~~ Lina
Thanks for the detailed rust explainations, I'd like to just highlight one thing: On Fri, Feb 24, 2023 at 10:15:12PM +0900, Asahi Lina wrote: > On 24/02/2023 20.23, Greg Kroah-Hartman wrote: > > And again, why are bindings needed for a "raw" struct device at all? > > Shouldn't the bus-specific wrappings work better? > > Because lots of kernel subsystems need to be able to accept "any" device > and don't care about the bus! That's what this is for. That's great, but: > All the bus > wrappers would implement this so they can be used as an argument for all > those subsystems (plus a generic one when you just need to pass around > an actual owned generic reference and no longer need bus-specific > operations - you can materialize that out of a RawDevice impl, which is > when get_device() would be called). That's why I'm introducing this now, > because both io_pgtable and rtkit need to take `struct device` pointers > on the C side so we need some "generic struct device" view on the Rust side. In looking at both ftkit and io_pgtable, those seem to be good examples of how "not to use a struct device", so trying to make safe bindings from Rust to these frameworks is very ironic :) rtkit takes a struct device pointer and then never increments it, despite saving it off, which is unsafe. It then only uses it to print out messages if things go wrong (or right in some cases), which is odd. So it can get away from using a device pointer entirely, except for the devm_apple_rtkit_init() call, which I doubt you want to call from rust code, right? for io_pgtable, that's a bit messier, you want to pass in a device that io_pgtable treats as a "device" but again, it is NEVER properly reference counted, AND, it is only needed to try to figure out the bus operations that dma memory should be allocated from for this device. So what would be better to save off there would be a pointer to the bus, which is constant and soon will be read-only so there are no lifetime rules needed at all (see the major struct bus_type changes going into 6.3-rc1 that will enable that to happen). So the two subsystems you want to call from rust code don't properly handle the reference count of the object you are going to pass into it, and only need it for debugging and iommu stuff, which is really only the bus that the device is on, not good examples to start out with :) Yeah, this is yack-shaving, sorry, but it's how we clean up core subsystems for apis and implementations that are not really correct and were not noticed at the time. Can we see some users of this code posted so I can see how struct device is going to work in a rust driver? That's the thing I worry most about the rust/C interaction here as we have two different ways of thinking about reference counts from the two worlds and putting them together is going to be "interesting", as can be seen here already. thanks, greg k-h
On Fri, Feb 24, 2023 at 03:11:05PM +0100, Greg Kroah-Hartman wrote: > Thanks for the detailed rust explainations, I'd like to just highlight > one thing: > > On Fri, Feb 24, 2023 at 10:15:12PM +0900, Asahi Lina wrote: > > On 24/02/2023 20.23, Greg Kroah-Hartman wrote: > > > And again, why are bindings needed for a "raw" struct device at all? > > > Shouldn't the bus-specific wrappings work better? > > > > Because lots of kernel subsystems need to be able to accept "any" device > > and don't care about the bus! That's what this is for. > > That's great, but: > > > All the bus > > wrappers would implement this so they can be used as an argument for all > > those subsystems (plus a generic one when you just need to pass around > > an actual owned generic reference and no longer need bus-specific > > operations - you can materialize that out of a RawDevice impl, which is > > when get_device() would be called). That's why I'm introducing this now, > > because both io_pgtable and rtkit need to take `struct device` pointers > > on the C side so we need some "generic struct device" view on the Rust side. > > In looking at both ftkit and io_pgtable, those seem to be good examples > of how "not to use a struct device", so trying to make safe bindings > from Rust to these frameworks is very ironic :) > > rtkit takes a struct device pointer and then never increments it, > despite saving it off, which is unsafe. It then only uses it to print > out messages if things go wrong (or right in some cases), which is odd. > So it can get away from using a device pointer entirely, except for the > devm_apple_rtkit_init() call, which I doubt you want to call from rust > code, right? > > for io_pgtable, that's a bit messier, you want to pass in a device that > io_pgtable treats as a "device" but again, it is NEVER properly > reference counted, AND, it is only needed to try to figure out the bus > operations that dma memory should be allocated from for this device. So > what would be better to save off there would be a pointer to the bus, > which is constant and soon will be read-only so there are no lifetime > rules needed at all (see the major struct bus_type changes going into > 6.3-rc1 that will enable that to happen). > > So the two subsystems you want to call from rust code don't properly > handle the reference count of the object you are going to pass into it, > and only need it for debugging and iommu stuff, which is really only the > bus that the device is on, not good examples to start out with :) > > Yeah, this is yack-shaving, sorry, but it's how we clean up core > subsystems for apis and implementations that are not really correct and > were not noticed at the time. > > Can we see some users of this code posted so I can see how struct device > is going to work in a rust driver? That's the thing I worry most about > the rust/C interaction here as we have two different ways of thinking > about reference counts from the two worlds and putting them together is > going to be "interesting", as can be seen here already. Also, where are you getting your 'struct device' from in the first place? What bus is createing it and giving it to your rust driver? thanks, greg k-h
On 2023-02-24 14:11, Greg Kroah-Hartman wrote: > Thanks for the detailed rust explainations, I'd like to just highlight > one thing: > > On Fri, Feb 24, 2023 at 10:15:12PM +0900, Asahi Lina wrote: >> On 24/02/2023 20.23, Greg Kroah-Hartman wrote: >>> And again, why are bindings needed for a "raw" struct device at all? >>> Shouldn't the bus-specific wrappings work better? >> >> Because lots of kernel subsystems need to be able to accept "any" device >> and don't care about the bus! That's what this is for. > > That's great, but: > >> All the bus >> wrappers would implement this so they can be used as an argument for all >> those subsystems (plus a generic one when you just need to pass around >> an actual owned generic reference and no longer need bus-specific >> operations - you can materialize that out of a RawDevice impl, which is >> when get_device() would be called). That's why I'm introducing this now, >> because both io_pgtable and rtkit need to take `struct device` pointers >> on the C side so we need some "generic struct device" view on the Rust side. > > In looking at both ftkit and io_pgtable, those seem to be good examples > of how "not to use a struct device", so trying to make safe bindings > from Rust to these frameworks is very ironic :) > > rtkit takes a struct device pointer and then never increments it, > despite saving it off, which is unsafe. It then only uses it to print > out messages if things go wrong (or right in some cases), which is odd. > So it can get away from using a device pointer entirely, except for the > devm_apple_rtkit_init() call, which I doubt you want to call from rust > code, right? > > for io_pgtable, that's a bit messier, you want to pass in a device that > io_pgtable treats as a "device" but again, it is NEVER properly > reference counted, AND, it is only needed to try to figure out the bus > operations that dma memory should be allocated from for this device. So > what would be better to save off there would be a pointer to the bus, > which is constant and soon will be read-only so there are no lifetime > rules needed at all (see the major struct bus_type changes going into > 6.3-rc1 that will enable that to happen). FWIW the DMA API *has* to know which specific device it's operating with, since the relevant properties can and do vary even between different devices within a single bus_type (e.g. DMA masks). In the case of io-pgtable at least, there's no explicit refcounting since the struct device must be the one representing the physical platform/PCI/etc. device consuming the pagetable, so if that were to disappear from underneath its driver while the pagetable is still in use, things would already have gone very very wrong indeed :) Cheers, Robin. > So the two subsystems you want to call from rust code don't properly > handle the reference count of the object you are going to pass into it, > and only need it for debugging and iommu stuff, which is really only the > bus that the device is on, not good examples to start out with :) > > Yeah, this is yack-shaving, sorry, but it's how we clean up core > subsystems for apis and implementations that are not really correct and > were not noticed at the time. > > Can we see some users of this code posted so I can see how struct device > is going to work in a rust driver? That's the thing I worry most about > the rust/C interaction here as we have two different ways of thinking > about reference counts from the two worlds and putting them together is > going to be "interesting", as can be seen here already. > > thanks, > > greg k-h
On 2023/02/24 23:11, Greg Kroah-Hartman wrote: > Thanks for the detailed rust explainations, I'd like to just highlight > one thing: > > On Fri, Feb 24, 2023 at 10:15:12PM +0900, Asahi Lina wrote: >> On 24/02/2023 20.23, Greg Kroah-Hartman wrote: >>> And again, why are bindings needed for a "raw" struct device at all? >>> Shouldn't the bus-specific wrappings work better? >> >> Because lots of kernel subsystems need to be able to accept "any" device >> and don't care about the bus! That's what this is for. > > That's great, but: > >> All the bus >> wrappers would implement this so they can be used as an argument for all >> those subsystems (plus a generic one when you just need to pass around >> an actual owned generic reference and no longer need bus-specific >> operations - you can materialize that out of a RawDevice impl, which is >> when get_device() would be called). That's why I'm introducing this now, >> because both io_pgtable and rtkit need to take `struct device` pointers >> on the C side so we need some "generic struct device" view on the Rust side. > > In looking at both ftkit and io_pgtable, those seem to be good examples > of how "not to use a struct device", so trying to make safe bindings > from Rust to these frameworks is very ironic :) And this is why I want to use Rust, and why writing the abstractions for C code is so difficult... Rust encodes all these rules in the type system, but C doesn't, and so many kernel APIs don't document any of this or what the requirements are... > rtkit takes a struct device pointer and then never increments it, > despite saving it off, which is unsafe. It then only uses it to print > out messages if things go wrong (or right in some cases), which is odd. > So it can get away from using a device pointer entirely, except for the > devm_apple_rtkit_init() call, which I doubt you want to call from rust > code, right? That sounds like we need to fix the C side to grab a reference ^^ We do need to pass the device to the init function though (apple_rtkit_init(), this is in the SoC tree which I mentioned as a prequisite and already on the way to 6.3-rc1), since at the very least it has to pick up the mailbox and all that to initialize. Alternatively we could say that the C API contract is that the user of rtkit has to own a reference, and then the Rust abstraction would have to take that reference to make a safe abstraction, but that doesn't sound like the better option. What do you recommend for things that want to print device-associated messages, if not holding a reference to the device? Or did I misunderstand what you meant? Just pr_foo() isn't great because we have a lot of instances of rtkit and then you wouldn't know which device the messages are about... > for io_pgtable, that's a bit messier, you want to pass in a device that > io_pgtable treats as a "device" but again, it is NEVER properly > reference counted, AND, it is only needed to try to figure out the bus > operations that dma memory should be allocated from for this device. So > what would be better to save off there would be a pointer to the bus, > which is constant and soon will be read-only so there are no lifetime > rules needed at all (see the major struct bus_type changes going into > 6.3-rc1 that will enable that to happen). > > So the two subsystems you want to call from rust code don't properly > handle the reference count of the object you are going to pass into it, > and only need it for debugging and iommu stuff, which is really only the > bus that the device is on, not good examples to start out with :) Well, they're two examples that are dependencies for the driver I wrote, and I don't think you want me picking easy examples with zero known upcoming users... ^^;; > Yeah, this is yack-shaving, sorry, but it's how we clean up core > subsystems for apis and implementations that are not really correct and > were not noticed at the time. I'm fine with helping fix all this, and I don't expect all the underlying C code to be perfect already either! I already fixed one locking bug in DRM and spent a lot of time trying to figure out lifetime rules there, but I didn't dig into rtkit/io_pgtable and didn't realize they don't take references properly... > Can we see some users of this code posted so I can see how struct device > is going to work in a rust driver? That's the thing I worry most about > the rust/C interaction here as we have two different ways of thinking > about reference counts from the two worlds and putting them together is > going to be "interesting", as can be seen here already. I linked a tree with everything in the cover letter ([4]), look in drivers/gpu/drm/asahi for the actual driver. But there are a lot of other dependencies that have to go in before that will compile (everything else in that branch...) I know it's hard to review without examples, but I also can't just post the driver and everything else as one series now, there's still a lot to be improved and fixed and I'm working with the Rust folks on figuring out a roadmap for that... and waiting until "everything" is ready and perfect would mean we don't get anything done in the meantime and fall into a pit of endless rebasing and coordinating downstream trees, which also isn't good... ~~ Lina
On 2023/02/24 23:19, Greg Kroah-Hartman wrote:>> Can we see some users of this code posted so I can see how struct device >> is going to work in a rust driver? That's the thing I worry most about >> the rust/C interaction here as we have two different ways of thinking >> about reference counts from the two worlds and putting them together is >> going to be "interesting", as can be seen here already. > > Also, where are you getting your 'struct device' from in the first > place? What bus is createing it and giving it to your rust driver? That would be platform for my GPU driver, matched via OF compatible. ~~ Lina
On 2023/02/24 23:32, Robin Murphy wrote: > On 2023-02-24 14:11, Greg Kroah-Hartman wrote: >> Thanks for the detailed rust explainations, I'd like to just highlight >> one thing: >> >> On Fri, Feb 24, 2023 at 10:15:12PM +0900, Asahi Lina wrote: >>> On 24/02/2023 20.23, Greg Kroah-Hartman wrote: >>>> And again, why are bindings needed for a "raw" struct device at all? >>>> Shouldn't the bus-specific wrappings work better? >>> >>> Because lots of kernel subsystems need to be able to accept "any" device >>> and don't care about the bus! That's what this is for. >> >> That's great, but: >> >>> All the bus >>> wrappers would implement this so they can be used as an argument for all >>> those subsystems (plus a generic one when you just need to pass around >>> an actual owned generic reference and no longer need bus-specific >>> operations - you can materialize that out of a RawDevice impl, which is >>> when get_device() would be called). That's why I'm introducing this now, >>> because both io_pgtable and rtkit need to take `struct device` pointers >>> on the C side so we need some "generic struct device" view on the >>> Rust side. >> >> In looking at both ftkit and io_pgtable, those seem to be good examples >> of how "not to use a struct device", so trying to make safe bindings >> from Rust to these frameworks is very ironic :) >> >> rtkit takes a struct device pointer and then never increments it, >> despite saving it off, which is unsafe. It then only uses it to print >> out messages if things go wrong (or right in some cases), which is odd. >> So it can get away from using a device pointer entirely, except for the >> devm_apple_rtkit_init() call, which I doubt you want to call from rust >> code, right? >> >> for io_pgtable, that's a bit messier, you want to pass in a device that >> io_pgtable treats as a "device" but again, it is NEVER properly >> reference counted, AND, it is only needed to try to figure out the bus >> operations that dma memory should be allocated from for this device. So >> what would be better to save off there would be a pointer to the bus, >> which is constant and soon will be read-only so there are no lifetime >> rules needed at all (see the major struct bus_type changes going into >> 6.3-rc1 that will enable that to happen). > > FWIW the DMA API *has* to know which specific device it's operating > with, since the relevant properties can and do vary even between > different devices within a single bus_type (e.g. DMA masks). > > In the case of io-pgtable at least, there's no explicit refcounting > since the struct device must be the one representing the physical > platform/PCI/etc. device consuming the pagetable, so if that were to > disappear from underneath its driver while the pagetable is still in > use, things would already have gone very very wrong indeed :) There's no terribly good way to encode this relationship in safe Rust as far as I know. So although it might be "obvious" (and I think my driver can never violate it as it is currently designed), this means the Rust abstraction will have to take the device reference if the C side does not, because safe rust abstractions have to actually make these bugs impossible and nothing stops a Rust driver from, say, stashing an io_pgtable reference into a global and letting the device go away. ~~ Lina
On 2023-02-24 14:48, Asahi Lina wrote: > > > On 2023/02/24 23:32, Robin Murphy wrote: >> On 2023-02-24 14:11, Greg Kroah-Hartman wrote: >>> Thanks for the detailed rust explainations, I'd like to just highlight >>> one thing: >>> >>> On Fri, Feb 24, 2023 at 10:15:12PM +0900, Asahi Lina wrote: >>>> On 24/02/2023 20.23, Greg Kroah-Hartman wrote: >>>>> And again, why are bindings needed for a "raw" struct device at all? >>>>> Shouldn't the bus-specific wrappings work better? >>>> >>>> Because lots of kernel subsystems need to be able to accept "any" device >>>> and don't care about the bus! That's what this is for. >>> >>> That's great, but: >>> >>>> All the bus >>>> wrappers would implement this so they can be used as an argument for all >>>> those subsystems (plus a generic one when you just need to pass around >>>> an actual owned generic reference and no longer need bus-specific >>>> operations - you can materialize that out of a RawDevice impl, which is >>>> when get_device() would be called). That's why I'm introducing this now, >>>> because both io_pgtable and rtkit need to take `struct device` pointers >>>> on the C side so we need some "generic struct device" view on the >>>> Rust side. >>> >>> In looking at both ftkit and io_pgtable, those seem to be good examples >>> of how "not to use a struct device", so trying to make safe bindings >>> from Rust to these frameworks is very ironic :) >>> >>> rtkit takes a struct device pointer and then never increments it, >>> despite saving it off, which is unsafe. It then only uses it to print >>> out messages if things go wrong (or right in some cases), which is odd. >>> So it can get away from using a device pointer entirely, except for the >>> devm_apple_rtkit_init() call, which I doubt you want to call from rust >>> code, right? >>> >>> for io_pgtable, that's a bit messier, you want to pass in a device that >>> io_pgtable treats as a "device" but again, it is NEVER properly >>> reference counted, AND, it is only needed to try to figure out the bus >>> operations that dma memory should be allocated from for this device. So >>> what would be better to save off there would be a pointer to the bus, >>> which is constant and soon will be read-only so there are no lifetime >>> rules needed at all (see the major struct bus_type changes going into >>> 6.3-rc1 that will enable that to happen). >> >> FWIW the DMA API *has* to know which specific device it's operating >> with, since the relevant properties can and do vary even between >> different devices within a single bus_type (e.g. DMA masks). >> >> In the case of io-pgtable at least, there's no explicit refcounting >> since the struct device must be the one representing the physical >> platform/PCI/etc. device consuming the pagetable, so if that were to >> disappear from underneath its driver while the pagetable is still in >> use, things would already have gone very very wrong indeed :) > > There's no terribly good way to encode this relationship in safe Rust as > far as I know. So although it might be "obvious" (and I think my driver > can never violate it as it is currently designed), this means the Rust > abstraction will have to take the device reference if the C side does > not, because safe rust abstractions have to actually make these bugs > impossible and nothing stops a Rust driver from, say, stashing an > io_pgtable reference into a global and letting the device go away. If someone did that, then simply holding a struct device reference wouldn't guarantee much, since it only prevents the pointer itself from becoming invalid - it still doesn't say any of the data *in* the structure is still valid and "safe" for what a DMA API call might do with it. At the very least you'd probably have to somehow also guarantee that the device has a driver bound (which is the closest thing to a general indication of valid DMA ops across all architectures) and block it from unbinding for the lifetime of the reference, but that would then mean a simple driver which expects to tear down its io-pgtable from its .remove callback could never be unbound due to the circular dependency :/ Cheers, Robin.
On Fri, Feb 24, 2023 at 11:43:54PM +0900, Asahi Lina wrote: > > > On 2023/02/24 23:11, Greg Kroah-Hartman wrote: > > Thanks for the detailed rust explainations, I'd like to just highlight > > one thing: > > > > On Fri, Feb 24, 2023 at 10:15:12PM +0900, Asahi Lina wrote: > >> On 24/02/2023 20.23, Greg Kroah-Hartman wrote: > >>> And again, why are bindings needed for a "raw" struct device at all? > >>> Shouldn't the bus-specific wrappings work better? > >> > >> Because lots of kernel subsystems need to be able to accept "any" device > >> and don't care about the bus! That's what this is for. > > > > That's great, but: > > > >> All the bus > >> wrappers would implement this so they can be used as an argument for all > >> those subsystems (plus a generic one when you just need to pass around > >> an actual owned generic reference and no longer need bus-specific > >> operations - you can materialize that out of a RawDevice impl, which is > >> when get_device() would be called). That's why I'm introducing this now, > >> because both io_pgtable and rtkit need to take `struct device` pointers > >> on the C side so we need some "generic struct device" view on the Rust side. > > > > In looking at both ftkit and io_pgtable, those seem to be good examples > > of how "not to use a struct device", so trying to make safe bindings > > from Rust to these frameworks is very ironic :) > > And this is why I want to use Rust, and why writing the abstractions for > C code is so difficult... Rust encodes all these rules in the type > system, but C doesn't, and so many kernel APIs don't document any of > this or what the requirements are... I totally agree, and is why I'm wanting to review this stuff. > > rtkit takes a struct device pointer and then never increments it, > > despite saving it off, which is unsafe. It then only uses it to print > > out messages if things go wrong (or right in some cases), which is odd. > > So it can get away from using a device pointer entirely, except for the > > devm_apple_rtkit_init() call, which I doubt you want to call from rust > > code, right? > > That sounds like we need to fix the C side to grab a reference ^^ I agree. Or remove the reference entirely. > We do need to pass the device to the init function though > (apple_rtkit_init(), this is in the SoC tree which I mentioned as a > prequisite and already on the way to 6.3-rc1), since at the very least > it has to pick up the mailbox and all that to initialize. As I said, the rtkit code does not actually use the device pointer at all except for some (I would say pointless) log messages. So we could change the rtkit code either way, I would go for the removal. > Alternatively we could say that the C API contract is that the user of > rtkit has to own a reference, and then the Rust abstraction would have > to take that reference to make a safe abstraction, but that doesn't > sound like the better option. Yeah, that feels odd. > What do you recommend for things that want to print device-associated > messages, if not holding a reference to the device? If you aren't holding a reference to the device, that means you aren't associated to it at all, so you better not be printing out anything related to any device as that pointer could be invalid at any point in time. > Or did I > misunderstand what you meant? Just pr_foo() isn't great because we have > a lot of instances of rtkit and then you wouldn't know which device the > messages are about... Then the rtkit code needs to be changed to properly grab the reference and actually use it for something other than just a log message. If it only wants it for a log message, then let's just drop it and have the rtkit code go quiet, as when kernel code is working properly, it should be quiet. If something goes wrong, the code that called into rtkit can print out a message based on the error return values. I have no idea what "rtkit" is, if it's an interface to hardware, why doesn't it have its own struct device that it creates and manages and uses instead? In my quick glance, that feels like the real solution here instead of just "I hope this pointer is going to be valid" like it lives with today. Odds are you can't remove a rtkit device at runtime, so no one has noticed this yet... > > for io_pgtable, that's a bit messier, you want to pass in a device that > > io_pgtable treats as a "device" but again, it is NEVER properly > > reference counted, AND, it is only needed to try to figure out the bus > > operations that dma memory should be allocated from for this device. So > > what would be better to save off there would be a pointer to the bus, > > which is constant and soon will be read-only so there are no lifetime > > rules needed at all (see the major struct bus_type changes going into > > 6.3-rc1 that will enable that to happen). > > > > So the two subsystems you want to call from rust code don't properly > > handle the reference count of the object you are going to pass into it, > > and only need it for debugging and iommu stuff, which is really only the > > bus that the device is on, not good examples to start out with :) > > Well, they're two examples that are dependencies for the driver I wrote, > and I don't think you want me picking easy examples with zero known > upcoming users... ^^;; Hey, this is good, I like it, no complaints! > > Yeah, this is yack-shaving, sorry, but it's how we clean up core > > subsystems for apis and implementations that are not really correct and > > were not noticed at the time. > > I'm fine with helping fix all this, and I don't expect all the > underlying C code to be perfect already either! I already fixed one > locking bug in DRM and spent a lot of time trying to figure out lifetime > rules there, but I didn't dig into rtkit/io_pgtable and didn't realize > they don't take references properly... The iomem code is under heavy-flux right now so let's see what 6.3-rc1 looks like to determine if this still needs to be resolved or not. I hope the latest set of fixes/reworks for that subsystem land. Either way, I'll look into what needs to be done there as obviously, the current code is not correct. > > Can we see some users of this code posted so I can see how struct device > > is going to work in a rust driver? That's the thing I worry most about > > the rust/C interaction here as we have two different ways of thinking > > about reference counts from the two worlds and putting them together is > > going to be "interesting", as can be seen here already. > > I linked a tree with everything in the cover letter ([4]), look in > drivers/gpu/drm/asahi for the actual driver. But there are a lot of > other dependencies that have to go in before that will compile > (everything else in that branch...) Sorry, it's hard to notice random github repos and branches, especially when traveling, that's why we use email for review. > I know it's hard to review without examples, but I also can't just post > the driver and everything else as one series now, there's still a lot to > be improved and fixed and I'm working with the Rust folks on figuring > out a roadmap for that... and waiting until "everything" is ready and > perfect would mean we don't get anything done in the meantime and fall > into a pit of endless rebasing and coordinating downstream trees, which > also isn't good... Yeah, it's a chicken and egg issue right now, no worries, I understand. This is going to take some cycles to get right. thanks, greg k-h
On Fri, Feb 24, 2023 at 11:44:59PM +0900, Asahi Lina wrote: > On 2023/02/24 23:19, Greg Kroah-Hartman wrote:>> Can we see some users > of this code posted so I can see how struct device > >> is going to work in a rust driver? That's the thing I worry most about > >> the rust/C interaction here as we have two different ways of thinking > >> about reference counts from the two worlds and putting them together is > >> going to be "interesting", as can be seen here already. > > > > Also, where are you getting your 'struct device' from in the first > > place? What bus is createing it and giving it to your rust driver? > > That would be platform for my GPU driver, matched via OF compatible. Ick, a platform device? The GPU isn't on the PCI bus? Wow, that's horrid...
On Fri, Feb 24, 2023 at 02:32:47PM +0000, Robin Murphy wrote: > On 2023-02-24 14:11, Greg Kroah-Hartman wrote: > > Thanks for the detailed rust explainations, I'd like to just highlight > > one thing: > > > > On Fri, Feb 24, 2023 at 10:15:12PM +0900, Asahi Lina wrote: > > > On 24/02/2023 20.23, Greg Kroah-Hartman wrote: > > > > And again, why are bindings needed for a "raw" struct device at all? > > > > Shouldn't the bus-specific wrappings work better? > > > > > > Because lots of kernel subsystems need to be able to accept "any" device > > > and don't care about the bus! That's what this is for. > > > > That's great, but: > > > > > All the bus > > > wrappers would implement this so they can be used as an argument for all > > > those subsystems (plus a generic one when you just need to pass around > > > an actual owned generic reference and no longer need bus-specific > > > operations - you can materialize that out of a RawDevice impl, which is > > > when get_device() would be called). That's why I'm introducing this now, > > > because both io_pgtable and rtkit need to take `struct device` pointers > > > on the C side so we need some "generic struct device" view on the Rust side. > > > > In looking at both ftkit and io_pgtable, those seem to be good examples > > of how "not to use a struct device", so trying to make safe bindings > > from Rust to these frameworks is very ironic :) > > > > rtkit takes a struct device pointer and then never increments it, > > despite saving it off, which is unsafe. It then only uses it to print > > out messages if things go wrong (or right in some cases), which is odd. > > So it can get away from using a device pointer entirely, except for the > > devm_apple_rtkit_init() call, which I doubt you want to call from rust > > code, right? > > > > for io_pgtable, that's a bit messier, you want to pass in a device that > > io_pgtable treats as a "device" but again, it is NEVER properly > > reference counted, AND, it is only needed to try to figure out the bus > > operations that dma memory should be allocated from for this device. So > > what would be better to save off there would be a pointer to the bus, > > which is constant and soon will be read-only so there are no lifetime > > rules needed at all (see the major struct bus_type changes going into > > 6.3-rc1 that will enable that to happen). > > FWIW the DMA API *has* to know which specific device it's operating with, > since the relevant properties can and do vary even between different devices > within a single bus_type (e.g. DMA masks). What bus_type has different DMA masks depending on the device on that bus today? And the iommu ops are on the bus, not the device, but there is a iommu_group on the device, is that what you are referring to? Am I getting iommu and dma stuff confused here? A bus also has dma callbacks, but yet the device itself has dma_ops? > In the case of io-pgtable at least, there's no explicit refcounting since > the struct device must be the one representing the physical > platform/PCI/etc. device consuming the pagetable, so if that were to > disappear from underneath its driver while the pagetable is still in use, > things would already have gone very very wrong indeed :) But that could happen at any point in time, the device can be removed from the system with no warning, how do you guarantee that io-pgtable is being initialized with a device that can NOT be removed? Think of drawers containing CPUs and PCI devices and memory, Linux has supported hot-removal of those for decades. (well, not for memory, we could only hot-add that...) Again, passing in a pointer to a struct device, and saving it off WITHOUT incrementing the reference count is not ok, that's not how reference counts work... But again, let's see about disconnecting the iommu ops entirely from the device and just relying on the bus, that should work better, rigth? thanks, greg k-h
On 25/02/2023 00.24, Greg Kroah-Hartman wrote: >> What do you recommend for things that want to print device-associated >> messages, if not holding a reference to the device? > > If you aren't holding a reference to the device, that means you aren't > associated to it at all, so you better not be printing out anything > related to any device as that pointer could be invalid at any point in > time. The RTKit code talks to the firmware coprocessor that is part of the device, so it definitely is associated to it... among other things it prints out firmware logs from the device and crash logs, manages memory buffers (which have default implementations but can be overridden by the user driver), and more. It's essentially library code shared by all device drivers that interact with devices with these coprocessors. >> Or did I >> misunderstand what you meant? Just pr_foo() isn't great because we have >> a lot of instances of rtkit and then you wouldn't know which device the >> messages are about... > > Then the rtkit code needs to be changed to properly grab the reference > and actually use it for something other than just a log message. If it > only wants it for a log message, then let's just drop it and have the > rtkit code go quiet, as when kernel code is working properly, it should > be quiet. If something goes wrong, the code that called into rtkit can > print out a message based on the error return values. Keep in mind rtkit does things like print out crash logs, and you wouldn't want the caller to be responsible for that (and we definitely want those in dmesg since these coprocessors are non-recoverable, it's almost as bad as a kernel panic: you will have to reboot to be able to use the machine properly again). I find those crash logs very useful to figure out what went wrong with the GPU (especially if combined with a memory dump which we don't expose to regular users right now, but which I have ideas for... but even without that, just assert messages from the coprocessor or fault instruction pointers that I can correlate with the firmware are very useful on their own). Right now rtkit also prints out syslogs from the coprocessors. That's noisy for some but I think very useful, since we're dealing with reverse engineered drivers. We'll probably want to silence those for some noisy coprocessors at some point, but I don't think we want to do that until things are all upstream, stable, and with a larger user base... until then I think we'd much rather be spammy and have a better chance of debugging rare issues, which often happen with these coprocessors running big firmware blobs... there are a lot of subtleties in getting the interfaces right, never mind cache coherence issues! > I have no idea what "rtkit" is, if it's an interface to hardware, why > doesn't it have its own struct device that it creates and manages and > uses instead? In my quick glance, that feels like the real solution > here instead of just "I hope this pointer is going to be valid" like it > lives with today. Odds are you can't remove a rtkit device at runtime, > so no one has noticed this yet... Well, they're all embedded into the SoC, yes. RTKit is Apple's firmware RTOS, and also the name for the semi-standardized mailbox/shared-memory interface shared by different firmwares using it. The Linux code to drive it doesn't create its own "struct device" because the rtkit code is just library code that is extended by the downstream drivers (like mine). How each driver interacts with rtkit varies widely... NVMe almost doesn't at all other than for power management, there is actually a downstream "rtkit-helper" driver that is a proper standalone device wrapper for one case (MTP) where it really doesn't need to interact at all... in my case with the GPU, almost everything is shared memory and doorbells over the mailbox. Other drivers like DCP actually send pointers over multiple mailbox endpoints, or do most of their data exchange directly over messages like that (SMC). So in a way, if we consider it driver library code, it's not unreasonable for RTKit to require that the device you pass it outlives it. Certainly, if the device is getting unbound from your driver, you'd need to tear down RTKit as part of that in any reasonable situation. >> I know it's hard to review without examples, but I also can't just post >> the driver and everything else as one series now, there's still a lot to >> be improved and fixed and I'm working with the Rust folks on figuring >> out a roadmap for that... and waiting until "everything" is ready and >> perfect would mean we don't get anything done in the meantime and fall >> into a pit of endless rebasing and coordinating downstream trees, which >> also isn't good... > > Yeah, it's a chicken and egg issue right now, no worries, I understand. > This is going to take some cycles to get right. > Thank you ^^ ~~ Lina
On 25/02/2023 00.25, Greg Kroah-Hartman wrote: > On Fri, Feb 24, 2023 at 11:44:59PM +0900, Asahi Lina wrote: >> On 2023/02/24 23:19, Greg Kroah-Hartman wrote:>> Can we see some users >> of this code posted so I can see how struct device >>>> is going to work in a rust driver? That's the thing I worry most about >>>> the rust/C interaction here as we have two different ways of thinking >>>> about reference counts from the two worlds and putting them together is >>>> going to be "interesting", as can be seen here already. >>> >>> Also, where are you getting your 'struct device' from in the first >>> place? What bus is createing it and giving it to your rust driver? >> >> That would be platform for my GPU driver, matched via OF compatible. > > Ick, a platform device? The GPU isn't on the PCI bus? Wow, that's > horrid... The internal NVMe controller also isn't on the PCI bus (which is worse since that actually has standard PCI bindings)... nothing is on PCI buses other than external PCI/Thunderbolt devices. There are no internal fake-PCI devices on Apple Silicon like there are on Intel. That's how it is on most SoCs (though Apple is the only vendor who has dared to go as far as integrating NVMe as far as I know)... ~~ Lina
On 25/02/2023 00.14, Robin Murphy wrote: > On 2023-02-24 14:48, Asahi Lina wrote: >> >> >> On 2023/02/24 23:32, Robin Murphy wrote: >>> FWIW the DMA API *has* to know which specific device it's operating >>> with, since the relevant properties can and do vary even between >>> different devices within a single bus_type (e.g. DMA masks). >>> >>> In the case of io-pgtable at least, there's no explicit refcounting >>> since the struct device must be the one representing the physical >>> platform/PCI/etc. device consuming the pagetable, so if that were to >>> disappear from underneath its driver while the pagetable is still in >>> use, things would already have gone very very wrong indeed :) >> >> There's no terribly good way to encode this relationship in safe Rust as >> far as I know. So although it might be "obvious" (and I think my driver >> can never violate it as it is currently designed), this means the Rust >> abstraction will have to take the device reference if the C side does >> not, because safe rust abstractions have to actually make these bugs >> impossible and nothing stops a Rust driver from, say, stashing an >> io_pgtable reference into a global and letting the device go away. > > If someone did that, then simply holding a struct device reference > wouldn't guarantee much, since it only prevents the pointer itself from > becoming invalid - it still doesn't say any of the data *in* the > structure is still valid and "safe" for what a DMA API call might do > with it. > > At the very least you'd probably have to somehow also guarantee that the > device has a driver bound (which is the closest thing to a general > indication of valid DMA ops across all architectures) and block it from > unbinding for the lifetime of the reference, but that would then mean a > simple driver which expects to tear down its io-pgtable from its .remove > callback could never be unbound due to the circular dependency :/ I'd like to hear from the other Rust folks about ideas on how to solve this ^^. I think it might fit into the Revocable model for device resources, but it's going to be a bit of an issue for my driver because for me, the io_pgtable objects are deep in the Vm object that has its own refcounting, and those are created and destroyed dynamically at runtime, not just during device probe... But it's also possible that we have to resign ourselves to a somewhat leaky/unsound abstraction under the understanding that the C API has requirements that are difficult to encode in the Rust type system... There's actually a much longer story to tell here though... I've done a lot of thinking about what "safe" means in the context of the kernel, what the useful safety boundaries are, and things like that. Especially within complex drivers, the answer is a lot greyer than what you tend to get with Rust userspace code. But at the end of the day, Rust lets you make things much safer and reduce the chances of bugs creeping in dramatically, even if you have some slightly unsound abstractions and other issues. Maybe I should write a long form blog post about this, would people find that useful? For whatever this anecdote is worth, my GPU driver is about 16000 lines of Rust and has ~100 unsafe blocks, and has been in a few downstream distro repos since December with actual users and nobody has ever managed to oops it as far as I know, including other Mesa developers who have been feeding it all kinds of broken command buffers and things like that. It has survived OOMs leading to untested error paths, UAPI fuzzing, and endless GPU faults/timeouts... and I definitely am not a genius who writes bug-free code! The firmware itself is also very easy to crash (raw unsafe pointers everywhere in shared memory), and crashes are unrecoverable (you need to reboot), but I try to use Rust's type system and ownership model to extend safety to that interface as much as possible and I think people other than me have only managed to get the GPU firmware to crash twice (both cache coherency related, I *think* I've finally eliminated that issue at the root now). The uptime on my M2 streaming machine running a decent GPU workload for 20 hours or so weekly is 44 days and counting... I think that old kernel even has some minor memory leaks that I've since fixed (you *can* get those in Rust with circular references), but it's so minor it's going to be at least a few more months before it becomes a problem. So I guess what I'm saying is that at the end of the day, if we can't get an interface to be 100% safe and sound and usable, that's probably okay. We're still getting a lot of safe mileage out of the other 99%! ^^ ~~ Lina
On 2023-02-24 15:32, Greg Kroah-Hartman wrote: > On Fri, Feb 24, 2023 at 02:32:47PM +0000, Robin Murphy wrote: >> On 2023-02-24 14:11, Greg Kroah-Hartman wrote: >>> Thanks for the detailed rust explainations, I'd like to just highlight >>> one thing: >>> >>> On Fri, Feb 24, 2023 at 10:15:12PM +0900, Asahi Lina wrote: >>>> On 24/02/2023 20.23, Greg Kroah-Hartman wrote: >>>>> And again, why are bindings needed for a "raw" struct device at all? >>>>> Shouldn't the bus-specific wrappings work better? >>>> >>>> Because lots of kernel subsystems need to be able to accept "any" device >>>> and don't care about the bus! That's what this is for. >>> >>> That's great, but: >>> >>>> All the bus >>>> wrappers would implement this so they can be used as an argument for all >>>> those subsystems (plus a generic one when you just need to pass around >>>> an actual owned generic reference and no longer need bus-specific >>>> operations - you can materialize that out of a RawDevice impl, which is >>>> when get_device() would be called). That's why I'm introducing this now, >>>> because both io_pgtable and rtkit need to take `struct device` pointers >>>> on the C side so we need some "generic struct device" view on the Rust side. >>> >>> In looking at both ftkit and io_pgtable, those seem to be good examples >>> of how "not to use a struct device", so trying to make safe bindings >>> from Rust to these frameworks is very ironic :) >>> >>> rtkit takes a struct device pointer and then never increments it, >>> despite saving it off, which is unsafe. It then only uses it to print >>> out messages if things go wrong (or right in some cases), which is odd. >>> So it can get away from using a device pointer entirely, except for the >>> devm_apple_rtkit_init() call, which I doubt you want to call from rust >>> code, right? >>> >>> for io_pgtable, that's a bit messier, you want to pass in a device that >>> io_pgtable treats as a "device" but again, it is NEVER properly >>> reference counted, AND, it is only needed to try to figure out the bus >>> operations that dma memory should be allocated from for this device. So >>> what would be better to save off there would be a pointer to the bus, >>> which is constant and soon will be read-only so there are no lifetime >>> rules needed at all (see the major struct bus_type changes going into >>> 6.3-rc1 that will enable that to happen). >> >> FWIW the DMA API *has* to know which specific device it's operating with, >> since the relevant properties can and do vary even between different devices >> within a single bus_type (e.g. DMA masks). > > What bus_type has different DMA masks depending on the device on that > bus today? Certainly PCI and platform, which are about 99% of what matters in terms of DMA. You may say "ewww" again, but even on something as "normal" as a PCIe GPU chip, the audio codec function may well have a different DMA mask from the GPU function, let alone distinct devices across cards/slots/segments/etc. differing - e.g. some PCIe USB controllers are still limited to 32 bits, where your more-capable GPU/NVMe/NIC etc. would definitely not be cool with a lowest-common-denominator constraint. > And the iommu ops are on the bus, not the device, but there is a > iommu_group on the device, is that what you are referring to? > > Am I getting iommu and dma stuff confused here? A bus also has dma > callbacks, but yet the device itself has dma_ops? Confusion is excusable here - the bus "dma_configure" callbacks are mostly actually IOMMU configuration, which is a bit of historical legacy that still needs more cleaning up (because it makes things happen in the wrong order from what the IOMMU API really wants). I'm hoping to get to that soon once we land my current series finishing the bus->iommu_ops removal. (FWIW what I want to do is flip things around so the buses just provide a method for the IOMMU API to to retrieve whatever bus-defined IDs it needs to associate with a given device, rather than having the bus code call into the IOMMU API itself as currently.) >> In the case of io-pgtable at least, there's no explicit refcounting since >> the struct device must be the one representing the physical >> platform/PCI/etc. device consuming the pagetable, so if that were to >> disappear from underneath its driver while the pagetable is still in use, >> things would already have gone very very wrong indeed :) > > But that could happen at any point in time, the device can be removed > from the system with no warning, how do you guarantee that io-pgtable is > being initialized with a device that can NOT be removed? > > Think of drawers containing CPUs and PCI devices and memory, Linux > has supported hot-removal of those for decades. (well, not for memory, > we could only hot-add that...) > > Again, passing in a pointer to a struct device, and saving it off > WITHOUT incrementing the reference count is not ok, that's not how > reference counts work... io-pgtable is just a utility library which that device's driver is calling into. It never does anything with that pointer by itself, and it's implicitly expected that the ops are only going to be called by whoever allocated them. If the driver needs to cope with a surprise removal event while bound, that's on the driver, but I'd assume the bus code is still expected to at least give it a chance to clean up gracefully before the struct device completely disappears. I don't think I've ever seen a driver explicitly take a reference on its own device for "normal" operation, so either everything's wrong, or that guarantee must already come from deeper in the driver core. Yes, *technically* someone could invoke io-pgtable ops from some context other than the driver which allocated them, but it would be wrong and the answer would be "don't do that". We've got a huge quantity of APIs all over Linux where the expected usage model is intuitive, but impractical to formally assert, so it's just assumed that you do the right thing and expect fireworks if you don't. I do fully appreciate that "don't do that" is not an easy thing for Rust to nail down. There's nothing inherently special about saving a pointer in a structure either - if someone's failing to respect the expected conventions, then the lifetime of a pointer *anywhere* matters, and even something as innocuous as dev_name() should strictly take a reference on its argument in case it's called concurrently with the device being freed. In fact, consider if get_device() is pre-empted right on entry, and by the time it comes back to dereference the kobject, someone else has racily dropped the last reference and already freed it. Unless we can draw a line of reasonable expectations somewhere, we've no pretence of being able to guarantee *anything*. > But again, let's see about disconnecting the iommu ops entirely from the > device and just relying on the bus, that should work better, rigth? Um, other way - iommu_ops have almost finished moving *off* the bus to be per-device. You acked the patch the other week and seemed pleased with it ;) But either way that's orthogonal to this case, since the device that io-pgtable deals with cannot have iommu_ops of its own (it *is* the IOMMU, or at least a GPU MMU acting as one). Cheers, Robin.
On Fri, Feb 24, 2023 at 5:23 PM Asahi Lina <lina@asahilina.net> wrote: > > So I guess what I'm saying is that at the end of the day, if we can't > get an interface to be 100% safe and sound and usable, that's probably > okay. We're still getting a lot of safe mileage out of the other 99%! ^^ We talked a few times about what approach to take for things where a fully safe API is not feasible. There have been differing opinions in the past. One approach would be requiring a "global `unsafe`" so to speak, once, somewhere in the API -- it does not improve things much, but at least it would make the user acknowledge the pitfalls of that particular API/feature/subsystem/hardware/... e.g. for module unloading, one could ask for an `unsafe` inside the `module!` macro invocation (like `unsafe unloadable: true,`). This would allow for `// SAFETY: ...` comments. Another approach would be declaring some of those "external entities" outside the scope of Rust's safety guarantees, like it is done for e.g. `/proc/self/mem` in userspace Rust [1]. They would be documented wherever relevant, and perhaps we could have an "acknowledged soundness holes" list. Having said that, of course, what we definitely don't want to allow is for subsystems to provide unsound safe APIs for no reason. [1] https://doc.rust-lang.org/stable/std/os/unix/io/index.html#procselfmem-and-similar-os-features Cheers, Miguel
On Fri, Feb 24, 2023 at 06:52:04PM +0000, Robin Murphy wrote: > > But again, let's see about disconnecting the iommu ops entirely from the > > device and just relying on the bus, that should work better, rigth? > > Um, other way - iommu_ops have almost finished moving *off* the bus to be > per-device. You acked the patch the other week and seemed pleased with it ;) Oops, yes, sorry, I meant the other way around in my comment, see, it's confusing :) And yes, I want to see your patch series merged, it will make my "clean up the driver core lifetime rules" work a lot easier, is it still planned for 6.3-rc1? > But either way that's orthogonal to this case, since the device that > io-pgtable deals with cannot have iommu_ops of its own (it *is* the IOMMU, > or at least a GPU MMU acting as one). Agreed, and thanks for the detailed explaination, it makes more sense now. But I still think the pointer reference needs to be incremented when saved off to be at least a bit more sane. thanks, greg k-h
February 24, 2023 10:25 AM, "Greg Kroah-Hartman" <gregkh@linuxfoundation.org> wrote: > On Fri, Feb 24, 2023 at 11:44:59PM +0900, Asahi Lina wrote: > >> On 2023/02/24 23:19, Greg Kroah-Hartman wrote:>> Can we see some users >> of this code posted so I can see how struct device >> is going to work in a rust driver? That's the thing I worry most about >> the rust/C interaction here as we have two different ways of thinking >> about reference counts from the two worlds and putting them together is >> going to be "interesting", as can be seen here already. >> >> Also, where are you getting your 'struct device' from in the first >> place? What bus is createing it and giving it to your rust driver? >> >> That would be platform for my GPU driver, matched via OF compatible. > > Ick, a platform device? The GPU isn't on the PCI bus? Wow, that's > horrid... This is bog standard for Arm SoCs... As far as I know, it's all platform devices in the Arm GPU world: Mali, Adreno, Tegra, VideoCore, and yes, Imaginapple. not really sure what good PCI would do for integrated GPUs.
On Fri, 24 Feb 2023 at 13:23, Asahi Lina <lina@asahilina.net> wrote: > > On 25/02/2023 00.14, Robin Murphy wrote: > > On 2023-02-24 14:48, Asahi Lina wrote: > >> > >> > >> On 2023/02/24 23:32, Robin Murphy wrote: > >>> FWIW the DMA API *has* to know which specific device it's operating > >>> with, since the relevant properties can and do vary even between > >>> different devices within a single bus_type (e.g. DMA masks). > >>> > >>> In the case of io-pgtable at least, there's no explicit refcounting > >>> since the struct device must be the one representing the physical > >>> platform/PCI/etc. device consuming the pagetable, so if that were to > >>> disappear from underneath its driver while the pagetable is still in > >>> use, things would already have gone very very wrong indeed :) > >> > >> There's no terribly good way to encode this relationship in safe Rust as > >> far as I know. So although it might be "obvious" (and I think my driver > >> can never violate it as it is currently designed), this means the Rust > >> abstraction will have to take the device reference if the C side does > >> not, because safe rust abstractions have to actually make these bugs > >> impossible and nothing stops a Rust driver from, say, stashing an > >> io_pgtable reference into a global and letting the device go away. > > > > If someone did that, then simply holding a struct device reference > > wouldn't guarantee much, since it only prevents the pointer itself from > > becoming invalid - it still doesn't say any of the data *in* the > > structure is still valid and "safe" for what a DMA API call might do > > with it. > > > > At the very least you'd probably have to somehow also guarantee that the > > device has a driver bound (which is the closest thing to a general > > indication of valid DMA ops across all architectures) and block it from > > unbinding for the lifetime of the reference, but that would then mean a > > simple driver which expects to tear down its io-pgtable from its .remove > > callback could never be unbound due to the circular dependency :/ > > I'd like to hear from the other Rust folks about ideas on how to solve > this ^^. I think it might fit into the Revocable model for device > resources, but it's going to be a bit of an issue for my driver because > for me, the io_pgtable objects are deep in the Vm object that has its > own refcounting, and those are created and destroyed dynamically at > runtime, not just during device probe... Our bus abstractions all require that driver data stored in devices implement the `DeviceRemoval` trait, which has a single function: device_remove. The idea is that this function will do something to relinquish access to resources that shouldn't be accessed after the device is unbound (device_remove is called, for whatever reason). This mechanism is generic and allows us to break circular dependencies. One way to implement the `DeviceRemoval` trait is to hold all resources in "revocable" wrappers (of which we have several that have different constraints/requirements), and then revoke access when `device_remove` is called (this results in blocking while concurrent users are running, then running "destructors" when they're all done and before returning). The part that isn't great about this approach is that we introduce error paths everywhere these resources are used (because they may have been revoked).
diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h index 75d85bd6c592..3632a39a28a6 100644 --- a/rust/bindings/bindings_helper.h +++ b/rust/bindings/bindings_helper.h @@ -6,6 +6,7 @@ * Sorted alphabetically. */ +#include <linux/device.h> #include <linux/slab.h> #include <linux/refcount.h> diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs new file mode 100644 index 000000000000..9be021e393ca --- /dev/null +++ b/rust/kernel/device.rs @@ -0,0 +1,23 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Generic devices that are part of the kernel's driver model. +//! +//! C header: [`include/linux/device.h`](../../../../include/linux/device.h) + +use crate::bindings; + +/// A raw device. +/// +/// # Safety +/// +/// Implementers must ensure that the `*mut device` returned by [`RawDevice::raw_device`] is +/// related to `self`, that is, actions on it will affect `self`. For example, if one calls +/// `get_device`, then the refcount on the device represented by `self` will be incremented. +/// +/// Additionally, implementers must ensure that the device is never renamed. Commit a5462516aa99 +/// ("driver-core: document restrictions on device_rename()") has details on why `device_rename` +/// should not be used. +pub unsafe trait RawDevice { + /// Returns the raw `struct device` related to `self`. + fn raw_device(&self) -> *mut bindings::device; +} diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs index 82dff6f4cf60..de44092718f8 100644 --- a/rust/kernel/lib.rs +++ b/rust/kernel/lib.rs @@ -29,6 +29,7 @@ compile_error!("Missing kernel configuration for conditional compilation"); #[cfg(not(testlib))] mod allocator; mod build_assert; +pub mod device; pub mod error; pub mod prelude; pub mod print;