[2/5] rust: device: Add a minimal RawDevice trait

Message ID 20230224-rust-iopt-rtkit-v1-2-49ced3391295@asahilina.net
State New
Headers
Series rust: Add io_pgtable and RTKit abstractions |

Commit Message

Asahi Lina Feb. 24, 2023, 10:53 a.m. UTC
  From: Wedson Almeida Filho <wedsonaf@gmail.com>

Add a RawDevice trait which can be implemented by any type representing
a device class (such as a PlatformDevice). This is the minimum amount of
Device support code required to unblock abstractions that need to take
device pointers.

Lina: Rewrote commit message, and dropped everything except RawDevice.

Co-developed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
Signed-off-by: Asahi Lina <lina@asahilina.net>
---
 rust/bindings/bindings_helper.h |  1 +
 rust/kernel/device.rs           | 23 +++++++++++++++++++++++
 rust/kernel/lib.rs              |  1 +
 3 files changed, 25 insertions(+)
  

Comments

Greg KH Feb. 24, 2023, 11:23 a.m. UTC | #1
On Fri, Feb 24, 2023 at 07:53:14PM +0900, Asahi Lina wrote:
> From: Wedson Almeida Filho <wedsonaf@gmail.com>
> 
> Add a RawDevice trait which can be implemented by any type representing
> a device class (such as a PlatformDevice). This is the minimum amount of
> Device support code required to unblock abstractions that need to take
> device pointers.
> 
> Lina: Rewrote commit message, and dropped everything except RawDevice.
> 
> Co-developed-by: Miguel Ojeda <ojeda@kernel.org>
> Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
> Signed-off-by: Asahi Lina <lina@asahilina.net>
> ---
>  rust/bindings/bindings_helper.h |  1 +
>  rust/kernel/device.rs           | 23 +++++++++++++++++++++++
>  rust/kernel/lib.rs              |  1 +
>  3 files changed, 25 insertions(+)
> 
> diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
> index 75d85bd6c592..3632a39a28a6 100644
> --- a/rust/bindings/bindings_helper.h
> +++ b/rust/bindings/bindings_helper.h
> @@ -6,6 +6,7 @@
>   * Sorted alphabetically.
>   */
>  
> +#include <linux/device.h>
>  #include <linux/slab.h>
>  #include <linux/refcount.h>
>  
> diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs
> new file mode 100644
> index 000000000000..9be021e393ca
> --- /dev/null
> +++ b/rust/kernel/device.rs
> @@ -0,0 +1,23 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Generic devices that are part of the kernel's driver model.
> +//!
> +//! C header: [`include/linux/device.h`](../../../../include/linux/device.h)
> +
> +use crate::bindings;
> +
> +/// A raw device.
> +///
> +/// # Safety
> +///
> +/// Implementers must ensure that the `*mut device` returned by [`RawDevice::raw_device`] is
> +/// related to `self`, that is, actions on it will affect `self`. For example, if one calls
> +/// `get_device`, then the refcount on the device represented by `self` will be incremented.

What is a "implementer" here?  Rust code?  C code?  Who is calling
get_device() here, rust code or the driver code or something else?  How
are you matching up the reference logic of this structure with the fact
that the driver core actually owns the reference, not any rust code?


> +///
> +/// Additionally, implementers must ensure that the device is never renamed. Commit a5462516aa99
> +/// ("driver-core: document restrictions on device_rename()") has details on why `device_rename`
> +/// should not be used.

As much as I would have liked that, that commit was from 2010 and
device_rename() is still being used by some pretty large subsystems
(networking and IB) and I don't see that going away any year soon.

So yes, your device will be renamed, so don't mess with the device name
like I mentioned in review of path 5/5 in this series.

> +pub unsafe trait RawDevice {
> +    /// Returns the raw `struct device` related to `self`.
> +    fn raw_device(&self) -> *mut bindings::device;

Again, what prevents this device from going away?  I don't see a call to
get_device() here :(

And again, why are bindings needed for a "raw" struct device at all?
Shouldn't the bus-specific wrappings work better?

thanks,

greg k-h
  
Asahi Lina Feb. 24, 2023, 1:15 p.m. UTC | #2
On 24/02/2023 20.23, Greg Kroah-Hartman wrote:
> On Fri, Feb 24, 2023 at 07:53:14PM +0900, Asahi Lina wrote:
>> From: Wedson Almeida Filho <wedsonaf@gmail.com>
>>
>> Add a RawDevice trait which can be implemented by any type representing
>> a device class (such as a PlatformDevice). This is the minimum amount of
>> Device support code required to unblock abstractions that need to take
>> device pointers.
>>
>> Lina: Rewrote commit message, and dropped everything except RawDevice.
>>
>> Co-developed-by: Miguel Ojeda <ojeda@kernel.org>
>> Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
>> Signed-off-by: Wedson Almeida Filho <wedsonaf@gmail.com>
>> Signed-off-by: Asahi Lina <lina@asahilina.net>
>> ---
>>  rust/bindings/bindings_helper.h |  1 +
>>  rust/kernel/device.rs           | 23 +++++++++++++++++++++++
>>  rust/kernel/lib.rs              |  1 +
>>  3 files changed, 25 insertions(+)
>>
>> diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs
>> new file mode 100644
>> index 000000000000..9be021e393ca
>> --- /dev/null
>> +++ b/rust/kernel/device.rs
>> @@ -0,0 +1,23 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +
>> +//! Generic devices that are part of the kernel's driver model.
>> +//!
>> +//! C header: [`include/linux/device.h`](../../../../include/linux/device.h)
>> +
>> +use crate::bindings;
>> +
>> +/// A raw device.
>> +///
>> +/// # Safety
>> +///
>> +/// Implementers must ensure that the `*mut device` returned by [`RawDevice::raw_device`] is
>> +/// related to `self`, that is, actions on it will affect `self`. For example, if one calls
>> +/// `get_device`, then the refcount on the device represented by `self` will be incremented.
> 
> What is a "implementer" here?  Rust code?  C code?  Who is calling
> get_device() here, rust code or the driver code or something else?  How
> are you matching up the reference logic of this structure with the fact
> that the driver core actually owns the reference, not any rust code?

This is a Rust trait, so that would be Rust code. In practice that means
other abstractions for different device buses, and the generic one which
I sent as patch 5. It's not a structure, so it doesn't have any
reference logic itself. That would go into the implementing struct (like
`Device`). What the comment is saying is that other parts of the Rust
kernel crate may make assumptions about the relationship between the
lifetime of the struct implementing RawDevice and the lifetime of the
underlying `struct device`. So for example, it's not legal to return a
borrowed reference to a `struct device` that might go away before `self`
does.

We should probably make this `Sealed` too, to make sure no driver can
even try to implement it (which probably wouldn't make sense)... this
should only ever be implemented by stuff in the kernel crate.

The reference is owned by whoever owns the reference, no? If you call
get_device(), you own a new reference to the device (which means you can
keep the pointer around until you call put_device()). Normally in Rust
you would have logic where whatever Rust structure that owns a reference
(which in practice is most things that implement RawDevice) would also
have a `Clone` trait impl that calls `get_device()` and duplicates
itself, and a `Drop` trait impl that calls `put_device()`. In Rust
terms, the Rust structure "owns" the device reference, and whatever code
creates that structure needs to either be allowed to steal the reference
from its caller or explicitly call `get_device()`.

You can also elide that in some cases though, like when you're just
passing a device reference down from a callback into Rust code. Then the
caller guarantees it has a device reference, which will outlive the
execution of the callback. So the wrapper that calls user Rust code can
materialize a RawDevice implementation without calling `get_device()`,
and then pass a Rust reference to it (which means the downstream code
can't steal it), and then make sure it gets destroyed without a
`put_device()` before the callback returns. In Rust terms, that's a code
path that passes a borrowed reference to the device down into user code.

This kind of ties back to why Rust rather than C... C does not encode
concepts like "borrowing a reference" or "taking over a reference" in
the type system, so it's really easy to mess up the refcounting and end
up with dangling references or leaks. Rust does, so once you wrap a C
API with the matching Rust semantics, there's no way to mess up and have
those issues any more.

> 
>> +///
>> +/// Additionally, implementers must ensure that the device is never renamed. Commit a5462516aa99
>> +/// ("driver-core: document restrictions on device_rename()") has details on why `device_rename`
>> +/// should not be used.
> 
> As much as I would have liked that, that commit was from 2010 and
> device_rename() is still being used by some pretty large subsystems
> (networking and IB) and I don't see that going away any year soon.
> 
> So yes, your device will be renamed, so don't mess with the device name
> like I mentioned in review of path 5/5 in this series.

I think we can just drop that part (and the name stuff) then. Wedson, is
that okay?

> 
>> +pub unsafe trait RawDevice {
>> +    /// Returns the raw `struct device` related to `self`.
>> +    fn raw_device(&self) -> *mut bindings::device;
> 
> Again, what prevents this device from going away?  I don't see a call to
> get_device() here :(

That would be up to the caller, if it needs to keep the pointer around.
Think of `raw_device()` as just "please return the `struct device *`
pointer for yourself, thanks". If you need to store that pointer for
later, you need to call `get_device()`. But most of the time it will be
used transiently, and then you don't need to take any references since
the safety requirement above guarantees that `self` owns a reference.
The caller then just needs to make sure not to throw away `self` before
it finishes using the returned pointer.

Keep in mind that this is an internal abstraction, it's there to be
implemented and used by the kernel crate itself. User code can't
actually do anything with the returned pointer without using `unsafe`
(so it's safe to expose this) and can't implement the trait without
`unsafe` either. But it can take dynamic trait arguments of type
RawDevice and pass them along, which is perfectly fine. Rust guarantees
that the right `get_device()` and `put_device()` are called when
references get cloned or dropped, for owned objects, and for borrowed
references there is no need to touch the refcount.

> And again, why are bindings needed for a "raw" struct device at all?
> Shouldn't the bus-specific wrappings work better?

Because lots of kernel subsystems need to be able to accept "any" device
and don't care about the bus! That's what this is for. All the bus
wrappers would implement this so they can be used as an argument for all
those subsystems (plus a generic one when you just need to pass around
an actual owned generic reference and no longer need bus-specific
operations - you can materialize that out of a RawDevice impl, which is
when get_device() would be called). That's why I'm introducing this now,
because both io_pgtable and rtkit need to take `struct device` pointers
on the C side so we need some "generic struct device" view on the Rust side.

~~ Lina
  
Greg KH Feb. 24, 2023, 2:11 p.m. UTC | #3
Thanks for the detailed rust explainations, I'd like to just highlight
one thing:

On Fri, Feb 24, 2023 at 10:15:12PM +0900, Asahi Lina wrote:
> On 24/02/2023 20.23, Greg Kroah-Hartman wrote:
> > And again, why are bindings needed for a "raw" struct device at all?
> > Shouldn't the bus-specific wrappings work better?
> 
> Because lots of kernel subsystems need to be able to accept "any" device
> and don't care about the bus! That's what this is for.

That's great, but:

> All the bus
> wrappers would implement this so they can be used as an argument for all
> those subsystems (plus a generic one when you just need to pass around
> an actual owned generic reference and no longer need bus-specific
> operations - you can materialize that out of a RawDevice impl, which is
> when get_device() would be called). That's why I'm introducing this now,
> because both io_pgtable and rtkit need to take `struct device` pointers
> on the C side so we need some "generic struct device" view on the Rust side.

In looking at both ftkit and io_pgtable, those seem to be good examples
of how "not to use a struct device", so trying to make safe bindings
from Rust to these frameworks is very ironic :)

rtkit takes a struct device pointer and then never increments it,
despite saving it off, which is unsafe.  It then only uses it to print
out messages if things go wrong (or right in some cases), which is odd.
So it can get away from using a device pointer entirely, except for the
devm_apple_rtkit_init() call, which I doubt you want to call from rust
code, right?

for io_pgtable, that's a bit messier, you want to pass in a device that
io_pgtable treats as a "device" but again, it is NEVER properly
reference counted, AND, it is only needed to try to figure out the bus
operations that dma memory should be allocated from for this device.  So
what would be better to save off there would be a pointer to the bus,
which is constant and soon will be read-only so there are no lifetime
rules needed at all (see the major struct bus_type changes going into
6.3-rc1 that will enable that to happen).

So the two subsystems you want to call from rust code don't properly
handle the reference count of the object you are going to pass into it,
and only need it for debugging and iommu stuff, which is really only the
bus that the device is on, not good examples to start out with :)

Yeah, this is yack-shaving, sorry, but it's how we clean up core
subsystems for apis and implementations that are not really correct and
were not noticed at the time.

Can we see some users of this code posted so I can see how struct device
is going to work in a rust driver?  That's the thing I worry most about
the rust/C interaction here as we have two different ways of thinking
about reference counts from the two worlds and putting them together is
going to be "interesting", as can be seen here already.

thanks,

greg k-h
  
Greg KH Feb. 24, 2023, 2:19 p.m. UTC | #4
On Fri, Feb 24, 2023 at 03:11:05PM +0100, Greg Kroah-Hartman wrote:
> Thanks for the detailed rust explainations, I'd like to just highlight
> one thing:
> 
> On Fri, Feb 24, 2023 at 10:15:12PM +0900, Asahi Lina wrote:
> > On 24/02/2023 20.23, Greg Kroah-Hartman wrote:
> > > And again, why are bindings needed for a "raw" struct device at all?
> > > Shouldn't the bus-specific wrappings work better?
> > 
> > Because lots of kernel subsystems need to be able to accept "any" device
> > and don't care about the bus! That's what this is for.
> 
> That's great, but:
> 
> > All the bus
> > wrappers would implement this so they can be used as an argument for all
> > those subsystems (plus a generic one when you just need to pass around
> > an actual owned generic reference and no longer need bus-specific
> > operations - you can materialize that out of a RawDevice impl, which is
> > when get_device() would be called). That's why I'm introducing this now,
> > because both io_pgtable and rtkit need to take `struct device` pointers
> > on the C side so we need some "generic struct device" view on the Rust side.
> 
> In looking at both ftkit and io_pgtable, those seem to be good examples
> of how "not to use a struct device", so trying to make safe bindings
> from Rust to these frameworks is very ironic :)
> 
> rtkit takes a struct device pointer and then never increments it,
> despite saving it off, which is unsafe.  It then only uses it to print
> out messages if things go wrong (or right in some cases), which is odd.
> So it can get away from using a device pointer entirely, except for the
> devm_apple_rtkit_init() call, which I doubt you want to call from rust
> code, right?
> 
> for io_pgtable, that's a bit messier, you want to pass in a device that
> io_pgtable treats as a "device" but again, it is NEVER properly
> reference counted, AND, it is only needed to try to figure out the bus
> operations that dma memory should be allocated from for this device.  So
> what would be better to save off there would be a pointer to the bus,
> which is constant and soon will be read-only so there are no lifetime
> rules needed at all (see the major struct bus_type changes going into
> 6.3-rc1 that will enable that to happen).
> 
> So the two subsystems you want to call from rust code don't properly
> handle the reference count of the object you are going to pass into it,
> and only need it for debugging and iommu stuff, which is really only the
> bus that the device is on, not good examples to start out with :)
> 
> Yeah, this is yack-shaving, sorry, but it's how we clean up core
> subsystems for apis and implementations that are not really correct and
> were not noticed at the time.
> 
> Can we see some users of this code posted so I can see how struct device
> is going to work in a rust driver?  That's the thing I worry most about
> the rust/C interaction here as we have two different ways of thinking
> about reference counts from the two worlds and putting them together is
> going to be "interesting", as can be seen here already.

Also, where are you getting your 'struct device' from in the first
place?  What bus is createing it and giving it to your rust driver?

thanks,

greg k-h
  
Robin Murphy Feb. 24, 2023, 2:32 p.m. UTC | #5
On 2023-02-24 14:11, Greg Kroah-Hartman wrote:
> Thanks for the detailed rust explainations, I'd like to just highlight
> one thing:
> 
> On Fri, Feb 24, 2023 at 10:15:12PM +0900, Asahi Lina wrote:
>> On 24/02/2023 20.23, Greg Kroah-Hartman wrote:
>>> And again, why are bindings needed for a "raw" struct device at all?
>>> Shouldn't the bus-specific wrappings work better?
>>
>> Because lots of kernel subsystems need to be able to accept "any" device
>> and don't care about the bus! That's what this is for.
> 
> That's great, but:
> 
>> All the bus
>> wrappers would implement this so they can be used as an argument for all
>> those subsystems (plus a generic one when you just need to pass around
>> an actual owned generic reference and no longer need bus-specific
>> operations - you can materialize that out of a RawDevice impl, which is
>> when get_device() would be called). That's why I'm introducing this now,
>> because both io_pgtable and rtkit need to take `struct device` pointers
>> on the C side so we need some "generic struct device" view on the Rust side.
> 
> In looking at both ftkit and io_pgtable, those seem to be good examples
> of how "not to use a struct device", so trying to make safe bindings
> from Rust to these frameworks is very ironic :)
> 
> rtkit takes a struct device pointer and then never increments it,
> despite saving it off, which is unsafe.  It then only uses it to print
> out messages if things go wrong (or right in some cases), which is odd.
> So it can get away from using a device pointer entirely, except for the
> devm_apple_rtkit_init() call, which I doubt you want to call from rust
> code, right?
> 
> for io_pgtable, that's a bit messier, you want to pass in a device that
> io_pgtable treats as a "device" but again, it is NEVER properly
> reference counted, AND, it is only needed to try to figure out the bus
> operations that dma memory should be allocated from for this device.  So
> what would be better to save off there would be a pointer to the bus,
> which is constant and soon will be read-only so there are no lifetime
> rules needed at all (see the major struct bus_type changes going into
> 6.3-rc1 that will enable that to happen).

FWIW the DMA API *has* to know which specific device it's operating 
with, since the relevant properties can and do vary even between 
different devices within a single bus_type (e.g. DMA masks).

In the case of io-pgtable at least, there's no explicit refcounting 
since the struct device must be the one representing the physical 
platform/PCI/etc. device consuming the pagetable, so if that were to 
disappear from underneath its driver while the pagetable is still in 
use, things would already have gone very very wrong indeed :)

Cheers,
Robin.

> So the two subsystems you want to call from rust code don't properly
> handle the reference count of the object you are going to pass into it,
> and only need it for debugging and iommu stuff, which is really only the
> bus that the device is on, not good examples to start out with :)
> 
> Yeah, this is yack-shaving, sorry, but it's how we clean up core
> subsystems for apis and implementations that are not really correct and
> were not noticed at the time.
> 
> Can we see some users of this code posted so I can see how struct device
> is going to work in a rust driver?  That's the thing I worry most about
> the rust/C interaction here as we have two different ways of thinking
> about reference counts from the two worlds and putting them together is
> going to be "interesting", as can be seen here already.
> 
> thanks,
> 
> greg k-h
  
Asahi Lina Feb. 24, 2023, 2:43 p.m. UTC | #6
On 2023/02/24 23:11, Greg Kroah-Hartman wrote:
> Thanks for the detailed rust explainations, I'd like to just highlight
> one thing:
> 
> On Fri, Feb 24, 2023 at 10:15:12PM +0900, Asahi Lina wrote:
>> On 24/02/2023 20.23, Greg Kroah-Hartman wrote:
>>> And again, why are bindings needed for a "raw" struct device at all?
>>> Shouldn't the bus-specific wrappings work better?
>>
>> Because lots of kernel subsystems need to be able to accept "any" device
>> and don't care about the bus! That's what this is for.
> 
> That's great, but:
> 
>> All the bus
>> wrappers would implement this so they can be used as an argument for all
>> those subsystems (plus a generic one when you just need to pass around
>> an actual owned generic reference and no longer need bus-specific
>> operations - you can materialize that out of a RawDevice impl, which is
>> when get_device() would be called). That's why I'm introducing this now,
>> because both io_pgtable and rtkit need to take `struct device` pointers
>> on the C side so we need some "generic struct device" view on the Rust side.
> 
> In looking at both ftkit and io_pgtable, those seem to be good examples
> of how "not to use a struct device", so trying to make safe bindings
> from Rust to these frameworks is very ironic :)

And this is why I want to use Rust, and why writing the abstractions for
C code is so difficult... Rust encodes all these rules in the type
system, but C doesn't, and so many kernel APIs don't document any of
this or what the requirements are...

> rtkit takes a struct device pointer and then never increments it,
> despite saving it off, which is unsafe.  It then only uses it to print
> out messages if things go wrong (or right in some cases), which is odd.
> So it can get away from using a device pointer entirely, except for the
> devm_apple_rtkit_init() call, which I doubt you want to call from rust
> code, right?

That sounds like we need to fix the C side to grab a reference ^^

We do need to pass the device to the init function though
(apple_rtkit_init(), this is in the SoC tree which I mentioned as a
prequisite and already on the way to 6.3-rc1), since at the very least
it has to pick up the mailbox and all that to initialize.

Alternatively we could say that the C API contract is that the user of
rtkit has to own a reference, and then the Rust abstraction would have
to take that reference to make a safe abstraction, but that doesn't
sound like the better option.

What do you recommend for things that want to print device-associated
messages, if not holding a reference to the device? Or did I
misunderstand what you meant? Just pr_foo() isn't great because we have
a lot of instances of rtkit and then you wouldn't know which device the
messages are about...

> for io_pgtable, that's a bit messier, you want to pass in a device that
> io_pgtable treats as a "device" but again, it is NEVER properly
> reference counted, AND, it is only needed to try to figure out the bus
> operations that dma memory should be allocated from for this device.  So
> what would be better to save off there would be a pointer to the bus,
> which is constant and soon will be read-only so there are no lifetime
> rules needed at all (see the major struct bus_type changes going into
> 6.3-rc1 that will enable that to happen).
> 
> So the two subsystems you want to call from rust code don't properly
> handle the reference count of the object you are going to pass into it,
> and only need it for debugging and iommu stuff, which is really only the
> bus that the device is on, not good examples to start out with :)

Well, they're two examples that are dependencies for the driver I wrote,
and I don't think you want me picking easy examples with zero known
upcoming users... ^^;;

> Yeah, this is yack-shaving, sorry, but it's how we clean up core
> subsystems for apis and implementations that are not really correct and
> were not noticed at the time.

I'm fine with helping fix all this, and I don't expect all the
underlying C code to be perfect already either! I already fixed one
locking bug in DRM and spent a lot of time trying to figure out lifetime
rules there, but I didn't dig into rtkit/io_pgtable and didn't realize
they don't take references properly...

> Can we see some users of this code posted so I can see how struct device
> is going to work in a rust driver?  That's the thing I worry most about
> the rust/C interaction here as we have two different ways of thinking
> about reference counts from the two worlds and putting them together is
> going to be "interesting", as can be seen here already.

I linked a tree with everything in the cover letter ([4]), look in
drivers/gpu/drm/asahi for the actual driver. But there are a lot of
other dependencies that have to go in before that will compile
(everything else in that branch...)

I know it's hard to review without examples, but I also can't just post
the driver and everything else as one series now, there's still a lot to
be improved and fixed and I'm working with the Rust folks on figuring
out a roadmap for that... and waiting until "everything" is ready and
perfect would mean we don't get anything done in the meantime and fall
into a pit of endless rebasing and coordinating downstream trees, which
also isn't good...

~~ Lina
  
Asahi Lina Feb. 24, 2023, 2:44 p.m. UTC | #7
On 2023/02/24 23:19, Greg Kroah-Hartman wrote:>> Can we see some users
of this code posted so I can see how struct device
>> is going to work in a rust driver?  That's the thing I worry most about
>> the rust/C interaction here as we have two different ways of thinking
>> about reference counts from the two worlds and putting them together is
>> going to be "interesting", as can be seen here already.
> 
> Also, where are you getting your 'struct device' from in the first
> place?  What bus is createing it and giving it to your rust driver?

That would be platform for my GPU driver, matched via OF compatible.

~~ Lina
  
Asahi Lina Feb. 24, 2023, 2:48 p.m. UTC | #8
On 2023/02/24 23:32, Robin Murphy wrote:
> On 2023-02-24 14:11, Greg Kroah-Hartman wrote:
>> Thanks for the detailed rust explainations, I'd like to just highlight
>> one thing:
>>
>> On Fri, Feb 24, 2023 at 10:15:12PM +0900, Asahi Lina wrote:
>>> On 24/02/2023 20.23, Greg Kroah-Hartman wrote:
>>>> And again, why are bindings needed for a "raw" struct device at all?
>>>> Shouldn't the bus-specific wrappings work better?
>>>
>>> Because lots of kernel subsystems need to be able to accept "any" device
>>> and don't care about the bus! That's what this is for.
>>
>> That's great, but:
>>
>>> All the bus
>>> wrappers would implement this so they can be used as an argument for all
>>> those subsystems (plus a generic one when you just need to pass around
>>> an actual owned generic reference and no longer need bus-specific
>>> operations - you can materialize that out of a RawDevice impl, which is
>>> when get_device() would be called). That's why I'm introducing this now,
>>> because both io_pgtable and rtkit need to take `struct device` pointers
>>> on the C side so we need some "generic struct device" view on the
>>> Rust side.
>>
>> In looking at both ftkit and io_pgtable, those seem to be good examples
>> of how "not to use a struct device", so trying to make safe bindings
>> from Rust to these frameworks is very ironic :)
>>
>> rtkit takes a struct device pointer and then never increments it,
>> despite saving it off, which is unsafe.  It then only uses it to print
>> out messages if things go wrong (or right in some cases), which is odd.
>> So it can get away from using a device pointer entirely, except for the
>> devm_apple_rtkit_init() call, which I doubt you want to call from rust
>> code, right?
>>
>> for io_pgtable, that's a bit messier, you want to pass in a device that
>> io_pgtable treats as a "device" but again, it is NEVER properly
>> reference counted, AND, it is only needed to try to figure out the bus
>> operations that dma memory should be allocated from for this device.  So
>> what would be better to save off there would be a pointer to the bus,
>> which is constant and soon will be read-only so there are no lifetime
>> rules needed at all (see the major struct bus_type changes going into
>> 6.3-rc1 that will enable that to happen).
> 
> FWIW the DMA API *has* to know which specific device it's operating
> with, since the relevant properties can and do vary even between
> different devices within a single bus_type (e.g. DMA masks).
> 
> In the case of io-pgtable at least, there's no explicit refcounting
> since the struct device must be the one representing the physical
> platform/PCI/etc. device consuming the pagetable, so if that were to
> disappear from underneath its driver while the pagetable is still in
> use, things would already have gone very very wrong indeed :)

There's no terribly good way to encode this relationship in safe Rust as
far as I know. So although it might be "obvious" (and I think my driver
can never violate it as it is currently designed), this means the Rust
abstraction will have to take the device reference if the C side does
not, because safe rust abstractions have to actually make these bugs
impossible and nothing stops a Rust driver from, say, stashing an
io_pgtable reference into a global and letting the device go away.

~~ Lina
  
Robin Murphy Feb. 24, 2023, 3:14 p.m. UTC | #9
On 2023-02-24 14:48, Asahi Lina wrote:
> 
> 
> On 2023/02/24 23:32, Robin Murphy wrote:
>> On 2023-02-24 14:11, Greg Kroah-Hartman wrote:
>>> Thanks for the detailed rust explainations, I'd like to just highlight
>>> one thing:
>>>
>>> On Fri, Feb 24, 2023 at 10:15:12PM +0900, Asahi Lina wrote:
>>>> On 24/02/2023 20.23, Greg Kroah-Hartman wrote:
>>>>> And again, why are bindings needed for a "raw" struct device at all?
>>>>> Shouldn't the bus-specific wrappings work better?
>>>>
>>>> Because lots of kernel subsystems need to be able to accept "any" device
>>>> and don't care about the bus! That's what this is for.
>>>
>>> That's great, but:
>>>
>>>> All the bus
>>>> wrappers would implement this so they can be used as an argument for all
>>>> those subsystems (plus a generic one when you just need to pass around
>>>> an actual owned generic reference and no longer need bus-specific
>>>> operations - you can materialize that out of a RawDevice impl, which is
>>>> when get_device() would be called). That's why I'm introducing this now,
>>>> because both io_pgtable and rtkit need to take `struct device` pointers
>>>> on the C side so we need some "generic struct device" view on the
>>>> Rust side.
>>>
>>> In looking at both ftkit and io_pgtable, those seem to be good examples
>>> of how "not to use a struct device", so trying to make safe bindings
>>> from Rust to these frameworks is very ironic :)
>>>
>>> rtkit takes a struct device pointer and then never increments it,
>>> despite saving it off, which is unsafe.  It then only uses it to print
>>> out messages if things go wrong (or right in some cases), which is odd.
>>> So it can get away from using a device pointer entirely, except for the
>>> devm_apple_rtkit_init() call, which I doubt you want to call from rust
>>> code, right?
>>>
>>> for io_pgtable, that's a bit messier, you want to pass in a device that
>>> io_pgtable treats as a "device" but again, it is NEVER properly
>>> reference counted, AND, it is only needed to try to figure out the bus
>>> operations that dma memory should be allocated from for this device.  So
>>> what would be better to save off there would be a pointer to the bus,
>>> which is constant and soon will be read-only so there are no lifetime
>>> rules needed at all (see the major struct bus_type changes going into
>>> 6.3-rc1 that will enable that to happen).
>>
>> FWIW the DMA API *has* to know which specific device it's operating
>> with, since the relevant properties can and do vary even between
>> different devices within a single bus_type (e.g. DMA masks).
>>
>> In the case of io-pgtable at least, there's no explicit refcounting
>> since the struct device must be the one representing the physical
>> platform/PCI/etc. device consuming the pagetable, so if that were to
>> disappear from underneath its driver while the pagetable is still in
>> use, things would already have gone very very wrong indeed :)
> 
> There's no terribly good way to encode this relationship in safe Rust as
> far as I know. So although it might be "obvious" (and I think my driver
> can never violate it as it is currently designed), this means the Rust
> abstraction will have to take the device reference if the C side does
> not, because safe rust abstractions have to actually make these bugs
> impossible and nothing stops a Rust driver from, say, stashing an
> io_pgtable reference into a global and letting the device go away.

If someone did that, then simply holding a struct device reference 
wouldn't guarantee much, since it only prevents the pointer itself from 
becoming invalid - it still doesn't say any of the data *in* the 
structure is still valid and "safe" for what a DMA API call might do 
with it.

At the very least you'd probably have to somehow also guarantee that the 
device has a driver bound (which is the closest thing to a general 
indication of valid DMA ops across all architectures) and block it from 
unbinding for the lifetime of the reference, but that would then mean a 
simple driver which expects to tear down its io-pgtable from its .remove 
callback could never be unbound due to the circular dependency :/

Cheers,
Robin.
  
Greg KH Feb. 24, 2023, 3:24 p.m. UTC | #10
On Fri, Feb 24, 2023 at 11:43:54PM +0900, Asahi Lina wrote:
> 
> 
> On 2023/02/24 23:11, Greg Kroah-Hartman wrote:
> > Thanks for the detailed rust explainations, I'd like to just highlight
> > one thing:
> > 
> > On Fri, Feb 24, 2023 at 10:15:12PM +0900, Asahi Lina wrote:
> >> On 24/02/2023 20.23, Greg Kroah-Hartman wrote:
> >>> And again, why are bindings needed for a "raw" struct device at all?
> >>> Shouldn't the bus-specific wrappings work better?
> >>
> >> Because lots of kernel subsystems need to be able to accept "any" device
> >> and don't care about the bus! That's what this is for.
> > 
> > That's great, but:
> > 
> >> All the bus
> >> wrappers would implement this so they can be used as an argument for all
> >> those subsystems (plus a generic one when you just need to pass around
> >> an actual owned generic reference and no longer need bus-specific
> >> operations - you can materialize that out of a RawDevice impl, which is
> >> when get_device() would be called). That's why I'm introducing this now,
> >> because both io_pgtable and rtkit need to take `struct device` pointers
> >> on the C side so we need some "generic struct device" view on the Rust side.
> > 
> > In looking at both ftkit and io_pgtable, those seem to be good examples
> > of how "not to use a struct device", so trying to make safe bindings
> > from Rust to these frameworks is very ironic :)
> 
> And this is why I want to use Rust, and why writing the abstractions for
> C code is so difficult... Rust encodes all these rules in the type
> system, but C doesn't, and so many kernel APIs don't document any of
> this or what the requirements are...

I totally agree, and is why I'm wanting to review this stuff.

> > rtkit takes a struct device pointer and then never increments it,
> > despite saving it off, which is unsafe.  It then only uses it to print
> > out messages if things go wrong (or right in some cases), which is odd.
> > So it can get away from using a device pointer entirely, except for the
> > devm_apple_rtkit_init() call, which I doubt you want to call from rust
> > code, right?
> 
> That sounds like we need to fix the C side to grab a reference ^^

I agree.  Or remove the reference entirely.

> We do need to pass the device to the init function though
> (apple_rtkit_init(), this is in the SoC tree which I mentioned as a
> prequisite and already on the way to 6.3-rc1), since at the very least
> it has to pick up the mailbox and all that to initialize.

As I said, the rtkit code does not actually use the device pointer at
all except for some (I would say pointless) log messages.  So we could
change the rtkit code either way, I would go for the removal.

> Alternatively we could say that the C API contract is that the user of
> rtkit has to own a reference, and then the Rust abstraction would have
> to take that reference to make a safe abstraction, but that doesn't
> sound like the better option.

Yeah, that feels odd.

> What do you recommend for things that want to print device-associated
> messages, if not holding a reference to the device?

If you aren't holding a reference to the device, that means you aren't
associated to it at all, so you better not be printing out anything
related to any device as that pointer could be invalid at any point in
time.

> Or did I
> misunderstand what you meant? Just pr_foo() isn't great because we have
> a lot of instances of rtkit and then you wouldn't know which device the
> messages are about...

Then the rtkit code needs to be changed to properly grab the reference
and actually use it for something other than just a log message.  If it
only wants it for a log message, then let's just drop it and have the
rtkit code go quiet, as when kernel code is working properly, it should
be quiet.  If something goes wrong, the code that called into rtkit can
print out a message based on the error return values.

I have no idea what "rtkit" is, if it's an interface to hardware, why
doesn't it have its own struct device that it creates and manages and
uses instead?  In my quick glance, that feels like the real solution
here instead of just "I hope this pointer is going to be valid" like it
lives with today.  Odds are you can't remove a rtkit device at runtime,
so no one has noticed this yet...

> > for io_pgtable, that's a bit messier, you want to pass in a device that
> > io_pgtable treats as a "device" but again, it is NEVER properly
> > reference counted, AND, it is only needed to try to figure out the bus
> > operations that dma memory should be allocated from for this device.  So
> > what would be better to save off there would be a pointer to the bus,
> > which is constant and soon will be read-only so there are no lifetime
> > rules needed at all (see the major struct bus_type changes going into
> > 6.3-rc1 that will enable that to happen).
> > 
> > So the two subsystems you want to call from rust code don't properly
> > handle the reference count of the object you are going to pass into it,
> > and only need it for debugging and iommu stuff, which is really only the
> > bus that the device is on, not good examples to start out with :)
> 
> Well, they're two examples that are dependencies for the driver I wrote,
> and I don't think you want me picking easy examples with zero known
> upcoming users... ^^;;

Hey, this is good, I like it, no complaints!

> > Yeah, this is yack-shaving, sorry, but it's how we clean up core
> > subsystems for apis and implementations that are not really correct and
> > were not noticed at the time.
> 
> I'm fine with helping fix all this, and I don't expect all the
> underlying C code to be perfect already either! I already fixed one
> locking bug in DRM and spent a lot of time trying to figure out lifetime
> rules there, but I didn't dig into rtkit/io_pgtable and didn't realize
> they don't take references properly...

The iomem code is under heavy-flux right now so let's see what 6.3-rc1
looks like to determine if this still needs to be resolved or not.  I
hope the latest set of fixes/reworks for that subsystem land.  Either
way, I'll look into what needs to be done there as obviously, the
current code is not correct.

> > Can we see some users of this code posted so I can see how struct device
> > is going to work in a rust driver?  That's the thing I worry most about
> > the rust/C interaction here as we have two different ways of thinking
> > about reference counts from the two worlds and putting them together is
> > going to be "interesting", as can be seen here already.
> 
> I linked a tree with everything in the cover letter ([4]), look in
> drivers/gpu/drm/asahi for the actual driver. But there are a lot of
> other dependencies that have to go in before that will compile
> (everything else in that branch...)

Sorry, it's hard to notice random github repos and branches, especially
when traveling, that's why we use email for review.

> I know it's hard to review without examples, but I also can't just post
> the driver and everything else as one series now, there's still a lot to
> be improved and fixed and I'm working with the Rust folks on figuring
> out a roadmap for that... and waiting until "everything" is ready and
> perfect would mean we don't get anything done in the meantime and fall
> into a pit of endless rebasing and coordinating downstream trees, which
> also isn't good...

Yeah, it's a chicken and egg issue right now, no worries, I understand.
This is going to take some cycles to get right.

thanks,

greg k-h
  
Greg KH Feb. 24, 2023, 3:25 p.m. UTC | #11
On Fri, Feb 24, 2023 at 11:44:59PM +0900, Asahi Lina wrote:
> On 2023/02/24 23:19, Greg Kroah-Hartman wrote:>> Can we see some users
> of this code posted so I can see how struct device
> >> is going to work in a rust driver?  That's the thing I worry most about
> >> the rust/C interaction here as we have two different ways of thinking
> >> about reference counts from the two worlds and putting them together is
> >> going to be "interesting", as can be seen here already.
> > 
> > Also, where are you getting your 'struct device' from in the first
> > place?  What bus is createing it and giving it to your rust driver?
> 
> That would be platform for my GPU driver, matched via OF compatible.

Ick, a platform device?  The GPU isn't on the PCI bus?  Wow, that's
horrid...
  
Greg KH Feb. 24, 2023, 3:32 p.m. UTC | #12
On Fri, Feb 24, 2023 at 02:32:47PM +0000, Robin Murphy wrote:
> On 2023-02-24 14:11, Greg Kroah-Hartman wrote:
> > Thanks for the detailed rust explainations, I'd like to just highlight
> > one thing:
> > 
> > On Fri, Feb 24, 2023 at 10:15:12PM +0900, Asahi Lina wrote:
> > > On 24/02/2023 20.23, Greg Kroah-Hartman wrote:
> > > > And again, why are bindings needed for a "raw" struct device at all?
> > > > Shouldn't the bus-specific wrappings work better?
> > > 
> > > Because lots of kernel subsystems need to be able to accept "any" device
> > > and don't care about the bus! That's what this is for.
> > 
> > That's great, but:
> > 
> > > All the bus
> > > wrappers would implement this so they can be used as an argument for all
> > > those subsystems (plus a generic one when you just need to pass around
> > > an actual owned generic reference and no longer need bus-specific
> > > operations - you can materialize that out of a RawDevice impl, which is
> > > when get_device() would be called). That's why I'm introducing this now,
> > > because both io_pgtable and rtkit need to take `struct device` pointers
> > > on the C side so we need some "generic struct device" view on the Rust side.
> > 
> > In looking at both ftkit and io_pgtable, those seem to be good examples
> > of how "not to use a struct device", so trying to make safe bindings
> > from Rust to these frameworks is very ironic :)
> > 
> > rtkit takes a struct device pointer and then never increments it,
> > despite saving it off, which is unsafe.  It then only uses it to print
> > out messages if things go wrong (or right in some cases), which is odd.
> > So it can get away from using a device pointer entirely, except for the
> > devm_apple_rtkit_init() call, which I doubt you want to call from rust
> > code, right?
> > 
> > for io_pgtable, that's a bit messier, you want to pass in a device that
> > io_pgtable treats as a "device" but again, it is NEVER properly
> > reference counted, AND, it is only needed to try to figure out the bus
> > operations that dma memory should be allocated from for this device.  So
> > what would be better to save off there would be a pointer to the bus,
> > which is constant and soon will be read-only so there are no lifetime
> > rules needed at all (see the major struct bus_type changes going into
> > 6.3-rc1 that will enable that to happen).
> 
> FWIW the DMA API *has* to know which specific device it's operating with,
> since the relevant properties can and do vary even between different devices
> within a single bus_type (e.g. DMA masks).

What bus_type has different DMA masks depending on the device on that
bus today?

And the iommu ops are on the bus, not the device, but there is a
iommu_group on the device, is that what you are referring to?

Am I getting iommu and dma stuff confused here?  A bus also has dma
callbacks, but yet the device itself has dma_ops?

> In the case of io-pgtable at least, there's no explicit refcounting since
> the struct device must be the one representing the physical
> platform/PCI/etc. device consuming the pagetable, so if that were to
> disappear from underneath its driver while the pagetable is still in use,
> things would already have gone very very wrong indeed :)

But that could happen at any point in time, the device can be removed
from the system with no warning, how do you guarantee that io-pgtable is
being initialized with a device that can NOT be removed?

Think of drawers containing CPUs and PCI devices and memory, Linux
has supported hot-removal of those for decades.  (well, not for memory,
we could only hot-add that...)

Again, passing in a pointer to a struct device, and saving it off
WITHOUT incrementing the reference count is not ok, that's not how
reference counts work...

But again, let's see about disconnecting the iommu ops entirely from the
device and just relying on the bus, that should work better, rigth?

thanks,

greg k-h
  
Asahi Lina Feb. 24, 2023, 3:42 p.m. UTC | #13
On 25/02/2023 00.24, Greg Kroah-Hartman wrote:
>> What do you recommend for things that want to print device-associated
>> messages, if not holding a reference to the device?
> 
> If you aren't holding a reference to the device, that means you aren't
> associated to it at all, so you better not be printing out anything
> related to any device as that pointer could be invalid at any point in
> time.

The RTKit code talks to the firmware coprocessor that is part of the
device, so it definitely is associated to it... among other things it
prints out firmware logs from the device and crash logs, manages memory
buffers (which have default implementations but can be overridden by the
user driver), and more. It's essentially library code shared by all
device drivers that interact with devices with these coprocessors.

>> Or did I
>> misunderstand what you meant? Just pr_foo() isn't great because we have
>> a lot of instances of rtkit and then you wouldn't know which device the
>> messages are about...
> 
> Then the rtkit code needs to be changed to properly grab the reference
> and actually use it for something other than just a log message.  If it
> only wants it for a log message, then let's just drop it and have the
> rtkit code go quiet, as when kernel code is working properly, it should
> be quiet.  If something goes wrong, the code that called into rtkit can
> print out a message based on the error return values.

Keep in mind rtkit does things like print out crash logs, and you
wouldn't want the caller to be responsible for that (and we definitely
want those in dmesg since these coprocessors are non-recoverable, it's
almost as bad as a kernel panic: you will have to reboot to be able to
use the machine properly again). I find those crash logs very useful to
figure out what went wrong with the GPU (especially if combined with a
memory dump which we don't expose to regular users right now, but which
I have ideas for... but even without that, just assert messages from the
coprocessor or fault instruction pointers that I can correlate with the
firmware are very useful on their own).

Right now rtkit also prints out syslogs from the coprocessors. That's
noisy for some but I think very useful, since we're dealing with reverse
engineered drivers. We'll probably want to silence those for some noisy
coprocessors at some point, but I don't think we want to do that until
things are all upstream, stable, and with a larger user base... until
then I think we'd much rather be spammy and have a better chance of
debugging rare issues, which often happen with these coprocessors
running big firmware blobs... there are a lot of subtleties in getting
the interfaces right, never mind cache coherence issues!

> I have no idea what "rtkit" is, if it's an interface to hardware, why
> doesn't it have its own struct device that it creates and manages and
> uses instead?  In my quick glance, that feels like the real solution
> here instead of just "I hope this pointer is going to be valid" like it
> lives with today.  Odds are you can't remove a rtkit device at runtime,
> so no one has noticed this yet...

Well, they're all embedded into the SoC, yes.

RTKit is Apple's firmware RTOS, and also the name for the
semi-standardized mailbox/shared-memory interface shared by different
firmwares using it. The Linux code to drive it doesn't create its own
"struct device" because the rtkit code is just library code that is
extended by the downstream drivers (like mine). How each driver
interacts with rtkit varies widely... NVMe almost doesn't at all other
than for power management, there is actually a downstream "rtkit-helper"
driver that is a proper standalone device wrapper for one case (MTP)
where it really doesn't need to interact at all... in my case with the
GPU, almost everything is shared memory and doorbells over the mailbox.
Other drivers like DCP actually send pointers over multiple mailbox
endpoints, or do most of their data exchange directly over messages like
that (SMC).

So in a way, if we consider it driver library code, it's not
unreasonable for RTKit to require that the device you pass it outlives
it. Certainly, if the device is getting unbound from your driver, you'd
need to tear down RTKit as part of that in any reasonable situation.

>> I know it's hard to review without examples, but I also can't just post
>> the driver and everything else as one series now, there's still a lot to
>> be improved and fixed and I'm working with the Rust folks on figuring
>> out a roadmap for that... and waiting until "everything" is ready and
>> perfect would mean we don't get anything done in the meantime and fall
>> into a pit of endless rebasing and coordinating downstream trees, which
>> also isn't good...
> 
> Yeah, it's a chicken and egg issue right now, no worries, I understand.
> This is going to take some cycles to get right.
> 

Thank you ^^

~~ Lina
  
Asahi Lina Feb. 24, 2023, 3:45 p.m. UTC | #14
On 25/02/2023 00.25, Greg Kroah-Hartman wrote:
> On Fri, Feb 24, 2023 at 11:44:59PM +0900, Asahi Lina wrote:
>> On 2023/02/24 23:19, Greg Kroah-Hartman wrote:>> Can we see some users
>> of this code posted so I can see how struct device
>>>> is going to work in a rust driver?  That's the thing I worry most about
>>>> the rust/C interaction here as we have two different ways of thinking
>>>> about reference counts from the two worlds and putting them together is
>>>> going to be "interesting", as can be seen here already.
>>>
>>> Also, where are you getting your 'struct device' from in the first
>>> place?  What bus is createing it and giving it to your rust driver?
>>
>> That would be platform for my GPU driver, matched via OF compatible.
> 
> Ick, a platform device?  The GPU isn't on the PCI bus?  Wow, that's
> horrid...

The internal NVMe controller also isn't on the PCI bus (which is worse
since that actually has standard PCI bindings)... nothing is on PCI
buses other than external PCI/Thunderbolt devices. There are no internal
fake-PCI devices on Apple Silicon like there are on Intel. That's how it
is on most SoCs (though Apple is the only vendor who has dared to go as
far as integrating NVMe as far as I know)...

~~ Lina
  
Asahi Lina Feb. 24, 2023, 4:23 p.m. UTC | #15
On 25/02/2023 00.14, Robin Murphy wrote:
> On 2023-02-24 14:48, Asahi Lina wrote:
>>
>>
>> On 2023/02/24 23:32, Robin Murphy wrote:
>>> FWIW the DMA API *has* to know which specific device it's operating
>>> with, since the relevant properties can and do vary even between
>>> different devices within a single bus_type (e.g. DMA masks).
>>>
>>> In the case of io-pgtable at least, there's no explicit refcounting
>>> since the struct device must be the one representing the physical
>>> platform/PCI/etc. device consuming the pagetable, so if that were to
>>> disappear from underneath its driver while the pagetable is still in
>>> use, things would already have gone very very wrong indeed :)
>>
>> There's no terribly good way to encode this relationship in safe Rust as
>> far as I know. So although it might be "obvious" (and I think my driver
>> can never violate it as it is currently designed), this means the Rust
>> abstraction will have to take the device reference if the C side does
>> not, because safe rust abstractions have to actually make these bugs
>> impossible and nothing stops a Rust driver from, say, stashing an
>> io_pgtable reference into a global and letting the device go away.
> 
> If someone did that, then simply holding a struct device reference 
> wouldn't guarantee much, since it only prevents the pointer itself from 
> becoming invalid - it still doesn't say any of the data *in* the 
> structure is still valid and "safe" for what a DMA API call might do 
> with it.
> 
> At the very least you'd probably have to somehow also guarantee that the 
> device has a driver bound (which is the closest thing to a general 
> indication of valid DMA ops across all architectures) and block it from 
> unbinding for the lifetime of the reference, but that would then mean a 
> simple driver which expects to tear down its io-pgtable from its .remove 
> callback could never be unbound due to the circular dependency :/

I'd like to hear from the other Rust folks about ideas on how to solve
this ^^. I think it might fit into the Revocable model for device
resources, but it's going to be a bit of an issue for my driver because
for me, the io_pgtable objects are deep in the Vm object that has its
own refcounting, and those are created and destroyed dynamically at
runtime, not just during device probe...

But it's also possible that we have to resign ourselves to a somewhat
leaky/unsound abstraction under the understanding that the C API has
requirements that are difficult to encode in the Rust type system...

There's actually a much longer story to tell here though... I've done a
lot of thinking about what "safe" means in the context of the kernel,
what the useful safety boundaries are, and things like that. Especially
within complex drivers, the answer is a lot greyer than what you tend to
get with Rust userspace code. But at the end of the day, Rust lets you
make things much safer and reduce the chances of bugs creeping in
dramatically, even if you have some slightly unsound abstractions and
other issues. Maybe I should write a long form blog post about this,
would people find that useful?

For whatever this anecdote is worth, my GPU driver is about 16000 lines
of Rust and has ~100 unsafe blocks, and has been in a few downstream
distro repos since December with actual users and nobody has ever
managed to oops it as far as I know, including other Mesa developers who
have been feeding it all kinds of broken command buffers and things like
that. It has survived OOMs leading to untested error paths, UAPI
fuzzing, and endless GPU faults/timeouts... and I definitely am not a
genius who writes bug-free code! The firmware itself is also very easy
to crash (raw unsafe pointers everywhere in shared memory), and crashes
are unrecoverable (you need to reboot), but I try to use Rust's type
system and ownership model to extend safety to that interface as much as
possible and I think people other than me have only managed to get the
GPU firmware to crash twice (both cache coherency related, I *think*
I've finally eliminated that issue at the root now). The uptime on my M2
streaming machine running a decent GPU workload for 20 hours or so
weekly is 44 days and counting... I think that old kernel even has some
minor memory leaks that I've since fixed (you *can* get those in Rust
with circular references), but it's so minor it's going to be at least a
few more months before it becomes a problem.

So I guess what I'm saying is that at the end of the day, if we can't
get an interface to be 100% safe and sound and usable, that's probably
okay. We're still getting a lot of safe mileage out of the other 99%! ^^

~~ Lina
  
Robin Murphy Feb. 24, 2023, 6:52 p.m. UTC | #16
On 2023-02-24 15:32, Greg Kroah-Hartman wrote:
> On Fri, Feb 24, 2023 at 02:32:47PM +0000, Robin Murphy wrote:
>> On 2023-02-24 14:11, Greg Kroah-Hartman wrote:
>>> Thanks for the detailed rust explainations, I'd like to just highlight
>>> one thing:
>>>
>>> On Fri, Feb 24, 2023 at 10:15:12PM +0900, Asahi Lina wrote:
>>>> On 24/02/2023 20.23, Greg Kroah-Hartman wrote:
>>>>> And again, why are bindings needed for a "raw" struct device at all?
>>>>> Shouldn't the bus-specific wrappings work better?
>>>>
>>>> Because lots of kernel subsystems need to be able to accept "any" device
>>>> and don't care about the bus! That's what this is for.
>>>
>>> That's great, but:
>>>
>>>> All the bus
>>>> wrappers would implement this so they can be used as an argument for all
>>>> those subsystems (plus a generic one when you just need to pass around
>>>> an actual owned generic reference and no longer need bus-specific
>>>> operations - you can materialize that out of a RawDevice impl, which is
>>>> when get_device() would be called). That's why I'm introducing this now,
>>>> because both io_pgtable and rtkit need to take `struct device` pointers
>>>> on the C side so we need some "generic struct device" view on the Rust side.
>>>
>>> In looking at both ftkit and io_pgtable, those seem to be good examples
>>> of how "not to use a struct device", so trying to make safe bindings
>>> from Rust to these frameworks is very ironic :)
>>>
>>> rtkit takes a struct device pointer and then never increments it,
>>> despite saving it off, which is unsafe.  It then only uses it to print
>>> out messages if things go wrong (or right in some cases), which is odd.
>>> So it can get away from using a device pointer entirely, except for the
>>> devm_apple_rtkit_init() call, which I doubt you want to call from rust
>>> code, right?
>>>
>>> for io_pgtable, that's a bit messier, you want to pass in a device that
>>> io_pgtable treats as a "device" but again, it is NEVER properly
>>> reference counted, AND, it is only needed to try to figure out the bus
>>> operations that dma memory should be allocated from for this device.  So
>>> what would be better to save off there would be a pointer to the bus,
>>> which is constant and soon will be read-only so there are no lifetime
>>> rules needed at all (see the major struct bus_type changes going into
>>> 6.3-rc1 that will enable that to happen).
>>
>> FWIW the DMA API *has* to know which specific device it's operating with,
>> since the relevant properties can and do vary even between different devices
>> within a single bus_type (e.g. DMA masks).
> 
> What bus_type has different DMA masks depending on the device on that
> bus today?

Certainly PCI and platform, which are about 99% of what matters in terms 
of DMA. You may say "ewww" again, but even on something as "normal" as a 
PCIe GPU chip, the audio codec function may well have a different DMA 
mask from the GPU function, let alone distinct devices across 
cards/slots/segments/etc. differing - e.g. some PCIe USB controllers are 
still limited to 32 bits, where your more-capable GPU/NVMe/NIC etc. 
would definitely not be cool with a lowest-common-denominator constraint.

> And the iommu ops are on the bus, not the device, but there is a
> iommu_group on the device, is that what you are referring to?
> 
> Am I getting iommu and dma stuff confused here?  A bus also has dma
> callbacks, but yet the device itself has dma_ops?

Confusion is excusable here - the bus "dma_configure" callbacks are 
mostly actually IOMMU configuration, which is a bit of historical legacy 
that still needs more cleaning up (because it makes things happen in the 
wrong order from what the IOMMU API really wants). I'm hoping to get to 
that soon once we land my current series finishing the bus->iommu_ops 
removal.

(FWIW what I want to do is flip things around so the buses just provide 
a method for the IOMMU API to to retrieve whatever bus-defined IDs it 
needs to associate with a given device, rather than having the bus code 
call into the IOMMU API itself as currently.)

>> In the case of io-pgtable at least, there's no explicit refcounting since
>> the struct device must be the one representing the physical
>> platform/PCI/etc. device consuming the pagetable, so if that were to
>> disappear from underneath its driver while the pagetable is still in use,
>> things would already have gone very very wrong indeed :)
> 
> But that could happen at any point in time, the device can be removed
> from the system with no warning, how do you guarantee that io-pgtable is
> being initialized with a device that can NOT be removed?
> 
> Think of drawers containing CPUs and PCI devices and memory, Linux
> has supported hot-removal of those for decades.  (well, not for memory,
> we could only hot-add that...)
> 
> Again, passing in a pointer to a struct device, and saving it off
> WITHOUT incrementing the reference count is not ok, that's not how
> reference counts work...

io-pgtable is just a utility library which that device's driver is 
calling into. It never does anything with that pointer by itself, and 
it's implicitly expected that the ops are only going to be called by 
whoever allocated them. If the driver needs to cope with a surprise 
removal event while bound, that's on the driver, but I'd assume the bus 
code is still expected to at least give it a chance to clean up 
gracefully before the struct device completely disappears. I don't think 
I've ever seen a driver explicitly take a reference on its own device 
for "normal" operation, so either everything's wrong, or that guarantee 
must already come from deeper in the driver core.

Yes, *technically* someone could invoke io-pgtable ops from some context 
other than the driver which allocated them, but it would be wrong and 
the answer would be "don't do that". We've got a huge quantity of APIs 
all over Linux where the expected usage model is intuitive, but 
impractical to formally assert, so it's just assumed that you do the 
right thing and expect fireworks if you don't. I do fully appreciate 
that "don't do that" is not an easy thing for Rust to nail down.

There's nothing inherently special about saving a pointer in a structure 
either - if someone's failing to respect the expected conventions, then 
the lifetime of a pointer *anywhere* matters, and even something as 
innocuous as dev_name() should strictly take a reference on its argument 
in case it's called concurrently with the device being freed. In fact, 
consider if get_device() is pre-empted right on entry, and by the time 
it comes back to dereference the kobject, someone else has racily 
dropped the last reference and already freed it. Unless we can draw a 
line of reasonable expectations somewhere, we've no pretence of being 
able to guarantee *anything*.

> But again, let's see about disconnecting the iommu ops entirely from the
> device and just relying on the bus, that should work better, rigth?

Um, other way - iommu_ops have almost finished moving *off* the bus to 
be per-device. You acked the patch the other week and seemed pleased 
with it ;)

But either way that's orthogonal to this case, since the device that 
io-pgtable deals with cannot have iommu_ops of its own (it *is* the 
IOMMU, or at least a GPU MMU acting as one).

Cheers,
Robin.
  
Miguel Ojeda Feb. 24, 2023, 7:22 p.m. UTC | #17
On Fri, Feb 24, 2023 at 5:23 PM Asahi Lina <lina@asahilina.net> wrote:
>
> So I guess what I'm saying is that at the end of the day, if we can't
> get an interface to be 100% safe and sound and usable, that's probably
> okay. We're still getting a lot of safe mileage out of the other 99%! ^^

We talked a few times about what approach to take for things where a
fully safe API is not feasible. There have been differing opinions in
the past.

One approach would be requiring a "global `unsafe`" so to speak, once,
somewhere in the API -- it does not improve things much, but at least
it would make the user acknowledge the pitfalls of that particular
API/feature/subsystem/hardware/... e.g. for module unloading, one
could ask for an `unsafe` inside the `module!` macro invocation (like
`unsafe unloadable: true,`). This would allow for `// SAFETY: ...`
comments.

Another approach would be declaring some of those "external entities"
outside the scope of Rust's safety guarantees, like it is done for
e.g. `/proc/self/mem` in userspace Rust [1]. They would be documented
wherever relevant, and perhaps we could have an "acknowledged
soundness holes" list.

Having said that, of course, what we definitely don't want to allow is
for subsystems to provide unsound safe APIs for no reason.

[1] https://doc.rust-lang.org/stable/std/os/unix/io/index.html#procselfmem-and-similar-os-features

Cheers,
Miguel
  
Greg KH Feb. 25, 2023, 7 a.m. UTC | #18
On Fri, Feb 24, 2023 at 06:52:04PM +0000, Robin Murphy wrote:
> > But again, let's see about disconnecting the iommu ops entirely from the
> > device and just relying on the bus, that should work better, rigth?
> 
> Um, other way - iommu_ops have almost finished moving *off* the bus to be
> per-device. You acked the patch the other week and seemed pleased with it ;)

Oops, yes, sorry, I meant the other way around in my comment, see, it's
confusing :)

And yes, I want to see your patch series merged, it will make my "clean
up the driver core lifetime rules" work a lot easier, is it still
planned for 6.3-rc1?

> But either way that's orthogonal to this case, since the device that
> io-pgtable deals with cannot have iommu_ops of its own (it *is* the IOMMU,
> or at least a GPU MMU acting as one).

Agreed, and thanks for the detailed explaination, it makes more sense
now.  But I still think the pointer reference needs to be incremented
when saved off to be at least a bit more sane.

thanks,

greg k-h
  
Alyssa Rosenzweig Feb. 25, 2023, 5:07 p.m. UTC | #19
February 24, 2023 10:25 AM, "Greg Kroah-Hartman" <gregkh@linuxfoundation.org> wrote:

> On Fri, Feb 24, 2023 at 11:44:59PM +0900, Asahi Lina wrote:
> 
>> On 2023/02/24 23:19, Greg Kroah-Hartman wrote:>> Can we see some users
>> of this code posted so I can see how struct device
>> is going to work in a rust driver? That's the thing I worry most about
>> the rust/C interaction here as we have two different ways of thinking
>> about reference counts from the two worlds and putting them together is
>> going to be "interesting", as can be seen here already.
>> 
>> Also, where are you getting your 'struct device' from in the first
>> place? What bus is createing it and giving it to your rust driver?
>> 
>> That would be platform for my GPU driver, matched via OF compatible.
> 
> Ick, a platform device? The GPU isn't on the PCI bus? Wow, that's
> horrid...

This is bog standard for Arm SoCs... As far as I know, it's all platform devices in the Arm GPU world: Mali, Adreno, Tegra, VideoCore, and yes, Imaginapple. not really sure what good PCI would do for integrated GPUs.
  
Wedson Almeida Filho March 5, 2023, 6:52 a.m. UTC | #20
On Fri, 24 Feb 2023 at 13:23, Asahi Lina <lina@asahilina.net> wrote:
>
> On 25/02/2023 00.14, Robin Murphy wrote:
> > On 2023-02-24 14:48, Asahi Lina wrote:
> >>
> >>
> >> On 2023/02/24 23:32, Robin Murphy wrote:
> >>> FWIW the DMA API *has* to know which specific device it's operating
> >>> with, since the relevant properties can and do vary even between
> >>> different devices within a single bus_type (e.g. DMA masks).
> >>>
> >>> In the case of io-pgtable at least, there's no explicit refcounting
> >>> since the struct device must be the one representing the physical
> >>> platform/PCI/etc. device consuming the pagetable, so if that were to
> >>> disappear from underneath its driver while the pagetable is still in
> >>> use, things would already have gone very very wrong indeed :)
> >>
> >> There's no terribly good way to encode this relationship in safe Rust as
> >> far as I know. So although it might be "obvious" (and I think my driver
> >> can never violate it as it is currently designed), this means the Rust
> >> abstraction will have to take the device reference if the C side does
> >> not, because safe rust abstractions have to actually make these bugs
> >> impossible and nothing stops a Rust driver from, say, stashing an
> >> io_pgtable reference into a global and letting the device go away.
> >
> > If someone did that, then simply holding a struct device reference
> > wouldn't guarantee much, since it only prevents the pointer itself from
> > becoming invalid - it still doesn't say any of the data *in* the
> > structure is still valid and "safe" for what a DMA API call might do
> > with it.
> >
> > At the very least you'd probably have to somehow also guarantee that the
> > device has a driver bound (which is the closest thing to a general
> > indication of valid DMA ops across all architectures) and block it from
> > unbinding for the lifetime of the reference, but that would then mean a
> > simple driver which expects to tear down its io-pgtable from its .remove
> > callback could never be unbound due to the circular dependency :/
>
> I'd like to hear from the other Rust folks about ideas on how to solve
> this ^^. I think it might fit into the Revocable model for device
> resources, but it's going to be a bit of an issue for my driver because
> for me, the io_pgtable objects are deep in the Vm object that has its
> own refcounting, and those are created and destroyed dynamically at
> runtime, not just during device probe...

Our bus abstractions all require that driver data stored in devices
implement the `DeviceRemoval` trait, which has a single function:
device_remove. The idea is that this function will do something to
relinquish access to resources that shouldn't be accessed after the
device is unbound (device_remove is called, for whatever reason). This
mechanism is generic and allows us to break circular dependencies.

One way to implement the `DeviceRemoval` trait is to hold all
resources in "revocable" wrappers (of which we have several that have
different constraints/requirements), and then revoke access when
`device_remove` is called (this results in blocking while concurrent
users are running, then running "destructors" when they're all done
and before returning). The part that isn't great about this approach
is that we introduce error paths everywhere these resources are used
(because they may have been revoked).
  

Patch

diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h
index 75d85bd6c592..3632a39a28a6 100644
--- a/rust/bindings/bindings_helper.h
+++ b/rust/bindings/bindings_helper.h
@@ -6,6 +6,7 @@ 
  * Sorted alphabetically.
  */
 
+#include <linux/device.h>
 #include <linux/slab.h>
 #include <linux/refcount.h>
 
diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs
new file mode 100644
index 000000000000..9be021e393ca
--- /dev/null
+++ b/rust/kernel/device.rs
@@ -0,0 +1,23 @@ 
+// SPDX-License-Identifier: GPL-2.0
+
+//! Generic devices that are part of the kernel's driver model.
+//!
+//! C header: [`include/linux/device.h`](../../../../include/linux/device.h)
+
+use crate::bindings;
+
+/// A raw device.
+///
+/// # Safety
+///
+/// Implementers must ensure that the `*mut device` returned by [`RawDevice::raw_device`] is
+/// related to `self`, that is, actions on it will affect `self`. For example, if one calls
+/// `get_device`, then the refcount on the device represented by `self` will be incremented.
+///
+/// Additionally, implementers must ensure that the device is never renamed. Commit a5462516aa99
+/// ("driver-core: document restrictions on device_rename()") has details on why `device_rename`
+/// should not be used.
+pub unsafe trait RawDevice {
+    /// Returns the raw `struct device` related to `self`.
+    fn raw_device(&self) -> *mut bindings::device;
+}
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index 82dff6f4cf60..de44092718f8 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -29,6 +29,7 @@  compile_error!("Missing kernel configuration for conditional compilation");
 #[cfg(not(testlib))]
 mod allocator;
 mod build_assert;
+pub mod device;
 pub mod error;
 pub mod prelude;
 pub mod print;