[v2,3/4] rust: sync: add `CondVar::wait_timeout`

Message ID 20231216-rb-new-condvar-methods-v2-3-b05ab61e6d5b@google.com
State New
Headers
Series Additional CondVar methods needed by Rust Binder |

Commit Message

Alice Ryhl Dec. 16, 2023, 3:31 p.m. UTC
  Sleep on a condition variable with a timeout.

This is used by Rust Binder for process freezing. There, we want to
sleep until the freeze operation completes, but we want to be able to
abort the process freezing if it doesn't complete within some timeout.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 rust/kernel/sync/condvar.rs | 59 ++++++++++++++++++++++++++++++++++++++++-----
 rust/kernel/sync/lock.rs    |  4 +--
 2 files changed, 55 insertions(+), 8 deletions(-)
  

Comments

Martin Rodriguez Reboredo Dec. 16, 2023, 4:37 p.m. UTC | #1
On 12/16/23 12:31, Alice Ryhl wrote:
> Sleep on a condition variable with a timeout.
> 
> This is used by Rust Binder for process freezing. There, we want to
> sleep until the freeze operation completes, but we want to be able to
> abort the process freezing if it doesn't complete within some timeout.
> 
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
> [...]

Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
  
Tiago Lam Dec. 18, 2023, 8:32 a.m. UTC | #2
On 16/12/2023 15:31, Alice Ryhl wrote:
> Sleep on a condition variable with a timeout.
> 
> This is used by Rust Binder for process freezing. There, we want to
> sleep until the freeze operation completes, but we want to be able to
> abort the process freezing if it doesn't complete within some timeout.
> 
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>

Reviewed-by: Tiago Lam <tiagolam@gmail.com>
  
Boqun Feng Dec. 18, 2023, 9:15 p.m. UTC | #3
On Sat, Dec 16, 2023 at 03:31:41PM +0000, Alice Ryhl wrote:
> Sleep on a condition variable with a timeout.
> 
> This is used by Rust Binder for process freezing. There, we want to
> sleep until the freeze operation completes, but we want to be able to
> abort the process freezing if it doesn't complete within some timeout.
> 
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
>  rust/kernel/sync/condvar.rs | 59 ++++++++++++++++++++++++++++++++++++++++-----
>  rust/kernel/sync/lock.rs    |  4 +--
>  2 files changed, 55 insertions(+), 8 deletions(-)
> 
> diff --git a/rust/kernel/sync/condvar.rs b/rust/kernel/sync/condvar.rs
> index 9331eb606738..0176cdfced6c 100644
> --- a/rust/kernel/sync/condvar.rs
> +++ b/rust/kernel/sync/condvar.rs
> @@ -6,7 +6,8 @@
>  //! variable.
>  
>  use super::{lock::Backend, lock::Guard, LockClassKey};
> -use crate::{bindings, init::PinInit, pin_init, str::CStr, types::Opaque};
> +use crate::{bindings, init::PinInit, pin_init, str::CStr, time::Jiffies, types::Opaque};
> +use core::ffi::c_long;
>  use core::marker::PhantomPinned;
>  use macros::pin_data;
>  
> @@ -18,6 +19,8 @@ macro_rules! new_condvar {
>      };
>  }
>  
> +const MAX_SCHEDULE_TIMEOUT: c_long = c_long::MAX;
> +

I'd like to put this in rust/kernel/time.rs or rust/kernel/task.rs, but
it's not a blocker.

>  /// A conditional variable.
>  ///
>  /// Exposes the kernel's [`struct wait_queue_head`] as a condition variable. It allows the caller to
> @@ -102,7 +105,12 @@ pub fn new(name: &'static CStr, key: &'static LockClassKey) -> impl PinInit<Self
>          })
>      }
>  
> -    fn wait_internal<T: ?Sized, B: Backend>(&self, wait_state: u32, guard: &mut Guard<'_, T, B>) {
> +    fn wait_internal<T: ?Sized, B: Backend>(
> +        &self,
> +        wait_state: u32,
> +        guard: &mut Guard<'_, T, B>,
> +        timeout: c_long,

Nit: maybe `timeout_in_jiffies` instead of `timeout`? Or we have another
data type:

	pub type DeltaJiffies = c_long;

or 

	pub type JiffyDelta = c_long;

because a "c_long timeout" really hurts the readability.

Regards,
Boqun

> +    ) -> c_long {
>          let wait = Opaque::<bindings::wait_queue_entry>::uninit();
>  
>          // SAFETY: `wait` points to valid memory.
> @@ -113,11 +121,13 @@ fn wait_internal<T: ?Sized, B: Backend>(&self, wait_state: u32, guard: &mut Guar
>              bindings::prepare_to_wait_exclusive(self.wait_list.get(), wait.get(), wait_state as _)
>          };
>  
[...]
  
Benno Lossin Dec. 20, 2023, 11:31 a.m. UTC | #4
On 12/16/23 16:31, Alice Ryhl wrote:
> @@ -102,7 +105,12 @@ pub fn new(name: &'static CStr, key: &'static LockClassKey) -> impl PinInit<Self
>          })
>      }
> 
> -    fn wait_internal<T: ?Sized, B: Backend>(&self, wait_state: u32, guard: &mut Guard<'_, T, B>) {
> +    fn wait_internal<T: ?Sized, B: Backend>(
> +        &self,
> +        wait_state: u32,
> +        guard: &mut Guard<'_, T, B>,
> +        timeout: c_long,
> +    ) -> c_long {
>          let wait = Opaque::<bindings::wait_queue_entry>::uninit();
> 
>          // SAFETY: `wait` points to valid memory.
> @@ -113,11 +121,13 @@ fn wait_internal<T: ?Sized, B: Backend>(&self, wait_state: u32, guard: &mut Guar
>              bindings::prepare_to_wait_exclusive(self.wait_list.get(), wait.get(), wait_state as _)
>          };
> 
> -        // SAFETY: No arguments, switches to another thread.
> -        guard.do_unlocked(|| unsafe { bindings::schedule() });
> +        // SAFETY: Switches to another thread. The timeout can be any number.
> +        let ret = guard.do_unlocked(|| unsafe { bindings::schedule_timeout(timeout) });

I am not sure what exactly the safety requirements of `schedule_timeout`
are. I looked at the function and saw that the timout should not be
negative. But aside from that only the the context switching should be
relevant. What things are not allowed to do when calling `schedule`
(aside from the stuff that klint catches)?
Because if there are none, then I would put the "switches to another
thread" part into a normal comment.
  
Boqun Feng Dec. 21, 2023, 4:54 p.m. UTC | #5
On Wed, Dec 20, 2023 at 11:31:05AM +0000, Benno Lossin wrote:
> On 12/16/23 16:31, Alice Ryhl wrote:
> > @@ -102,7 +105,12 @@ pub fn new(name: &'static CStr, key: &'static LockClassKey) -> impl PinInit<Self
> >          })
> >      }
> > 
> > -    fn wait_internal<T: ?Sized, B: Backend>(&self, wait_state: u32, guard: &mut Guard<'_, T, B>) {
> > +    fn wait_internal<T: ?Sized, B: Backend>(
> > +        &self,
> > +        wait_state: u32,
> > +        guard: &mut Guard<'_, T, B>,
> > +        timeout: c_long,
> > +    ) -> c_long {
> >          let wait = Opaque::<bindings::wait_queue_entry>::uninit();
> > 
> >          // SAFETY: `wait` points to valid memory.
> > @@ -113,11 +121,13 @@ fn wait_internal<T: ?Sized, B: Backend>(&self, wait_state: u32, guard: &mut Guar
> >              bindings::prepare_to_wait_exclusive(self.wait_list.get(), wait.get(), wait_state as _)
> >          };
> > 
> > -        // SAFETY: No arguments, switches to another thread.
> > -        guard.do_unlocked(|| unsafe { bindings::schedule() });
> > +        // SAFETY: Switches to another thread. The timeout can be any number.
> > +        let ret = guard.do_unlocked(|| unsafe { bindings::schedule_timeout(timeout) });
> 
> I am not sure what exactly the safety requirements of `schedule_timeout`
> are. I looked at the function and saw that the timout should not be
> negative. But aside from that only the the context switching should be
> relevant. What things are not allowed to do when calling `schedule`
> (aside from the stuff that klint catches)?

One thing is that you probably don't want to call `schedule` with task
state being TASK_DEAD, if so the `schedule` would be counted as
`ARef<Task>::drop()`, see __schedule() -> context_switch() ->
finish_context_switch(), and the task may be freed after that, which
free the stack of the task, and anything that references a object on the
stack would be a UAF. On the other hand, if the task state is not
TASK_DEAD, `schedule*()` should be a no-op regarding memory safety.

> Because if there are none, then I would put the "switches to another
> thread" part into a normal comment.
> 

I think it's possible to make schedule_timeout() a safe function: we can
define setting task state TASK_DEAD as an unsafe operation, whose safety
requirement is something like: "Must ensure that if some code can
reference a memory object that belongs to the task (e.g. a stack
variable) after the task calls a followed `schedule()`, the code must
also hold an additional reference count to the task."

Yes, it might be out of the scope of this patchset though.

Regards,
Boqun

> -- 
> Cheers,
> Benno
> 
> > 
> >          // SAFETY: Both `wait` and `wait_list` point to valid memory.
> >          unsafe { bindings::finish_wait(self.wait_list.get(), wait.get()) };
> > +
> > +        ret
> >      }
>
  
Alice Ryhl Jan. 4, 2024, 1:48 p.m. UTC | #6
On Mon, Dec 18, 2023 at 10:15 PM Boqun Feng <boqun.feng@gmail.com> wrote:
>
> On Sat, Dec 16, 2023 at 03:31:41PM +0000, Alice Ryhl wrote:
> > +const MAX_SCHEDULE_TIMEOUT: c_long = c_long::MAX;
> > +
>
> I'd like to put this in rust/kernel/time.rs or rust/kernel/task.rs, but
> it's not a blocker.

I'll move it to task.rs.

> > -    fn wait_internal<T: ?Sized, B: Backend>(&self, wait_state: u32, guard: &mut Guard<'_, T, B>) {
> > +    fn wait_internal<T: ?Sized, B: Backend>(
> > +        &self,
> > +        wait_state: u32,
> > +        guard: &mut Guard<'_, T, B>,
> > +        timeout: c_long,
>
> Nit: maybe `timeout_in_jiffies` instead of `timeout`? Or we have another
> data type:
>
>         pub type DeltaJiffies = c_long;
>
> or
>
>         pub type JiffyDelta = c_long;
>
> because a "c_long timeout" really hurts the readability.

I will rename this to timeout_in_jiffies.

Alice
  
Alice Ryhl Jan. 4, 2024, 1:49 p.m. UTC | #7
On Thu, Dec 21, 2023 at 5:54 PM Boqun Feng <boqun.feng@gmail.com> wrote:
>
> On Wed, Dec 20, 2023 at 11:31:05AM +0000, Benno Lossin wrote:
> > On 12/16/23 16:31, Alice Ryhl wrote:
> > > @@ -102,7 +105,12 @@ pub fn new(name: &'static CStr, key: &'static LockClassKey) -> impl PinInit<Self
> > >          })
> > >      }
> > >
> > > -    fn wait_internal<T: ?Sized, B: Backend>(&self, wait_state: u32, guard: &mut Guard<'_, T, B>) {
> > > +    fn wait_internal<T: ?Sized, B: Backend>(
> > > +        &self,
> > > +        wait_state: u32,
> > > +        guard: &mut Guard<'_, T, B>,
> > > +        timeout: c_long,
> > > +    ) -> c_long {
> > >          let wait = Opaque::<bindings::wait_queue_entry>::uninit();
> > >
> > >          // SAFETY: `wait` points to valid memory.
> > > @@ -113,11 +121,13 @@ fn wait_internal<T: ?Sized, B: Backend>(&self, wait_state: u32, guard: &mut Guar
> > >              bindings::prepare_to_wait_exclusive(self.wait_list.get(), wait.get(), wait_state as _)
> > >          };
> > >
> > > -        // SAFETY: No arguments, switches to another thread.
> > > -        guard.do_unlocked(|| unsafe { bindings::schedule() });
> > > +        // SAFETY: Switches to another thread. The timeout can be any number.
> > > +        let ret = guard.do_unlocked(|| unsafe { bindings::schedule_timeout(timeout) });
> >
> > I am not sure what exactly the safety requirements of `schedule_timeout`
> > are. I looked at the function and saw that the timout should not be
> > negative. But aside from that only the the context switching should be
> > relevant. What things are not allowed to do when calling `schedule`
> > (aside from the stuff that klint catches)?
>
> One thing is that you probably don't want to call `schedule` with task
> state being TASK_DEAD, if so the `schedule` would be counted as
> `ARef<Task>::drop()`, see __schedule() -> context_switch() ->
> finish_context_switch(), and the task may be freed after that, which
> free the stack of the task, and anything that references a object on the
> stack would be a UAF. On the other hand, if the task state is not
> TASK_DEAD, `schedule*()` should be a no-op regarding memory safety.
>
> > Because if there are none, then I would put the "switches to another
> > thread" part into a normal comment.
> >
>
> I think it's possible to make schedule_timeout() a safe function: we can
> define setting task state TASK_DEAD as an unsafe operation, whose safety
> requirement is something like: "Must ensure that if some code can
> reference a memory object that belongs to the task (e.g. a stack
> variable) after the task calls a followed `schedule()`, the code must
> also hold an additional reference count to the task."
>
> Yes, it might be out of the scope of this patchset though.

These things sound like they are out of scope of this patchset.
Changing it from schedule to schedule_timeout doesn't change whether
this is ok or not.

Alice
  

Patch

diff --git a/rust/kernel/sync/condvar.rs b/rust/kernel/sync/condvar.rs
index 9331eb606738..0176cdfced6c 100644
--- a/rust/kernel/sync/condvar.rs
+++ b/rust/kernel/sync/condvar.rs
@@ -6,7 +6,8 @@ 
 //! variable.
 
 use super::{lock::Backend, lock::Guard, LockClassKey};
-use crate::{bindings, init::PinInit, pin_init, str::CStr, types::Opaque};
+use crate::{bindings, init::PinInit, pin_init, str::CStr, time::Jiffies, types::Opaque};
+use core::ffi::c_long;
 use core::marker::PhantomPinned;
 use macros::pin_data;
 
@@ -18,6 +19,8 @@  macro_rules! new_condvar {
     };
 }
 
+const MAX_SCHEDULE_TIMEOUT: c_long = c_long::MAX;
+
 /// A conditional variable.
 ///
 /// Exposes the kernel's [`struct wait_queue_head`] as a condition variable. It allows the caller to
@@ -102,7 +105,12 @@  pub fn new(name: &'static CStr, key: &'static LockClassKey) -> impl PinInit<Self
         })
     }
 
-    fn wait_internal<T: ?Sized, B: Backend>(&self, wait_state: u32, guard: &mut Guard<'_, T, B>) {
+    fn wait_internal<T: ?Sized, B: Backend>(
+        &self,
+        wait_state: u32,
+        guard: &mut Guard<'_, T, B>,
+        timeout: c_long,
+    ) -> c_long {
         let wait = Opaque::<bindings::wait_queue_entry>::uninit();
 
         // SAFETY: `wait` points to valid memory.
@@ -113,11 +121,13 @@  fn wait_internal<T: ?Sized, B: Backend>(&self, wait_state: u32, guard: &mut Guar
             bindings::prepare_to_wait_exclusive(self.wait_list.get(), wait.get(), wait_state as _)
         };
 
-        // SAFETY: No arguments, switches to another thread.
-        guard.do_unlocked(|| unsafe { bindings::schedule() });
+        // SAFETY: Switches to another thread. The timeout can be any number.
+        let ret = guard.do_unlocked(|| unsafe { bindings::schedule_timeout(timeout) });
 
         // SAFETY: Both `wait` and `wait_list` point to valid memory.
         unsafe { bindings::finish_wait(self.wait_list.get(), wait.get()) };
+
+        ret
     }
 
     /// Releases the lock and waits for a notification in uninterruptible mode.
@@ -127,7 +137,7 @@  fn wait_internal<T: ?Sized, B: Backend>(&self, wait_state: u32, guard: &mut Guar
     /// [`CondVar::notify_one`] or [`CondVar::notify_all`]. Note that it may also wake up
     /// spuriously.
     pub fn wait<T: ?Sized, B: Backend>(&self, guard: &mut Guard<'_, T, B>) {
-        self.wait_internal(bindings::TASK_UNINTERRUPTIBLE, guard);
+        self.wait_internal(bindings::TASK_UNINTERRUPTIBLE, guard, MAX_SCHEDULE_TIMEOUT);
     }
 
     /// Releases the lock and waits for a notification in interruptible mode.
@@ -138,10 +148,31 @@  pub fn wait<T: ?Sized, B: Backend>(&self, guard: &mut Guard<'_, T, B>) {
     /// Returns whether there is a signal pending.
     #[must_use = "wait_interruptible returns if a signal is pending, so the caller must check the return value"]
     pub fn wait_interruptible<T: ?Sized, B: Backend>(&self, guard: &mut Guard<'_, T, B>) -> bool {
-        self.wait_internal(bindings::TASK_INTERRUPTIBLE, guard);
+        self.wait_internal(bindings::TASK_INTERRUPTIBLE, guard, MAX_SCHEDULE_TIMEOUT);
         crate::current!().signal_pending()
     }
 
+    /// Releases the lock and waits for a notification in interruptible mode.
+    ///
+    /// Atomically releases the given lock (whose ownership is proven by the guard) and puts the
+    /// thread to sleep. It wakes up when notified by [`CondVar::notify_one`] or
+    /// [`CondVar::notify_all`], or when a timeout occurs, or when the thread receives a signal.
+    #[must_use = "wait_interruptible_timeout returns if a signal is pending, so the caller must check the return value"]
+    pub fn wait_interruptible_timeout<T: ?Sized, B: Backend>(
+        &self,
+        guard: &mut Guard<'_, T, B>,
+        jiffies: Jiffies,
+    ) -> CondVarTimeoutResult {
+        let jiffies = jiffies.try_into().unwrap_or(MAX_SCHEDULE_TIMEOUT);
+        let res = self.wait_internal(bindings::TASK_INTERRUPTIBLE, guard, jiffies);
+
+        match (res as Jiffies, crate::current!().signal_pending()) {
+            (jiffies, true) => CondVarTimeoutResult::Signal { jiffies },
+            (0, false) => CondVarTimeoutResult::Timeout,
+            (jiffies, false) => CondVarTimeoutResult::Woken { jiffies },
+        }
+    }
+
     /// Calls the kernel function to notify the appropriate number of threads with the given flags.
     fn notify(&self, count: i32, flags: u32) {
         // SAFETY: `wait_list` points to valid memory.
@@ -177,3 +208,19 @@  pub fn notify_all(&self) {
         self.notify(0, 0);
     }
 }
+
+/// The return type of `wait_timeout`.
+pub enum CondVarTimeoutResult {
+    /// The timeout was reached.
+    Timeout,
+    /// Somebody woke us up.
+    Woken {
+        /// Remaining sleep duration.
+        jiffies: Jiffies,
+    },
+    /// A signal occurred.
+    Signal {
+        /// Remaining sleep duration.
+        jiffies: Jiffies,
+    },
+}
diff --git a/rust/kernel/sync/lock.rs b/rust/kernel/sync/lock.rs
index f12a684bc957..149a5259d431 100644
--- a/rust/kernel/sync/lock.rs
+++ b/rust/kernel/sync/lock.rs
@@ -139,7 +139,7 @@  pub struct Guard<'a, T: ?Sized, B: Backend> {
 unsafe impl<T: Sync + ?Sized, B: Backend> Sync for Guard<'_, T, B> {}
 
 impl<T: ?Sized, B: Backend> Guard<'_, T, B> {
-    pub(crate) fn do_unlocked(&mut self, cb: impl FnOnce()) {
+    pub(crate) fn do_unlocked<U>(&mut self, cb: impl FnOnce() -> U) -> U {
         // SAFETY: The caller owns the lock, so it is safe to unlock it.
         unsafe { B::unlock(self.lock.state.get(), &self.state) };
 
@@ -147,7 +147,7 @@  pub(crate) fn do_unlocked(&mut self, cb: impl FnOnce()) {
         let _relock =
             ScopeGuard::new(|| unsafe { B::relock(self.lock.state.get(), &mut self.state) });
 
-        cb();
+        cb()
     }
 }