rust: str: add conversion from `CStr` to `CString`

Message ID 20230502125306.358283-1-aliceryhl@google.com
State New
Headers
Series rust: str: add conversion from `CStr` to `CString` |

Commit Message

Alice Ryhl May 2, 2023, 12:53 p.m. UTC
  These methods can be used to copy the data in a temporary c string into
a separate allocation, so that it can be accessed later even if the
original is deallocated.

The API in this file mirrors the standard library API for the `&str` and
`String` types. The `ToOwned` trait is not implemented because it
assumes that allocations are infallible.

Signed-off-by: Alice Ryhl <aliceryhl@google.com>
---
 rust/kernel/str.rs | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)


base-commit: ea76e08f4d901a450619831a255e9e0a4c0ed162
  

Comments

Martin Rodriguez Reboredo May 2, 2023, 1:55 p.m. UTC | #1
On 5/2/23 09:53, Alice Ryhl wrote:
> These methods can be used to copy the data in a temporary c string into
> a separate allocation, so that it can be accessed later even if the
> original is deallocated.
> 
> The API in this file mirrors the standard library API for the `&str` and
> `String` types. The `ToOwned` trait is not implemented because it
> assumes that allocations are infallible.
> 
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
>  rust/kernel/str.rs | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/rust/kernel/str.rs b/rust/kernel/str.rs
> index b771310fa4a4..54935ff3a610 100644
> --- a/rust/kernel/str.rs
> +++ b/rust/kernel/str.rs
> @@ -2,6 +2,7 @@
>  
>  //! String representations.
>  
> +use alloc::collections::TryReserveError;
>  use alloc::vec::Vec;
>  use core::fmt::{self, Write};
>  use core::ops::{self, Deref, Index};
> @@ -199,6 +200,12 @@ impl CStr {
>      pub unsafe fn as_str_unchecked(&self) -> &str {
>          unsafe { core::str::from_utf8_unchecked(self.as_bytes()) }
>      }
> +
> +    /// Convert this [`CStr`] into a [`CString`] by allocating memory and
> +    /// copying over the string data.
> +    pub fn to_cstring(&self) -> Result<CString, TryReserveError> {
> +        CString::try_from(self)
> +    }
>  }
>  
>  impl fmt::Display for CStr {
> @@ -584,6 +591,20 @@ impl Deref for CString {
>      }
>  }
>  
> +impl<'a> TryFrom<&'a CStr> for CString {
> +    type Error = TryReserveError;
> +
> +    fn try_from(cstr: &'a CStr) -> Result<CString, TryReserveError> {
> +        let len = cstr.len_with_nul();
> +        let mut buf = Vec::try_with_capacity(len)?;
> +        buf.try_extend_from_slice(cstr.as_bytes_with_nul())?;
> +
> +        // INVARIANT: The CStr and CString types have the same invariants for
> +        // the string data, and we copied it over without changes.
> +        Ok(CString { buf })
> +    }
> +}
> +
>  /// A convenience alias for [`core::format_args`].
>  #[macro_export]
>  macro_rules! fmt {
> 
> base-commit: ea76e08f4d901a450619831a255e9e0a4c0ed162

Reviewed-by: Martin Rodriguez Reboredo <yakoyoku@gmail.com>
  
Wedson Almeida Filho May 2, 2023, 4:59 p.m. UTC | #2
On Tue, 2 May 2023 at 09:53, Alice Ryhl <aliceryhl@google.com> wrote:
>
> These methods can be used to copy the data in a temporary c string into
> a separate allocation, so that it can be accessed later even if the
> original is deallocated.
>
> The API in this file mirrors the standard library API for the `&str` and
> `String` types. The `ToOwned` trait is not implemented because it
> assumes that allocations are infallible.
>
> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
> ---
>  rust/kernel/str.rs | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
>
> diff --git a/rust/kernel/str.rs b/rust/kernel/str.rs
> index b771310fa4a4..54935ff3a610 100644
> --- a/rust/kernel/str.rs
> +++ b/rust/kernel/str.rs
> @@ -2,6 +2,7 @@
>
>  //! String representations.
>
> +use alloc::collections::TryReserveError;
>  use alloc::vec::Vec;
>  use core::fmt::{self, Write};
>  use core::ops::{self, Deref, Index};
> @@ -199,6 +200,12 @@ impl CStr {
>      pub unsafe fn as_str_unchecked(&self) -> &str {
>          unsafe { core::str::from_utf8_unchecked(self.as_bytes()) }
>      }
> +
> +    /// Convert this [`CStr`] into a [`CString`] by allocating memory and
> +    /// copying over the string data.
> +    pub fn to_cstring(&self) -> Result<CString, TryReserveError> {
> +        CString::try_from(self)
> +    }
>  }
>
>  impl fmt::Display for CStr {
> @@ -584,6 +591,20 @@ impl Deref for CString {
>      }
>  }
>
> +impl<'a> TryFrom<&'a CStr> for CString {
> +    type Error = TryReserveError;

Wouldn't `AllocError` make more sense? Or even Error (with ENOMEM value).

`TryReserveError` is documented as "The error type for try_reserve
methods." -- that fact the we use a `Vec` is an implementation detail,
I feel it's better not to leak this fact through the public API.

> +
> +    fn try_from(cstr: &'a CStr) -> Result<CString, TryReserveError> {
> +        let len = cstr.len_with_nul();
> +        let mut buf = Vec::try_with_capacity(len)?;
> +        buf.try_extend_from_slice(cstr.as_bytes_with_nul())?;
> +
> +        // INVARIANT: The CStr and CString types have the same invariants for
> +        // the string data, and we copied it over without changes.
> +        Ok(CString { buf })
> +    }
> +}
> +
>  /// A convenience alias for [`core::format_args`].
>  #[macro_export]
>  macro_rules! fmt {
>
> base-commit: ea76e08f4d901a450619831a255e9e0a4c0ed162
> --
> 2.40.1.495.gc816e09b53d-goog
>
  
Benno Lossin May 2, 2023, 6:02 p.m. UTC | #3
On 02.05.23 18:59, Wedson Almeida Filho wrote:
> On Tue, 2 May 2023 at 09:53, Alice Ryhl <aliceryhl@google.com> wrote:
>>
>> These methods can be used to copy the data in a temporary c string into
>> a separate allocation, so that it can be accessed later even if the
>> original is deallocated.
>>
>> The API in this file mirrors the standard library API for the `&str` and
>> `String` types. The `ToOwned` trait is not implemented because it
>> assumes that allocations are infallible.
>>
>> Signed-off-by: Alice Ryhl <aliceryhl@google.com>
>> ---
>>   rust/kernel/str.rs | 21 +++++++++++++++++++++
>>   1 file changed, 21 insertions(+)
>>
>> diff --git a/rust/kernel/str.rs b/rust/kernel/str.rs
>> index b771310fa4a4..54935ff3a610 100644
>> --- a/rust/kernel/str.rs
>> +++ b/rust/kernel/str.rs
>> @@ -2,6 +2,7 @@
>>
>>   //! String representations.
>>
>> +use alloc::collections::TryReserveError;
>>   use alloc::vec::Vec;
>>   use core::fmt::{self, Write};
>>   use core::ops::{self, Deref, Index};
>> @@ -199,6 +200,12 @@ impl CStr {
>>       pub unsafe fn as_str_unchecked(&self) -> &str {
>>           unsafe { core::str::from_utf8_unchecked(self.as_bytes()) }
>>       }
>> +
>> +    /// Convert this [`CStr`] into a [`CString`] by allocating memory and
>> +    /// copying over the string data.
>> +    pub fn to_cstring(&self) -> Result<CString, TryReserveError> {
>> +        CString::try_from(self)
>> +    }
>>   }
>>
>>   impl fmt::Display for CStr {
>> @@ -584,6 +591,20 @@ impl Deref for CString {
>>       }
>>   }
>>
>> +impl<'a> TryFrom<&'a CStr> for CString {
>> +    type Error = TryReserveError;
> 
> Wouldn't `AllocError` make more sense? Or even Error (with ENOMEM value).
> 
> `TryReserveError` is documented as "The error type for try_reserve
> methods." -- that fact the we use a `Vec` is an implementation detail,
> I feel it's better not to leak this fact through the public API.

I agree, it should be `AllocError`. There is a `From<AllocError> for Error`
with `ENOMEM` as the value, so `AllocError` is the most compatible, since it
simply converts to `Error` via `?`.

Technically, `TryReserveError` represents two different kinds of errors:
- CapacityOverflow -- triggered when exceeding `isize::MAX` bytes of size
- AllocError -- memory allocation failed

I think it is fine to coalesce these into `AllocError`, since allocating
`isize::MAX` might as well be considered an OOM error.

With that fixed:
Reviewed-by: Benno Lossin <benno.lossin@proton.me>

>> +
>> +    fn try_from(cstr: &'a CStr) -> Result<CString, TryReserveError> {
>> +        let len = cstr.len_with_nul();
>> +        let mut buf = Vec::try_with_capacity(len)?;
>> +        buf.try_extend_from_slice(cstr.as_bytes_with_nul())?;
>> +
>> +        // INVARIANT: The CStr and CString types have the same invariants for
>> +        // the string data, and we copied it over without changes.
>> +        Ok(CString { buf })
>> +    }
>> +}
>> +
>>   /// A convenience alias for [`core::format_args`].
>>   #[macro_export]
>>   macro_rules! fmt {
>>
>> base-commit: ea76e08f4d901a450619831a255e9e0a4c0ed162
>> --
>> 2.40.1.495.gc816e09b53d-goog
>>
  
Alice Ryhl May 2, 2023, 6:17 p.m. UTC | #4
On 5/2/23 20:02, Benno Lossin wrote:
> On 02.05.23 18:59, Wedson Almeida Filho wrote:
>> On Tue, 2 May 2023 at 09:53, Alice Ryhl <aliceryhl@google.com> wrote:
>>>
>>> +impl<'a> TryFrom<&'a CStr> for CString {
>>> +    type Error = TryReserveError;
>>
>> Wouldn't `AllocError` make more sense? Or even Error (with ENOMEM value).
>>
>> `TryReserveError` is documented as "The error type for try_reserve
>> methods." -- that fact the we use a `Vec` is an implementation detail,
>> I feel it's better not to leak this fact through the public API.
> 
> I agree, it should be `AllocError`. There is a `From<AllocError> for Error`
> with `ENOMEM` as the value, so `AllocError` is the most compatible, since it
> simply converts to `Error` via `?`.

Sounds good to me.

> Technically, `TryReserveError` represents two different kinds of errors:
> - CapacityOverflow -- triggered when exceeding `isize::MAX` bytes of size
> - AllocError -- memory allocation failed
> 
> I think it is fine to coalesce these into `AllocError`, since allocating
> `isize::MAX` might as well be considered an OOM error.
In fact, the `isize::MAX` case is unreachable since that would require 
you to already have a `&CStr` of that size, which Rust does not allow.

> With that fixed:
> Reviewed-by: Benno Lossin <benno.lossin@proton.me>

Thanks both of you. I'll submit a v2 tomorrow.

Alice
  

Patch

diff --git a/rust/kernel/str.rs b/rust/kernel/str.rs
index b771310fa4a4..54935ff3a610 100644
--- a/rust/kernel/str.rs
+++ b/rust/kernel/str.rs
@@ -2,6 +2,7 @@ 
 
 //! String representations.
 
+use alloc::collections::TryReserveError;
 use alloc::vec::Vec;
 use core::fmt::{self, Write};
 use core::ops::{self, Deref, Index};
@@ -199,6 +200,12 @@  impl CStr {
     pub unsafe fn as_str_unchecked(&self) -> &str {
         unsafe { core::str::from_utf8_unchecked(self.as_bytes()) }
     }
+
+    /// Convert this [`CStr`] into a [`CString`] by allocating memory and
+    /// copying over the string data.
+    pub fn to_cstring(&self) -> Result<CString, TryReserveError> {
+        CString::try_from(self)
+    }
 }
 
 impl fmt::Display for CStr {
@@ -584,6 +591,20 @@  impl Deref for CString {
     }
 }
 
+impl<'a> TryFrom<&'a CStr> for CString {
+    type Error = TryReserveError;
+
+    fn try_from(cstr: &'a CStr) -> Result<CString, TryReserveError> {
+        let len = cstr.len_with_nul();
+        let mut buf = Vec::try_with_capacity(len)?;
+        buf.try_extend_from_slice(cstr.as_bytes_with_nul())?;
+
+        // INVARIANT: The CStr and CString types have the same invariants for
+        // the string data, and we copied it over without changes.
+        Ok(CString { buf })
+    }
+}
+
 /// A convenience alias for [`core::format_args`].
 #[macro_export]
 macro_rules! fmt {