Message ID | 20240214172505.5044-1-dakr@redhat.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-65675-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:bc8a:b0:106:860b:bbdd with SMTP id dn10csp1377412dyb; Wed, 14 Feb 2024 09:31:49 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCXNSnk3kD8fHIjSsmIsSyw0RBVZHxrI1Fo1B42bmUBbZI9EytjTYbCRxJgdcfLDZxEbW22akYOzFVWJIrJwgdKBBCy/ag== X-Google-Smtp-Source: AGHT+IG/dPgQrliN2ppPz1iG7R1H9YltdecOfj6Gs66MAyiycmMvRFFAsxrIgu8lCyavPQSIxDm/ X-Received: by 2002:a05:6808:1510:b0:3bf:d840:6c0b with SMTP id u16-20020a056808151000b003bfd8406c0bmr4023278oiw.36.1707931909588; Wed, 14 Feb 2024 09:31:49 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1707931909; cv=pass; d=google.com; s=arc-20160816; b=cDV0FfEKcr2FvUMVhT9hBCMCxWwrXymD3zPVuuzJwNwY9SdeLadXl7Sgh2RPpXtoqI cUb7m4v/AuP8NAfvffqDUxM8JaP4FuChLZRvOB1KiQ2Julj0tbTulHpqwYwG8wtYNScd rt5Cwz/ak9ENoWJKobr/G8HoF+jwujFsiLr2gMv+7m6sx84npXRmREW0umvo4A8jNWAR hBchEkPOD91TiB5A8Qi56/0P3/O6FyYVdROCj/JI7/CzdEFBdcqXhT+sZVTbgj8mLvK7 RUX4eILJAdtpMCsPzOuiaPNnjGX00WjQPSeos+r+jHel104lL/xpBRtIH32359w4Pwhf n8Ig== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=VgGnSh1FS2jhduTVKubyRQKqPQ/32rOrbBEh1JM9wsE=; fh=BJaBiLRdI+9KDPuVKBjRN6cudBPOkeB+XgVCcqeELiY=; b=JhFk8Q72zm7R/S/X8bGmn1udmq8hvv+T7V/ve4L43qLQUJuNrrg+QIqKdjEr7p3U/R lYXpEm9miGq6h3c4+sR5STBEKtswRl9PfO/3wt54a4n8ZrRvDwUvfTDMXMxWYewRzuu9 qKQ1U3/PYU482ANTU9adTstJ4Omt2FvbKMDaWHF8ODhEIYKrTE6y4mWsEqBQzrEtW85K VygSArifrzjBHRLu77Kxx7YxsCObJgmODLXqoYsBwZPLyKmjPCVccYJix4TM102betHG fvA2fxx83Mwf5/44Md3hUpxGOjj1OF0ogTYvRHLPnm9OtMKTQloHAS2dxvQQNDsJzfpS Qwkg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=UdYE14tu; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-65675-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-65675-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Forwarded-Encrypted: i=2; AJvYcCWO2BCD9B3PSjVAAC25+IbyAI7941o0Ob7d5O2XziyM8pAHTXV1E3sMumfi6ByQMfsgOaZ/GVBhJwTZjYEAR04A0k8SRA== Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id 100-20020a9f206d000000b007d90c834407si641283uam.38.2024.02.14.09.31.49 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Feb 2024 09:31:49 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-65675-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=UdYE14tu; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-65675-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-65675-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id A644B1C28DF7 for <ouuuleilei@gmail.com>; Wed, 14 Feb 2024 17:30:25 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 2E6BB85927; Wed, 14 Feb 2024 17:25:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="UdYE14tu" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D0E1E839FE for <linux-kernel@vger.kernel.org>; Wed, 14 Feb 2024 17:25:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707931514; cv=none; b=WbWmiHN85X6d2LkteKIIKz5W4iI69kI1RR1M4lTKYUXgnoq9881niH549x2TvPqZRJ7Vf+BNz8/OO3ETy/53JzwYN4nGyvbM9w1XtMMk/ekfrFImcQxUaJUJ0GF96/NADigynqlPbQGgmbPZOfW7eTSDFa1rr0lqGE+aKAXUoJY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707931514; c=relaxed/simple; bh=nSY/ipm3/ufRfFV6OKDnHMrXNybe4BKCpYhUNqiAdxk=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=O9s5TLe/MLRdrsoKqOl+JNDZ6y7LCPBYynEdWrV1DZ6aj6m5s3IUrSJkj8JnUxfthiYfFn6JUbS2UHQXl6qtLCHXo3Yu6QezSPGw/OqSHA6O0BZ6s2BrOUjDylTaSeeJMaJkv3kPTOyXXbpcln96wJmpY7m/C/vDFKjQni+bGMs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=UdYE14tu; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707931511; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=VgGnSh1FS2jhduTVKubyRQKqPQ/32rOrbBEh1JM9wsE=; b=UdYE14tuiAaLMUN07y0pDoMGuShi+5VlNWfNmtRoYj/a+/3QLHmWTy4+oqvMpQ7SOOHc8+ pZnitIZjHJFcpHYsDjvEsszs43seYcrs0UwT2wx83WlQTZY+YDExXRDzJPJ60D1B3BXea7 m5YWFhEJQSnjXmeaV3BJQhlq0ML8tdQ= Received: from mail-ej1-f69.google.com (mail-ej1-f69.google.com [209.85.218.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-580-2lYL32NlMaiE-RZF2eSVPQ-1; Wed, 14 Feb 2024 12:25:10 -0500 X-MC-Unique: 2lYL32NlMaiE-RZF2eSVPQ-1 Received: by mail-ej1-f69.google.com with SMTP id a640c23a62f3a-a2b068401b4so571766b.1 for <linux-kernel@vger.kernel.org>; Wed, 14 Feb 2024 09:25:10 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707931509; x=1708536309; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=VgGnSh1FS2jhduTVKubyRQKqPQ/32rOrbBEh1JM9wsE=; b=i4OY2Ycqo43WhqHv8OroqN3+2oyXhLhnSN8D3McWBnosPKig4KvUN3M/+K2xr/nkQC b2civGkJ/4d70MxV86nyzVYIVtLTZqYboeM9fRcQ7/lZAa+VNhwA5X8Ua5o9zsViRlrL yRgFOZ2/2jUw1ArQj9bkxIxk6CplQi2mU8YOX4rwGEvmhC0LbSmlVEbizzQQeYMGYDdg v2A4xArMsIbvCncyMRAd7JOBzqYPgVowczx7gOni7CxybPkXFnmQ5qRM40yMgRWIqqjg kFU+VNTyoSJfK1Cmn3aaaHtKZXin3rgEeUNm6BWum2jGHU++H9j2Eh+7LBiaQUX8g0uf h1/A== X-Forwarded-Encrypted: i=1; AJvYcCVC4Wkqy1TNHTq87TK0u0o58LDTvWj+BYcbU5jOXp5uRYYh5+q2yM6FKo6ShT5Mm2dfeADnc0gv79oLg/f3uqN28cSzWn/AUH4uRfmd X-Gm-Message-State: AOJu0YyfC1NNPT3lUOoR6FwDVvvHsmRwxvsZGG4WeBeWdLOGpLYAszIl EqrBqjZc05ndBlDtOMeTy5hTvtSrRt9YsNsnssqgsztz5/W5gN+qOjWuAmohhpXfuF9OCPmoJx2 d1OdOdhW2M9XeGssfp7jvGztVvKs6kdcELwPlg3MnTcjj5k3mahT4lGfYlpWiBA== X-Received: by 2002:a17:906:2c46:b0:a3d:6175:7d04 with SMTP id f6-20020a1709062c4600b00a3d61757d04mr1163988ejh.34.1707931509066; Wed, 14 Feb 2024 09:25:09 -0800 (PST) X-Received: by 2002:a17:906:2c46:b0:a3d:6175:7d04 with SMTP id f6-20020a1709062c4600b00a3d61757d04mr1163965ejh.34.1707931508611; Wed, 14 Feb 2024 09:25:08 -0800 (PST) X-Forwarded-Encrypted: i=1; AJvYcCWiK4wpvb07ZT0WpDRHlSIKf0oEDBDyscAVfF4smPJFL3PUfJViFdMbNIOJmWbpoL9KNlRehUEF3IC2rGam31WhZzDBzJvlGCrC5RTlzrumVzORcyEXq0U/9+mX//QyeLbktSvGe4iN0WQh4wkDXjxF0UuyZZS1GjLW7lWhYP0KpcbXfT21uPH52BN02VLnAaHYa2CyJB3gvHYO77k6ZKZhe1+r4o3oGXKoH34iHfjMdNX6P50+sRoswSNCScNewos0iC67z40DM+JYuOwIyROPCdAQ6fkouSWNsBNCCdNYwQPR788XK5x1slfwLQ4xK0tXXL8XTPWfjHr+apjny52NOKJLo42u5K7fFgYZFHIn8Q== Received: from cassiopeiae.. ([2a02:810d:4b3f:ee94:642:1aff:fe31:a19f]) by smtp.gmail.com with ESMTPSA id h2-20020a1709063c0200b00a3d1ea6134dsm1370996ejg.197.2024.02.14.09.25.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Feb 2024 09:25:08 -0800 (PST) From: Danilo Krummrich <dakr@redhat.com> To: ojeda@kernel.org, alex.gaynor@gmail.com, wedsonaf@gmail.com, boqun.feng@gmail.com, gary@garyguo.net, bjorn3_gh@protonmail.com, benno.lossin@proton.me, a.hindborg@samsung.com, aliceryhl@google.com Cc: rust-for-linux@vger.kernel.org, linux-kernel@vger.kernel.org, Danilo Krummrich <dakr@redhat.com> Subject: [PATCH v3] rust: str: add {make,to}_{upper,lower}case() to CString Date: Wed, 14 Feb 2024 18:24:10 +0100 Message-ID: <20240214172505.5044-1-dakr@redhat.com> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789481317798813208 X-GMAIL-MSGID: 1790896409983332995 |
Series |
[v3] rust: str: add {make,to}_{upper,lower}case() to CString
|
|
Commit Message
Danilo Krummrich
Feb. 14, 2024, 5:24 p.m. UTC
Add functions to convert a CString to upper- / lowercase, either
in-place or by creating a copy of the original CString.
Naming followes the one from the Rust stdlib, where functions starting
with 'to' create a copy and functions starting with 'make' perform an
in-place conversion.
This is required by the Nova project (GSP only Rust successor of
Nouveau) to convert stringified enum values (representing different GPU
chipsets) to strings in order to generate the corresponding firmware
paths. See also [1].
[1] https://rust-for-linux.zulipchat.com/#narrow/stream/288089-General/topic/String.20manipulation.20in.20kernel.20Rust
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
---
Changes in V3:
- add an `impl DerefMut for CString`, such that these functions can be defined
for `CStr` as `&mut self` and still be called on a `CString`
Changes in V2:
- expand commit message mentioning the use case
- expand function doc comments to match the ones from Rust's stdlib
- rename to_* to make_* and add the actual to_* implementations
---
rust/kernel/str.rs | 81 +++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 80 insertions(+), 1 deletion(-)
base-commit: 7e90b5c295ec1e47c8ad865429f046970c549a66
Comments
On Wed, Feb 14, 2024 at 06:24:10PM +0100, Danilo Krummrich wrote: > Add functions to convert a CString to upper- / lowercase, either > in-place or by creating a copy of the original CString. > > Naming followes the one from the Rust stdlib, where functions starting > with 'to' create a copy and functions starting with 'make' perform an > in-place conversion. > > This is required by the Nova project (GSP only Rust successor of > Nouveau) to convert stringified enum values (representing different GPU > chipsets) to strings in order to generate the corresponding firmware > paths. See also [1]. > > [1] https://rust-for-linux.zulipchat.com/#narrow/stream/288089-General/topic/String.20manipulation.20in.20kernel.20Rust > > Signed-off-by: Danilo Krummrich <dakr@redhat.com> > --- > Changes in V3: > - add an `impl DerefMut for CString`, such that these functions can be defined > for `CStr` as `&mut self` and still be called on a `CString` > Changes in V2: > - expand commit message mentioning the use case > - expand function doc comments to match the ones from Rust's stdlib > - rename to_* to make_* and add the actual to_* implementations > --- > rust/kernel/str.rs | 81 +++++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 80 insertions(+), 1 deletion(-) > > diff --git a/rust/kernel/str.rs b/rust/kernel/str.rs > index 7d848b83add4..02d6e510b852 100644 > --- a/rust/kernel/str.rs > +++ b/rust/kernel/str.rs > @@ -5,7 +5,7 @@ > use alloc::alloc::AllocError; > use alloc::vec::Vec; > use core::fmt::{self, Write}; > -use core::ops::{self, Deref, Index}; > +use core::ops::{self, Deref, DerefMut, Index}; > > use crate::{ > bindings, > @@ -143,6 +143,19 @@ pub const fn from_bytes_with_nul(bytes: &[u8]) -> Result<&Self, CStrConvertError > unsafe { core::mem::transmute(bytes) } > } > > + /// Creates a mutable [`CStr`] from a `[u8]` without performing any > + /// additional checks. > + /// > + /// # Safety > + /// > + /// `bytes` *must* end with a `NUL` byte, and should only have a single > + /// `NUL` byte (or the string will be truncated). > + #[inline] > + pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr { > + // SAFETY: Properties of `bytes` guaranteed by the safety precondition. > + unsafe { &mut *(bytes as *mut [u8] as *mut CStr) } First `.cast::<[u8]>().cast::<CStr>()` is preferred than `as`. Besides, I think the dereference (or reborrow) is only safe if `CStr` is `#[repr(transparent)]. I.e. #[repr(transparent)] pub struct CStr([u8]); with that you can implement the function as (you can still use `cast()` implementation, but I sometimes find `transmute` is more simple). pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr { // SAFETY: `CStr` is transparent to `[u8]`, so the transmute is // safe to do, and per the function safety requirement, `bytes` // is a valid `CStr`. unsafe { core::mem::transmute(bytes) } } but this is just my thought, better wait for others' feedback as well. Regards, Boqun
On 2/14/24 20:27, Boqun Feng wrote: > On Wed, Feb 14, 2024 at 06:24:10PM +0100, Danilo Krummrich wrote: >> Add functions to convert a CString to upper- / lowercase, either >> in-place or by creating a copy of the original CString. >> >> Naming followes the one from the Rust stdlib, where functions starting >> with 'to' create a copy and functions starting with 'make' perform an >> in-place conversion. >> >> This is required by the Nova project (GSP only Rust successor of >> Nouveau) to convert stringified enum values (representing different GPU >> chipsets) to strings in order to generate the corresponding firmware >> paths. See also [1]. >> >> [1] https://rust-for-linux.zulipchat.com/#narrow/stream/288089-General/topic/String.20manipulation.20in.20kernel.20Rust >> >> Signed-off-by: Danilo Krummrich <dakr@redhat.com> >> --- >> Changes in V3: >> - add an `impl DerefMut for CString`, such that these functions can be defined >> for `CStr` as `&mut self` and still be called on a `CString` >> Changes in V2: >> - expand commit message mentioning the use case >> - expand function doc comments to match the ones from Rust's stdlib >> - rename to_* to make_* and add the actual to_* implementations >> --- >> rust/kernel/str.rs | 81 +++++++++++++++++++++++++++++++++++++++++++++- >> 1 file changed, 80 insertions(+), 1 deletion(-) >> >> diff --git a/rust/kernel/str.rs b/rust/kernel/str.rs >> index 7d848b83add4..02d6e510b852 100644 >> --- a/rust/kernel/str.rs >> +++ b/rust/kernel/str.rs >> @@ -5,7 +5,7 @@ >> use alloc::alloc::AllocError; >> use alloc::vec::Vec; >> use core::fmt::{self, Write}; >> -use core::ops::{self, Deref, Index}; >> +use core::ops::{self, Deref, DerefMut, Index}; >> >> use crate::{ >> bindings, >> @@ -143,6 +143,19 @@ pub const fn from_bytes_with_nul(bytes: &[u8]) -> Result<&Self, CStrConvertError >> unsafe { core::mem::transmute(bytes) } >> } >> >> + /// Creates a mutable [`CStr`] from a `[u8]` without performing any >> + /// additional checks. >> + /// >> + /// # Safety >> + /// >> + /// `bytes` *must* end with a `NUL` byte, and should only have a single >> + /// `NUL` byte (or the string will be truncated). >> + #[inline] >> + pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr { >> + // SAFETY: Properties of `bytes` guaranteed by the safety precondition. >> + unsafe { &mut *(bytes as *mut [u8] as *mut CStr) } > > First `.cast::<[u8]>().cast::<CStr>()` is preferred than `as`. Besides, > I think the dereference (or reborrow) is only safe if `CStr` is > `#[repr(transparent)]. I.e. > > #[repr(transparent)] > pub struct CStr([u8]); > > with that you can implement the function as (you can still use `cast()` > implementation, but I sometimes find `transmute` is more simple). > > pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr { > // SAFETY: `CStr` is transparent to `[u8]`, so the transmute is > // safe to do, and per the function safety requirement, `bytes` > // is a valid `CStr`. > unsafe { core::mem::transmute(bytes) } > } > > but this is just my thought, better wait for others' feedback as well. Transmuting references is generally frowned upon. It's better to use a pointer cast. As for .cast() vs the `as` operator, I'm not sure you can use .cast() in this case since the pointers are unsized. So you might have to use `as` instead. Alice
On Wed, Feb 14, 2024 at 08:59:06PM +0100, Alice Ryhl wrote: > On 2/14/24 20:27, Boqun Feng wrote: > > On Wed, Feb 14, 2024 at 06:24:10PM +0100, Danilo Krummrich wrote: > > > Add functions to convert a CString to upper- / lowercase, either > > > in-place or by creating a copy of the original CString. > > > > > > Naming followes the one from the Rust stdlib, where functions starting > > > with 'to' create a copy and functions starting with 'make' perform an > > > in-place conversion. > > > > > > This is required by the Nova project (GSP only Rust successor of > > > Nouveau) to convert stringified enum values (representing different GPU > > > chipsets) to strings in order to generate the corresponding firmware > > > paths. See also [1]. > > > > > > [1] https://rust-for-linux.zulipchat.com/#narrow/stream/288089-General/topic/String.20manipulation.20in.20kernel.20Rust > > > > > > Signed-off-by: Danilo Krummrich <dakr@redhat.com> > > > --- > > > Changes in V3: > > > - add an `impl DerefMut for CString`, such that these functions can be defined > > > for `CStr` as `&mut self` and still be called on a `CString` > > > Changes in V2: > > > - expand commit message mentioning the use case > > > - expand function doc comments to match the ones from Rust's stdlib > > > - rename to_* to make_* and add the actual to_* implementations > > > --- > > > rust/kernel/str.rs | 81 +++++++++++++++++++++++++++++++++++++++++++++- > > > 1 file changed, 80 insertions(+), 1 deletion(-) > > > > > > diff --git a/rust/kernel/str.rs b/rust/kernel/str.rs > > > index 7d848b83add4..02d6e510b852 100644 > > > --- a/rust/kernel/str.rs > > > +++ b/rust/kernel/str.rs > > > @@ -5,7 +5,7 @@ > > > use alloc::alloc::AllocError; > > > use alloc::vec::Vec; > > > use core::fmt::{self, Write}; > > > -use core::ops::{self, Deref, Index}; > > > +use core::ops::{self, Deref, DerefMut, Index}; > > > use crate::{ > > > bindings, > > > @@ -143,6 +143,19 @@ pub const fn from_bytes_with_nul(bytes: &[u8]) -> Result<&Self, CStrConvertError > > > unsafe { core::mem::transmute(bytes) } > > > } > > > + /// Creates a mutable [`CStr`] from a `[u8]` without performing any > > > + /// additional checks. > > > + /// > > > + /// # Safety > > > + /// > > > + /// `bytes` *must* end with a `NUL` byte, and should only have a single > > > + /// `NUL` byte (or the string will be truncated). > > > + #[inline] > > > + pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr { > > > + // SAFETY: Properties of `bytes` guaranteed by the safety precondition. > > > + unsafe { &mut *(bytes as *mut [u8] as *mut CStr) } > > > > First `.cast::<[u8]>().cast::<CStr>()` is preferred than `as`. Besides, > > I think the dereference (or reborrow) is only safe if `CStr` is > > `#[repr(transparent)]. I.e. > > > > #[repr(transparent)] > > pub struct CStr([u8]); > > > > with that you can implement the function as (you can still use `cast()` > > implementation, but I sometimes find `transmute` is more simple). > > > > pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr { > > // SAFETY: `CStr` is transparent to `[u8]`, so the transmute is > > // safe to do, and per the function safety requirement, `bytes` > > // is a valid `CStr`. > > unsafe { core::mem::transmute(bytes) } > > } > > > > but this is just my thought, better wait for others' feedback as well. > > Transmuting references is generally frowned upon. It's better to use a > pointer cast. > Ok, but honestly, I don't think the pointer casting is better ;-) What wants to be done here is simply converting a `&mut [u8]` to `&mut CStr`, adding two levels of pointer casting is kinda noise. (Also `from_bytes_with_nul` uses `transmute` as well). > As for .cast() vs the `as` operator, I'm not sure you can use .cast() in > this case since the pointers are unsized. So you might have to use `as` > instead. > You're right, that's a bit unfortunate.. Regards, Boqun > Alice
On Thu, Feb 15, 2024 at 2:18 AM Boqun Feng <boqun.feng@gmail.com> wrote: > > On Wed, Feb 14, 2024 at 08:59:06PM +0100, Alice Ryhl wrote: > > On 2/14/24 20:27, Boqun Feng wrote: > > > On Wed, Feb 14, 2024 at 06:24:10PM +0100, Danilo Krummrich wrote: > > > > --- a/rust/kernel/str.rs > > > > +++ b/rust/kernel/str.rs > > > > @@ -5,7 +5,7 @@ > > > > use alloc::alloc::AllocError; > > > > use alloc::vec::Vec; > > > > use core::fmt::{self, Write}; > > > > -use core::ops::{self, Deref, Index}; > > > > +use core::ops::{self, Deref, DerefMut, Index}; > > > > use crate::{ > > > > bindings, > > > > @@ -143,6 +143,19 @@ pub const fn from_bytes_with_nul(bytes: &[u8]) -> Result<&Self, CStrConvertError > > > > unsafe { core::mem::transmute(bytes) } > > > > } > > > > + /// Creates a mutable [`CStr`] from a `[u8]` without performing any > > > > + /// additional checks. > > > > + /// > > > > + /// # Safety > > > > + /// > > > > + /// `bytes` *must* end with a `NUL` byte, and should only have a single > > > > + /// `NUL` byte (or the string will be truncated). > > > > + #[inline] > > > > + pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr { > > > > + // SAFETY: Properties of `bytes` guaranteed by the safety precondition. > > > > + unsafe { &mut *(bytes as *mut [u8] as *mut CStr) } > > > > > > First `.cast::<[u8]>().cast::<CStr>()` is preferred than `as`. Besides, > > > I think the dereference (or reborrow) is only safe if `CStr` is > > > `#[repr(transparent)]. I.e. > > > > > > #[repr(transparent)] > > > pub struct CStr([u8]); > > > > > > with that you can implement the function as (you can still use `cast()` > > > implementation, but I sometimes find `transmute` is more simple). > > > > > > pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr { > > > // SAFETY: `CStr` is transparent to `[u8]`, so the transmute is > > > // safe to do, and per the function safety requirement, `bytes` > > > // is a valid `CStr`. > > > unsafe { core::mem::transmute(bytes) } > > > } > > > > > > but this is just my thought, better wait for others' feedback as well. > > > > Transmuting references is generally frowned upon. It's better to use a > > pointer cast. > > > > Ok, but honestly, I don't think the pointer casting is better ;-) What > wants to be done here is simply converting a `&mut [u8]` to `&mut CStr`, > adding two levels of pointer casting is kinda noise. (Also > `from_bytes_with_nul` uses `transmute` as well). Here's my logic for preferring pointer casts: Transmute raises questions about the layout of fat pointers, whereas pointer casts are obviously okay. Alice
On Thu, Feb 15, 2024 at 10:38:07AM +0100, Alice Ryhl wrote: > On Thu, Feb 15, 2024 at 2:18 AM Boqun Feng <boqun.feng@gmail.com> wrote: > > > > On Wed, Feb 14, 2024 at 08:59:06PM +0100, Alice Ryhl wrote: > > > On 2/14/24 20:27, Boqun Feng wrote: > > > > On Wed, Feb 14, 2024 at 06:24:10PM +0100, Danilo Krummrich wrote: > > > > > --- a/rust/kernel/str.rs > > > > > +++ b/rust/kernel/str.rs > > > > > @@ -5,7 +5,7 @@ > > > > > use alloc::alloc::AllocError; > > > > > use alloc::vec::Vec; > > > > > use core::fmt::{self, Write}; > > > > > -use core::ops::{self, Deref, Index}; > > > > > +use core::ops::{self, Deref, DerefMut, Index}; > > > > > use crate::{ > > > > > bindings, > > > > > @@ -143,6 +143,19 @@ pub const fn from_bytes_with_nul(bytes: &[u8]) -> Result<&Self, CStrConvertError > > > > > unsafe { core::mem::transmute(bytes) } > > > > > } > > > > > + /// Creates a mutable [`CStr`] from a `[u8]` without performing any > > > > > + /// additional checks. > > > > > + /// > > > > > + /// # Safety > > > > > + /// > > > > > + /// `bytes` *must* end with a `NUL` byte, and should only have a single > > > > > + /// `NUL` byte (or the string will be truncated). > > > > > + #[inline] > > > > > + pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr { > > > > > + // SAFETY: Properties of `bytes` guaranteed by the safety precondition. > > > > > + unsafe { &mut *(bytes as *mut [u8] as *mut CStr) } > > > > > > > > First `.cast::<[u8]>().cast::<CStr>()` is preferred than `as`. Besides, > > > > I think the dereference (or reborrow) is only safe if `CStr` is > > > > `#[repr(transparent)]. I.e. > > > > > > > > #[repr(transparent)] > > > > pub struct CStr([u8]); > > > > > > > > with that you can implement the function as (you can still use `cast()` > > > > implementation, but I sometimes find `transmute` is more simple). > > > > > > > > pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr { > > > > // SAFETY: `CStr` is transparent to `[u8]`, so the transmute is > > > > // safe to do, and per the function safety requirement, `bytes` > > > > // is a valid `CStr`. > > > > unsafe { core::mem::transmute(bytes) } > > > > } > > > > > > > > but this is just my thought, better wait for others' feedback as well. > > > > > > Transmuting references is generally frowned upon. It's better to use a > > > pointer cast. > > > > > > > Ok, but honestly, I don't think the pointer casting is better ;-) What > > wants to be done here is simply converting a `&mut [u8]` to `&mut CStr`, > > adding two levels of pointer casting is kinda noise. (Also > > `from_bytes_with_nul` uses `transmute` as well). > > Here's my logic for preferring pointer casts: Transmute raises > questions about the layout of fat pointers, whereas pointer casts are > obviously okay. > But in this case, eventually you need to worry about fat pointer layout when you dereference the `*mut CStr`, right? In other words, the dereference is only safe if `*mut [u8]` has the same fat pointer layout as `*mut CStr`. I prefer to transmute here because it's a newtype paradigm, and transmute kinda makes that clear. Regards, Boqun > Alice
On Thu, Feb 15, 2024 at 5:51 PM Boqun Feng <boqun.feng@gmail.com> wrote: > > On Thu, Feb 15, 2024 at 10:38:07AM +0100, Alice Ryhl wrote: > > On Thu, Feb 15, 2024 at 2:18 AM Boqun Feng <boqun.feng@gmail.com> wrote: > > > > > > On Wed, Feb 14, 2024 at 08:59:06PM +0100, Alice Ryhl wrote: > > > > On 2/14/24 20:27, Boqun Feng wrote: > > > > > On Wed, Feb 14, 2024 at 06:24:10PM +0100, Danilo Krummrich wrote: > > > > > > --- a/rust/kernel/str.rs > > > > > > +++ b/rust/kernel/str.rs > > > > > > @@ -5,7 +5,7 @@ > > > > > > use alloc::alloc::AllocError; > > > > > > use alloc::vec::Vec; > > > > > > use core::fmt::{self, Write}; > > > > > > -use core::ops::{self, Deref, Index}; > > > > > > +use core::ops::{self, Deref, DerefMut, Index}; > > > > > > use crate::{ > > > > > > bindings, > > > > > > @@ -143,6 +143,19 @@ pub const fn from_bytes_with_nul(bytes: &[u8]) -> Result<&Self, CStrConvertError > > > > > > unsafe { core::mem::transmute(bytes) } > > > > > > } > > > > > > + /// Creates a mutable [`CStr`] from a `[u8]` without performing any > > > > > > + /// additional checks. > > > > > > + /// > > > > > > + /// # Safety > > > > > > + /// > > > > > > + /// `bytes` *must* end with a `NUL` byte, and should only have a single > > > > > > + /// `NUL` byte (or the string will be truncated). > > > > > > + #[inline] > > > > > > + pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr { > > > > > > + // SAFETY: Properties of `bytes` guaranteed by the safety precondition. > > > > > > + unsafe { &mut *(bytes as *mut [u8] as *mut CStr) } > > > > > > > > > > First `.cast::<[u8]>().cast::<CStr>()` is preferred than `as`. Besides, > > > > > I think the dereference (or reborrow) is only safe if `CStr` is > > > > > `#[repr(transparent)]. I.e. > > > > > > > > > > #[repr(transparent)] > > > > > pub struct CStr([u8]); > > > > > > > > > > with that you can implement the function as (you can still use `cast()` > > > > > implementation, but I sometimes find `transmute` is more simple). > > > > > > > > > > pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr { > > > > > // SAFETY: `CStr` is transparent to `[u8]`, so the transmute is > > > > > // safe to do, and per the function safety requirement, `bytes` > > > > > // is a valid `CStr`. > > > > > unsafe { core::mem::transmute(bytes) } > > > > > } > > > > > > > > > > but this is just my thought, better wait for others' feedback as well. > > > > > > > > Transmuting references is generally frowned upon. It's better to use a > > > > pointer cast. > > > > > > > > > > Ok, but honestly, I don't think the pointer casting is better ;-) What > > > wants to be done here is simply converting a `&mut [u8]` to `&mut CStr`, > > > adding two levels of pointer casting is kinda noise. (Also > > > `from_bytes_with_nul` uses `transmute` as well). > > > > Here's my logic for preferring pointer casts: Transmute raises > > questions about the layout of fat pointers, whereas pointer casts are > > obviously okay. > > > > But in this case, eventually you need to worry about fat pointer layout > when you dereference the `*mut CStr`, right? In other words, the > dereference is only safe if `*mut [u8]` has the same fat pointer layout > as `*mut CStr`. I prefer to transmute here because it's a newtype > paradigm, and transmute kinda makes that clear. No, if the `*mut CStr` and `*mut [u8]` types disagree on whether the data or vtable pointer is first in the layout, then an as cast should swap them. The question of whether their vtables (well I guess it's just a length in this case) are compatible is separate. Alice
> + pub fn make_ascii_lowercase(&mut self) { > + self.0.make_ascii_lowercase(); > + } It's important to note here that this doesn't remove or introduce NUL bytes. pub fn make_ascii_lowercase(&mut self) { // INVARIANT: This doesn't introduce or remove NUL bytes in the c // string. self.0.make_ascii_lowercase(); } Ditto for make_ascii_uppercase. (But not the to_* methods.) > + /// Returns a copy of this [`CString`] where each character is mapped to its > + /// ASCII lower case equivalent. > + /// > + /// ASCII letters 'A' to 'Z' are mapped to 'a' to 'z', > + /// but non-ASCII letters are unchanged. > + /// > + /// To lowercase the value in-place, use [`make_ascii_lowercase`]. > + /// > + /// [`make_ascii_lowercase`]: str::make_ascii_lowercase > + pub fn to_ascii_lowercase(&self) -> Result<CString, AllocError> { > + let mut s = (*self).to_cstring()?; > + > + s.make_ascii_lowercase(); > + > + return Ok(s); > + } > + > + /// Returns a copy of this [`CString`] where each character is mapped to its > + /// ASCII upper case equivalent. > + /// > + /// ASCII letters 'a' to 'z' are mapped to 'A' to 'Z', > + /// but non-ASCII letters are unchanged. > + /// > + /// To uppercase the value in-place, use [`make_ascii_uppercase`]. > + /// > + /// [`make_ascii_uppercase`]: str::make_ascii_uppercase > + pub fn to_ascii_uppercase(&self) -> Result<CString, AllocError> { > + let mut s = (*self).to_cstring()?; > + > + s.make_ascii_uppercase(); > + > + return Ok(s); > + } Please move these to `CStr` as well. > +impl DerefMut for CString { > + fn deref_mut(&mut self) -> &mut Self::Target { > + unsafe { CStr::from_bytes_with_nul_unchecked_mut(&mut *self.buf) } > + } > +} Needs a safety comment. impl DerefMut for CString { fn deref_mut(&mut self) -> &mut Self::Target { // SAFETY: A `CString` is always NUL-terminated and contains no // other NUL bytes. unsafe { CStr::from_bytes_with_nul_unchecked_mut(&mut *self.buf) } } } Alice
On 2/16/24 17:53, Alice Ryhl wrote: >> + pub fn make_ascii_lowercase(&mut self) { >> + self.0.make_ascii_lowercase(); >> + } > > It's important to note here that this doesn't remove or introduce NUL > bytes. > > pub fn make_ascii_lowercase(&mut self) { > // INVARIANT: This doesn't introduce or remove NUL bytes in the c > // string. > self.0.make_ascii_lowercase(); > } > > Ditto for make_ascii_uppercase. (But not the to_* methods.) > >> + /// Returns a copy of this [`CString`] where each character is mapped to its >> + /// ASCII lower case equivalent. >> + /// >> + /// ASCII letters 'A' to 'Z' are mapped to 'a' to 'z', >> + /// but non-ASCII letters are unchanged. >> + /// >> + /// To lowercase the value in-place, use [`make_ascii_lowercase`]. >> + /// >> + /// [`make_ascii_lowercase`]: str::make_ascii_lowercase >> + pub fn to_ascii_lowercase(&self) -> Result<CString, AllocError> { >> + let mut s = (*self).to_cstring()?; >> + >> + s.make_ascii_lowercase(); >> + >> + return Ok(s); >> + } >> + >> + /// Returns a copy of this [`CString`] where each character is mapped to its >> + /// ASCII upper case equivalent. >> + /// >> + /// ASCII letters 'a' to 'z' are mapped to 'A' to 'Z', >> + /// but non-ASCII letters are unchanged. >> + /// >> + /// To uppercase the value in-place, use [`make_ascii_uppercase`]. >> + /// >> + /// [`make_ascii_uppercase`]: str::make_ascii_uppercase >> + pub fn to_ascii_uppercase(&self) -> Result<CString, AllocError> { >> + let mut s = (*self).to_cstring()?; >> + >> + s.make_ascii_uppercase(); >> + >> + return Ok(s); >> + } > > Please move these to `CStr` as well. That would result into two copies if I actually want a CString, wouldn't it? Also, what would be the use case? And even if someone wants to have a CStr again, couldn't we just deref the resulting CString? - Danilo > >> +impl DerefMut for CString { >> + fn deref_mut(&mut self) -> &mut Self::Target { >> + unsafe { CStr::from_bytes_with_nul_unchecked_mut(&mut *self.buf) } >> + } >> +} > > Needs a safety comment. > > impl DerefMut for CString { > fn deref_mut(&mut self) -> &mut Self::Target { > // SAFETY: A `CString` is always NUL-terminated and contains no > // other NUL bytes. > unsafe { CStr::from_bytes_with_nul_unchecked_mut(&mut *self.buf) } > } > } > > Alice >
On 2/16/24 18:11, Danilo Krummrich wrote: > On 2/16/24 17:53, Alice Ryhl wrote: >>> + /// Returns a copy of this [`CString`] where each character is >>> mapped to its >>> + /// ASCII upper case equivalent. >>> + /// >>> + /// ASCII letters 'a' to 'z' are mapped to 'A' to 'Z', >>> + /// but non-ASCII letters are unchanged. >>> + /// >>> + /// To uppercase the value in-place, use [`make_ascii_uppercase`]. >>> + /// >>> + /// [`make_ascii_uppercase`]: str::make_ascii_uppercase >>> + pub fn to_ascii_uppercase(&self) -> Result<CString, AllocError> { >>> + let mut s = (*self).to_cstring()?; >>> + >>> + s.make_ascii_uppercase(); >>> + >>> + return Ok(s); >>> + } >> >> Please move these to `CStr` as well. > > That would result into two copies if I actually want a CString, wouldn't > it? > > Also, what would be the use case? And even if someone wants to have a CStr > again, couldn't we just deref the resulting CString? To clarify, I want you to move it to the `impl CStr` block. That changes the type of the `self` argument. I don't want you to change the return type - that should still be `CString`. Currently, if I have a `&CStr` and I want an uppercase `CString`, I can't do that with this method. Alice
diff --git a/rust/kernel/str.rs b/rust/kernel/str.rs index 7d848b83add4..02d6e510b852 100644 --- a/rust/kernel/str.rs +++ b/rust/kernel/str.rs @@ -5,7 +5,7 @@ use alloc::alloc::AllocError; use alloc::vec::Vec; use core::fmt::{self, Write}; -use core::ops::{self, Deref, Index}; +use core::ops::{self, Deref, DerefMut, Index}; use crate::{ bindings, @@ -143,6 +143,19 @@ pub const fn from_bytes_with_nul(bytes: &[u8]) -> Result<&Self, CStrConvertError unsafe { core::mem::transmute(bytes) } } + /// Creates a mutable [`CStr`] from a `[u8]` without performing any + /// additional checks. + /// + /// # Safety + /// + /// `bytes` *must* end with a `NUL` byte, and should only have a single + /// `NUL` byte (or the string will be truncated). + #[inline] + pub const unsafe fn from_bytes_with_nul_unchecked_mut(bytes: &mut [u8]) -> &mut CStr { + // SAFETY: Properties of `bytes` guaranteed by the safety precondition. + unsafe { &mut *(bytes as *mut [u8] as *mut CStr) } + } + /// Returns a C pointer to the string. #[inline] pub const fn as_char_ptr(&self) -> *const core::ffi::c_char { @@ -206,6 +219,32 @@ pub unsafe fn as_str_unchecked(&self) -> &str { pub fn to_cstring(&self) -> Result<CString, AllocError> { CString::try_from(self) } + + /// Converts this [`CStr`] to its ASCII lower case equivalent in-place. + /// + /// ASCII letters 'A' to 'Z' are mapped to 'a' to 'z', + /// but non-ASCII letters are unchanged. + /// + /// To return a new lowercased value without modifying the existing one, use + /// [`to_ascii_lowercase()`]. + /// + /// [`to_ascii_lowercase()`]: #method.to_ascii_lowercase + pub fn make_ascii_lowercase(&mut self) { + self.0.make_ascii_lowercase(); + } + + /// Converts this [`CStr`] to its ASCII upper case equivalent in-place. + /// + /// ASCII letters 'a' to 'z' are mapped to 'A' to 'Z', + /// but non-ASCII letters are unchanged. + /// + /// To return a new uppercased value without modifying the existing one, use + /// [`to_ascii_uppercase()`]. + /// + /// [`to_ascii_uppercase()`]: #method.to_ascii_uppercase + pub fn make_ascii_uppercase(&mut self) { + self.0.make_ascii_uppercase(); + } } impl fmt::Display for CStr { @@ -581,6 +620,40 @@ pub fn try_from_fmt(args: fmt::Arguments<'_>) -> Result<Self, Error> { // exist in the buffer. Ok(Self { buf }) } + + /// Returns a copy of this [`CString`] where each character is mapped to its + /// ASCII lower case equivalent. + /// + /// ASCII letters 'A' to 'Z' are mapped to 'a' to 'z', + /// but non-ASCII letters are unchanged. + /// + /// To lowercase the value in-place, use [`make_ascii_lowercase`]. + /// + /// [`make_ascii_lowercase`]: str::make_ascii_lowercase + pub fn to_ascii_lowercase(&self) -> Result<CString, AllocError> { + let mut s = (*self).to_cstring()?; + + s.make_ascii_lowercase(); + + return Ok(s); + } + + /// Returns a copy of this [`CString`] where each character is mapped to its + /// ASCII upper case equivalent. + /// + /// ASCII letters 'a' to 'z' are mapped to 'A' to 'Z', + /// but non-ASCII letters are unchanged. + /// + /// To uppercase the value in-place, use [`make_ascii_uppercase`]. + /// + /// [`make_ascii_uppercase`]: str::make_ascii_uppercase + pub fn to_ascii_uppercase(&self) -> Result<CString, AllocError> { + let mut s = (*self).to_cstring()?; + + s.make_ascii_uppercase(); + + return Ok(s); + } } impl Deref for CString { @@ -593,6 +666,12 @@ fn deref(&self) -> &Self::Target { } } +impl DerefMut for CString { + fn deref_mut(&mut self) -> &mut Self::Target { + unsafe { CStr::from_bytes_with_nul_unchecked_mut(&mut *self.buf) } + } +} + impl<'a> TryFrom<&'a CStr> for CString { type Error = AllocError;