Message ID | 20240129141516.198636-4-rodrigo@sdfg.com.ar |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-42870-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2087:b0:106:209c:c626 with SMTP id gs7csp594288dyb; Mon, 29 Jan 2024 06:16:52 -0800 (PST) X-Google-Smtp-Source: AGHT+IEmGeWDhddoPrbxR4eT2cHME1SRKJFPowUBTBbBJ2CE8p31Fz1p+absyGZBW3W1r+9H/gt9 X-Received: by 2002:a05:6102:343:b0:46b:403c:e9b9 with SMTP id e3-20020a056102034300b0046b403ce9b9mr1091034vsa.3.1706537812415; Mon, 29 Jan 2024 06:16:52 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706537812; cv=pass; d=google.com; s=arc-20160816; b=y0+5Rx580v7rluYA/k1Jcvno2JbZYa2kSKhPZniRmFccris7iBl01GErSbKKo10mIR S841YpSp1rjC9HZN35JEBItz7JX+wkBwsfQzYpbHIXnsp28jd37d82zfNgp6suaHeIgt 48B0jiEwE7KTFHqbr+9nq/Ds5QXT1Z/R0skno2z6hDpzbSwC5pfdyQswXBS75YglfV9n l3XXwOOnTuXAweiV2GSDUEtK8tyM6KyWgzsYSd7VPKG3Y1u3ArP7lcn38FEHIlf1SGMl YY5qGHUL+spCk8tqQL2uRJBEDm/tTztrf8MeWlJ9rIX5oTs1CeXDfKw6o2n9P6bdHzwp Fg3A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=QFMdsPJoQ2X/PkrIym4NHWnjqdqIwU3qklc0sweEXf4=; fh=1pO/fnqlbA1DlSshLp2Kp23nBXobkeZ558bQEuNU3Z8=; b=PhT+N9uhAWT+v82vHSsTKg2YaXVRilC5dv4uzIqAulMN8GCSC0ze2gFriwE46JrOEE BNbGqKdLODAvSGr9gZ4tFGmVvtPn4G8CG6HLdC/hKWhtRYDx81mI4hFZoMTpQYsjJgZo WpLA3FOE0B4oh/YuEdwLu84pYqXOgx0MJJUA/3cTWW3AoGBl6sjNwBcOjaqArtGBV2n6 3M7W2+ByWEihahJEAZWOTzQmwFwAgYAmxZBylMAUl8t8O6cSu2xVTb7f54EQoqEOML+W E3K5VFJc4OwseahvKQ5kl8kTt9JRkJwP9dZbrR+erp8+keeisoVcpgeYx1csPFAKMOZP gejA== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=sdfg.com.ar); spf=pass (google.com: domain of linux-kernel+bounces-42870-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-42870-ouuuleilei=gmail.com@vger.kernel.org" Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id ib12-20020a0561022b8c00b0046b2d6199fdsi810482vsb.660.2024.01.29.06.16.52 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Jan 2024 06:16:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-42870-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=sdfg.com.ar); spf=pass (google.com: domain of linux-kernel+bounces-42870-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-42870-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 35BC31C22727 for <ouuuleilei@gmail.com>; Mon, 29 Jan 2024 14:16:52 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B7E1E67724; Mon, 29 Jan 2024 14:15:48 +0000 (UTC) Received: from alerce.blitiri.com.ar (alerce.blitiri.com.ar [49.12.208.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A819965BA4 for <linux-kernel@vger.kernel.org>; Mon, 29 Jan 2024 14:15:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=49.12.208.134 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706537747; cv=none; b=sP6oWoAtIgGVBQveYJuTRiIQq4y/l9d5M71CLVcBYtBiOJfNB8/B1EG93UAUUsxMtaCXTzhSueemVNvzFBbys4M3ZfSkUr4aotKDYd+/nBf2EIvY5OLz7U/e3vIXXHuEA7hvb7UbrMEdT4H+tnEyhNnzNIoHzNx5fWxulto5444= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706537747; c=relaxed/simple; bh=jJmfIz4w556Ptcd31JHp/E5FbsAW+xG8mxPMPAJNtj8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=u2/7FTuF6kEBELnYLRb+cmnwQY/vNEh2uMOqd4U+1QpYyAgSvaDZIodx+/zobhe+IZA9alRz9SFtBA6AoRkbIyF3Wstms/pPOpEqVsLus7PX7r3Wb6dpCcUqLo6zj3b5b9iu44/LdTdmqyHDFcr3cqFlr8ou2badJr+hKe6zUy0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=sdfg.com.ar; spf=pass smtp.mailfrom=sdfg.com.ar; arc=none smtp.client-ip=49.12.208.134 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=sdfg.com.ar Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=sdfg.com.ar Received: from localhost.localdomain by sdfg.com.ar (chasquid) with ESMTPSA tls TLS_AES_128_GCM_SHA256 (over submission, TLS-1.3, envelope from "rodrigo@sdfg.com.ar") ; Mon, 29 Jan 2024 14:15:36 +0000 From: Rodrigo Campos <rodrigo@sdfg.com.ar> To: Willy Tarreau <w@1wt.eu>, =?utf-8?q?Thomas_Wei=C3=9Fschuh?= <linux@weissschuh.net> Cc: linux-kernel@vger.kernel.org, Rodrigo Campos <rodrigo@sdfg.com.ar> Subject: [PATCH 3/4] tools/nolibc: Fix strlcpy() return code and size usage Date: Mon, 29 Jan 2024 15:15:15 +0100 Message-ID: <20240129141516.198636-4-rodrigo@sdfg.com.ar> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240129141516.198636-1-rodrigo@sdfg.com.ar> References: <20240129141516.198636-1-rodrigo@sdfg.com.ar> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789434592883579293 X-GMAIL-MSGID: 1789434592883579293 |
Series |
tools/nolibc: Misc fixes for strlcpy() and strlcat()
|
|
Commit Message
Rodrigo Campos
Jan. 29, 2024, 2:15 p.m. UTC
The return code should always be strlen(src), and we should copy at most
size-1 bytes.
While we are there, make sure to null-terminate the dst buffer.
Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
---
tools/include/nolibc/string.h | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
Comments
Hi Rodrigo, On Mon, Jan 29, 2024 at 03:15:15PM +0100, Rodrigo Campos wrote: > The return code should always be strlen(src), and we should copy at most > size-1 bytes. > > While we are there, make sure to null-terminate the dst buffer. > > Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar> > --- > tools/include/nolibc/string.h | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/tools/include/nolibc/string.h b/tools/include/nolibc/string.h > index b2149e1342a8..e4bc0302967d 100644 > --- a/tools/include/nolibc/string.h > +++ b/tools/include/nolibc/string.h > @@ -212,15 +212,16 @@ size_t strlcpy(char *dst, const char *src, size_t size) > size_t len; > char c; > > - for (len = 0;;) { > + for (len = 0; len < size; len++) { > c = src[len]; > - if (len < size) > + if (len < size - 1) > dst[len] = c; > + if (len == size - 1) > + dst[len] = '\0'; > if (!c) > break; > - len++; > } > - return len; > + return strlen(src); > } It's good, but for the same reason as the previous one, I'm getting smaller code by doing less in the loop. Also calling strlen() here looks expensive, I'm seeing that the compiler inlined it nevertheless and did it in a dep-optimized way due to the asm statement. That results in 67 bytes total while a simpler version gives 47. If I explicitly mark strlen() __attribute__((noinline)) that prevents it from doing so starting with gcc-10, where it correctly places a jump from strlcpy() to strlen() and ends up with 50 bytes (vs 44 for the alt one). The other one I can propose is directly derived from the other strlcat() variant, which first performs the copy and starts to count: size_t strlcpy(char *dst, const char *src, size_t size) { size_t len; for (len = 0; len < size; len++) { if (!(dst[len] = src[len])) return len; } /* end of src not found before size */ if (size) dst[size - 1] = '\0'; while (src[len]) len++; return len; } Just let me know what you think. And I think we should explicitly mark strlen() and the few other ones we're marking weak as noinline so that the compiler perfers a call there to inlining. Well, registers clobbering might not always be worth, but given that strlen() alone provides some savings I think it's still interesting. Thanks, Willy
On Sun, Feb 11, 2024 at 12:08:14PM +0100, Willy Tarreau wrote: > And I think we should explicitly mark > strlen() and the few other ones we're marking weak as noinline so that > the compiler perfers a call there to inlining. So actually the weak argument always prevents inlining from happening so this is not needed (I didn't have it in my previous test). Willy
On 2/11/24 12:08, Willy Tarreau wrote: > Hi Rodrigo, > > It's good, but for the same reason as the previous one, I'm getting > smaller code by doing less in the loop. Also calling strlen() here > looks expensive, I'm seeing that the compiler inlined it nevertheless > and did it in a dep-optimized way due to the asm statement. That > results in 67 bytes total while a simpler version gives 47. > > If I explicitly mark strlen() __attribute__((noinline)) that prevents > it from doing so starting with gcc-10, where it correctly places a jump > from strlcpy() to strlen() and ends up with 50 bytes (vs 44 for the alt > one). The other one I can propose is directly derived from the other > strlcat() variant, which first performs the copy and starts to count: > > size_t strlcpy(char *dst, const char *src, size_t size) > { > size_t len; > > for (len = 0; len < size; len++) { > if (!(dst[len] = src[len])) > return len; > } > > /* end of src not found before size */ > if (size) > dst[size - 1] = '\0'; > > while (src[len]) > len++; > > return len; > } > > Just let me know what you think. This is one is very nice, thanks! Sorry I didn't think about the size at all when writing the functions :) We can change the loop to be: for (len = 0; len < size; len++) { dst[len] = src[len]; if (!dst[len]) break; } That IMHO it is slightly more readable and makes it only 2 bytes longer here. What do you think? I'm fine with both, of course. If I resend, shall I add a suggested-by or directly you as the author? Best, Rodrigo
On Wed, Feb 14, 2024 at 12:50:53PM -0300, Rodrigo Campos wrote: > On 2/11/24 12:08, Willy Tarreau wrote: > > Hi Rodrigo, > > > > It's good, but for the same reason as the previous one, I'm getting > > smaller code by doing less in the loop. Also calling strlen() here > > looks expensive, I'm seeing that the compiler inlined it nevertheless > > and did it in a dep-optimized way due to the asm statement. That > > results in 67 bytes total while a simpler version gives 47. > > > > If I explicitly mark strlen() __attribute__((noinline)) that prevents > > it from doing so starting with gcc-10, where it correctly places a jump > > from strlcpy() to strlen() and ends up with 50 bytes (vs 44 for the alt > > one). The other one I can propose is directly derived from the other > > strlcat() variant, which first performs the copy and starts to count: > > > > size_t strlcpy(char *dst, const char *src, size_t size) > > { > > size_t len; > > > > for (len = 0; len < size; len++) { > > if (!(dst[len] = src[len])) > > return len; > > } > > > > /* end of src not found before size */ > > if (size) > > dst[size - 1] = '\0'; > > > > while (src[len]) > > len++; > > > > return len; > > } > > > > Just let me know what you think. > > This is one is very nice, thanks! > > Sorry I didn't think about the size at all when writing the functions :) Never be sorry, low-level user code like this is never trivial and that's the goal of the nolibc-test in the first place ;-) > We can change the loop to be: > > for (len = 0; len < size; len++) { > dst[len] = src[len]; > if (!dst[len]) > break; > } > > That IMHO it is slightly more readable and makes it only 2 bytes longer > here. It's not exactly the same, it will always write a zero at dst[size-1] due to the break statement. As much as I hate returns in the middle, this one made sense for this case. A goto to the final return statement is fine as well. > What do you think? I'm fine with both, of course. I'm fine with the more readable part (I also prefer it) but not the use of break here. > If I resend, shall I add a suggested-by or directly you as the author? No need for either, it's your work, my part was just a review and an addictive temptation to look at asm code ;-) Cheers, Willy
diff --git a/tools/include/nolibc/string.h b/tools/include/nolibc/string.h index b2149e1342a8..e4bc0302967d 100644 --- a/tools/include/nolibc/string.h +++ b/tools/include/nolibc/string.h @@ -212,15 +212,16 @@ size_t strlcpy(char *dst, const char *src, size_t size) size_t len; char c; - for (len = 0;;) { + for (len = 0; len < size; len++) { c = src[len]; - if (len < size) + if (len < size - 1) dst[len] = c; + if (len == size - 1) + dst[len] = '\0'; if (!c) break; - len++; } - return len; + return strlen(src); } static __attribute__((unused))