Message ID | 20240129141516.198636-3-rodrigo@sdfg.com.ar |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-42867-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2087:b0:106:209c:c626 with SMTP id gs7csp594016dyb; Mon, 29 Jan 2024 06:16:30 -0800 (PST) X-Google-Smtp-Source: AGHT+IH1Eq2r9z25uYhYIQn3KPDCG7lFGJFpWtXyBq22is4jbQjxEc81kbxziSmUDb1CBYNBQIzB X-Received: by 2002:a50:8a97:0:b0:55f:2078:32bd with SMTP id j23-20020a508a97000000b0055f207832bdmr715712edj.23.1706537790567; Mon, 29 Jan 2024 06:16:30 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706537790; cv=pass; d=google.com; s=arc-20160816; b=qHN+MWPGSp7v0U7pIEYmC806exSXqYbdal8pwckBZzdg2PDpYKs4zNjIfw7TRXtVsF OzmobRXxk5g/GaGgQgxV3bZAKyJmVLJxWCfF7QHLbLQ1HkhxGW8uKmcSqV9dZSFXSxMX yERe0W1I63uzG/gBuarvpc+2azGb+hLq+AOg++73rsgQ4T9FW4P0Up0sOelsw5l9MnpJ HwZndHSLADFLYJyXelVJAVcjj9teRMKXFkdGk3csTq2H0Z2eoHGZVud16SBB7Z/vOMYN ksKgkgXaFPF4BV2upIzKCpeDox/NU8jwz4+HMnVE5BgJctmOl+wNyltH3gZWigmWosRn L5aQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from; bh=5e/dGTJ7z+T+XCFJMU5fRg8FRn3HUWe9uaelPfJoLdM=; fh=1pO/fnqlbA1DlSshLp2Kp23nBXobkeZ558bQEuNU3Z8=; b=ZBNcncfhlCXAQm+KGFSRwRSL1R+qtc5w9Nlrp61INnpx77i6zO4yHfabVtFhfnPOBa P+n/9UPbzGf4jdElt5yw98xvXo7e4EE9JE1ZfJkS+CRrPRvucDHO78h3JE36fQ0FADsE KujKqiR+W2DTldwnA2+bgG+j1Jr1rmvq0nfZc4y1JTv80v+A07nLVNl6ZvTb2LLdrbDG fhv0jQkt2USxF0y8Ru+MFGI7DZOuyALp7OB+emVp6QezFnLvy75dGh15ysM4O/siKucd ZVOw4hSZ4hO2QX3eYLmW4Ij+kUNKRx/fsnDKWz/K72lmC7fSOpyLyKGqT6Ht/VV2bjON 8FnA== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=sdfg.com.ar); spf=pass (google.com: domain of linux-kernel+bounces-42867-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-42867-ouuuleilei=gmail.com@vger.kernel.org" Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id c67-20020a509fc9000000b0055f2e703b53si9233edf.464.2024.01.29.06.16.30 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Jan 2024 06:16:30 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-42867-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=sdfg.com.ar); spf=pass (google.com: domain of linux-kernel+bounces-42867-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-42867-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 2F74D1F216AF for <ouuuleilei@gmail.com>; Mon, 29 Jan 2024 14:16:30 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id BDBFF66B29; Mon, 29 Jan 2024 14:15:47 +0000 (UTC) Received: from alerce.blitiri.com.ar (alerce.blitiri.com.ar [49.12.208.134]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A820D65BA7 for <linux-kernel@vger.kernel.org>; Mon, 29 Jan 2024 14:15:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=49.12.208.134 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706537746; cv=none; b=pq5DRz2rjN3jJQbJIVtGb55oIK/k/qoNVQ4WCsJRja+DTT2rBe8SklvSa/ymrBrjBDE+TJORCUylPkna54BylaeJO+cW9n3KKr7ptjDyyuhlGOoPeVJWgBpCShBTQxmVYGisjEr5f5LSBw77Txkhgtz4mQQUaMBHIeSZPQ7wPu0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706537746; c=relaxed/simple; bh=t2kPYJ7pezKsYGsHLmpMN2up6sNljv6a0crp2CwK1j4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ny/Cha+7dUvuLSLt0ZY79w8O6gX9FQxl7vWdMKfpUH/aj0hxUxSqunVDnnq1s5JSbP7tG9nEVUjjmh4BOfy6d7xTu099Wv26Mqgh8MYksuFxH3/hW2VPR06leWWEjicw4wToeUmMtXC/s9KJiOqqLY2CdHjqFozXoJ35t+uwAO0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=sdfg.com.ar; spf=pass smtp.mailfrom=sdfg.com.ar; arc=none smtp.client-ip=49.12.208.134 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=sdfg.com.ar Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=sdfg.com.ar Received: from localhost.localdomain by sdfg.com.ar (chasquid) with ESMTPSA tls TLS_AES_128_GCM_SHA256 (over submission, TLS-1.3, envelope from "rodrigo@sdfg.com.ar") ; Mon, 29 Jan 2024 14:15:35 +0000 From: Rodrigo Campos <rodrigo@sdfg.com.ar> To: Willy Tarreau <w@1wt.eu>, =?utf-8?q?Thomas_Wei=C3=9Fschuh?= <linux@weissschuh.net> Cc: linux-kernel@vger.kernel.org, Rodrigo Campos <rodrigo@sdfg.com.ar> Subject: [PATCH 2/4] tools/nolibc: Fix strlcat() return code and size usage Date: Mon, 29 Jan 2024 15:15:14 +0100 Message-ID: <20240129141516.198636-3-rodrigo@sdfg.com.ar> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240129141516.198636-1-rodrigo@sdfg.com.ar> References: <20240129141516.198636-1-rodrigo@sdfg.com.ar> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789434570278994019 X-GMAIL-MSGID: 1789434570278994019 |
Series |
tools/nolibc: Misc fixes for strlcpy() and strlcat()
|
|
Commit Message
Rodrigo Campos
Jan. 29, 2024, 2:15 p.m. UTC
The return code should always be strlen(src) + strlen(dst), but dst is
considered shorter if size is less than strlen(dst).
While we are there, make sure to copy at most size-1 bytes and
null-terminate the dst buffer.
Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar>
---
tools/include/nolibc/string.h | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
Comments
Hi Rodrigo, first, thanks for the series! On Mon, Jan 29, 2024 at 03:15:14PM +0100, Rodrigo Campos wrote: > The return code should always be strlen(src) + strlen(dst), but dst is > considered shorter if size is less than strlen(dst). > > While we are there, make sure to copy at most size-1 bytes and > null-terminate the dst buffer. > > Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar> > --- > tools/include/nolibc/string.h | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/tools/include/nolibc/string.h b/tools/include/nolibc/string.h > index ed15c22b1b2a..b2149e1342a8 100644 > --- a/tools/include/nolibc/string.h > +++ b/tools/include/nolibc/string.h > @@ -187,23 +187,23 @@ char *strndup(const char *str, size_t maxlen) > static __attribute__((unused)) > size_t strlcat(char *dst, const char *src, size_t size) > { > - size_t len; > char c; > + size_t len = strlen(dst); > + size_t ret = strlen(src) + (size < len? size: len); From what I'm reading in the man page, ret should *always* be the sum of the two string lengths. I guess it helps for reallocation. It's even explicitly mentioned: "While this may seem somewhat confusing, it was done to make truncation detection simple." Above ret will be bound to the existing size so a realloc wouldn't work. Thus I think the correct solution is: size_t ret = strlen(src) + len; > - for (len = 0; dst[len]; len++) > - ; > - > - for (;;) { > + for (;len < size;) { > c = *src; > - if (len < size) > + if (len < size - 1) > dst[len] = c; > + if (len == size - 1) > + dst[len] = '\0'; > if (!c) > break; > len++; > src++; > } > > - return len; > + return ret; > } The test inside the loop is going to make this not very efficient. Same for the fact that we're measuring the length of src twice (once via strlen, a second time through the loop). I've just had a look and it compiles to 77 bytes at -Os. A simpler variant would consist in trying to copy what fits in <size> and once reached, go on just for trailing zero and the size measurement: size_t strlcat(char *dst, const char *src, size_t size) { size_t len = strlen(dst); while (len < size) { if (!(dst[len] = *src)) return len; len++; src++; } /* end of src not found before size */ if (size) dst[size - 1] = '\0'; while (*src++) len++; return len; } For me it's 58 bytes, or 19 less / 25% smaller, and at first glance it should do the right thing as well. What do you think ? Thanks! Willy
On 2/11/24 11:48, Willy Tarreau wrote: > Hi Rodrigo, > > first, thanks for the series! Thank you, for your time and review! :) > On Mon, Jan 29, 2024 at 03:15:14PM +0100, Rodrigo Campos wrote: >> The return code should always be strlen(src) + strlen(dst), but dst is >> considered shorter if size is less than strlen(dst). >> >> While we are there, make sure to copy at most size-1 bytes and >> null-terminate the dst buffer. >> >> Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar> >> --- >> tools/include/nolibc/string.h | 14 +++++++------- >> 1 file changed, 7 insertions(+), 7 deletions(-) >> >> diff --git a/tools/include/nolibc/string.h b/tools/include/nolibc/string.h >> index ed15c22b1b2a..b2149e1342a8 100644 >> --- a/tools/include/nolibc/string.h >> +++ b/tools/include/nolibc/string.h >> @@ -187,23 +187,23 @@ char *strndup(const char *str, size_t maxlen) >> static __attribute__((unused)) >> size_t strlcat(char *dst, const char *src, size_t size) >> { >> - size_t len; >> char c; >> + size_t len = strlen(dst); >> + size_t ret = strlen(src) + (size < len? size: len); > > From what I'm reading in the man page, ret should *always* be the sum > of the two string lengths. I guess it helps for reallocation. It's even > explicitly mentioned: > > "While this may seem somewhat confusing, it was done to make truncation > detection simple." Yes, that was my *first* understanding of the manpage too. But it doesn't seem to be the correct interpretation. Also, this is exactly what I tried to say in the cover letter, with: I thought the manpage was clear, but when checking against that, I noted a few twists (like the manpage says the return code of strlcat is strlen(src) + strlen(dst), but it was not clear it is not that if size < strlen(dst). When looking at the libbsd implementation and re-reading the manpage, I understood what it really meant). Sorry if I wasn't clear. Let me try again. My first interpretation of the manpage was also that, and I think it would make sense to be that one. However, it is wrong, they also say this, that is what made me add the ternary operator: Note, however, that if strlcat() traverses size characters without finding a NUL, the length of the string is considered to be size and the destination string will not be NUL terminated (since there was no space for the NUL) So, my interpretation is that it is the sum of both, except when we can't find the NUL in that size, in which case we conside "size" to be the len of dst. If you compare it with the output of libbsd, the return code seems to be exactly that. I was surprised too, as the manpage seem so clear... :-/ > Above ret will be bound to the existing size so a realloc wouldn't work. > Thus I think the correct solution is: Note that this implementation fails the tests added on patch 4. I've tested them (output and return code) to match the libbsd implementation. > The test inside the loop is going to make this not very efficient. Same > for the fact that we're measuring the length of src twice (once via > strlen, a second time through the loop). I've just had a look and it > compiles to 77 bytes at -Os. A simpler variant would consist in trying How are you measuring the size? I've added noinline to strlcat to the version I sent, so now it is shown in nm, but I have this as output: $ nm --size -t x test.o 0000000000000004 V errno 0000000000000006 t strlcat.constprop.0 0000000000000008 V _auxv 0000000000000008 V environ 000000000000000e W strlen 000000000000000f W _start 0000000000000018 W raise 000000000000001b W abort 000000000000004c T main 000000000000005a t u64toa_r.isra.0 0000000000000095 W _start_c 00000000000002a8 t printf How are you measuring it there? Sorry, I'm not familiar with this :) > to copy what fits in <size> and once reached, go on just for trailing > zero and the size measurement: > > size_t strlcat(char *dst, const char *src, size_t size) > { > size_t len = strlen(dst); The thing is, we need to return always at least strlen(src). Maybe plus something else. Even if size==0 and we don't copy anything. Maybe we can special case that, so we simplify the loop, and it's smaller? I've been playing, but as I can't measure the size, I'm not sure what is really better. I'd like to play a little once I know how to measure it :) Best, Rodrigo
On 2/13/24 00:16, Rodrigo Campos wrote: > On 2/11/24 11:48, Willy Tarreau wrote: >> The test inside the loop is going to make this not very efficient. Same >> for the fact that we're measuring the length of src twice (once via >> strlen, a second time through the loop). I've just had a look and it >> compiles to 77 bytes at -Os. A simpler variant would consist in trying What code are you compiling that uses strlcat? I've tried dropping almost all the flags to leave only: -Os -nostdlib -lgcc And I still don't see the 77 bytes. > $ nm --size -t x test.o > 0000000000000004 V errno > 0000000000000006 t strlcat.constprop.0 > 0000000000000008 V _auxv > 0000000000000008 V environ > 000000000000000e W strlen > 000000000000000f W _start > 0000000000000018 W raise > 000000000000001b W abort > 000000000000004c T main > 000000000000005a t u64toa_r.isra.0 > 0000000000000095 W _start_c > 00000000000002a8 t printf > > How are you measuring it there? btw, sorry for the late reply. I was on a flight that was rebooked, it took me some days to arrive. But I finally arrived and have internet now :) The constprop seems to be some gcc optimization to simplify things. I see what happens. The example I was compiling was left with size=0 (a leftover from when I was testing the code). That is very easy for gcc to optimize it all out. That is why it can make it so small. If I compile the example from my original email, though, I still don't see the 77 bytes you mention. I see 31 in the nm output, that in decimal is 49 bytes. Maybe tomorrow with some sleep after the long flight, I see what else is happening :) Best, Rodrigo
On 2/11/24 11:48, Willy Tarreau wrote: > For me it's 58 bytes, or 19 less / 25% smaller, and at first glance it > should do the right thing as well. Oh, here I get almost the same (57 bytes), so it must be the example we are compiling what is different. I'll check this after some sleep :)
On 2/13/24 00:16, Rodrigo Campos wrote: > On 2/11/24 11:48, Willy Tarreau wrote: >> Hi Rodrigo, >> >> first, thanks for the series! > > Thank you, for your time and review! :) > >> On Mon, Jan 29, 2024 at 03:15:14PM +0100, Rodrigo Campos wrote: >>> The return code should always be strlen(src) + strlen(dst), but dst is >>> considered shorter if size is less than strlen(dst). >>> >>> While we are there, make sure to copy at most size-1 bytes and >>> null-terminate the dst buffer. >>> >>> Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar> >>> --- >>> tools/include/nolibc/string.h | 14 +++++++------- >>> 1 file changed, 7 insertions(+), 7 deletions(-) >>> >>> diff --git a/tools/include/nolibc/string.h >>> b/tools/include/nolibc/string.h >>> index ed15c22b1b2a..b2149e1342a8 100644 >>> --- a/tools/include/nolibc/string.h >>> +++ b/tools/include/nolibc/string.h >>> @@ -187,23 +187,23 @@ char *strndup(const char *str, size_t maxlen) >>> static __attribute__((unused)) >>> size_t strlcat(char *dst, const char *src, size_t size) >>> { >>> - size_t len; >>> char c; >>> + size_t len = strlen(dst); >>> + size_t ret = strlen(src) + (size < len? size: len); >> >> From what I'm reading in the man page, ret should *always* be the sum >> of the two string lengths. I guess it helps for reallocation. It's even >> explicitly mentioned: >> >> "While this may seem somewhat confusing, it was done to make >> truncation >> detection simple." > > Yes, that was my *first* understanding of the manpage too. But it > doesn't seem to be the correct interpretation. > > Also, this is exactly what I tried to say in the cover letter, with: > > I thought the manpage was clear, but when checking against that, > I noted a few twists (like the manpage says the return code of > strlcat is strlen(src) + strlen(dst), but it was not clear it is > not that if size < strlen(dst). When looking at the libbsd > implementation and re-reading the manpage, I understood what it > really meant). > > > Sorry if I wasn't clear. Let me try again. > > My first interpretation of the manpage was also that, and I think it > would make sense to be that one. However, it is wrong, they also say > this, that is what made me add the ternary operator: > > Note, however, that if strlcat() traverses size characters > without finding a NUL, the length of the string is considered > to be size and the destination string will not be NUL > terminated (since there was no space for the NUL) > > So, my interpretation is that it is the sum of both, except when we > can't find the NUL in that size, in which case we conside "size" to be > the len of dst. > > If you compare it with the output of libbsd, the return code seems to be > exactly that. I was surprised too, as the manpage seem so clear... :-/ I'm not convinced now if that is the right interpretation. I'll check the libbsd code, I don't remember it now, it's been a few days already. My memory is that it was not something so sane.
Hi Rodrigo, On Tue, Feb 13, 2024 at 12:16:06AM +0100, Rodrigo Campos wrote: > On 2/11/24 11:48, Willy Tarreau wrote: > > Hi Rodrigo, > > > > first, thanks for the series! > > Thank you, for your time and review! :) You're welcome. I'm sorry I couldn't respond earlier. > > On Mon, Jan 29, 2024 at 03:15:14PM +0100, Rodrigo Campos wrote: > > > The return code should always be strlen(src) + strlen(dst), but dst is > > > considered shorter if size is less than strlen(dst). > > > > > > While we are there, make sure to copy at most size-1 bytes and > > > null-terminate the dst buffer. > > > > > > Signed-off-by: Rodrigo Campos <rodrigo@sdfg.com.ar> > > > --- > > > tools/include/nolibc/string.h | 14 +++++++------- > > > 1 file changed, 7 insertions(+), 7 deletions(-) > > > > > > diff --git a/tools/include/nolibc/string.h b/tools/include/nolibc/string.h > > > index ed15c22b1b2a..b2149e1342a8 100644 > > > --- a/tools/include/nolibc/string.h > > > +++ b/tools/include/nolibc/string.h > > > @@ -187,23 +187,23 @@ char *strndup(const char *str, size_t maxlen) > > > static __attribute__((unused)) > > > size_t strlcat(char *dst, const char *src, size_t size) > > > { > > > - size_t len; > > > char c; > > > + size_t len = strlen(dst); > > > + size_t ret = strlen(src) + (size < len? size: len); > > > > From what I'm reading in the man page, ret should *always* be the sum > > of the two string lengths. I guess it helps for reallocation. It's even > > explicitly mentioned: > > > > "While this may seem somewhat confusing, it was done to make truncation > > detection simple." > > Yes, that was my *first* understanding of the manpage too. But it doesn't > seem to be the correct interpretation. > > Also, this is exactly what I tried to say in the cover letter, with: > > I thought the manpage was clear, but when checking against that, > I noted a few twists (like the manpage says the return code of > strlcat is strlen(src) + strlen(dst), but it was not clear it is > not that if size < strlen(dst). When looking at the libbsd > implementation and re-reading the manpage, I understood what it > really meant). > > > Sorry if I wasn't clear. Let me try again. > > My first interpretation of the manpage was also that, and I think it would > make sense to be that one. However, it is wrong, they also say this, that is > what made me add the ternary operator: > > Note, however, that if strlcat() traverses size characters > without finding a NUL, the length of the string is considered > to be size and the destination string will not be NUL > terminated (since there was no space for the NUL) > > So, my interpretation is that it is the sum of both, except when we can't > find the NUL in that size, in which case we conside "size" to be the len of > dst. I've read it as well and I don't interpret it the same way. I'm reading it as "if dst doesn't contain a NUL char before size, its length is considered to be size", and the reason is explicitly explained just after: This keeps strlcat() from running off the end of a string. In practice this should not happen (as it means that either size is incorrect or that dst is not a proper ``C'' string). So this explicitly means that supporting this specific use case is by definition incompatible with the use of strlen(). Thus it's a matter of choice for us, either we explicitly want to support invalid strings in the destination and we need the check but we're not allowed to use strlen() on dst, or we're not interested in such bogus cases and we can stick to strlen() and the test is not needed. But the test combined with strlen() is not logical. > If you compare it with the output of libbsd, the return code seems to be > exactly that. I was surprised too, as the manpage seem so clear... :-/ I trust you since I have not tried. I understand the rationale, i.e. still being able to realloc() the new string, except that the caller must be extremely prudent here since there's no way for it to know that it faces a non-terminated string and that it must resort to a special code path that relies on two strlcpy() instead of strlcat(). > > Above ret will be bound to the existing size so a realloc wouldn't work. > > Thus I think the correct solution is: > > Note that this implementation fails the tests added on patch 4. I've tested > them (output and return code) to match the libbsd implementation. OK. > > The test inside the loop is going to make this not very efficient. Same > > for the fact that we're measuring the length of src twice (once via > > strlen, a second time through the loop). I've just had a look and it > > compiles to 77 bytes at -Os. A simpler variant would consist in trying > > How are you measuring the size? > > I've added noinline to strlcat to the version I sent, so now it is shown in > nm, but I have this as output: > > $ nm --size -t x test.o > 0000000000000004 V errno > 0000000000000006 t strlcat.constprop.0 > 0000000000000008 V _auxv > 0000000000000008 V environ > 000000000000000e W strlen > 000000000000000f W _start > 0000000000000018 W raise > 000000000000001b W abort > 000000000000004c T main > 000000000000005a t u64toa_r.isra.0 > 0000000000000095 W _start_c > 00000000000002a8 t printf > > How are you measuring it there? > > Sorry, I'm not familiar with this :) Oh, sorry for not providing the detalis! I just remove "static" in front of the function. The reason for this is that if the function is used only once in a program, and it is inlined, you never know in practice how it will be inlined, depending on the opportunities the compiler will face in the calling function. However, when it compiles it into its own function, you get a better picture at the emitted code. A second option, that's sometimes more convenient when you're hacking directly in the include files themselves, is to write wrappers around these functions and look at them. For example: $ cd tools/testing/selftests/nolibc/ $ printf "size_t test_strlcat(char *dst, const char *src, size_t size) { return strlcat(dst, src, size); }\n" | gcc-9.5 -xc -c -Os -I sysroot/x86/include -include nolibc.h - -o test.o $ nm --size test.o 0000000000000004 V errno 0000000000000008 V _auxv 0000000000000008 V environ 000000000000000f W _start 0000000000000018 W raise 000000000000001b W abort 0000000000000025 T test_strlcat 000000000000008d W _start_c $ objdump --disassemble=test_strlcat test.o ... I can't say I'm having a preference, it depends how I'm proceeding. When comparing multiple variants of a same function, I generally like to just copy them into a new file under different names so that I can compare them all at once. > > to copy what fits in <size> and once reached, go on just for trailing > > zero and the size measurement: > > > > size_t strlcat(char *dst, const char *src, size_t size) > > { > > size_t len = strlen(dst); > > The thing is, we need to return always at least strlen(src). Maybe plus > something else. Even if size==0 and we don't copy anything. Yes, but that's exactly what the function does, look at the end: while (*src++) len++; return len; > Maybe we can special case that, so we simplify the loop, and it's smaller? As a general rule, to keep code small it's best to avoid special cases and to make them part of the general case as much as possible. And if some special cases need to be made, as much as possible we need to arrange them around existing jump or return points, i.e. preferably just before a return statement that uses a value that was already computed, or sometimes around a break or continue in a loop, but conditions inside loops should be avoided as much as possible because compilers often maintain multiple indexes inside loops (e.g. both target pointer and a counter), while conditions also tend to consume registers that force the compiler to perform some permutations to keep all of its variables available, the worst being calling functions from loops since the ABI indicates that a number of registers are clobbered and need to be saved and restored. But these are essentially hints and need to be verified on a case-by-case basis. Similarly, some architectures provide convenient addressing mechanisms such as a=b[c+d] that we have on x86 and not necessarily elsewhere, and which favors the usage of offsets instead of pointers. But again it depends on a lot of variables. > I've been playing, but as I can't measure the size, I'm not sure what is > really better. I'd like to play a little once I know how to measure it :) Be careful not to spend too much time on it, nm --size and objdump are quickly addictive ;-) Willy
On 2/13/24 08:02, Willy Tarreau wrote: > Hi Rodrigo, > > On Tue, Feb 13, 2024 at 12:16:06AM +0100, Rodrigo Campos wrote: >> On 2/11/24 11:48, Willy Tarreau wrote: >>> Hi Rodrigo, > I've read it as well and I don't interpret it the same way. I'm reading it > as "if dst doesn't contain a NUL char before size, its length is considered > to be size", and the reason is explicitly explained just after: Very true, I mixed dst and source. Thanks! >> How are you measuring it there? >> >> Sorry, I'm not familiar with this :) > > Oh, sorry for not providing the detalis! I just remove "static" in front > of the function. Thanks for the examples and heuristics to keep code small :) >> I've been playing, but as I can't measure the size, I'm not sure what is >> really better. I'd like to play a little once I know how to measure it :) > > Be careful not to spend too much time on it, nm --size and objdump are > quickly addictive ;-) Haha, it is definitely addictive. I'll share some new versions in a minute.
On 2/11/24 11:48, Willy Tarreau wrote: > On Mon, Jan 29, 2024 at 03:15:14PM +0100, Rodrigo Campos wrote: > The test inside the loop is going to make this not very efficient. Same > for the fact that we're measuring the length of src twice (once via > strlen, a second time through the loop). I've just had a look and it > compiles to 77 bytes at -Os. A simpler variant would consist in trying The sizes here don't match that, even using gcc 9.5.0 from debian sid (your example in the other email was calling gcc 9.5). Here I see bigger sizes, even using the same line you share to compile. > For me it's 58 bytes, or 19 less / 25% smaller, and at first glance it > should do the right thing as well. I see 69 bytes for that func (nm --size says 45, that is in hex). The function I posted in the patchset I see it as 101 bytes, so that is here is 32 bytes less here. Here are two versions that are significantly shorter than the 101 bytes, that pass the tests (I added more to account for the src vs dst mismatch that was easy to pass tests when both buffers have the same size as they did before). size_t strlcat_rata(char *dst, const char *src, size_t size) { const char *orig_src = src; size_t len = 0; for (;len < size; len++) { if (dst[len] == '\0') break; } /* Let's copy len < n < size-1 times from src. * size is unsigned, so instead of size-1, that can wrap around, * let's use len + 1 */ while (len + 1 < size) { dst[len] = *src; if (*src == '\0') break; len++; src++; } if (src != orig_src) dst[len] = '\0'; while (*src++) len++; return len; } This one compiles to 81 bytes here using gcc 13.2.0 and to 83 using gcc 9.5.0. Compared to the one posted in the patchset, it is significantly smaller. One based in the version you posted (uses strlen(dst) instead), is this one: size_t strlcat_willy_fixed(char *dst, const char *src, size_t size) { const char *orig_src = src; size_t len = strlen(dst); if (size < len) len = size; for (;len + 1 < size; len++, src++) { dst[len] = *src; if (*src == '\0') break; } if (orig_src != src) dst[len] = '\0'; while (*src++) len++; return len; } Note the "if size < len, then len=size", I couldn't get rid of it because we need to account for the smaller size of dst if we don't get passed it for the return code. This one is 90 bytes here. What do you think? Can you make them shorter? If you like one of these, I can repost with the improved tests too.
On 2/14/24 16:34, Rodrigo Campos wrote: > size_t strlcat_rata(char *dst, const char *src, size_t size) > { > const char *orig_src = src; > size_t len = 0; > for (;len < size; len++) { > if (dst[len] == '\0') > break; > } If you think about it, this is strnlen() and what follows is strncat(). > size_t strlcat_willy_fixed(char *dst, const char *src, size_t size) > { > const char *orig_src = src; > size_t len = strlen(dst); > if (size < len) > len = size; Same here. We can simplify the code by using them, but the size doesn't seem to be smaller. Let me see what I can do.
On 2/14/24 23:03, Rodrigo Campos wrote: > On 2/14/24 16:34, Rodrigo Campos wrote: >> size_t strlcat_rata(char *dst, const char *src, size_t size) >> { >> const char *orig_src = src; >> size_t len = 0; >> for (;len < size; len++) { >> if (dst[len] == '\0') >> break; >> } > > If you think about it, this is strnlen() and what follows is strncat(). > >> size_t strlcat_willy_fixed(char *dst, const char *src, size_t size) >> { >> const char *orig_src = src; >> size_t len = strlen(dst); >> if (size < len) >> len = size; > > Same here. > > We can simplify the code by using them, but the size doesn't seem to be > smaller. Let me see what I can do. Yeap, third option would be this: size_t strlcat(char *dst, const char *src, size_t size) { size_t len = strnlen(dst, size); if(size > len) { strncpy(dst + len, src, size - len - 1); dst[size - 1] = '\0'; } return len + strlen(src); } I've tried to use strncpy return code to avoid doing the strlen(src), but that is only useful if we copy all of src. Otherwise, we still need to go till the end. And the function code ends-up being bigger because we can only do it inside the if (and return from there), so we have to add code there and keep the strlen(src) outside. I'm not sure about the size of this variant, as different programs give me different sizes. For example, if I use it in this program: int main(void) { char buf[10] = "test123456"; buf[4] = '\0'; char *src = "bar"; size_t ret = strlcat(buf, src, 9); printf("dst is: %s\n", buf); printf("ret is: %d\n", ret); printf("strlen(buf) is: %d\n", strlen(buf)); printf("strlen(src) is: %d\n", strlen(src)); return 0; } It is compiled to 74 bytes. This is smaller than the other two options I posted. But if I just use it in a wrapper function, as you suggested, like: size_t test_strlcat_nlen(char *dst, const char *src, size_t size) { return strlcat(dst, src, size); } And compile like this: gcc -xc -c -Os -fno-asynchronous-unwind-tables -fno-ident -s -Os -I sysroot/x86/include -include nolibc.h strlcat_sizes.c -o test.o or even like this, the result is the same: gcc -xc -c -Os -I sysroot/x86/include -include nolibc.h strlcat_sizes.c -o test.o The result is 86 bytes, which is similar to the previous versions (81 and 90 bytes they were). I haven't checked the assembler generated, I bet it is being smart and taking advantage of the set size param when it makes it so much smaller. I've tried calling it with more numbers than just "9", in lot of cases it is smaller than 86 bytes. I have the gut-feeling that for the types of programs that us nolibc, all params might be known (dst, src, size) and we won't need to keep a generic version of the function, the compiler will optimize it. Even calling it several times in the compiled program, forcing it out and inside the "if size > len)" case, the optimized version is still in the 76 bytes or so. So, the code is so simple, the size is almost the same and it seems the compiler has a better time optimizing it. I lean towards this version here. What do you think? Best, Rodrigo
Hi Rodrigo, On Wed, Feb 14, 2024 at 12:34:46PM -0300, Rodrigo Campos wrote: > Here are two versions that are significantly shorter than the 101 bytes, > that pass the tests (I added more to account for the src vs dst mismatch > that was easy to pass tests when both buffers have the same size as they did > before). > > size_t strlcat_rata(char *dst, const char *src, size_t size) > { > const char *orig_src = src; > size_t len = 0; > for (;len < size; len++) { > if (dst[len] == '\0') > break; > } > > /* Let's copy len < n < size-1 times from src. > * size is unsigned, so instead of size-1, that can wrap around, > * let's use len + 1 */ > while (len + 1 < size) { > dst[len] = *src; > if (*src == '\0') > break; > len++; > src++; > } > > if (src != orig_src) > dst[len] = '\0'; > > while (*src++) > len++; > > return len; > } > > This one compiles to 81 bytes here using gcc 13.2.0 and to 83 using gcc > 9.5.0. Compared to the one posted in the patchset, it is significantly > smaller. OK this looks good to me. I think your test on src != orig_src is not trivial and that testing instead if (len < size) would be better, and possibly even shorter. > One based in the version you posted (uses strlen(dst) instead), is this one: > > size_t strlcat_willy_fixed(char *dst, const char *src, size_t size) > { > const char *orig_src = src; > size_t len = strlen(dst); > if (size < len) > len = size; > > for (;len + 1 < size; len++, src++) { > dst[len] = *src; > if (*src == '\0') > break; > } > > if (orig_src != src) > dst[len] = '\0'; > > while (*src++) > len++; > > return len; > } > > > Note the "if size < len, then len=size", I couldn't get rid of it because we > need to account for the smaller size of dst if we don't get passed it for > the return code. Please no, again as I mentioned earlier, it's wrong to use strlen(dst) in this case: the only situation where we'd accept size < len is if dst is already not zero-terminated, which means that strlen() cannot be used, or you'd need strnlen() for that, I'm just seeing that we have it, I thought we didn't. > This one is 90 bytes here. See the point I was making? Sometimes the amount of fat added around a function call is important compared to just an increment and a comparison. Of course that's not always true but that's one such example. > What do you think? Can you make them shorter? I don't want to enter that activity again this week-end ;-) It's sufficient here. > If you like one of these, I can repost with the improved tests too. Yeah please check the few points above for your version, make sure to clean it up according to the kernel's coding style (empty line after variable declaration, */ on the next line for the multi-line comment etc) and that's fine. Thanks, Willy
On Wed, Feb 14, 2024 at 07:03:10PM -0300, Rodrigo Campos wrote: > On 2/14/24 16:34, Rodrigo Campos wrote: > > size_t strlcat_rata(char *dst, const char *src, size_t size) > > { > > const char *orig_src = src; > > size_t len = 0; > > for (;len < size; len++) { > > if (dst[len] == '\0') > > break; > > } > > If you think about it, this is strnlen() and what follows is strncat(). I agree, I just didn't remember we had strnlen() nor strncat(). Willy
On 2/18/24 07:20, Willy Tarreau wrote: > Hi Rodrigo, > > > OK this looks good to me. I'll use that version in the next send then, thanks! > I think your test on src != orig_src is not > trivial and that testing instead if (len < size) would be better, and > possibly even shorter. Fixed that, thanks. >> Note the "if size < len, then len=size", I couldn't get rid of it because we >> need to account for the smaller size of dst if we don't get passed it for >> the return code. > > Please no, again as I mentioned earlier, it's wrong to use strlen(dst) in > this case: the only situation where we'd accept size < len is if dst is > already not zero-terminated, which means that strlen() cannot be used, or > you'd need strnlen() for that, I'm just seeing that we have it, I thought > we didn't. Right, sorry. >> What do you think? Can you make them shorter? > > I don't want to enter that activity again this week-end ;-) It's sufficient > here. haha, fair :) >> If you like one of these, I can repost with the improved tests too. > > Yeah please check the few points above for your version, make sure to > clean it up according to the kernel's coding style (empty line after > variable declaration, */ on the next line for the multi-line comment > etc) and that's fine. Will do Best, Rodrigo
diff --git a/tools/include/nolibc/string.h b/tools/include/nolibc/string.h index ed15c22b1b2a..b2149e1342a8 100644 --- a/tools/include/nolibc/string.h +++ b/tools/include/nolibc/string.h @@ -187,23 +187,23 @@ char *strndup(const char *str, size_t maxlen) static __attribute__((unused)) size_t strlcat(char *dst, const char *src, size_t size) { - size_t len; char c; + size_t len = strlen(dst); + size_t ret = strlen(src) + (size < len? size: len); - for (len = 0; dst[len]; len++) - ; - - for (;;) { + for (;len < size;) { c = *src; - if (len < size) + if (len < size - 1) dst[len] = c; + if (len == size - 1) + dst[len] = '\0'; if (!c) break; len++; src++; } - return len; + return ret; } static __attribute__((unused))