Message ID | 20231010234153.021826b1@imladris.surriel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp284314vqb; Tue, 10 Oct 2023 20:42:12 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFPJpWwrVMz59r3dbpEu+dijAsaQsNMls40sY6Sz5831ANshoHH9G0hppuUTgsO60Wb4pJq X-Received: by 2002:a05:6808:1495:b0:3a8:458d:e1cc with SMTP id e21-20020a056808149500b003a8458de1ccmr25128642oiw.1.1696995731885; Tue, 10 Oct 2023 20:42:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696995731; cv=none; d=google.com; s=arc-20160816; b=kyZQW0WqgzHryTHwTld2cpT1/2a4swM8DgvqYyBMjViu+Xh5449ICvz9FKlODMHwRV vv+UBFU7XRDClhSR+MXRfzKWmX1ygGw09gaX1rwzwpYuUNN+m8HQ5rGBKfq6n0vgeruc qFm/bpPzrL91K1y2vGj+zapqmPJnz1OIuoRSE3yuK4rejp861AMsyHxDe0qAsilQ02m8 Z5CmQ6i0whbmOB3WfnbcnfxLiMAxqHz6DEUJawxglSWKAMjVH+PSUCZ3rVh/zziCxdh4 DvZS8/xC2BIlgf4ZAzhAy8OhIyhFv+TG7FnwoLAv9AVLc20T3llhUR0/LSLyOT6x33vo PCkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :message-id:subject:cc:to:from:date; bh=bLyQO33tDw0iksj7Gaz/Kn20E/yoVolkW/flR1G7DUs=; fh=IaHIGecTgtBZWzHFdribiMaBLwiIQP47sp/WWUYgmOw=; b=RrB73wP/yk0NP0dByx9PiqEwob3bFU6AEgMP/S6+3oMuTc5v7SggyzQ2XPXefHNhTj 7H5gBYVWNUXvYlAIyopucAa/BBykZ+/jSYOn12j686ckz1KCARSIy0jVwq1mo241xPNU rsj3G4ePQWCQ38MVpPlH4/N47gvBFqdECWTDrkBAMRnOEJ7DMviYmhDo6+Ixx7AudZP4 B7SgW/0v9oZVaiwaGKxq7bUl1A6KRJ+I0Ms5CijIJWpluqrMsoNc+TdmHda5mH9aDXqz zj2SIHdI3IqZ/MMP5UAJF38cc4XdmtE6ACC/JUNCDj60SVQRaNCGvLKqbjg0+Fv9prnh LKUA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id d19-20020a637353000000b0059b950975b9si3159458pgn.534.2023.10.10.20.42.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Oct 2023 20:42:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 262BF80657C3; Tue, 10 Oct 2023 20:42:11 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229496AbjJKDmI (ORCPT <rfc822;rua109.linux@gmail.com> + 19 others); Tue, 10 Oct 2023 23:42:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55618 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229445AbjJKDmH (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 10 Oct 2023 23:42:07 -0400 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C0AD92 for <linux-kernel@vger.kernel.org>; Tue, 10 Oct 2023 20:42:06 -0700 (PDT) Received: from [2601:18c:9101:a8b6:6e0b:84ff:fee2:98bb] (helo=imladris.surriel.com) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96.1) (envelope-from <riel@shelob.surriel.com>) id 1qqQ6f-0006gO-0x; Tue, 10 Oct 2023 23:41:57 -0400 Date: Tue, 10 Oct 2023 23:41:53 -0400 From: Rik van Riel <riel@surriel.com> To: Alejandro Colomar <alx@kernel.org> Cc: linux-man@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>, kernel-team@meta.com, Eric Biederman <ebiederm@xmission.com> Subject: [PATCH] execve.2: execve also returns E2BIG if a string is too long Message-ID: <20231010234153.021826b1@imladris.surriel.com> X-Mailer: Claws Mail 4.1.1 (GTK 3.24.38; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: riel@surriel.com X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Tue, 10 Oct 2023 20:42:11 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779428996456082463 X-GMAIL-MSGID: 1779428996456082463 |
Series |
execve.2: execve also returns E2BIG if a string is too long
|
|
Commit Message
Rik van Riel
Oct. 11, 2023, 3:41 a.m. UTC
Document that if a command line or environment string is too long (> MAX_ARG_STRLEN), execve will also return E2BIG.
Signed-off-by: Rik van Riel <riel@surriel.com>
---
man2/execve.2 | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
Hi Rik, On Tue, Oct 10, 2023 at 11:41:53PM -0400, Rik van Riel wrote: > Document that if a command line or environment string is too long (> MAX_ARG_STRLEN), execve will also return E2BIG. That's already implied by the current text: E2BIG The total number of bytes in the environment (envp) and argument list (argv) is too large. That means that size_t bytes; bytes = 0; for (char *e = envp; e != NULL; e++) bytes += strlen(e) + 1; // I have doubts about the +1 for (char *a = argv; a != NULL; a++) bytes += strlen(a) + 1; // Same doubts if (bytes > MAX_ARG_STRLEN) // Maybe >= ? return -E2BIG; > > Signed-off-by: Rik van Riel <riel@surriel.com> > --- > man2/execve.2 | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/man2/execve.2 b/man2/execve.2 > index 0d9582492ad1..c1a359d01872 100644 > --- a/man2/execve.2 > +++ b/man2/execve.2 > @@ -449,7 +449,7 @@ The total number of bytes in the environment > .RI ( envp ) > and argument list > .RI ( argv ) > -is too large. > +is too large, or an argument or environment string is too long. Please use semantic newlines: $ MANWIDTH=72 man man-pages | sed -n '/Use semantic newlines/,/^$/p' Use semantic newlines In the source of a manual page, new sentences should be started on new lines, long sentences should be split into lines at clause breaks (commas, semicolons, colons, and so on), and long clauses should be split at phrase boundaries. This convention, sometimes known as "semantic newlines", makes it easier to see the effect of patches, which often operate at the level of in‐ dividual sentences, clauses, or phrases. Thanks, Alex > .TP > .B EACCES > Search permission is denied on a component of the path prefix of > -- > 2.41.0 > >
On Wed, 2023-10-11 at 12:41 +0200, Alejandro Colomar wrote: > Hi Rik, > > On Tue, Oct 10, 2023 at 11:41:53PM -0400, Rik van Riel wrote: > > Document that if a command line or environment string is too long > > (> MAX_ARG_STRLEN), execve will also return E2BIG. > > That's already implied by the current text: > > E2BIG The total number of bytes in the environment (envp) and > argument > list (argv) is too large. > > That means that > > size_t bytes; > > bytes = 0; > for (char *e = envp; e != NULL; e++) > bytes += strlen(e) + 1; // I have doubts about the +1 > for (char *a = argv; a != NULL; a++) > bytes += strlen(a) + 1; // Same doubts > > if (bytes > MAX_ARG_STRLEN) // Maybe >= ? > return -E2BIG; The code in fs/exec.c enforces MAX_ARG_STRLEN against each individual string, not against the total. If any string, either argument or environment, is larger than 32 * PAGE_SIZE, the kernel will return -E2BIG. do_execveat_common() has this code, which uses copy_strings to copy both the strings from the environment, and from the command line arguments: retval = copy_strings(bprm->envc, envp, bprm); if (retval < 0) goto out_free; retval = copy_strings(bprm->argc, argv, bprm); if (retval < 0) goto out_free; Inside copy_strings() we have this code: while (argc-- > 0) { ... len = strnlen_user(str, MAX_ARG_STRLEN); if (!len) goto out; ret = -E2BIG; if (!valid_arg_len(bprm, len)) goto out; The valid_arg_len() function does not need explanation: static bool valid_arg_len(struct linux_binprm *bprm, long len) { return len <= MAX_ARG_STRLEN; } The current man page wording is very clear about the total length being enforced, but IMHO not as clear about the limit that gets enforced on each individual string. The total length limit of environment & commandline arguments is enforced by bprm_stack_limits(), and is checked against either 1/4 of the maximum stack size, or 3/4 of _STK_LIM, whichever is smaller. The MAX_ARG_STRLEN value does not come into play when enforcing the total.
On Wed, Oct 11, 2023 at 9:21 AM Rik van Riel <riel@surriel.com> wrote: > On Wed, 2023-10-11 at 12:41 +0200, Alejandro Colomar wrote: > > Hi Rik, > > > > On Tue, Oct 10, 2023 at 11:41:53PM -0400, Rik van Riel wrote: > > > Document that if a command line or environment string is too long > > > (> MAX_ARG_STRLEN), execve will also return E2BIG. > > > > That's already implied by the current text: > > > > E2BIG The total number of bytes in the environment (envp) and > > argument > > list (argv) is too large. > > > > That means that > > > > size_t bytes; > > > > bytes = 0; > > for (char *e = envp; e != NULL; e++) > > bytes += strlen(e) + 1; // I have doubts about the +1 > > for (char *a = argv; a != NULL; a++) > > bytes += strlen(a) + 1; // Same doubts > > > > if (bytes > MAX_ARG_STRLEN) // Maybe >= ? > > return -E2BIG; > > The code in fs/exec.c enforces MAX_ARG_STRLEN against > each individual string, not against the total. > > If any string, either argument or environment, is larger > than 32 * PAGE_SIZE, the kernel will return -E2BIG. > > do_execveat_common() has this code, which uses copy_strings > to copy both the strings from the environment, and from > the command line arguments: > > retval = copy_strings(bprm->envc, envp, bprm); > if (retval < 0) > goto out_free; > > retval = copy_strings(bprm->argc, argv, bprm); > if (retval < 0) > goto out_free; > > Inside copy_strings() we have this code: > > > while (argc-- > 0) { > ... > len = strnlen_user(str, MAX_ARG_STRLEN); > if (!len) > goto out; > > ret = -E2BIG; > if (!valid_arg_len(bprm, len)) > goto out; > > The valid_arg_len() function does not need explanation: > > static bool valid_arg_len(struct linux_binprm *bprm, long len) > { > return len <= MAX_ARG_STRLEN; > } > > > The current man page wording is very clear about the total > length being enforced, but IMHO not as clear about the limit > that gets enforced on each individual string. > > The total length limit of environment & commandline arguments > is enforced by bprm_stack_limits(), and is checked against > either 1/4 of the maximum stack size, or 3/4 of _STK_LIM, whichever > is smaller. The MAX_ARG_STRLEN value does not come into play when > enforcing the total. To expand on this, there are basically two separate byte limits in fs/exec.c, one for each individual argv/envp string, and another for all strings and all pointers to them as a whole. To put the whole thing in pseudocode, the checks work effectively like this, assuming I haven't made any errors: int argc, envc; unsigned long bytes, limit; /* assume that argv has already been adjusted to add an empty argv[0] */ argc = 0, envc = 0, bytes = 0; for (char **a = argv; *a != NULL; a++, argc++) { if (strlen(*a) >= MAX_ARG_STRLEN) return -E2BIG; bytes += strlen(*a) + 1; } for (char **e = envp; *e != NULL; e++, envc++) { if (strlen(*e) >= MAX_ARG_STRLEN) return -E2BIG; bytes += strlen(*e) + 1; } if (argc > MAX_ARG_STRINGS || envc > MAX_ARG_STRINGS) return -E2BIG; bytes += (argc + envc) * sizeof(void *); limit = max(min(_STK_LIM / 4 * 3, rlim_stack.rlim_cur / 4), ARG_MAX); if (bytes > limit) return -E2BIG; Thank you, Matthew House
On Tue, Oct 10, 2023 at 11:51 PM Rik van Riel <riel@surriel.com> wrote: > Document that if a command line or environment string is too long (> MAX_ARG_STRLEN), execve will also return E2BIG. > > Signed-off-by: Rik van Riel <riel@surriel.com> It might be worth mentioning that strlen(pathname) must also be strictly less than MAX_ARG_STRLEN, it being subject to the same restrictions as each of the argv/envp strings. Thank you, Matthew House
Hi Rik, On Wed, Oct 11, 2023 at 09:21:28AM -0400, Rik van Riel wrote: > On Wed, 2023-10-11 at 12:41 +0200, Alejandro Colomar wrote: > > Hi Rik, > > > > On Tue, Oct 10, 2023 at 11:41:53PM -0400, Rik van Riel wrote: > > > Document that if a command line or environment string is too long > > > (> MAX_ARG_STRLEN), execve will also return E2BIG. > > > > That's already implied by the current text: > > > > E2BIG The total number of bytes in the environment (envp) and > > argument > > list (argv) is too large. > > > > That means that > > > > size_t bytes; > > > > bytes = 0; > > for (char *e = envp; e != NULL; e++) > > bytes += strlen(e) + 1; // I have doubts about the +1 > > for (char *a = argv; a != NULL; a++) > > bytes += strlen(a) + 1; // Same doubts > > > > if (bytes > MAX_ARG_STRLEN) // Maybe >= ? > > return -E2BIG; > > The code in fs/exec.c enforces MAX_ARG_STRLEN against > each individual string, not against the total. > > If any string, either argument or environment, is larger > than 32 * PAGE_SIZE, the kernel will return -E2BIG. > > do_execveat_common() has this code, which uses copy_strings > to copy both the strings from the environment, and from > the command line arguments: > > retval = copy_strings(bprm->envc, envp, bprm); > if (retval < 0) > goto out_free; > > retval = copy_strings(bprm->argc, argv, bprm); > if (retval < 0) > goto out_free; > > Inside copy_strings() we have this code: > > > while (argc-- > 0) { > ... > len = strnlen_user(str, MAX_ARG_STRLEN); > if (!len) > goto out; > > ret = -E2BIG; > if (!valid_arg_len(bprm, len)) > goto out; > > The valid_arg_len() function does not need explanation: > > static bool valid_arg_len(struct linux_binprm *bprm, long len) > { > return len <= MAX_ARG_STRLEN; > } > > > The current man page wording is very clear about the total > length being enforced, but IMHO not as clear about the limit > that gets enforced on each individual string. > > The total length limit of environment & commandline arguments > is enforced by bprm_stack_limits(), and is checked against > either 1/4 of the maximum stack size, or 3/4 of _STK_LIM, whichever > is smaller. The MAX_ARG_STRLEN value does not come into play when > enforcing the total. Ahh, so the limit for each string is different than the limit for the total. That makes sense. Sorry for the noise. Cheers, Alex > > -- > All Rights Reversed.
Hi Matthew, On Wed, Oct 11, 2023 at 09:44:29AM -0400, Matthew House wrote: > > To expand on this, there are basically two separate byte limits in > fs/exec.c, one for each individual argv/envp string, and another for all > strings and all pointers to them as a whole. To put the whole thing in > pseudocode, the checks work effectively like this, assuming I haven't made > any errors: > > int argc, envc; > unsigned long bytes, limit; > > /* assume that argv has already been adjusted to add an empty argv[0] */ > argc = 0, envc = 0, bytes = 0; > for (char **a = argv; *a != NULL; a++, argc++) { > if (strlen(*a) >= MAX_ARG_STRLEN) > return -E2BIG; > bytes += strlen(*a) + 1; > } > for (char **e = envp; *e != NULL; e++, envc++) { > if (strlen(*e) >= MAX_ARG_STRLEN) > return -E2BIG; > bytes += strlen(*e) + 1; > } > > if (argc > MAX_ARG_STRINGS || envc > MAX_ARG_STRINGS) > return -E2BIG; > bytes += (argc + envc) * sizeof(void *); > > limit = max(min(_STK_LIM / 4 * 3, rlim_stack.rlim_cur / 4), ARG_MAX); > if (bytes > limit) > return -E2BIG; > > Thank you, > Matthew House Thanks! This thing would be useful in the commit message. An example program demonstrating it would be even better. Cheers, Alex
On Wed, Oct 11, 2023 at 09:44:29AM -0400, Matthew House wrote: > On Wed, Oct 11, 2023 at 9:21 AM Rik van Riel <riel@surriel.com> wrote: > > On Wed, 2023-10-11 at 12:41 +0200, Alejandro Colomar wrote: > > > Hi Rik, > > > > > > On Tue, Oct 10, 2023 at 11:41:53PM -0400, Rik van Riel wrote: > > > > Document that if a command line or environment string is too long > > > > (> MAX_ARG_STRLEN), execve will also return E2BIG. > > > > > > That's already implied by the current text: > > > > > > E2BIG The total number of bytes in the environment (envp) and > > > argument > > > list (argv) is too large. > > > > > > That means that > > > > > > size_t bytes; > > > > > > bytes = 0; > > > for (char *e = envp; e != NULL; e++) > > > bytes += strlen(e) + 1; // I have doubts about the +1 > > > for (char *a = argv; a != NULL; a++) > > > bytes += strlen(a) + 1; // Same doubts > > > > > > if (bytes > MAX_ARG_STRLEN) // Maybe >= ? > > > return -E2BIG; > > > > The code in fs/exec.c enforces MAX_ARG_STRLEN against > > each individual string, not against the total. > > > > If any string, either argument or environment, is larger > > than 32 * PAGE_SIZE, the kernel will return -E2BIG. > > > > do_execveat_common() has this code, which uses copy_strings > > to copy both the strings from the environment, and from > > the command line arguments: > > > > retval = copy_strings(bprm->envc, envp, bprm); > > if (retval < 0) > > goto out_free; > > > > retval = copy_strings(bprm->argc, argv, bprm); > > if (retval < 0) > > goto out_free; > > > > Inside copy_strings() we have this code: > > > > > > while (argc-- > 0) { > > ... > > len = strnlen_user(str, MAX_ARG_STRLEN); > > if (!len) > > goto out; > > > > ret = -E2BIG; > > if (!valid_arg_len(bprm, len)) > > goto out; > > > > The valid_arg_len() function does not need explanation: > > > > static bool valid_arg_len(struct linux_binprm *bprm, long len) > > { > > return len <= MAX_ARG_STRLEN; > > } > > > > > > The current man page wording is very clear about the total > > length being enforced, but IMHO not as clear about the limit > > that gets enforced on each individual string. > > > > The total length limit of environment & commandline arguments > > is enforced by bprm_stack_limits(), and is checked against > > either 1/4 of the maximum stack size, or 3/4 of _STK_LIM, whichever > > is smaller. The MAX_ARG_STRLEN value does not come into play when > > enforcing the total. > > To expand on this, there are basically two separate byte limits in > fs/exec.c, one for each individual argv/envp string, and another for all > strings and all pointers to them as a whole. To put the whole thing in > pseudocode, the checks work effectively like this, assuming I haven't made > any errors: > > int argc, envc; > unsigned long bytes, limit; > > /* assume that argv has already been adjusted to add an empty argv[0] */ > argc = 0, envc = 0, bytes = 0; > for (char **a = argv; *a != NULL; a++, argc++) { > if (strlen(*a) >= MAX_ARG_STRLEN) Are you sure this is >= and not > ? > return -E2BIG; > bytes += strlen(*a) + 1; > } > for (char **e = envp; *e != NULL; e++, envc++) { > if (strlen(*e) >= MAX_ARG_STRLEN) > return -E2BIG; > bytes += strlen(*e) + 1; > } > > if (argc > MAX_ARG_STRINGS || envc > MAX_ARG_STRINGS) > return -E2BIG; > bytes += (argc + envc) * sizeof(void *); > > limit = max(min(_STK_LIM / 4 * 3, rlim_stack.rlim_cur / 4), ARG_MAX); > if (bytes > limit) > return -E2BIG; > > Thank you, > Matthew House
On Wed, Oct 11, 2023 at 10:47 AM Alejandro Colomar <alx@kernel.org> wrote: > On Wed, Oct 11, 2023 at 09:44:29AM -0400, Matthew House wrote: > > To expand on this, there are basically two separate byte limits in > > fs/exec.c, one for each individual argv/envp string, and another for all > > strings and all pointers to them as a whole. To put the whole thing in > > pseudocode, the checks work effectively like this, assuming I haven't made > > any errors: > > > > int argc, envc; > > unsigned long bytes, limit; > > > > /* assume that argv has already been adjusted to add an empty argv[0] */ > > argc = 0, envc = 0, bytes = 0; > > for (char **a = argv; *a != NULL; a++, argc++) { > > if (strlen(*a) >= MAX_ARG_STRLEN) > > Are you sure this is >= and not > ? Yes. In general, the kernel's string limits tend to include the trailing null byte. There are two places where this limit is enforced, one for the pathname (or full pathname for execveat) and the other for the argv/envp strings. The pathname is handled by copy_string_kernel(): int len = strnlen(arg, MAX_ARG_STRLEN) + 1 /* terminating NUL */; if (len == 0) return -EFAULT; if (!valid_arg_len(bprm, len)) return -E2BIG; where valid_arg_len(bprm, len) is just (len <= MAX_ARG_STRLEN). Here, strnlen() has the same behavior as the ordinary libc strnlen(3), effectively returning min(strlen(arg), MAX_ARG_STRLEN). Thus, the check succeeds iff strlen(arg) + 1 <= MAX_ARG_STRLEN, or equivalently, iff strlen(arg) < MAX_ARG_STRLEN. Next, each of the environment and argument strings is handled by copy_strings(): len = strnlen_user(str, MAX_ARG_STRLEN); if (!len) goto out; ret = -E2BIG; if (!valid_arg_len(bprm, len)) goto out; The strnlen_user() function, per its documentation, is explicitly inclusive of the trailing null byte: * Returns the size of the string INCLUDING the terminating NUL. * If the string is too long, returns a number larger than @count. User * has to check the return value against "> count". * On exception (or invalid count), returns 0. Thus, the check succeeds iff the size including the null byte is <= MAX_ARG_STRLEN, i.e., iff strlen(arg) + 1 <= MAX_ARG_STRLEN, or strlen(arg) < MAX_ARG_STRLEN. Matthew House
Hi Matthew, On Wed, Oct 11, 2023 at 11:11:24AM -0400, Matthew House wrote: > On Wed, Oct 11, 2023 at 10:47 AM Alejandro Colomar <alx@kernel.org> wrote: > > On Wed, Oct 11, 2023 at 09:44:29AM -0400, Matthew House wrote: > > > To expand on this, there are basically two separate byte limits in > > > fs/exec.c, one for each individual argv/envp string, and another for all > > > strings and all pointers to them as a whole. To put the whole thing in > > > pseudocode, the checks work effectively like this, assuming I haven't made > > > any errors: > > > > > > int argc, envc; > > > unsigned long bytes, limit; > > > > > > /* assume that argv has already been adjusted to add an empty argv[0] */ > > > argc = 0, envc = 0, bytes = 0; > > > for (char **a = argv; *a != NULL; a++, argc++) { > > > if (strlen(*a) >= MAX_ARG_STRLEN) > > > > Are you sure this is >= and not > ? > > Yes. In general, the kernel's string limits tend to include the trailing > null byte. There are two places where this limit is enforced, one for the > pathname (or full pathname for execveat) and the other for the argv/envp > strings. The pathname is handled by copy_string_kernel(): > > int len = strnlen(arg, MAX_ARG_STRLEN) + 1 /* terminating NUL */; > > if (len == 0) > return -EFAULT; > if (!valid_arg_len(bprm, len)) > return -E2BIG; > > where valid_arg_len(bprm, len) is just (len <= MAX_ARG_STRLEN). Here, > strnlen() has the same behavior as the ordinary libc strnlen(3), > effectively returning min(strlen(arg), MAX_ARG_STRLEN). Thus, the check > succeeds iff strlen(arg) + 1 <= MAX_ARG_STRLEN, or equivalently, iff > strlen(arg) < MAX_ARG_STRLEN. > > Next, each of the environment and argument strings is handled by > copy_strings(): > > len = strnlen_user(str, MAX_ARG_STRLEN); > if (!len) > goto out; > > ret = -E2BIG; > if (!valid_arg_len(bprm, len)) > goto out; > > The strnlen_user() function, per its documentation, is explicitly inclusive > of the trailing null byte: > > * Returns the size of the string INCLUDING the terminating NUL. > * If the string is too long, returns a number larger than @count. User > * has to check the return value against "> count". > * On exception (or invalid count), returns 0. > > Thus, the check succeeds iff the size including the null byte is > <= MAX_ARG_STRLEN, i.e., iff strlen(arg) + 1 <= MAX_ARG_STRLEN, or > strlen(arg) < MAX_ARG_STRLEN. Thanks! It's a bit confusing to see the terms 'len' and '_STRLEN' meaning length+1, but it makes sense now. Cheers, Alex > > Matthew House
diff --git a/man2/execve.2 b/man2/execve.2 index 0d9582492ad1..c1a359d01872 100644 --- a/man2/execve.2 +++ b/man2/execve.2 @@ -449,7 +449,7 @@ The total number of bytes in the environment .RI ( envp ) and argument list .RI ( argv ) -is too large. +is too large, or an argument or environment string is too long. .TP .B EACCES Search permission is denied on a component of the path prefix of