From patchwork Sat Feb 10 17:26:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Burgess X-Patchwork-Id: 199285 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:50ea:b0:106:860b:bbdd with SMTP id r10csp1574444dyd; Sat, 10 Feb 2024 09:27:38 -0800 (PST) X-Google-Smtp-Source: AGHT+IHBfz1fwmsOO1qBjKwMrbXF8bmD1Vyzg5rQ6uOD0CZKrjJRd6d98g7ViETEht0RQizRg2xr X-Received: by 2002:ad4:5be2:0:b0:68c:96e2:1dc0 with SMTP id k2-20020ad45be2000000b0068c96e21dc0mr3334542qvc.23.1707586058455; Sat, 10 Feb 2024 09:27:38 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1707586058; cv=pass; d=google.com; s=arc-20160816; b=yIalYgQMkQm2rznUj8BZzQYDjPbb9dLLGWhYCnixTIzWDPdnE+G2pq/avCo6GHGzF/ EvQV4Yv98NcveZ9t+I+jiC51jOoFCb+tPqKjBdPPqx+d750kCSoF5RViKVHzTAz5UQ68 bikfSMY6n2puH+C2NnWQrYjLEpJa27Shlj3O94oEsmEFrmXuIehZMz2e92z+om46x3IJ /S8WNhHRwK5rGE+gQ+LhJ/YnFX68tAdCYDIfrVz99guNxQmJ+juqEQjNsJTwW985uI8+ onmCvumHfz0zbMoD818Qo4k4saQ9/fcGlbL+ju4QUQTjPNFNYHB3Vxyzq3Cl8ACBA4Tk /xtg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=ZbEN68KkygiuvSUrSLvy0hLUWqPttOh6/jJLjH0PKMs=; fh=rVJ58JDycbmCbCv8yebOO/iWiVX5tH1hAw5p7vjZHQ0=; b=VGjC8RPQSFW6//zoiDu4PBnY06VlC3woA1GmxToWZV2zV/W4vB5HT6tKuSbAJ0gUJO dRkX3CsOSA1acZx2nHOlmnMWfxKpsMOmGWUQb/fTmqIHS9vdxpdUzoNb4rWTWBKsJ450 FS8OhysA6+aAwnxzF0oywB7a9hb4AU3owbTzBV4/WOaA34lTAlYf11hsTpWoW+CKEWnq zXdY3sIep3W4RZZFk+tn6vf+fv4dhG36NeGlrlRnSaqkxI7l198BKKYweZz6HaiILd3o GmaepCSNAuO86LDuHKUYw02cVJq+WXk/xrfZDR70Ck3pkBdHWG8IFjxgFOLRaNa6j4Jw zAew==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=aMKrMG29; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Forwarded-Encrypted: i=2; AJvYcCVhxdDd7o1tawpKf0wPy2qRE18MWNdjY86XqksOq1I61LZRAI4e2uztlddHWVyafwKuzmrHgMELEVazi4PiuvsG8CXxPg== Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id gg3-20020a056214252300b0068cc27fa9f6si4783127qvb.19.2024.02.10.09.27.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 10 Feb 2024 09:27:38 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=aMKrMG29; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2B23A3858C54 for ; Sat, 10 Feb 2024 17:27:38 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id D6A1E3858CDA for ; Sat, 10 Feb 2024 17:26:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D6A1E3858CDA Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D6A1E3858CDA Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707585972; cv=none; b=uhfhxhb8Wp6utnAYRittl5PXc+0lSp6n51ea3T9KdBB7TVLpmmBsmoLFi7mMr4tilWnKGVqMwIX17xphpzrmITBI99+/a61GQePCcC0zhijQUtDYB9tdP9oUdpQwKTcJupi9Xp4PZ4G4ZXlR2sbHN0xNcHLfsuCDPtbEs0HrA+g= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707585972; c=relaxed/simple; bh=ZPJCJSZE4DdBsoqJ1+ZZoOIqggIWu5ddxaU4VTX+s80=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=TXIQQ0Afr0TTdmtf9dbBuLHBwAP4gsNCbybk3Bdq+z9BjO7BzbIXxiJWDPlId3EAyurxdCrATaR35uLO0lKkgvcY2GnB7MpiqY5mVED2RYvDqdfDgsBhfLAIcNjPwpXIuMN15TVpTjZEIKF5nRSZQqeE90kFdQo5sp/r8Zb6218= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707585970; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZbEN68KkygiuvSUrSLvy0hLUWqPttOh6/jJLjH0PKMs=; b=aMKrMG29KgMr/avnk7rzxDKxk3vthYJQlu8YDTGQxAFsRL1uWSz6La7t2o3yplRu1TzeMX lTWYPRButrt2yqrXidzYy86gnD+umyvaL7aZXfawgybdcdC9HqyTA8oX8tFXT1YqJlwXYn BB4BVU3qZnMuzL7I6dzQeYeq8UwIR34= Received: from mail-lf1-f71.google.com (mail-lf1-f71.google.com [209.85.167.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-519-jTyKGaNtOQudoG5IjKIC_A-1; Sat, 10 Feb 2024 12:26:08 -0500 X-MC-Unique: jTyKGaNtOQudoG5IjKIC_A-1 Received: by mail-lf1-f71.google.com with SMTP id 2adb3069b0e04-51161adad50so1632609e87.2 for ; Sat, 10 Feb 2024 09:26:07 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707585966; x=1708190766; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZbEN68KkygiuvSUrSLvy0hLUWqPttOh6/jJLjH0PKMs=; b=D2BQwGN6jxMpeFFadDl0/xifVeqtT8sCEB51MqOZEk9BeJ/33tdErIJidTC7Aq0wsE oS6L/6RitKZc3z7TzzjXXfZjeJSAliS3+1FXocNMszTAGckKDQNbpYUKHsL7/FcBuNgo y/TEUvPIAHKC6K5wLPCak7BrkqJwe/4ngU0CT1D4IurpPXoCQgucdiLFC4jHL49c3yP/ UXZmC32RyYlOhWHphSl385lPLEeps+yxI1zJaMER8IPz9LT0vJr4zTQ/9yCpf4zX3yiI KkVkk+YuKCRCyPcBQPn22the446LKtNqigZcaQ3gnaBv9k6eTNXDuzP39Nevf3U5a+jf MUWw== X-Gm-Message-State: AOJu0YympturlEWAsurfabhPK426QfHDBfrxa0SGhEI0LX7nVPG0MLWh tC5rlp/yPmgycuxQBmq/qUU4jyzBw9WON8WTDIzi5gSHR4Kpw2mnzWd6jdnPghSrbJYCdYjoCFf /a/xzl8RgrswiJBZUt7BmiD7cLSkp9O5+mmCB5AaZT3+FRoeJryo8EvpmVhQYnxjxFFL4sJU3yc xnNpAMHeZn1Fm8CfvzqOYC3URMWpH/jJwE5A1KfWE= X-Received: by 2002:ac2:4189:0:b0:511:3bd4:6a97 with SMTP id z9-20020ac24189000000b005113bd46a97mr1437778lfh.7.1707585966459; Sat, 10 Feb 2024 09:26:06 -0800 (PST) X-Received: by 2002:ac2:4189:0:b0:511:3bd4:6a97 with SMTP id z9-20020ac24189000000b005113bd46a97mr1437769lfh.7.1707585966048; Sat, 10 Feb 2024 09:26:06 -0800 (PST) Received: from localhost (185.223.159.143.dyn.plus.net. [143.159.223.185]) by smtp.gmail.com with ESMTPSA id u16-20020a05600c19d000b0040f0219c371sm3936177wmq.19.2024.02.10.09.26.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 10 Feb 2024 09:26:04 -0800 (PST) From: Andrew Burgess To: gcc-patches@gcc.gnu.org Cc: Andrew Burgess Subject: [PATCHv2 1/2] libiberty/buildargv: POSIX behaviour for backslash handling Date: Sat, 10 Feb 2024 17:26:00 +0000 Message-Id: X-Mailer: git-send-email 2.25.4 In-Reply-To: References: <24a8d878590403540bc9b579ba58805985a4d2f7.1701881419.git.aburgess@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784552091146979119 X-GMAIL-MSGID: 1790533759129016876 GDB makes use of the libiberty function buildargv for splitting the inferior (program being debugged) argument string in the case where the inferior is not being started under a shell. I have recently been working to improve this area of GDB, and have tracked done some of the unexpected behaviour to the libiberty function buildargv, and how it handles backslash escapes. For reference, I've been mostly reading: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html The issues that I would like to fix are: 1. Backslashes within single quotes should not be treated as an escape, thus: '\a' should split to \a, retaining the backslash. 2. Backslashes within double quotes should only act as an escape if they are immediately before one of the characters $ (dollar), ` (backtick), " (double quote), ` (backslash), or \n (newline). In all other cases a backslash should not be treated as an escape character. Thus: "\a" should split to \a, but "\$" should split to $. 3. A backslash-newline sequence should be treated as a line continuation, both the backslash and the newline should be removed. I've updated libiberty and also added some tests. All the existing libiberty tests continue to pass, but I'm not sure if there is more testing that should be done, buildargv is used within lto-wraper.cc, so maybe there's some testing folk can suggest that I run? --- libiberty/argv.c | 8 +++++-- libiberty/testsuite/test-expandargv.c | 34 +++++++++++++++++++++++++++ 2 files changed, 40 insertions(+), 2 deletions(-) diff --git a/libiberty/argv.c b/libiberty/argv.c index 45f16854603..d9d32e59e72 100644 --- a/libiberty/argv.c +++ b/libiberty/argv.c @@ -224,9 +224,13 @@ char **buildargv (const char *input) if (bsquote) { bsquote = 0; - *arg++ = *input; + if (*input != '\n') + *arg++ = *input; } - else if (*input == '\\') + else if (*input == '\\' + && !squote + && (!dquote + || strchr ("$`\"\\\n", *(input + 1)) != NULL)) { bsquote = 1; } diff --git a/libiberty/testsuite/test-expandargv.c b/libiberty/testsuite/test-expandargv.c index 1e9cb0a0d5a..ea1aeb0eda2 100644 --- a/libiberty/testsuite/test-expandargv.c +++ b/libiberty/testsuite/test-expandargv.c @@ -142,6 +142,40 @@ const char *test_data[] = { "b", 0, + /* Test 7 - No backslash removal within single quotes. */ + "'a\\$VAR' '\\\"'", /* Test 7 data */ + ARGV0, + "@test-expandargv-7.lst", + 0, + ARGV0, + "a\\$VAR", + "\\\"", + 0, + + /* Test 8 - Remove backslash / newline pairs. */ + "\"ab\\\ncd\" ef\\\ngh", /* Test 8 data */ + ARGV0, + "@test-expandargv-8.lst", + 0, + ARGV0, + "abcd", + "efgh", + 0, + + /* Test 9 - Backslash within double quotes. */ + "\"\\$VAR\" \"\\`\" \"\\\"\" \"\\\\\" \"\\n\" \"\\t\"", /* Test 9 data */ + ARGV0, + "@test-expandargv-9.lst", + 0, + ARGV0, + "$VAR", + "`", + "\"", + "\\", + "\\n", + "\\t", + 0, + 0 /* Test done marker, don't remove. */ }; From patchwork Sat Feb 10 17:26:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Burgess X-Patchwork-Id: 199284 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:50ea:b0:106:860b:bbdd with SMTP id r10csp1574142dyd; Sat, 10 Feb 2024 09:26:53 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCUfjFmiVRaa4W2DfbvLHHVTtVluZAbSEnyHV0YStplqPdb6WI8K5MNyv3ow5vqiKKfCYDnO0mmPuuW7gxHwV2+hCf5dEQ== X-Google-Smtp-Source: AGHT+IHzOZiGPvrYYpoy89wpJ1TQFfKmA2ej3ZReH3Z3x0u1pzuAmXYejIfjon9HfEEYc06TaDSq X-Received: by 2002:a05:6808:4484:b0:3bf:d702:e578 with SMTP id eq4-20020a056808448400b003bfd702e578mr4969846oib.8.1707586013600; Sat, 10 Feb 2024 09:26:53 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1707586013; cv=pass; d=google.com; s=arc-20160816; b=Wxh0W13xUshQ+AG5Bh6Ulu/kAQWRVHx9KS5PZVZOmvI2/Y7J+8uMM0Y9EboPLwUrwF nnTfU34UkWVE9/yy+8tz3aS9p04cdVa6Kx+gWZH48VSCKq7PgKoNALs35c5oS3zcpL2Y USLPC+0lJnKvK2sm9ZKoFZqQOhEdf8pFb4zTmL0ECKSCQjjSIx2+akSVdr7s2f4EbiXz VpPy9F98z8ZwjLO6Kk3SCvjR6ff93Xpvb6YI/qj6/khdRxY7srgK1ArDZMN+U1q4k64y fUPmZohR27lqZwORazHfpnKDqtVADO4sLl6ci4wH5EYPeoBAZY2yzPK0nGZnlBdJfB73 W6YA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=WdSLGIDD5hVUFxhd9ugTCMg6XPEuZGyAljG7iBLs8Ho=; fh=OG9NyGlbseJmKVWi81FsqA5UmggHNmnaxjD8Ypm17uQ=; b=Po7X6YZam0sFOcjv8u2dKfwjCkAa4NXVgSh+mkbrxeq5b41DBHziLXBPWB4Nr+TfWe +aZnQHYaVFGkddqzENQb/Uf3w+H4fwKL2uei+axk+b02Zi8NGxQE76JmlBAglClNj0+e zUm4J+SJCEc72YmIvbeLQzuvbGSSmbJFc96rs3S0s/pM6Wa0WSRUnuvm8wq7g6/b+3x8 c0Up4N1uv3fy8RUrRqRb7MXm3CX+FmnKtqPVDxNr3BwTL/BNCcqCcRLSwswRPWX3ep/y FiEIjLiRIuULnZ4Iyo3DRq1GeAOewNYOp2DDnHQ71nDylcsLj3igUFUv/h+IrXyT39Fm EvEg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=YBq0HVHH; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Forwarded-Encrypted: i=2; AJvYcCW1Bi4v5jEMvpTJXG0eMX48+dcVl6ZsM5hkPEKIqj8Bla88aa9pkaNcMv7gNVZR89ICFRawRd1kBW2wTOTe6wwnr/WU3w== Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id g9-20020ac85d49000000b0042c2109e0e6si4599426qtx.438.2024.02.10.09.26.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 10 Feb 2024 09:26:53 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=YBq0HVHH; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4AEE93858C52 for ; Sat, 10 Feb 2024 17:26:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 324E33858CDB for ; Sat, 10 Feb 2024 17:26:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 324E33858CDB Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 324E33858CDB Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707585973; cv=none; b=MNstaPpXrCVXy481Y1h0zsyRgadbhOKsCLiDWMeX8lZN7XaDfowDmzXSpI+sWBMfvBGgfMJ8rl1Q2Sh2PNYipJDdQl7+cmJMcQPGMwWnY/KWkQndSqVyf2JpVhKnSIJpjXZO1l5U3nYe53kjqNoKMl3iSbu+E5F3GBB2mlaLMI0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1707585973; c=relaxed/simple; bh=ONaa87UG1p/davIjJABr4fygkg4k02PWWZ5UnDD3x4A=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=P25m1XnwmMWHpFvWVuqWR/Nlyb2yN2Kzh5OuJfHC2/CS6FfH0n3JWrX1b7G5WQi1Hv3RJBh0B+tyA6YU4agoQjgexi5XtNzANga9izA6xWI5d2ljNdyHRE2eJkK3VDHAV0n1YlL9bq8J7E/2fCiHE0IPCATE2EFygF/5c7xnohk= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1707585970; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WdSLGIDD5hVUFxhd9ugTCMg6XPEuZGyAljG7iBLs8Ho=; b=YBq0HVHHU0ZynRHi9hQDXpW2r6C2pxLuxtjQQPvfRoyUHr59T/9GAJb1kwNqnT1A6M70bY I7xkgKNL0L1xMsxpY7M4vANyw4EioyA90nAwhcZTllVSbOkzVGEFDLBAAIL+Zyu6pVHshh Ebe0MGg3IO+vK55uq0LF/ovDGso+LqQ= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-447-ZVZ8h2SnPv2niAyB6koN1Q-1; Sat, 10 Feb 2024 12:26:09 -0500 X-MC-Unique: ZVZ8h2SnPv2niAyB6koN1Q-1 Received: by mail-wr1-f70.google.com with SMTP id ffacd0b85a97d-33b316fcaecso799316f8f.3 for ; Sat, 10 Feb 2024 09:26:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707585968; x=1708190768; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WdSLGIDD5hVUFxhd9ugTCMg6XPEuZGyAljG7iBLs8Ho=; b=GwP1UsBhznXwPE0Y/Goc+iLn6P75FxKozI9SRb0TS6rahHI0CmO6V9tAxvDp+3WLLs IhUgPG6Iltz7DO3nZYlSa0MpcPRMzRfTBS9ULoTKKDLjYPpl7kGmwo2Cre3wBWuQBvIy NtUzELeBQ/Z1FSSQW813RYdNBRJsoJu5Ed10hGeb4FvfHpqVa4ffBwx31jxkMYZiDjq6 SaRGm0uPz9W7CsqF/E/R4d1g2TiDTtgC/0e1SkobKCyi5jkZT0rJcezWBxWjaC8VFc11 NOv5tu7RwulfPvR9OJblotODBLsAIzXFbHmBNdah1ngSWp0n6lHYvp3ZqYA5nPkBdTy8 qsOA== X-Gm-Message-State: AOJu0Ywm4+e+j7GH2I+w/X33SYl99tHZHQksghRu4wlfSV6m3+pG2lSe XdQop/DKhYQXF2oTejZWvfodyrZYSW3/JgCtk4W9hs56HR8ur085LUtjHxB+SPJtHK1P/TSltc/ iIAyzu+g8LNbXTzUp0daYFmHjXOK+1ahlhnUxGDYmeZE/68i7k3xSUipv/OW7eg4UC9vpN54bSO QjbVlv8bc8eOSUkeVuAw0xjcKPpfwazresaemyRpE= X-Received: by 2002:a5d:6543:0:b0:33b:178a:6715 with SMTP id z3-20020a5d6543000000b0033b178a6715mr1615571wrv.24.1707585967872; Sat, 10 Feb 2024 09:26:07 -0800 (PST) X-Received: by 2002:a5d:6543:0:b0:33b:178a:6715 with SMTP id z3-20020a5d6543000000b0033b178a6715mr1615557wrv.24.1707585967336; Sat, 10 Feb 2024 09:26:07 -0800 (PST) Received: from localhost (185.223.159.143.dyn.plus.net. [143.159.223.185]) by smtp.gmail.com with ESMTPSA id bu3-20020a056000078300b0033b46b1b6adsm2357849wrb.21.2024.02.10.09.26.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 10 Feb 2024 09:26:07 -0800 (PST) From: Andrew Burgess To: gcc-patches@gcc.gnu.org Cc: Andrew Burgess Subject: [PATCHv2 2/2] libiberty/buildargv: handle input consisting of only white space Date: Sat, 10 Feb 2024 17:26:01 +0000 Message-Id: <37b3a30868139bab59155717f6bff1ed08dbca76.1707585836.git.aburgess@redhat.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: References: <24a8d878590403540bc9b579ba58805985a4d2f7.1701881419.git.aburgess@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1790533711383948126 X-GMAIL-MSGID: 1790533711383948126 GDB makes use of the libiberty function buildargv for splitting the inferior (program being debugged) argument string in the case where the inferior is not being started under a shell. I have recently been working to improve this area of GDB, and noticed some unexpected behaviour to the libiberty function buildargv, when the input is a string consisting only of white space. What I observe is that if the input to buildargv is a string containing only white space, then buildargv will return an argv list containing a single empty argument, e.g.: char **argv = buildargv (" "); assert (*argv[0] == '\0'); assert (argv[1] == NULL); We get the same output from buildargv if the input is a single space, or multiple spaces. Other white space characters give the same results. This doesn't seem right to me, and in fact, there appears to be a work around for this issue in expandargv where we have this code: /* If the file is empty or contains only whitespace, buildargv would return a single empty argument. In this context we want no arguments, instead. */ if (only_whitespace (buffer)) { file_argv = (char **) xmalloc (sizeof (char *)); file_argv[0] = NULL; } else /* Parse the string. */ file_argv = buildargv (buffer); I think that the correct behaviour in this situation is to return an empty argv array, e.g.: char **argv = buildargv (" "); assert (argv[0] == NULL); And it turns out that this is a trivial change to buildargv. The diff does look big, but this is because I've re-indented a block. Check with 'git diff -b' to see the minimal changes. I've also removed the work around from expandargv. When testing this sort of thing I normally write the tests first, and then fix the code. In this case test-expandargv.c has sort-of been used as a mechanism for testing the buildargv function (expandargv does call buildargv most of the time), however, for this particular issue the work around in expandargv (mentioned above) masked the buildargv bug. I did consider adding a new test-buildargv.c file, however, this would have basically been a copy & paste of test-expandargv.c (with some minor changes to call buildargv). This would be fine now, but feels like we would eventually end up with one file not being updated as much as the other, and so test coverage would suffer. Instead, I have added some explicit buildargv testing to the test-expandargv.c file, this reuses the test input that is already defined for expandargv. Of course, once I removed the work around from expandargv then we now do always call buildargv from expandargv, and so the bug I'm fixing would impact both expandargv and buildargv, so maybe the new testing is redundant? I tend to think more testing is always better, so I've left it in for now. --- libiberty/argv.c | 108 ++++++++++---------- libiberty/testsuite/test-expandargv.c | 136 ++++++++++++++++++++++---- 2 files changed, 166 insertions(+), 78 deletions(-) diff --git a/libiberty/argv.c b/libiberty/argv.c index d9d32e59e72..675336273f3 100644 --- a/libiberty/argv.c +++ b/libiberty/argv.c @@ -212,71 +212,74 @@ char **buildargv (const char *input) argv[argc] = NULL; } /* Begin scanning arg */ - arg = copybuf; - while (*input != EOS) + if (*input != EOS) { - if (ISSPACE (*input) && !squote && !dquote && !bsquote) + arg = copybuf; + while (*input != EOS) { - break; - } - else - { - if (bsquote) - { - bsquote = 0; - if (*input != '\n') - *arg++ = *input; - } - else if (*input == '\\' - && !squote - && (!dquote - || strchr ("$`\"\\\n", *(input + 1)) != NULL)) + if (ISSPACE (*input) && !squote && !dquote && !bsquote) { - bsquote = 1; - } - else if (squote) - { - if (*input == '\'') - { - squote = 0; - } - else - { - *arg++ = *input; - } + break; } - else if (dquote) + else { - if (*input == '"') + if (bsquote) { - dquote = 0; + bsquote = 0; + if (*input != '\n') + *arg++ = *input; } - else + else if (*input == '\\' + && !squote + && (!dquote + || strchr ("$`\"\\\n", *(input + 1)) != NULL)) { - *arg++ = *input; + bsquote = 1; } - } - else - { - if (*input == '\'') + else if (squote) { - squote = 1; + if (*input == '\'') + { + squote = 0; + } + else + { + *arg++ = *input; + } } - else if (*input == '"') + else if (dquote) { - dquote = 1; + if (*input == '"') + { + dquote = 0; + } + else + { + *arg++ = *input; + } } else { - *arg++ = *input; + if (*input == '\'') + { + squote = 1; + } + else if (*input == '"') + { + dquote = 1; + } + else + { + *arg++ = *input; + } } + input++; } - input++; } + *arg = EOS; + argv[argc] = xstrdup (copybuf); + argc++; } - *arg = EOS; - argv[argc] = xstrdup (copybuf); - argc++; argv[argc] = NULL; consume_whitespace (&input); @@ -439,17 +442,8 @@ expandargv (int *argcp, char ***argvp) } /* Add a NUL terminator. */ buffer[len] = '\0'; - /* If the file is empty or contains only whitespace, buildargv would - return a single empty argument. In this context we want no arguments, - instead. */ - if (only_whitespace (buffer)) - { - file_argv = (char **) xmalloc (sizeof (char *)); - file_argv[0] = NULL; - } - else - /* Parse the string. */ - file_argv = buildargv (buffer); + /* Parse the string. */ + file_argv = buildargv (buffer); /* If *ARGVP is not already dynamically allocated, copy it. */ if (*argvp == original_argv) *argvp = dupargv (*argvp); diff --git a/libiberty/testsuite/test-expandargv.c b/libiberty/testsuite/test-expandargv.c index ea1aeb0eda2..ca7031eaf68 100644 --- a/libiberty/testsuite/test-expandargv.c +++ b/libiberty/testsuite/test-expandargv.c @@ -176,6 +176,30 @@ const char *test_data[] = { "\\t", 0, + /* Test 10 - Mixed white space characters. */ + "\t \n \t ", /* Test 10 data */ + ARGV0, + "@test-expandargv-10.lst", + 0, + ARGV0, + 0, + + /* Test 11 - Single ' ' character. */ + " ", /* Test 11 data */ + ARGV0, + "@test-expandargv-11.lst", + 0, + ARGV0, + 0, + + /* Test 12 - Multiple ' ' characters. */ + " ", /* Test 12 data */ + ARGV0, + "@test-expandargv-12.lst", + 0, + ARGV0, + 0, + 0 /* Test done marker, don't remove. */ }; @@ -265,6 +289,78 @@ erase_test (int test) fatal_error (__LINE__, "Failed to erase test file.", errno); } +/* compare_argv: + TEST is the current test number, and NAME is a short string to identify + which libibery function is being tested. ARGC_A and ARGV_A describe an + argument array, and this is compared to ARGC_B and ARGV_B, return 0 if + the two arrays match, otherwise return 1. */ + +static int +compare_argv (int test, const char *name, int argc_a, char *argv_a[], + int argc_b, char *argv_b[]) +{ + int failed = 0, k; + + if (argc_a != argc_b) + { + printf ("FAIL: test-%s-%d. Argument count didn't match\n", name, test); + failed = 1; + } + /* Compare each of the argv's ... */ + else + for (k = 0; k < argc_a; k++) + if (strcmp (argv_a[k], argv_b[k]) != 0) + { + printf ("FAIL: test-%s-%d. Arguments don't match.\n", name, test); + failed = 1; + break; + } + + if (!failed) + printf ("PASS: test-%s-%d.\n", name, test); + + return failed; +} + +/* test_buildargv + Test the buildargv function from libiberty. TEST is the current test + number and TEST_INPUT is the string to pass to buildargv (after calling + run_replaces on it). ARGC_AFTER and ARGV_AFTER are the expected + results. Return 0 if the test passes, otherwise return 1. */ + +static int +test_buildargv (int test, const char * test_input, int argc_after, + char *argv_after[]) +{ + char * input, ** argv; + size_t len; + int argc, failed; + + /* Generate RW copy of data for replaces */ + len = strlen (test_input); + input = malloc (sizeof (char) * (len + 1)); + if (input == NULL) + fatal_error (__LINE__, "Failed to malloc buildargv input buffer.", errno); + + memcpy (input, test_input, sizeof (char) * (len + 1)); + /* Run all possible replaces */ + run_replaces (input); + + /* Split INPUT into separate arguments. */ + argv = buildargv (input); + + /* Count the arguments we got back. */ + argc = 0; + while (argv[argc]) + ++argc; + + failed = compare_argv (test, "buildargv", argc_after, argv_after, argc, argv); + + free (input); + freeargv (argv); + + return failed; +} /* run_tests: Run expandargv @@ -276,12 +372,16 @@ run_tests (const char **test_data) { int argc_after, argc_before; char ** argv_before, ** argv_after; - int i, j, k, fails, failed; + int i, j, k, fails; + const char * input_str; i = j = fails = 0; /* Loop over all the tests */ while (test_data[j]) { + /* Save original input in case we run a buildargv test. */ + input_str = test_data[j]; + /* Write test data */ writeout_test (i, test_data[j++]); /* Copy argv before */ @@ -305,29 +405,23 @@ run_tests (const char **test_data) for (k = 0; k < argc_after; k++) run_replaces (argv_after[k]); + /* If the test input is just a file to expand then we can also test + calling buildargv directly as the expected output is equivalent to + calling buildargv on the contents of the file. + + The results of calling buildargv will not include the ARGV0 constant, + which is why we pass 'argc_after - 1' and 'argv_after + 1', this skips + over the ARGV0 in the expected results. */ + if (argc_before == 2) + fails += test_buildargv (i, input_str, argc_after - 1, argv_after + 1); + else + printf ("SKIP: test-buildargv-%d. This test isn't for buildargv\n", i); + /* Run test: Expand arguments */ expandargv (&argc_before, &argv_before); - failed = 0; - /* Compare size first */ - if (argc_before != argc_after) - { - printf ("FAIL: test-expandargv-%d. Number of arguments don't match.\n", i); - failed++; - } - /* Compare each of the argv's ... */ - else - for (k = 0; k < argc_after; k++) - if (strcmp (argv_before[k], argv_after[k]) != 0) - { - printf ("FAIL: test-expandargv-%d. Arguments don't match.\n", i); - failed++; - } - - if (!failed) - printf ("PASS: test-expandargv-%d.\n", i); - else - fails++; + fails += compare_argv (i, "expandargv", argc_before, argv_before, + argc_after, argv_after); freeargv (argv_before); freeargv (argv_after);