From patchwork Thu Aug 24 13:58:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 136827 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a7d1:0:b0:3f2:4152:657d with SMTP id p17csp1138130vqm; Thu, 24 Aug 2023 06:59:34 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFRVzmHIQ8fhXe3EOux9g9cLakr8ahoTN8RdHljE1RPtldZeav4Obpy3gypA20nv0GtNTZ5 X-Received: by 2002:a17:906:300c:b0:9a1:6252:16a0 with SMTP id 12-20020a170906300c00b009a1625216a0mr11812407ejz.46.1692885573874; Thu, 24 Aug 2023 06:59:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692885573; cv=none; d=google.com; s=arc-20160816; b=0EDlBI/cDCC/X6VP5gn5hn7RmWx/EDbt/D3v34CVibgcQVbBg3YA3WvstPqJE+JQdl 3T1pggTkd7z742BAEORKdywhtcufOKbR6eib2EB+7sT+LJ0P87dzTTeh23vffIxSrHnT 92lzASIHHCA+YDjnsCtRZ4SD5JoNm+lH3HSZx4DfJzg3IBdgp33lLKPpFkrgLJzAcSup MQNEvKPbBOjs4IbWplLruNeZHJzcCJA4k+5FQ0IAlQX1P//fWPBdPgq6YCyXDa7ZK1Az x4LqYnn7padln+mUebTwBs22owDFZ/1GO+4NnP+0rG1aUS6tGstJZgrc5i31mOgpS42O oRoA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-disposition:mime-version:message-id:subject:cc:to:date :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=yBZJkWwZ2g7y5fIBrql4xYSoew8XQngOM8KApbdF0js=; fh=IMTvZBfL0G7+xziJdWPs4IsGafFJ6QqWQkMOdm9kPF0=; b=mDQ45EdanxqwbM+Mthh0jUwzIECXwBu50ymq+dEhEgSepPvH4X99Q7DLwVg7AuOhZp x4heXig8/e34J7tXppsLsDyucCTwmX++QRHq6RhvwKgdVRskmNwU0Ysp35nWkcj07jG6 EJ7ohgYRba+gR2KMw38K9GzpnBdjRu0p5YWtIpgO4OicLgsFDq6PBceTWCooFL++CK8c TExoh7OJTv9Sx/86Nx+AX0GhspsYtYSXjSt577Df5dgaxcFc29s3dEE+bwIAL5CNVLbu 1H9+XkymQTzt+06vuNr8GMSddiRs5Zbpq9fUeaRUKmyxBh6eg+bVTEdfnyXvuN8h3tU8 TOSg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=LK1h65zN; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id z7-20020a1709060f0700b0099c05358e73si10106307eji.688.2023.08.24.06.59.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 24 Aug 2023 06:59:33 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=LK1h65zN; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5BE5E3858426 for ; Thu, 24 Aug 2023 13:59:32 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 5BE5E3858426 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1692885572; bh=yBZJkWwZ2g7y5fIBrql4xYSoew8XQngOM8KApbdF0js=; h=Date:To:Cc:Subject:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=LK1h65zNJPwWQdUPhmtJFy8PExfXc/2Bpwcngm3FRWXouHqR/UFoMqanrEWuhqMdN d8uM903HczqFCQASKKbxvcKCOXdrXefBjM2e8eYsxfBFEIW3d/ijriCz117DobMRse ehQpKvNt178HtPqCktqeQOOwNzWVV8/toUYt/Bhc= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 31D033858C53 for ; Thu, 24 Aug 2023 13:58:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 31D033858C53 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-696-pJDElwNENxaGGIyyrUHYBA-1; Thu, 24 Aug 2023 09:58:40 -0400 X-MC-Unique: pJDElwNENxaGGIyyrUHYBA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id DF4EF85C712 for ; Thu, 24 Aug 2023 13:58:39 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.45.225.165]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 3C3491121314; Thu, 24 Aug 2023 13:58:38 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 37ODwast667789 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 24 Aug 2023 15:58:36 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 37ODwZVq667788; Thu, 24 Aug 2023 15:58:35 +0200 Date: Thu, 24 Aug 2023 15:58:27 +0200 To: Jason Merrill Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] c++: Implement C++26 P2361R6 - Unevaluated strings [PR110342] Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-3.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jakub Jelinek via Gcc-patches From: Jakub Jelinek Reply-To: Jakub Jelinek Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775119183736355317 X-GMAIL-MSGID: 1775119183736355317 Hi! The following patch implements C++26 unevaluated-string. As it seems to me just extra pedanticity, it is implemented only for -std=c++26 or -std=gnu++26 and later and only if -pedantic/-pedantic-errors. Nothing is done for inline asm, while the spec changes those, it changes it to a balanced token sequence with implementation defined rules on what is and isn't allowed (so pedantically accepting asm ("" : "+m" (x)); was accepts-invalid before C++26, but we didn't diagnose anything). For the other spots mentioned in the paper, static_assert message, linkage specification, deprecated/nodiscard attributes it enforces the requirements (no prefixes, udlit suffixes, no octal/hexadecimal escapes (conditional escape sequences were rejected with pedantic already before). For the deprecated operator "" identifier case I've kept things as is, because everything seems to have been diagnosed already (a lot being implied from the string having to be empty). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2023-08-24 Jakub Jelinek PR c++/110342 gcc/cp/ * parser.cc: Implement C++26 P2361R6 - Unevaluated strings. (uneval_string_attr): New enumerator. (cp_parser_string_literal_common): Add UNEVAL argument. If true, pass CPP_UNEVAL_STRING rather than CPP_STRING to cpp_interpret_string_notranslate. (cp_parser_string_literal, cp_parser_userdef_string_literal): Adjust callers of cp_parser_string_literal_common. (cp_parser_unevaluated_string_literal): New function. (cp_parser_parenthesized_expression_list): Handle uneval_string_attr. (cp_parser_linkage_specification): Use cp_parser_unevaluated_string_literal for C++26. (cp_parser_static_assert): Likewise. (cp_parser_std_attribute): Use uneval_string_attr for standard deprecated and nodiscard attributes. gcc/testsuite/ * g++.dg/cpp26/unevalstr1.C: New test. * g++.dg/cpp26/unevalstr2.C: New test. * g++.dg/cpp0x/udlit-error1.C (lol): Expect an error for C++26 about user-defined literal in deprecated attribute. libcpp/ * include/cpplib.h (TTYPE_TABLE): Add CPP_UNEVAL_STRING literal entry. Use C++11 instead of C++-0x in comments. * charset.cc (convert_escape): Add UNEVAL argument, if true, pedantically diagnose numeric escape sequences. (cpp_interpret_string_1): Formatting fix. Adjust convert_escape caller. (cpp_interpret_string): Formatting string. (cpp_interpret_string_notranslate): Pass type through to cpp_interpret_string if it is CPP_UNEVAL_STRING. Jakub --- gcc/cp/parser.cc.jj 2023-08-23 11:22:28.006593913 +0200 +++ gcc/cp/parser.cc 2023-08-23 12:21:31.384232520 +0200 @@ -2267,7 +2267,8 @@ static vec *cp_parser_paren (cp_parser *, int, bool, bool, bool *, location_t * = NULL, bool = false); /* Values for the second parameter of cp_parser_parenthesized_expression_list. */ -enum { non_attr = 0, normal_attr = 1, id_attr = 2, assume_attr = 3 }; +enum { non_attr = 0, normal_attr = 1, id_attr = 2, assume_attr = 3, + uneval_string_attr = 4 }; static void cp_parser_pseudo_destructor_name (cp_parser *, tree, tree *, tree *); static cp_expr cp_parser_unary_expression @@ -4409,7 +4410,8 @@ cp_parser_identifier (cp_parser* parser) return error_mark_node; } -/* Worker for cp_parser_string_literal and cp_parser_userdef_string_literal. +/* Worker for cp_parser_string_literal, cp_parser_userdef_string_literal + and cp_parser_unevaluated_string_literal. Do not call this directly; use either of the above. Parse a sequence of adjacent string constants. Return a @@ -4417,7 +4419,8 @@ cp_parser_identifier (cp_parser* parser) constant. If TRANSLATE is true, translate the string to the execution character set. If WIDE_OK is true, a wide string is valid here. If UDL_OK is true, a string literal with user-defined - suffix can be used in this context. + suffix can be used in this context. If UNEVAL is true, diagnose + numeric and conditional escape sequences in it if pedantic. C++98 [lex.string] says that if a narrow string literal token is adjacent to a wide string literal token, the behavior is undefined. @@ -4431,7 +4434,7 @@ cp_parser_identifier (cp_parser* parser) static cp_expr cp_parser_string_literal_common (cp_parser *parser, bool translate, bool wide_ok, bool udl_ok, - bool lookup_udlit) + bool lookup_udlit, bool uneval) { tree value; size_t count; @@ -4584,6 +4587,8 @@ cp_parser_string_literal_common (cp_pars cp_parser_error (parser, "a wide string is invalid in this context"); type = CPP_STRING; } + if (uneval) + type = CPP_UNEVAL_STRING; if ((translate ? cpp_interpret_string : cpp_interpret_string_notranslate) (parse_in, strs, count, &istr, type)) @@ -4658,7 +4663,8 @@ cp_parser_string_literal (cp_parser *par { return cp_parser_string_literal_common (parser, translate, wide_ok, /*udl_ok=*/false, - /*lookup_udlit=*/false); + /*lookup_udlit=*/false, + /*uneval=*/false); } /* Parse a string literal or user defined string literal. @@ -4673,7 +4679,21 @@ cp_parser_userdef_string_literal (cp_par { return cp_parser_string_literal_common (parser, /*translate=*/true, /*wide_ok=*/true, /*udl_ok=*/true, - lookup_udlit); + lookup_udlit, /*uneval=*/false); +} + +/* Parse an unevaluated string literal. + + unevaluated-string: + string-literal */ + +static inline cp_expr +cp_parser_unevaluated_string_literal (cp_parser *parser) +{ + return cp_parser_string_literal_common (parser, /*translate=*/false, + /*wide_ok=*/false, /*udl_ok=*/false, + /*lookup_udlit=*/false, + /*uneval=*/true); } /* Look up a literal operator with the name and the exact arguments. */ @@ -8578,6 +8598,8 @@ cp_parser_parenthesized_expression_list expr = cp_lexer_consume_token (parser->lexer)->u.value; else if (is_attribute_list == assume_attr) expr = cp_parser_conditional_expression (parser); + else if (is_attribute_list == uneval_string_attr) + expr = cp_parser_unevaluated_string_literal (parser); else expr = cp_parser_parenthesized_expression_list_elt (parser, cast_p, @@ -16319,8 +16341,12 @@ cp_parser_linkage_specification (cp_pars /* Look for the string-literal. */ cp_token *string_token = cp_lexer_peek_token (parser->lexer); - tree linkage = cp_parser_string_literal (parser, /*translate=*/false, - /*wide_ok=*/false); + tree linkage; + if (cxx_dialect >= cxx26) + linkage = cp_parser_unevaluated_string_literal (parser); + else + linkage = cp_parser_string_literal (parser, /*translate=*/false, + /*wide_ok=*/false); /* Transform the literal into an identifier. If the literal is a wide-character string, or contains embedded NULs, then we can't @@ -16449,8 +16475,11 @@ cp_parser_static_assert (cp_parser *pars cp_parser_require (parser, CPP_COMMA, RT_COMMA); /* Parse the string-literal message. */ - message = cp_parser_string_literal (parser, /*translate=*/false, - /*wide_ok=*/true); + if (cxx_dialect >= cxx26) + message = cp_parser_unevaluated_string_literal (parser); + else + message = cp_parser_string_literal (parser, /*translate=*/false, + /*wide_ok=*/true); /* A `)' completes the static assertion. */ if (!parens.require_close (parser)) @@ -29442,6 +29471,11 @@ cp_parser_std_attribute (cp_parser *pars && attribute_takes_identifier_p (attr_id)) /* A GNU attribute that takes an identifier in parameter. */ attr_flag = id_attr; + else if (attr_ns == NULL_TREE + && cxx_dialect >= cxx26 + && (is_attribute_p ("deprecated", attr_id) + || is_attribute_p ("nodiscard", attr_id))) + attr_flag = uneval_string_attr; /* If this is a fake attribute created to handle -Wno-attributes, we must skip parsing the arguments. */ --- gcc/testsuite/g++.dg/cpp26/unevalstr1.C.jj 2023-08-23 13:07:05.960665571 +0200 +++ gcc/testsuite/g++.dg/cpp26/unevalstr1.C 2023-08-23 13:09:59.782410316 +0200 @@ -0,0 +1,103 @@ +// C++26 P2361R6 - Unevaluated strings +// { dg-do compile { target c++26 } } + +static_assert (true, "foo"); +static_assert (true, "foo" " " "bar"); +static_assert (true, "\u01FC\U000001FC\u{1FC}\N{LATIN CAPITAL LETTER AE WITH ACUTE}\\\'\"\?\a\b\f\n\r\t\v"); +static_assert (true, L"foo"); // { dg-error "a wide string is invalid in this context" } +static_assert (true, u"foo"); // { dg-error "a wide string is invalid in this context" } +static_assert (true, U"foo"); // { dg-error "a wide string is invalid in this context" } +static_assert (true, u8"foo"); // { dg-error "a wide string is invalid in this context" } +static_assert (true, L"fo" "o"); // { dg-error "a wide string is invalid in this context" } +static_assert (true, u"fo" "o"); // { dg-error "a wide string is invalid in this context" } +static_assert (true, U"fo" "o"); // { dg-error "a wide string is invalid in this context" } +static_assert (true, u8"fo" "o"); // { dg-error "a wide string is invalid in this context" } +static_assert (true, "fo" L"o"); // { dg-error "a wide string is invalid in this context" } +static_assert (true, "fo" u"o"); // { dg-error "a wide string is invalid in this context" } +static_assert (true, "fo" U"o"); // { dg-error "a wide string is invalid in this context" } +static_assert (true, "fo" u8"o"); // { dg-error "a wide string is invalid in this context" } +static_assert (true, "\0"); // { dg-error "numeric escape sequence in unevaluated string" } +static_assert (true, "\17"); // { dg-error "numeric escape sequence in unevaluated string" } +static_assert (true, "\x20"); // { dg-error "numeric escape sequence in unevaluated string" } +static_assert (true, "\o{17}"); // { dg-error "numeric escape sequence in unevaluated string" } +static_assert (true, "\x{20}"); // { dg-error "numeric escape sequence in unevaluated string" } +static_assert (true, "\h"); // { dg-error "unknown escape sequence" } + +extern "C" "+" "+" int f0 (); +extern "C" int f1 (); +extern "C" { int f2 (); }; +extern L"C" int f3 (); // { dg-error "a wide string is invalid in this context" } +extern L"C" { int f4 (); } // { dg-error "a wide string is invalid in this context" } +extern u"C" int f5 (); // { dg-error "a wide string is invalid in this context" } +extern u"C" { int f6 (); } // { dg-error "a wide string is invalid in this context" } +extern U"C" int f7 (); // { dg-error "a wide string is invalid in this context" } +extern U"C" { int f8 (); } // { dg-error "a wide string is invalid in this context" } +extern u8"C" int f9 (); // { dg-error "a wide string is invalid in this context" } +extern u8"C" { int f10 (); } // { dg-error "a wide string is invalid in this context" } +extern "\x43" int f11 (); // { dg-error "numeric escape sequence in unevaluated string" } +extern "\x{43}" { int f12 (); } // { dg-error "numeric escape sequence in unevaluated string" } +extern "\103" int f13 (); // { dg-error "numeric escape sequence in unevaluated string" } +extern "\o{0103}" { int f14 (); } // { dg-error "numeric escape sequence in unevaluated string" } + +[[deprecated ("foo")]] int g0 (); +[[deprecated ("foo" " " "bar")]] int g1 (); +[[deprecated ("\u01FC\U000001FC\u{1FC}\N{LATIN CAPITAL LETTER AE WITH ACUTE}\\\'\"\?\a\b\f\n\r\t\v")]] int g2 (); +[[deprecated (L"foo")]] int g3 (); // { dg-error "a wide string is invalid in this context" } +[[deprecated (u"foo")]] int g4 (); // { dg-error "a wide string is invalid in this context" } +[[deprecated (U"foo")]] int g5 (); // { dg-error "a wide string is invalid in this context" } +[[deprecated (u8"foo")]] int g6 (); // { dg-error "a wide string is invalid in this context" } +[[deprecated (L"fo" "o")]] int g7 (); // { dg-error "a wide string is invalid in this context" } +[[deprecated (u"fo" "o")]] int g8 (); // { dg-error "a wide string is invalid in this context" } +[[deprecated (U"fo" "o")]] int g9 (); // { dg-error "a wide string is invalid in this context" } +[[deprecated (u8"fo" "o")]] int g10 (); // { dg-error "a wide string is invalid in this context" } +[[deprecated ("fo" L"o")]] int g11 (); // { dg-error "a wide string is invalid in this context" } +[[deprecated ("fo" u"o")]] int g12 (); // { dg-error "a wide string is invalid in this context" } +[[deprecated ("fo" U"o")]] int g13 (); // { dg-error "a wide string is invalid in this context" } +[[deprecated ("fo" u8"o")]] int g14 (); // { dg-error "a wide string is invalid in this context" } +[[deprecated ("\0")]] int g15 (); // { dg-error "numeric escape sequence in unevaluated string" } +[[deprecated ("\17")]] int g16 (); // { dg-error "numeric escape sequence in unevaluated string" } +[[deprecated ("\x20")]] int g17 (); // { dg-error "numeric escape sequence in unevaluated string" } +[[deprecated ("\o{17}")]] int g18 (); // { dg-error "numeric escape sequence in unevaluated string" } +[[deprecated ("\x{20}")]] int g19 (); // { dg-error "numeric escape sequence in unevaluated string" } +[[deprecated ("\h")]] int g20 (); // { dg-error "unknown escape sequence" } + +[[nodiscard ("foo")]] int h0 (); +[[nodiscard ("foo" " " "bar")]] int h1 (); +[[nodiscard ("\u01FC\U000001FC\u{1FC}\N{LATIN CAPITAL LETTER AE WITH ACUTE}\\\'\"\?\a\b\f\n\r\t\v")]] int h2 (); +[[nodiscard (L"foo")]] int h3 (); // { dg-error "a wide string is invalid in this context" } +[[nodiscard (u"foo")]] int h4 (); // { dg-error "a wide string is invalid in this context" } +[[nodiscard (U"foo")]] int h5 (); // { dg-error "a wide string is invalid in this context" } +[[nodiscard (u8"foo")]] int h6 (); // { dg-error "a wide string is invalid in this context" } +[[nodiscard (L"fo" "o")]] int h7 (); // { dg-error "a wide string is invalid in this context" } +[[nodiscard (u"fo" "o")]] int h8 (); // { dg-error "a wide string is invalid in this context" } +[[nodiscard (U"fo" "o")]] int h9 (); // { dg-error "a wide string is invalid in this context" } +[[nodiscard (u8"fo" "o")]] int h10 (); // { dg-error "a wide string is invalid in this context" } +[[nodiscard ("fo" L"o")]] int h11 (); // { dg-error "a wide string is invalid in this context" } +[[nodiscard ("fo" u"o")]] int h12 (); // { dg-error "a wide string is invalid in this context" } +[[nodiscard ("fo" U"o")]] int h13 (); // { dg-error "a wide string is invalid in this context" } +[[nodiscard ("fo" u8"o")]] int h14 (); // { dg-error "a wide string is invalid in this context" } +[[nodiscard ("\0")]] int h15 (); // { dg-error "numeric escape sequence in unevaluated string" } +[[nodiscard ("\17")]] int h16 (); // { dg-error "numeric escape sequence in unevaluated string" } +[[nodiscard ("\x20")]] int h17 (); // { dg-error "numeric escape sequence in unevaluated string" } +[[nodiscard ("\o{17}")]] int h18 (); // { dg-error "numeric escape sequence in unevaluated string" } +[[nodiscard ("\x{20}")]] int h19 (); // { dg-error "numeric escape sequence in unevaluated string" } +[[nodiscard ("\h")]] int h20 (); // { dg-error "unknown escape sequence" } + +float operator "" _my0 (const char *); +float operator "" "" _my1 (const char *); +float operator L"" _my2 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator u"" _my3 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator U"" _my4 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator u8"" _my5 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator L"" "" _my6 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator u"" "" _my7 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator U"" "" _my8 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator u8"" "" _my9 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator "" L"" _my10 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator "" u"" _my11 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator "" U"" _my12 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator "" u8"" _my13 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator "\0" _my14 (const char *); // { dg-error "expected empty string after 'operator' keyword" } +float operator "\x00" _my15 (const char *); // { dg-error "expected empty string after 'operator' keyword" } +float operator "\h" _my16 (const char *); // { dg-error "expected empty string after 'operator' keyword" } + // { dg-error "unknown escape sequence" "" { target *-*-* } .-1 } --- gcc/testsuite/g++.dg/cpp26/unevalstr2.C.jj 2023-08-23 13:10:17.120185018 +0200 +++ gcc/testsuite/g++.dg/cpp26/unevalstr2.C 2023-08-23 13:20:18.152371965 +0200 @@ -0,0 +1,110 @@ +// C++26 P2361R6 - Unevaluated strings +// { dg-do compile { target { c++11 && c++23_down } } } +// { dg-options "-pedantic" } + +static_assert (true, "foo"); +static_assert (true, "foo" " " "bar"); +static_assert (true, "\u01FC\U000001FC\u{1FC}\N{LATIN CAPITAL LETTER AE WITH ACUTE}\\\'\"\?\a\b\f\n\r\t\v"); +// { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } .-1 } +// { dg-warning "named universal character escapes are only valid in" "" { target c++20_down } .-2 } +static_assert (true, L"foo"); +static_assert (true, u"foo"); +static_assert (true, U"foo"); +static_assert (true, u8"foo"); +static_assert (true, L"fo" "o"); +static_assert (true, u"fo" "o"); +static_assert (true, U"fo" "o"); +static_assert (true, u8"fo" "o"); +static_assert (true, "fo" L"o"); +static_assert (true, "fo" u"o"); +static_assert (true, "fo" U"o"); +static_assert (true, "fo" u8"o"); +static_assert (true, "\0"); +static_assert (true, "\17"); +static_assert (true, "\x20"); +static_assert (true, "\o{17}"); // { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } } +static_assert (true, "\x{20}"); // { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } } +static_assert (true, "\h"); // { dg-warning "unknown escape sequence" } + +extern "C" "+" "+" int f0 (); +extern "C" int f1 (); +extern "C" { int f2 (); }; +extern L"C" int f3 (); // { dg-error "a wide string is invalid in this context" } +extern L"C" { int f4 (); } // { dg-error "a wide string is invalid in this context" } +extern u"C" int f5 (); // { dg-error "a wide string is invalid in this context" } +extern u"C" { int f6 (); } // { dg-error "a wide string is invalid in this context" } +extern U"C" int f7 (); // { dg-error "a wide string is invalid in this context" } +extern U"C" { int f8 (); } // { dg-error "a wide string is invalid in this context" } +extern u8"C" int f9 (); // { dg-error "a wide string is invalid in this context" } +extern u8"C" { int f10 (); } // { dg-error "a wide string is invalid in this context" } +extern "\x43" int f11 (); +extern "\x{43}" { int f12 (); } // { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } } +extern "\103" int f13 (); +extern "\o{0103}" { int f14 (); } // { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } } + +[[deprecated ("foo")]] int g0 (); +[[deprecated ("foo" " " "bar")]] int g1 (); +[[deprecated ("\u01FC\U000001FC\u{1FC}\N{LATIN CAPITAL LETTER AE WITH ACUTE}\\\'\"\?\a\b\f\n\r\t\v")]] int g2 (); +// { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } .-1 } +// { dg-warning "named universal character escapes are only valid in" "" { target c++20_down } .-2 } +[[deprecated (L"foo")]] int g3 (); +[[deprecated (u"foo")]] int g4 (); +[[deprecated (U"foo")]] int g5 (); +[[deprecated (u8"foo")]] int g6 (); +[[deprecated (L"fo" "o")]] int g7 (); +[[deprecated (u"fo" "o")]] int g8 (); +[[deprecated (U"fo" "o")]] int g9 (); +[[deprecated (u8"fo" "o")]] int g10 (); +[[deprecated ("fo" L"o")]] int g11 (); +[[deprecated ("fo" u"o")]] int g12 (); +[[deprecated ("fo" U"o")]] int g13 (); +[[deprecated ("fo" u8"o")]] int g14 (); +[[deprecated ("\0")]] int g15 (); +[[deprecated ("\17")]] int g16 (); +[[deprecated ("\x20")]] int g17 (); +[[deprecated ("\o{17}")]] int g18 (); // { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } } +[[deprecated ("\x{20}")]] int g19 (); // { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } } +[[deprecated ("\h")]] int g20 (); // { dg-warning "unknown escape sequence" } + +[[nodiscard ("foo")]] int h0 (); +[[nodiscard ("foo" " " "bar")]] int h1 (); +[[nodiscard ("\u01FC\U000001FC\u{1FC}\N{LATIN CAPITAL LETTER AE WITH ACUTE}\\\'\"\?\a\b\f\n\r\t\v")]] int h2 (); +// { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } .-1 } +// { dg-warning "named universal character escapes are only valid in" "" { target c++20_down } .-2 } +[[nodiscard (L"foo")]] int h3 (); +[[nodiscard (u"foo")]] int h4 (); +[[nodiscard (U"foo")]] int h5 (); +[[nodiscard (u8"foo")]] int h6 (); +[[nodiscard (L"fo" "o")]] int h7 (); +[[nodiscard (u"fo" "o")]] int h8 (); +[[nodiscard (U"fo" "o")]] int h9 (); +[[nodiscard (u8"fo" "o")]] int h10 (); +[[nodiscard ("fo" L"o")]] int h11 (); +[[nodiscard ("fo" u"o")]] int h12 (); +[[nodiscard ("fo" U"o")]] int h13 (); +[[nodiscard ("fo" u8"o")]] int h14 (); +[[nodiscard ("\0")]] int h15 (); +[[nodiscard ("\17")]] int h16 (); +[[nodiscard ("\x20")]] int h17 (); +[[nodiscard ("\o{17}")]] int h18 (); // { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } } +[[nodiscard ("\x{20}")]] int h19 (); // { dg-warning "delimited escape sequences are only valid in" "" { target c++20_down } } +[[nodiscard ("\h")]] int h20 (); // { dg-warning "unknown escape sequence" } + +float operator "" _my0 (const char *); +float operator "" "" _my1 (const char *); +float operator L"" _my2 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator u"" _my3 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator U"" _my4 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator u8"" _my5 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator L"" "" _my6 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator u"" "" _my7 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator U"" "" _my8 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator u8"" "" _my9 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator "" L"" _my10 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator "" u"" _my11 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator "" U"" _my12 (const char *); // { dg-error "invalid encoding prefix in literal operator" } +float operator "" u8"" _my13 (const char *); // { dg-error "invalid encoding prefix in literal operator" "" { target c++20 } } +float operator "\0" _my14 (const char *); // { dg-error "expected empty string after 'operator' keyword" } +float operator "\x00" _my15 (const char *); // { dg-error "expected empty string after 'operator' keyword" } +float operator "\h" _my16 (const char *); // { dg-error "expected empty string after 'operator' keyword" } + // { dg-warning "unknown escape sequence" "" { target *-*-* } .-1 } --- gcc/testsuite/g++.dg/cpp0x/udlit-error1.C.jj 2023-01-26 22:03:00.657122433 +0100 +++ gcc/testsuite/g++.dg/cpp0x/udlit-error1.C 2023-08-24 15:46:18.149708095 +0200 @@ -13,7 +13,7 @@ void operator""_x(const char *, decltype extern "C"_x { void g(); } // { dg-error "before user-defined string literal" } static_assert(true, "foo"_x); // { dg-error "string literal with user-defined suffix is invalid in this context|expected" } -[[deprecated("oof"_x)]] +[[deprecated("oof"_x)]] // { dg-error "string literal with user-defined suffix is invalid in this context" "" { target c++26 } } void lol () // { dg-error "not a string" } { --- libcpp/include/cpplib.h.jj 2023-08-22 16:12:27.709260416 +0200 +++ libcpp/include/cpplib.h 2023-08-23 11:24:56.100650548 +0200 @@ -129,17 +129,18 @@ struct _cpp_file; TK(UTF8STRING, LITERAL) /* u8"string" */ \ TK(OBJC_STRING, LITERAL) /* @"string" - Objective-C */ \ TK(HEADER_NAME, LITERAL) /* in #include */ \ + TK(UNEVAL_STRING, LITERAL) /* unevaluated "string" - C++26 */ \ \ - TK(CHAR_USERDEF, LITERAL) /* 'char'_suffix - C++-0x */ \ - TK(WCHAR_USERDEF, LITERAL) /* L'char'_suffix - C++-0x */ \ - TK(CHAR16_USERDEF, LITERAL) /* u'char'_suffix - C++-0x */ \ - TK(CHAR32_USERDEF, LITERAL) /* U'char'_suffix - C++-0x */ \ - TK(UTF8CHAR_USERDEF, LITERAL) /* u8'char'_suffix - C++-0x */ \ - TK(STRING_USERDEF, LITERAL) /* "string"_suffix - C++-0x */ \ - TK(WSTRING_USERDEF, LITERAL) /* L"string"_suffix - C++-0x */ \ - TK(STRING16_USERDEF, LITERAL) /* u"string"_suffix - C++-0x */ \ - TK(STRING32_USERDEF, LITERAL) /* U"string"_suffix - C++-0x */ \ - TK(UTF8STRING_USERDEF,LITERAL) /* u8"string"_suffix - C++-0x */ \ + TK(CHAR_USERDEF, LITERAL) /* 'char'_suffix - C++11 */ \ + TK(WCHAR_USERDEF, LITERAL) /* L'char'_suffix - C++11 */ \ + TK(CHAR16_USERDEF, LITERAL) /* u'char'_suffix - C++11 */ \ + TK(CHAR32_USERDEF, LITERAL) /* U'char'_suffix - C++11 */ \ + TK(UTF8CHAR_USERDEF, LITERAL) /* u8'char'_suffix - C++11 */ \ + TK(STRING_USERDEF, LITERAL) /* "string"_suffix - C++11 */ \ + TK(WSTRING_USERDEF, LITERAL) /* L"string"_suffix - C++11 */ \ + TK(STRING16_USERDEF, LITERAL) /* u"string"_suffix - C++11 */ \ + TK(STRING32_USERDEF, LITERAL) /* U"string"_suffix - C++11 */ \ + TK(UTF8STRING_USERDEF,LITERAL) /* u8"string"_suffix - C++11 */ \ \ TK(COMMENT, LITERAL) /* Only if output comments. */ \ /* SPELL_LITERAL happens to DTRT. */ \ --- libcpp/charset.cc.jj 2023-07-11 13:40:40.398430000 +0200 +++ libcpp/charset.cc 2023-08-23 12:56:48.926671275 +0200 @@ -2156,7 +2156,7 @@ static const uchar * convert_escape (cpp_reader *pfile, const uchar *from, const uchar *limit, struct _cpp_strbuf *tbuf, struct cset_converter cvt, cpp_string_location_reader *loc_reader, - cpp_substring_ranges *ranges) + cpp_substring_ranges *ranges, bool uneval) { /* Values of \a \b \e \f \n \r \t \v respectively. */ #if HOST_CHARSET == HOST_CHARSET_ASCII @@ -2183,12 +2183,20 @@ convert_escape (cpp_reader *pfile, const char_range, loc_reader, ranges); case 'x': + if (uneval && CPP_PEDANTIC (pfile)) + cpp_error (pfile, CPP_DL_PEDWARN, + "numeric escape sequence in unevaluated string: " + "'\\%c'", (int) c); return convert_hex (pfile, from, limit, tbuf, cvt, char_range, loc_reader, ranges); case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case 'o': + if (uneval && CPP_PEDANTIC (pfile)) + cpp_error (pfile, CPP_DL_PEDWARN, + "numeric escape sequence in unevaluated string: " + "'\\%c'", (int) c); return convert_oct (pfile, from, limit, tbuf, cvt, char_range, loc_reader, ranges); @@ -2296,7 +2304,7 @@ converter_for_type (cpp_reader *pfile, e static bool cpp_interpret_string_1 (cpp_reader *pfile, const cpp_string *from, size_t count, - cpp_string *to, enum cpp_ttype type, + cpp_string *to, enum cpp_ttype type, cpp_string_location_reader *loc_readers, cpp_substring_ranges *out) { @@ -2427,7 +2435,7 @@ cpp_interpret_string_1 (cpp_reader *pfil struct _cpp_strbuf *tbuf_ptr = to ? &tbuf : NULL; p = convert_escape (pfile, p + 1, limit, tbuf_ptr, cvt, - loc_reader, out); + loc_reader, out, type == CPP_UNEVAL_STRING); } } @@ -2465,7 +2473,7 @@ cpp_interpret_string_1 (cpp_reader *pfil false for failure. */ bool cpp_interpret_string (cpp_reader *pfile, const cpp_string *from, size_t count, - cpp_string *to, enum cpp_ttype type) + cpp_string *to, enum cpp_ttype type) { return cpp_interpret_string_1 (pfile, from, count, to, type, NULL, NULL); } @@ -2548,7 +2556,7 @@ cpp_interpret_string_ranges (cpp_reader bool cpp_interpret_string_notranslate (cpp_reader *pfile, const cpp_string *from, size_t count, cpp_string *to, - enum cpp_ttype type ATTRIBUTE_UNUSED) + enum cpp_ttype type) { struct cset_converter save_narrow_cset_desc = pfile->narrow_cset_desc; bool retval; @@ -2557,7 +2565,9 @@ cpp_interpret_string_notranslate (cpp_re pfile->narrow_cset_desc.cd = (iconv_t) -1; pfile->narrow_cset_desc.width = CPP_OPTION (pfile, char_precision); - retval = cpp_interpret_string (pfile, from, count, to, CPP_STRING); + retval = cpp_interpret_string (pfile, from, count, to, + type == CPP_UNEVAL_STRING + ? CPP_UNEVAL_STRING : CPP_STRING); pfile->narrow_cset_desc = save_narrow_cset_desc; return retval;