From patchwork Mon Jan 15 20:45:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Wakely X-Patchwork-Id: 188343 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:693c:2614:b0:101:6a76:bbe3 with SMTP id mm20csp1968461dyc; Mon, 15 Jan 2024 14:09:00 -0800 (PST) X-Google-Smtp-Source: AGHT+IG6qhPHDlc+jVGE/BlSusYRlzeUwKzhNMAKqXdPnROlCbC4xw6ckR7Vi3ytKbEkHzge75us X-Received: by 2002:a05:622a:211:b0:429:a256:7b0b with SMTP id b17-20020a05622a021100b00429a2567b0bmr9090371qtx.78.1705356540401; Mon, 15 Jan 2024 14:09:00 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705356540; cv=pass; d=google.com; s=arc-20160816; b=OoTB7GnBJYOIN3h13bnR2IqirVq7NMGXTl5DDQX/Nt4j0cV0NaUTWyo+NrGWsCdGny 6xKYc1YkPbQNq+LZwQepROkeWNzFS0gF3us2mrSQtHIm8696M3OZOvlyWENjHCfqCrbP evmPWIMQbU77MERYsTUGSmPQ/o5qsBwrEJwCYffvLPWyYDVSFNUD7PyugP4hToV6aFKL R7ywzELwe6uWAy6MshR/cXXsAe4y87Gn/HPn6xVeoneCdLlCclQxQW6IWdkKm/dVkYoe xIVenmbVw2Egudwo1mMTFmqAn2Jo9eK8MZnC6eWTCHr8WNjEBD/i1ADHHjuFWTYCaONs cXeA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=/lCyqksMlKhVkkkbwEWWNw5Ncqi7UPmpuggpISUAF/Y=; fh=sJ+2/4g29YdyXkoRrFZSpsL2zxijepB7X/1rB0LDDh8=; b=0KLFEwfNuAqpS3zGN+Db1qIw7/jAflSardul3a5dogLfQ5l5bC6WAJdItyvLAputZe WWeDGivtOngSxgZ7Ao7ZMhcg/S5WowupgcrFVK0BpD10D3SuMNOnFaJoZARyQaOPbBQ0 AExI1Y6bDBcPZtIrZ89ChC9H5BcZGk6bpAd/5UMjIX3qJhj4qM7m7kJMPzhDtVEdyHdu dUaS4ph7wQJPnEosXAZbn7fFgNW2FkSHISREzxnBlNzN3ws18Wji3KGvJywcSB837lII TnF0kFEhy4KcOAzbeXAzbkbTU6qf1QquKuRiHiOoW7w3KFMZHMnEdogFpvn4TZbbm6Q2 CquA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JewnJAEo; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id z9-20020a05622a060900b00429a008f0bfsi8669531qta.395.2024.01.15.14.09.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 Jan 2024 14:09:00 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JewnJAEo; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 201BF385AC20 for ; Mon, 15 Jan 2024 20:49:16 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id CC3C03858C52 for ; Mon, 15 Jan 2024 20:48:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CC3C03858C52 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CC3C03858C52 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705351694; cv=none; b=MfTQrPedEBsMWD4hzxc7bZehsxJn1XK2B9dPoXyKysq2lODRqTxrfDH5EPAjsvBZDkuOJWOkjBeBxit8EmM2Ti9ZWCO5o9SU5GokUwWv/Tpa9e890kfsfcb7E05K4kM+N6dyYaZlg11Je20U+a/IUhtTsrVBQnJBCE4YUSnK3M8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705351694; c=relaxed/simple; bh=QIOpri1F/aFlBD2sxcp0HzROFMN8Z0FyvQz6FgieL9o=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=jpl9VaooPlh9l+Xnf7otODXf5nz2b509NhtFCUr5vInGsiYnFgsoqJosK9eYbqs99/rw6hYpEHh56GC8m0SQpqqkrrUFj/OlznMPLMb5i7Jm2PxV12qPxIcLcNGdT5EEMlgS3ntmcCGQvVdcGdYU2wb3TVObHuoZqLpMRwTbKNE= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1705351686; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/lCyqksMlKhVkkkbwEWWNw5Ncqi7UPmpuggpISUAF/Y=; b=JewnJAEoflKz2YBHLKj9oEVCEBYnVSqLlHSw9krdE6vFtNYu7JSs6IRssdDDBRzqa35jxx +PfHZFZqQm8b7NBQ1308VcFikb2pE2sEplggQDAPag9b6YLXdY9UlCWB02UeqYwvnR6US9 T388QzjDkaplMJS+fRQtTlHZNGtPECg= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-664-QmP2ZYSlOO-JbS4TBKJp5w-1; Mon, 15 Jan 2024 15:48:04 -0500 X-MC-Unique: QmP2ZYSlOO-JbS4TBKJp5w-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8685185A58B; Mon, 15 Jan 2024 20:48:04 +0000 (UTC) Received: from localhost (unknown [10.42.28.185]) by smtp.corp.redhat.com (Postfix) with ESMTP id 08A4B2166B33; Mon, 15 Jan 2024 20:48:03 +0000 (UTC) From: Jonathan Wakely To: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: [PATCH v3] libstdc++: Implement C++26 std::text_encoding (P1885R12) [PR113318] Date: Mon, 15 Jan 2024 20:45:33 +0000 Message-ID: <20240115204803.1550804-1-jwakely@redhat.com> In-Reply-To: <20240113124834.1296437-1-jwakely@redhat.com> References: <20240113124834.1296437-1-jwakely@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.6 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788195939223642250 X-GMAIL-MSGID: 1788195939223642250 I think I'm happy with this now. It has tests for all the new functions, and the performance of the charset alias match algorithm is improved by reusing part of . Tested x86_64-linux. -- >8 -- This is another C++26 change, approved in Varna 2022. We require a new static array of data that is extracted from the IANA Character Sets database. A new Python script to generate a header from the IANA CSV file is added. libstdc++-v3/ChangeLog: PR libstdc++/113318 * acinclude.m4 (GLIBCXX_CONFIGURE): Add c++26 directory. (GLIBCXX_CHECK_TEXT_ENCODING): Define. * config.h.in: Regenerate. * configure: Regenerate. * configure.ac: Use GLIBCXX_CHECK_TEXT_ENCODING. * include/Makefile.am: Add new headers. * include/Makefile.in: Regenerate. * include/bits/locale_classes.h (locale::encoding): Declare new member function. * include/bits/unicode.h (__charset_alias_match): New function. * include/bits/text_encoding-data.h: New file. * include/bits/version.def (text_encoding): Define. * include/bits/version.h: Regenerate. * include/std/text_encoding: New file. * src/Makefile.am: Add new subdirectory. * src/Makefile.in: Regenerate. * src/c++26/Makefile.am: New file. * src/c++26/Makefile.in: New file. * src/c++26/text_encoding.cc: New file. * src/experimental/Makefile.am: Include c++26 convenience library. * src/experimental/Makefile.in: Regenerate. * python/libstdcxx/v6/printers.py (StdTextEncodingPrinter): New printer. * scripts/gen_text_encoding_data.py: New file. * testsuite/22_locale/locale/encoding.cc: New test. * testsuite/ext/unicode/charset_alias_match.cc: New test. * testsuite/std/text_encoding/cons.cc: New test. * testsuite/std/text_encoding/members.cc: New test. * testsuite/std/text_encoding/requirements.cc: New test. --- libstdc++-v3/acinclude.m4 | 30 +- libstdc++-v3/config.h.in | 3 + libstdc++-v3/configure | 70 +- libstdc++-v3/configure.ac | 3 + libstdc++-v3/include/Makefile.am | 2 + libstdc++-v3/include/Makefile.in | 2 + libstdc++-v3/include/bits/locale_classes.h | 14 + .../include/bits/text_encoding-data.h | 902 ++++++++++++++++++ libstdc++-v3/include/bits/unicode.h | 53 +- libstdc++-v3/include/bits/version.def | 10 + libstdc++-v3/include/bits/version.h | 13 +- libstdc++-v3/include/std/text_encoding | 704 ++++++++++++++ libstdc++-v3/python/libstdcxx/v6/printers.py | 17 + .../scripts/gen_text_encoding_data.py | 70 ++ libstdc++-v3/src/Makefile.am | 3 +- libstdc++-v3/src/Makefile.in | 7 +- libstdc++-v3/src/c++26/Makefile.am | 109 +++ libstdc++-v3/src/c++26/Makefile.in | 747 +++++++++++++++ libstdc++-v3/src/c++26/text_encoding.cc | 91 ++ libstdc++-v3/src/experimental/Makefile.am | 2 + libstdc++-v3/src/experimental/Makefile.in | 2 + .../testsuite/22_locale/locale/encoding.cc | 36 + .../ext/unicode/charset_alias_match.cc | 18 + .../testsuite/std/text_encoding/cons.cc | 113 +++ .../testsuite/std/text_encoding/members.cc | 41 + .../std/text_encoding/requirements.cc | 31 + 26 files changed, 3083 insertions(+), 10 deletions(-) create mode 100644 libstdc++-v3/include/bits/text_encoding-data.h create mode 100644 libstdc++-v3/include/std/text_encoding create mode 100755 libstdc++-v3/scripts/gen_text_encoding_data.py create mode 100644 libstdc++-v3/src/c++26/Makefile.am create mode 100644 libstdc++-v3/src/c++26/Makefile.in create mode 100644 libstdc++-v3/src/c++26/text_encoding.cc create mode 100644 libstdc++-v3/testsuite/22_locale/locale/encoding.cc create mode 100644 libstdc++-v3/testsuite/ext/unicode/charset_alias_match.cc create mode 100644 libstdc++-v3/testsuite/std/text_encoding/cons.cc create mode 100644 libstdc++-v3/testsuite/std/text_encoding/members.cc create mode 100644 libstdc++-v3/testsuite/std/text_encoding/requirements.cc diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4 index e7cbf0fcf96..f9ba7ef744b 100644 --- a/libstdc++-v3/acinclude.m4 +++ b/libstdc++-v3/acinclude.m4 @@ -49,7 +49,7 @@ AC_DEFUN([GLIBCXX_CONFIGURE], [ # Keep these sync'd with the list in Makefile.am. The first provides an # expandable list at autoconf time; the second provides an expandable list # (i.e., shell variable) at configure time. - m4_define([glibcxx_SUBDIRS],[include libsupc++ src src/c++98 src/c++11 src/c++17 src/c++20 src/c++23 src/filesystem src/libbacktrace src/experimental doc po testsuite python]) + m4_define([glibcxx_SUBDIRS],[include libsupc++ src src/c++98 src/c++11 src/c++17 src/c++20 src/c++23 src/c++26 src/filesystem src/libbacktrace src/experimental doc po testsuite python]) SUBDIRS='glibcxx_SUBDIRS' # These need to be absolute paths, yet at the same time need to @@ -5821,6 +5821,34 @@ AC_LANG_SAVE AC_LANG_RESTORE ]) +dnl +dnl Check whether the dependencies for std::text_encoding are available. +dnl +dnl Defines: +dnl _GLIBCXX_USE_NL_LANGINFO_L if nl_langinfo_l is in . +dnl +AC_DEFUN([GLIBCXX_CHECK_TEXT_ENCODING], [ +AC_LANG_SAVE + AC_LANG_CPLUSPLUS + + AC_MSG_CHECKING([whether nl_langinfo_l is defined in ]) + AC_TRY_COMPILE([ + #include + #include + ],[ + locale_t loc = newlocale(LC_ALL_MASK, "", (locale_t)0); + const char* enc = nl_langinfo_l(CODESET, loc); + freelocale(loc); + ], [ac_nl_langinfo_l=yes], [ac_nl_langinfo_l=no]) + AC_MSG_RESULT($ac_nl_langinfo_l) + if test "$ac_nl_langinfo_l" = yes; then + AC_DEFINE_UNQUOTED(_GLIBCXX_USE_NL_LANGINFO_L, 1, + [Define if nl_langinfo_l should be used for std::text_encoding.]) + fi + + AC_LANG_RESTORE +]) + # Macros from the top-level gcc directory. m4_include([../config/gc++filt.m4]) m4_include([../config/tls.m4]) diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac index c8b36333019..c68cac4f345 100644 --- a/libstdc++-v3/configure.ac +++ b/libstdc++-v3/configure.ac @@ -557,6 +557,9 @@ GLIBCXX_CHECK_INIT_PRIORITY # For __basic_file::native_handle() GLIBCXX_CHECK_FILEBUF_NATIVE_HANDLES +# For std::text_encoding +GLIBCXX_CHECK_TEXT_ENCODING + # Define documentation rules conditionally. # See if makeinfo has been installed and is modern enough diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am index c6d6a24eb9e..64152351ed0 100644 --- a/libstdc++-v3/include/Makefile.am +++ b/libstdc++-v3/include/Makefile.am @@ -104,6 +104,7 @@ std_headers = \ ${std_srcdir}/streambuf \ ${std_srcdir}/string \ ${std_srcdir}/system_error \ + ${std_srcdir}/text_encoding \ ${std_srcdir}/thread \ ${std_srcdir}/unordered_map \ ${std_srcdir}/unordered_set \ @@ -159,6 +160,7 @@ bits_freestanding = \ ${bits_srcdir}/stl_raw_storage_iter.h \ ${bits_srcdir}/stl_relops.h \ ${bits_srcdir}/stl_uninitialized.h \ + ${bits_srcdir}/text_encoding-data.h \ ${bits_srcdir}/version.h \ ${bits_srcdir}/string_view.tcc \ ${bits_srcdir}/unicode.h \ diff --git a/libstdc++-v3/include/bits/locale_classes.h b/libstdc++-v3/include/bits/locale_classes.h index 621f2a29f50..a2e94217006 100644 --- a/libstdc++-v3/include/bits/locale_classes.h +++ b/libstdc++-v3/include/bits/locale_classes.h @@ -40,6 +40,10 @@ #include #include +#ifdef __glibcxx_text_encoding +#include +#endif + namespace std _GLIBCXX_VISIBILITY(default) { _GLIBCXX_BEGIN_NAMESPACE_VERSION @@ -248,6 +252,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION string name() const; +#ifdef __glibcxx_text_encoding +# if __CHAR_BIT__ == 8 + text_encoding + encoding() const; +# else + text_encoding + encoding() const = delete; +# endif +#endif + /** * @brief Locale equality. * diff --git a/libstdc++-v3/include/bits/text_encoding-data.h b/libstdc++-v3/include/bits/text_encoding-data.h new file mode 100644 index 00000000000..7ac2e9dc3d9 --- /dev/null +++ b/libstdc++-v3/include/bits/text_encoding-data.h @@ -0,0 +1,902 @@ +// Generated by gen_text_encoding_data.py, do not edit. + +#ifndef _GLIBCXX_GET_ENCODING_DATA +# error "This is not a public header, do not include it directly" +#endif + + { 3, "US-ASCII" }, + { 3, "iso-ir-6" }, + { 3, "ANSI_X3.4-1968" }, + { 3, "ANSI_X3.4-1986" }, + { 3, "ISO_646.irv:1991" }, + { 3, "ISO646-US" }, + { 3, "us" }, + { 3, "IBM367" }, + { 3, "cp367" }, + { 3, "csASCII" }, + { 4, "ISO_8859-1:1987" }, + { 4, "iso-ir-100" }, + { 4, "ISO_8859-1" }, + { 4, "ISO-8859-1" }, + { 4, "latin1" }, + { 4, "l1" }, + { 4, "IBM819" }, + { 4, "CP819" }, + { 4, "csISOLatin1" }, + { 5, "ISO_8859-2:1987" }, + { 5, "iso-ir-101" }, + { 5, "ISO_8859-2" }, + { 5, "ISO-8859-2" }, + { 5, "latin2" }, + { 5, "l2" }, + { 5, "csISOLatin2" }, + { 6, "ISO_8859-3:1988" }, + { 6, "iso-ir-109" }, + { 6, "ISO_8859-3" }, + { 6, "ISO-8859-3" }, + { 6, "latin3" }, + { 6, "l3" }, + { 6, "csISOLatin3" }, + { 7, "ISO_8859-4:1988" }, + { 7, "iso-ir-110" }, + { 7, "ISO_8859-4" }, + { 7, "ISO-8859-4" }, + { 7, "latin4" }, + { 7, "l4" }, + { 7, "csISOLatin4" }, + { 8, "ISO_8859-5:1988" }, + { 8, "iso-ir-144" }, + { 8, "ISO_8859-5" }, + { 8, "ISO-8859-5" }, + { 8, "cyrillic" }, + { 8, "csISOLatinCyrillic" }, + { 9, "ISO_8859-6:1987" }, + { 9, "iso-ir-127" }, + { 9, "ISO_8859-6" }, + { 9, "ISO-8859-6" }, + { 9, "ECMA-114" }, + { 9, "ASMO-708" }, + { 9, "arabic" }, + { 9, "csISOLatinArabic" }, + { 10, "ISO_8859-7:1987" }, + { 10, "iso-ir-126" }, + { 10, "ISO_8859-7" }, + { 10, "ISO-8859-7" }, + { 10, "ELOT_928" }, + { 10, "ECMA-118" }, + { 10, "greek" }, + { 10, "greek8" }, + { 10, "csISOLatinGreek" }, + { 11, "ISO_8859-8:1988" }, + { 11, "iso-ir-138" }, + { 11, "ISO_8859-8" }, + { 11, "ISO-8859-8" }, + { 11, "hebrew" }, + { 11, "csISOLatinHebrew" }, + { 12, "ISO_8859-9:1989" }, + { 12, "iso-ir-148" }, + { 12, "ISO_8859-9" }, + { 12, "ISO-8859-9" }, + { 12, "latin5" }, + { 12, "l5" }, + { 12, "csISOLatin5" }, + { 13, "ISO-8859-10" }, + { 13, "iso-ir-157" }, + { 13, "l6" }, + { 13, "ISO_8859-10:1992" }, + { 13, "csISOLatin6" }, + { 13, "latin6" }, + { 14, "ISO_6937-2-add" }, + { 14, "iso-ir-142" }, + { 14, "csISOTextComm" }, + { 15, "JIS_X0201" }, + { 15, "X0201" }, + { 15, "csHalfWidthKatakana" }, + { 16, "JIS_Encoding" }, + { 16, "csJISEncoding" }, + { 17, "Shift_JIS" }, + { 17, "MS_Kanji" }, + { 17, "csShiftJIS" }, + { 18, "Extended_UNIX_Code_Packed_Format_for_Japanese" }, + { 18, "csEUCPkdFmtJapanese" }, + { 18, "EUC-JP" }, + { 19, "Extended_UNIX_Code_Fixed_Width_for_Japanese" }, + { 19, "csEUCFixWidJapanese" }, + { 20, "BS_4730" }, + { 20, "iso-ir-4" }, + { 20, "ISO646-GB" }, + { 20, "gb" }, + { 20, "uk" }, + { 20, "csISO4UnitedKingdom" }, + { 21, "SEN_850200_C" }, + { 21, "iso-ir-11" }, + { 21, "ISO646-SE2" }, + { 21, "se2" }, + { 21, "csISO11SwedishForNames" }, + { 22, "IT" }, + { 22, "iso-ir-15" }, + { 22, "ISO646-IT" }, + { 22, "csISO15Italian" }, + { 23, "ES" }, + { 23, "iso-ir-17" }, + { 23, "ISO646-ES" }, + { 23, "csISO17Spanish" }, + { 24, "DIN_66003" }, + { 24, "iso-ir-21" }, + { 24, "de" }, + { 24, "ISO646-DE" }, + { 24, "csISO21German" }, + { 25, "NS_4551-1" }, + { 25, "iso-ir-60" }, + { 25, "ISO646-NO" }, + { 25, "no" }, + { 25, "csISO60DanishNorwegian" }, + { 25, "csISO60Norwegian1" }, + { 26, "NF_Z_62-010" }, + { 26, "iso-ir-69" }, + { 26, "ISO646-FR" }, + { 26, "fr" }, + { 26, "csISO69French" }, + { 27, "ISO-10646-UTF-1" }, + { 27, "csISO10646UTF1" }, + { 28, "ISO_646.basic:1983" }, + { 28, "ref" }, + { 28, "csISO646basic1983" }, + { 29, "INVARIANT" }, + { 29, "csINVARIANT" }, + { 30, "ISO_646.irv:1983" }, + { 30, "iso-ir-2" }, + { 30, "irv" }, + { 30, "csISO2IntlRefVersion" }, + { 31, "NATS-SEFI" }, + { 31, "iso-ir-8-1" }, + { 31, "csNATSSEFI" }, + { 32, "NATS-SEFI-ADD" }, + { 32, "iso-ir-8-2" }, + { 32, "csNATSSEFIADD" }, + { 35, "SEN_850200_B" }, + { 35, "iso-ir-10" }, + { 35, "FI" }, + { 35, "ISO646-FI" }, + { 35, "ISO646-SE" }, + { 35, "se" }, + { 35, "csISO10Swedish" }, + { 36, "KS_C_5601-1987" }, + { 36, "iso-ir-149" }, + { 36, "KS_C_5601-1989" }, + { 36, "KSC_5601" }, + { 36, "korean" }, + { 36, "csKSC56011987" }, + { 37, "ISO-2022-KR" }, + { 37, "csISO2022KR" }, + { 38, "EUC-KR" }, + { 38, "csEUCKR" }, + { 39, "ISO-2022-JP" }, + { 39, "csISO2022JP" }, + { 40, "ISO-2022-JP-2" }, + { 40, "csISO2022JP2" }, + { 41, "JIS_C6220-1969-jp" }, + { 41, "JIS_C6220-1969" }, + { 41, "iso-ir-13" }, + { 41, "katakana" }, + { 41, "x0201-7" }, + { 41, "csISO13JISC6220jp" }, + { 42, "JIS_C6220-1969-ro" }, + { 42, "iso-ir-14" }, + { 42, "jp" }, + { 42, "ISO646-JP" }, + { 42, "csISO14JISC6220ro" }, + { 43, "PT" }, + { 43, "iso-ir-16" }, + { 43, "ISO646-PT" }, + { 43, "csISO16Portuguese" }, + { 44, "greek7-old" }, + { 44, "iso-ir-18" }, + { 44, "csISO18Greek7Old" }, + { 45, "latin-greek" }, + { 45, "iso-ir-19" }, + { 45, "csISO19LatinGreek" }, + { 46, "NF_Z_62-010_(1973)" }, + { 46, "iso-ir-25" }, + { 46, "ISO646-FR1" }, + { 46, "csISO25French" }, + { 47, "Latin-greek-1" }, + { 47, "iso-ir-27" }, + { 47, "csISO27LatinGreek1" }, + { 48, "ISO_5427" }, + { 48, "iso-ir-37" }, + { 48, "csISO5427Cyrillic" }, + { 49, "JIS_C6226-1978" }, + { 49, "iso-ir-42" }, + { 49, "csISO42JISC62261978" }, + { 50, "BS_viewdata" }, + { 50, "iso-ir-47" }, + { 50, "csISO47BSViewdata" }, + { 51, "INIS" }, + { 51, "iso-ir-49" }, + { 51, "csISO49INIS" }, + { 52, "INIS-8" }, + { 52, "iso-ir-50" }, + { 52, "csISO50INIS8" }, + { 53, "INIS-cyrillic" }, + { 53, "iso-ir-51" }, + { 53, "csISO51INISCyrillic" }, + { 54, "ISO_5427:1981" }, + { 54, "iso-ir-54" }, + { 54, "ISO5427Cyrillic1981" }, + { 54, "csISO54271981" }, + { 55, "ISO_5428:1980" }, + { 55, "iso-ir-55" }, + { 55, "csISO5428Greek" }, + { 56, "GB_1988-80" }, + { 56, "iso-ir-57" }, + { 56, "cn" }, + { 56, "ISO646-CN" }, + { 56, "csISO57GB1988" }, + { 57, "GB_2312-80" }, + { 57, "iso-ir-58" }, + { 57, "chinese" }, + { 57, "csISO58GB231280" }, + { 58, "NS_4551-2" }, + { 58, "ISO646-NO2" }, + { 58, "iso-ir-61" }, + { 58, "no2" }, + { 58, "csISO61Norwegian2" }, + { 59, "videotex-suppl" }, + { 59, "iso-ir-70" }, + { 59, "csISO70VideotexSupp1" }, + { 60, "PT2" }, + { 60, "iso-ir-84" }, + { 60, "ISO646-PT2" }, + { 60, "csISO84Portuguese2" }, + { 61, "ES2" }, + { 61, "iso-ir-85" }, + { 61, "ISO646-ES2" }, + { 61, "csISO85Spanish2" }, + { 62, "MSZ_7795.3" }, + { 62, "iso-ir-86" }, + { 62, "ISO646-HU" }, + { 62, "hu" }, + { 62, "csISO86Hungarian" }, + { 63, "JIS_C6226-1983" }, + { 63, "iso-ir-87" }, + { 63, "x0208" }, + { 63, "JIS_X0208-1983" }, + { 63, "csISO87JISX0208" }, + { 64, "greek7" }, + { 64, "iso-ir-88" }, + { 64, "csISO88Greek7" }, + { 65, "ASMO_449" }, + { 65, "ISO_9036" }, + { 65, "arabic7" }, + { 65, "iso-ir-89" }, + { 65, "csISO89ASMO449" }, + { 66, "iso-ir-90" }, + { 66, "csISO90" }, + { 67, "JIS_C6229-1984-a" }, + { 67, "iso-ir-91" }, + { 67, "jp-ocr-a" }, + { 67, "csISO91JISC62291984a" }, + { 68, "JIS_C6229-1984-b" }, + { 68, "iso-ir-92" }, + { 68, "ISO646-JP-OCR-B" }, + { 68, "jp-ocr-b" }, + { 68, "csISO92JISC62991984b" }, + { 69, "JIS_C6229-1984-b-add" }, + { 69, "iso-ir-93" }, + { 69, "jp-ocr-b-add" }, + { 69, "csISO93JIS62291984badd" }, + { 70, "JIS_C6229-1984-hand" }, + { 70, "iso-ir-94" }, + { 70, "jp-ocr-hand" }, + { 70, "csISO94JIS62291984hand" }, + { 71, "JIS_C6229-1984-hand-add" }, + { 71, "iso-ir-95" }, + { 71, "jp-ocr-hand-add" }, + { 71, "csISO95JIS62291984handadd" }, + { 72, "JIS_C6229-1984-kana" }, + { 72, "iso-ir-96" }, + { 72, "csISO96JISC62291984kana" }, + { 73, "ISO_2033-1983" }, + { 73, "iso-ir-98" }, + { 73, "e13b" }, + { 73, "csISO2033" }, + { 74, "ANSI_X3.110-1983" }, + { 74, "iso-ir-99" }, + { 74, "CSA_T500-1983" }, + { 74, "NAPLPS" }, + { 74, "csISO99NAPLPS" }, + { 75, "T.61-7bit" }, + { 75, "iso-ir-102" }, + { 75, "csISO102T617bit" }, + { 76, "T.61-8bit" }, + { 76, "T.61" }, + { 76, "iso-ir-103" }, + { 76, "csISO103T618bit" }, + { 77, "ECMA-cyrillic" }, + { 77, "iso-ir-111" }, + { 77, "KOI8-E" }, + { 77, "csISO111ECMACyrillic" }, + { 78, "CSA_Z243.4-1985-1" }, + { 78, "iso-ir-121" }, + { 78, "ISO646-CA" }, + { 78, "csa7-1" }, + { 78, "csa71" }, + { 78, "ca" }, + { 78, "csISO121Canadian1" }, + { 79, "CSA_Z243.4-1985-2" }, + { 79, "iso-ir-122" }, + { 79, "ISO646-CA2" }, + { 79, "csa7-2" }, + { 79, "csa72" }, + { 79, "csISO122Canadian2" }, + { 80, "CSA_Z243.4-1985-gr" }, + { 80, "iso-ir-123" }, + { 80, "csISO123CSAZ24341985gr" }, + { 81, "ISO_8859-6-E" }, + { 81, "csISO88596E" }, + { 81, "ISO-8859-6-E" }, + { 82, "ISO_8859-6-I" }, + { 82, "csISO88596I" }, + { 82, "ISO-8859-6-I" }, + { 83, "T.101-G2" }, + { 83, "iso-ir-128" }, + { 83, "csISO128T101G2" }, + { 84, "ISO_8859-8-E" }, + { 84, "csISO88598E" }, + { 84, "ISO-8859-8-E" }, + { 85, "ISO_8859-8-I" }, + { 85, "csISO88598I" }, + { 85, "ISO-8859-8-I" }, + { 86, "CSN_369103" }, + { 86, "iso-ir-139" }, + { 86, "csISO139CSN369103" }, + { 87, "JUS_I.B1.002" }, + { 87, "iso-ir-141" }, + { 87, "ISO646-YU" }, + { 87, "js" }, + { 87, "yu" }, + { 87, "csISO141JUSIB1002" }, + { 88, "IEC_P27-1" }, + { 88, "iso-ir-143" }, + { 88, "csISO143IECP271" }, + { 89, "JUS_I.B1.003-serb" }, + { 89, "iso-ir-146" }, + { 89, "serbian" }, + { 89, "csISO146Serbian" }, + { 90, "JUS_I.B1.003-mac" }, + { 90, "macedonian" }, + { 90, "iso-ir-147" }, + { 90, "csISO147Macedonian" }, + { 91, "greek-ccitt" }, + { 91, "iso-ir-150" }, + { 91, "csISO150" }, + { 91, "csISO150GreekCCITT" }, + { 92, "NC_NC00-10:81" }, + { 92, "cuba" }, + { 92, "iso-ir-151" }, + { 92, "ISO646-CU" }, + { 92, "csISO151Cuba" }, + { 93, "ISO_6937-2-25" }, + { 93, "iso-ir-152" }, + { 93, "csISO6937Add" }, + { 94, "GOST_19768-74" }, + { 94, "ST_SEV_358-88" }, + { 94, "iso-ir-153" }, + { 94, "csISO153GOST1976874" }, + { 95, "ISO_8859-supp" }, + { 95, "iso-ir-154" }, + { 95, "latin1-2-5" }, + { 95, "csISO8859Supp" }, + { 96, "ISO_10367-box" }, + { 96, "iso-ir-155" }, + { 96, "csISO10367Box" }, + { 97, "latin-lap" }, + { 97, "lap" }, + { 97, "iso-ir-158" }, + { 97, "csISO158Lap" }, + { 98, "JIS_X0212-1990" }, + { 98, "x0212" }, + { 98, "iso-ir-159" }, + { 98, "csISO159JISX02121990" }, + { 99, "DS_2089" }, + { 99, "DS2089" }, + { 99, "ISO646-DK" }, + { 99, "dk" }, + { 99, "csISO646Danish" }, + { 100, "us-dk" }, + { 100, "csUSDK" }, + { 101, "dk-us" }, + { 101, "csDKUS" }, + { 102, "KSC5636" }, + { 102, "ISO646-KR" }, + { 102, "csKSC5636" }, + { 103, "UNICODE-1-1-UTF-7" }, + { 103, "csUnicode11UTF7" }, + { 104, "ISO-2022-CN" }, + { 104, "csISO2022CN" }, + { 105, "ISO-2022-CN-EXT" }, + { 105, "csISO2022CNEXT" }, +#define _GLIBCXX_TEXT_ENCODING_UTF8_OFFSET 413 + { 106, "UTF-8" }, + { 106, "csUTF8" }, + { 109, "ISO-8859-13" }, + { 109, "csISO885913" }, + { 110, "ISO-8859-14" }, + { 110, "iso-ir-199" }, + { 110, "ISO_8859-14:1998" }, + { 110, "ISO_8859-14" }, + { 110, "latin8" }, + { 110, "iso-celtic" }, + { 110, "l8" }, + { 110, "csISO885914" }, + { 111, "ISO-8859-15" }, + { 111, "ISO_8859-15" }, + { 111, "Latin-9" }, + { 111, "csISO885915" }, + { 112, "ISO-8859-16" }, + { 112, "iso-ir-226" }, + { 112, "ISO_8859-16:2001" }, + { 112, "ISO_8859-16" }, + { 112, "latin10" }, + { 112, "l10" }, + { 112, "csISO885916" }, + { 113, "GBK" }, + { 113, "CP936" }, + { 113, "MS936" }, + { 113, "windows-936" }, + { 113, "csGBK" }, + { 114, "GB18030" }, + { 114, "csGB18030" }, + { 115, "OSD_EBCDIC_DF04_15" }, + { 115, "csOSDEBCDICDF0415" }, + { 116, "OSD_EBCDIC_DF03_IRV" }, + { 116, "csOSDEBCDICDF03IRV" }, + { 117, "OSD_EBCDIC_DF04_1" }, + { 117, "csOSDEBCDICDF041" }, + { 118, "ISO-11548-1" }, + { 118, "ISO_11548-1" }, + { 118, "ISO_TR_11548-1" }, + { 118, "csISO115481" }, + { 119, "KZ-1048" }, + { 119, "STRK1048-2002" }, + { 119, "RK1048" }, + { 119, "csKZ1048" }, + { 1000, "ISO-10646-UCS-2" }, + { 1000, "csUnicode" }, + { 1001, "ISO-10646-UCS-4" }, + { 1001, "csUCS4" }, + { 1002, "ISO-10646-UCS-Basic" }, + { 1002, "csUnicodeASCII" }, + { 1003, "ISO-10646-Unicode-Latin1" }, + { 1003, "csUnicodeLatin1" }, + { 1003, "ISO-10646" }, + { 1004, "ISO-10646-J-1" }, + { 1004, "csUnicodeJapanese" }, + { 1005, "ISO-Unicode-IBM-1261" }, + { 1005, "csUnicodeIBM1261" }, + { 1006, "ISO-Unicode-IBM-1268" }, + { 1006, "csUnicodeIBM1268" }, + { 1007, "ISO-Unicode-IBM-1276" }, + { 1007, "csUnicodeIBM1276" }, + { 1008, "ISO-Unicode-IBM-1264" }, + { 1008, "csUnicodeIBM1264" }, + { 1009, "ISO-Unicode-IBM-1265" }, + { 1009, "csUnicodeIBM1265" }, + { 1010, "UNICODE-1-1" }, + { 1010, "csUnicode11" }, + { 1011, "SCSU" }, + { 1011, "csSCSU" }, + { 1012, "UTF-7" }, + { 1012, "csUTF7" }, + { 1013, "UTF-16BE" }, + { 1013, "csUTF16BE" }, + { 1014, "UTF-16LE" }, + { 1014, "csUTF16LE" }, + { 1015, "UTF-16" }, + { 1015, "csUTF16" }, + { 1016, "CESU-8" }, + { 1016, "csCESU8" }, + { 1016, "csCESU-8" }, + { 1017, "UTF-32" }, + { 1017, "csUTF32" }, + { 1018, "UTF-32BE" }, + { 1018, "csUTF32BE" }, + { 1019, "UTF-32LE" }, + { 1019, "csUTF32LE" }, + { 1020, "BOCU-1" }, + { 1020, "csBOCU1" }, + { 1020, "csBOCU-1" }, + { 1021, "UTF-7-IMAP" }, + { 1021, "csUTF7IMAP" }, + { 2000, "ISO-8859-1-Windows-3.0-Latin-1" }, + { 2000, "csWindows30Latin1" }, + { 2001, "ISO-8859-1-Windows-3.1-Latin-1" }, + { 2001, "csWindows31Latin1" }, + { 2002, "ISO-8859-2-Windows-Latin-2" }, + { 2002, "csWindows31Latin2" }, + { 2003, "ISO-8859-9-Windows-Latin-5" }, + { 2003, "csWindows31Latin5" }, + { 2004, "hp-roman8" }, + { 2004, "roman8" }, + { 2004, "r8" }, + { 2004, "csHPRoman8" }, + { 2005, "Adobe-Standard-Encoding" }, + { 2005, "csAdobeStandardEncoding" }, + { 2006, "Ventura-US" }, + { 2006, "csVenturaUS" }, + { 2007, "Ventura-International" }, + { 2007, "csVenturaInternational" }, + { 2008, "DEC-MCS" }, + { 2008, "dec" }, + { 2008, "csDECMCS" }, + { 2009, "IBM850" }, + { 2009, "cp850" }, + { 2009, "850" }, + { 2009, "csPC850Multilingual" }, + { 2010, "IBM852" }, + { 2010, "cp852" }, + { 2010, "852" }, + { 2010, "csPCp852" }, + { 2011, "IBM437" }, + { 2011, "cp437" }, + { 2011, "437" }, + { 2011, "csPC8CodePage437" }, + { 2012, "PC8-Danish-Norwegian" }, + { 2012, "csPC8DanishNorwegian" }, + { 2013, "IBM862" }, + { 2013, "cp862" }, + { 2013, "862" }, + { 2013, "csPC862LatinHebrew" }, + { 2014, "PC8-Turkish" }, + { 2014, "csPC8Turkish" }, + { 2015, "IBM-Symbols" }, + { 2015, "csIBMSymbols" }, + { 2016, "IBM-Thai" }, + { 2016, "csIBMThai" }, + { 2017, "HP-Legal" }, + { 2017, "csHPLegal" }, + { 2018, "HP-Pi-font" }, + { 2018, "csHPPiFont" }, + { 2019, "HP-Math8" }, + { 2019, "csHPMath8" }, + { 2020, "Adobe-Symbol-Encoding" }, + { 2020, "csHPPSMath" }, + { 2021, "HP-DeskTop" }, + { 2021, "csHPDesktop" }, + { 2022, "Ventura-Math" }, + { 2022, "csVenturaMath" }, + { 2023, "Microsoft-Publishing" }, + { 2023, "csMicrosoftPublishing" }, + { 2024, "Windows-31J" }, + { 2024, "csWindows31J" }, + { 2025, "GB2312" }, + { 2025, "csGB2312" }, + { 2026, "Big5" }, + { 2026, "csBig5" }, + { 2027, "macintosh" }, + { 2027, "mac" }, + { 2027, "csMacintosh" }, + { 2028, "IBM037" }, + { 2028, "cp037" }, + { 2028, "ebcdic-cp-us" }, + { 2028, "ebcdic-cp-ca" }, + { 2028, "ebcdic-cp-wt" }, + { 2028, "ebcdic-cp-nl" }, + { 2028, "csIBM037" }, + { 2029, "IBM038" }, + { 2029, "EBCDIC-INT" }, + { 2029, "cp038" }, + { 2029, "csIBM038" }, + { 2030, "IBM273" }, + { 2030, "CP273" }, + { 2030, "csIBM273" }, + { 2031, "IBM274" }, + { 2031, "EBCDIC-BE" }, + { 2031, "CP274" }, + { 2031, "csIBM274" }, + { 2032, "IBM275" }, + { 2032, "EBCDIC-BR" }, + { 2032, "cp275" }, + { 2032, "csIBM275" }, + { 2033, "IBM277" }, + { 2033, "EBCDIC-CP-DK" }, + { 2033, "EBCDIC-CP-NO" }, + { 2033, "csIBM277" }, + { 2034, "IBM278" }, + { 2034, "CP278" }, + { 2034, "ebcdic-cp-fi" }, + { 2034, "ebcdic-cp-se" }, + { 2034, "csIBM278" }, + { 2035, "IBM280" }, + { 2035, "CP280" }, + { 2035, "ebcdic-cp-it" }, + { 2035, "csIBM280" }, + { 2036, "IBM281" }, + { 2036, "EBCDIC-JP-E" }, + { 2036, "cp281" }, + { 2036, "csIBM281" }, + { 2037, "IBM284" }, + { 2037, "CP284" }, + { 2037, "ebcdic-cp-es" }, + { 2037, "csIBM284" }, + { 2038, "IBM285" }, + { 2038, "CP285" }, + { 2038, "ebcdic-cp-gb" }, + { 2038, "csIBM285" }, + { 2039, "IBM290" }, + { 2039, "cp290" }, + { 2039, "EBCDIC-JP-kana" }, + { 2039, "csIBM290" }, + { 2040, "IBM297" }, + { 2040, "cp297" }, + { 2040, "ebcdic-cp-fr" }, + { 2040, "csIBM297" }, + { 2041, "IBM420" }, + { 2041, "cp420" }, + { 2041, "ebcdic-cp-ar1" }, + { 2041, "csIBM420" }, + { 2042, "IBM423" }, + { 2042, "cp423" }, + { 2042, "ebcdic-cp-gr" }, + { 2042, "csIBM423" }, + { 2043, "IBM424" }, + { 2043, "cp424" }, + { 2043, "ebcdic-cp-he" }, + { 2043, "csIBM424" }, + { 2044, "IBM500" }, + { 2044, "CP500" }, + { 2044, "ebcdic-cp-be" }, + { 2044, "ebcdic-cp-ch" }, + { 2044, "csIBM500" }, + { 2045, "IBM851" }, + { 2045, "cp851" }, + { 2045, "851" }, + { 2045, "csIBM851" }, + { 2046, "IBM855" }, + { 2046, "cp855" }, + { 2046, "855" }, + { 2046, "csIBM855" }, + { 2047, "IBM857" }, + { 2047, "cp857" }, + { 2047, "857" }, + { 2047, "csIBM857" }, + { 2048, "IBM860" }, + { 2048, "cp860" }, + { 2048, "860" }, + { 2048, "csIBM860" }, + { 2049, "IBM861" }, + { 2049, "cp861" }, + { 2049, "861" }, + { 2049, "cp-is" }, + { 2049, "csIBM861" }, + { 2050, "IBM863" }, + { 2050, "cp863" }, + { 2050, "863" }, + { 2050, "csIBM863" }, + { 2051, "IBM864" }, + { 2051, "cp864" }, + { 2051, "csIBM864" }, + { 2052, "IBM865" }, + { 2052, "cp865" }, + { 2052, "865" }, + { 2052, "csIBM865" }, + { 2053, "IBM868" }, + { 2053, "CP868" }, + { 2053, "cp-ar" }, + { 2053, "csIBM868" }, + { 2054, "IBM869" }, + { 2054, "cp869" }, + { 2054, "869" }, + { 2054, "cp-gr" }, + { 2054, "csIBM869" }, + { 2055, "IBM870" }, + { 2055, "CP870" }, + { 2055, "ebcdic-cp-roece" }, + { 2055, "ebcdic-cp-yu" }, + { 2055, "csIBM870" }, + { 2056, "IBM871" }, + { 2056, "CP871" }, + { 2056, "ebcdic-cp-is" }, + { 2056, "csIBM871" }, + { 2057, "IBM880" }, + { 2057, "cp880" }, + { 2057, "EBCDIC-Cyrillic" }, + { 2057, "csIBM880" }, + { 2058, "IBM891" }, + { 2058, "cp891" }, + { 2058, "csIBM891" }, + { 2059, "IBM903" }, + { 2059, "cp903" }, + { 2059, "csIBM903" }, + { 2060, "IBM904" }, + { 2060, "cp904" }, + { 2060, "904" }, + { 2060, "csIBBM904" }, + { 2061, "IBM905" }, + { 2061, "CP905" }, + { 2061, "ebcdic-cp-tr" }, + { 2061, "csIBM905" }, + { 2062, "IBM918" }, + { 2062, "CP918" }, + { 2062, "ebcdic-cp-ar2" }, + { 2062, "csIBM918" }, + { 2063, "IBM1026" }, + { 2063, "CP1026" }, + { 2063, "csIBM1026" }, + { 2064, "EBCDIC-AT-DE" }, + { 2064, "csIBMEBCDICATDE" }, + { 2065, "EBCDIC-AT-DE-A" }, + { 2065, "csEBCDICATDEA" }, + { 2066, "EBCDIC-CA-FR" }, + { 2066, "csEBCDICCAFR" }, + { 2067, "EBCDIC-DK-NO" }, + { 2067, "csEBCDICDKNO" }, + { 2068, "EBCDIC-DK-NO-A" }, + { 2068, "csEBCDICDKNOA" }, + { 2069, "EBCDIC-FI-SE" }, + { 2069, "csEBCDICFISE" }, + { 2070, "EBCDIC-FI-SE-A" }, + { 2070, "csEBCDICFISEA" }, + { 2071, "EBCDIC-FR" }, + { 2071, "csEBCDICFR" }, + { 2072, "EBCDIC-IT" }, + { 2072, "csEBCDICIT" }, + { 2073, "EBCDIC-PT" }, + { 2073, "csEBCDICPT" }, + { 2074, "EBCDIC-ES" }, + { 2074, "csEBCDICES" }, + { 2075, "EBCDIC-ES-A" }, + { 2075, "csEBCDICESA" }, + { 2076, "EBCDIC-ES-S" }, + { 2076, "csEBCDICESS" }, + { 2077, "EBCDIC-UK" }, + { 2077, "csEBCDICUK" }, + { 2078, "EBCDIC-US" }, + { 2078, "csEBCDICUS" }, + { 2079, "UNKNOWN-8BIT" }, + { 2079, "csUnknown8BiT" }, + { 2080, "MNEMONIC" }, + { 2080, "csMnemonic" }, + { 2081, "MNEM" }, + { 2081, "csMnem" }, + { 2082, "VISCII" }, + { 2082, "csVISCII" }, + { 2083, "VIQR" }, + { 2083, "csVIQR" }, + { 2084, "KOI8-R" }, + { 2084, "csKOI8R" }, + { 2085, "HZ-GB-2312" }, + { 2086, "IBM866" }, + { 2086, "cp866" }, + { 2086, "866" }, + { 2086, "csIBM866" }, + { 2087, "IBM775" }, + { 2087, "cp775" }, + { 2087, "csPC775Baltic" }, + { 2088, "KOI8-U" }, + { 2088, "csKOI8U" }, + { 2089, "IBM00858" }, + { 2089, "CCSID00858" }, + { 2089, "CP00858" }, + { 2089, "PC-Multilingual-850+euro" }, + { 2089, "csIBM00858" }, + { 2090, "IBM00924" }, + { 2090, "CCSID00924" }, + { 2090, "CP00924" }, + { 2090, "ebcdic-Latin9--euro" }, + { 2090, "csIBM00924" }, + { 2091, "IBM01140" }, + { 2091, "CCSID01140" }, + { 2091, "CP01140" }, + { 2091, "ebcdic-us-37+euro" }, + { 2091, "csIBM01140" }, + { 2092, "IBM01141" }, + { 2092, "CCSID01141" }, + { 2092, "CP01141" }, + { 2092, "ebcdic-de-273+euro" }, + { 2092, "csIBM01141" }, + { 2093, "IBM01142" }, + { 2093, "CCSID01142" }, + { 2093, "CP01142" }, + { 2093, "ebcdic-dk-277+euro" }, + { 2093, "ebcdic-no-277+euro" }, + { 2093, "csIBM01142" }, + { 2094, "IBM01143" }, + { 2094, "CCSID01143" }, + { 2094, "CP01143" }, + { 2094, "ebcdic-fi-278+euro" }, + { 2094, "ebcdic-se-278+euro" }, + { 2094, "csIBM01143" }, + { 2095, "IBM01144" }, + { 2095, "CCSID01144" }, + { 2095, "CP01144" }, + { 2095, "ebcdic-it-280+euro" }, + { 2095, "csIBM01144" }, + { 2096, "IBM01145" }, + { 2096, "CCSID01145" }, + { 2096, "CP01145" }, + { 2096, "ebcdic-es-284+euro" }, + { 2096, "csIBM01145" }, + { 2097, "IBM01146" }, + { 2097, "CCSID01146" }, + { 2097, "CP01146" }, + { 2097, "ebcdic-gb-285+euro" }, + { 2097, "csIBM01146" }, + { 2098, "IBM01147" }, + { 2098, "CCSID01147" }, + { 2098, "CP01147" }, + { 2098, "ebcdic-fr-297+euro" }, + { 2098, "csIBM01147" }, + { 2099, "IBM01148" }, + { 2099, "CCSID01148" }, + { 2099, "CP01148" }, + { 2099, "ebcdic-international-500+euro" }, + { 2099, "csIBM01148" }, + { 2100, "IBM01149" }, + { 2100, "CCSID01149" }, + { 2100, "CP01149" }, + { 2100, "ebcdic-is-871+euro" }, + { 2100, "csIBM01149" }, + { 2101, "Big5-HKSCS" }, + { 2101, "csBig5HKSCS" }, + { 2102, "IBM1047" }, + { 2102, "IBM-1047" }, + { 2102, "csIBM1047" }, + { 2103, "PTCP154" }, + { 2103, "csPTCP154" }, + { 2103, "PT154" }, + { 2103, "CP154" }, + { 2103, "Cyrillic-Asian" }, + { 2104, "Amiga-1251" }, + { 2104, "Ami1251" }, + { 2104, "Amiga1251" }, + { 2104, "Ami-1251" }, + { 2104, "csAmiga1251" }, + { 2104, "(Aliases" }, + { 2104, "are" }, + { 2104, "provided" }, + { 2104, "for" }, + { 2104, "historical" }, + { 2104, "reasons" }, + { 2104, "and" }, + { 2104, "should" }, + { 2104, "not" }, + { 2104, "be" }, + { 2104, "used)" }, + { 2104, "[Malyshev]" }, + { 2105, "KOI7-switched" }, + { 2105, "csKOI7switched" }, + { 2106, "BRF" }, + { 2106, "csBRF" }, + { 2107, "TSCII" }, + { 2107, "csTSCII" }, + { 2108, "CP51932" }, + { 2108, "csCP51932" }, + { 2109, "windows-874" }, + { 2109, "cswindows874" }, + { 2250, "windows-1250" }, + { 2250, "cswindows1250" }, + { 2251, "windows-1251" }, + { 2251, "cswindows1251" }, + { 2252, "windows-1252" }, + { 2252, "cswindows1252" }, + { 2253, "windows-1253" }, + { 2253, "cswindows1253" }, + { 2254, "windows-1254" }, + { 2254, "cswindows1254" }, + { 2255, "windows-1255" }, + { 2255, "cswindows1255" }, + { 2256, "windows-1256" }, + { 2256, "cswindows1256" }, + { 2257, "windows-1257" }, + { 2257, "cswindows1257" }, + { 2258, "windows-1258" }, + { 2258, "cswindows1258" }, + { 2259, "TIS-620" }, + { 2259, "csTIS620" }, + { 2259, "ISO-8859-11" }, + { 2260, "CP50220" }, + { 2260, "csCP50220" }, + +#undef _GLIBCXX_GET_ENCODING_DATA diff --git a/libstdc++-v3/include/bits/unicode.h b/libstdc++-v3/include/bits/unicode.h index f1b2b359bdf..8bc55e9c136 100644 --- a/libstdc++-v3/include/bits/unicode.h +++ b/libstdc++-v3/include/bits/unicode.h @@ -32,7 +32,8 @@ #if __cplusplus >= 202002L #include -#include +#include // bit_width +#include // __detail::__from_chars_alnum_to_val_table #include #include #include @@ -986,7 +987,7 @@ inline namespace __v15_1_0 return __n; } - template + template consteval bool __literal_encoding_is_unicode() { @@ -1056,6 +1057,54 @@ inline namespace __v15_1_0 __literal_encoding_is_utf8() { return __literal_encoding_is_unicode(); } + consteval bool + __literal_encoding_is_extended_ascii() + { + return '0' == 0x30 && 'A' == 0x41 && 'Z' == 0x5a + && 'a' == 0x61 && 'z' == 0x7a; + } + + // https://www.unicode.org/reports/tr22/tr22-8.html#Charset_Alias_Matching + constexpr bool + __charset_alias_match(string_view __a, string_view __b) + { + // Map alphanumeric chars to their base 64 value, everything else to 127. + auto __map = [](char __c, bool& __num) -> unsigned char { + using __detail::__from_chars_alnum_to_val_table; + if (__c == '0') [[unlikely]] + return __num ? 0 : 127; + auto __v = __from_chars_alnum_to_val_table::value.__data[__c]; + __num = __v < 10; + return __v; + }; + + auto __ptr_a = __a.begin(), __end_a = __a.end(); + auto __ptr_b = __b.begin(), __end_b = __b.end(); + bool __num_a = false, __num_b = false; + + while (true) + { + // Find the value of the next alphanumeric character in each string. + unsigned char __val_a, __val_b; + while (__ptr_a != __end_a + && (__val_a = __map(*__ptr_a, __num_a)) == 127) + ++__ptr_a; + while (__ptr_b != __end_b + && (__val_b = __map(*__ptr_b, __num_b)) == 127) + ++__ptr_b; + // Stop when we reach the end of a string, or get a mismatch. + if (__ptr_a == __end_a) + return __ptr_b == __end_b; + else if (__ptr_b == __end_b) + return false; + else if (__val_a != __val_b) + return false; // Found non-matching characters. + ++__ptr_a; + ++__ptr_b; + } + return true; + } + } // namespace __unicode _GLIBCXX_END_NAMESPACE_VERSION diff --git a/libstdc++-v3/include/bits/version.def b/libstdc++-v3/include/bits/version.def index afbec6c3e6a..8fb8a2877ee 100644 --- a/libstdc++-v3/include/bits/version.def +++ b/libstdc++-v3/include/bits/version.def @@ -1751,6 +1751,16 @@ ftms = { }; }; +ftms = { + name = text_encoding; + values = { + v = 202306; + cxxmin = 26; + hosted = yes; + extra_cond = "_GLIBCXX_USE_NL_LANGINFO_L"; + }; +}; + ftms = { name = to_string; values = { diff --git a/libstdc++-v3/include/bits/version.h b/libstdc++-v3/include/bits/version.h index 9688b246ef4..9ba99deeda6 100644 --- a/libstdc++-v3/include/bits/version.h +++ b/libstdc++-v3/include/bits/version.h @@ -2137,6 +2137,17 @@ #undef __glibcxx_want_saturation_arithmetic // from version.def line 1755 +#if !defined(__cpp_lib_text_encoding) +# if (__cplusplus > 202302L) && _GLIBCXX_HOSTED && (_GLIBCXX_USE_NL_LANGINFO_L) +# define __glibcxx_text_encoding 202306L +# if defined(__glibcxx_want_all) || defined(__glibcxx_want_text_encoding) +# define __cpp_lib_text_encoding 202306L +# endif +# endif +#endif /* !defined(__cpp_lib_text_encoding) && defined(__glibcxx_want_text_encoding) */ +#undef __glibcxx_want_text_encoding + +// from version.def line 1765 #if !defined(__cpp_lib_to_string) # if (__cplusplus > 202302L) && _GLIBCXX_HOSTED && (__glibcxx_to_chars) # define __glibcxx_to_string 202306L @@ -2147,7 +2158,7 @@ #endif /* !defined(__cpp_lib_to_string) && defined(__glibcxx_want_to_string) */ #undef __glibcxx_want_to_string -// from version.def line 1765 +// from version.def line 1775 #if !defined(__cpp_lib_generator) # if (__cplusplus >= 202100L) && (__glibcxx_coroutine) # define __glibcxx_generator 202207L diff --git a/libstdc++-v3/include/std/text_encoding b/libstdc++-v3/include/std/text_encoding new file mode 100644 index 00000000000..df8a09c5810 --- /dev/null +++ b/libstdc++-v3/include/std/text_encoding @@ -0,0 +1,704 @@ +// -*- C++ -*- + +// Copyright The GNU Toolchain Authors. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// Under Section 7 of GPL version 3, you are granted additional +// permissions described in the GCC Runtime Library Exception, version +// 3.1, as published by the Free Software Foundation. + +// You should have received a copy of the GNU General Public License and +// a copy of the GCC Runtime Library Exception along with this program; +// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +// . + +/** @file include/text_encoding + * This is a Standard C++ Library header. + */ + +#ifndef _GLIBCXX_TEXT_ENCODING +#define _GLIBCXX_TEXT_ENCODING + +#pragma GCC system_header + +#include + +#define __glibcxx_want_text_encoding +#include + +#ifdef __cpp_lib_text_encoding +#include +#include +#include // hash +#include // view_interface +#include // __charset_alias_match +#include // __int_traits + +namespace std _GLIBCXX_VISIBILITY(default) +{ +_GLIBCXX_BEGIN_NAMESPACE_VERSION + + /** + * @brief An interface for accessing the IANA Character Sets registry. + * @ingroup locales + * @since C++23 + */ + struct text_encoding + { + private: + struct _Rep + { + using id = __INT_LEAST32_TYPE__; + id _M_id; + const char* _M_name; + + friend constexpr bool + operator<(const _Rep& __r, id __m) noexcept + { return __r._M_id < __m; } + + friend constexpr bool + operator==(const _Rep& __r, string_view __name) noexcept + { return __r._M_name == __name; } + }; + + public: + static constexpr size_t max_name_length = 63; + + enum class id : _Rep::id + { + other = 1, + unknown = 2, + ASCII = 3, + ISOLatin1 = 4, + ISOLatin2 = 5, + ISOLatin3 = 6, + ISOLatin4 = 7, + ISOLatinCyrillic = 8, + ISOLatinArabic = 9, + ISOLatinGreek = 10, + ISOLatinHebrew = 11, + ISOLatin5 = 12, + ISOLatin6 = 13, + ISOTextComm = 14, + HalfWidthKatakana = 15, + JISEncoding = 16, + ShiftJIS = 17, + EUCPkdFmtJapanese = 18, + EUCFixWidJapanese = 19, + ISO4UnitedKingdom = 20, + ISO11SwedishForNames = 21, + ISO15Italian = 22, + ISO17Spanish = 23, + ISO21German = 24, + ISO60DanishNorwegian = 25, + ISO69French = 26, + ISO10646UTF1 = 27, + ISO646basic1983 = 28, + INVARIANT = 29, + ISO2IntlRefVersion = 30, + NATSSEFI = 31, + NATSSEFIADD = 32, + ISO10Swedish = 35, + KSC56011987 = 36, + ISO2022KR = 37, + EUCKR = 38, + ISO2022JP = 39, + ISO2022JP2 = 40, + ISO13JISC6220jp = 41, + ISO14JISC6220ro = 42, + ISO16Portuguese = 43, + ISO18Greek7Old = 44, + ISO19LatinGreek = 45, + ISO25French = 46, + ISO27LatinGreek1 = 47, + ISO5427Cyrillic = 48, + ISO42JISC62261978 = 49, + ISO47BSViewdata = 50, + ISO49INIS = 51, + ISO50INIS8 = 52, + ISO51INISCyrillic = 53, + ISO54271981 = 54, + ISO5428Greek = 55, + ISO57GB1988 = 56, + ISO58GB231280 = 57, + ISO61Norwegian2 = 58, + ISO70VideotexSupp1 = 59, + ISO84Portuguese2 = 60, + ISO85Spanish2 = 61, + ISO86Hungarian = 62, + ISO87JISX0208 = 63, + ISO88Greek7 = 64, + ISO89ASMO449 = 65, + ISO90 = 66, + ISO91JISC62291984a = 67, + ISO92JISC62991984b = 68, + ISO93JIS62291984badd = 69, + ISO94JIS62291984hand = 70, + ISO95JIS62291984handadd = 71, + ISO96JISC62291984kana = 72, + ISO2033 = 73, + ISO99NAPLPS = 74, + ISO102T617bit = 75, + ISO103T618bit = 76, + ISO111ECMACyrillic = 77, + ISO121Canadian1 = 78, + ISO122Canadian2 = 79, + ISO123CSAZ24341985gr = 80, + ISO88596E = 81, + ISO88596I = 82, + ISO128T101G2 = 83, + ISO88598E = 84, + ISO88598I = 85, + ISO139CSN369103 = 86, + ISO141JUSIB1002 = 87, + ISO143IECP271 = 88, + ISO146Serbian = 89, + ISO147Macedonian = 90, + ISO150 = 91, + ISO151Cuba = 92, + ISO6937Add = 93, + ISO153GOST1976874 = 94, + ISO8859Supp = 95, + ISO10367Box = 96, + ISO158Lap = 97, + ISO159JISX02121990 = 98, + ISO646Danish = 99, + USDK = 100, + DKUS = 101, + KSC5636 = 102, + Unicode11UTF7 = 103, + ISO2022CN = 104, + ISO2022CNEXT = 105, + UTF8 = 106, + ISO885913 = 109, + ISO885914 = 110, + ISO885915 = 111, + ISO885916 = 112, + GBK = 113, + GB18030 = 114, + OSDEBCDICDF0415 = 115, + OSDEBCDICDF03IRV = 116, + OSDEBCDICDF041 = 117, + ISO115481 = 118, + KZ1048 = 119, + UCS2 = 1000, + UCS4 = 1001, + UnicodeASCII = 1002, + UnicodeLatin1 = 1003, + UnicodeJapanese = 1004, + UnicodeIBM1261 = 1005, + UnicodeIBM1268 = 1006, + UnicodeIBM1276 = 1007, + UnicodeIBM1264 = 1008, + UnicodeIBM1265 = 1009, + Unicode11 = 1010, + SCSU = 1011, + UTF7 = 1012, + UTF16BE = 1013, + UTF16LE = 1014, + UTF16 = 1015, + CESU8 = 1016, + UTF32 = 1017, + UTF32BE = 1018, + UTF32LE = 1019, + BOCU1 = 1020, + UTF7IMAP = 1021, + Windows30Latin1 = 2000, + Windows31Latin1 = 2001, + Windows31Latin2 = 2002, + Windows31Latin5 = 2003, + HPRoman8 = 2004, + AdobeStandardEncoding = 2005, + VenturaUS = 2006, + VenturaInternational = 2007, + DECMCS = 2008, + PC850Multilingual = 2009, + PC8DanishNorwegian = 2012, + PC862LatinHebrew = 2013, + PC8Turkish = 2014, + IBMSymbols = 2015, + IBMThai = 2016, + HPLegal = 2017, + HPPiFont = 2018, + HPMath8 = 2019, + HPPSMath = 2020, + HPDesktop = 2021, + VenturaMath = 2022, + MicrosoftPublishing = 2023, + Windows31J = 2024, + GB2312 = 2025, + Big5 = 2026, + Macintosh = 2027, + IBM037 = 2028, + IBM038 = 2029, + IBM273 = 2030, + IBM274 = 2031, + IBM275 = 2032, + IBM277 = 2033, + IBM278 = 2034, + IBM280 = 2035, + IBM281 = 2036, + IBM284 = 2037, + IBM285 = 2038, + IBM290 = 2039, + IBM297 = 2040, + IBM420 = 2041, + IBM423 = 2042, + IBM424 = 2043, + PC8CodePage437 = 2011, + IBM500 = 2044, + IBM851 = 2045, + PCp852 = 2010, + IBM855 = 2046, + IBM857 = 2047, + IBM860 = 2048, + IBM861 = 2049, + IBM863 = 2050, + IBM864 = 2051, + IBM865 = 2052, + IBM868 = 2053, + IBM869 = 2054, + IBM870 = 2055, + IBM871 = 2056, + IBM880 = 2057, + IBM891 = 2058, + IBM903 = 2059, + IBM904 = 2060, + IBM905 = 2061, + IBM918 = 2062, + IBM1026 = 2063, + IBMEBCDICATDE = 2064, + EBCDICATDEA = 2065, + EBCDICCAFR = 2066, + EBCDICDKNO = 2067, + EBCDICDKNOA = 2068, + EBCDICFISE = 2069, + EBCDICFISEA = 2070, + EBCDICFR = 2071, + EBCDICIT = 2072, + EBCDICPT = 2073, + EBCDICES = 2074, + EBCDICESA = 2075, + EBCDICESS = 2076, + EBCDICUK = 2077, + EBCDICUS = 2078, + Unknown8BiT = 2079, + Mnemonic = 2080, + Mnem = 2081, + VISCII = 2082, + VIQR = 2083, + KOI8R = 2084, + HZGB2312 = 2085, + IBM866 = 2086, + PC775Baltic = 2087, + KOI8U = 2088, + IBM00858 = 2089, + IBM00924 = 2090, + IBM01140 = 2091, + IBM01141 = 2092, + IBM01142 = 2093, + IBM01143 = 2094, + IBM01144 = 2095, + IBM01145 = 2096, + IBM01146 = 2097, + IBM01147 = 2098, + IBM01148 = 2099, + IBM01149 = 2100, + Big5HKSCS = 2101, + IBM1047 = 2102, + PTCP154 = 2103, + Amiga1251 = 2104, + KOI7switched = 2105, + BRF = 2106, + TSCII = 2107, + CP51932 = 2108, + windows874 = 2109, + windows1250 = 2250, + windows1251 = 2251, + windows1252 = 2252, + windows1253 = 2253, + windows1254 = 2254, + windows1255 = 2255, + windows1256 = 2256, + windows1257 = 2257, + windows1258 = 2258, + TIS620 = 2259, + CP50220 = 2260 + }; + using enum id; + + constexpr text_encoding() = default; + + constexpr explicit + text_encoding(string_view __enc) noexcept + : _M_rep(_S_find_name(__enc)) + { + __enc.copy(_M_name, max_name_length); + } + + // @pre i has the value of one of the enumerators of id. + constexpr + text_encoding(id __i) noexcept + : _M_rep(_S_find_id(__i)) + { + if (string_view __name(_M_rep->_M_name); !__name.empty()) + __name.copy(_M_name, max_name_length); + } + + constexpr id mib() const noexcept { return id(_M_rep->_M_id); } + + constexpr const char* name() const noexcept { return _M_name; } + + struct aliases_view : ranges::view_interface + { + private: + class _Iterator; + struct _Sentinel { }; + + public: + constexpr _Iterator begin() const noexcept { return _Iterator(_M_begin); } + constexpr _Sentinel end() const noexcept { return _Sentinel{}; } + + private: + friend struct text_encoding; + + constexpr explicit aliases_view(const _Rep* __r) : _M_begin(__r) { } + + class _Iterator + { + public: + using value_type = const char*; + using reference = const char*; + using difference_type = int; + + constexpr _Iterator() = default; + constexpr value_type operator*() const; + constexpr _Iterator& operator++(); + constexpr _Iterator& operator--(); + constexpr _Iterator operator++(int); + constexpr _Iterator operator--(int); + constexpr value_type operator[](difference_type) const; + constexpr _Iterator& operator+=(difference_type); + constexpr _Iterator& operator-=(difference_type); + constexpr difference_type operator-(const _Iterator&) const; + constexpr bool operator==(const _Iterator&) const = default; + constexpr bool operator==(_Sentinel) const noexcept; + constexpr strong_ordering operator<=>(const _Iterator&) const; + + friend _Iterator + operator+(_Iterator __i, difference_type __n) + { + __i += __n; + return __i; + } + + friend _Iterator + operator+(difference_type __n, _Iterator __i) + { + __i += __n; + return __i; + } + + friend _Iterator + operator-(_Iterator __i, difference_type __n) + { + __i -= __n; + return __i; + } + + private: + friend class text_encoding; + + constexpr explicit + _Iterator(const _Rep* __r) noexcept + : _M_rep(__r), _M_id(__r ? __r->_M_id : 0) + { } + + constexpr bool _M_dereferenceable() const noexcept; + static constexpr difference_type _S_neg(difference_type) noexcept; + + const _Rep* _M_rep = nullptr; + _Rep::id _M_id = 0; + }; + + const _Rep* _M_begin = nullptr; + }; + + constexpr aliases_view + aliases() const noexcept + { + return _M_rep->_M_name[0] ? aliases_view(_M_rep) : aliases_view{nullptr}; + } + + friend constexpr bool + operator==(const text_encoding& __a, + const text_encoding& __b) noexcept + { + if (__a.mib() == id::other && __b.mib() == id::other) [[unlikely]] + return _S_comp(__a._M_name, __b._M_name); + else + return __a.mib() == __b.mib(); + } + + friend constexpr bool + operator==(const text_encoding& __encoding, id __i) noexcept + { return __encoding.mib() == __i; } + +#if __CHAR_BIT__ == 8 + static consteval text_encoding + literal() noexcept + { +#ifdef __GNUC_EXECUTION_CHARSET_NAME + return text_encoding(__GNUC_EXECUTION_CHARSET_NAME); +#elif defined __clang_literal_encoding__ + return text_encoding(__clang_literal_encoding__); +#else + return text_encoding(); +#endif + } + + static text_encoding + environment(); + + template + static bool + environment_is() + { return text_encoding(_Id)._M_is_environment(); } +#else + static text_encoding literal() = delete; + static text_encoding environment() = delete; + template static bool environment_is() = delete; +#endif + + private: + const _Rep* _M_rep = _S_reps + 1; // id::unknown + char _M_name[max_name_length + 1] = {0}; + + bool + _M_is_environment() const; + + static inline constexpr _Rep _S_reps[] = { + { 1, "" }, { 2, "" }, +#define _GLIBCXX_GET_ENCODING_DATA +#include +#ifdef _GLIBCXX_GET_ENCODING_DATA +# error "Invalid text_encoding data" +#endif + { 9999, nullptr }, // sentinel + }; + + static constexpr bool + _S_comp(string_view __a, string_view __b) + { return __unicode::__charset_alias_match(__a, __b); } + + static constexpr const _Rep* + _S_find_name(string_view __name) noexcept + { +#ifdef _GLIBCXX_TEXT_ENCODING_UTF8_OFFSET + // Optimize the common UTF-8 case to avoid a linear search through all + // strings in the table using the _S_comp function. + if (__name == "UTF-8") + return _S_reps + 2 + _GLIBCXX_TEXT_ENCODING_UTF8_OFFSET; +#endif + + // The first two array elements (other and unknown) don't have names. + // The last element is a sentinel that can never match anything. + const auto __first = _S_reps + 2, __end = std::end(_S_reps) - 1; + for (auto __r = __first; __r != __end; ++__r) + if (_S_comp(__r->_M_name, __name)) + { + // Might have matched an alias. Find the first entry for this ID. + const auto __id = __r->_M_id; + while (__r[-1]._M_id == __id) + --__r; + return __r; + } + return _S_reps; // id::other + } + + static constexpr const _Rep* + _S_find_id(id __id) noexcept + { + const auto __i = (_Rep::id)__id; + const auto __r = std::lower_bound(_S_reps, std::end(_S_reps) - 1, __i); + if (__r->_M_id == __i) [[likely]] + return __r; + else + { + // Preconditions: i has the value of one of the enumerators of id. + __glibcxx_assert(__r->_M_id == __i); + return _S_reps + 1; // id::unknown + } + } + }; + + template<> + struct hash + { + size_t + operator()(const text_encoding& __enc) const noexcept + { return std::hash()(__enc.mib()); } + }; + + constexpr auto + text_encoding::aliases_view:: + _Iterator::operator*() const + -> value_type + { + if (_M_dereferenceable()) [[likely]] + return _M_rep->_M_name; + else + { + __glibcxx_assert(_M_dereferenceable()); + return ""; + } + } + + constexpr auto + text_encoding::aliases_view:: + _Iterator::operator++() + -> _Iterator& + { + if (_M_dereferenceable()) [[likely]] + ++_M_rep; + else + { + __glibcxx_assert(_M_dereferenceable()); + *this = _Iterator{}; + } + return *this; + } + + constexpr auto + text_encoding::aliases_view:: + _Iterator::operator--() + -> _Iterator& + { + const bool __decrementable = _M_rep != nullptr && _M_rep[-1]._M_id == _M_id; + if (__decrementable) [[likely]] + --_M_rep; + else + { + __glibcxx_assert(__decrementable); + *this = _Iterator{}; + } + return *this; + } + + constexpr auto + text_encoding::aliases_view:: + _Iterator::operator++(int) + -> _Iterator + { + auto __it = *this; + ++*this; + return __it; + } + + constexpr auto + text_encoding::aliases_view:: + _Iterator::operator--(int) + -> _Iterator + { + auto __it = *this; + --*this; + return __it; + } + + constexpr auto + text_encoding::aliases_view:: + _Iterator::operator[](difference_type __n) const + -> value_type + { return *(*this + __n); } + + constexpr auto + text_encoding::aliases_view:: + _Iterator::operator+=(difference_type __n) + -> _Iterator& + { + if (_M_rep != nullptr) + { + if ((__n > 0 && __n < (std::end(_S_reps) - _M_rep)) + || (__n < 0 && __n > (_S_reps - _M_rep))) + { + if (_M_rep[__n]._M_id == _M_id) + _M_rep += __n; + else + *this = _Iterator{}; + } + else if (__n != 0) + *this = _Iterator{}; + } + __glibcxx_assert(_M_rep != nullptr); + return *this; + } + + constexpr auto + text_encoding::aliases_view:: + _Iterator::operator-=(difference_type __n) + -> _Iterator& + { return operator+=(_S_neg(__n)); } + + constexpr auto + text_encoding::aliases_view:: + _Iterator::operator-(const _Iterator& __i) const noexcept + -> difference_type + { + if (_M_id == __i._M_id) + return _M_rep - __i._M_rep; + __glibcxx_assert(_M_id == __i._M_id); + return __gnu_cxx::__int_traits::__max; + } + + constexpr bool + text_encoding::aliases_view:: + _Iterator::operator==(_Sentinel) const noexcept + { return !_M_dereferenceable(); } + + constexpr strong_ordering + text_encoding::aliases_view:: + _Iterator::operator<=>(const _Iterator& __i) const + { + __glibcxx_assert(_M_id == __i._M_id); + return _M_rep <=> __i._M_rep; + } + + constexpr bool + text_encoding::aliases_view:: + _Iterator::_M_dereferenceable() const noexcept + { return _M_rep != nullptr && _M_rep->_M_id == _M_id; } + + constexpr auto + text_encoding::aliases_view:: + _Iterator::_S_neg(difference_type __n) noexcept + -> difference_type + { + using _Traits = __gnu_cxx::__int_traits; + if (__n == _Traits::__min) [[unlikely]] + return _Traits::__max; + return -__n; + } + +namespace ranges +{ + // Opt-in to borrowed_range concept + template<> + inline constexpr bool + enable_borrowed_range = true; +} + +_GLIBCXX_END_NAMESPACE_VERSION +} // namespace std + +#endif // __cpp_lib_text_encoding +#endif // _GLIBCXX_TEXT_ENCODING diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py b/libstdc++-v3/python/libstdcxx/v6/printers.py index 032a7aa58a2..a6c2ed4599f 100644 --- a/libstdc++-v3/python/libstdcxx/v6/printers.py +++ b/libstdc++-v3/python/libstdcxx/v6/printers.py @@ -2324,6 +2324,21 @@ class StdIntegralConstantPrinter(printer_base): typename = strip_versioned_namespace(self._typename) return "{}<{}, {}>".format(typename, value_type, value) +class StdTextEncodingPrinter(printer_base): + """Print a std::text_encoding.""" + + def __init__(self, typename, val): + self._val = val + self._typename = typename + + def to_string(self): + rep = self._val['_M_rep'].dereference() + if rep['_M_id'] == 1: + return self._val['_M_name'] + if rep['_M_id'] == 2: + return 'unknown' + return rep['_M_name'] + # A "regular expression" printer which conforms to the # "SubPrettyPrinter" protocol from gdb.printing. class RxPrinter(object): @@ -2807,6 +2822,8 @@ def build_libstdcxx_dictionary(): libstdcxx_printer.add_version('std::', 'integral_constant', StdIntegralConstantPrinter) + libstdcxx_printer.add_version('std::', 'text_encoding', + StdTextEncodingPrinter) if hasattr(gdb.Value, 'dynamic_type'): libstdcxx_printer.add_version('std::', 'error_code', diff --git a/libstdc++-v3/scripts/gen_text_encoding_data.py b/libstdc++-v3/scripts/gen_text_encoding_data.py new file mode 100755 index 00000000000..2d6f3e4077a --- /dev/null +++ b/libstdc++-v3/scripts/gen_text_encoding_data.py @@ -0,0 +1,70 @@ +#!/usr/bin/env python3 +# +# Script to generate tables for libstdc++ std::text_encoding. +# +# This file is part of GCC. +# +# GCC is free software; you can redistribute it and/or modify it under +# the terms of the GNU General Public License as published by the Free +# Software Foundation; either version 3, or (at your option) any later +# version. +# +# GCC is distributed in the hope that it will be useful, but WITHOUT ANY +# WARRANTY; without even the implied warranty of MERCHANTABILITY or +# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +# for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . + +# To update the Libstdc++ static data in download +# the latest: +# https://www.iana.org/assignments/character-sets/character-sets-1.csv +# Then run this script and save the output to +# include/bits/text_encoding-data.h + +import sys +import csv + +if len(sys.argv) != 2: + print("Usage: %s " % sys.argv[0], file=sys.stderr) + sys.exit(1) + +print("// Generated by gen_text_encoding_data.py, do not edit.\n") +print("#ifndef _GLIBCXX_GET_ENCODING_DATA") +print('# error "This is not a public header, do not include it directly"') +print("#endif\n") + + +charsets = {} +with open(sys.argv[1], newline='') as f: + reader = csv.reader(f) + next(reader) # skip header row + for row in reader: + mib = int(row[2]) + if mib in charsets: + raise ValueError("Multiple rows for mibEnum={}".format(mib)) + name = row[1] + aliases = row[5].split() + # Ensure primary name comes first + if name in aliases: + aliases.remove(name) + charsets[mib] = [name] + aliases + +# Remove "NATS-DANO" and "NATS-DANO-ADD" +charsets.pop(33, None) +charsets.pop(34, None) + +count = 0 +for mib in sorted(charsets.keys()): + names = charsets[mib] + if names[0] == "UTF-8": + print("#define _GLIBCXX_TEXT_ENCODING_UTF8_OFFSET {}".format(count)) + for name in names: + print(' {{ {:4}, "{}" }},'.format(mib, name)) + count += len(names) + +# gives an error if this macro is left defined. +# Do this last, so that the generated output is not usable unless we reach here. +print("\n#undef _GLIBCXX_GET_ENCODING_DATA") diff --git a/libstdc++-v3/src/Makefile.am b/libstdc++-v3/src/Makefile.am index 7292ae70f81..37ba1491dea 100644 --- a/libstdc++-v3/src/Makefile.am +++ b/libstdc++-v3/src/Makefile.am @@ -43,7 +43,7 @@ experimental_dir = endif ## Keep this list sync'd with acinclude.m4:GLIBCXX_CONFIGURE. -SUBDIRS = c++98 c++11 c++17 c++20 c++23 \ +SUBDIRS = c++98 c++11 c++17 c++20 c++23 c++26 \ $(filesystem_dir) $(backtrace_dir) $(experimental_dir) # Cross compiler support. @@ -77,6 +77,7 @@ vpath % $(top_srcdir)/src/c++11 vpath % $(top_srcdir)/src/c++17 vpath % $(top_srcdir)/src/c++20 vpath % $(top_srcdir)/src/c++23 +vpath % $(top_srcdir)/src/c++26 if ENABLE_FILESYSTEM_TS vpath % $(top_srcdir)/src/filesystem endif diff --git a/libstdc++-v3/src/c++26/Makefile.am b/libstdc++-v3/src/c++26/Makefile.am new file mode 100644 index 00000000000..000ced1f501 --- /dev/null +++ b/libstdc++-v3/src/c++26/Makefile.am @@ -0,0 +1,109 @@ +## Makefile for the C++26 sources of the GNU C++ Standard library. +## +## Copyright (C) 1997-2023 Free Software Foundation, Inc. +## +## This file is part of the libstdc++ version 3 distribution. +## Process this file with automake to produce Makefile.in. + +## This file is part of the GNU ISO C++ Library. This library is free +## software; you can redistribute it and/or modify it under the +## terms of the GNU General Public License as published by the +## Free Software Foundation; either version 3, or (at your option) +## any later version. + +## This library is distributed in the hope that it will be useful, +## but WITHOUT ANY WARRANTY; without even the implied warranty of +## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +## GNU General Public License for more details. + +## You should have received a copy of the GNU General Public License along +## with this library; see the file COPYING3. If not see +## . + +include $(top_srcdir)/fragment.am + +# Convenience library for C++26 runtime. +noinst_LTLIBRARIES = libc++26convenience.la + +headers = + +if ENABLE_EXTERN_TEMPLATE +# XTEMPLATE_FLAGS = -fno-implicit-templates +inst_sources = +else +# XTEMPLATE_FLAGS = +inst_sources = +endif + +sources = text_encoding.cc + +vpath % $(top_srcdir)/src/c++26 + + +if GLIBCXX_HOSTED +libc__26convenience_la_SOURCES = $(sources) $(inst_sources) +else +libc__26convenience_la_SOURCES = +endif + +# AM_CXXFLAGS needs to be in each subdirectory so that it can be +# modified in a per-library or per-sub-library way. Need to manually +# set this option because CONFIG_CXXFLAGS has to be after +# OPTIMIZE_CXXFLAGS on the compile line so that -O2 can be overridden +# as the occasion calls for it. +AM_CXXFLAGS = \ + -std=gnu++26 \ + $(glibcxx_lt_pic_flag) $(glibcxx_compiler_shared_flag) \ + $(XTEMPLATE_FLAGS) $(VTV_CXXFLAGS) \ + $(WARN_CXXFLAGS) $(OPTIMIZE_CXXFLAGS) $(CONFIG_CXXFLAGS) \ + -fimplicit-templates + +AM_MAKEFLAGS = \ + "gxx_include_dir=$(gxx_include_dir)" + +# Libtool notes + +# 1) In general, libtool expects an argument such as `--tag=CXX' when +# using the C++ compiler, because that will enable the settings +# detected when C++ support was being configured. However, when no +# such flag is given in the command line, libtool attempts to figure +# it out by matching the compiler name in each configuration section +# against a prefix of the command line. The problem is that, if the +# compiler name and its initial flags stored in the libtool +# configuration file don't match those in the command line, libtool +# can't decide which configuration to use, and it gives up. The +# correct solution is to add `--tag CXX' to LTCXXCOMPILE and maybe +# CXXLINK, just after $(LIBTOOL), so that libtool doesn't have to +# attempt to infer which configuration to use. +# +# The second tag argument, `--tag disable-shared` means that libtool +# only compiles each source once, for static objects. In actuality, +# glibcxx_lt_pic_flag and glibcxx_compiler_shared_flag are added to +# the libtool command that is used create the object, which is +# suitable for shared libraries. The `--tag disable-shared` must be +# placed after --tag CXX lest things CXX undo the affect of +# disable-shared. + +# 2) Need to explicitly set LTCXXCOMPILE so that EXTRA_CXX_FLAGS is +# last. (That way, things like -O2 passed down from the toplevel can +# be overridden by --enable-debug.) +LTCXXCOMPILE = \ + $(LIBTOOL) --tag CXX --tag disable-shared \ + $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \ + --mode=compile $(CXX) $(TOPLEVEL_INCLUDES) \ + $(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CXXFLAGS) $(CXXFLAGS) $(EXTRA_CXX_FLAGS) + +LTLDFLAGS = $(shell $(SHELL) $(top_srcdir)/../libtool-ldflags $(LDFLAGS)) + +# 3) We'd have a problem when building the shared libstdc++ object if +# the rules automake generates would be used. We cannot allow g++ to +# be used since this would add -lstdc++ to the link line which of +# course is problematic at this point. So, we get the top-level +# directory to configure libstdc++-v3 to use gcc as the C++ +# compilation driver. +CXXLINK = \ + $(LIBTOOL) --tag CXX --tag disable-shared \ + $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \ + --mode=link $(CXX) \ + $(VTV_CXXLINKFLAGS) \ + $(OPT_LDFLAGS) $(SECTION_LDFLAGS) $(AM_CXXFLAGS) $(LTLDFLAGS) -o $@ diff --git a/libstdc++-v3/src/c++26/text_encoding.cc b/libstdc++-v3/src/c++26/text_encoding.cc new file mode 100644 index 00000000000..9a7df07db29 --- /dev/null +++ b/libstdc++-v3/src/c++26/text_encoding.cc @@ -0,0 +1,91 @@ +// Definitions for -*- C++ -*- + +// Copyright The GNU Toolchain Authors. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// Under Section 7 of GPL version 3, you are granted additional +// permissions described in the GCC Runtime Library Exception, version +// 3.1, as published by the Free Software Foundation. + +// You should have received a copy of the GNU General Public License and +// a copy of the GCC Runtime Library Exception along with this program; +// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +// . + +#include +#include + +#ifdef _GLIBCXX_USE_NL_LANGINFO_L +#include +#include + +#if __CHAR_BIT__ == 8 +namespace std +{ +_GLIBCXX_BEGIN_NAMESPACE_VERSION + +text_encoding +__locale_encoding(const char* name) +{ + text_encoding enc; + if (locale_t loc = ::newlocale(LC_ALL_MASK, name, (locale_t)0)) + { + if (const char* codeset = ::nl_langinfo_l(CODESET, loc)) + { + string_view s(codeset); + if (s.size() < text_encoding::max_name_length) + enc = text_encoding(s); + } + ::freelocale(loc); + } + return enc; +} + +_GLIBCXX_END_NAMESPACE_VERSION +} // namespace std + +std::text_encoding +std::text_encoding::environment() +{ + return std::__locale_encoding(""); +} + +bool +std::text_encoding::_M_is_environment() const +{ + bool matched = false; + if (locale_t loc = ::newlocale(LC_ALL_MASK, "", (locale_t)0)) + { + if (const char* codeset = ::nl_langinfo_l(CODESET, loc)) + { + string_view sv(codeset); + for (auto alias : aliases()) + if (__unicode::__charset_alias_match(alias, sv)) + { + matched = true; + break; + } + } + ::freelocale(loc); + } + return matched; +} + +std::text_encoding +std::locale::encoding() const +{ + return std::__locale_encoding(name().c_str()); +} +#endif // CHAR_BIT == 8 + +#endif // _GLIBCXX_USE_NL_LANGINFO_L diff --git a/libstdc++-v3/src/experimental/Makefile.am b/libstdc++-v3/src/experimental/Makefile.am index 8259f986d95..6241430988e 100644 --- a/libstdc++-v3/src/experimental/Makefile.am +++ b/libstdc++-v3/src/experimental/Makefile.am @@ -47,10 +47,12 @@ libstdc__exp_la_SOURCES = $(sources) libstdc__exp_la_LIBADD = \ $(top_builddir)/src/c++23/libc++23convenience.la \ + $(top_builddir)/src/c++26/libc++26convenience.la \ $(filesystem_lib) $(backtrace_lib) libstdc__exp_la_DEPENDENCIES = \ $(top_builddir)/src/c++23/libc++23convenience.la \ + $(top_builddir)/src/c++26/libc++26convenience.la \ $(filesystem_lib) $(backtrace_lib) # AM_CXXFLAGS needs to be in each subdirectory so that it can be diff --git a/libstdc++-v3/testsuite/22_locale/locale/encoding.cc b/libstdc++-v3/testsuite/22_locale/locale/encoding.cc new file mode 100644 index 00000000000..18825fb88b9 --- /dev/null +++ b/libstdc++-v3/testsuite/22_locale/locale/encoding.cc @@ -0,0 +1,36 @@ +// { dg-options "-lstdc++exp" } +// { dg-do run { target c++26 } } +// { dg-require-namedlocale "en_US.ISO8859-1" } +// { dg-require-namedlocale "fr_FR.ISO8859-15" } + +#include +#include + +void +test_encoding() +{ + const std::locale c = std::locale::classic(); + std::text_encoding c_enc = c.encoding(); + VERIFY( c_enc == std::text_encoding::ASCII ); + + const std::locale fr = std::locale(ISO_8859(15, fr_FR)); + std::text_encoding fr_enc = fr.encoding(); + VERIFY( fr_enc == std::text_encoding::ISO885915 ); + + const std::locale en = std::locale(ISO_8859(1, en_US)); + std::text_encoding en_enc = en.encoding(); + VERIFY( en_enc == std::text_encoding::ISOLatin1 ); + +#if __cpp_exceptions + try { + const std::locale c_utf8 = std::locale("C.UTF-8"); + VERIFY( c_utf8.encoding() == std::text_encoding::UTF8 ); + } catch (...) { + } +#endif +} + +int main() +{ + test_encoding(); +} diff --git a/libstdc++-v3/testsuite/ext/unicode/charset_alias_match.cc b/libstdc++-v3/testsuite/ext/unicode/charset_alias_match.cc new file mode 100644 index 00000000000..f6272ae998b --- /dev/null +++ b/libstdc++-v3/testsuite/ext/unicode/charset_alias_match.cc @@ -0,0 +1,18 @@ +// { dg-do compile { target c++20 } } +#include + +using std::__unicode::__charset_alias_match; +static_assert( __charset_alias_match("UTF-8", "utf8") == true ); +static_assert( __charset_alias_match("UTF-8", "u.t.f-008") == true ); +static_assert( __charset_alias_match("UTF-8", "utf-80") == false ); +static_assert( __charset_alias_match("UTF-8", "ut8") == false ); + +static_assert( __charset_alias_match("iso8859_1", "ISO-8859-1") == true ); + +static_assert( __charset_alias_match("", "") == true ); +static_assert( __charset_alias_match("", ".") == true ); +static_assert( __charset_alias_match("--", "...") == true ); +static_assert( __charset_alias_match("--a", "a...") == true ); +static_assert( __charset_alias_match("--a010", "a..10.") == true ); +static_assert( __charset_alias_match("--a010", "a..1.0") == false ); +static_assert( __charset_alias_match("aaaa", "000.00.0a0a)0aa...") == true ); diff --git a/libstdc++-v3/testsuite/std/text_encoding/cons.cc b/libstdc++-v3/testsuite/std/text_encoding/cons.cc new file mode 100644 index 00000000000..b9d93641de4 --- /dev/null +++ b/libstdc++-v3/testsuite/std/text_encoding/cons.cc @@ -0,0 +1,113 @@ +// { dg-do run { target c++26 } } + +#include +#include +#include + +using namespace std::string_view_literals; + +constexpr void +test_default_construct() +{ + std::text_encoding e0; + VERIFY( e0.mib() == std::text_encoding::unknown ); + VERIFY( e0.name()[0] == '\0' ); // P2862R1 name() should never return null + VERIFY( e0.aliases().empty() ); +} + +constexpr void +test_construct_by_name() +{ + std::string_view s; + std::text_encoding e0(s); + VERIFY( e0.mib() == std::text_encoding::other ); + VERIFY( e0.name() == s ); + VERIFY( e0.aliases().empty() ); + + s = "not a real encoding"; + std::text_encoding e1(s); + VERIFY( e1.mib() == std::text_encoding::other ); + VERIFY( e1.name() == s ); + VERIFY( e1.aliases().empty() ); + + VERIFY( e1 != e0 ); + VERIFY( e1 == e0.mib() ); + + s = "utf8"; + std::text_encoding e2(s); + VERIFY( e2.mib() == std::text_encoding::UTF8 ); + VERIFY( e2.name() == s ); + VERIFY( ! e2.aliases().empty() ); + VERIFY( e2.aliases().front() == "UTF-8"sv ); + + s = "Latin-1"; // matches "latin1" + std::text_encoding e3(s); + VERIFY( e3.mib() == std::text_encoding::ISOLatin1 ); + VERIFY( e3.name() == s ); + VERIFY( ! e3.aliases().empty() ); + VERIFY( e3.aliases().front() == "ISO_8859-1:1987"sv ); // primary name + + s = "U.S."; // matches "us" + std::text_encoding e4(s); + VERIFY( e4.mib() == std::text_encoding::ASCII ); + VERIFY( e4.name() == s ); + VERIFY( ! e4.aliases().empty() ); + VERIFY( e4.aliases().front() == "US-ASCII"sv ); // primary name +} + +constexpr void +test_construct_by_id() +{ + std::text_encoding e0(std::text_encoding::other); + VERIFY( e0.mib() == std::text_encoding::other ); + VERIFY( e0.name() == ""sv ); + VERIFY( e0.aliases().empty() ); + + std::text_encoding e1(std::text_encoding::unknown); + VERIFY( e1.mib() == std::text_encoding::unknown ); + VERIFY( e1.name() == ""sv ); + VERIFY( e1.aliases().empty() ); + + std::text_encoding e2(std::text_encoding::UTF8); + VERIFY( e2.mib() == std::text_encoding::UTF8 ); + VERIFY( e2.name() == "UTF-8"sv ); + VERIFY( ! e2.aliases().empty() ); + VERIFY( e2.aliases().front() == std::string_view(e2.name()) ); + bool found = false; + for (auto alias : e2.aliases()) + if (alias == "csUTF8"sv) + { + found = true; + break; + } + VERIFY( found ); +} + +constexpr void +test_copy_construct() +{ + std::text_encoding e0; + std::text_encoding e1 = e0; + VERIFY( e1 == e0 ); + + std::text_encoding e2(std::text_encoding::UTF8); + auto e3 = e2; + VERIFY( e3 == e2 ); + + e1 = e3; + VERIFY( e1 == e2 ); +} + +int main() +{ + auto run_tests = [] { + test_default_construct(); + test_construct_by_name(); + test_construct_by_id(); + test_copy_construct(); + return true; + }; + + run_tests(); + static_assert( run_tests() ); +} diff --git a/libstdc++-v3/testsuite/std/text_encoding/members.cc b/libstdc++-v3/testsuite/std/text_encoding/members.cc new file mode 100644 index 00000000000..0b0d6bd0c96 --- /dev/null +++ b/libstdc++-v3/testsuite/std/text_encoding/members.cc @@ -0,0 +1,41 @@ +// { dg-options "-lstdc++exp" } +// { dg-do run { target c++26 } } +// { dg-require-namedlocale "en_US.ISO8859-1" } +// { dg-require-namedlocale "fr_FR.ISO8859-15" } + +#include +#include +#include +#include + +using namespace std::string_view_literals; + +void +test_literal() +{ + const std::text_encoding lit = std::text_encoding::literal(); + VERIFY( lit.name() == std::string_view(__GNUC_EXECUTION_CHARSET_NAME) ); +} + +void +test_env() +{ + const std::text_encoding env = std::text_encoding::environment(); + + if (env.mib() == std::text_encoding::UTF8) + VERIFY( std::text_encoding::environment_is() ); + + ::setlocale(LC_ALL, ISO_8859(1, en_US)); + const std::text_encoding env1 = std::text_encoding::environment(); + VERIFY( env1 == env ); + + ::setlocale(LC_ALL, ISO_8859(15, fr_FR)); + const std::text_encoding env2 = std::text_encoding::environment(); + VERIFY( env2 == env ); +} + +int main() +{ + test_literal(); + test_env(); +} diff --git a/libstdc++-v3/testsuite/std/text_encoding/requirements.cc b/libstdc++-v3/testsuite/std/text_encoding/requirements.cc new file mode 100644 index 00000000000..d62d93dcda4 --- /dev/null +++ b/libstdc++-v3/testsuite/std/text_encoding/requirements.cc @@ -0,0 +1,31 @@ +// { dg-do compile { target c++26 } } +// { dg-add-options no_pch } + +#include +#ifndef __cpp_lib_text_encoding +# error "Feature-test macro for text_encoding missing in " +#elif __cpp_lib_text_encoding != 202306L +# error "Feature-test macro for text_encoding has wrong value in " +#endif + +#undef __cpp_lib_expected +#include +#ifndef __cpp_lib_text_encoding +# error "Feature-test macro for text_encoding missing in " +#elif __cpp_lib_text_encoding != 202306L +# error "Feature-test macro for text_encoding has wrong value in " +#endif + +#include +#include +static_assert( std::is_trivially_copyable_v ); + +using aliases_view = std::text_encoding::aliases_view; +static_assert( std::copyable ); +static_assert( std::ranges::view ); +static_assert( std::ranges::random_access_range ); +static_assert( std::ranges::borrowed_range ); +static_assert( std::same_as, + const char*> ); +static_assert( std::same_as, + const char*> );