From patchwork Fri Nov 17 15:54:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Wakely X-Patchwork-Id: 166276 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9910:0:b0:403:3b70:6f57 with SMTP id i16csp680526vqn; Fri, 17 Nov 2023 09:11:48 -0800 (PST) X-Google-Smtp-Source: AGHT+IEk+NF9K5XP5kkq5Hau44K6C5AcvfbRNIXk/zREJQiBoda/AnrsrM5Iv/ZP3ARHrv6VPXLp X-Received: by 2002:a05:622a:54f:b0:421:abc2:7f89 with SMTP id m15-20020a05622a054f00b00421abc27f89mr236297qtx.14.1700241108739; Fri, 17 Nov 2023 09:11:48 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700241108; cv=pass; d=google.com; s=arc-20160816; b=TvNFciEIXAXVjPqS/yaqcCGgQK6o4TUdnfw7aMzTetASEg1cDFC47w5uIFekaslWce auigv8wFfwuK1H2+KK2+pIFzEFwMmWyDhrhIOYct+Z4ZjncHalfLfmQfPsz31EEIHi84 RUppAw23GveDcfr7NYg7CH2VC/U9/YOpqbALU4l7IoFNSBEMw5aRp1Aut8cDlVuwPwEQ HWhPZ90v+GzilgdKvVse09shTrpuHF5E6+WpBdh5uJ8GxOMjIgJpN+3FTmJQWEVHjlFK VUSQHAX8veF6JZDBe05b0yzsrR2vVDAZ3xiNEvyirHf2BcbK8OAnNfJCiPNOc60Gkp1Y 2xvQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:to:from:dkim-signature :arc-filter:dmarc-filter:delivered-to; bh=aKsCCeD+N5eup3Lt0WmrY9mnTtiyKiqmzSzpC2k8YIY=; fh=sJ+2/4g29YdyXkoRrFZSpsL2zxijepB7X/1rB0LDDh8=; b=OUha3wDUwG9mCC5hdMZdxY7T+WFOqwByj/Vj3rFPG5G88GQYfCtjEUYD5l9EtigC1A HOlSNuxhi490+DtnFP+zGNhQqqt3mVYVfKWEPze/X3CofDvoZ5Czlc/zRPiWBFskfeZA 0TmewKPxdEnBVCylX51t3zgQyAh+Df/BseTZ8fyN0BV29ZqiwfYQFMDF8cY5tMFoQugj OAI65Yr1oXQFzWLSEfpNyOkDQvR3OT57wVpGGGK5TvdpnTftzT1v4of4Yg28OU6fNifP 0mDpe7Gg/UqCVgIFJoOAf6KY3FFp8CJxy350nTW3xjq8YbhTy8VxudeX2+NyCd7wUROj xdQw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=i9NR1D3I; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id d6-20020a05622a05c600b0041cbac106basi1929631qtb.522.2023.11.17.09.11.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 Nov 2023 09:11:48 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=i9NR1D3I; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 35C4A38845A8 for ; Fri, 17 Nov 2023 16:03:58 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 5D89A385840C for ; Fri, 17 Nov 2023 16:03:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5D89A385840C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5D89A385840C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700237009; cv=none; b=LDthbuRDxTj3Olxs6M+JX5mB4mGDImEmXNjPX5b/724ygpxzmwlXLmp0Fw+tTm25uKWO3Dl8nfPLxpKbIRrc7AUx8q3IIQ7cR1WQ9ARTHNYqGAq/bRPV07qOqMkuJXU4djseSCVR/e1vJd1VFWUjNOKAT+IzdeFL4WPKsTdFqew= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700237009; c=relaxed/simple; bh=2WCCTJBGvZ8zAf1lbp6R+o7RbZyOAaDrTUBMvBptgNY=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=XL8VFKASF02E8C33ygrNKmLCmnZo2cFrAWPKxU+PQYi9ORgrAW+sGy6mjVd2eaIEomG7r6uPGI0C56grHJNbCi14b+AIlHTAd2WM2H0Dan0N2yOY84wKqUOp1GRZ8B0XEXkN20EMRzYoqL29146CP7LixZ7CeXD8E0cB74WdX/A= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700237003; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=aKsCCeD+N5eup3Lt0WmrY9mnTtiyKiqmzSzpC2k8YIY=; b=i9NR1D3Iuu7IjB2d9t7nI0BIC8G9Pkf7Pa8AhkpCPoYvCNNYtq97tN5N+BhBAQ4P1N0yMP vKPvr1n5527hFE8NdDISU9DaLbskrBUgHn+LYzoay6wcd/4Yas1cHE11suERenj95NZuWN Mju1zd48l/tpgCtqlVxeZxQLT+RGCbI= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-152-iYlB6jCRN0qW5NA0xYGfbQ-1; Fri, 17 Nov 2023 11:03:22 -0500 X-MC-Unique: iYlB6jCRN0qW5NA0xYGfbQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id F3AF43C0D185; Fri, 17 Nov 2023 16:03:21 +0000 (UTC) Received: from localhost (unknown [10.42.28.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9BC1640C6EB9; Fri, 17 Nov 2023 16:03:21 +0000 (UTC) From: Jonathan Wakely To: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: [PATCH 1/2] libstdc++: Implement C++23 header [PR107760] Date: Fri, 17 Nov 2023 15:54:38 +0000 Message-ID: <20231117160320.1513815-1-jwakely@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.2 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782832020988718730 X-GMAIL-MSGID: 1782832020988718730 There's a TODO here about checking for invalid UTF-8, which is done by the next patch. I don't know if the Windows code actually works. I tried to test it with mingw and Wine, but I got garbled text. But I'm not sure if that's my code here, or the conversion to UTF-16, or how I'm testing, or just that Wine in a Linux terminal doesn't properly emulat the Windows console, or something else. This needs tests, so I need to write them before pushing, but I still plan to get that done for GCC 14. -- >8 -- libstdc++-v3/ChangeLog: PR libstdc++/107760 * config/abi/pre/gnu.ver: Export new symbols. * include/Makefile.am: Add new header. * include/Makefile.in: Regenerate. * include/bits/version.def (__cpp_lib_print): Define. * include/bits/version.h: Regenerate. * include/std/ostream (vprintf_nonunicode, vprintf_unicode) (print, println): New functions. * include/std/print: New file. * src/c++20/Makefile.am: Add new source file. * src/c++20/Makefile.in: Regenerate. * src/c++98/globals_io.cc [_WIN32] (__fd_for_console): New function. * src/c++20/print.cc: New file. --- libstdc++-v3/config/abi/pre/gnu.ver | 4 + libstdc++-v3/include/Makefile.am | 1 + libstdc++-v3/include/Makefile.in | 1 + libstdc++-v3/include/bits/version.def | 9 ++ libstdc++-v3/include/bits/version.h | 29 ++++-- libstdc++-v3/include/std/ostream | 102 ++++++++++++++++++++ libstdc++-v3/include/std/print | 128 ++++++++++++++++++++++++++ libstdc++-v3/src/c++20/Makefile.am | 2 +- libstdc++-v3/src/c++20/Makefile.in | 4 +- libstdc++-v3/src/c++20/print.cc | 35 +++++++ libstdc++-v3/src/c++98/globals_io.cc | 23 +++++ 11 files changed, 326 insertions(+), 12 deletions(-) create mode 100644 libstdc++-v3/include/std/print create mode 100644 libstdc++-v3/src/c++20/print.cc diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b/libstdc++-v3/config/abi/pre/gnu.ver index 15b50d51251..c7200929e34 100644 --- a/libstdc++-v3/config/abi/pre/gnu.ver +++ b/libstdc++-v3/config/abi/pre/gnu.ver @@ -2514,6 +2514,10 @@ GLIBCXX_3.4.31 { _ZNKSt12__shared_ptrINSt10filesystem28recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE[012]EEcvbEv; _ZNKSt12__shared_ptrINSt10filesystem7__cxx1128recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE[012]EEcvbEv; + # These are only defined for *-*-mingw* + _ZSt16__fd_for_consolePSt15basic_streambufIcSt11char_traitsIcEE; + _ZSt24__write_utf16_to_consoleiNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE; + } GLIBCXX_3.4.30; GLIBCXX_3.4.32 { diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am index 17d9d9cec31..368b92eafbc 100644 --- a/libstdc++-v3/include/Makefile.am +++ b/libstdc++-v3/include/Makefile.am @@ -85,6 +85,7 @@ std_headers = \ ${std_srcdir}/memory_resource \ ${std_srcdir}/mutex \ ${std_srcdir}/ostream \ + ${std_srcdir}/print \ ${std_srcdir}/queue \ ${std_srcdir}/random \ ${std_srcdir}/regex \ diff --git a/libstdc++-v3/include/Makefile.in b/libstdc++-v3/include/Makefile.in index f038af709cc..a31588c0100 100644 --- a/libstdc++-v3/include/Makefile.in +++ b/libstdc++-v3/include/Makefile.in @@ -441,6 +441,7 @@ std_freestanding = \ @GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/memory_resource \ @GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/mutex \ @GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/ostream \ +@GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/print \ @GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/queue \ @GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/random \ @GLIBCXX_HOSTED_TRUE@ ${std_srcdir}/regex \ diff --git a/libstdc++-v3/include/bits/version.def b/libstdc++-v3/include/bits/version.def index 15bd502f52c..8b5cace3775 100644 --- a/libstdc++-v3/include/bits/version.def +++ b/libstdc++-v3/include/bits/version.def @@ -1578,6 +1578,15 @@ ftms = { }; }; +ftms = { + name = print; + values = { + v = 202211; + cxxmin = 23; + hosted = yes; + }; +}; + ftms = { name = spanstream; values = { diff --git a/libstdc++-v3/include/bits/version.h b/libstdc++-v3/include/bits/version.h index 9563b6cd2f7..f197408e60f 100644 --- a/libstdc++-v3/include/bits/version.h +++ b/libstdc++-v3/include/bits/version.h @@ -1923,6 +1923,17 @@ #undef __glibcxx_want_out_ptr // from version.def line 1582 +#if !defined(__cpp_lib_print) +# if (__cplusplus >= 202100L) && _GLIBCXX_HOSTED +# define __glibcxx_print 202211L +# if defined(__glibcxx_want_all) || defined(__glibcxx_want_print) +# define __cpp_lib_print 202211L +# endif +# endif +#endif /* !defined(__cpp_lib_print) && defined(__glibcxx_want_print) */ +#undef __glibcxx_want_print + +// from version.def line 1591 #if !defined(__cpp_lib_spanstream) # if (__cplusplus >= 202100L) && _GLIBCXX_HOSTED && (__glibcxx_span) # define __glibcxx_spanstream 202106L diff --git a/libstdc++-v3/include/std/ostream b/libstdc++-v3/include/std/ostream index 1de1c1bd359..e81c39a7c80 100644 --- a/libstdc++-v3/include/std/ostream +++ b/libstdc++-v3/include/std/ostream @@ -39,6 +39,11 @@ #include #include +#if __cplusplus > 202002L +# include +#endif + +# define __glibcxx_want_print #include // __glibcxx_syncbuf namespace std _GLIBCXX_VISIBILITY(default) @@ -872,6 +877,103 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION } #endif // __glibcxx_syncbuf +#if __cpp_lib_print // C++ >= 23 + + inline void + vprint_nonunicode(ostream& __os, string_view __fmt, format_args __args) + { + ostream::sentry __cerb(__os); + if (__cerb) + { + string __out = std::vformat(__fmt, __args); + __try + { + const streamsize __w = __os.width(); + const streamsize __n = __out.size(); + if (__w > __n) + { + const bool __left + = (__os.flags() & ios_base::adjustfield) == ios_base::left; + if (!__left) + std::__ostream_fill(__os, __w - __n); + if (__os.good()) + std::__ostream_write(__os, __out.data(), __n); + if (__left && __os.good()) + std::__ostream_fill(__os, __w - __n); + } + else + std::__ostream_write(__os, __out.data(), __n); + } + __catch(const __cxxabiv1::__forced_unwind&) + { + __os._M_setstate(ios_base::badbit); + __throw_exception_again; + } + __catch(...) + { __os._M_setstate(ios_base::badbit); } + } + } + + inline void + vprint_unicode(ostream& __os, string_view __fmt, format_args __args) + { + // TODO: diagnose invalid UTF-8 code units +#ifdef _WIN32 + int __fd_for_console(std::streambuf*); + void __write_utf16_to_console(int, string); + + // If stream refers to a terminal convert to UTF-16 and use WriteConsoleW. + if (int __fd = __fd_for_console(__os.rdbuf()); __fd >= 0) + { + ostream::sentry __cerb(__os); + if (__cerb) + { + string __out = std::vformat(__fmt, __args); + ios_base::iostate __err = ios_base::goodbit; + __try + { + if (__os.rdbuf()->pubsync() == -1) + __err = ios::badbit; + else if (__write_utf16_to_console(__fd, __out)) + __err = ios::badbit; + } + __catch(const __cxxabiv1::__forced_unwind&) + { + __os._M_setstate(ios_base::badbit); + __throw_exception_again; + } + __catch(...) + { __os._M_setstate(ios_base::badbit); } + + if (__err) + __os.setstate(__err); + } + } +#endif + std::vprint_nonunicode(__os, __fmt, __args); + } + + + template + inline void + print(ostream& __os, format_string<_Args...> __fmt, _Args&&... __args) + { + auto __fmtargs = std::make_format_args(std::forward<_Args>(__args)...); + if constexpr (string_view(__GNUC_EXECUTION_CHARSET_NAME) == "UTF-8") + std::vprint_unicode(__os, __fmt.get(), __fmtargs); + else + std::vprint_nonunicode(__os, __fmt.get(), __fmtargs); + } + + template + inline void + println(ostream& __os, format_string<_Args...> __fmt, _Args&&... __args) + { + std::print(__os, "{}\n", + std::format(__fmt, std::forward<_Args>(__args)...)); + } +#endif // __cpp_lib_print + #endif // C++11 _GLIBCXX_END_NAMESPACE_VERSION diff --git a/libstdc++-v3/include/std/print b/libstdc++-v3/include/std/print new file mode 100644 index 00000000000..75e78841247 --- /dev/null +++ b/libstdc++-v3/include/std/print @@ -0,0 +1,128 @@ +// Print functions -*- C++ -*- + +// Copyright The GNU Toolchain Authors. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// Under Section 7 of GPL version 3, you are granted additional +// permissions described in the GCC Runtime Library Exception, version +// 3.1, as published by the Free Software Foundation. + +// You should have received a copy of the GNU General Public License and +// a copy of the GCC Runtime Library Exception along with this program; +// see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +// . + +/** @file include/print + * This is a Standard C++ Library header. + */ + +#ifndef _GLIBCXX_PRINT +#define _GLIBCXX_PRINT 1 + +#pragma GCC system_header + +#include // for std::format + +#define __glibcxx_want_print +#include + +#ifdef __cpp_lib_print // C++ >= 23 + +#include +#include +#include +#include + +#ifdef _WIN32 +# include +#endif + +namespace std _GLIBCXX_VISIBILITY(default) +{ +_GLIBCXX_BEGIN_NAMESPACE_VERSION + + inline void + vprint_nonunicode(FILE* __stream, string_view __fmt, format_args __args) + { + string __out = std::vformat(__fmt, __args); + if (std::fwrite(__out.data(), 1, __out.size(), __stream) != __out.size()) + __throw_system_error(EIO); + } + + inline void + vprint_unicode(FILE* __stream, string_view __fmt, format_args __args) + { + // TODO: diagnose invalid UTF-8 code units +#ifdef _WIN32 + int __fd_for_console(FILE*); + void __write_utf16_to_console(int, string); + + // If stream refers to a terminal convert to UTF-16 and use WriteConsoleW. + if (int __fd = __fd_for_console(__stream); __fd >= 0) + { + string __out = std::vformat(__fmt, __args); + error_code __e; + if (!std::fflush(__stream)) + { + if (!(__e = __write_utf16_to_console(__fd, __out))) + return; + } + else + __e = error_code(errno, generic_category()); + _GLIBCXX_THROW_OR_ABORT(system_error(__e, "std::vprint_unicode")); + } +#endif + std::vprint_nonunicode(__stream, __fmt, __args); + } + + template + inline void + print(FILE* __stream, format_string<_Args...> __fmt, _Args&&... __args) + { + auto __fmtargs = std::make_format_args(std::forward<_Args>(__args)...); + if constexpr (string_view(__GNUC_EXECUTION_CHARSET_NAME) == "UTF-8") + std::vprint_unicode(__stream, __fmt.get(), __fmtargs); + else + std::vprint_nonunicode(__stream, __fmt.get(), __fmtargs); + } + + template + inline void + print(format_string<_Args...> __fmt, _Args&&... __args) + { std::print(stdout, __fmt, std::forward<_Args>(__args)...); } + + template + inline void + println(FILE* __stream, format_string<_Args...> __fmt, _Args&&... __args) + { + std::print(__stream, "{}\n", + std::format(__fmt, std::forward<_Args>(__args)...)); + } + + template + inline void + println(format_string<_Args...> __fmt, _Args&&... __args) + { std::println(stdout, __fmt, std::forward<_Args>(__args)...); } + + inline void + vprint_unicode(string_view __fmt, format_args __args) + { std::vprint_unicode(stdout, __fmt, __args); } + + inline void + vprint_nonunicode(string_view __fmt, format_args __args) + { std::vprint_nonunicode(stdout, __fmt, __args); } + +_GLIBCXX_END_NAMESPACE_VERSION +} // namespace std +#endif // __cpp_lib_print +#endif // _GLIBCXX_PRINT diff --git a/libstdc++-v3/src/c++20/Makefile.am b/libstdc++-v3/src/c++20/Makefile.am index e947855e6ae..3cdc6521bb4 100644 --- a/libstdc++-v3/src/c++20/Makefile.am +++ b/libstdc++-v3/src/c++20/Makefile.am @@ -36,7 +36,7 @@ else inst_sources = endif -sources = tzdb.cc +sources = tzdb.cc print.cc vpath % $(top_srcdir)/src/c++20 diff --git a/libstdc++-v3/src/c++20/Makefile.in b/libstdc++-v3/src/c++20/Makefile.in index 3ec8c5ce804..b732e6fc005 100644 --- a/libstdc++-v3/src/c++20/Makefile.in +++ b/libstdc++-v3/src/c++20/Makefile.in @@ -121,7 +121,7 @@ CONFIG_CLEAN_FILES = CONFIG_CLEAN_VPATH_FILES = LTLIBRARIES = $(noinst_LTLIBRARIES) libc__20convenience_la_LIBADD = -am__objects_1 = tzdb.lo +am__objects_1 = tzdb.lo print.lo @ENABLE_EXTERN_TEMPLATE_TRUE@am__objects_2 = sstream-inst.lo @GLIBCXX_HOSTED_TRUE@am_libc__20convenience_la_OBJECTS = \ @GLIBCXX_HOSTED_TRUE@ $(am__objects_1) $(am__objects_2) @@ -432,7 +432,7 @@ headers = @ENABLE_EXTERN_TEMPLATE_TRUE@inst_sources = \ @ENABLE_EXTERN_TEMPLATE_TRUE@ sstream-inst.cc -sources = tzdb.cc +sources = tzdb.cc print.cc @GLIBCXX_HOSTED_FALSE@libc__20convenience_la_SOURCES = @GLIBCXX_HOSTED_TRUE@libc__20convenience_la_SOURCES = $(sources) $(inst_sources) diff --git a/libstdc++-v3/src/c++20/print.cc b/libstdc++-v3/src/c++20/print.cc new file mode 100644 index 00000000000..d97a0c71dfe --- /dev/null +++ b/libstdc++-v3/src/c++20/print.cc @@ -0,0 +1,35 @@ +#ifdef _WIN32 +#include +#include +#include +#include // _fileno +#include // _get_osfhandle, _isatty +#include // GetLastError, WriteConsoleW + +namespace std _GLIBCXX_VISIBILITY(default) +{ + int + __fd_for_console(FILE* f) + { + if (int fd = _fileno(f); fd >= 0 && _isatty(fd)) + return fd; + return -1; + } + + error_code + __write_utf16_to_console(int fd, string str) + { + std::wstring wstr; + std::codecvt_utf8_utf16 wcvt; + const auto p = str.data(); + unsigned long nchars = 0; + if (!std::__str_codecvt_in_all(p, p + str.size(), wstr, wcvt)) + return std::make_error_code(errc::illegal_byte_sequence); + WriteConsoleW(reinterpret_cast(_get_osfhandle(fd)), + wstr.data(), wstr.size(), &nchars, nullptr); + if (nchars != wstr.size()) + return {(int)GetLastError(), system_category()}; + return {}; + } +} +#endif // _WIN32 diff --git a/libstdc++-v3/src/c++98/globals_io.cc b/libstdc++-v3/src/c++98/globals_io.cc index 0c4f270977d..538efd01b53 100644 --- a/libstdc++-v3/src/c++98/globals_io.cc +++ b/libstdc++-v3/src/c++98/globals_io.cc @@ -107,3 +107,26 @@ namespace __gnu_internal _GLIBCXX_VISIBILITY(hidden) fake_wfilebuf buf_wcerr; #endif } // namespace __gnu_internal + +#ifdef _WIN32 +namespace std _GLIBCXX_VISIBILITY(default) +{ + int __fd_for_console(FILE*); + + int __fd_for_console(std::streambuf* sb) + { + using namespace __gnu_internal; + using namespace __gnu_cxx; + + FILE* f = NULL; + void* p = sb; + if (p == buf_cout_sync || p == buf_cin_sync || p == buf_cerr_sync) + f = dynamic_cast*>(sb)->file(); + else if (p == buf_cout || p == buf_cin || p == buf_cerr) + f = dynamic_cast*>(sb)->file(); + else + return -1; + return __fd_for_console(f); + } +} +#endif From patchwork Fri Nov 17 15:54:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Wakely X-Patchwork-Id: 166239 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9910:0:b0:403:3b70:6f57 with SMTP id i16csp633971vqn; Fri, 17 Nov 2023 08:04:38 -0800 (PST) X-Google-Smtp-Source: AGHT+IFGD440f+PywZYm38zclCvdMQdgoUKzbvl5XI4R3/gbM1UVvnUj7Pqz2PSbacZOxaWqnD3A X-Received: by 2002:a0c:fbcd:0:b0:66c:fc97:fbcf with SMTP id n13-20020a0cfbcd000000b0066cfc97fbcfmr12991671qvp.48.1700237078494; Fri, 17 Nov 2023 08:04:38 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700237078; cv=pass; d=google.com; s=arc-20160816; b=w39dSfhdh0gHPncjVQ/FKUHhZWpmhzXcumXI+HVTMUNCsvChT7SCLLzHHayL4dQSGj WHl2erKX2ekuI5qXQu4Pr2qWSEWdEcFGJcDldZYMZIgGJokGUarnxqW5hkf7Whc3DMTq 5AaVj+j329/kj3lWPoTKBCsFIRx+Cly3f28VT7XgblJstvIU/k+jvQOIFn6xEpFwqZE2 NlW2fceKtCo9Wo2n1BlfVt33xhO35PvSJHwpREP9T/LJyA+r+OSbeqhlufDOY33dZloZ +HUMHNXEoJMQfVaqHiEPQ71IjWb2urDdhgoZKgYS0W9mz5C0C0UeBu4JNpnKt9Hc9tAQ mE8g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=Wjs5bFeoN7/ebeGjQ+Mir01YLRW5/bl8X0DT5ffPrxU=; fh=sJ+2/4g29YdyXkoRrFZSpsL2zxijepB7X/1rB0LDDh8=; b=g/vXZiuDXfwB8tkKz+6DI4dDlI9RHOzznsT/sNY2/broVTKR422VlNhlZn48+8tOVt qng1pk0XURoLO5ISFUkJJXtlpHuLBJ01IIggBV6KKMHBx9ZExxxbvkmrt5CJcOsdN4Mo TApM5exwQPaU8kK5by6+dxzarJLQahUOHhQAU9lRvpffLoCBR5PI9YG5LBTfPsV0gVKe NYGiirDVFM3rxddGTqG3Wu//Yo6pFQ1sgZ1sn9eZJT1lFZSfmZ6e8eNQlIoB15H4HNTi tQqjBJzgqbflzCvOtmtO3DWQU0wF/G4jbzLEbmC0olEaOLK3fS9vDMGi9QXHDn0+1AR7 NT2g== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Fz6pOLN2; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id l18-20020ad44bd2000000b00677fbf147c2si462323qvw.512.2023.11.17.08.04.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 Nov 2023 08:04:38 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Fz6pOLN2; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9FFC63885C3A for ; Fri, 17 Nov 2023 16:04:02 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 045AF3858435 for ; Fri, 17 Nov 2023 16:03:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 045AF3858435 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 045AF3858435 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700237014; cv=none; b=HU2ghj6Al/YJ/rjQCKF3eWbVSy/b8b3iEBu623I+sUPFuzhkkoF5t6jD+mPbpRDfhUCreM12KveIbwNi4mbPj5jVc6144Nt/4/aUKiSSQf8kiscYlQ60chZWpp0Oo5zpJX9OysVwGRlqqhVDVZg3n8ZZjEm0/bJ3NZm7KsaWMgI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700237014; c=relaxed/simple; bh=61DNr61O6DOp7uQHOFV3iwoxeZj2HAcaBtEFh0VEtzk=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=Ke0BeTSLHyuSjh3kx1EqzO5o8bsrTaj9+SEwA2LK4ij/DW/O/7UcM8WLat95yYddQucfQNt9T68TRkjrG9XA0RtOQYbo5arorqU4kdbcWElmiGXuSaIXpQXic5ffbOqHAu5NuX6xG0xSJ8wftcIejdO4QtCMx+raxYgzV5845Fk= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1700237006; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Wjs5bFeoN7/ebeGjQ+Mir01YLRW5/bl8X0DT5ffPrxU=; b=Fz6pOLN2SY4z4duQqhoEHN5PjBf1sIU+v+5nTMkILhLY5yK6kqEwOVeNbbje4sr+ng11ia xrlWCF72t6S7r/mEFwsBorA7ITHUeeeM6xA9PtDAiYdszS4S9vmtEubtyyVHFPQLrVBnRG c61hLoGpIwlWW5JZGoZh9bJOjQcFOzk= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-616-f4hxRp4INU-0aHTShvQEsQ-1; Fri, 17 Nov 2023 11:03:23 -0500 X-MC-Unique: f4hxRp4INU-0aHTShvQEsQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C8742185A786; Fri, 17 Nov 2023 16:03:22 +0000 (UTC) Received: from localhost (unknown [10.42.28.9]) by smtp.corp.redhat.com (Postfix) with ESMTP id 929EF5028; Fri, 17 Nov 2023 16:03:22 +0000 (UTC) From: Jonathan Wakely To: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: [PATCH 2/2] libstdc++: Ensure valid UTF-8 in std::vprint_unicode Date: Fri, 17 Nov 2023 15:54:39 +0000 Message-ID: <20231117160320.1513815-2-jwakely@redhat.com> In-Reply-To: <20231117160320.1513815-1-jwakely@redhat.com> References: <20231117160320.1513815-1-jwakely@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.5 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782827794645032422 X-GMAIL-MSGID: 1782827794645032422 This is a naive implementation of the UTF-8 validation algorithm, which could definitely be optimized. But it's faster than using std::codecvt_utf8 and checking the result of that, which is the only existing code we have to do it in the library. As the TODO suggests, we could do the UTF-8 to UTF-16 conversion at the same time. But that is only needed for Windows and as I said in the 1/2 email, the output for Windows seems to be broken currently anyway and I can't test it properly. -- >8 -- libstdc++-v3/ChangeLog: * include/bits/locale_conv.h (__to_valid_utf8): New function. * include/std/ostream (vprint_unicode): Use it. * include/std/print (vprint_unicode): Use it. --- libstdc++-v3/include/bits/locale_conv.h | 104 ++++++++++++++++++++++++ libstdc++-v3/include/std/ostream | 74 +++++++++++------ libstdc++-v3/include/std/print | 8 +- 3 files changed, 160 insertions(+), 26 deletions(-) diff --git a/libstdc++-v3/include/bits/locale_conv.h b/libstdc++-v3/include/bits/locale_conv.h index 284142a360a..f6ade1d0395 100644 --- a/libstdc++-v3/include/bits/locale_conv.h +++ b/libstdc++-v3/include/bits/locale_conv.h @@ -624,6 +624,110 @@ _GLIBCXX_END_NAMESPACE_CXX11 bool _M_always_noconv; }; +#if __cplusplus >= 202002L + template + bool + __to_valid_utf8(string& __s) + { + // TODO if _CharT is wchar_t then transcode at the same time. + + unsigned __seen = 0, __needed = 0; + unsigned char __lo_bound = 0x80, __hi_bound = 0xBF; + size_t __errors = 0; + + auto __q = __s.data(), __eoq = __q + __s.size(); + while (__q != __eoq) + { + unsigned char __byte = *__q; + if (__needed == 0) + { + if (__byte <= 0x7F) // 0x00 to 0x7F + { + while (++__q != __eoq && (unsigned char)*__q <= 0x7F) + { } // Fast forward to the next non-ASCII character. + continue; + } + else if (__byte < 0xC2) + { + *__q = 0xFF; + ++__errors; + } + else if (__byte <= 0xDF) // 0xC2 to 0xDF + { + __needed = 1; + } + else if (__byte <= 0xEF) // 0xE0 to 0xEF + { + if (__byte == 0xE0) + __lo_bound = 0xA0; + else if (__byte == 0xED) + __hi_bound = 0x9F; + + __needed = 2; + } + else if (__byte <= 0xF4) // 0xF0 to 0xF4 + { + if (__byte == 0xF0) + __lo_bound = 0x90; + else if (__byte == 0xF4) + __hi_bound = 0x8F; + + __needed = 3; + } + else + { + *__q = 0xFF; + ++__errors; + } + } + else + { + if (__byte < __lo_bound || __byte > __hi_bound) + { + *(__q - __seen - 1) = 0xFF; + __builtin_memset(__q - __seen, 0xFE, __seen); + ++__errors; + __needed = __seen = 0; + __lo_bound = 0x80; + __hi_bound = 0xBF; + continue; // Reprocess the current character. + } + + __lo_bound = 0x80; + __hi_bound = 0xBF; + ++__seen; + if (__seen == __needed) + __needed = __seen = 0; + } + __q++; + } + + if (__needed) + { + // The string ends with an incomplete multibyte sequence. + if (__seen) + __s.resize(__s.size() - __seen); + __s.back() = 0xFF; + ++__errors; + } + + if (__errors == 0) + return true; + + string __s2; + __s2.reserve(__s.size() + __errors * 2); + for (unsigned char __byte : __s) + { + if (__byte == 0xFF) + __s2 += "\uFFFD"; + else if (__byte != 0xFE) + __s2 += (char)__byte; + } + __s = std::move(__s2); + return false; + } +#endif // C++20 + /// @} group locales _GLIBCXX_END_NAMESPACE_VERSION diff --git a/libstdc++-v3/include/std/ostream b/libstdc++-v3/include/std/ostream index e81c39a7c80..760aaa206da 100644 --- a/libstdc++-v3/include/std/ostream +++ b/libstdc++-v3/include/std/ostream @@ -917,42 +917,68 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION inline void vprint_unicode(ostream& __os, string_view __fmt, format_args __args) { - // TODO: diagnose invalid UTF-8 code units -#ifdef _WIN32 - int __fd_for_console(std::streambuf*); - void __write_utf16_to_console(int, string); - - // If stream refers to a terminal convert to UTF-16 and use WriteConsoleW. - if (int __fd = __fd_for_console(__os.rdbuf()); __fd >= 0) + ostream::sentry __cerb(__os); + if (__cerb) { - ostream::sentry __cerb(__os); - if (__cerb) + string __out = std::vformat(__fmt, __args); + std::__to_valid_utf8(__out); + +#ifdef _WIN32 + int __fd_for_console(std::streambuf*); + void __write_utf16_to_console(int, string); + + // If stream refers to a terminal output UTF-16 using WriteConsoleW. + if (int __fd = __fd_for_console(__os.rdbuf()); __fd >= 0) { - string __out = std::vformat(__fmt, __args); ios_base::iostate __err = ios_base::goodbit; __try - { - if (__os.rdbuf()->pubsync() == -1) - __err = ios::badbit; - else if (__write_utf16_to_console(__fd, __out)) - __err = ios::badbit; - } + { + if (__os.rdbuf()->pubsync() == -1) + __err = ios::badbit; + else if (__write_utf16_to_console(__fd, __out)) + __err = ios::badbit; + } __catch(const __cxxabiv1::__forced_unwind&) - { - __os._M_setstate(ios_base::badbit); - __throw_exception_again; - } + { + __os._M_setstate(ios_base::badbit); + __throw_exception_again; + } __catch(...) - { __os._M_setstate(ios_base::badbit); } + { __os._M_setstate(ios_base::badbit); } if (__err) __os.setstate(__err); + return; } - } #endif - std::vprint_nonunicode(__os, __fmt, __args); - } + __try + { + const streamsize __w = __os.width(); + const streamsize __n = __out.size(); + if (__w > __n) + { + const bool __left + = (__os.flags() & ios_base::adjustfield) == ios_base::left; + if (!__left) + std::__ostream_fill(__os, __w - __n); + if (__os.good()) + std::__ostream_write(__os, __out.data(), __n); + if (__left && __os.good()) + std::__ostream_fill(__os, __w - __n); + } + else + std::__ostream_write(__os, __out.data(), __n); + } + __catch(const __cxxabiv1::__forced_unwind&) + { + __os._M_setstate(ios_base::badbit); + __throw_exception_again; + } + __catch(...) + { __os._M_setstate(ios_base::badbit); } + } + } template inline void diff --git a/libstdc++-v3/include/std/print b/libstdc++-v3/include/std/print index 75e78841247..096b97b1ef7 100644 --- a/libstdc++-v3/include/std/print +++ b/libstdc++-v3/include/std/print @@ -62,7 +62,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION inline void vprint_unicode(FILE* __stream, string_view __fmt, format_args __args) { - // TODO: diagnose invalid UTF-8 code units + string __out = std::vformat(__fmt, __args); + std::__to_valid_utf8(__out); + #ifdef _WIN32 int __fd_for_console(FILE*); void __write_utf16_to_console(int, string); @@ -82,7 +84,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION _GLIBCXX_THROW_OR_ABORT(system_error(__e, "std::vprint_unicode")); } #endif - std::vprint_nonunicode(__stream, __fmt, __args); + + if (std::fwrite(__out.data(), 1, __out.size(), __stream) != __out.size()) + __throw_system_error(EIO); } template