From patchwork Tue Jan 10 12:58:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Dimitrij Mijoski X-Patchwork-Id: 41451 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp2730133wrt; Tue, 10 Jan 2023 05:00:43 -0800 (PST) X-Google-Smtp-Source: AMrXdXt89mHIJEi03uVQw1w4AFGpGGlQphN8v2oxkvVk/T9YWlsVRlhxMxpMSnBnPRPjxwHK7bHL X-Received: by 2002:a17:906:5f98:b0:84d:1b67:cecb with SMTP id a24-20020a1709065f9800b0084d1b67cecbmr13063940eju.43.1673355643572; Tue, 10 Jan 2023 05:00:43 -0800 (PST) Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id oz12-20020a1709077d8c00b00857bd89d27dsi1580184ejc.76.2023.01.10.05.00.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Jan 2023 05:00:43 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=WYripy0x; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 119FE38493FD for ; Tue, 10 Jan 2023 13:00:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 119FE38493FD DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1673355609; bh=F1b1nXjKLEsAfvR5km4/qM5k/FZXlu2mKOM5ssjSbJU=; h=Subject:To:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=WYripy0xI+oEFiPSsSpHUhI2GKDL2Rmq/PXlZOVbPlJbGxz6qO1z+weHjV9nJ5lwI MoBqEp/4KLVBLUchuI4lyueHncxqjgvDKuFZGjLz2WGDzgpNYxY5f4qfS+s3auICxB LCi+k7KJggzadddemNqS/E8eO4TfTMw4Iom9PUd8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-vi1eur04olkn2066.outbound.protection.outlook.com [40.92.75.66]) by sourceware.org (Postfix) with ESMTPS id B3A573858C78; Tue, 10 Jan 2023 12:59:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B3A573858C78 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=jKafXVgdOqB+/AWE/NBkmciW56e/N3GFbPvzBGnnixbH6nuKmfh2gxH+PS9MW85s1eCRVBWrDRubbrtW3iYrf8XbOBNG5vfw5Ga8XgdqJbj3IbtBFYyyOmK1K1Wn0u9ex29AH5xGUw6RhuyuTydnMJ9BxOWhpS3NocPbMCskb1lGPMfE5DJA+yJs/g2Dz7H4TIx1yU7zsc3GZ85cuGs2Rg4kHEbgdZGlWZHG8aeNyQ9buYR+2LYX0sCCD4tdEpckDexPK3iJMyynNFcwAq8qUNjf4IiTXT0AqMIiooEvqIniyG0hIg7fnFrWeZCcJCqbutZovOCw2t/T93AYQU/cCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=F1b1nXjKLEsAfvR5km4/qM5k/FZXlu2mKOM5ssjSbJU=; b=RWg/41AnswaprsyQ/V8BCjM6GGHcGpT95EfdOsCpk5V2MQimnup3DLjga8Rckdpm9XZNk3NKYZ3ZhFpM9G9AXoYbIH7dJVtdc2JkwuAIGwMMvjhxeeEYcTYGD8imaWBDXjql0QkRu0nocWUSBQOtGu49f4kw7dSsID+DlkFuySv5pIw/jbx8h4A6tA4pWK5+OAqSOGWMI78arnaMsFVg5rAHOhk1miHwj1xvawcXShGI9midVuxn/X1VT6/OKkkAkGfjp2TC8IPpUO2Q4cZGQDyo7Ya2iROr3GUbSGlTLitubpo5SmipdAAFJPNptKzyARXapK59iD93S0Y8/ZbUnA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none Received: from AM0PR04MB5412.eurprd04.prod.outlook.com (2603:10a6:208:10f::11) by DB8PR04MB7018.eurprd04.prod.outlook.com (2603:10a6:10:121::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Tue, 10 Jan 2023 12:59:05 +0000 Received: from AM0PR04MB5412.eurprd04.prod.outlook.com ([fe80::8021:ef99:c515:b20f]) by AM0PR04MB5412.eurprd04.prod.outlook.com ([fe80::8021:ef99:c515:b20f%3]) with mapi id 15.20.5986.018; Tue, 10 Jan 2023 12:59:03 +0000 Message-ID: Subject: [PATCH v2] libstdc++: Fix Unicode codecvt and add tests [PR86419] To: gcc-patches@gcc.gnu.org, libstdc++@gcc.gnu.org Date: Tue, 10 Jan 2023 13:58:59 +0100 User-Agent: Evolution 3.44.4-0ubuntu1 X-TMN: [Ix00PYKkgxB28EmSOVxZW88nnBSdWG2F] X-ClientProxiedBy: VI1PR0102CA0002.eurprd01.prod.exchangelabs.com (2603:10a6:802::15) To AM0PR04MB5412.eurprd04.prod.outlook.com (2603:10a6:208:10f::11) X-Microsoft-Original-Message-ID: MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AM0PR04MB5412:EE_|DB8PR04MB7018:EE_ X-MS-Office365-Filtering-Correlation-Id: 97c88ebc-040b-42d0-3f24-08daf30a7263 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: I7gSGJCEEq+RzWlYS/a8+aoQm6M3lrkDOXJVCYrQ2VPMz2gk5T/oXPR/J5WZl/mmgpNleSJaM8fGlQnE6bufK7b5oa3yuM53Y2SpoZnNj+kBW6wUuKAazbf1PpDnhcay6frMtggF9YiCSkDtqmneYaxda/EGVvzmR/ipZsom8HwATMRXzKqyiHZ6qgoNZUob4eQhZJ8JybQZfilIvrpmBxy/2Kk4A7DBanoYeh1lF2dQYMZHJcENExBjwQN7Y0/Nr3OF65WBHPmjQXtgKic1hVvXMGBZmOHaS4GMGuwORMC5aR9Oa0ltK7IMDjnqlg7UNYu6CuU7UAv9YG0skrFMHnfRkfJdP6qAJlUzXnkqEn5I44FNm62nIE5KZ1gfZqmdbnUK+2ndaOVLJpaCm5kTbav9PpBCeNv8JuZDiK8XKO85q6DjMqtVM227VITT3RhIVocpHWZXdqla/VzsT6SSb39mJC1T1PQdyq6hQt8xvDdK+AtwuBASh37EoHJBkdhyhstakbInk3u9X4KXDe+y6TVN9m1Cnc98ODV9v55j93Q/djxyFRa2ODmEfT+9QlFKukb0UuYZQ323DweYJtcGDg== X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?R+/TjNncoPCvskusuuDEHAE/D5QO?= =?utf-8?q?Nj76o1ccMLBdJ5UzzGSizi5QyA+CLfGlGg+QhDloQZJTIZ29YqUGENEyn8S0DhzJj?= =?utf-8?q?AxzcaIfe/cD40Zxa/6AyGY0iEjq5XIPHXCYnmhtn54K2l3WFuCtCkEmjzRHvoD6uG?= =?utf-8?q?8YuO+HzaKXo/LVhAQ8alwO+0NPHYd8mjv7ScVR0u4bgXTdh2M4LNmDLqPXHXbIQqi?= =?utf-8?q?iw/FC+HnC+dE2PIIWZP3i90RiHGmH7TFLMEItscarTmsT46lc+Wl2+4bLQJEe2fLp?= =?utf-8?q?X7JIgl5Zh7npk0o08IIEszTMRV34wczmAgQmkGMos6Lb6Jj8bSQlLFEVkn03TL6xO?= =?utf-8?q?0t8fM1WaVl/7W2WDUBb1GMlLYlTpypvkegsbIGpj6VpiAXFCt/Jd23Txd3ObAnxmi?= =?utf-8?q?ixm/m9RbnY68weX+jaglHfwtDBkn/WIAWwZNLWKKYKrezmvJ3dRWKHyocEOZRsZ4M?= =?utf-8?q?T6oU7Uu2fABrtI0fgEZ9yuc23ZHF25hnYD7ZiIRpOJpVZM25+xPW7RnqtDvvrupz9?= =?utf-8?q?EzPMYupieCk+GLF6ma3jnqjqqu1VPzgXSBJF8TM5NkVsVtxYaKEamNisM6AyzsQh2?= =?utf-8?q?dj7iiZPqOS3l6RkAMu48faO/+0L6AuE/NIiD2fYWK4vuUfz5XRCarU8RT1q9MnmtX?= =?utf-8?q?lK/BnFwoQyZOgrCkmDG8/4s10uUzMBJq3SQwQmZHLfO07pKz8zfT9zC8ixWa+exTd?= =?utf-8?q?k1J6YJ2nXoSclUAJ4u8snglEosEJZYDkdwEobgNXZq3HuQz4pdB+ICqGl8aSmVIB4?= =?utf-8?q?LfcPqwimlS62JbvFYHgl/zM5D1l4yKEOQtYuJQotmpPxqCOrh0eRfgLl60AfZEJ+K?= =?utf-8?q?qy1VxwODtgok21aZ4r312U+R+0fhi/6JWu23C/kY5FZsmsoyUMIW1L1ffPZt5GHKe?= =?utf-8?q?lYNkx9m1egFqkf8XRHkEzrZTHbMdxCR+TV05ECX8B7tmdSitLuPXTwOp8PfE8l8nJ?= =?utf-8?q?5iDJhD18gk29rS5Wjc+Yo5T2Iif/DBca5i0UeJ7jiMjEBIqwu0iPJaruaBr/YJTe/?= =?utf-8?q?mx4Z9683/5xiPqmPSjtjnZ/DtrulhPVW4d7DI+5whOktRgnGcOOjvgaJXlaPizmzH?= =?utf-8?q?4dK4vsGAOfQ56DTAx1vEwWMqMr36rwr7e3LiFiQiJLcJbn0CF1xl2MGJ6BFx+HXjt?= =?utf-8?q?1KaOmWZR1L7HQpCsAJhXMo11tYOjnuKq+6lsxSqAZVlgxMOsS+SqP2SErR4iWvlRq?= =?utf-8?q?kbUfjyEiZeBqxdi3A?= X-OriginatorOrg: sct-15-20-4755-11-msonline-outlook-03a34.templateTenant X-MS-Exchange-CrossTenant-Network-Message-Id: 97c88ebc-040b-42d0-3f24-08daf30a7263 X-MS-Exchange-CrossTenant-AuthSource: AM0PR04MB5412.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Jan 2023 12:59:02.9737 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR04MB7018 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, HK_RANDOM_ENVFROM, HK_RANDOM_FROM, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Dimitrij Mijoski via Gcc-patches From: Dimitrij Mijoski Reply-To: Dimitrij Mijoski Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1754640567609666176?= X-GMAIL-MSGID: =?utf-8?q?1754640567609666176?= Fixes the conversion from UTF-8 to UTF-16 to properly return partial instead ok. Fixes the conversion from UTF-16 to UTF-8 to properly return partial instead ok. Fixes the conversion from UTF-8 to UCS-2 to properly return partial instead error. Fixes the conversion from UTF-8 to UCS-2 to treat 4-byte UTF-8 sequences as error just by seeing the leading byte. Fixes UTF-8 decoding for all codecvts so they detect error at the end of the input range when the last code point is also incomplete. libstdc++-v3/ChangeLog: PR libstdc++/86419 * src/c++11/codecvt.cc: Fix bugs. * testsuite/22_locale/codecvt/codecvt_unicode.cc: New tests. * testsuite/22_locale/codecvt/codecvt_unicode.h: New tests. * testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc: New tests. --- libstdc++-v3/src/c++11/codecvt.cc | 38 +- .../22_locale/codecvt/codecvt_unicode.cc | 68 + .../22_locale/codecvt/codecvt_unicode.h | 1268 +++++++++++++++++ .../codecvt/codecvt_unicode_wchar_t.cc | 59 + 4 files changed, 1414 insertions(+), 19 deletions(-) create mode 100644 libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc create mode 100644 libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h create mode 100644 libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc diff --git a/libstdc++-v3/src/c++11/codecvt.cc b/libstdc++-v3/src/c++11/codecvt.cc index 9f8cb7677..49282a510 100644 --- a/libstdc++-v3/src/c++11/codecvt.cc +++ b/libstdc++-v3/src/c++11/codecvt.cc @@ -277,13 +277,15 @@ namespace } else if (c1 < 0xF0) // 3-byte sequence { - if (avail < 3) + if (avail < 2) return incomplete_mb_character; char32_t c2 = (unsigned char) from[1]; if ((c2 & 0xC0) != 0x80) return invalid_mb_sequence; if (c1 == 0xE0 && c2 < 0xA0) // overlong return invalid_mb_sequence; + if (avail < 3) + return incomplete_mb_character; char32_t c3 = (unsigned char) from[2]; if ((c3 & 0xC0) != 0x80) return invalid_mb_sequence; @@ -292,9 +294,9 @@ namespace from += 3; return c; } - else if (c1 < 0xF5) // 4-byte sequence + else if (c1 < 0xF5 && maxcode > 0xFFFF) // 4-byte sequence { - if (avail < 4) + if (avail < 2) return incomplete_mb_character; char32_t c2 = (unsigned char) from[1]; if ((c2 & 0xC0) != 0x80) @@ -302,10 +304,14 @@ namespace if (c1 == 0xF0 && c2 < 0x90) // overlong return invalid_mb_sequence; if (c1 == 0xF4 && c2 >= 0x90) // > U+10FFFF - return invalid_mb_sequence; + return invalid_mb_sequence; + if (avail < 3) + return incomplete_mb_character; char32_t c3 = (unsigned char) from[2]; if ((c3 & 0xC0) != 0x80) return invalid_mb_sequence; + if (avail < 4) + return incomplete_mb_character; char32_t c4 = (unsigned char) from[3]; if ((c4 & 0xC0) != 0x80) return invalid_mb_sequence; @@ -527,12 +533,11 @@ namespace // Flag indicating whether to process UTF-16 or UCS2 enum class surrogates { allowed, disallowed }; - // utf8 -> utf16 (or utf8 -> ucs2 if s == surrogates::disallowed) - template - codecvt_base::result - utf16_in(range& from, range& to, - unsigned long maxcode = max_code_point, codecvt_mode mode = {}, - surrogates s = surrogates::allowed) + // utf8 -> utf16 (or utf8 -> ucs2 if maxcode <= 0xFFFF) + template + codecvt_base::result utf16_in (range &from, range &to, + unsigned long maxcode = max_code_point, + codecvt_mode mode = {}) { read_utf8_bom(from, mode); while (from.size() && to.size()) @@ -540,12 +545,7 @@ namespace auto orig = from; const char32_t codepoint = read_utf8_code_point(from, maxcode); if (codepoint == incomplete_mb_character) - { - if (s == surrogates::allowed) - return codecvt_base::partial; - else - return codecvt_base::error; // No surrogates in UCS2 - } + return codecvt_base::partial; if (codepoint > maxcode) return codecvt_base::error; if (!write_utf16_code_point(to, codepoint, mode)) @@ -554,7 +554,7 @@ namespace return codecvt_base::partial; } } - return codecvt_base::ok; + return from.size () ? codecvt_base::partial : codecvt_base::ok; } // utf16 -> utf8 (or ucs2 -> utf8 if s == surrogates::disallowed) @@ -576,7 +576,7 @@ namespace return codecvt_base::error; // No surrogates in UCS-2 if (from.size() < 2) - return codecvt_base::ok; // stop converting at this point + return codecvt_base::partial; // stop converting at this point const char32_t c2 = from[1]; if (is_low_surrogate(c2)) @@ -629,7 +629,7 @@ namespace { // UCS-2 only supports characters in the BMP, i.e. one UTF-16 code unit: maxcode = std::min(max_single_utf16_unit, maxcode); - return utf16_in(from, to, maxcode, mode, surrogates::disallowed); + return utf16_in (from, to, maxcode, mode); } // ucs2 -> utf8 diff --git a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc new file mode 100644 index 000000000..ae4b6c896 --- /dev/null +++ b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc @@ -0,0 +1,68 @@ +// Copyright (C) 2020-2023 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// . + +// { dg-do run { target c++11 } } + +#include "codecvt_unicode.h" + +#include + +using namespace std; + +void +test_utf8_utf32_codecvts () +{ + using codecvt_c32 = codecvt; + auto loc_c = locale::classic (); + VERIFY (has_facet (loc_c)); + auto &cvt = use_facet (loc_c); + test_utf8_utf32_codecvts (cvt); + + auto cvt_ptr = to_unique_ptr (new codecvt_utf8 ()); + test_utf8_utf32_codecvts (*cvt_ptr); +} + +void +test_utf8_utf16_codecvts () +{ + using codecvt_c16 = codecvt; + auto loc_c = locale::classic (); + VERIFY (has_facet (loc_c)); + auto &cvt = use_facet (loc_c); + test_utf8_utf16_cvts (cvt); + + auto cvt_ptr = to_unique_ptr (new codecvt_utf8_utf16 ()); + test_utf8_utf16_cvts (*cvt_ptr); + + auto cvt_ptr2 = to_unique_ptr (new codecvt_utf8_utf16 ()); + test_utf8_utf16_cvts (*cvt_ptr2); +} + +void +test_utf8_ucs2_codecvts () +{ + auto cvt_ptr = to_unique_ptr (new codecvt_utf8 ()); + test_utf8_ucs2_cvts (*cvt_ptr); +} + +int +main () +{ + test_utf8_utf32_codecvts (); + test_utf8_utf16_codecvts (); + test_utf8_ucs2_codecvts (); +} diff --git a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h new file mode 100644 index 000000000..70d079286 --- /dev/null +++ b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h @@ -0,0 +1,1268 @@ +// Copyright (C) 2020-2023 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// . + +#include +#include +#include + +template +std::unique_ptr +to_unique_ptr (T *ptr) +{ + return std::unique_ptr (ptr); +} + +struct test_offsets_ok +{ + size_t in_size, out_size; +}; +struct test_offsets_partial +{ + size_t in_size, out_size, expected_in_next, expected_out_next; +}; + +template struct test_offsets_error +{ + size_t in_size, out_size, expected_in_next, expected_out_next; + CharT replace_char; + size_t replace_pos; +}; + +template +auto constexpr array_size (const T (&)[N]) -> size_t +{ + return N; +} + +template +void +utf8_to_utf32_in_ok (const std::codecvt &cvt) +{ + using namespace std; + // UTF-8 string of 1-byte CP, 2-byte CP, 3-byte CP and 4-byte CP + const char in[] = "bш\uAAAA\U0010AAAA"; + const char32_t exp_literal[] = U"bш\uAAAA\U0010AAAA"; + CharT exp[array_size (exp_literal)] = {}; + std::copy (begin (exp_literal), end (exp_literal), begin (exp)); + + static_assert (array_size (in) == 11, ""); + static_assert (array_size (exp_literal) == 5, ""); + static_assert (array_size (exp) == 5, ""); + VERIFY (char_traits::length (in) == 10); + VERIFY (char_traits::length (exp_literal) == 4); + VERIFY (char_traits::length (exp) == 4); + + test_offsets_ok offsets[] = {{0, 0}, {1, 1}, {3, 2}, {6, 3}, {10, 4}}; + for (auto t : offsets) + { + CharT out[array_size (exp) - 1] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + auto state = mbstate_t{}; + auto in_next = (const char *) nullptr; + auto out_next = (CharT *) nullptr; + auto res = codecvt_base::result (); + + res = cvt.in (state, in, in + t.in_size, in_next, out, out + t.out_size, + out_next); + VERIFY (res == cvt.ok); + VERIFY (in_next == in + t.in_size); + VERIFY (out_next == out + t.out_size); + VERIFY (char_traits::compare (out, exp, t.out_size) == 0); + if (t.out_size < array_size (out)) + VERIFY (out[t.out_size] == 0); + } + + for (auto t : offsets) + { + CharT out[array_size (exp)] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + auto state = mbstate_t{}; + auto in_next = (const char *) nullptr; + auto out_next = (CharT *) nullptr; + auto res = codecvt_base::result (); + + res + = cvt.in (state, in, in + t.in_size, in_next, out, end (out), out_next); + VERIFY (res == cvt.ok); + VERIFY (in_next == in + t.in_size); + VERIFY (out_next == out + t.out_size); + VERIFY (char_traits::compare (out, exp, t.out_size) == 0); + if (t.out_size < array_size (out)) + VERIFY (out[t.out_size] == 0); + } +} + +template +void +utf8_to_utf32_in_partial (const std::codecvt &cvt) +{ + using namespace std; + // UTF-8 string of 1-byte CP, 2-byte CP, 3-byte CP and 4-byte CP + const char in[] = "bш\uAAAA\U0010AAAA"; + const char32_t exp_literal[] = U"bш\uAAAA\U0010AAAA"; + CharT exp[array_size (exp_literal)] = {}; + std::copy (begin (exp_literal), end (exp_literal), begin (exp)); + + static_assert (array_size (in) == 11, ""); + static_assert (array_size (exp_literal) == 5, ""); + static_assert (array_size (exp) == 5, ""); + VERIFY (char_traits::length (in) == 10); + VERIFY (char_traits::length (exp_literal) == 4); + VERIFY (char_traits::length (exp) == 4); + + test_offsets_partial offsets[] = { + {1, 0, 0, 0}, // no space for first CP + + {3, 1, 1, 1}, // no space for second CP + {2, 2, 1, 1}, // incomplete second CP + {2, 1, 1, 1}, // incomplete second CP, and no space for it + + {6, 2, 3, 2}, // no space for third CP + {4, 3, 3, 2}, // incomplete third CP + {5, 3, 3, 2}, // incomplete third CP + {4, 2, 3, 2}, // incomplete third CP, and no space for it + {5, 2, 3, 2}, // incomplete third CP, and no space for it + + {10, 3, 6, 3}, // no space for fourth CP + {7, 4, 6, 3}, // incomplete fourth CP + {8, 4, 6, 3}, // incomplete fourth CP + {9, 4, 6, 3}, // incomplete fourth CP + {7, 3, 6, 3}, // incomplete fourth CP, and no space for it + {8, 3, 6, 3}, // incomplete fourth CP, and no space for it + {9, 3, 6, 3}, // incomplete fourth CP, and no space for it + }; + + for (auto t : offsets) + { + CharT out[array_size (exp) - 1] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + VERIFY (t.expected_in_next <= t.in_size); + VERIFY (t.expected_out_next <= t.out_size); + auto state = mbstate_t{}; + auto in_next = (const char *) nullptr; + auto out_next = (CharT *) nullptr; + auto res = codecvt_base::result (); + + res = cvt.in (state, in, in + t.in_size, in_next, out, out + t.out_size, + out_next); + VERIFY (res == cvt.partial); + VERIFY (in_next == in + t.expected_in_next); + VERIFY (out_next == out + t.expected_out_next); + VERIFY (char_traits::compare (out, exp, t.expected_out_next) == 0); + if (t.expected_out_next < array_size (out)) + VERIFY (out[t.expected_out_next] == 0); + } +} + +template +void +utf8_to_utf32_in_error (const std::codecvt &cvt) +{ + using namespace std; + // UTF-8 string of 1-byte CP, 2-byte CP, 3-byte CP and 4-byte CP + const char valid_in[] = "bш\uAAAA\U0010AAAA"; + const char32_t exp_literal[] = U"bш\uAAAA\U0010AAAA"; + CharT exp[array_size (exp_literal)] = {}; + std::copy (begin (exp_literal), end (exp_literal), begin (exp)); + + static_assert (array_size (valid_in) == 11, ""); + static_assert (array_size (exp_literal) == 5, ""); + static_assert (array_size (exp) == 5, ""); + VERIFY (char_traits::length (valid_in) == 10); + VERIFY (char_traits::length (exp_literal) == 4); + VERIFY (char_traits::length (exp) == 4); + + test_offsets_error offsets[] = { + + // replace leading byte with invalid byte + {1, 4, 0, 0, '\xFF', 0}, + {3, 4, 1, 1, '\xFF', 1}, + {6, 4, 3, 2, '\xFF', 3}, + {10, 4, 6, 3, '\xFF', 6}, + + // replace first trailing byte with ASCII byte + {3, 4, 1, 1, 'z', 2}, + {6, 4, 3, 2, 'z', 4}, + {10, 4, 6, 3, 'z', 7}, + + // replace first trailing byte with invalid byte + {3, 4, 1, 1, '\xFF', 2}, + {6, 4, 3, 2, '\xFF', 4}, + {10, 4, 6, 3, '\xFF', 7}, + + // replace second trailing byte with ASCII byte + {6, 4, 3, 2, 'z', 5}, + {10, 4, 6, 3, 'z', 8}, + + // replace second trailing byte with invalid byte + {6, 4, 3, 2, '\xFF', 5}, + {10, 4, 6, 3, '\xFF', 8}, + + // replace third trailing byte + {10, 4, 6, 3, 'z', 9}, + {10, 4, 6, 3, '\xFF', 9}, + + // replace first trailing byte with ASCII byte, also incomplete at end + {5, 4, 3, 2, 'z', 4}, + {8, 4, 6, 3, 'z', 7}, + {9, 4, 6, 3, 'z', 7}, + + // replace first trailing byte with invalid byte, also incomplete at end + {5, 4, 3, 2, '\xFF', 4}, + {8, 4, 6, 3, '\xFF', 7}, + {9, 4, 6, 3, '\xFF', 7}, + + // replace second trailing byte with ASCII byte, also incomplete at end + {9, 4, 6, 3, 'z', 8}, + + // replace second trailing byte with invalid byte, also incomplete at end + {9, 4, 6, 3, '\xFF', 8}, + }; + for (auto t : offsets) + { + char in[array_size (valid_in)] = {}; + CharT out[array_size (exp) - 1] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + VERIFY (t.expected_in_next <= t.in_size); + VERIFY (t.expected_out_next <= t.out_size); + char_traits::copy (in, valid_in, array_size (valid_in)); + in[t.replace_pos] = t.replace_char; + + auto state = mbstate_t{}; + auto in_next = (const char *) nullptr; + auto out_next = (CharT *) nullptr; + auto res = codecvt_base::result (); + + res = cvt.in (state, in, in + t.in_size, in_next, out, out + t.out_size, + out_next); + VERIFY (res == cvt.error); + VERIFY (in_next == in + t.expected_in_next); + VERIFY (out_next == out + t.expected_out_next); + VERIFY (char_traits::compare (out, exp, t.expected_out_next) == 0); + if (t.expected_out_next < array_size (out)) + VERIFY (out[t.expected_out_next] == 0); + } +} + +template +void +utf8_to_utf32_in (const std::codecvt &cvt) +{ + utf8_to_utf32_in_ok (cvt); + utf8_to_utf32_in_partial (cvt); + utf8_to_utf32_in_error (cvt); +} + +template +void +utf32_to_utf8_out_ok (const std::codecvt &cvt) +{ + using namespace std; + // UTF-8 string of 1-byte CP, 2-byte CP, 3-byte CP and 4-byte CP + const char32_t in_literal[] = U"bш\uAAAA\U0010AAAA"; + const char exp[] = "bш\uAAAA\U0010AAAA"; + CharT in[array_size (in_literal)] = {}; + copy (begin (in_literal), end (in_literal), begin (in)); + + static_assert (array_size (in_literal) == 5, ""); + static_assert (array_size (in) == 5, ""); + static_assert (array_size (exp) == 11, ""); + VERIFY (char_traits::length (in_literal) == 4); + VERIFY (char_traits::length (in) == 4); + VERIFY (char_traits::length (exp) == 10); + + const test_offsets_ok offsets[] = {{0, 0}, {1, 1}, {2, 3}, {3, 6}, {4, 10}}; + for (auto t : offsets) + { + char out[array_size (exp) - 1] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + auto state = mbstate_t{}; + auto in_next = (const CharT *) nullptr; + auto out_next = (char *) nullptr; + auto res = codecvt_base::result (); + + res = cvt.out (state, in, in + t.in_size, in_next, out, out + t.out_size, + out_next); + VERIFY (res == cvt.ok); + VERIFY (in_next == in + t.in_size); + VERIFY (out_next == out + t.out_size); + VERIFY (char_traits::compare (out, exp, t.out_size) == 0); + if (t.out_size < array_size (out)) + VERIFY (out[t.out_size] == 0); + } +} + +template +void +utf32_to_utf8_out_partial (const std::codecvt &cvt) +{ + using namespace std; + // UTF-8 string of 1-byte CP, 2-byte CP, 3-byte CP and 4-byte CP + const char32_t in_literal[] = U"bш\uAAAA\U0010AAAA"; + const char exp[] = "bш\uAAAA\U0010AAAA"; + CharT in[array_size (in_literal)] = {}; + copy (begin (in_literal), end (in_literal), begin (in)); + + static_assert (array_size (in_literal) == 5, ""); + static_assert (array_size (in) == 5, ""); + static_assert (array_size (exp) == 11, ""); + VERIFY (char_traits::length (in_literal) == 4); + VERIFY (char_traits::length (in) == 4); + VERIFY (char_traits::length (exp) == 10); + + const test_offsets_partial offsets[] = { + {1, 0, 0, 0}, // no space for first CP + + {2, 1, 1, 1}, // no space for second CP + {2, 2, 1, 1}, // no space for second CP + + {3, 3, 2, 3}, // no space for third CP + {3, 4, 2, 3}, // no space for third CP + {3, 5, 2, 3}, // no space for third CP + + {4, 6, 3, 6}, // no space for fourth CP + {4, 7, 3, 6}, // no space for fourth CP + {4, 8, 3, 6}, // no space for fourth CP + {4, 9, 3, 6}, // no space for fourth CP + }; + for (auto t : offsets) + { + char out[array_size (exp) - 1] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + VERIFY (t.expected_in_next <= t.in_size); + VERIFY (t.expected_out_next <= t.out_size); + auto state = mbstate_t{}; + auto in_next = (const CharT *) nullptr; + auto out_next = (char *) nullptr; + auto res = codecvt_base::result (); + + res = cvt.out (state, in, in + t.in_size, in_next, out, out + t.out_size, + out_next); + VERIFY (res == cvt.partial); + VERIFY (in_next == in + t.expected_in_next); + VERIFY (out_next == out + t.expected_out_next); + VERIFY (char_traits::compare (out, exp, t.expected_out_next) == 0); + if (t.expected_out_next < array_size (out)) + VERIFY (out[t.expected_out_next] == 0); + } +} + +template +void +utf32_to_utf8_out_error (const std::codecvt &cvt) +{ + using namespace std; + const char32_t valid_in[] = U"bш\uAAAA\U0010AAAA"; + const char exp[] = "bш\uAAAA\U0010AAAA"; + + static_assert (array_size (valid_in) == 5, ""); + static_assert (array_size (exp) == 11, ""); + VERIFY (char_traits::length (valid_in) == 4); + VERIFY (char_traits::length (exp) == 10); + + test_offsets_error offsets[] = {{4, 10, 0, 0, 0x00110000, 0}, + {4, 10, 1, 1, 0x00110000, 1}, + {4, 10, 2, 3, 0x00110000, 2}, + {4, 10, 3, 6, 0x00110000, 3}}; + + for (auto t : offsets) + { + CharT in[array_size (valid_in)] = {}; + char out[array_size (exp) - 1] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + VERIFY (t.expected_in_next <= t.in_size); + VERIFY (t.expected_out_next <= t.out_size); + copy (begin (valid_in), end (valid_in), begin (in)); + in[t.replace_pos] = t.replace_char; + + auto state = mbstate_t{}; + auto in_next = (const CharT *) nullptr; + auto out_next = (char *) nullptr; + auto res = codecvt_base::result (); + + res = cvt.out (state, in, in + t.in_size, in_next, out, out + t.out_size, + out_next); + VERIFY (res == cvt.error); + VERIFY (in_next == in + t.expected_in_next); + VERIFY (out_next == out + t.expected_out_next); + VERIFY (char_traits::compare (out, exp, t.expected_out_next) == 0); + if (t.expected_out_next < array_size (out)) + VERIFY (out[t.expected_out_next] == 0); + } +} + +template +void +utf32_to_utf8_out (const std::codecvt &cvt) +{ + utf32_to_utf8_out_ok (cvt); + utf32_to_utf8_out_partial (cvt); + utf32_to_utf8_out_error (cvt); +} + +template +void +test_utf8_utf32_codecvts (const std::codecvt &cvt) +{ + utf8_to_utf32_in (cvt); + utf32_to_utf8_out (cvt); +} + +template +void +utf8_to_utf16_in_ok (const std::codecvt &cvt) +{ + using namespace std; + // UTF-8 string of 1-byte CP, 2-byte CP, 3-byte CP and 4-byte CP + const char in[] = "bш\uAAAA\U0010AAAA"; + const char16_t exp_literal[] = u"bш\uAAAA\U0010AAAA"; + CharT exp[array_size (exp_literal)] = {}; + copy (begin (exp_literal), end (exp_literal), begin (exp)); + + static_assert (array_size (in) == 11, ""); + static_assert (array_size (exp_literal) == 6, ""); + static_assert (array_size (exp) == 6, ""); + VERIFY (char_traits::length (in) == 10); + VERIFY (char_traits::length (exp_literal) == 5); + VERIFY (char_traits::length (exp) == 5); + + test_offsets_ok offsets[] = {{0, 0}, {1, 1}, {3, 2}, {6, 3}, {10, 5}}; + for (auto t : offsets) + { + CharT out[array_size (exp) - 1] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + auto state = mbstate_t{}; + auto in_next = (const char *) nullptr; + auto out_next = (CharT *) nullptr; + auto res = codecvt_base::result (); + + res = cvt.in (state, in, in + t.in_size, in_next, out, out + t.out_size, + out_next); + VERIFY (res == cvt.ok); + VERIFY (in_next == in + t.in_size); + VERIFY (out_next == out + t.out_size); + VERIFY (char_traits::compare (out, exp, t.out_size) == 0); + if (t.out_size < array_size (out)) + VERIFY (out[t.out_size] == 0); + } + + for (auto t : offsets) + { + CharT out[array_size (exp)] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + auto state = mbstate_t{}; + auto in_next = (const char *) nullptr; + auto out_next = (CharT *) nullptr; + auto res = codecvt_base::result (); + + res + = cvt.in (state, in, in + t.in_size, in_next, out, end (out), out_next); + VERIFY (res == cvt.ok); + VERIFY (in_next == in + t.in_size); + VERIFY (out_next == out + t.out_size); + VERIFY (char_traits::compare (out, exp, t.out_size) == 0); + if (t.out_size < array_size (out)) + VERIFY (out[t.out_size] == 0); + } +} + +template +void +utf8_to_utf16_in_partial (const std::codecvt &cvt) +{ + using namespace std; + // UTF-8 string of 1-byte CP, 2-byte CP, 3-byte CP and 4-byte CP + const char in[] = "bш\uAAAA\U0010AAAA"; + const char16_t exp_literal[] = u"bш\uAAAA\U0010AAAA"; + CharT exp[array_size (exp_literal)] = {}; + copy (begin (exp_literal), end (exp_literal), begin (exp)); + + static_assert (array_size (in) == 11, ""); + static_assert (array_size (exp_literal) == 6, ""); + static_assert (array_size (exp) == 6, ""); + VERIFY (char_traits::length (in) == 10); + VERIFY (char_traits::length (exp_literal) == 5); + VERIFY (char_traits::length (exp) == 5); + + test_offsets_partial offsets[] = { + {1, 0, 0, 0}, // no space for first CP + + {3, 1, 1, 1}, // no space for second CP + {2, 2, 1, 1}, // incomplete second CP + {2, 1, 1, 1}, // incomplete second CP, and no space for it + + {6, 2, 3, 2}, // no space for third CP + {4, 3, 3, 2}, // incomplete third CP + {5, 3, 3, 2}, // incomplete third CP + {4, 2, 3, 2}, // incomplete third CP, and no space for it + {5, 2, 3, 2}, // incomplete third CP, and no space for it + + {10, 3, 6, 3}, // no space for fourth CP + {10, 4, 6, 3}, // no space for fourth CP + {7, 5, 6, 3}, // incomplete fourth CP + {8, 5, 6, 3}, // incomplete fourth CP + {9, 5, 6, 3}, // incomplete fourth CP + {7, 3, 6, 3}, // incomplete fourth CP, and no space for it + {8, 3, 6, 3}, // incomplete fourth CP, and no space for it + {9, 3, 6, 3}, // incomplete fourth CP, and no space for it + {7, 4, 6, 3}, // incomplete fourth CP, and no space for it + {8, 4, 6, 3}, // incomplete fourth CP, and no space for it + {9, 4, 6, 3}, // incomplete fourth CP, and no space for it + + }; + + for (auto t : offsets) + { + CharT out[array_size (exp) - 1] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + VERIFY (t.expected_in_next <= t.in_size); + VERIFY (t.expected_out_next <= t.out_size); + auto state = mbstate_t{}; + auto in_next = (const char *) nullptr; + auto out_next = (CharT *) nullptr; + auto res = codecvt_base::result (); + + res = cvt.in (state, in, in + t.in_size, in_next, out, out + t.out_size, + out_next); + VERIFY (res == cvt.partial); + VERIFY (in_next == in + t.expected_in_next); + VERIFY (out_next == out + t.expected_out_next); + VERIFY (char_traits::compare (out, exp, t.expected_out_next) == 0); + if (t.expected_out_next < array_size (out)) + VERIFY (out[t.expected_out_next] == 0); + } +} + +template +void +utf8_to_utf16_in_error (const std::codecvt &cvt) +{ + using namespace std; + const char valid_in[] = "bш\uAAAA\U0010AAAA"; + const char16_t exp_literal[] = u"bш\uAAAA\U0010AAAA"; + CharT exp[array_size (exp_literal)] = {}; + copy (begin (exp_literal), end (exp_literal), begin (exp)); + + static_assert (array_size (valid_in) == 11, ""); + static_assert (array_size (exp_literal) == 6, ""); + static_assert (array_size (exp) == 6, ""); + VERIFY (char_traits::length (valid_in) == 10); + VERIFY (char_traits::length (exp_literal) == 5); + VERIFY (char_traits::length (exp) == 5); + + test_offsets_error offsets[] = { + + // replace leading byte with invalid byte + {1, 5, 0, 0, '\xFF', 0}, + {3, 5, 1, 1, '\xFF', 1}, + {6, 5, 3, 2, '\xFF', 3}, + {10, 5, 6, 3, '\xFF', 6}, + + // replace first trailing byte with ASCII byte + {3, 5, 1, 1, 'z', 2}, + {6, 5, 3, 2, 'z', 4}, + {10, 5, 6, 3, 'z', 7}, + + // replace first trailing byte with invalid byte + {3, 5, 1, 1, '\xFF', 2}, + {6, 5, 3, 2, '\xFF', 4}, + {10, 5, 6, 3, '\xFF', 7}, + + // replace second trailing byte with ASCII byte + {6, 5, 3, 2, 'z', 5}, + {10, 5, 6, 3, 'z', 8}, + + // replace second trailing byte with invalid byte + {6, 5, 3, 2, '\xFF', 5}, + {10, 5, 6, 3, '\xFF', 8}, + + // replace third trailing byte + {10, 5, 6, 3, 'z', 9}, + {10, 5, 6, 3, '\xFF', 9}, + + // replace first trailing byte with ASCII byte, also incomplete at end + {5, 5, 3, 2, 'z', 4}, + {8, 5, 6, 3, 'z', 7}, + {9, 5, 6, 3, 'z', 7}, + + // replace first trailing byte with invalid byte, also incomplete at end + {5, 5, 3, 2, '\xFF', 4}, + {8, 5, 6, 3, '\xFF', 7}, + {9, 5, 6, 3, '\xFF', 7}, + + // replace second trailing byte with ASCII byte, also incomplete at end + {9, 5, 6, 3, 'z', 8}, + + // replace second trailing byte with invalid byte, also incomplete at end + {9, 5, 6, 3, '\xFF', 8}, + }; + for (auto t : offsets) + { + char in[array_size (valid_in)] = {}; + CharT out[array_size (exp) - 1] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + VERIFY (t.expected_in_next <= t.in_size); + VERIFY (t.expected_out_next <= t.out_size); + char_traits::copy (in, valid_in, array_size (valid_in)); + in[t.replace_pos] = t.replace_char; + + auto state = mbstate_t{}; + auto in_next = (const char *) nullptr; + auto out_next = (CharT *) nullptr; + auto res = codecvt_base::result (); + + res = cvt.in (state, in, in + t.in_size, in_next, out, out + t.out_size, + out_next); + VERIFY (res == cvt.error); + VERIFY (in_next == in + t.expected_in_next); + VERIFY (out_next == out + t.expected_out_next); + VERIFY (char_traits::compare (out, exp, t.expected_out_next) == 0); + if (t.expected_out_next < array_size (out)) + VERIFY (out[t.expected_out_next] == 0); + } +} + +template +void +utf8_to_utf16_in (const std::codecvt &cvt) +{ + utf8_to_utf16_in_ok (cvt); + utf8_to_utf16_in_partial (cvt); + utf8_to_utf16_in_error (cvt); +} + +template +void +utf16_to_utf8_out_ok (const std::codecvt &cvt) +{ + using namespace std; + // UTF-8 string of 1-byte CP, 2-byte CP, 3-byte CP and 4-byte CP + const char16_t in_literal[] = u"bш\uAAAA\U0010AAAA"; + const char exp[] = "bш\uAAAA\U0010AAAA"; + CharT in[array_size (in_literal)]; + copy (begin (in_literal), end (in_literal), begin (in)); + + static_assert (array_size (in_literal) == 6, ""); + static_assert (array_size (exp) == 11, ""); + static_assert (array_size (in) == 6, ""); + VERIFY (char_traits::length (in_literal) == 5); + VERIFY (char_traits::length (exp) == 10); + VERIFY (char_traits::length (in) == 5); + + const test_offsets_ok offsets[] = {{0, 0}, {1, 1}, {2, 3}, {3, 6}, {5, 10}}; + for (auto t : offsets) + { + char out[array_size (exp) - 1] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + auto state = mbstate_t{}; + auto in_next = (const CharT *) nullptr; + auto out_next = (char *) nullptr; + auto res = codecvt_base::result (); + + res = cvt.out (state, in, in + t.in_size, in_next, out, out + t.out_size, + out_next); + VERIFY (res == cvt.ok); + VERIFY (in_next == in + t.in_size); + VERIFY (out_next == out + t.out_size); + VERIFY (char_traits::compare (out, exp, t.out_size) == 0); + if (t.out_size < array_size (out)) + VERIFY (out[t.out_size] == 0); + } +} + +template +void +utf16_to_utf8_out_partial (const std::codecvt &cvt) +{ + using namespace std; + // UTF-8 string of 1-byte CP, 2-byte CP, 3-byte CP and 4-byte CP + const char16_t in_literal[] = u"bш\uAAAA\U0010AAAA"; + const char exp[] = "bш\uAAAA\U0010AAAA"; + CharT in[array_size (in_literal)]; + copy (begin (in_literal), end (in_literal), begin (in)); + + static_assert (array_size (in_literal) == 6, ""); + static_assert (array_size (exp) == 11, ""); + static_assert (array_size (in) == 6, ""); + VERIFY (char_traits::length (in_literal) == 5); + VERIFY (char_traits::length (exp) == 10); + VERIFY (char_traits::length (in) == 5); + + const test_offsets_partial offsets[] = { + {1, 0, 0, 0}, // no space for first CP + + {2, 1, 1, 1}, // no space for second CP + {2, 2, 1, 1}, // no space for second CP + + {3, 3, 2, 3}, // no space for third CP + {3, 4, 2, 3}, // no space for third CP + {3, 5, 2, 3}, // no space for third CP + + {5, 6, 3, 6}, // no space for fourth CP + {5, 7, 3, 6}, // no space for fourth CP + {5, 8, 3, 6}, // no space for fourth CP + {5, 9, 3, 6}, // no space for fourth CP + + {4, 10, 3, 6}, // incomplete fourth CP + + {4, 6, 3, 6}, // incomplete fourth CP, and no space for it + {4, 7, 3, 6}, // incomplete fourth CP, and no space for it + {4, 8, 3, 6}, // incomplete fourth CP, and no space for it + {4, 9, 3, 6}, // incomplete fourth CP, and no space for it + }; + for (auto t : offsets) + { + char out[array_size (exp) - 1] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + VERIFY (t.expected_in_next <= t.in_size); + VERIFY (t.expected_out_next <= t.out_size); + auto state = mbstate_t{}; + auto in_next = (const CharT *) nullptr; + auto out_next = (char *) nullptr; + auto res = codecvt_base::result (); + + res = cvt.out (state, in, in + t.in_size, in_next, out, out + t.out_size, + out_next); + VERIFY (res == cvt.partial); + VERIFY (in_next == in + t.expected_in_next); + VERIFY (out_next == out + t.expected_out_next); + VERIFY (char_traits::compare (out, exp, t.expected_out_next) == 0); + if (t.expected_out_next < array_size (out)) + VERIFY (out[t.expected_out_next] == 0); + } +} + +template +void +utf16_to_utf8_out_error (const std::codecvt &cvt) +{ + using namespace std; + const char16_t valid_in[] = u"bш\uAAAA\U0010AAAA"; + const char exp[] = "bш\uAAAA\U0010AAAA"; + + static_assert (array_size (valid_in) == 6, ""); + static_assert (array_size (exp) == 11, ""); + VERIFY (char_traits::length (valid_in) == 5); + VERIFY (char_traits::length (exp) == 10); + + test_offsets_error offsets[] = { + {5, 10, 0, 0, 0xD800, 0}, + {5, 10, 0, 0, 0xDBFF, 0}, + {5, 10, 0, 0, 0xDC00, 0}, + {5, 10, 0, 0, 0xDFFF, 0}, + + {5, 10, 1, 1, 0xD800, 1}, + {5, 10, 1, 1, 0xDBFF, 1}, + {5, 10, 1, 1, 0xDC00, 1}, + {5, 10, 1, 1, 0xDFFF, 1}, + + {5, 10, 2, 3, 0xD800, 2}, + {5, 10, 2, 3, 0xDBFF, 2}, + {5, 10, 2, 3, 0xDC00, 2}, + {5, 10, 2, 3, 0xDFFF, 2}, + + // make the leading surrogate a trailing one + {5, 10, 3, 6, 0xDC00, 3}, + {5, 10, 3, 6, 0xDFFF, 3}, + + // make the trailing surrogate a leading one + {5, 10, 3, 6, 0xD800, 4}, + {5, 10, 3, 6, 0xDBFF, 4}, + + // make the trailing surrogate a BMP char + {5, 10, 3, 6, u'z', 4}, + }; + + for (auto t : offsets) + { + CharT in[array_size (valid_in)] = {}; + char out[array_size (exp) - 1] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + VERIFY (t.expected_in_next <= t.in_size); + VERIFY (t.expected_out_next <= t.out_size); + copy (begin (valid_in), end (valid_in), begin (in)); + in[t.replace_pos] = t.replace_char; + + auto state = mbstate_t{}; + auto in_next = (const CharT *) nullptr; + auto out_next = (char *) nullptr; + auto res = codecvt_base::result (); + + res = cvt.out (state, in, in + t.in_size, in_next, out, out + t.out_size, + out_next); + VERIFY (res == cvt.error); + VERIFY (in_next == in + t.expected_in_next); + VERIFY (out_next == out + t.expected_out_next); + VERIFY (char_traits::compare (out, exp, t.expected_out_next) == 0); + if (t.expected_out_next < array_size (out)) + VERIFY (out[t.expected_out_next] == 0); + } +} + +template +void +utf16_to_utf8_out (const std::codecvt &cvt) +{ + utf16_to_utf8_out_ok (cvt); + utf16_to_utf8_out_partial (cvt); + utf16_to_utf8_out_error (cvt); +} + +template +void +test_utf8_utf16_cvts (const std::codecvt &cvt) +{ + utf8_to_utf16_in (cvt); + utf16_to_utf8_out (cvt); +} + +template +void +utf8_to_ucs2_in_ok (const std::codecvt &cvt) +{ + using namespace std; + // UTF-8 string of 1-byte CP, 2-byte CP and 3-byte CP + const char in[] = "bш\uAAAA"; + const char16_t exp_literal[] = u"bш\uAAAA"; + CharT exp[array_size (exp_literal)] = {}; + copy (begin (exp_literal), end (exp_literal), begin (exp)); + + static_assert (array_size (in) == 7, ""); + static_assert (array_size (exp_literal) == 4, ""); + static_assert (array_size (exp) == 4, ""); + VERIFY (char_traits::length (in) == 6); + VERIFY (char_traits::length (exp_literal) == 3); + VERIFY (char_traits::length (exp) == 3); + + test_offsets_ok offsets[] = {{0, 0}, {1, 1}, {3, 2}, {6, 3}}; + for (auto t : offsets) + { + CharT out[array_size (exp) - 1] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + auto state = mbstate_t{}; + auto in_next = (const char *) nullptr; + auto out_next = (CharT *) nullptr; + auto res = codecvt_base::result (); + + res = cvt.in (state, in, in + t.in_size, in_next, out, out + t.out_size, + out_next); + VERIFY (res == cvt.ok); + VERIFY (in_next == in + t.in_size); + VERIFY (out_next == out + t.out_size); + VERIFY (char_traits::compare (out, exp, t.out_size) == 0); + if (t.out_size < array_size (out)) + VERIFY (out[t.out_size] == 0); + } + + for (auto t : offsets) + { + CharT out[array_size (exp)] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + auto state = mbstate_t{}; + auto in_next = (const char *) nullptr; + auto out_next = (CharT *) nullptr; + auto res = codecvt_base::result (); + + res + = cvt.in (state, in, in + t.in_size, in_next, out, end (out), out_next); + VERIFY (res == cvt.ok); + VERIFY (in_next == in + t.in_size); + VERIFY (out_next == out + t.out_size); + VERIFY (char_traits::compare (out, exp, t.out_size) == 0); + if (t.out_size < array_size (out)) + VERIFY (out[t.out_size] == 0); + } +} + +template +void +utf8_to_ucs2_in_partial (const std::codecvt &cvt) +{ + using namespace std; + // UTF-8 string of 1-byte CP, 2-byte CP and 3-byte CP + const char in[] = "bш\uAAAA"; + const char16_t exp_literal[] = u"bш\uAAAA"; + CharT exp[array_size (exp_literal)] = {}; + copy (begin (exp_literal), end (exp_literal), begin (exp)); + + static_assert (array_size (in) == 7, ""); + static_assert (array_size (exp_literal) == 4, ""); + static_assert (array_size (exp) == 4, ""); + VERIFY (char_traits::length (in) == 6); + VERIFY (char_traits::length (exp_literal) == 3); + VERIFY (char_traits::length (exp) == 3); + + test_offsets_partial offsets[] = { + {1, 0, 0, 0}, // no space for first CP + + {3, 1, 1, 1}, // no space for second CP + {2, 2, 1, 1}, // incomplete second CP + {2, 1, 1, 1}, // incomplete second CP, and no space for it + + {6, 2, 3, 2}, // no space for third CP + {4, 3, 3, 2}, // incomplete third CP + {5, 3, 3, 2}, // incomplete third CP + {4, 2, 3, 2}, // incomplete third CP, and no space for it + {5, 2, 3, 2}, // incomplete third CP, and no space for it + }; + + for (auto t : offsets) + { + CharT out[array_size (exp) - 1] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + VERIFY (t.expected_in_next <= t.in_size); + VERIFY (t.expected_out_next <= t.out_size); + auto state = mbstate_t{}; + auto in_next = (const char *) nullptr; + auto out_next = (CharT *) nullptr; + auto res = codecvt_base::result (); + + res = cvt.in (state, in, in + t.in_size, in_next, out, out + t.out_size, + out_next); + VERIFY (res == cvt.partial); + VERIFY (in_next == in + t.expected_in_next); + VERIFY (out_next == out + t.expected_out_next); + VERIFY (char_traits::compare (out, exp, t.expected_out_next) == 0); + if (t.expected_out_next < array_size (out)) + VERIFY (out[t.expected_out_next] == 0); + } +} + +template +void +utf8_to_ucs2_in_error (const std::codecvt &cvt) +{ + using namespace std; + const char valid_in[] = "bш\uAAAA\U0010AAAA"; + const char16_t exp_literal[] = u"bш\uAAAA\U0010AAAA"; + CharT exp[array_size (exp_literal)] = {}; + copy (begin (exp_literal), end (exp_literal), begin (exp)); + + static_assert (array_size (valid_in) == 11, ""); + static_assert (array_size (exp_literal) == 6, ""); + static_assert (array_size (exp) == 6, ""); + VERIFY (char_traits::length (valid_in) == 10); + VERIFY (char_traits::length (exp_literal) == 5); + VERIFY (char_traits::length (exp) == 5); + + test_offsets_error offsets[] = { + + // replace leading byte with invalid byte + {1, 5, 0, 0, '\xFF', 0}, + {3, 5, 1, 1, '\xFF', 1}, + {6, 5, 3, 2, '\xFF', 3}, + {10, 5, 6, 3, '\xFF', 6}, + + // replace first trailing byte with ASCII byte + {3, 5, 1, 1, 'z', 2}, + {6, 5, 3, 2, 'z', 4}, + {10, 5, 6, 3, 'z', 7}, + + // replace first trailing byte with invalid byte + {3, 5, 1, 1, '\xFF', 2}, + {6, 5, 3, 2, '\xFF', 4}, + {10, 5, 6, 3, '\xFF', 7}, + + // replace second trailing byte with ASCII byte + {6, 5, 3, 2, 'z', 5}, + {10, 5, 6, 3, 'z', 8}, + + // replace second trailing byte with invalid byte + {6, 5, 3, 2, '\xFF', 5}, + {10, 5, 6, 3, '\xFF', 8}, + + // replace third trailing byte + {10, 5, 6, 3, 'z', 9}, + {10, 5, 6, 3, '\xFF', 9}, + + // When we see a leading byte of 4-byte CP, we should return error, no + // matter if it is incomplete at the end or has errors in the trailing + // bytes. + + // Don't replace anything, show full 4-byte CP + {10, 4, 6, 3, 'b', 0}, + {10, 5, 6, 3, 'b', 0}, + + // Don't replace anything, show incomplete 4-byte CP at the end + {7, 4, 6, 3, 'b', 0}, // incomplete fourth CP + {8, 4, 6, 3, 'b', 0}, // incomplete fourth CP + {9, 4, 6, 3, 'b', 0}, // incomplete fourth CP + {7, 5, 6, 3, 'b', 0}, // incomplete fourth CP + {8, 5, 6, 3, 'b', 0}, // incomplete fourth CP + {9, 5, 6, 3, 'b', 0}, // incomplete fourth CP + + // replace first trailing byte with ASCII byte, also incomplete at end + {5, 5, 3, 2, 'z', 4}, + + // replace first trailing byte with invalid byte, also incomplete at end + {5, 5, 3, 2, '\xFF', 4}, + + // replace first trailing byte with ASCII byte, also incomplete at end + {8, 5, 6, 3, 'z', 7}, + {9, 5, 6, 3, 'z', 7}, + + // replace first trailing byte with invalid byte, also incomplete at end + {8, 5, 6, 3, '\xFF', 7}, + {9, 5, 6, 3, '\xFF', 7}, + + // replace second trailing byte with ASCII byte, also incomplete at end + {9, 5, 6, 3, 'z', 8}, + + // replace second trailing byte with invalid byte, also incomplete at end + {9, 5, 6, 3, '\xFF', 8}, + }; + for (auto t : offsets) + { + char in[array_size (valid_in)] = {}; + CharT out[array_size (exp) - 1] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + VERIFY (t.expected_in_next <= t.in_size); + VERIFY (t.expected_out_next <= t.out_size); + char_traits::copy (in, valid_in, array_size (valid_in)); + in[t.replace_pos] = t.replace_char; + + auto state = mbstate_t{}; + auto in_next = (const char *) nullptr; + auto out_next = (CharT *) nullptr; + auto res = codecvt_base::result (); + + res = cvt.in (state, in, in + t.in_size, in_next, out, out + t.out_size, + out_next); + VERIFY (res == cvt.error); + VERIFY (in_next == in + t.expected_in_next); + VERIFY (out_next == out + t.expected_out_next); + VERIFY (char_traits::compare (out, exp, t.expected_out_next) == 0); + if (t.expected_out_next < array_size (out)) + VERIFY (out[t.expected_out_next] == 0); + } +} + +template +void +utf8_to_ucs2_in (const std::codecvt &cvt) +{ + utf8_to_ucs2_in_ok (cvt); + utf8_to_ucs2_in_partial (cvt); + utf8_to_ucs2_in_error (cvt); +} + +template +void +ucs2_to_utf8_out_ok (const std::codecvt &cvt) +{ + using namespace std; + // UTF-8 string of 1-byte CP, 2-byte CP and 3-byte CP + const char16_t in_literal[] = u"bш\uAAAA"; + const char exp[] = "bш\uAAAA"; + CharT in[array_size (in_literal)] = {}; + copy (begin (in_literal), end (in_literal), begin (in)); + + static_assert (array_size (in_literal) == 4, ""); + static_assert (array_size (exp) == 7, ""); + static_assert (array_size (in) == 4, ""); + VERIFY (char_traits::length (in_literal) == 3); + VERIFY (char_traits::length (exp) == 6); + VERIFY (char_traits::length (in) == 3); + + const test_offsets_ok offsets[] = {{0, 0}, {1, 1}, {2, 3}, {3, 6}}; + for (auto t : offsets) + { + char out[array_size (exp) - 1] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + auto state = mbstate_t{}; + auto in_next = (const CharT *) nullptr; + auto out_next = (char *) nullptr; + auto res = codecvt_base::result (); + + res = cvt.out (state, in, in + t.in_size, in_next, out, out + t.out_size, + out_next); + VERIFY (res == cvt.ok); + VERIFY (in_next == in + t.in_size); + VERIFY (out_next == out + t.out_size); + VERIFY (char_traits::compare (out, exp, t.out_size) == 0); + if (t.out_size < array_size (out)) + VERIFY (out[t.out_size] == 0); + } +} + +template +void +ucs2_to_utf8_out_partial (const std::codecvt &cvt) +{ + using namespace std; + // UTF-8 string of 1-byte CP, 2-byte CP and 3-byte CP + const char16_t in_literal[] = u"bш\uAAAA"; + const char exp[] = "bш\uAAAA"; + CharT in[array_size (in_literal)] = {}; + copy (begin (in_literal), end (in_literal), begin (in)); + + static_assert (array_size (in_literal) == 4, ""); + static_assert (array_size (exp) == 7, ""); + static_assert (array_size (in) == 4, ""); + VERIFY (char_traits::length (in_literal) == 3); + VERIFY (char_traits::length (exp) == 6); + VERIFY (char_traits::length (in) == 3); + + const test_offsets_partial offsets[] = { + {1, 0, 0, 0}, // no space for first CP + + {2, 1, 1, 1}, // no space for second CP + {2, 2, 1, 1}, // no space for second CP + + {3, 3, 2, 3}, // no space for third CP + {3, 4, 2, 3}, // no space for third CP + {3, 5, 2, 3}, // no space for third CP + }; + for (auto t : offsets) + { + char out[array_size (exp) - 1] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + VERIFY (t.expected_in_next <= t.in_size); + VERIFY (t.expected_out_next <= t.out_size); + auto state = mbstate_t{}; + auto in_next = (const CharT *) nullptr; + auto out_next = (char *) nullptr; + auto res = codecvt_base::result (); + + res = cvt.out (state, in, in + t.in_size, in_next, out, out + t.out_size, + out_next); + VERIFY (res == cvt.partial); + VERIFY (in_next == in + t.expected_in_next); + VERIFY (out_next == out + t.expected_out_next); + VERIFY (char_traits::compare (out, exp, t.expected_out_next) == 0); + if (t.expected_out_next < array_size (out)) + VERIFY (out[t.expected_out_next] == 0); + } +} + +template +void +ucs2_to_utf8_out_error (const std::codecvt &cvt) +{ + using namespace std; + const char16_t valid_in[] = u"bш\uAAAA\U0010AAAA"; + const char exp[] = "bш\uAAAA\U0010AAAA"; + + static_assert (array_size (valid_in) == 6, ""); + static_assert (array_size (exp) == 11, ""); + VERIFY (char_traits::length (valid_in) == 5); + VERIFY (char_traits::length (exp) == 10); + + test_offsets_error offsets[] = { + {5, 10, 0, 0, 0xD800, 0}, + {5, 10, 0, 0, 0xDBFF, 0}, + {5, 10, 0, 0, 0xDC00, 0}, + {5, 10, 0, 0, 0xDFFF, 0}, + + {5, 10, 1, 1, 0xD800, 1}, + {5, 10, 1, 1, 0xDBFF, 1}, + {5, 10, 1, 1, 0xDC00, 1}, + {5, 10, 1, 1, 0xDFFF, 1}, + + {5, 10, 2, 3, 0xD800, 2}, + {5, 10, 2, 3, 0xDBFF, 2}, + {5, 10, 2, 3, 0xDC00, 2}, + {5, 10, 2, 3, 0xDFFF, 2}, + + // dont replace anything, just show the surrogate pair + {5, 10, 3, 6, u'b', 0}, + + // make the leading surrogate a trailing one + {5, 10, 3, 6, 0xDC00, 3}, + {5, 10, 3, 6, 0xDFFF, 3}, + + // make the trailing surrogate a leading one + {5, 10, 3, 6, 0xD800, 4}, + {5, 10, 3, 6, 0xDBFF, 4}, + + // make the trailing surrogate a BMP char + {5, 10, 3, 6, u'z', 4}, + + {5, 7, 3, 6, u'b', 0}, // no space for fourth CP + {5, 8, 3, 6, u'b', 0}, // no space for fourth CP + {5, 9, 3, 6, u'b', 0}, // no space for fourth CP + + {4, 10, 3, 6, u'b', 0}, // incomplete fourth CP + {4, 7, 3, 6, u'b', 0}, // incomplete fourth CP, and no space for it + {4, 8, 3, 6, u'b', 0}, // incomplete fourth CP, and no space for it + {4, 9, 3, 6, u'b', 0}, // incomplete fourth CP, and no space for it + + }; + + for (auto t : offsets) + { + CharT in[array_size (valid_in)] = {}; + char out[array_size (exp) - 1] = {}; + VERIFY (t.in_size <= array_size (in)); + VERIFY (t.out_size <= array_size (out)); + VERIFY (t.expected_in_next <= t.in_size); + VERIFY (t.expected_out_next <= t.out_size); + copy (begin (valid_in), end (valid_in), begin (in)); + in[t.replace_pos] = t.replace_char; + + auto state = mbstate_t{}; + auto in_next = (const CharT *) nullptr; + auto out_next = (char *) nullptr; + auto res = codecvt_base::result (); + + res = cvt.out (state, in, in + t.in_size, in_next, out, out + t.out_size, + out_next); + VERIFY (res == cvt.error); + VERIFY (in_next == in + t.expected_in_next); + VERIFY (out_next == out + t.expected_out_next); + VERIFY (char_traits::compare (out, exp, t.expected_out_next) == 0); + if (t.expected_out_next < array_size (out)) + VERIFY (out[t.expected_out_next] == 0); + } +} + +template +void +ucs2_to_utf8_out (const std::codecvt &cvt) +{ + ucs2_to_utf8_out_ok (cvt); + ucs2_to_utf8_out_partial (cvt); + ucs2_to_utf8_out_error (cvt); +} + +template +void +test_utf8_ucs2_cvts (const std::codecvt &cvt) +{ + utf8_to_ucs2_in (cvt); + ucs2_to_utf8_out (cvt); +} diff --git a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc new file mode 100644 index 000000000..169504939 --- /dev/null +++ b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc @@ -0,0 +1,59 @@ +// Copyright (C) 2020-2023 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// . + +// { dg-do run { target c++11 } } + +#include "codecvt_unicode.h" + +#include + +using namespace std; + +void +test_utf8_utf32_codecvts () +{ +#if __SIZEOF_WCHAR_T__ == 4 + auto cvt_ptr = to_unique_ptr (new codecvt_utf8 ()); + test_utf8_utf32_codecvts (*cvt_ptr); +#endif +} + +void +test_utf8_utf16_codecvts () +{ +#if __SIZEOF_WCHAR_T__ >= 2 + auto cvt_ptr = to_unique_ptr (new codecvt_utf8_utf16 ()); + test_utf8_utf16_cvts (*cvt_ptr); +#endif +} + +void +test_utf8_ucs2_codecvts () +{ +#if __SIZEOF_WCHAR_T__ == 2 + auto cvt_ptr = to_unique_ptr (new codecvt_utf8 ()); + test_utf8_ucs2_cvts (*cvt_ptr); +#endif +} + +int +main () +{ + test_utf8_utf32_codecvts (); + test_utf8_utf16_codecvts (); + test_utf8_ucs2_codecvts (); +}