From patchwork Tue Jan 10 12:58:59 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: Dimitrij Mijoski <dmjpp@hotmail.com>
X-Patchwork-Id: 41451
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp2730133wrt;
        Tue, 10 Jan 2023 05:00:43 -0800 (PST)
X-Google-Smtp-Source: 
 AMrXdXt89mHIJEi03uVQw1w4AFGpGGlQphN8v2oxkvVk/T9YWlsVRlhxMxpMSnBnPRPjxwHK7bHL
X-Received: by 2002:a17:906:5f98:b0:84d:1b67:cecb with SMTP id
 a24-20020a1709065f9800b0084d1b67cecbmr13063940eju.43.1673355643572;
        Tue, 10 Jan 2023 05:00:43 -0800 (PST)
Received: from sourceware.org (server2.sourceware.org. [8.43.85.97])
        by mx.google.com with ESMTPS id
 oz12-20020a1709077d8c00b00857bd89d27dsi1580184ejc.76.2023.01.10.05.00.43
        for <ouuuleilei@gmail.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 10 Jan 2023 05:00:43 -0800 (PST)
Received-SPF: pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as
 permitted sender) client-ip=8.43.85.97;
Authentication-Results: mx.google.com;
       dkim=pass header.i=@gcc.gnu.org header.s=default header.b=WYripy0x;
       arc=fail (signature failed);
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as
 permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org";
       dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 119FE38493FD
	for <ouuuleilei@gmail.com>; Tue, 10 Jan 2023 13:00:09 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 119FE38493FD
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1673355609;
	bh=F1b1nXjKLEsAfvR5km4/qM5k/FZXlu2mKOM5ssjSbJU=;
	h=Subject:To:Date:List-Id:List-Unsubscribe:List-Archive:List-Post:
	 List-Help:List-Subscribe:From:Reply-To:From;
	b=WYripy0xI+oEFiPSsSpHUhI2GKDL2Rmq/PXlZOVbPlJbGxz6qO1z+weHjV9nJ5lwI
	 MoBqEp/4KLVBLUchuI4lyueHncxqjgvDKuFZGjLz2WGDzgpNYxY5f4qfS+s3auICxB
	 LCi+k7KJggzadddemNqS/E8eO4TfTMw4Iom9PUd8=
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from EUR04-VI1-obe.outbound.protection.outlook.com
 (mail-vi1eur04olkn2066.outbound.protection.outlook.com [40.92.75.66])
 by sourceware.org (Postfix) with ESMTPS id B3A573858C78;
 Tue, 10 Jan 2023 12:59:08 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B3A573858C78
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
 b=jKafXVgdOqB+/AWE/NBkmciW56e/N3GFbPvzBGnnixbH6nuKmfh2gxH+PS9MW85s1eCRVBWrDRubbrtW3iYrf8XbOBNG5vfw5Ga8XgdqJbj3IbtBFYyyOmK1K1Wn0u9ex29AH5xGUw6RhuyuTydnMJ9BxOWhpS3NocPbMCskb1lGPMfE5DJA+yJs/g2Dz7H4TIx1yU7zsc3GZ85cuGs2Rg4kHEbgdZGlWZHG8aeNyQ9buYR+2LYX0sCCD4tdEpckDexPK3iJMyynNFcwAq8qUNjf4IiTXT0AqMIiooEvqIniyG0hIg7fnFrWeZCcJCqbutZovOCw2t/T93AYQU/cCQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;
 s=arcselector9901;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;
 bh=F1b1nXjKLEsAfvR5km4/qM5k/FZXlu2mKOM5ssjSbJU=;
 b=RWg/41AnswaprsyQ/V8BCjM6GGHcGpT95EfdOsCpk5V2MQimnup3DLjga8Rckdpm9XZNk3NKYZ3ZhFpM9G9AXoYbIH7dJVtdc2JkwuAIGwMMvjhxeeEYcTYGD8imaWBDXjql0QkRu0nocWUSBQOtGu49f4kw7dSsID+DlkFuySv5pIw/jbx8h4A6tA4pWK5+OAqSOGWMI78arnaMsFVg5rAHOhk1miHwj1xvawcXShGI9midVuxn/X1VT6/OKkkAkGfjp2TC8IPpUO2Q4cZGQDyo7Ya2iROr3GUbSGlTLitubpo5SmipdAAFJPNptKzyARXapK59iD93S0Y8/ZbUnA==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none;
 dkim=none; arc=none
Received: from AM0PR04MB5412.eurprd04.prod.outlook.com (2603:10a6:208:10f::11)
 by DB8PR04MB7018.eurprd04.prod.outlook.com (2603:10a6:10:121::24)
 with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5986.18; Tue, 10 Jan
 2023 12:59:05 +0000
Received: from AM0PR04MB5412.eurprd04.prod.outlook.com
 ([fe80::8021:ef99:c515:b20f]) by AM0PR04MB5412.eurprd04.prod.outlook.com
 ([fe80::8021:ef99:c515:b20f%3]) with mapi id 15.20.5986.018; Tue, 10 Jan 2023
 12:59:03 +0000
Message-ID: 
 <AM0PR04MB541256BD6B9838E4BE055B85ACFF9@AM0PR04MB5412.eurprd04.prod.outlook.com>
Subject: [PATCH v2] libstdc++: Fix Unicode codecvt and add tests [PR86419]
To: gcc-patches@gcc.gnu.org, libstdc++@gcc.gnu.org
Date: Tue, 10 Jan 2023 13:58:59 +0100
User-Agent: Evolution 3.44.4-0ubuntu1 
X-TMN: [Ix00PYKkgxB28EmSOVxZW88nnBSdWG2F]
X-ClientProxiedBy: VI1PR0102CA0002.eurprd01.prod.exchangelabs.com
 (2603:10a6:802::15) To AM0PR04MB5412.eurprd04.prod.outlook.com
 (2603:10a6:208:10f::11)
X-Microsoft-Original-Message-ID: 
 <c18f61fae7d0cad8386db5a03a66b1c33d63806f.camel@hotmail.com>
MIME-Version: 1.0
X-MS-Exchange-MessageSentRepresentingType: 1
X-MS-PublicTrafficType: Email
X-MS-TrafficTypeDiagnostic: AM0PR04MB5412:EE_|DB8PR04MB7018:EE_
X-MS-Office365-Filtering-Correlation-Id: 97c88ebc-040b-42d0-3f24-08daf30a7263
X-Microsoft-Antispam: BCL:0;
X-Microsoft-Antispam-Message-Info: 
 I7gSGJCEEq+RzWlYS/a8+aoQm6M3lrkDOXJVCYrQ2VPMz2gk5T/oXPR/J5WZl/mmgpNleSJaM8fGlQnE6bufK7b5oa3yuM53Y2SpoZnNj+kBW6wUuKAazbf1PpDnhcay6frMtggF9YiCSkDtqmneYaxda/EGVvzmR/ipZsom8HwATMRXzKqyiHZ6qgoNZUob4eQhZJ8JybQZfilIvrpmBxy/2Kk4A7DBanoYeh1lF2dQYMZHJcENExBjwQN7Y0/Nr3OF65WBHPmjQXtgKic1hVvXMGBZmOHaS4GMGuwORMC5aR9Oa0ltK7IMDjnqlg7UNYu6CuU7UAv9YG0skrFMHnfRkfJdP6qAJlUzXnkqEn5I44FNm62nIE5KZ1gfZqmdbnUK+2ndaOVLJpaCm5kTbav9PpBCeNv8JuZDiK8XKO85q6DjMqtVM227VITT3RhIVocpHWZXdqla/VzsT6SSb39mJC1T1PQdyq6hQt8xvDdK+AtwuBASh37EoHJBkdhyhstakbInk3u9X4KXDe+y6TVN9m1Cnc98ODV9v55j93Q/djxyFRa2ODmEfT+9QlFKukb0UuYZQ323DweYJtcGDg==
X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1
X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?q?R+/TjNncoPCvskusuuDEHAE/D5QO?=
	=?utf-8?q?Nj76o1ccMLBdJ5UzzGSizi5QyA+CLfGlGg+QhDloQZJTIZ29YqUGENEyn8S0DhzJj?=
	=?utf-8?q?AxzcaIfe/cD40Zxa/6AyGY0iEjq5XIPHXCYnmhtn54K2l3WFuCtCkEmjzRHvoD6uG?=
	=?utf-8?q?8YuO+HzaKXo/LVhAQ8alwO+0NPHYd8mjv7ScVR0u4bgXTdh2M4LNmDLqPXHXbIQqi?=
	=?utf-8?q?iw/FC+HnC+dE2PIIWZP3i90RiHGmH7TFLMEItscarTmsT46lc+Wl2+4bLQJEe2fLp?=
	=?utf-8?q?X7JIgl5Zh7npk0o08IIEszTMRV34wczmAgQmkGMos6Lb6Jj8bSQlLFEVkn03TL6xO?=
	=?utf-8?q?0t8fM1WaVl/7W2WDUBb1GMlLYlTpypvkegsbIGpj6VpiAXFCt/Jd23Txd3ObAnxmi?=
	=?utf-8?q?ixm/m9RbnY68weX+jaglHfwtDBkn/WIAWwZNLWKKYKrezmvJ3dRWKHyocEOZRsZ4M?=
	=?utf-8?q?T6oU7Uu2fABrtI0fgEZ9yuc23ZHF25hnYD7ZiIRpOJpVZM25+xPW7RnqtDvvrupz9?=
	=?utf-8?q?EzPMYupieCk+GLF6ma3jnqjqqu1VPzgXSBJF8TM5NkVsVtxYaKEamNisM6AyzsQh2?=
	=?utf-8?q?dj7iiZPqOS3l6RkAMu48faO/+0L6AuE/NIiD2fYWK4vuUfz5XRCarU8RT1q9MnmtX?=
	=?utf-8?q?lK/BnFwoQyZOgrCkmDG8/4s10uUzMBJq3SQwQmZHLfO07pKz8zfT9zC8ixWa+exTd?=
	=?utf-8?q?k1J6YJ2nXoSclUAJ4u8snglEosEJZYDkdwEobgNXZq3HuQz4pdB+ICqGl8aSmVIB4?=
	=?utf-8?q?LfcPqwimlS62JbvFYHgl/zM5D1l4yKEOQtYuJQotmpPxqCOrh0eRfgLl60AfZEJ+K?=
	=?utf-8?q?qy1VxwODtgok21aZ4r312U+R+0fhi/6JWu23C/kY5FZsmsoyUMIW1L1ffPZt5GHKe?=
	=?utf-8?q?lYNkx9m1egFqkf8XRHkEzrZTHbMdxCR+TV05ECX8B7tmdSitLuPXTwOp8PfE8l8nJ?=
	=?utf-8?q?5iDJhD18gk29rS5Wjc+Yo5T2Iif/DBca5i0UeJ7jiMjEBIqwu0iPJaruaBr/YJTe/?=
	=?utf-8?q?mx4Z9683/5xiPqmPSjtjnZ/DtrulhPVW4d7DI+5whOktRgnGcOOjvgaJXlaPizmzH?=
	=?utf-8?q?4dK4vsGAOfQ56DTAx1vEwWMqMr36rwr7e3LiFiQiJLcJbn0CF1xl2MGJ6BFx+HXjt?=
	=?utf-8?q?1KaOmWZR1L7HQpCsAJhXMo11tYOjnuKq+6lsxSqAZVlgxMOsS+SqP2SErR4iWvlRq?=
	=?utf-8?q?kbUfjyEiZeBqxdi3A?=
X-OriginatorOrg: sct-15-20-4755-11-msonline-outlook-03a34.templateTenant
X-MS-Exchange-CrossTenant-Network-Message-Id: 
 97c88ebc-040b-42d0-3f24-08daf30a7263
X-MS-Exchange-CrossTenant-AuthSource: AM0PR04MB5412.eurprd04.prod.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Jan 2023 12:59:02.9737 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-CrossTenant-Id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa
X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 
 00000000-0000-0000-0000-000000000000
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB8PR04MB7018
X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0,
 HK_RANDOM_ENVFROM, HK_RANDOM_FROM, KAM_SHORT, RCVD_IN_DNSWL_NONE,
 RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-Patchwork-Original-From: Dimitrij Mijoski via Gcc-patches
 <gcc-patches@gcc.gnu.org>
From: Dimitrij Mijoski <dmjpp@hotmail.com>
Reply-To: Dimitrij Mijoski <dmjpp@hotmail.com>
Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org
Sender: "Gcc-patches" <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org>
X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?=
X-GMAIL-THRID: =?utf-8?q?1754640567609666176?=
X-GMAIL-MSGID: =?utf-8?q?1754640567609666176?=

Fixes the conversion from UTF-8 to UTF-16 to properly return partial
instead ok.
Fixes the conversion from UTF-16 to UTF-8 to properly return partial
instead ok.
Fixes the conversion from UTF-8 to UCS-2 to properly return partial
instead error.
Fixes the conversion from UTF-8 to UCS-2 to treat 4-byte UTF-8 sequences
as error just by seeing the leading byte.
Fixes UTF-8 decoding for all codecvts so they detect error at the end of
the input range when the last code point is also incomplete.

libstdc++-v3/ChangeLog:
	PR libstdc++/86419
	* src/c++11/codecvt.cc: Fix bugs.
	* testsuite/22_locale/codecvt/codecvt_unicode.cc: New tests.
	* testsuite/22_locale/codecvt/codecvt_unicode.h: New tests.
	* testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc: New
	  tests.
---
 libstdc++-v3/src/c++11/codecvt.cc             |   38 +-
 .../22_locale/codecvt/codecvt_unicode.cc      |   68 +
 .../22_locale/codecvt/codecvt_unicode.h       | 1268 +++++++++++++++++
 .../codecvt/codecvt_unicode_wchar_t.cc        |   59 +
 4 files changed, 1414 insertions(+), 19 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc
 create mode 100644 libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h
 create mode 100644 libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc

diff --git a/libstdc++-v3/src/c++11/codecvt.cc b/libstdc++-v3/src/c++11/codecvt.cc
index 9f8cb7677..49282a510 100644
--- a/libstdc++-v3/src/c++11/codecvt.cc
+++ b/libstdc++-v3/src/c++11/codecvt.cc
@@ -277,13 +277,15 @@ namespace
     }
     else if (c1 < 0xF0) // 3-byte sequence
     {
-      if (avail < 3)
+      if (avail < 2)
 	return incomplete_mb_character;
       char32_t c2 = (unsigned char) from[1];
       if ((c2 & 0xC0) != 0x80)
 	return invalid_mb_sequence;
       if (c1 == 0xE0 && c2 < 0xA0) // overlong
 	return invalid_mb_sequence;
+      if (avail < 3)
+	return incomplete_mb_character;
       char32_t c3 = (unsigned char) from[2];
       if ((c3 & 0xC0) != 0x80)
 	return invalid_mb_sequence;
@@ -292,9 +294,9 @@ namespace
 	from += 3;
       return c;
     }
-    else if (c1 < 0xF5) // 4-byte sequence
+    else if (c1 < 0xF5 && maxcode > 0xFFFF) // 4-byte sequence
     {
-      if (avail < 4)
+      if (avail < 2)
 	return incomplete_mb_character;
       char32_t c2 = (unsigned char) from[1];
       if ((c2 & 0xC0) != 0x80)
@@ -302,10 +304,14 @@ namespace
       if (c1 == 0xF0 && c2 < 0x90) // overlong
 	return invalid_mb_sequence;
       if (c1 == 0xF4 && c2 >= 0x90) // > U+10FFFF
-      return invalid_mb_sequence;
+	return invalid_mb_sequence;
+      if (avail < 3)
+	return incomplete_mb_character;
       char32_t c3 = (unsigned char) from[2];
       if ((c3 & 0xC0) != 0x80)
 	return invalid_mb_sequence;
+      if (avail < 4)
+	return incomplete_mb_character;
       char32_t c4 = (unsigned char) from[3];
       if ((c4 & 0xC0) != 0x80)
 	return invalid_mb_sequence;
@@ -527,12 +533,11 @@ namespace
   // Flag indicating whether to process UTF-16 or UCS2
   enum class surrogates { allowed, disallowed };
 
-  // utf8 -> utf16 (or utf8 -> ucs2 if s == surrogates::disallowed)
-  template<typename C8, typename C16>
-  codecvt_base::result
-  utf16_in(range<const C8>& from, range<C16>& to,
-	   unsigned long maxcode = max_code_point, codecvt_mode mode = {},
-	   surrogates s = surrogates::allowed)
+  // utf8 -> utf16 (or utf8 -> ucs2 if maxcode <= 0xFFFF)
+  template <typename C8, typename C16>
+  codecvt_base::result utf16_in (range<const C8> &from, range<C16> &to,
+				 unsigned long maxcode = max_code_point,
+				 codecvt_mode mode = {})
   {
     read_utf8_bom(from, mode);
     while (from.size() && to.size())
@@ -540,12 +545,7 @@ namespace
 	auto orig = from;
 	const char32_t codepoint = read_utf8_code_point(from, maxcode);
 	if (codepoint == incomplete_mb_character)
-	  {
-	    if (s == surrogates::allowed)
-	      return codecvt_base::partial;
-	    else
-	      return codecvt_base::error; // No surrogates in UCS2
-	  }
+	  return codecvt_base::partial;
 	if (codepoint > maxcode)
 	  return codecvt_base::error;
 	if (!write_utf16_code_point(to, codepoint, mode))
@@ -554,7 +554,7 @@ namespace
 	    return codecvt_base::partial;
 	  }
       }
-    return codecvt_base::ok;
+    return from.size () ? codecvt_base::partial : codecvt_base::ok;
   }
 
   // utf16 -> utf8 (or ucs2 -> utf8 if s == surrogates::disallowed)
@@ -576,7 +576,7 @@ namespace
 	      return codecvt_base::error; // No surrogates in UCS-2
 
 	    if (from.size() < 2)
-	      return codecvt_base::ok; // stop converting at this point
+	      return codecvt_base::partial; // stop converting at this point
 
 	    const char32_t c2 = from[1];
 	    if (is_low_surrogate(c2))
@@ -629,7 +629,7 @@ namespace
   {
     // UCS-2 only supports characters in the BMP, i.e. one UTF-16 code unit:
     maxcode = std::min(max_single_utf16_unit, maxcode);
-    return utf16_in(from, to, maxcode, mode, surrogates::disallowed);
+    return utf16_in (from, to, maxcode, mode);
   }
 
   // ucs2 -> utf8
diff --git a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc
new file mode 100644
index 000000000..ae4b6c896
--- /dev/null
+++ b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc
@@ -0,0 +1,68 @@
+// Copyright (C) 2020-2023 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// { dg-do run { target c++11 } }
+
+#include "codecvt_unicode.h"
+
+#include <codecvt>
+
+using namespace std;
+
+void
+test_utf8_utf32_codecvts ()
+{
+  using codecvt_c32 = codecvt<char32_t, char, mbstate_t>;
+  auto loc_c = locale::classic ();
+  VERIFY (has_facet<codecvt_c32> (loc_c));
+  auto &cvt = use_facet<codecvt_c32> (loc_c);
+  test_utf8_utf32_codecvts (cvt);
+
+  auto cvt_ptr = to_unique_ptr (new codecvt_utf8<char32_t> ());
+  test_utf8_utf32_codecvts (*cvt_ptr);
+}
+
+void
+test_utf8_utf16_codecvts ()
+{
+  using codecvt_c16 = codecvt<char16_t, char, mbstate_t>;
+  auto loc_c = locale::classic ();
+  VERIFY (has_facet<codecvt_c16> (loc_c));
+  auto &cvt = use_facet<codecvt_c16> (loc_c);
+  test_utf8_utf16_cvts (cvt);
+
+  auto cvt_ptr = to_unique_ptr (new codecvt_utf8_utf16<char16_t> ());
+  test_utf8_utf16_cvts (*cvt_ptr);
+
+  auto cvt_ptr2 = to_unique_ptr (new codecvt_utf8_utf16<char32_t> ());
+  test_utf8_utf16_cvts (*cvt_ptr2);
+}
+
+void
+test_utf8_ucs2_codecvts ()
+{
+  auto cvt_ptr = to_unique_ptr (new codecvt_utf8<char16_t> ());
+  test_utf8_ucs2_cvts (*cvt_ptr);
+}
+
+int
+main ()
+{
+  test_utf8_utf32_codecvts ();
+  test_utf8_utf16_codecvts ();
+  test_utf8_ucs2_codecvts ();
+}
diff --git a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h
new file mode 100644
index 000000000..70d079286
--- /dev/null
+++ b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h
@@ -0,0 +1,1268 @@
+// Copyright (C) 2020-2023 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#include <locale>
+#include <string>
+#include <testsuite_hooks.h>
+
+template <typename T>
+std::unique_ptr<T>
+to_unique_ptr (T *ptr)
+{
+  return std::unique_ptr<T> (ptr);
+}
+
+struct test_offsets_ok
+{
+  size_t in_size, out_size;
+};
+struct test_offsets_partial
+{
+  size_t in_size, out_size, expected_in_next, expected_out_next;
+};
+
+template <class CharT> struct test_offsets_error
+{
+  size_t in_size, out_size, expected_in_next, expected_out_next;
+  CharT replace_char;
+  size_t replace_pos;
+};
+
+template <class T, size_t N>
+auto constexpr array_size (const T (&)[N]) -> size_t
+{
+  return N;
+}
+
+template <class CharT>
+void
+utf8_to_utf32_in_ok (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  using namespace std;
+  // UTF-8 string of 1-byte CP, 2-byte CP, 3-byte CP and 4-byte CP
+  const char in[] = "bш\uAAAA\U0010AAAA";
+  const char32_t exp_literal[] = U"bш\uAAAA\U0010AAAA";
+  CharT exp[array_size (exp_literal)] = {};
+  std::copy (begin (exp_literal), end (exp_literal), begin (exp));
+
+  static_assert (array_size (in) == 11, "");
+  static_assert (array_size (exp_literal) == 5, "");
+  static_assert (array_size (exp) == 5, "");
+  VERIFY (char_traits<char>::length (in) == 10);
+  VERIFY (char_traits<char32_t>::length (exp_literal) == 4);
+  VERIFY (char_traits<CharT>::length (exp) == 4);
+
+  test_offsets_ok offsets[] = {{0, 0}, {1, 1}, {3, 2}, {6, 3}, {10, 4}};
+  for (auto t : offsets)
+    {
+      CharT out[array_size (exp) - 1] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      auto state = mbstate_t{};
+      auto in_next = (const char *) nullptr;
+      auto out_next = (CharT *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res = cvt.in (state, in, in + t.in_size, in_next, out, out + t.out_size,
+		    out_next);
+      VERIFY (res == cvt.ok);
+      VERIFY (in_next == in + t.in_size);
+      VERIFY (out_next == out + t.out_size);
+      VERIFY (char_traits<CharT>::compare (out, exp, t.out_size) == 0);
+      if (t.out_size < array_size (out))
+	VERIFY (out[t.out_size] == 0);
+    }
+
+  for (auto t : offsets)
+    {
+      CharT out[array_size (exp)] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      auto state = mbstate_t{};
+      auto in_next = (const char *) nullptr;
+      auto out_next = (CharT *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res
+	= cvt.in (state, in, in + t.in_size, in_next, out, end (out), out_next);
+      VERIFY (res == cvt.ok);
+      VERIFY (in_next == in + t.in_size);
+      VERIFY (out_next == out + t.out_size);
+      VERIFY (char_traits<CharT>::compare (out, exp, t.out_size) == 0);
+      if (t.out_size < array_size (out))
+	VERIFY (out[t.out_size] == 0);
+    }
+}
+
+template <class CharT>
+void
+utf8_to_utf32_in_partial (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  using namespace std;
+  // UTF-8 string of 1-byte CP, 2-byte CP, 3-byte CP and 4-byte CP
+  const char in[] = "bш\uAAAA\U0010AAAA";
+  const char32_t exp_literal[] = U"bш\uAAAA\U0010AAAA";
+  CharT exp[array_size (exp_literal)] = {};
+  std::copy (begin (exp_literal), end (exp_literal), begin (exp));
+
+  static_assert (array_size (in) == 11, "");
+  static_assert (array_size (exp_literal) == 5, "");
+  static_assert (array_size (exp) == 5, "");
+  VERIFY (char_traits<char>::length (in) == 10);
+  VERIFY (char_traits<char32_t>::length (exp_literal) == 4);
+  VERIFY (char_traits<CharT>::length (exp) == 4);
+
+  test_offsets_partial offsets[] = {
+    {1, 0, 0, 0}, // no space for first CP
+
+    {3, 1, 1, 1}, // no space for second CP
+    {2, 2, 1, 1}, // incomplete second CP
+    {2, 1, 1, 1}, // incomplete second CP, and no space for it
+
+    {6, 2, 3, 2}, // no space for third CP
+    {4, 3, 3, 2}, // incomplete third CP
+    {5, 3, 3, 2}, // incomplete third CP
+    {4, 2, 3, 2}, // incomplete third CP, and no space for it
+    {5, 2, 3, 2}, // incomplete third CP, and no space for it
+
+    {10, 3, 6, 3}, // no space for fourth CP
+    {7, 4, 6, 3},  // incomplete fourth CP
+    {8, 4, 6, 3},  // incomplete fourth CP
+    {9, 4, 6, 3},  // incomplete fourth CP
+    {7, 3, 6, 3},  // incomplete fourth CP, and no space for it
+    {8, 3, 6, 3},  // incomplete fourth CP, and no space for it
+    {9, 3, 6, 3},  // incomplete fourth CP, and no space for it
+  };
+
+  for (auto t : offsets)
+    {
+      CharT out[array_size (exp) - 1] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      VERIFY (t.expected_in_next <= t.in_size);
+      VERIFY (t.expected_out_next <= t.out_size);
+      auto state = mbstate_t{};
+      auto in_next = (const char *) nullptr;
+      auto out_next = (CharT *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res = cvt.in (state, in, in + t.in_size, in_next, out, out + t.out_size,
+		    out_next);
+      VERIFY (res == cvt.partial);
+      VERIFY (in_next == in + t.expected_in_next);
+      VERIFY (out_next == out + t.expected_out_next);
+      VERIFY (char_traits<CharT>::compare (out, exp, t.expected_out_next) == 0);
+      if (t.expected_out_next < array_size (out))
+	VERIFY (out[t.expected_out_next] == 0);
+    }
+}
+
+template <class CharT>
+void
+utf8_to_utf32_in_error (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  using namespace std;
+  // UTF-8 string of 1-byte CP, 2-byte CP, 3-byte CP and 4-byte CP
+  const char valid_in[] = "bш\uAAAA\U0010AAAA";
+  const char32_t exp_literal[] = U"bш\uAAAA\U0010AAAA";
+  CharT exp[array_size (exp_literal)] = {};
+  std::copy (begin (exp_literal), end (exp_literal), begin (exp));
+
+  static_assert (array_size (valid_in) == 11, "");
+  static_assert (array_size (exp_literal) == 5, "");
+  static_assert (array_size (exp) == 5, "");
+  VERIFY (char_traits<char>::length (valid_in) == 10);
+  VERIFY (char_traits<char32_t>::length (exp_literal) == 4);
+  VERIFY (char_traits<CharT>::length (exp) == 4);
+
+  test_offsets_error<char> offsets[] = {
+
+    // replace leading byte with invalid byte
+    {1, 4, 0, 0, '\xFF', 0},
+    {3, 4, 1, 1, '\xFF', 1},
+    {6, 4, 3, 2, '\xFF', 3},
+    {10, 4, 6, 3, '\xFF', 6},
+
+    // replace first trailing byte with ASCII byte
+    {3, 4, 1, 1, 'z', 2},
+    {6, 4, 3, 2, 'z', 4},
+    {10, 4, 6, 3, 'z', 7},
+
+    // replace first trailing byte with invalid byte
+    {3, 4, 1, 1, '\xFF', 2},
+    {6, 4, 3, 2, '\xFF', 4},
+    {10, 4, 6, 3, '\xFF', 7},
+
+    // replace second trailing byte with ASCII byte
+    {6, 4, 3, 2, 'z', 5},
+    {10, 4, 6, 3, 'z', 8},
+
+    // replace second trailing byte with invalid byte
+    {6, 4, 3, 2, '\xFF', 5},
+    {10, 4, 6, 3, '\xFF', 8},
+
+    // replace third trailing byte
+    {10, 4, 6, 3, 'z', 9},
+    {10, 4, 6, 3, '\xFF', 9},
+
+    // replace first trailing byte with ASCII byte, also incomplete at end
+    {5, 4, 3, 2, 'z', 4},
+    {8, 4, 6, 3, 'z', 7},
+    {9, 4, 6, 3, 'z', 7},
+
+    // replace first trailing byte with invalid byte, also incomplete at end
+    {5, 4, 3, 2, '\xFF', 4},
+    {8, 4, 6, 3, '\xFF', 7},
+    {9, 4, 6, 3, '\xFF', 7},
+
+    // replace second trailing byte with ASCII byte, also incomplete at end
+    {9, 4, 6, 3, 'z', 8},
+
+    // replace second trailing byte with invalid byte, also incomplete at end
+    {9, 4, 6, 3, '\xFF', 8},
+  };
+  for (auto t : offsets)
+    {
+      char in[array_size (valid_in)] = {};
+      CharT out[array_size (exp) - 1] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      VERIFY (t.expected_in_next <= t.in_size);
+      VERIFY (t.expected_out_next <= t.out_size);
+      char_traits<char>::copy (in, valid_in, array_size (valid_in));
+      in[t.replace_pos] = t.replace_char;
+
+      auto state = mbstate_t{};
+      auto in_next = (const char *) nullptr;
+      auto out_next = (CharT *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res = cvt.in (state, in, in + t.in_size, in_next, out, out + t.out_size,
+		    out_next);
+      VERIFY (res == cvt.error);
+      VERIFY (in_next == in + t.expected_in_next);
+      VERIFY (out_next == out + t.expected_out_next);
+      VERIFY (char_traits<CharT>::compare (out, exp, t.expected_out_next) == 0);
+      if (t.expected_out_next < array_size (out))
+	VERIFY (out[t.expected_out_next] == 0);
+    }
+}
+
+template <class CharT>
+void
+utf8_to_utf32_in (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  utf8_to_utf32_in_ok (cvt);
+  utf8_to_utf32_in_partial (cvt);
+  utf8_to_utf32_in_error (cvt);
+}
+
+template <class CharT>
+void
+utf32_to_utf8_out_ok (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  using namespace std;
+  // UTF-8 string of 1-byte CP, 2-byte CP, 3-byte CP and 4-byte CP
+  const char32_t in_literal[] = U"bш\uAAAA\U0010AAAA";
+  const char exp[] = "bш\uAAAA\U0010AAAA";
+  CharT in[array_size (in_literal)] = {};
+  copy (begin (in_literal), end (in_literal), begin (in));
+
+  static_assert (array_size (in_literal) == 5, "");
+  static_assert (array_size (in) == 5, "");
+  static_assert (array_size (exp) == 11, "");
+  VERIFY (char_traits<char32_t>::length (in_literal) == 4);
+  VERIFY (char_traits<CharT>::length (in) == 4);
+  VERIFY (char_traits<char>::length (exp) == 10);
+
+  const test_offsets_ok offsets[] = {{0, 0}, {1, 1}, {2, 3}, {3, 6}, {4, 10}};
+  for (auto t : offsets)
+    {
+      char out[array_size (exp) - 1] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      auto state = mbstate_t{};
+      auto in_next = (const CharT *) nullptr;
+      auto out_next = (char *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res = cvt.out (state, in, in + t.in_size, in_next, out, out + t.out_size,
+		     out_next);
+      VERIFY (res == cvt.ok);
+      VERIFY (in_next == in + t.in_size);
+      VERIFY (out_next == out + t.out_size);
+      VERIFY (char_traits<char>::compare (out, exp, t.out_size) == 0);
+      if (t.out_size < array_size (out))
+	VERIFY (out[t.out_size] == 0);
+    }
+}
+
+template <class CharT>
+void
+utf32_to_utf8_out_partial (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  using namespace std;
+  // UTF-8 string of 1-byte CP, 2-byte CP, 3-byte CP and 4-byte CP
+  const char32_t in_literal[] = U"bш\uAAAA\U0010AAAA";
+  const char exp[] = "bш\uAAAA\U0010AAAA";
+  CharT in[array_size (in_literal)] = {};
+  copy (begin (in_literal), end (in_literal), begin (in));
+
+  static_assert (array_size (in_literal) == 5, "");
+  static_assert (array_size (in) == 5, "");
+  static_assert (array_size (exp) == 11, "");
+  VERIFY (char_traits<char32_t>::length (in_literal) == 4);
+  VERIFY (char_traits<CharT>::length (in) == 4);
+  VERIFY (char_traits<char>::length (exp) == 10);
+
+  const test_offsets_partial offsets[] = {
+    {1, 0, 0, 0}, // no space for first CP
+
+    {2, 1, 1, 1}, // no space for second CP
+    {2, 2, 1, 1}, // no space for second CP
+
+    {3, 3, 2, 3}, // no space for third CP
+    {3, 4, 2, 3}, // no space for third CP
+    {3, 5, 2, 3}, // no space for third CP
+
+    {4, 6, 3, 6}, // no space for fourth CP
+    {4, 7, 3, 6}, // no space for fourth CP
+    {4, 8, 3, 6}, // no space for fourth CP
+    {4, 9, 3, 6}, // no space for fourth CP
+  };
+  for (auto t : offsets)
+    {
+      char out[array_size (exp) - 1] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      VERIFY (t.expected_in_next <= t.in_size);
+      VERIFY (t.expected_out_next <= t.out_size);
+      auto state = mbstate_t{};
+      auto in_next = (const CharT *) nullptr;
+      auto out_next = (char *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res = cvt.out (state, in, in + t.in_size, in_next, out, out + t.out_size,
+		     out_next);
+      VERIFY (res == cvt.partial);
+      VERIFY (in_next == in + t.expected_in_next);
+      VERIFY (out_next == out + t.expected_out_next);
+      VERIFY (char_traits<char>::compare (out, exp, t.expected_out_next) == 0);
+      if (t.expected_out_next < array_size (out))
+	VERIFY (out[t.expected_out_next] == 0);
+    }
+}
+
+template <class CharT>
+void
+utf32_to_utf8_out_error (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  using namespace std;
+  const char32_t valid_in[] = U"bш\uAAAA\U0010AAAA";
+  const char exp[] = "bш\uAAAA\U0010AAAA";
+
+  static_assert (array_size (valid_in) == 5, "");
+  static_assert (array_size (exp) == 11, "");
+  VERIFY (char_traits<char32_t>::length (valid_in) == 4);
+  VERIFY (char_traits<char>::length (exp) == 10);
+
+  test_offsets_error<CharT> offsets[] = {{4, 10, 0, 0, 0x00110000, 0},
+					 {4, 10, 1, 1, 0x00110000, 1},
+					 {4, 10, 2, 3, 0x00110000, 2},
+					 {4, 10, 3, 6, 0x00110000, 3}};
+
+  for (auto t : offsets)
+    {
+      CharT in[array_size (valid_in)] = {};
+      char out[array_size (exp) - 1] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      VERIFY (t.expected_in_next <= t.in_size);
+      VERIFY (t.expected_out_next <= t.out_size);
+      copy (begin (valid_in), end (valid_in), begin (in));
+      in[t.replace_pos] = t.replace_char;
+
+      auto state = mbstate_t{};
+      auto in_next = (const CharT *) nullptr;
+      auto out_next = (char *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res = cvt.out (state, in, in + t.in_size, in_next, out, out + t.out_size,
+		     out_next);
+      VERIFY (res == cvt.error);
+      VERIFY (in_next == in + t.expected_in_next);
+      VERIFY (out_next == out + t.expected_out_next);
+      VERIFY (char_traits<char>::compare (out, exp, t.expected_out_next) == 0);
+      if (t.expected_out_next < array_size (out))
+	VERIFY (out[t.expected_out_next] == 0);
+    }
+}
+
+template <class CharT>
+void
+utf32_to_utf8_out (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  utf32_to_utf8_out_ok (cvt);
+  utf32_to_utf8_out_partial (cvt);
+  utf32_to_utf8_out_error (cvt);
+}
+
+template <class CharT>
+void
+test_utf8_utf32_codecvts (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  utf8_to_utf32_in (cvt);
+  utf32_to_utf8_out (cvt);
+}
+
+template <class CharT>
+void
+utf8_to_utf16_in_ok (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  using namespace std;
+  // UTF-8 string of 1-byte CP, 2-byte CP, 3-byte CP and 4-byte CP
+  const char in[] = "bш\uAAAA\U0010AAAA";
+  const char16_t exp_literal[] = u"bш\uAAAA\U0010AAAA";
+  CharT exp[array_size (exp_literal)] = {};
+  copy (begin (exp_literal), end (exp_literal), begin (exp));
+
+  static_assert (array_size (in) == 11, "");
+  static_assert (array_size (exp_literal) == 6, "");
+  static_assert (array_size (exp) == 6, "");
+  VERIFY (char_traits<char>::length (in) == 10);
+  VERIFY (char_traits<char16_t>::length (exp_literal) == 5);
+  VERIFY (char_traits<CharT>::length (exp) == 5);
+
+  test_offsets_ok offsets[] = {{0, 0}, {1, 1}, {3, 2}, {6, 3}, {10, 5}};
+  for (auto t : offsets)
+    {
+      CharT out[array_size (exp) - 1] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      auto state = mbstate_t{};
+      auto in_next = (const char *) nullptr;
+      auto out_next = (CharT *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res = cvt.in (state, in, in + t.in_size, in_next, out, out + t.out_size,
+		    out_next);
+      VERIFY (res == cvt.ok);
+      VERIFY (in_next == in + t.in_size);
+      VERIFY (out_next == out + t.out_size);
+      VERIFY (char_traits<CharT>::compare (out, exp, t.out_size) == 0);
+      if (t.out_size < array_size (out))
+	VERIFY (out[t.out_size] == 0);
+    }
+
+  for (auto t : offsets)
+    {
+      CharT out[array_size (exp)] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      auto state = mbstate_t{};
+      auto in_next = (const char *) nullptr;
+      auto out_next = (CharT *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res
+	= cvt.in (state, in, in + t.in_size, in_next, out, end (out), out_next);
+      VERIFY (res == cvt.ok);
+      VERIFY (in_next == in + t.in_size);
+      VERIFY (out_next == out + t.out_size);
+      VERIFY (char_traits<CharT>::compare (out, exp, t.out_size) == 0);
+      if (t.out_size < array_size (out))
+	VERIFY (out[t.out_size] == 0);
+    }
+}
+
+template <class CharT>
+void
+utf8_to_utf16_in_partial (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  using namespace std;
+  // UTF-8 string of 1-byte CP, 2-byte CP, 3-byte CP and 4-byte CP
+  const char in[] = "bш\uAAAA\U0010AAAA";
+  const char16_t exp_literal[] = u"bш\uAAAA\U0010AAAA";
+  CharT exp[array_size (exp_literal)] = {};
+  copy (begin (exp_literal), end (exp_literal), begin (exp));
+
+  static_assert (array_size (in) == 11, "");
+  static_assert (array_size (exp_literal) == 6, "");
+  static_assert (array_size (exp) == 6, "");
+  VERIFY (char_traits<char>::length (in) == 10);
+  VERIFY (char_traits<char16_t>::length (exp_literal) == 5);
+  VERIFY (char_traits<CharT>::length (exp) == 5);
+
+  test_offsets_partial offsets[] = {
+    {1, 0, 0, 0}, // no space for first CP
+
+    {3, 1, 1, 1}, // no space for second CP
+    {2, 2, 1, 1}, // incomplete second CP
+    {2, 1, 1, 1}, // incomplete second CP, and no space for it
+
+    {6, 2, 3, 2}, // no space for third CP
+    {4, 3, 3, 2}, // incomplete third CP
+    {5, 3, 3, 2}, // incomplete third CP
+    {4, 2, 3, 2}, // incomplete third CP, and no space for it
+    {5, 2, 3, 2}, // incomplete third CP, and no space for it
+
+    {10, 3, 6, 3}, // no space for fourth CP
+    {10, 4, 6, 3}, // no space for fourth CP
+    {7, 5, 6, 3},  // incomplete fourth CP
+    {8, 5, 6, 3},  // incomplete fourth CP
+    {9, 5, 6, 3},  // incomplete fourth CP
+    {7, 3, 6, 3},  // incomplete fourth CP, and no space for it
+    {8, 3, 6, 3},  // incomplete fourth CP, and no space for it
+    {9, 3, 6, 3},  // incomplete fourth CP, and no space for it
+    {7, 4, 6, 3},  // incomplete fourth CP, and no space for it
+    {8, 4, 6, 3},  // incomplete fourth CP, and no space for it
+    {9, 4, 6, 3},  // incomplete fourth CP, and no space for it
+
+  };
+
+  for (auto t : offsets)
+    {
+      CharT out[array_size (exp) - 1] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      VERIFY (t.expected_in_next <= t.in_size);
+      VERIFY (t.expected_out_next <= t.out_size);
+      auto state = mbstate_t{};
+      auto in_next = (const char *) nullptr;
+      auto out_next = (CharT *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res = cvt.in (state, in, in + t.in_size, in_next, out, out + t.out_size,
+		    out_next);
+      VERIFY (res == cvt.partial);
+      VERIFY (in_next == in + t.expected_in_next);
+      VERIFY (out_next == out + t.expected_out_next);
+      VERIFY (char_traits<CharT>::compare (out, exp, t.expected_out_next) == 0);
+      if (t.expected_out_next < array_size (out))
+	VERIFY (out[t.expected_out_next] == 0);
+    }
+}
+
+template <class CharT>
+void
+utf8_to_utf16_in_error (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  using namespace std;
+  const char valid_in[] = "bш\uAAAA\U0010AAAA";
+  const char16_t exp_literal[] = u"bш\uAAAA\U0010AAAA";
+  CharT exp[array_size (exp_literal)] = {};
+  copy (begin (exp_literal), end (exp_literal), begin (exp));
+
+  static_assert (array_size (valid_in) == 11, "");
+  static_assert (array_size (exp_literal) == 6, "");
+  static_assert (array_size (exp) == 6, "");
+  VERIFY (char_traits<char>::length (valid_in) == 10);
+  VERIFY (char_traits<char16_t>::length (exp_literal) == 5);
+  VERIFY (char_traits<CharT>::length (exp) == 5);
+
+  test_offsets_error<char> offsets[] = {
+
+    // replace leading byte with invalid byte
+    {1, 5, 0, 0, '\xFF', 0},
+    {3, 5, 1, 1, '\xFF', 1},
+    {6, 5, 3, 2, '\xFF', 3},
+    {10, 5, 6, 3, '\xFF', 6},
+
+    // replace first trailing byte with ASCII byte
+    {3, 5, 1, 1, 'z', 2},
+    {6, 5, 3, 2, 'z', 4},
+    {10, 5, 6, 3, 'z', 7},
+
+    // replace first trailing byte with invalid byte
+    {3, 5, 1, 1, '\xFF', 2},
+    {6, 5, 3, 2, '\xFF', 4},
+    {10, 5, 6, 3, '\xFF', 7},
+
+    // replace second trailing byte with ASCII byte
+    {6, 5, 3, 2, 'z', 5},
+    {10, 5, 6, 3, 'z', 8},
+
+    // replace second trailing byte with invalid byte
+    {6, 5, 3, 2, '\xFF', 5},
+    {10, 5, 6, 3, '\xFF', 8},
+
+    // replace third trailing byte
+    {10, 5, 6, 3, 'z', 9},
+    {10, 5, 6, 3, '\xFF', 9},
+
+    // replace first trailing byte with ASCII byte, also incomplete at end
+    {5, 5, 3, 2, 'z', 4},
+    {8, 5, 6, 3, 'z', 7},
+    {9, 5, 6, 3, 'z', 7},
+
+    // replace first trailing byte with invalid byte, also incomplete at end
+    {5, 5, 3, 2, '\xFF', 4},
+    {8, 5, 6, 3, '\xFF', 7},
+    {9, 5, 6, 3, '\xFF', 7},
+
+    // replace second trailing byte with ASCII byte, also incomplete at end
+    {9, 5, 6, 3, 'z', 8},
+
+    // replace second trailing byte with invalid byte, also incomplete at end
+    {9, 5, 6, 3, '\xFF', 8},
+  };
+  for (auto t : offsets)
+    {
+      char in[array_size (valid_in)] = {};
+      CharT out[array_size (exp) - 1] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      VERIFY (t.expected_in_next <= t.in_size);
+      VERIFY (t.expected_out_next <= t.out_size);
+      char_traits<char>::copy (in, valid_in, array_size (valid_in));
+      in[t.replace_pos] = t.replace_char;
+
+      auto state = mbstate_t{};
+      auto in_next = (const char *) nullptr;
+      auto out_next = (CharT *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res = cvt.in (state, in, in + t.in_size, in_next, out, out + t.out_size,
+		    out_next);
+      VERIFY (res == cvt.error);
+      VERIFY (in_next == in + t.expected_in_next);
+      VERIFY (out_next == out + t.expected_out_next);
+      VERIFY (char_traits<CharT>::compare (out, exp, t.expected_out_next) == 0);
+      if (t.expected_out_next < array_size (out))
+	VERIFY (out[t.expected_out_next] == 0);
+    }
+}
+
+template <class CharT>
+void
+utf8_to_utf16_in (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  utf8_to_utf16_in_ok (cvt);
+  utf8_to_utf16_in_partial (cvt);
+  utf8_to_utf16_in_error (cvt);
+}
+
+template <class CharT>
+void
+utf16_to_utf8_out_ok (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  using namespace std;
+  // UTF-8 string of 1-byte CP, 2-byte CP, 3-byte CP and 4-byte CP
+  const char16_t in_literal[] = u"bш\uAAAA\U0010AAAA";
+  const char exp[] = "bш\uAAAA\U0010AAAA";
+  CharT in[array_size (in_literal)];
+  copy (begin (in_literal), end (in_literal), begin (in));
+
+  static_assert (array_size (in_literal) == 6, "");
+  static_assert (array_size (exp) == 11, "");
+  static_assert (array_size (in) == 6, "");
+  VERIFY (char_traits<char16_t>::length (in_literal) == 5);
+  VERIFY (char_traits<char>::length (exp) == 10);
+  VERIFY (char_traits<CharT>::length (in) == 5);
+
+  const test_offsets_ok offsets[] = {{0, 0}, {1, 1}, {2, 3}, {3, 6}, {5, 10}};
+  for (auto t : offsets)
+    {
+      char out[array_size (exp) - 1] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      auto state = mbstate_t{};
+      auto in_next = (const CharT *) nullptr;
+      auto out_next = (char *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res = cvt.out (state, in, in + t.in_size, in_next, out, out + t.out_size,
+		     out_next);
+      VERIFY (res == cvt.ok);
+      VERIFY (in_next == in + t.in_size);
+      VERIFY (out_next == out + t.out_size);
+      VERIFY (char_traits<char>::compare (out, exp, t.out_size) == 0);
+      if (t.out_size < array_size (out))
+	VERIFY (out[t.out_size] == 0);
+    }
+}
+
+template <class CharT>
+void
+utf16_to_utf8_out_partial (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  using namespace std;
+  // UTF-8 string of 1-byte CP, 2-byte CP, 3-byte CP and 4-byte CP
+  const char16_t in_literal[] = u"bш\uAAAA\U0010AAAA";
+  const char exp[] = "bш\uAAAA\U0010AAAA";
+  CharT in[array_size (in_literal)];
+  copy (begin (in_literal), end (in_literal), begin (in));
+
+  static_assert (array_size (in_literal) == 6, "");
+  static_assert (array_size (exp) == 11, "");
+  static_assert (array_size (in) == 6, "");
+  VERIFY (char_traits<char16_t>::length (in_literal) == 5);
+  VERIFY (char_traits<char>::length (exp) == 10);
+  VERIFY (char_traits<CharT>::length (in) == 5);
+
+  const test_offsets_partial offsets[] = {
+    {1, 0, 0, 0}, // no space for first CP
+
+    {2, 1, 1, 1}, // no space for second CP
+    {2, 2, 1, 1}, // no space for second CP
+
+    {3, 3, 2, 3}, // no space for third CP
+    {3, 4, 2, 3}, // no space for third CP
+    {3, 5, 2, 3}, // no space for third CP
+
+    {5, 6, 3, 6}, // no space for fourth CP
+    {5, 7, 3, 6}, // no space for fourth CP
+    {5, 8, 3, 6}, // no space for fourth CP
+    {5, 9, 3, 6}, // no space for fourth CP
+
+    {4, 10, 3, 6}, // incomplete fourth CP
+
+    {4, 6, 3, 6}, // incomplete fourth CP, and no space for it
+    {4, 7, 3, 6}, // incomplete fourth CP, and no space for it
+    {4, 8, 3, 6}, // incomplete fourth CP, and no space for it
+    {4, 9, 3, 6}, // incomplete fourth CP, and no space for it
+  };
+  for (auto t : offsets)
+    {
+      char out[array_size (exp) - 1] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      VERIFY (t.expected_in_next <= t.in_size);
+      VERIFY (t.expected_out_next <= t.out_size);
+      auto state = mbstate_t{};
+      auto in_next = (const CharT *) nullptr;
+      auto out_next = (char *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res = cvt.out (state, in, in + t.in_size, in_next, out, out + t.out_size,
+		     out_next);
+      VERIFY (res == cvt.partial);
+      VERIFY (in_next == in + t.expected_in_next);
+      VERIFY (out_next == out + t.expected_out_next);
+      VERIFY (char_traits<char>::compare (out, exp, t.expected_out_next) == 0);
+      if (t.expected_out_next < array_size (out))
+	VERIFY (out[t.expected_out_next] == 0);
+    }
+}
+
+template <class CharT>
+void
+utf16_to_utf8_out_error (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  using namespace std;
+  const char16_t valid_in[] = u"bш\uAAAA\U0010AAAA";
+  const char exp[] = "bш\uAAAA\U0010AAAA";
+
+  static_assert (array_size (valid_in) == 6, "");
+  static_assert (array_size (exp) == 11, "");
+  VERIFY (char_traits<char16_t>::length (valid_in) == 5);
+  VERIFY (char_traits<char>::length (exp) == 10);
+
+  test_offsets_error<CharT> offsets[] = {
+    {5, 10, 0, 0, 0xD800, 0},
+    {5, 10, 0, 0, 0xDBFF, 0},
+    {5, 10, 0, 0, 0xDC00, 0},
+    {5, 10, 0, 0, 0xDFFF, 0},
+
+    {5, 10, 1, 1, 0xD800, 1},
+    {5, 10, 1, 1, 0xDBFF, 1},
+    {5, 10, 1, 1, 0xDC00, 1},
+    {5, 10, 1, 1, 0xDFFF, 1},
+
+    {5, 10, 2, 3, 0xD800, 2},
+    {5, 10, 2, 3, 0xDBFF, 2},
+    {5, 10, 2, 3, 0xDC00, 2},
+    {5, 10, 2, 3, 0xDFFF, 2},
+
+    // make the leading surrogate a trailing one
+    {5, 10, 3, 6, 0xDC00, 3},
+    {5, 10, 3, 6, 0xDFFF, 3},
+
+    // make the trailing surrogate a leading one
+    {5, 10, 3, 6, 0xD800, 4},
+    {5, 10, 3, 6, 0xDBFF, 4},
+
+    // make the trailing surrogate a BMP char
+    {5, 10, 3, 6, u'z', 4},
+  };
+
+  for (auto t : offsets)
+    {
+      CharT in[array_size (valid_in)] = {};
+      char out[array_size (exp) - 1] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      VERIFY (t.expected_in_next <= t.in_size);
+      VERIFY (t.expected_out_next <= t.out_size);
+      copy (begin (valid_in), end (valid_in), begin (in));
+      in[t.replace_pos] = t.replace_char;
+
+      auto state = mbstate_t{};
+      auto in_next = (const CharT *) nullptr;
+      auto out_next = (char *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res = cvt.out (state, in, in + t.in_size, in_next, out, out + t.out_size,
+		     out_next);
+      VERIFY (res == cvt.error);
+      VERIFY (in_next == in + t.expected_in_next);
+      VERIFY (out_next == out + t.expected_out_next);
+      VERIFY (char_traits<char>::compare (out, exp, t.expected_out_next) == 0);
+      if (t.expected_out_next < array_size (out))
+	VERIFY (out[t.expected_out_next] == 0);
+    }
+}
+
+template <class CharT>
+void
+utf16_to_utf8_out (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  utf16_to_utf8_out_ok (cvt);
+  utf16_to_utf8_out_partial (cvt);
+  utf16_to_utf8_out_error (cvt);
+}
+
+template <class CharT>
+void
+test_utf8_utf16_cvts (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  utf8_to_utf16_in (cvt);
+  utf16_to_utf8_out (cvt);
+}
+
+template <class CharT>
+void
+utf8_to_ucs2_in_ok (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  using namespace std;
+  // UTF-8 string of 1-byte CP, 2-byte CP and 3-byte CP
+  const char in[] = "bш\uAAAA";
+  const char16_t exp_literal[] = u"bш\uAAAA";
+  CharT exp[array_size (exp_literal)] = {};
+  copy (begin (exp_literal), end (exp_literal), begin (exp));
+
+  static_assert (array_size (in) == 7, "");
+  static_assert (array_size (exp_literal) == 4, "");
+  static_assert (array_size (exp) == 4, "");
+  VERIFY (char_traits<char>::length (in) == 6);
+  VERIFY (char_traits<char16_t>::length (exp_literal) == 3);
+  VERIFY (char_traits<CharT>::length (exp) == 3);
+
+  test_offsets_ok offsets[] = {{0, 0}, {1, 1}, {3, 2}, {6, 3}};
+  for (auto t : offsets)
+    {
+      CharT out[array_size (exp) - 1] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      auto state = mbstate_t{};
+      auto in_next = (const char *) nullptr;
+      auto out_next = (CharT *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res = cvt.in (state, in, in + t.in_size, in_next, out, out + t.out_size,
+		    out_next);
+      VERIFY (res == cvt.ok);
+      VERIFY (in_next == in + t.in_size);
+      VERIFY (out_next == out + t.out_size);
+      VERIFY (char_traits<CharT>::compare (out, exp, t.out_size) == 0);
+      if (t.out_size < array_size (out))
+	VERIFY (out[t.out_size] == 0);
+    }
+
+  for (auto t : offsets)
+    {
+      CharT out[array_size (exp)] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      auto state = mbstate_t{};
+      auto in_next = (const char *) nullptr;
+      auto out_next = (CharT *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res
+	= cvt.in (state, in, in + t.in_size, in_next, out, end (out), out_next);
+      VERIFY (res == cvt.ok);
+      VERIFY (in_next == in + t.in_size);
+      VERIFY (out_next == out + t.out_size);
+      VERIFY (char_traits<CharT>::compare (out, exp, t.out_size) == 0);
+      if (t.out_size < array_size (out))
+	VERIFY (out[t.out_size] == 0);
+    }
+}
+
+template <class CharT>
+void
+utf8_to_ucs2_in_partial (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  using namespace std;
+  // UTF-8 string of 1-byte CP, 2-byte CP and 3-byte CP
+  const char in[] = "bш\uAAAA";
+  const char16_t exp_literal[] = u"bш\uAAAA";
+  CharT exp[array_size (exp_literal)] = {};
+  copy (begin (exp_literal), end (exp_literal), begin (exp));
+
+  static_assert (array_size (in) == 7, "");
+  static_assert (array_size (exp_literal) == 4, "");
+  static_assert (array_size (exp) == 4, "");
+  VERIFY (char_traits<char>::length (in) == 6);
+  VERIFY (char_traits<char16_t>::length (exp_literal) == 3);
+  VERIFY (char_traits<CharT>::length (exp) == 3);
+
+  test_offsets_partial offsets[] = {
+    {1, 0, 0, 0}, // no space for first CP
+
+    {3, 1, 1, 1}, // no space for second CP
+    {2, 2, 1, 1}, // incomplete second CP
+    {2, 1, 1, 1}, // incomplete second CP, and no space for it
+
+    {6, 2, 3, 2}, // no space for third CP
+    {4, 3, 3, 2}, // incomplete third CP
+    {5, 3, 3, 2}, // incomplete third CP
+    {4, 2, 3, 2}, // incomplete third CP, and no space for it
+    {5, 2, 3, 2}, // incomplete third CP, and no space for it
+  };
+
+  for (auto t : offsets)
+    {
+      CharT out[array_size (exp) - 1] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      VERIFY (t.expected_in_next <= t.in_size);
+      VERIFY (t.expected_out_next <= t.out_size);
+      auto state = mbstate_t{};
+      auto in_next = (const char *) nullptr;
+      auto out_next = (CharT *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res = cvt.in (state, in, in + t.in_size, in_next, out, out + t.out_size,
+		    out_next);
+      VERIFY (res == cvt.partial);
+      VERIFY (in_next == in + t.expected_in_next);
+      VERIFY (out_next == out + t.expected_out_next);
+      VERIFY (char_traits<CharT>::compare (out, exp, t.expected_out_next) == 0);
+      if (t.expected_out_next < array_size (out))
+	VERIFY (out[t.expected_out_next] == 0);
+    }
+}
+
+template <class CharT>
+void
+utf8_to_ucs2_in_error (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  using namespace std;
+  const char valid_in[] = "bш\uAAAA\U0010AAAA";
+  const char16_t exp_literal[] = u"bш\uAAAA\U0010AAAA";
+  CharT exp[array_size (exp_literal)] = {};
+  copy (begin (exp_literal), end (exp_literal), begin (exp));
+
+  static_assert (array_size (valid_in) == 11, "");
+  static_assert (array_size (exp_literal) == 6, "");
+  static_assert (array_size (exp) == 6, "");
+  VERIFY (char_traits<char>::length (valid_in) == 10);
+  VERIFY (char_traits<char16_t>::length (exp_literal) == 5);
+  VERIFY (char_traits<CharT>::length (exp) == 5);
+
+  test_offsets_error<char> offsets[] = {
+
+    // replace leading byte with invalid byte
+    {1, 5, 0, 0, '\xFF', 0},
+    {3, 5, 1, 1, '\xFF', 1},
+    {6, 5, 3, 2, '\xFF', 3},
+    {10, 5, 6, 3, '\xFF', 6},
+
+    // replace first trailing byte with ASCII byte
+    {3, 5, 1, 1, 'z', 2},
+    {6, 5, 3, 2, 'z', 4},
+    {10, 5, 6, 3, 'z', 7},
+
+    // replace first trailing byte with invalid byte
+    {3, 5, 1, 1, '\xFF', 2},
+    {6, 5, 3, 2, '\xFF', 4},
+    {10, 5, 6, 3, '\xFF', 7},
+
+    // replace second trailing byte with ASCII byte
+    {6, 5, 3, 2, 'z', 5},
+    {10, 5, 6, 3, 'z', 8},
+
+    // replace second trailing byte with invalid byte
+    {6, 5, 3, 2, '\xFF', 5},
+    {10, 5, 6, 3, '\xFF', 8},
+
+    // replace third trailing byte
+    {10, 5, 6, 3, 'z', 9},
+    {10, 5, 6, 3, '\xFF', 9},
+
+    // When we see a leading byte of 4-byte CP, we should return error, no
+    // matter if it is incomplete at the end or has errors in the trailing
+    // bytes.
+
+    // Don't replace anything, show full 4-byte CP
+    {10, 4, 6, 3, 'b', 0},
+    {10, 5, 6, 3, 'b', 0},
+
+    // Don't replace anything, show incomplete 4-byte CP at the end
+    {7, 4, 6, 3, 'b', 0}, // incomplete fourth CP
+    {8, 4, 6, 3, 'b', 0}, // incomplete fourth CP
+    {9, 4, 6, 3, 'b', 0}, // incomplete fourth CP
+    {7, 5, 6, 3, 'b', 0}, // incomplete fourth CP
+    {8, 5, 6, 3, 'b', 0}, // incomplete fourth CP
+    {9, 5, 6, 3, 'b', 0}, // incomplete fourth CP
+
+    // replace first trailing byte with ASCII byte, also incomplete at end
+    {5, 5, 3, 2, 'z', 4},
+
+    // replace first trailing byte with invalid byte, also incomplete at end
+    {5, 5, 3, 2, '\xFF', 4},
+
+    // replace first trailing byte with ASCII byte, also incomplete at end
+    {8, 5, 6, 3, 'z', 7},
+    {9, 5, 6, 3, 'z', 7},
+
+    // replace first trailing byte with invalid byte, also incomplete at end
+    {8, 5, 6, 3, '\xFF', 7},
+    {9, 5, 6, 3, '\xFF', 7},
+
+    // replace second trailing byte with ASCII byte, also incomplete at end
+    {9, 5, 6, 3, 'z', 8},
+
+    // replace second trailing byte with invalid byte, also incomplete at end
+    {9, 5, 6, 3, '\xFF', 8},
+  };
+  for (auto t : offsets)
+    {
+      char in[array_size (valid_in)] = {};
+      CharT out[array_size (exp) - 1] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      VERIFY (t.expected_in_next <= t.in_size);
+      VERIFY (t.expected_out_next <= t.out_size);
+      char_traits<char>::copy (in, valid_in, array_size (valid_in));
+      in[t.replace_pos] = t.replace_char;
+
+      auto state = mbstate_t{};
+      auto in_next = (const char *) nullptr;
+      auto out_next = (CharT *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res = cvt.in (state, in, in + t.in_size, in_next, out, out + t.out_size,
+		    out_next);
+      VERIFY (res == cvt.error);
+      VERIFY (in_next == in + t.expected_in_next);
+      VERIFY (out_next == out + t.expected_out_next);
+      VERIFY (char_traits<CharT>::compare (out, exp, t.expected_out_next) == 0);
+      if (t.expected_out_next < array_size (out))
+	VERIFY (out[t.expected_out_next] == 0);
+    }
+}
+
+template <class CharT>
+void
+utf8_to_ucs2_in (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  utf8_to_ucs2_in_ok (cvt);
+  utf8_to_ucs2_in_partial (cvt);
+  utf8_to_ucs2_in_error (cvt);
+}
+
+template <class CharT>
+void
+ucs2_to_utf8_out_ok (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  using namespace std;
+  // UTF-8 string of 1-byte CP, 2-byte CP and 3-byte CP
+  const char16_t in_literal[] = u"bш\uAAAA";
+  const char exp[] = "bш\uAAAA";
+  CharT in[array_size (in_literal)] = {};
+  copy (begin (in_literal), end (in_literal), begin (in));
+
+  static_assert (array_size (in_literal) == 4, "");
+  static_assert (array_size (exp) == 7, "");
+  static_assert (array_size (in) == 4, "");
+  VERIFY (char_traits<char16_t>::length (in_literal) == 3);
+  VERIFY (char_traits<char>::length (exp) == 6);
+  VERIFY (char_traits<CharT>::length (in) == 3);
+
+  const test_offsets_ok offsets[] = {{0, 0}, {1, 1}, {2, 3}, {3, 6}};
+  for (auto t : offsets)
+    {
+      char out[array_size (exp) - 1] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      auto state = mbstate_t{};
+      auto in_next = (const CharT *) nullptr;
+      auto out_next = (char *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res = cvt.out (state, in, in + t.in_size, in_next, out, out + t.out_size,
+		     out_next);
+      VERIFY (res == cvt.ok);
+      VERIFY (in_next == in + t.in_size);
+      VERIFY (out_next == out + t.out_size);
+      VERIFY (char_traits<char>::compare (out, exp, t.out_size) == 0);
+      if (t.out_size < array_size (out))
+	VERIFY (out[t.out_size] == 0);
+    }
+}
+
+template <class CharT>
+void
+ucs2_to_utf8_out_partial (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  using namespace std;
+  // UTF-8 string of 1-byte CP, 2-byte CP and 3-byte CP
+  const char16_t in_literal[] = u"bш\uAAAA";
+  const char exp[] = "bш\uAAAA";
+  CharT in[array_size (in_literal)] = {};
+  copy (begin (in_literal), end (in_literal), begin (in));
+
+  static_assert (array_size (in_literal) == 4, "");
+  static_assert (array_size (exp) == 7, "");
+  static_assert (array_size (in) == 4, "");
+  VERIFY (char_traits<char16_t>::length (in_literal) == 3);
+  VERIFY (char_traits<char>::length (exp) == 6);
+  VERIFY (char_traits<CharT>::length (in) == 3);
+
+  const test_offsets_partial offsets[] = {
+    {1, 0, 0, 0}, // no space for first CP
+
+    {2, 1, 1, 1}, // no space for second CP
+    {2, 2, 1, 1}, // no space for second CP
+
+    {3, 3, 2, 3}, // no space for third CP
+    {3, 4, 2, 3}, // no space for third CP
+    {3, 5, 2, 3}, // no space for third CP
+  };
+  for (auto t : offsets)
+    {
+      char out[array_size (exp) - 1] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      VERIFY (t.expected_in_next <= t.in_size);
+      VERIFY (t.expected_out_next <= t.out_size);
+      auto state = mbstate_t{};
+      auto in_next = (const CharT *) nullptr;
+      auto out_next = (char *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res = cvt.out (state, in, in + t.in_size, in_next, out, out + t.out_size,
+		     out_next);
+      VERIFY (res == cvt.partial);
+      VERIFY (in_next == in + t.expected_in_next);
+      VERIFY (out_next == out + t.expected_out_next);
+      VERIFY (char_traits<char>::compare (out, exp, t.expected_out_next) == 0);
+      if (t.expected_out_next < array_size (out))
+	VERIFY (out[t.expected_out_next] == 0);
+    }
+}
+
+template <class CharT>
+void
+ucs2_to_utf8_out_error (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  using namespace std;
+  const char16_t valid_in[] = u"bш\uAAAA\U0010AAAA";
+  const char exp[] = "bш\uAAAA\U0010AAAA";
+
+  static_assert (array_size (valid_in) == 6, "");
+  static_assert (array_size (exp) == 11, "");
+  VERIFY (char_traits<char16_t>::length (valid_in) == 5);
+  VERIFY (char_traits<char>::length (exp) == 10);
+
+  test_offsets_error<CharT> offsets[] = {
+    {5, 10, 0, 0, 0xD800, 0},
+    {5, 10, 0, 0, 0xDBFF, 0},
+    {5, 10, 0, 0, 0xDC00, 0},
+    {5, 10, 0, 0, 0xDFFF, 0},
+
+    {5, 10, 1, 1, 0xD800, 1},
+    {5, 10, 1, 1, 0xDBFF, 1},
+    {5, 10, 1, 1, 0xDC00, 1},
+    {5, 10, 1, 1, 0xDFFF, 1},
+
+    {5, 10, 2, 3, 0xD800, 2},
+    {5, 10, 2, 3, 0xDBFF, 2},
+    {5, 10, 2, 3, 0xDC00, 2},
+    {5, 10, 2, 3, 0xDFFF, 2},
+
+    // dont replace anything, just show the surrogate pair
+    {5, 10, 3, 6, u'b', 0},
+
+    // make the leading surrogate a trailing one
+    {5, 10, 3, 6, 0xDC00, 3},
+    {5, 10, 3, 6, 0xDFFF, 3},
+
+    // make the trailing surrogate a leading one
+    {5, 10, 3, 6, 0xD800, 4},
+    {5, 10, 3, 6, 0xDBFF, 4},
+
+    // make the trailing surrogate a BMP char
+    {5, 10, 3, 6, u'z', 4},
+
+    {5, 7, 3, 6, u'b', 0}, // no space for fourth CP
+    {5, 8, 3, 6, u'b', 0}, // no space for fourth CP
+    {5, 9, 3, 6, u'b', 0}, // no space for fourth CP
+
+    {4, 10, 3, 6, u'b', 0}, // incomplete fourth CP
+    {4, 7, 3, 6, u'b', 0},  // incomplete fourth CP, and no space for it
+    {4, 8, 3, 6, u'b', 0},  // incomplete fourth CP, and no space for it
+    {4, 9, 3, 6, u'b', 0},  // incomplete fourth CP, and no space for it
+
+  };
+
+  for (auto t : offsets)
+    {
+      CharT in[array_size (valid_in)] = {};
+      char out[array_size (exp) - 1] = {};
+      VERIFY (t.in_size <= array_size (in));
+      VERIFY (t.out_size <= array_size (out));
+      VERIFY (t.expected_in_next <= t.in_size);
+      VERIFY (t.expected_out_next <= t.out_size);
+      copy (begin (valid_in), end (valid_in), begin (in));
+      in[t.replace_pos] = t.replace_char;
+
+      auto state = mbstate_t{};
+      auto in_next = (const CharT *) nullptr;
+      auto out_next = (char *) nullptr;
+      auto res = codecvt_base::result ();
+
+      res = cvt.out (state, in, in + t.in_size, in_next, out, out + t.out_size,
+		     out_next);
+      VERIFY (res == cvt.error);
+      VERIFY (in_next == in + t.expected_in_next);
+      VERIFY (out_next == out + t.expected_out_next);
+      VERIFY (char_traits<char>::compare (out, exp, t.expected_out_next) == 0);
+      if (t.expected_out_next < array_size (out))
+	VERIFY (out[t.expected_out_next] == 0);
+    }
+}
+
+template <class CharT>
+void
+ucs2_to_utf8_out (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  ucs2_to_utf8_out_ok (cvt);
+  ucs2_to_utf8_out_partial (cvt);
+  ucs2_to_utf8_out_error (cvt);
+}
+
+template <class CharT>
+void
+test_utf8_ucs2_cvts (const std::codecvt<CharT, char, mbstate_t> &cvt)
+{
+  utf8_to_ucs2_in (cvt);
+  ucs2_to_utf8_out (cvt);
+}
diff --git a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc
new file mode 100644
index 000000000..169504939
--- /dev/null
+++ b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc
@@ -0,0 +1,59 @@
+// Copyright (C) 2020-2023 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+// { dg-do run { target c++11 } }
+
+#include "codecvt_unicode.h"
+
+#include <codecvt>
+
+using namespace std;
+
+void
+test_utf8_utf32_codecvts ()
+{
+#if __SIZEOF_WCHAR_T__ == 4
+  auto cvt_ptr = to_unique_ptr (new codecvt_utf8<wchar_t> ());
+  test_utf8_utf32_codecvts (*cvt_ptr);
+#endif
+}
+
+void
+test_utf8_utf16_codecvts ()
+{
+#if __SIZEOF_WCHAR_T__ >= 2
+  auto cvt_ptr = to_unique_ptr (new codecvt_utf8_utf16<wchar_t> ());
+  test_utf8_utf16_cvts (*cvt_ptr);
+#endif
+}
+
+void
+test_utf8_ucs2_codecvts ()
+{
+#if __SIZEOF_WCHAR_T__ == 2
+  auto cvt_ptr = to_unique_ptr (new codecvt_utf8<wchar_t> ());
+  test_utf8_ucs2_cvts (*cvt_ptr);
+#endif
+}
+
+int
+main ()
+{
+  test_utf8_utf32_codecvts ();
+  test_utf8_utf16_codecvts ();
+  test_utf8_ucs2_codecvts ();
+}