From patchwork Sat Dec 16 20:10:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Antony Polukhin X-Patchwork-Id: 179921 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:24d3:b0:fb:cd0c:d3e with SMTP id r19csp406037dyi; Sat, 16 Dec 2023 12:11:10 -0800 (PST) X-Google-Smtp-Source: AGHT+IGdHNiMXhRWNmNNxOhwYnmutUMTYy/NxSsTcVQ7v1ENAXzbo+6Ou4tg9Ij/ErgF4Pm+NIBO X-Received: by 2002:a05:6214:130e:b0:67a:b419:530f with SMTP id pn14-20020a056214130e00b0067ab419530fmr20306332qvb.9.1702757470481; Sat, 16 Dec 2023 12:11:10 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1702757470; cv=pass; d=google.com; s=arc-20160816; b=cZo7qii6doU84fOFZIBJyQXIB222X38WKOYHnHRNtk1aIujaMmQURm2/YgMKIcYhSs VTYUGaLMUx64IuB3Ki/hn1uEqKHsv73+6AkHfKJdQ2w0+4fA7krmaNqxfiNfrGgamfgn fj1hYZyAnyTXhZWtSAUHZw9VA1uzzUPj/cAfkmEjNDXbJBKAo7dk3yDhs0m03huvr2Vi dpgD8wcG/A+OQtR1HgprmqEMKa6Sk4NEliq7rfQPVbqhOiu+YyIZZek7c54B4LRYH2Uq xUVYlr2dWgGS42HApQVw/5pRU0PgAe6mhkrMF0OHKoW0VskfGrIz4fNs3KZarazckkC3 a90Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:to:subject:message-id:date:from :mime-version:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=QPHmMQEPtAH6CC1gSE2OG5QsriYJl9qquDOzeZr2RnM=; fh=rSywq26ZTMl5CAui074fFonJPJkFILOdThDl8m4ps0M=; b=DqTWO40ZvFriTC03oBBJ4ro3BX11Q8TkzUD0IC5qbEe+zE9r2aF5QpCq6eyt+JsQiv c6rKkGZAR4pWpCnwZqSLmH0TCyxfZfed4UVU1T/foPm9d7eqWN/fUNZEGuSDJy3nxOSk GPij7CLiRM6WhYyu+M56cFBQwKTedfXW2FtEcUmTBxn/5oMDEOcu5x3py9c9CXNPmgis jYM2XM1nrmud9LGMgAiAcF+NYVOZLJh5N/I0yO0CFrhJIcV3EUSccJpYyL7RwQ8DFSID L+9tAbK3K7qWae2mrDz7N1F9lJAGko/ywglZDdLctGcLv/cba1q8wYCZnG8WJx7NSxYB UEHA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b="ane/r2dB"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id i15-20020a0cf38f000000b0067f24ee88cfsi2642355qvk.472.2023.12.16.12.11.10 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 16 Dec 2023 12:11:10 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b="ane/r2dB"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 35E25385841E for ; Sat, 16 Dec 2023 20:11:10 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ot1-x32c.google.com (mail-ot1-x32c.google.com [IPv6:2607:f8b0:4864:20::32c]) by sourceware.org (Postfix) with ESMTPS id 012A03858D28; Sat, 16 Dec 2023 20:10:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 012A03858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 012A03858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::32c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702757447; cv=none; b=alaw0HpgHPRsAOwHHEdgIFptZbFcoYPrEDyZwbsCm4aefFFUeEuVDxHJFjzLRx3ZC3ffBp3+2soW+2jPJI0Svat2C0PomUlwtzB57OGIZQBqAMVz73xKDumUCzw0s87SMfHug1hHYsYCK7Qqf7BnmXsvL8wmZXNoISRYFTHPbEc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1702757447; c=relaxed/simple; bh=madTjnM8bCftpYxU4dU6lWoeM1z53h9ylqc62GZlZ28=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=JkUME3tWPX17y76McsZLEVP2LBMjD7/N739gRSyNsidP49N43xdcEZUt5zp9g3Rc3of2pxK3Z+XFGvIy16xIfO4O4ZWnUXJ+dYyT9IqQe4JHOZdS6QS5P+8UuZRezr3CYcjum/mnqX71UC3V5C857m2qWMlGF5L2ds7dVI+qwQw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ot1-x32c.google.com with SMTP id 46e09a7af769-6d9e0f0cba9so1435858a34.1; Sat, 16 Dec 2023 12:10:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702757445; x=1703362245; darn=gcc.gnu.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=QPHmMQEPtAH6CC1gSE2OG5QsriYJl9qquDOzeZr2RnM=; b=ane/r2dBsNzKco6ZStByjorcJWo8MpZDp/Ud4RmRW4n0pn4WQuue1bfv2IqOaS4sci pQQJ49UBMfCiiI9w2Vm0vCjyE30Rgflgvqer0Efya8JLbIPLuLgZ2zNn/niulpyip4Kl g8au43LyOdGFnupiQvbKdyEfGJilLUcnzF701k0i4yW2TcH46Oj2mTA3N/QePITVzunI rJUz61Y1XC6lCHkdVuQKUCR/AseY9xi/gZ/0mxyVUwf7cki7iZn9Abrv5buG0yTIvDUX nwvf8AQ3t/IcBxK2pavmH4os6BRmicLSzERU5UcCPV8f1H5qUyReklWRilWwmpnYJ48V Nd2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702757445; x=1703362245; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=QPHmMQEPtAH6CC1gSE2OG5QsriYJl9qquDOzeZr2RnM=; b=sjF9s4CZzCqsRPggArMl95OQdwYAY7hKuDYHmgf37NrWkhdGWz2CvtrsYQxhA0nois hdgGFgXYswxNhYurpcqLS0FcAnyqWDy5nL9Y930c8RWlK+jF5HM1lZpt47wfBpGUtyhT R2cvQOZul0Ml0CYkwrn2BpTFH9vs4T3kFIpWB/w9C4aYDwbyceOSAzJAEyA4o3u1ny5Z SplaYSXKtNqnGtsyKM5cotPUdAQVxUS2Ikp5tDm3xEdnYE3xGpthHZ8j8o6sdGSjDElC NKw0TGtuqJUU2DzSVtiGCWjF3nP2bceI9HtzaXKFyuDoExwYAIscL7d0iDhYFaIwr2oL 29ag== X-Gm-Message-State: AOJu0YyP36yI3XtkzVRS3tpKhv086f+RX6cvb+R0FDoBIWxbDN0wlEdh 1U15hshbuaDiWGcx0M0fjVQLQK0FsYCgfCcMrtW2l/1kJnc= X-Received: by 2002:a05:6808:38c3:b0:3ba:5dd:9457 with SMTP id el3-20020a05680838c300b003ba05dd9457mr19688400oib.38.1702757444849; Sat, 16 Dec 2023 12:10:44 -0800 (PST) MIME-Version: 1.0 From: Antony Polukhin Date: Sat, 16 Dec 2023 23:10:33 +0300 Message-ID: Subject: [PATCH] PR libstdc++/112682 More efficient std::basic_string move To: "libstdc++" , gcc-patches List X-Spam-Status: No, score=-8.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785470617335285465 X-GMAIL-MSGID: 1785470617335285465 A few places in bits/basic_string.h use `traits_type::copy` to copy `__str.length() + 1` bytes. Despite the knowledge that `__str.length()` is not greater than 15 the compiler emits (and sometimes inlines) a `memcpy` call. That results in a quite big set of instructions https://godbolt.org/z/j35MMfxzq Replacing `__str.length() + 1` with `_S_local_capacity + 1` explicitly forces the compiler to copy the whole `__str._M_local_buf`. As a result the assembly becomes almost 5 times shorter and without any function calls or multiple conditional jumps https://godbolt.org/z/bfq8bxra9 This patch always copies `_S_local_capacity + 1` if working with `std::char_traits`. PR libstdc++/112682: * include/bits/basic_string.h: Optimize string moves. P.S.: still not sure that this optimization is not an UB or fine for libstdc++. However, the assembly looks much better with it. diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h index 1b8ebca7dad..7a5e348280c 100644 --- a/libstdc++-v3/include/bits/basic_string.h +++ b/libstdc++-v3/include/bits/basic_string.h @@ -188,6 +188,23 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 : basic_string(__svw._M_sv.data(), __svw._M_sv.size(), __a) { } #endif + _GLIBCXX17_CONSTEXPR + static bool + _S_permit_copying_indeterminate() noexcept + { + // Copying compile-time known _S_local_capacity + 1 bytes is much more + // efficient than copying runtime known __str.length() + 1. This + // function returns true, if such initialization is permitted even if + // the right side has indeterminate values. + // + // [dcl.init] permits initializing with indeterminate value of unsigned + // narrow character type. + // + // Library users should not specialize char_traits so this is + // not observable for user. + return is_same >::value; + } + // Use empty-base optimization: http://www.cantrip.org/emptyopt.html struct _Alloc_hider : allocator_type // TODO check __is_final { @@ -672,8 +689,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 { if (__str._M_is_local()) { - traits_type::copy(_M_local_buf, __str._M_local_buf, - __str.length() + 1); + size_type __copy_count = _S_local_capacity + 1; + if _GLIBCXX17_CONSTEXPR (!_S_permit_copying_indeterminate()) + __copy_count = __str.length() + 1; + traits_type::copy(_M_local_buf, __str._M_local_buf, __copy_count); } else { @@ -711,8 +730,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 { if (__str._M_is_local()) { - traits_type::copy(_M_local_buf, __str._M_local_buf, - __str.length() + 1); + size_type __copy_count = _S_local_capacity + 1; + if _GLIBCXX17_CONSTEXPR (!_S_permit_copying_indeterminate()) + __copy_count = __str.length() + 1; + traits_type::copy(_M_local_buf, __str._M_local_buf, __copy_count); _M_length(__str.length()); __str._M_set_length(0); }