From patchwork Thu Aug 17 20:32:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Wakely X-Patchwork-Id: 135945 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b82d:0:b0:3f2:4152:657d with SMTP id z13csp964393vqi; Thu, 17 Aug 2023 13:39:01 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFnqHl4XsGhfeU8bYc3e33fS/AD57zl8otWHMzbLx/9w0IWzU4ewlIE120ssI8/wLhJzC+p X-Received: by 2002:a2e:9015:0:b0:2b6:e618:b593 with SMTP id h21-20020a2e9015000000b002b6e618b593mr320077ljg.31.1692304741192; Thu, 17 Aug 2023 13:39:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692304741; cv=none; d=google.com; s=arc-20160816; b=pPJgv9Xg+C/gcUdYNQjbJIDwxLPzWtEDdQ4pOoMFFqrxIBdD1HKrgiP/OzjUvrXn7F /CzkFDl6sfn3GLkQ6x5P5wMfBE3G3DvXwcLFXS7hqniIvGaWL7ukJs4zgCKqGhuELaGQ rR8BJgAwmFN6yO9h/G+TgO22GgfNaYyB1zJK7xshrr5zKWlwAZGbMa0HXyCcGJhtWKuH p3qZ75IdY6fmvgW+tjOb8nENs5MYRZhMC4Ym7hrd2XOEl5Md1VWNOvhDfK1NaNQFrIIJ lH+5cP1EPNNmJrb3yiolOV+vf1tBIhWZE+kudamzy0vW0uGDKVvh4Si/n7EJsyIdx5oG OsTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:message-id:date:subject:to :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=kpYvv+x6alFtxf2CTvsrHauIqxGMkuf0HzSgoi86JxQ=; fh=sC8lnOfyH6gL5FYlMbZ1VDz6qVPMj/3ViLJWKNUCUgQ=; b=y+59w85dhz79+FoJcA3Q3toL7Ngn8lQKEc63Lq1fqCWDXwWV+Uk2AM99t7ONIMN39H cGT8Un3YyyQnWMh7UCxsR5XfAJuynZ0Ss1AV0HgzBtz6JPXsi6VOqst3NUX/Yq2UyW6z nzidJmgGkD1ejIRpseqz+r5AQeH5kf55taC5xFSg32ryD0/KSSCl+fYCasI++BfMj7H+ p5XoOlr7SC7sO/To+AOSc8s7+qR5ROy+rATp107ZviTILqd26L2IiiREhxksEMU8pmhd u9Oks/J+kpiHLWNC6ZsfEwOsqmS4q10VoIbJzANel/WIxmARdX/GSoivZ4vSkRsNCR2E R+mw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="P4Kk/z3b"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id p17-20020a170906839100b00991ec5525f7si222558ejx.50.2023.08.17.13.39.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Aug 2023 13:39:01 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="P4Kk/z3b"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A36FA392AC02 for ; Thu, 17 Aug 2023 20:34:06 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A36FA392AC02 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1692304446; bh=kpYvv+x6alFtxf2CTvsrHauIqxGMkuf0HzSgoi86JxQ=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=P4Kk/z3bV7g3j2jNLHxk6h83hHPbUHGw/67TrKd6OW6AI4DzuAUcnQDN0I6pmycJ/ dM7jUzu5M+6XHagQUEL0dW+pCDQA+09LOWTgBiGPgmiWyiRsNTdGcB4Mn18JUoPQT4 /e96QwMvsDyPLjcQOkrydH5CzEimjHopWt+wLmaA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id B8CC93853D3D for ; Thu, 17 Aug 2023 20:32:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B8CC93853D3D Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-460-HuURcGOFOImD7-3UjRL7Gw-1; Thu, 17 Aug 2023 16:32:23 -0400 X-MC-Unique: HuURcGOFOImD7-3UjRL7Gw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 283668DC665; Thu, 17 Aug 2023 20:32:23 +0000 (UTC) Received: from localhost (unknown [10.42.28.201]) by smtp.corp.redhat.com (Postfix) with ESMTP id E71F51121314; Thu, 17 Aug 2023 20:32:22 +0000 (UTC) To: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: [committed] libstdc++: Optimize std::string::assign(Iter, Iter) [PR110945] Date: Thu, 17 Aug 2023 21:32:19 +0100 Message-ID: <20230817203223.1131562-1-jwakely@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jonathan Wakely via Gcc-patches From: Jonathan Wakely Reply-To: Jonathan Wakely Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774510136269662404 X-GMAIL-MSGID: 1774510136269662404 Tested x86_64-linux. Pushed to trunk. -- >8 -- Calling string::assign(Iter, Iter) with "foreign" iterators (not the string's own iterator or pointer types) currently constructs a temporary string and then calls replace to copy the characters from it. That means we copy from the iterators twice, and if the replace operation has to grow the string then we also allocate twice. By using *this = basic_string(first, last, get_allocator()) we only perform a single allocation+copy and then do a cheap move assignment instead of a second copy (and possible allocation). But that alternative has to be done conditionally, so that we don't pessimize the native iterator case (the string's own iterator and pointer types) which currently select efficient overloads of replace which will not allocate at all if the string already has sufficient capacity. For C++20 we can extend that efficient case to work for any contiguous iterator with the right value type, not just for the string's native iterators. So the change is to inline the code that decides whether to work in place or to allocate+copy (instead of deciding that via overload resolution for replace), and for the allocate+copy case do a move assignment instead of another call to replace. For C++98 there is no change, as we can't do an efficient move assignment anyway, so keep the current code. We can also simplify assign(initializer_list) because the backing array for an initializer_list is always disjunct with *this, so most of the code in _M_replace is not needed. libstdc++-v3/ChangeLog: PR libstdc++/110945 * include/bits/basic_string.h (basic_string::assign(Iter, Iter)): Dispatch to _M_replace or move assignment from a temporary, based on the iterator type. --- libstdc++-v3/include/bits/basic_string.h | 42 +++++++++++++++++++++--- 1 file changed, 38 insertions(+), 4 deletions(-) diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h index f4bbf521bba..09fd62afa66 100644 --- a/libstdc++-v3/include/bits/basic_string.h +++ b/libstdc++-v3/include/bits/basic_string.h @@ -1711,15 +1711,36 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 * Sets value of string to characters in the range [__first,__last). */ #if __cplusplus >= 201103L +#pragma GCC diagnostic push +#pragma GCC diagnostic ignored "-Wc++17-extensions" template> _GLIBCXX20_CONSTEXPR + basic_string& + assign(_InputIterator __first, _InputIterator __last) + { +#if __cplusplus >= 202002L + if constexpr (contiguous_iterator<_InputIterator> + && is_same_v, _CharT>) +#else + if constexpr (__is_one_of<_InputIterator, const_iterator, iterator, + const _CharT*, _CharT*>::value) +#endif + { + __glibcxx_requires_valid_range(__first, __last); + return _M_replace(size_type(0), size(), + std::__to_address(__first), __last - __first); + } + else + return *this = basic_string(__first, __last, get_allocator()); + } +#pragma GCC diagnostic pop #else template + basic_string& + assign(_InputIterator __first, _InputIterator __last) + { return this->replace(begin(), end(), __first, __last); } #endif - basic_string& - assign(_InputIterator __first, _InputIterator __last) - { return this->replace(begin(), end(), __first, __last); } #if __cplusplus >= 201103L /** @@ -1730,7 +1751,20 @@ _GLIBCXX_BEGIN_NAMESPACE_CXX11 _GLIBCXX20_CONSTEXPR basic_string& assign(initializer_list<_CharT> __l) - { return this->assign(__l.begin(), __l.size()); } + { + // The initializer_list array cannot alias the characters in *this + // so we don't need to use replace to that case. + const size_type __n = __l.size(); + if (__n > capacity()) + *this = basic_string(__l.begin(), __l.end(), get_allocator()); + else + { + if (__n) + _S_copy(_M_data(), __l.begin(), __n); + _M_set_length(__n); + } + return *this; + } #endif // C++11 #if __cplusplus >= 201703L