libstdc++: Optimize operator+(string/char*, string/char*) equally

Message ID 20220824061648.1119635-2-whh8b@obs.cr
State New, archived
Headers
Series libstdc++: Optimize operator+(string/char*, string/char*) equally |

Commit Message

Will Hawkins Aug. 24, 2022, 6:16 a.m. UTC
  From: Will Hawkins <whh8b@obs.cr>

Until now operator+(char*, string) and operator+(string, char*) had
different performance characteristics. The former required a single
memory allocation and the latter required two. This patch makes the
performance equal.

libstdc++-v3/ChangeLog:
        * libstdc++-v3/include/bits/basic_string.h (operator+(string, char*)):
        Remove naive implementation.
        * libstdc++-v3/include/bits/basic_string.tcc (operator+(string, char*)):
        Add single-allocation implementation.

Signed-off-by: Will Hawkins <whh8b@obs.cr>
---
 libstdc++-v3/include/bits/basic_string.h   |  9 ++-------
 libstdc++-v3/include/bits/basic_string.tcc | 21 +++++++++++++++++++++
 2 files changed, 23 insertions(+), 7 deletions(-)
  

Comments

Jonathan Wakely Aug. 24, 2022, 2:24 p.m. UTC | #1
On Wed, 24 Aug 2022 at 07:17, Will Hawkins wrote:
>
> Until now operator+(char*, string) and operator+(string, char*) had
> different performance characteristics. The former required a single
> memory allocation and the latter required two. This patch makes the
> performance equal.
>
> libstdc++-v3/ChangeLog:

There should be a blank line here.

>         * libstdc++-v3/include/bits/basic_string.h (operator+(string, char*)):

The path should be relative to the ChangeLog, so should not include
the libstdc++-v3/ directory component. You can use the git gcc-verify
alias to check your commit msgs format before submitting. That runs
the same checks as will be used for the server-side hook that decides
whether to allow a push. See the customization script described at
https://gcc.gnu.org/gitwrite.html#vendor for the alaises.

Also, the overload you're changing is operator+(const string&, const
char*). The distinction matters, because there is also
operator+(string&&, const char*) and what you wrote looks more like
that one.

So I've committed it with this changelog:


   libstdc++-v3/ChangeLog:

           * include/bits/basic_string.h (operator+(const string&,
const char*)):
           Remove naive implementation.
           * include/bits/basic_string.tcc (operator+(const string&,
const char*)):
           Add single-allocation implementation.

Thanks for the patch!
  
Alexandre Oliva Aug. 24, 2022, 10:39 p.m. UTC | #2
On Aug 24, 2022, Jonathan Wakely via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:

>            * include/bits/basic_string.h (operator+(const string&,
> const char*)):
>            Remove naive implementation.
>            * include/bits/basic_string.tcc (operator+(const string&,
> const char*)):
>            Add single-allocation implementation.

ISTM this requires the following additional tweak:

diff --git a/libstdc++-v3/src/c++11/string-inst.cc b/libstdc++-v3/src/c++11/string-inst.cc
index bfae6d902a1dd..2ec0e9d85f947 100644
--- a/libstdc++-v3/src/c++11/string-inst.cc
+++ b/libstdc++-v3/src/c++11/string-inst.cc
@@ -58,6 +58,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template class basic_string<C>;
   template S operator+(const C*, const S&);
+  template S operator+(const S&, const C*);
   template S operator+(C, const S&);
   template S operator+(const S&, const S&);
 

Without this, I'm getting undefined references to this specialization in
libstdc++.so, that I tracked down to a std::system_error ctor in
cxx11-ios_failure.o.  I got this while testing another patch that might
be the reason why the template instantiation doesn't get inlined, but...
we can't depend on its being inlined, can we?
  
Jonathan Wakely Aug. 24, 2022, 10:47 p.m. UTC | #3
On Wed, 24 Aug 2022 at 23:39, Alexandre Oliva <oliva@gnu.org> wrote:
>
> On Aug 24, 2022, Jonathan Wakely via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>
> >            * include/bits/basic_string.h (operator+(const string&,
> > const char*)):
> >            Remove naive implementation.
> >            * include/bits/basic_string.tcc (operator+(const string&,
> > const char*)):
> >            Add single-allocation implementation.
>
> ISTM this requires the following additional tweak:
>
> diff --git a/libstdc++-v3/src/c++11/string-inst.cc b/libstdc++-v3/src/c++11/string-inst.cc
> index bfae6d902a1dd..2ec0e9d85f947 100644
> --- a/libstdc++-v3/src/c++11/string-inst.cc
> +++ b/libstdc++-v3/src/c++11/string-inst.cc
> @@ -58,6 +58,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>    template class basic_string<C>;
>    template S operator+(const C*, const S&);
> +  template S operator+(const S&, const C*);
>    template S operator+(C, const S&);
>    template S operator+(const S&, const S&);
>
>
> Without this, I'm getting undefined references to this specialization in
> libstdc++.so, that I tracked down to a std::system_error ctor in
> cxx11-ios_failure.o.  I got this while testing another patch that might
> be the reason why the template instantiation doesn't get inlined, but...
> we can't depend on its being inlined, can we?

Right. But adding that will cause another symbol to be exported,
probably with the wrong symbol version.

To fix https://gcc.gnu.org/PR106735 I'm just going to revert it for
now, and revisit in the morning.
  
Jonathan Wakely Aug. 24, 2022, 11:02 p.m. UTC | #4
On Wed, 24 Aug 2022 at 23:47, Jonathan Wakely <jwakely@redhat.com> wrote:
>
> On Wed, 24 Aug 2022 at 23:39, Alexandre Oliva <oliva@gnu.org> wrote:
> >
> > On Aug 24, 2022, Jonathan Wakely via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> >
> > >            * include/bits/basic_string.h (operator+(const string&,
> > > const char*)):
> > >            Remove naive implementation.
> > >            * include/bits/basic_string.tcc (operator+(const string&,
> > > const char*)):
> > >            Add single-allocation implementation.
> >
> > ISTM this requires the following additional tweak:
> >
> > diff --git a/libstdc++-v3/src/c++11/string-inst.cc b/libstdc++-v3/src/c++11/string-inst.cc
> > index bfae6d902a1dd..2ec0e9d85f947 100644
> > --- a/libstdc++-v3/src/c++11/string-inst.cc
> > +++ b/libstdc++-v3/src/c++11/string-inst.cc
> > @@ -58,6 +58,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >
> >    template class basic_string<C>;
> >    template S operator+(const C*, const S&);
> > +  template S operator+(const S&, const C*);
> >    template S operator+(C, const S&);
> >    template S operator+(const S&, const S&);
> >
> >
> > Without this, I'm getting undefined references to this specialization in
> > libstdc++.so, that I tracked down to a std::system_error ctor in
> > cxx11-ios_failure.o.  I got this while testing another patch that might
> > be the reason why the template instantiation doesn't get inlined, but...
> > we can't depend on its being inlined, can we?
>
> Right. But adding that will cause another symbol to be exported,
> probably with the wrong symbol version.

Oh, but those symbols aren't exported ... they're just needed because
we compile with -fno-implicit-templatesand other symbols in
libstdc++.so require them.

So that probably is the right fix. I have another change for operator+
in mind now anyway, which optimizes operator(const string&, char) the
same way, and removes the duplication in those five operator+
overloads.

>
> To fix https://gcc.gnu.org/PR106735 I'm just going to revert it for
> now, and revisit in the morning.
  
Will Hawkins Aug. 25, 2022, 5:52 a.m. UTC | #5
On Wed, Aug 24, 2022 at 7:03 PM Jonathan Wakely <jwakely@redhat.com> wrote:
>
> On Wed, 24 Aug 2022 at 23:47, Jonathan Wakely <jwakely@redhat.com> wrote:
> >
> > On Wed, 24 Aug 2022 at 23:39, Alexandre Oliva <oliva@gnu.org> wrote:
> > >
> > > On Aug 24, 2022, Jonathan Wakely via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> > >
> > > >            * include/bits/basic_string.h (operator+(const string&,
> > > > const char*)):
> > > >            Remove naive implementation.
> > > >            * include/bits/basic_string.tcc (operator+(const string&,
> > > > const char*)):
> > > >            Add single-allocation implementation.
> > >
> > > ISTM this requires the following additional tweak:
> > >
> > > diff --git a/libstdc++-v3/src/c++11/string-inst.cc b/libstdc++-v3/src/c++11/string-inst.cc
> > > index bfae6d902a1dd..2ec0e9d85f947 100644
> > > --- a/libstdc++-v3/src/c++11/string-inst.cc
> > > +++ b/libstdc++-v3/src/c++11/string-inst.cc
> > > @@ -58,6 +58,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > >
> > >    template class basic_string<C>;
> > >    template S operator+(const C*, const S&);
> > > +  template S operator+(const S&, const C*);
> > >    template S operator+(C, const S&);
> > >    template S operator+(const S&, const S&);
> > >

I realize that I should have noticed that asymmetry as well. Sorry!

> > >
> > > Without this, I'm getting undefined references to this specialization in
> > > libstdc++.so, that I tracked down to a std::system_error ctor in
> > > cxx11-ios_failure.o.  I got this while testing another patch that might
> > > be the reason why the template instantiation doesn't get inlined, but...
> > > we can't depend on its being inlined, can we?
> >
> > Right. But adding that will cause another symbol to be exported,
> > probably with the wrong symbol version.
>
> Oh, but those symbols aren't exported ... they're just needed because
> we compile with -fno-implicit-templatesand other symbols in
> libstdc++.so require them.
>
> So that probably is the right fix. I have another change for operator+
> in mind now anyway, which optimizes operator(const string&, char) the
> same way, and removes the duplication in those five operator+
> overloads.

Let me know if/how I can help.

Will

>
> >
> > To fix https://gcc.gnu.org/PR106735 I'm just going to revert it for
> > now, and revisit in the morning.
>
  

Patch

diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h
index b04fba95678..fa6738925bb 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -3521,14 +3521,9 @@  _GLIBCXX_END_NAMESPACE_CXX11
    */
   template<typename _CharT, typename _Traits, typename _Alloc>
     _GLIBCXX20_CONSTEXPR
-    inline basic_string<_CharT, _Traits, _Alloc>
+    basic_string<_CharT, _Traits, _Alloc>
     operator+(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
-	      const _CharT* __rhs)
-    {
-      basic_string<_CharT, _Traits, _Alloc> __str(__lhs);
-      __str.append(__rhs);
-      return __str;
-    }
+	      const _CharT* __rhs);
 
   /**
    *  @brief  Concatenate string and character.
diff --git a/libstdc++-v3/include/bits/basic_string.tcc b/libstdc++-v3/include/bits/basic_string.tcc
index 4563c61429a..95ba8e503e9 100644
--- a/libstdc++-v3/include/bits/basic_string.tcc
+++ b/libstdc++-v3/include/bits/basic_string.tcc
@@ -640,6 +640,27 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
       return __str;
     }
 
+  template<typename _CharT, typename _Traits, typename _Alloc>
+    _GLIBCXX20_CONSTEXPR
+    basic_string<_CharT, _Traits, _Alloc>
+    operator+(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
+	      const _CharT* __rhs)
+    {
+      __glibcxx_requires_string(__rhs);
+      typedef basic_string<_CharT, _Traits, _Alloc> __string_type;
+      typedef typename __string_type::size_type	  __size_type;
+      typedef typename __gnu_cxx::__alloc_traits<_Alloc>::template
+	rebind<_CharT>::other _Char_alloc_type;
+      typedef __gnu_cxx::__alloc_traits<_Char_alloc_type> _Alloc_traits;
+      const __size_type __len = _Traits::length(__rhs);
+      __string_type __str(_Alloc_traits::_S_select_on_copy(
+          __lhs.get_allocator()));
+      __str.reserve(__len + __lhs.size());
+      __str.append(__lhs);
+      __str.append(__rhs, __len);
+      return __str;
+    }
+
   template<typename _CharT, typename _Traits, typename _Alloc>
     _GLIBCXX_STRING_CONSTEXPR
     typename basic_string<_CharT, _Traits, _Alloc>::size_type