libstdc++: Optimize operator+(string/char*, string/char*) equally

Message ID 20220822181509.1032874-2-whh8b@obs.cr
State New, archived
Headers
Series libstdc++: Optimize operator+(string/char*, string/char*) equally |

Commit Message

Will Hawkins Aug. 22, 2022, 6:15 p.m. UTC
  From: Will Hawkins <whh8b@obs.cr>

Until now operator+(char*, string) and operator+(string, char*) had
different performance characteristics. The former required a single
memory allocation and the latter required two. This patch makes the
performance equal.

libstdc++-v3/ChangeLog:
        * libstdc++-v3/include/bits/basic_string.h (operator+(string, char*)):
        Remove naive implementation.
        * libstdc++-v3/include/bits/basic_string.tcc (operator+(string, char*)):
        Add single-allocation implementation.
---
 libstdc++-v3/include/bits/basic_string.h   |  7 +------
 libstdc++-v3/include/bits/basic_string.tcc | 21 +++++++++++++++++++++
 2 files changed, 22 insertions(+), 6 deletions(-)
  

Comments

Jonathan Wakely Aug. 23, 2022, 4:33 p.m. UTC | #1
On Mon, 22 Aug 2022 at 19:15, Will Hawkins wrote:
>
> Until now operator+(char*, string) and operator+(string, char*) had
> different performance characteristics. The former required a single
> memory allocation and the latter required two. This patch makes the
> performance equal.

If you don't have a GCC copyright assignment on file with the FSF then
please follow the procedure described at https://gcc.gnu.org/dco.html


> libstdc++-v3/ChangeLog:
>         * libstdc++-v3/include/bits/basic_string.h (operator+(string, char*)):
>         Remove naive implementation.
>         * libstdc++-v3/include/bits/basic_string.tcc (operator+(string, char*)):
>         Add single-allocation implementation.
> ---
>  libstdc++-v3/include/bits/basic_string.h   |  7 +------
>  libstdc++-v3/include/bits/basic_string.tcc | 21 +++++++++++++++++++++
>  2 files changed, 22 insertions(+), 6 deletions(-)
>
> diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h
> index b04fba95678..bc048fc0689 100644
> --- a/libstdc++-v3/include/bits/basic_string.h
> +++ b/libstdc++-v3/include/bits/basic_string.h
> @@ -3523,12 +3523,7 @@ _GLIBCXX_END_NAMESPACE_CXX11
>      _GLIBCXX20_CONSTEXPR
>      inline basic_string<_CharT, _Traits, _Alloc>

Please remove the 'inline' specifier here, since you're moving the
definition into the non-inline .tcc file.

There's a separate discussion to be had about whether these operator+
overloads *should* be inline. But for the purposes of this change, we
want these two operator+ overloads to be consistent, and so they
should both be non-inline.

>      operator+(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
> -             const _CharT* __rhs)
> -    {
> -      basic_string<_CharT, _Traits, _Alloc> __str(__lhs);
> -      __str.append(__rhs);
> -      return __str;
> -    }
> +             const _CharT* __rhs);
>
>    /**
>     *  @brief  Concatenate string and character.
> diff --git a/libstdc++-v3/include/bits/basic_string.tcc b/libstdc++-v3/include/bits/basic_string.tcc
> index 4563c61429a..95ba8e503e9 100644
> --- a/libstdc++-v3/include/bits/basic_string.tcc
> +++ b/libstdc++-v3/include/bits/basic_string.tcc
> @@ -640,6 +640,27 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        return __str;
>      }
>
> +  template<typename _CharT, typename _Traits, typename _Alloc>
> +    _GLIBCXX20_CONSTEXPR
> +    basic_string<_CharT, _Traits, _Alloc>
> +    operator+(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
> +             const _CharT* __rhs)
> +    {
> +      __glibcxx_requires_string(__rhs);
> +      typedef basic_string<_CharT, _Traits, _Alloc> __string_type;
> +      typedef typename __string_type::size_type          __size_type;
> +      typedef typename __gnu_cxx::__alloc_traits<_Alloc>::template
> +       rebind<_CharT>::other _Char_alloc_type;
> +      typedef __gnu_cxx::__alloc_traits<_Char_alloc_type> _Alloc_traits;
> +      const __size_type __len = _Traits::length(__rhs);
> +      __string_type __str(_Alloc_traits::_S_select_on_copy(
> +          __lhs.get_allocator()));
> +      __str.reserve(__len + __lhs.size());
> +      __str.append(__lhs);
> +      __str.append(__rhs, __len);
> +      return __str;
> +    }
> +
>    template<typename _CharT, typename _Traits, typename _Alloc>
>      _GLIBCXX_STRING_CONSTEXPR
>      typename basic_string<_CharT, _Traits, _Alloc>::size_type
> --
> 2.34.1
>
  
Will Hawkins Aug. 24, 2022, 6:16 a.m. UTC | #2
A revision of the original patch -- based on the feedback from Jonathan -- that
removes the `inline` specifier is attached.
  
Will Hawkins Aug. 24, 2022, 6:18 a.m. UTC | #3
On Tue, Aug 23, 2022 at 12:33 PM Jonathan Wakely <jwakely@redhat.com> wrote:
>
> On Mon, 22 Aug 2022 at 19:15, Will Hawkins wrote:
> >
> > Until now operator+(char*, string) and operator+(string, char*) had
> > different performance characteristics. The former required a single
> > memory allocation and the latter required two. This patch makes the
> > performance equal.
>
> If you don't have a GCC copyright assignment on file with the FSF then
> please follow the procedure described at https://gcc.gnu.org/dco.html

Thank you.

>
>
> > libstdc++-v3/ChangeLog:
> >         * libstdc++-v3/include/bits/basic_string.h (operator+(string, char*)):
> >         Remove naive implementation.
> >         * libstdc++-v3/include/bits/basic_string.tcc (operator+(string, char*)):
> >         Add single-allocation implementation.
> > ---
> >  libstdc++-v3/include/bits/basic_string.h   |  7 +------
> >  libstdc++-v3/include/bits/basic_string.tcc | 21 +++++++++++++++++++++
> >  2 files changed, 22 insertions(+), 6 deletions(-)
> >
> > diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h
> > index b04fba95678..bc048fc0689 100644
> > --- a/libstdc++-v3/include/bits/basic_string.h
> > +++ b/libstdc++-v3/include/bits/basic_string.h
> > @@ -3523,12 +3523,7 @@ _GLIBCXX_END_NAMESPACE_CXX11
> >      _GLIBCXX20_CONSTEXPR
> >      inline basic_string<_CharT, _Traits, _Alloc>
>
> Please remove the 'inline' specifier here, since you're moving the
> definition into the non-inline .tcc file.
>
> There's a separate discussion to be had about whether these operator+
> overloads *should* be inline. But for the purposes of this change, we
> want these two operator+ overloads to be consistent, and so they
> should both be non-inline.

Thank you for the feedback. I sent out a v2 of the patch. Again, I
hope that I followed the proper procedure by having my mailer put the
patch in reply to my previous message.

Thank you again!
Will

>
> >      operator+(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
> > -             const _CharT* __rhs)
> > -    {
> > -      basic_string<_CharT, _Traits, _Alloc> __str(__lhs);
> > -      __str.append(__rhs);
> > -      return __str;
> > -    }
> > +             const _CharT* __rhs);
> >
> >    /**
> >     *  @brief  Concatenate string and character.
> > diff --git a/libstdc++-v3/include/bits/basic_string.tcc b/libstdc++-v3/include/bits/basic_string.tcc
> > index 4563c61429a..95ba8e503e9 100644
> > --- a/libstdc++-v3/include/bits/basic_string.tcc
> > +++ b/libstdc++-v3/include/bits/basic_string.tcc
> > @@ -640,6 +640,27 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >        return __str;
> >      }
> >
> > +  template<typename _CharT, typename _Traits, typename _Alloc>
> > +    _GLIBCXX20_CONSTEXPR
> > +    basic_string<_CharT, _Traits, _Alloc>
> > +    operator+(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
> > +             const _CharT* __rhs)
> > +    {
> > +      __glibcxx_requires_string(__rhs);
> > +      typedef basic_string<_CharT, _Traits, _Alloc> __string_type;
> > +      typedef typename __string_type::size_type          __size_type;
> > +      typedef typename __gnu_cxx::__alloc_traits<_Alloc>::template
> > +       rebind<_CharT>::other _Char_alloc_type;
> > +      typedef __gnu_cxx::__alloc_traits<_Char_alloc_type> _Alloc_traits;
> > +      const __size_type __len = _Traits::length(__rhs);
> > +      __string_type __str(_Alloc_traits::_S_select_on_copy(
> > +          __lhs.get_allocator()));
> > +      __str.reserve(__len + __lhs.size());
> > +      __str.append(__lhs);
> > +      __str.append(__rhs, __len);
> > +      return __str;
> > +    }
> > +
> >    template<typename _CharT, typename _Traits, typename _Alloc>
> >      _GLIBCXX_STRING_CONSTEXPR
> >      typename basic_string<_CharT, _Traits, _Alloc>::size_type
> > --
> > 2.34.1
> >
>
  
Jonathan Wakely Aug. 24, 2022, 9:49 a.m. UTC | #4
On Wed, 24 Aug 2022 at 07:18, Will Hawkins <hawkinsw@obs.cr> wrote:
>
> On Tue, Aug 23, 2022 at 12:33 PM Jonathan Wakely <jwakely@redhat.com> wrote:
> >
> > On Mon, 22 Aug 2022 at 19:15, Will Hawkins wrote:
> > >
> > > Until now operator+(char*, string) and operator+(string, char*) had
> > > different performance characteristics. The former required a single
> > > memory allocation and the latter required two. This patch makes the
> > > performance equal.
> >
> > If you don't have a GCC copyright assignment on file with the FSF then
> > please follow the procedure described at https://gcc.gnu.org/dco.html
>
> Thank you.
>
> >
> >
> > > libstdc++-v3/ChangeLog:
> > >         * libstdc++-v3/include/bits/basic_string.h (operator+(string, char*)):
> > >         Remove naive implementation.
> > >         * libstdc++-v3/include/bits/basic_string.tcc (operator+(string, char*)):
> > >         Add single-allocation implementation.
> > > ---
> > >  libstdc++-v3/include/bits/basic_string.h   |  7 +------
> > >  libstdc++-v3/include/bits/basic_string.tcc | 21 +++++++++++++++++++++
> > >  2 files changed, 22 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h
> > > index b04fba95678..bc048fc0689 100644
> > > --- a/libstdc++-v3/include/bits/basic_string.h
> > > +++ b/libstdc++-v3/include/bits/basic_string.h
> > > @@ -3523,12 +3523,7 @@ _GLIBCXX_END_NAMESPACE_CXX11
> > >      _GLIBCXX20_CONSTEXPR
> > >      inline basic_string<_CharT, _Traits, _Alloc>
> >
> > Please remove the 'inline' specifier here, since you're moving the
> > definition into the non-inline .tcc file.
> >
> > There's a separate discussion to be had about whether these operator+
> > overloads *should* be inline. But for the purposes of this change, we
> > want these two operator+ overloads to be consistent, and so they
> > should both be non-inline.
>
> Thank you for the feedback. I sent out a v2 of the patch. Again, I
> hope that I followed the proper procedure by having my mailer put the
> patch in reply to my previous message.

It looks like the patch got attached in this thread, not the [PATCH v2] thread:
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600176.html
Presumably it was meant as a reply to:
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600175.html

It's more conventional to put the patch in the same email, not as a
separate reply, which would avoid that problem.

You can use git scissors to separate the patch submission from the
preceding discussion and comments, see
https://git-scm.com/docs/git-mailinfo#Documentation/git-mailinfo.txt---scissors

For example, see my patches like the one at
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600109.html
where the "informational" part comes first, describing where it was
tested, then the patch (and its commit msg) come after the -- >8 --
scissors.

If you use git send-email to send mails, you can use
--in-reply-to=$MessageId to make the email a reply to the specified
$MessageId from a previous mail in the thread.

Anyway, the v2 patch looks fine and I'll apply it to trunk after
testing - thanks!
  

Patch

diff --git a/libstdc++-v3/include/bits/basic_string.h b/libstdc++-v3/include/bits/basic_string.h
index b04fba95678..bc048fc0689 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -3523,12 +3523,7 @@  _GLIBCXX_END_NAMESPACE_CXX11
     _GLIBCXX20_CONSTEXPR
     inline basic_string<_CharT, _Traits, _Alloc>
     operator+(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
-	      const _CharT* __rhs)
-    {
-      basic_string<_CharT, _Traits, _Alloc> __str(__lhs);
-      __str.append(__rhs);
-      return __str;
-    }
+	      const _CharT* __rhs);
 
   /**
    *  @brief  Concatenate string and character.
diff --git a/libstdc++-v3/include/bits/basic_string.tcc b/libstdc++-v3/include/bits/basic_string.tcc
index 4563c61429a..95ba8e503e9 100644
--- a/libstdc++-v3/include/bits/basic_string.tcc
+++ b/libstdc++-v3/include/bits/basic_string.tcc
@@ -640,6 +640,27 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
       return __str;
     }
 
+  template<typename _CharT, typename _Traits, typename _Alloc>
+    _GLIBCXX20_CONSTEXPR
+    basic_string<_CharT, _Traits, _Alloc>
+    operator+(const basic_string<_CharT, _Traits, _Alloc>& __lhs,
+	      const _CharT* __rhs)
+    {
+      __glibcxx_requires_string(__rhs);
+      typedef basic_string<_CharT, _Traits, _Alloc> __string_type;
+      typedef typename __string_type::size_type	  __size_type;
+      typedef typename __gnu_cxx::__alloc_traits<_Alloc>::template
+	rebind<_CharT>::other _Char_alloc_type;
+      typedef __gnu_cxx::__alloc_traits<_Char_alloc_type> _Alloc_traits;
+      const __size_type __len = _Traits::length(__rhs);
+      __string_type __str(_Alloc_traits::_S_select_on_copy(
+          __lhs.get_allocator()));
+      __str.reserve(__len + __lhs.size());
+      __str.append(__lhs);
+      __str.append(__rhs, __len);
+      return __str;
+    }
+
   template<typename _CharT, typename _Traits, typename _Alloc>
     _GLIBCXX_STRING_CONSTEXPR
     typename basic_string<_CharT, _Traits, _Alloc>::size_type