diff mbox

Add header implementation of std::to_string for integers (PR libstdc++/71108)

Message ID CAPoJ1RH7BDKNCF9yo8XEXx_mSZ4=G2LWPxxG0rfL9oSyrQSUWQ@mail.gmail.com
State New
Headers show

Commit Message

Adrian Wielgosik May 27, 2017, 9:46 p.m. UTC
Currently std::to_string takes a fairly long trip to vs(n/w)printf. The patch
implements int-to-string formatting in header, to improve function performance.

Results of a benchmark on my PC:
to_string(integer) takes 85-107ns per call (depending on result string size),
out of which ~75ns is spent in vsnprintf. The new implementation removes the
overhead, reducing time per call to 12-29ns.

Tested on x64.

As a side note, this is my first patch for libstdc++/GCC, so please try to
forgive any silly mistakes. Regarding requesting copyright assignment, do I
have to contact assign@gnu.org separately, or is mentioning it here enough?


2017-05-27  Adrian Wielgosik  <adrian.wielgosik@gmail.com>

    PR libstdc++/71108
    * libstdc++-v3/include/ext/string_conversions.h:
    (__int_to_string, __format_uint): New.
    * libstdc++-v3/include/bits/basic_string.h: Use the new function.


 libstdc++-v3/include/bits/basic_string.h      | 44
++++++++++++--------------------------------
 libstdc++-v3/include/ext/string_conversions.h | 38
++++++++++++++++++++++++++++++++++++++
 2 files changed, 50 insertions(+), 32 deletions(-)

Comments

Daniel Krügler May 28, 2017, 1:37 p.m. UTC | #1
2017-05-27 23:46 GMT+02:00 Adrian Wielgosik <adrian.wielgosik@gmail.com>:
> Currently std::to_string takes a fairly long trip to vs(n/w)printf. The patch
> implements int-to-string formatting in header, to improve function performance.

The existing specification of std::to_string is specified in a way
that would make this replacement implementation non-conforming, IMO.
Reason for this is, that the C++ specification says:

"Each function returns a string object holding the character
representation of the value of
its argument that would be generated by calling sprintf(buf, fmt, val)
with a format specifier of
"%d", "%u", "%ld", "%lu", "%lld", "%llu", "%f", "%f", or "%Lf",
respectively, where buf designates
an internal character buffer of sufficient size."

The explicit reference to sprintf without specification of any locale
means that the outcome should depend on the currently set locale, so a
conforming program could notice the difference by either calling
std::setlocale.

This is different for the new "primitive numeric output" functions
(std::to_chars, std::from_chars), which explicitly require a
conversion as if the "C" locale would be active.

- Daniel
Adrian Wielgosik May 28, 2017, 7:38 p.m. UTC | #2
> so a conforming program could notice the difference by either calling std::setlocale.

Unless I missed or misunderstood something about locale (please let me
know if I did), I don't know of any way for locale to affect %d and
its integer friends.
Daniel Krügler May 28, 2017, 8:03 p.m. UTC | #3
2017-05-28 21:38 GMT+02:00 Adrian Wielgosik <adrian.wielgosik@gmail.com>:
>> so a conforming program could notice the difference by either calling std::setlocale.
>
> Unless I missed or misunderstood something about locale (please let me
> know if I did), I don't know of any way for locale to affect %d and
> its integer friends.

Hmmh, rethinking about it, I agree that I also cannot see a possible
effect here for the integer types. Initially I was alarmed by the fact
alone that the current wording is basically local-depending, but now I
agree that in this case I cannot find how a conforming program could
observe the difference. Thanks for remaining stubbornly ;-)

- Daniel
Tim Song May 28, 2017, 8:29 p.m. UTC | #4
libstdc++ has a to_chars implementation already. Assuming that the
locale issue isn't a problem, can that be reused?
Ville Voutilainen May 28, 2017, 9:05 p.m. UTC | #5
On 28 May 2017 at 23:29, Tim Song <t.canens.cpp@gmail.com> wrote:
> libstdc++ has a to_chars implementation already. Assuming that the
> locale issue isn't a problem, can that be reused?

Maybe, but that implementation hasn't landed yet. The patch has been circulated
but I don't see it having been committed.
Adrian Wielgosik May 29, 2017, 8:46 a.m. UTC | #6
> Assuming that the locale issue isn't a problem, can that be reused?

The to_chars patch uses C++14 features, while to_string is C++11. If
that was solved, it probably could be used.
However, as far as I know, simply using to_chars in to_string would
technically be suboptimal, because it needs three loops:
- in to_chars, to determine length of the string
- in to_chars, to format the number
- in to_string, to copy the formatted string to std::string

Meanwhile, ideally to_string can use only two loops:
- to determine length of the std::string
- to format the number in constructed std::string

OR (this is what I am doing)
- to format the number in temporary buffer
- to copy the formatted string to std::string

That said, the proposed not-yet-committed to_chars implementation is
more optimized than my code (by doing less integer divisions), so it
may perform just as good as mine or better.
diff mbox

Patch

diff --git a/libstdc++-v3/include/bits/basic_string.h
b/libstdc++-v3/include/bits/basic_string.h
index b6693c440c01..7732b246a962 100644
--- a/libstdc++-v3/include/bits/basic_string.h
+++ b/libstdc++-v3/include/bits/basic_string.h
@@ -6210,37 +6210,27 @@  _GLIBCXX_BEGIN_NAMESPACE_CXX11
   // DR 1261.
   inline string
   to_string(int __val)
-  { return __gnu_cxx::__to_xstring<string>(&std::vsnprintf, 4 * sizeof(int),
-   "%d", __val); }
+  { return __gnu_cxx::__int_to_string<string>(__val); }

   inline string
   to_string(unsigned __val)
-  { return __gnu_cxx::__to_xstring<string>(&std::vsnprintf,
-   4 * sizeof(unsigned),
-   "%u", __val); }
+  { return __gnu_cxx::__int_to_string<string>(__val); }

   inline string
   to_string(long __val)
-  { return __gnu_cxx::__to_xstring<string>(&std::vsnprintf, 4 * sizeof(long),
-   "%ld", __val); }
+  { return __gnu_cxx::__int_to_string<string>(__val); }

   inline string
   to_string(unsigned long __val)
-  { return __gnu_cxx::__to_xstring<string>(&std::vsnprintf,
-   4 * sizeof(unsigned long),
-   "%lu", __val); }
+  { return __gnu_cxx::__int_to_string<string>(__val); }

   inline string
   to_string(long long __val)
-  { return __gnu_cxx::__to_xstring<string>(&std::vsnprintf,
-   4 * sizeof(long long),
-   "%lld", __val); }
+  { return __gnu_cxx::__int_to_string<string>(__val); }

   inline string
   to_string(unsigned long long __val)
-  { return __gnu_cxx::__to_xstring<string>(&std::vsnprintf,
-   4 * sizeof(unsigned long long),
-   "%llu", __val); }
+  { return __gnu_cxx::__int_to_string<string>(__val); }

   inline string
   to_string(float __val)
@@ -6313,37 +6303,27 @@  _GLIBCXX_BEGIN_NAMESPACE_CXX11
   // DR 1261.
   inline wstring
   to_wstring(int __val)
-  { return __gnu_cxx::__to_xstring<wstring>(&std::vswprintf, 4 * sizeof(int),
-    L"%d", __val); }
+  { return __gnu_cxx::__int_to_string<wstring>(__val); }

   inline wstring
   to_wstring(unsigned __val)
-  { return __gnu_cxx::__to_xstring<wstring>(&std::vswprintf,
-    4 * sizeof(unsigned),
-    L"%u", __val); }
+  { return __gnu_cxx::__int_to_string<wstring>(__val); }

   inline wstring
   to_wstring(long __val)
-  { return __gnu_cxx::__to_xstring<wstring>(&std::vswprintf, 4 * sizeof(long),
-    L"%ld", __val); }
+  { return __gnu_cxx::__int_to_string<wstring>(__val); }

   inline wstring
   to_wstring(unsigned long __val)
-  { return __gnu_cxx::__to_xstring<wstring>(&std::vswprintf,
-    4 * sizeof(unsigned long),
-    L"%lu", __val); }
+  { return __gnu_cxx::__int_to_string<wstring>(__val); }

   inline wstring
   to_wstring(long long __val)
-  { return __gnu_cxx::__to_xstring<wstring>(&std::vswprintf,
-    4 * sizeof(long long),
-    L"%lld", __val); }
+  { return __gnu_cxx::__int_to_string<wstring>(__val); }

   inline wstring
   to_wstring(unsigned long long __val)
-  { return __gnu_cxx::__to_xstring<wstring>(&std::vswprintf,
-    4 * sizeof(unsigned long long),
-    L"%llu", __val); }
+  { return __gnu_cxx::__int_to_string<wstring>(__val); }

   inline wstring
   to_wstring(float __val)
diff --git a/libstdc++-v3/include/ext/string_conversions.h
b/libstdc++-v3/include/ext/string_conversions.h
index 89cba61857d4..617a626a7955 100644
--- a/libstdc++-v3/include/ext/string_conversions.h
+++ b/libstdc++-v3/include/ext/string_conversions.h
@@ -93,6 +93,44 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
       return __ret;
     }

+  template<typename _CharT, typename _T>
+    _CharT*
+    __format_uint(_CharT *__end, _T __value)
+    {
+      do
+      {
+        *--__end = '0' + __value % 10;
+        __value /= 10;
+      } while(__value);
+
+      return __end;
+    }
+
+  template<typename _String, typename _T>
+    _String
+    __int_to_string(_T __value)
+    {
+      using _UT = typename std::make_unsigned<_T>::type;
+      using _CharT = typename _String::value_type;
+      constexpr std::size_t __n = 4 * sizeof(_T);
+
+      _UT __absolute;
+      bool __negative = __value < 0;
+      if (__negative)
+        __absolute = -static_cast<_UT>(__value);
+      else
+        __absolute = static_cast<_UT>(__value);
+
+      _CharT __s[__n];
+      _CharT* __end = __s + __n;
+      _CharT* __start = __format_uint(__end, __absolute);
+
+      if (__negative)
+        *--__start = '-';
+
+      return _String(__start, __end);
+    }
+
   // Helper for the to_string / to_wstring functions.
   template<typename _String, typename _CharT = typename _String::value_type>
     _String