Patchwork hasher speed traits

login
register
mail settings
Submitter François Dumont
Date Feb. 2, 2013, 9:57 p.m.
Message ID <510D8BE2.3020105@gmail.com>
Download mbox | patch
Permalink /patch/217703/
State New
Headers show

Comments

François Dumont - Feb. 2, 2013, 9:57 p.m.
Hi

     Here is the last patch I can think of for 4.8. Thanks to it default 
performance reported in performance/23_containers/insert/54075.cc and 
performance/23_containers/insert_erase/41975.cc are always the best:

54075.cc                     std::unordered_set without hash code cached 
300000 insertion attempts, 300000 inserted      10r    8u 1s 
13761936mem    0pf
54075.cc                     std::unordered_set without hash code cached 
10 times insertion of 300000 elements      31r   31u 0s        0mem    0pf

54075.cc                     std::unordered_set with hash code cached 
300000 insertion attempts, 300000 inserted      10r    9u 1s 
18562000mem    0pf
54075.cc                     std::unordered_set with hash code cached 10 
times insertion of 300000 elements      34r   35u 0s        0mem    0pf

54075.cc                     std::unordered_set default cache 300000 
insertion attempts, 300000 inserted       9r    8u    0s 13761936mem    0pf
54075.cc                     std::unordered_set default cache 10 times 
insertion of 300000 elements      31r   32u    0s 0mem    0pf


41975.cc                     std::unordered_set<string> without hash 
code cached: first insert       9r    9u    0s 8450336mem    0pf
41975.cc                     std::unordered_set<string> without hash 
code cached: erase from iterator       6r    5u    0s -6400096mem    0pf
41975.cc                     std::unordered_set<string> without hash 
code cached: second insert       6r    5u    0s 6400000mem    0pf
41975.cc                     std::unordered_set<string> without hash 
code cached: erase from key       5r    5u    0s -6400000mem    0pf

41975.cc                     std::unordered_set<string> with hash code 
cached: first insert       5r    5u    1s  8450336mem 0pf
41975.cc                     std::unordered_set<string> with hash code 
cached: erase from iterator       4r    3u    0s -6400096mem    0pf
41975.cc                     std::unordered_set<string> with hash code 
cached: second insert       3r    3u    0s  6400016mem 0pf
41975.cc                     std::unordered_set<string> with hash code 
cached: erase from key       4r    3u    0s -6400016mem 0pf

41975.cc                     std::unordered_set<string> default cache: 
first insert       5r    5u    1s  8450336mem    0pf
41975.cc                     std::unordered_set<string> default cache: 
erase from iterator       4r    3u    0s -6400096mem    0pf
41975.cc                     std::unordered_set<string> default cache: 
second insert       3r    3u    0s  6400000mem    0pf
41975.cc                     std::unordered_set<string> default cache: 
erase from key       4r    3u    0s -6400000mem 0pf

2013-02-02  François Dumont  <fdumont@gcc.gnu.org>

     * include/bits/functional_hash.h (std::__is_fast_hash<>): New.
     * include/bits/basic_string.h: Specialize previous to mark
     std::hash for string types as slow.
     * include/bits/hashtable.h (__cache_default): Replace is_integral
     with __is_fast_hash.
     * src/c++11/hash_c++0x.cc: Add type_traits include.

Tested under Linux x86_64.

Ok to commit ?

François
Jonathan Wakely - Feb. 3, 2013, 9:03 p.m.
On 2 February 2013 21:57, François Dumont wrote:
> Hi
>
>     Here is the last patch I can think of for 4.8. Thanks to it default
> performance reported in performance/23_containers/insert/54075.cc and
> performance/23_containers/insert_erase/41975.cc are always the best:

Excellent.

> Ok to commit ?

Yes, thanks for all your work on this.

As Benjamin suggested, we should think about which changes are
user-visible and worth documenting in the GCC 4.8 release notes.

We should certainly note somewhere that it's important for the hash
function to be noexcept to avoid decreased performance.

Patch

Index: include/bits/functional_hash.h
===================================================================
--- include/bits/functional_hash.h	(revision 195686)
+++ include/bits/functional_hash.h	(working copy)
@@ -195,6 +195,18 @@ 
 
   // @} group hashes
 
+  // Hint about performance of hash functor. If not fast the hash based
+  // containers will cache the hash code.
+  // Default behavior is to consider that hasher are fast unless specified
+  // otherwise.
+  template<typename _Hash>
+    struct __is_fast_hash : public std::true_type
+    { };
+
+  template<>
+    struct __is_fast_hash<hash<long double>> : public std::false_type
+    { };
+
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
 
Index: include/bits/basic_string.h
===================================================================
--- include/bits/basic_string.h	(revision 195686)
+++ include/bits/basic_string.h	(working copy)
@@ -3053,6 +3053,10 @@ 
       { return std::_Hash_impl::hash(__s.data(), __s.length()); }
     };
 
+  template<>
+    struct __is_fast_hash<hash<string>> : std::false_type
+    { };
+
 #ifdef _GLIBCXX_USE_WCHAR_T
   /// std::hash specialization for wstring.
   template<>
@@ -3064,6 +3068,10 @@ 
       { return std::_Hash_impl::hash(__s.data(),
                                      __s.length() * sizeof(wchar_t)); }
     };
+
+  template<>
+    struct __is_fast_hash<hash<wstring>> : std::false_type
+    { };
 #endif
 #endif /* _GLIBCXX_COMPATIBILITY_CXX0X */
 
@@ -3079,6 +3087,10 @@ 
                                      __s.length() * sizeof(char16_t)); }
     };
 
+  template<>
+    struct __is_fast_hash<hash<u16string>> : std::false_type
+    { };
+
   /// std::hash specialization for u32string.
   template<>
     struct hash<u32string>
@@ -3089,6 +3101,10 @@ 
       { return std::_Hash_impl::hash(__s.data(),
                                      __s.length() * sizeof(char32_t)); }
     };
+
+  template<>
+    struct __is_fast_hash<hash<u32string>> : std::false_type
+    { };
 #endif
 
 _GLIBCXX_END_NAMESPACE_VERSION
Index: include/bits/hashtable.h
===================================================================
--- include/bits/hashtable.h	(revision 195686)
+++ include/bits/hashtable.h	(working copy)
@@ -40,9 +40,8 @@ 
 
   template<typename _Tp, typename _Hash>
     using __cache_default
-      =  __not_<__and_<// Do not cache for builtin integral types having trivial
-		       // hasher.
-		       is_integral<_Tp>,
+      =  __not_<__and_<// Do not cache for fast hasher.
+		       __is_fast_hash<_Hash>,
 		       // Mandatory to make local_iterator default
 		       // constructible.
 		       is_default_constructible<_Hash>,
Index: src/c++11/hash_c++0x.cc
===================================================================
--- src/c++11/hash_c++0x.cc	(revision 195686)
+++ src/c++11/hash_c++0x.cc	(working copy)
@@ -26,6 +26,7 @@ 
 # error "hash_c++0x.cc must be compiled with -std=gnu++0x"
 #endif
 
+#include <type_traits>
 #include <bits/functional_hash.h>
 
 namespace std _GLIBCXX_VISIBILITY(default)