Message ID | 554A751E.9030009@gmail.com |
---|---|
State | New |
Headers | show |
On Wed, May 6, 2015 at 10:10 PM, François Dumont <frs.dumont@gmail.com> wrote: > Hi > > Following Marc Glisse comment #4 > on:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65641 I would like to > propose this enhancement to the hash functor for pointers. It simply gets > rid of the irrelevant bits on pointers hash code based on memory alignment > of the pointed type. The only drawback I can think of is that the type needs > to be complete at std::hash instantiation time but is it really an issue ? > > IMO it is quite obvious that the resulting hash code will be better but If you use a real hashing function that's not true. That is, something else than GCCs pointer_hash (void *p) { return (uintptr_t)p >>3; }. Richard. > if anyone has a good method to prove it I can try to implement it. The test > I have added in quality.cc is very basic and just reflect enhancement > following Marc's comment. > > 2015-05-05 François Dumont <fdumont@gcc.gnu.org> > > * include/bits/functional_hash.h > (std::__detail::_Lowest_power_of_two<size_t>): New. > (std::hash<_Tp*>::operator()): Use latter. > * testsuite/20_util/hash/quality.cc (pointer_quality_test): New. > > Tested under Linux x86_64. > > François >
On 08/05/2015 10:02, Richard Biener wrote: > On Wed, May 6, 2015 at 10:10 PM, François Dumont <frs.dumont@gmail.com> wrote: >> Hi >> >> Following Marc Glisse comment #4 >> on:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65641 I would like to >> propose this enhancement to the hash functor for pointers. It simply gets >> rid of the irrelevant bits on pointers hash code based on memory alignment >> of the pointed type. The only drawback I can think of is that the type needs >> to be complete at std::hash instantiation time but is it really an issue ? >> >> IMO it is quite obvious that the resulting hash code will be better but > If you use a real hashing function that's not true. That is, something > else than GCCs pointer_hash (void *p) { return (uintptr_t)p >>3; }. Sorry, I don't understand your remark. Do you mean that if someone is not using std::hash to hash its pointers he won't benefit from the enhancement ? It is a good point however to see that gcc is using something similar internally. François
My concern with accepting this patch is that many of libstdc++'s hash functions are awful from a mixing point of view -- you would get exactly the same problem from users who have integers which are always a multiple of a power of 2 (which is in practice not uncommon). This would give exactly the same problem. Rather than try to "fix" one hash function like this, we should just accept our hash functions might have low quality lower order bits. On 8 May 2015 at 21:18, François Dumont <frs.dumont@gmail.com> wrote: > On 08/05/2015 10:02, Richard Biener wrote: >> >> On Wed, May 6, 2015 at 10:10 PM, François Dumont <frs.dumont@gmail.com> >> wrote: >>> >>> Hi >>> >>> Following Marc Glisse comment #4 >>> on:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65641 I would like to >>> propose this enhancement to the hash functor for pointers. It simply gets >>> rid of the irrelevant bits on pointers hash code based on memory >>> alignment >>> of the pointed type. The only drawback I can think of is that the type >>> needs >>> to be complete at std::hash instantiation time but is it really an issue >>> ? >>> >>> IMO it is quite obvious that the resulting hash code will be better >>> but >> >> If you use a real hashing function that's not true. That is, something >> else than GCCs pointer_hash (void *p) { return (uintptr_t)p >>3; }. > > > Sorry, I don't understand your remark. Do you mean that if someone is not > using std::hash to hash its pointers he won't benefit from the enhancement ? > > It is a good point however to see that gcc is using something similar > internally. > > François >
My proposal should be consider out of any context. We don't know what std::hash is used for in user code, this is why I am proposing this patch even if for the moment it doesn't make any difference considering only our usage of it. Your remark would make more sens if we were talking about changing std::unordered_xxx containers number of buckets policy to a power of 2 François On 09/05/2015 00:10, Christopher Jefferson wrote: > My concern with accepting this patch is that many of libstdc++'s hash > functions are awful from a mixing point of view -- you would get > exactly the same problem from users who have integers which are always > a multiple of a power of 2 (which is in practice not uncommon). This > would give exactly the same problem. > > Rather than try to "fix" one hash function like this, we should just > accept our hash functions might have low quality lower order bits. > > > > On 8 May 2015 at 21:18, François Dumont<frs.dumont@gmail.com> wrote: >> On 08/05/2015 10:02, Richard Biener wrote: >>> On Wed, May 6, 2015 at 10:10 PM, François Dumont<frs.dumont@gmail.com> >>> wrote: >>>> Hi >>>> >>>> Following Marc Glisse comment #4 >>>> on:https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65641 I would like to >>>> propose this enhancement to the hash functor for pointers. It simply gets >>>> rid of the irrelevant bits on pointers hash code based on memory >>>> alignment >>>> of the pointed type. The only drawback I can think of is that the type >>>> needs >>>> to be complete at std::hash instantiation time but is it really an issue >>>> ? >>>> >>>> IMO it is quite obvious that the resulting hash code will be better >>>> but >>> If you use a real hashing function that's not true. That is, something >>> else than GCCs pointer_hash (void *p) { return (uintptr_t)p >>3; }. >> Sorry, I don't understand your remark. Do you mean that if someone is not >> using std::hash to hash its pointers he won't benefit from the enhancement ? >> >> It is a good point however to see that gcc is using something similar >> internally. >> >> François >>
diff --git a/libstdc++-v3/include/bits/functional_hash.h b/libstdc++-v3/include/bits/functional_hash.h index d94843f..a217f8a 100644 --- a/libstdc++-v3/include/bits/functional_hash.h +++ b/libstdc++-v3/include/bits/functional_hash.h @@ -36,6 +36,29 @@ namespace std _GLIBCXX_VISIBILITY(default) { +namespace __detail +{ +_GLIBCXX_BEGIN_NAMESPACE_VERSION + + // Compute highest power of 2 lower or equal to __n. + template<size_t __n> + struct _Lowest_power_of_two + { + static const size_t __val + = _Lowest_power_of_two< (__n >> 1) >::__val + 1; + }; + + template<> + struct _Lowest_power_of_two<1> + { static const size_t __val = 0; }; + + template<> + struct _Lowest_power_of_two<0> + { static const size_t __val = 0; }; + +_GLIBCXX_END_NAMESPACE_VERSION +} + _GLIBCXX_BEGIN_NAMESPACE_VERSION /** @defgroup hashes Hashes @@ -63,7 +86,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { size_t operator()(_Tp* __p) const noexcept - { return reinterpret_cast<size_t>(__p); } + { + return reinterpret_cast<size_t>(__p) + >> __detail::_Lowest_power_of_two<alignof(_Tp)>::__val; + } }; // Explicit specializations for integer types. diff --git a/libstdc++-v3/testsuite/20_util/hash/quality.cc b/libstdc++-v3/testsuite/20_util/hash/quality.cc index af417ed..d9c72c7 100644 --- a/libstdc++-v3/testsuite/20_util/hash/quality.cc +++ b/libstdc++-v3/testsuite/20_util/hash/quality.cc @@ -164,9 +164,20 @@ quality_test() } } +void +pointer_quality_test() +{ + bool test __attribute__((unused)) = true; + + double d1, d2; + std::hash<double*> dh; + VERIFY( dh(&d1) % sizeof(double) != dh(&d2) % sizeof(double) ); +} + int main() { quality_test(); + pointer_quality_test(); return 0; }