[RFC] Make iterating over hash-map elide copying/destructing

Message ID alpine.LSU.2.20.1807101037290.16707@zhemvz.fhfr.qr
State New
Headers show
Series
  • [RFC] Make iterating over hash-map elide copying/destructing
Related show

Commit Message

Richard Biener July 10, 2018, 8:43 a.m.
The following makes the hash-map iterator dereference return a pair<Key, 
Value&> rather than a copy of Value.  This matches the hash-table iterator
behavior and avoids issues with

  hash_map<tree, auto_vec<..., 2> >

where iterating over the hash-table will call the auto_vec destructor
when dereferencing the iterator.  I note that the copy ctor of
auto_vec should probably be deleted and the hash-table/map iterators
should possibly support an alternate "reference" type to the stored
Values so we can use vec<> for "references" and auto_vec<> for
stored members.

But that's out of scope - the patch below seems to survive minimal
testing at least.

I suppose we still want to somehow hide the copy ctors of auto_vec?
How does hash-map growth work here?  (I suppose it doesn't...?)

Any further comments?

Thanks,
Richard.

2018-07-10  Richard Biener  <rguenther@suse.de>

	* hash-map.h (hash_map::iterator::operator*): Return
	a reference to Value.

Comments

Trevor Saunders July 10, 2018, 9:15 a.m. | #1
On Tue, Jul 10, 2018 at 10:43:20AM +0200, Richard Biener wrote:
> 
> The following makes the hash-map iterator dereference return a pair<Key, 
> Value&> rather than a copy of Value.  This matches the hash-table iterator
> behavior and avoids issues with
> 
>   hash_map<tree, auto_vec<..., 2> >

Eventually somebodies probably going to want
hash_map<unique_ptr<x>>, auto_vec<tree>> too, so we might as well go ahead
and make it pair<key &, value &>?

> where iterating over the hash-table will call the auto_vec destructor
> when dereferencing the iterator.  I note that the copy ctor of
> auto_vec should probably be deleted and the hash-table/map iterators
> should possibly support an alternate "reference" type to the stored
> Values so we can use vec<> for "references" and auto_vec<> for
> stored members.

I think code somewhere uses the auto_vec copy ctor to return a auto_vec,
this is pretty similar to the situation with unique_ptr in c++98 mode.

> But that's out of scope - the patch below seems to survive minimal
> testing at least.
> 
> I suppose we still want to somehow hide the copy ctors of auto_vec?

I suspec the best we can do is delete it in c++11 mode and provide a
auto_vec<T>(auto_vec<T> &&) move ctor instead.  Though I think for the
case where auto_vec has inline storage we should be able to just delete
the copy ctor?

> How does hash-map growth work here?  (I suppose it doesn't...?)

Yeah was going to ask, I think hash_table memcpy's the elements? in
which case memcpying a pointer into yourself isn't going to work.
However I think if you use the auto_vec specialization for 0 internal
elements that should be able to work if we null out the old auto_vec or
avoid running dtors on the old elements.

> Any further comments?

other than using a reference for the key type seems good.

thanks

trev
Richard Biener July 10, 2018, 9:46 a.m. | #2
On Tue, 10 Jul 2018, Trevor Saunders wrote:

> On Tue, Jul 10, 2018 at 10:43:20AM +0200, Richard Biener wrote:
> > 
> > The following makes the hash-map iterator dereference return a pair<Key, 
> > Value&> rather than a copy of Value.  This matches the hash-table iterator
> > behavior and avoids issues with
> > 
> >   hash_map<tree, auto_vec<..., 2> >
> 
> Eventually somebodies probably going to want
> hash_map<unique_ptr<x>>, auto_vec<tree>> too, so we might as well go ahead
> and make it pair<key &, value &>?
> 
> > where iterating over the hash-table will call the auto_vec destructor
> > when dereferencing the iterator.  I note that the copy ctor of
> > auto_vec should probably be deleted and the hash-table/map iterators
> > should possibly support an alternate "reference" type to the stored
> > Values so we can use vec<> for "references" and auto_vec<> for
> > stored members.
> 
> I think code somewhere uses the auto_vec copy ctor to return a auto_vec,
> this is pretty similar to the situation with unique_ptr in c++98 mode.
> 
> > But that's out of scope - the patch below seems to survive minimal
> > testing at least.
> > 
> > I suppose we still want to somehow hide the copy ctors of auto_vec?
> 
> I suspec the best we can do is delete it in c++11 mode and provide a
> auto_vec<T>(auto_vec<T> &&) move ctor instead.  Though I think for the
> case where auto_vec has inline storage we should be able to just delete
> the copy ctor?
> 
> > How does hash-map growth work here?  (I suppose it doesn't...?)
> 
> Yeah was going to ask, I think hash_table memcpy's the elements? in
> which case memcpying a pointer into yourself isn't going to work.

It doesn't work.  It uses assignment but auto_vec doesn't implement
that so auto-storage breaks.  So you say it should use
std::move<> where that's obviously not available for us :/

> However I think if you use the auto_vec specialization for 0 internal
> elements that should be able to work if we null out the old auto_vec or
> avoid running dtors on the old elements.

Well, then I don't really need auto_vec, I'm more interested in the
embedded storage than the destructor ;)

> > Any further comments?
> 
> other than using a reference for the key type seems good.

OK, I suppose it should be 'const Key&' then (hopefully that
works for Key == const X / X * as intended).

I guess given the expansion problem I'm going to re-think using
auto_vec for now :/

Can we please move to C++11? ;)

Richard.
Richard Biener July 10, 2018, 1:01 p.m. | #3
On Tue, 10 Jul 2018, Richard Biener wrote:

> On Tue, 10 Jul 2018, Trevor Saunders wrote:
> 
> > On Tue, Jul 10, 2018 at 10:43:20AM +0200, Richard Biener wrote:
> > > 
> > > The following makes the hash-map iterator dereference return a pair<Key, 
> > > Value&> rather than a copy of Value.  This matches the hash-table iterator
> > > behavior and avoids issues with
> > > 
> > >   hash_map<tree, auto_vec<..., 2> >
> > 
> > Eventually somebodies probably going to want
> > hash_map<unique_ptr<x>>, auto_vec<tree>> too, so we might as well go ahead
> > and make it pair<key &, value &>?
> > 
> > > where iterating over the hash-table will call the auto_vec destructor
> > > when dereferencing the iterator.  I note that the copy ctor of
> > > auto_vec should probably be deleted and the hash-table/map iterators
> > > should possibly support an alternate "reference" type to the stored
> > > Values so we can use vec<> for "references" and auto_vec<> for
> > > stored members.
> > 
> > I think code somewhere uses the auto_vec copy ctor to return a auto_vec,
> > this is pretty similar to the situation with unique_ptr in c++98 mode.
> > 
> > > But that's out of scope - the patch below seems to survive minimal
> > > testing at least.
> > > 
> > > I suppose we still want to somehow hide the copy ctors of auto_vec?
> > 
> > I suspec the best we can do is delete it in c++11 mode and provide a
> > auto_vec<T>(auto_vec<T> &&) move ctor instead.  Though I think for the
> > case where auto_vec has inline storage we should be able to just delete
> > the copy ctor?
> > 
> > > How does hash-map growth work here?  (I suppose it doesn't...?)
> > 
> > Yeah was going to ask, I think hash_table memcpy's the elements? in
> > which case memcpying a pointer into yourself isn't going to work.
> 
> It doesn't work.  It uses assignment but auto_vec doesn't implement
> that so auto-storage breaks.  So you say it should use
> std::move<> where that's obviously not available for us :/
> 
> > However I think if you use the auto_vec specialization for 0 internal
> > elements that should be able to work if we null out the old auto_vec or
> > avoid running dtors on the old elements.
> 
> Well, then I don't really need auto_vec, I'm more interested in the
> embedded storage than the destructor ;)
> 
> > > Any further comments?
> > 
> > other than using a reference for the key type seems good.
> 
> OK, I suppose it should be 'const Key&' then (hopefully that
> works for Key == const X / X * as intended).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2018-07-10  Richard Biener  <rguenther@suse.de>

	* hash-map.h (hash_map::iterator::operator*): Return
	references to key and value.

diff --git a/gcc/hash-map.h b/gcc/hash-map.h
index 7861440f3b3..39848289d80 100644
--- a/gcc/hash-map.h
+++ b/gcc/hash-map.h
@@ -223,10 +223,10 @@ public:
       return *this;
     }
 
-    std::pair<Key, Value> operator* ()
+    std::pair<const Key&, Value&> operator* ()
     {
       hash_entry &e = *m_iter;
-      return std::pair<Key, Value> (e.m_key, e.m_value);
+      return std::pair<const Key&, Value&> (e.m_key, e.m_value);
     }
 
     bool
Trevor Saunders July 11, 2018, 11:24 a.m. | #4
On Tue, Jul 10, 2018 at 11:46:54AM +0200, Richard Biener wrote:
> On Tue, 10 Jul 2018, Trevor Saunders wrote:
> 
> > On Tue, Jul 10, 2018 at 10:43:20AM +0200, Richard Biener wrote:
> > > 
> > > The following makes the hash-map iterator dereference return a pair<Key, 
> > > Value&> rather than a copy of Value.  This matches the hash-table iterator
> > > behavior and avoids issues with
> > > 
> > >   hash_map<tree, auto_vec<..., 2> >
> > 
> > Eventually somebodies probably going to want
> > hash_map<unique_ptr<x>>, auto_vec<tree>> too, so we might as well go ahead
> > and make it pair<key &, value &>?
> > 
> > > where iterating over the hash-table will call the auto_vec destructor
> > > when dereferencing the iterator.  I note that the copy ctor of
> > > auto_vec should probably be deleted and the hash-table/map iterators
> > > should possibly support an alternate "reference" type to the stored
> > > Values so we can use vec<> for "references" and auto_vec<> for
> > > stored members.
> > 
> > I think code somewhere uses the auto_vec copy ctor to return a auto_vec,
> > this is pretty similar to the situation with unique_ptr in c++98 mode.
> > 
> > > But that's out of scope - the patch below seems to survive minimal
> > > testing at least.
> > > 
> > > I suppose we still want to somehow hide the copy ctors of auto_vec?
> > 
> > I suspec the best we can do is delete it in c++11 mode and provide a
> > auto_vec<T>(auto_vec<T> &&) move ctor instead.  Though I think for the
> > case where auto_vec has inline storage we should be able to just delete
> > the copy ctor?
> > 
> > > How does hash-map growth work here?  (I suppose it doesn't...?)
> > 
> > Yeah was going to ask, I think hash_table memcpy's the elements? in
> > which case memcpying a pointer into yourself isn't going to work.
> 
> It doesn't work.  It uses assignment but auto_vec doesn't implement
> that so auto-storage breaks.  So you say it should use
> std::move<> where that's obviously not available for us :/

Well, since it doesn't define an assignment operator, but its members
are copyable it gets a copy assignment operator generated by the
compiler that just copies the data leading to it being broken.  I
suppose we could implement a copy assignment that handles the different
cases of using inline storage or not, but that seems complicated, and
kind of slow for a "assignment" so I'd be inclined to not support
copying it.  However if we went that route we should prevent use of the
assignment operator by declaring one explicitly and making it private but
then not implementing it, so it at least fails to link and with some
macros you can actually tell the compiler in c++11 its deleted and may
not be used.

> 
> > However I think if you use the auto_vec specialization for 0 internal
> > elements that should be able to work if we null out the old auto_vec or
> > avoid running dtors on the old elements.
> 
> Well, then I don't really need auto_vec, I'm more interested in the
> embedded storage than the destructor ;)

A hash table with inline storage like that is really going to eat
memory, but I suppose you know that ;) anyway fair enough.

> > > Any further comments?
> > 
> > other than using a reference for the key type seems good.
> 
> OK, I suppose it should be 'const Key&' then (hopefully that
> works for Key == const X / X * as intended).

Unfortunately I can never remember, but hopefully.

> I guess given the expansion problem I'm going to re-think using
> auto_vec for now :/

Seems like the destructor would help with cleanup, but yeah kind of an
odd beast to put in a datastructure.

> Can we please move to C++11? ;)

;) talk to the IBM folks I think their basically the only ones who
really care.

thanks

Trev

> 
> Richard.
Pedro Alves July 11, 2018, 9:16 p.m. | #5
On 07/11/2018 12:24 PM, Trevor Saunders wrote:
> However if we went that route we should prevent use of the
> assignment operator by declaring one explicitly and making it private but
> then not implementing it, so it at least fails to link and with some
> macros you can actually tell the compiler in c++11 its deleted and may
> not be used.

The macro already exists --- DISABLE_COPY_AND_ASSIGN in include/ansidecl.h.

Thanks,
Pedro Alves

Patch

diff --git a/gcc/hash-map.h b/gcc/hash-map.h
index 7861440f3b3..9d2b38a843e 100644
--- a/gcc/hash-map.h
+++ b/gcc/hash-map.h
@@ -223,10 +223,10 @@  public:
       return *this;
     }
 
-    std::pair<Key, Value> operator* ()
+    std::pair<Key, Value&> operator* ()
     {
       hash_entry &e = *m_iter;
-      return std::pair<Key, Value> (e.m_key, e.m_value);
+      return std::pair<Key, Value&> (e.m_key, e.m_value);
     }
 
     bool