diff mbox

C++ edge_iterator (was: Re: [SH] PR 53976 - Add RTL pass to eliminate clrt, sett insns)

Message ID 1386968457.2455.129.camel@yam-132-YW-E178-FTW
State New
Headers show

Commit Message

Oleg Endo Dec. 13, 2013, 9 p.m. UTC
On Thu, 2013-12-12 at 03:13 -0500, Trevor Saunders wrote:
> On Wed, Dec 11, 2013 at 06:47:37PM +0100, Oleg Endo wrote:
> > On Thu, 2013-11-21 at 00:04 +0100, Steven Bosscher wrote:
> > > Declaring the edge_iterator inside the for() is not a good argument
> > > against FOR_EACH_EDGE. Of course, brownie points are up for grabs for
> > > the brave soul daring enough to make edge iterators be proper C++
> > > iterators... ;-)
> 
> so, as a first question why do we have a special edge iterator at all? it
> seems like we could just have a vec iterator and use that removing a
> bunch of indirection that seems pretty useless.

I don't know why it's there.  Looks like a remainder from the pre-C++
code, as the conversion is being done step by step.

> 
> > So, I gave it a try -- see the attached patch.
> > It allows edge iteration to look more like STL container iteration:
> > 
> > for (basic_block::edge_iterator ei = bb->pred_edges ().begin ();
> >      ei != bb->pred_edges ().end (); ++ei)
> > {
> >   basic_block pred_bb = (*ei)->src;
> >   ...
> > }
> 
> personally I'm not really a fan of overloading ++ / * that way, but I
> can't speak for anyone else.  I'd prefer something like
> 
> for (vec_iterator i = vec.forward_iterator (); !i.done (); i.next ())
> and
> for (backward_vec_iterator i = vec.backward_iterator (); !i.done (); i.next ())
> 
> but that might break range base for loops?

Right, that doesn't work with range-based for loops, since it doesn't
follow the standard concept of iteration.  For a more detailed
explanation, see also for example
http://www.codesynthesis.com/~boris/blog/2012/05/16/cxx11-range-based-for-loop/

BTW, if you look at the patch, I haven't overloaded any ++ operators:

> > Then the
> > typedef struct basic_block_def* basic_block;
> > 
> > is replaced with a wrapper class 'basic_block', which is just a simple
> > POD wrapper around a basic_block_def*.  There should be no penalties
> > compared to passing/storing raw pointers.  Because of the union with
> > constructor restriction of C++98 an additional wrapper class
> > 'basic_block_in_union' is required, which doesn't have any constructors
> > defined.
> > 
> > Having 'basic_block' as a class allows putting typedefs for the edge
> > iterator types in there (initially I tried putting the typedefs into
> > struct basic_block_def, but gengtype would bail out).
> 
> namespacing like that seems a little messy, but so is vec_iterator or
> such I guess.

I'm not sure which part of the namespacing you're referring to exactly.
The basic_block::edge_iterator thing?  Usually the iterator type is
defined in the container type.  In this case it would be vec<edge,
va_gc>.  The choice of the container type for storing edges is done in
basic_block_def.  Thus, ideally the iterator type should be obtained
from the basic_block_def class somehow.  A more bureaucratic way would
be to have a typedef inside basic_block_def (which is not possible
because of gengtype as mentioned before, so let's assume it's in
basic_block)... 

class basic_block
{
public:
  typedef vec<edge, va_gc> edge_container;

  edge_container& pred_edges (void);
  edge_container& succ_edges (void);

...
};

and then access the iterator via 
for (basic_block::edge_container::iterator i = bb->bb->pred_edges
().begin (); ...)

Having to type out iterator types is a well known annoyance of C++98.
Of course it's shorter to write
for (edge_iterator i = ...)

but that means, that there can be only one type of edge container ever.


> > It would also be possible to have a free standing definition / typedef
> > of edge_iterator, but it would conflict with the existing one and
> > require too many changes at once.  Moreover, the iterator type actually
> 
> I bet it'll be a lot of work but changing everything seems nice so maybe
> its worth just sitting down for a couple days and banging it out if it
> gives nicer names?

Nicer names than "edge_iterator" you mean?  I can't think of any at the
moment... 

> 
> > depends on the container type, which is vec<edge, ...>, and the
> > container type is defined/selected by the basic_block class.
> 
> I don't see how this is relevent

I hope that the explanation above makes it somewhat clearer.

> 
> > The following
> >   basic_block pred_bb = (*ei)->src;
> > 
> > can also be written as
> >   basic_block pred_bb = ei->src;
> > 
> > after converting the edge typedef to a wrapper of edge_def*.
> 
> this is assuming you overload operator -> on the iterator? I'm a c++ guy
> not a stl guy, but that seems pretty dubious to me.

Yes, that requires overloading of "operator ->".  However, in this case
not in the iterator, but in the pointer wrapper as I've done it already
in the patch for class basic_block (in the file basic_block2.h).  This
is common practice for pointer wrappers (see e.g. std::shared_ptr).

Overloading "operator ->" is also required in iterators.  See
http://www.cplusplus.com/reference/iterator/
If raw pointers are used as iterators (as in my example patch), there's
nothing to overload for those of course.

> > The idea of the approach is to allow co-existence of the new
> > edge_iterator and the old and thus be able to gradually convert code.
> > The wrappers around raw pointers also helo encapsulating the underlying
> > memory management issues.  For example, it would be much easier to
> > replace garbage collected objects with intrusive reference counting.
> 
> I don't think there's actually a memory management issue here,
> edge_iterator can only work if you allocate it on the stack since its
> not marked for gty, and afaik ggc doesn't scan the stack so the
> edge_iterator can't keep the vector alive.  Now I think it would be nice
> if these vectors moved out of gc memory, but I don't think this is
> particularly helpful for that.

Sorry, I think I caused a misunderstanding here.  By "memory management
issue" I just meant the way a container stores its objects, like whether
it's storing pointers to garbage collected objects, smart pointers like
shared_ptr<edge> or whatever.  I didn't mean that the iterator should
somehow influence the lifetime of the container.

Cheers,
Oleg

Comments

Richard Biener Dec. 14, 2013, 9:19 a.m. UTC | #1
Oleg Endo <oleg.endo@t-online.de> wrote:
>On Thu, 2013-12-12 at 03:13 -0500, Trevor Saunders wrote:
>> On Wed, Dec 11, 2013 at 06:47:37PM +0100, Oleg Endo wrote:
>> > On Thu, 2013-11-21 at 00:04 +0100, Steven Bosscher wrote:
>> > > Declaring the edge_iterator inside the for() is not a good
>argument
>> > > against FOR_EACH_EDGE. Of course, brownie points are up for grabs
>for
>> > > the brave soul daring enough to make edge iterators be proper C++
>> > > iterators... ;-)
>> 
>> so, as a first question why do we have a special edge iterator at
>all? it
>> seems like we could just have a vec iterator and use that removing a
>> bunch of indirection that seems pretty useless.
>
>I don't know why it's there.  Looks like a remainder from the pre-C++
>code, as the conversion is being done step by step.

The fact that we iterate over a vector is an implementation detail that should be hidden.

Richard.

>> 
>> > So, I gave it a try -- see the attached patch.
>> > It allows edge iteration to look more like STL container iteration:
>> > 
>> > for (basic_block::edge_iterator ei = bb->pred_edges ().begin ();
>> >      ei != bb->pred_edges ().end (); ++ei)
>> > {
>> >   basic_block pred_bb = (*ei)->src;
>> >   ...
>> > }
>> 
>> personally I'm not really a fan of overloading ++ / * that way, but I
>> can't speak for anyone else.  I'd prefer something like
>> 
>> for (vec_iterator i = vec.forward_iterator (); !i.done (); i.next ())
>> and
>> for (backward_vec_iterator i = vec.backward_iterator (); !i.done ();
>i.next ())
>> 
>> but that might break range base for loops?
>
>Right, that doesn't work with range-based for loops, since it doesn't
>follow the standard concept of iteration.  For a more detailed
>explanation, see also for example
>http://www.codesynthesis.com/~boris/blog/2012/05/16/cxx11-range-based-for-loop/
>
>BTW, if you look at the patch, I haven't overloaded any ++ operators:
>
>Index: gcc/vec.h
>===================================================================
>--- gcc/vec.h	(revision 205866)
>+++ gcc/vec.h	(working copy)
>@@ -482,6 +482,15 @@
>   void quick_grow (unsigned len);
>   void quick_grow_cleared (unsigned len);
> 
>+  /* STL like iterator interface.  */
>+  typedef T* iterator;
>+  typedef const T* const_iterator;
>+
>+  iterator begin (void) { return &m_vecdata[0]; }
>+  iterator end (void) { return &m_vecdata[m_vecpfx.m_num]; }
>+  const_iterator begin (void) const { return &m_vecdata[0]; }
>+  const_iterator end (void) const { &m_vecdata[m_vecpfx.m_num]; }
>
>This is because raw pointers can be used as random access iterators.
>
>
>> > Then the
>> > typedef struct basic_block_def* basic_block;
>> > 
>> > is replaced with a wrapper class 'basic_block', which is just a
>simple
>> > POD wrapper around a basic_block_def*.  There should be no
>penalties
>> > compared to passing/storing raw pointers.  Because of the union
>with
>> > constructor restriction of C++98 an additional wrapper class
>> > 'basic_block_in_union' is required, which doesn't have any
>constructors
>> > defined.
>> > 
>> > Having 'basic_block' as a class allows putting typedefs for the
>edge
>> > iterator types in there (initially I tried putting the typedefs
>into
>> > struct basic_block_def, but gengtype would bail out).
>> 
>> namespacing like that seems a little messy, but so is vec_iterator or
>> such I guess.
>
>I'm not sure which part of the namespacing you're referring to exactly.
>The basic_block::edge_iterator thing?  Usually the iterator type is
>defined in the container type.  In this case it would be vec<edge,
>va_gc>.  The choice of the container type for storing edges is done in
>basic_block_def.  Thus, ideally the iterator type should be obtained
>from the basic_block_def class somehow.  A more bureaucratic way would
>be to have a typedef inside basic_block_def (which is not possible
>because of gengtype as mentioned before, so let's assume it's in
>basic_block)... 
>
>class basic_block
>{
>public:
>  typedef vec<edge, va_gc> edge_container;
>
>  edge_container& pred_edges (void);
>  edge_container& succ_edges (void);
>
>...
>};
>
>and then access the iterator via 
>for (basic_block::edge_container::iterator i = bb->bb->pred_edges
>().begin (); ...)
>
>Having to type out iterator types is a well known annoyance of C++98.
>Of course it's shorter to write
>for (edge_iterator i = ...)
>
>but that means, that there can be only one type of edge container ever.
>
>
>> > It would also be possible to have a free standing definition /
>typedef
>> > of edge_iterator, but it would conflict with the existing one and
>> > require too many changes at once.  Moreover, the iterator type
>actually
>> 
>> I bet it'll be a lot of work but changing everything seems nice so
>maybe
>> its worth just sitting down for a couple days and banging it out if
>it
>> gives nicer names?
>
>Nicer names than "edge_iterator" you mean?  I can't think of any at the
>moment... 
>
>> 
>> > depends on the container type, which is vec<edge, ...>, and the
>> > container type is defined/selected by the basic_block class.
>> 
>> I don't see how this is relevent
>
>I hope that the explanation above makes it somewhat clearer.
>
>> 
>> > The following
>> >   basic_block pred_bb = (*ei)->src;
>> > 
>> > can also be written as
>> >   basic_block pred_bb = ei->src;
>> > 
>> > after converting the edge typedef to a wrapper of edge_def*.
>> 
>> this is assuming you overload operator -> on the iterator? I'm a c++
>guy
>> not a stl guy, but that seems pretty dubious to me.
>
>Yes, that requires overloading of "operator ->".  However, in this case
>not in the iterator, but in the pointer wrapper as I've done it already
>in the patch for class basic_block (in the file basic_block2.h).  This
>is common practice for pointer wrappers (see e.g. std::shared_ptr).
>
>Overloading "operator ->" is also required in iterators.  See
>http://www.cplusplus.com/reference/iterator/
>If raw pointers are used as iterators (as in my example patch), there's
>nothing to overload for those of course.
>
>> > The idea of the approach is to allow co-existence of the new
>> > edge_iterator and the old and thus be able to gradually convert
>code.
>> > The wrappers around raw pointers also helo encapsulating the
>underlying
>> > memory management issues.  For example, it would be much easier to
>> > replace garbage collected objects with intrusive reference
>counting.
>> 
>> I don't think there's actually a memory management issue here,
>> edge_iterator can only work if you allocate it on the stack since its
>> not marked for gty, and afaik ggc doesn't scan the stack so the
>> edge_iterator can't keep the vector alive.  Now I think it would be
>nice
>> if these vectors moved out of gc memory, but I don't think this is
>> particularly helpful for that.
>
>Sorry, I think I caused a misunderstanding here.  By "memory management
>issue" I just meant the way a container stores its objects, like
>whether
>it's storing pointers to garbage collected objects, smart pointers like
>shared_ptr<edge> or whatever.  I didn't mean that the iterator should
>somehow influence the lifetime of the container.
>
>Cheers,
>Oleg
Oleg Endo Dec. 14, 2013, 12:54 p.m. UTC | #2
On Sat, 2013-12-14 at 10:19 +0100, Richard Biener wrote:
> Oleg Endo <oleg.endo@t-online.de> wrote:
> >On Thu, 2013-12-12 at 03:13 -0500, Trevor Saunders wrote:
> >> On Wed, Dec 11, 2013 at 06:47:37PM +0100, Oleg Endo wrote:
> >> > On Thu, 2013-11-21 at 00:04 +0100, Steven Bosscher wrote:
> >> > > Declaring the edge_iterator inside the for() is not a good
> >argument
> >> > > against FOR_EACH_EDGE. Of course, brownie points are up for grabs
> >for
> >> > > the brave soul daring enough to make edge iterators be proper C++
> >> > > iterators... ;-)
> >> 
> >> so, as a first question why do we have a special edge iterator at
> >all? it
> >> seems like we could just have a vec iterator and use that removing a
> >> bunch of indirection that seems pretty useless.
> >
> >I don't know why it's there.  Looks like a remainder from the pre-C++
> >code, as the conversion is being done step by step.
> 
> The fact that we iterate over a vector is an implementation detail that should be hidden.

Yep.  Actually there are two implementation details:

> The fact that we iterate over a vector
                1) ^^^^^^^     2) ^^^^^^

This is addressed in the STL by splitting iterators and algorithms.
Whether a specialized/optimized version of an algorithm is available
when used with a particular iterator type (or inherently container type)
becomes an implementation detail and does require changes to code that
uses algorithms or iterators.

Cheers,
Oleg
Trevor Saunders Dec. 16, 2013, 2:38 p.m. UTC | #3
On Sat, Dec 14, 2013 at 10:19:48AM +0100, Richard Biener wrote:
> Oleg Endo <oleg.endo@t-online.de> wrote:
> >On Thu, 2013-12-12 at 03:13 -0500, Trevor Saunders wrote:
> >> On Wed, Dec 11, 2013 at 06:47:37PM +0100, Oleg Endo wrote:
> >> > On Thu, 2013-11-21 at 00:04 +0100, Steven Bosscher wrote:
> >> > > Declaring the edge_iterator inside the for() is not a good
> >argument
> >> > > against FOR_EACH_EDGE. Of course, brownie points are up for grabs
> >for
> >> > > the brave soul daring enough to make edge iterators be proper C++
> >> > > iterators... ;-)
> >> 
> >> so, as a first question why do we have a special edge iterator at
> >all? it
> >> seems like we could just have a vec iterator and use that removing a
> >> bunch of indirection that seems pretty useless.
> >
> >I don't know why it's there.  Looks like a remainder from the pre-C++
> >code, as the conversion is being done step by step.
> 
> The fact that we iterate over a vector is an implementation detail that should be hidden.

presumably then c++ifying this stuff should result in the vectors being
private to basic_block with accessors that return an iterator or
something?

btw what is the reason for it to be an implementation detail that basic
block keeps edges in a vector instead it being the api that the accessor
returns a vector of edges?

Trev

> 
> Richard.
> 
> >> 
> >> > So, I gave it a try -- see the attached patch.
> >> > It allows edge iteration to look more like STL container iteration:
> >> > 
> >> > for (basic_block::edge_iterator ei = bb->pred_edges ().begin ();
> >> >      ei != bb->pred_edges ().end (); ++ei)
> >> > {
> >> >   basic_block pred_bb = (*ei)->src;
> >> >   ...
> >> > }
> >> 
> >> personally I'm not really a fan of overloading ++ / * that way, but I
> >> can't speak for anyone else.  I'd prefer something like
> >> 
> >> for (vec_iterator i = vec.forward_iterator (); !i.done (); i.next ())
> >> and
> >> for (backward_vec_iterator i = vec.backward_iterator (); !i.done ();
> >i.next ())
> >> 
> >> but that might break range base for loops?
> >
> >Right, that doesn't work with range-based for loops, since it doesn't
> >follow the standard concept of iteration.  For a more detailed
> >explanation, see also for example
> >http://www.codesynthesis.com/~boris/blog/2012/05/16/cxx11-range-based-for-loop/
> >
> >BTW, if you look at the patch, I haven't overloaded any ++ operators:
> >
> >Index: gcc/vec.h
> >===================================================================
> >--- gcc/vec.h	(revision 205866)
> >+++ gcc/vec.h	(working copy)
> >@@ -482,6 +482,15 @@
> >   void quick_grow (unsigned len);
> >   void quick_grow_cleared (unsigned len);
> > 
> >+  /* STL like iterator interface.  */
> >+  typedef T* iterator;
> >+  typedef const T* const_iterator;
> >+
> >+  iterator begin (void) { return &m_vecdata[0]; }
> >+  iterator end (void) { return &m_vecdata[m_vecpfx.m_num]; }
> >+  const_iterator begin (void) const { return &m_vecdata[0]; }
> >+  const_iterator end (void) const { &m_vecdata[m_vecpfx.m_num]; }
> >
> >This is because raw pointers can be used as random access iterators.
> >
> >
> >> > Then the
> >> > typedef struct basic_block_def* basic_block;
> >> > 
> >> > is replaced with a wrapper class 'basic_block', which is just a
> >simple
> >> > POD wrapper around a basic_block_def*.  There should be no
> >penalties
> >> > compared to passing/storing raw pointers.  Because of the union
> >with
> >> > constructor restriction of C++98 an additional wrapper class
> >> > 'basic_block_in_union' is required, which doesn't have any
> >constructors
> >> > defined.
> >> > 
> >> > Having 'basic_block' as a class allows putting typedefs for the
> >edge
> >> > iterator types in there (initially I tried putting the typedefs
> >into
> >> > struct basic_block_def, but gengtype would bail out).
> >> 
> >> namespacing like that seems a little messy, but so is vec_iterator or
> >> such I guess.
> >
> >I'm not sure which part of the namespacing you're referring to exactly.
> >The basic_block::edge_iterator thing?  Usually the iterator type is
> >defined in the container type.  In this case it would be vec<edge,
> >va_gc>.  The choice of the container type for storing edges is done in
> >basic_block_def.  Thus, ideally the iterator type should be obtained
> >from the basic_block_def class somehow.  A more bureaucratic way would
> >be to have a typedef inside basic_block_def (which is not possible
> >because of gengtype as mentioned before, so let's assume it's in
> >basic_block)... 
> >
> >class basic_block
> >{
> >public:
> >  typedef vec<edge, va_gc> edge_container;
> >
> >  edge_container& pred_edges (void);
> >  edge_container& succ_edges (void);
> >
> >...
> >};
> >
> >and then access the iterator via 
> >for (basic_block::edge_container::iterator i = bb->bb->pred_edges
> >().begin (); ...)
> >
> >Having to type out iterator types is a well known annoyance of C++98.
> >Of course it's shorter to write
> >for (edge_iterator i = ...)
> >
> >but that means, that there can be only one type of edge container ever.
> >
> >
> >> > It would also be possible to have a free standing definition /
> >typedef
> >> > of edge_iterator, but it would conflict with the existing one and
> >> > require too many changes at once.  Moreover, the iterator type
> >actually
> >> 
> >> I bet it'll be a lot of work but changing everything seems nice so
> >maybe
> >> its worth just sitting down for a couple days and banging it out if
> >it
> >> gives nicer names?
> >
> >Nicer names than "edge_iterator" you mean?  I can't think of any at the
> >moment... 
> >
> >> 
> >> > depends on the container type, which is vec<edge, ...>, and the
> >> > container type is defined/selected by the basic_block class.
> >> 
> >> I don't see how this is relevent
> >
> >I hope that the explanation above makes it somewhat clearer.
> >
> >> 
> >> > The following
> >> >   basic_block pred_bb = (*ei)->src;
> >> > 
> >> > can also be written as
> >> >   basic_block pred_bb = ei->src;
> >> > 
> >> > after converting the edge typedef to a wrapper of edge_def*.
> >> 
> >> this is assuming you overload operator -> on the iterator? I'm a c++
> >guy
> >> not a stl guy, but that seems pretty dubious to me.
> >
> >Yes, that requires overloading of "operator ->".  However, in this case
> >not in the iterator, but in the pointer wrapper as I've done it already
> >in the patch for class basic_block (in the file basic_block2.h).  This
> >is common practice for pointer wrappers (see e.g. std::shared_ptr).
> >
> >Overloading "operator ->" is also required in iterators.  See
> >http://www.cplusplus.com/reference/iterator/
> >If raw pointers are used as iterators (as in my example patch), there's
> >nothing to overload for those of course.
> >
> >> > The idea of the approach is to allow co-existence of the new
> >> > edge_iterator and the old and thus be able to gradually convert
> >code.
> >> > The wrappers around raw pointers also helo encapsulating the
> >underlying
> >> > memory management issues.  For example, it would be much easier to
> >> > replace garbage collected objects with intrusive reference
> >counting.
> >> 
> >> I don't think there's actually a memory management issue here,
> >> edge_iterator can only work if you allocate it on the stack since its
> >> not marked for gty, and afaik ggc doesn't scan the stack so the
> >> edge_iterator can't keep the vector alive.  Now I think it would be
> >nice
> >> if these vectors moved out of gc memory, but I don't think this is
> >> particularly helpful for that.
> >
> >Sorry, I think I caused a misunderstanding here.  By "memory management
> >issue" I just meant the way a container stores its objects, like
> >whether
> >it's storing pointers to garbage collected objects, smart pointers like
> >shared_ptr<edge> or whatever.  I didn't mean that the iterator should
> >somehow influence the lifetime of the container.
> >
> >Cheers,
> >Oleg
> 
>
Trevor Saunders Dec. 16, 2013, 2:57 p.m. UTC | #4
On Fri, Dec 13, 2013 at 10:00:57PM +0100, Oleg Endo wrote:
> On Thu, 2013-12-12 at 03:13 -0500, Trevor Saunders wrote:
> > On Wed, Dec 11, 2013 at 06:47:37PM +0100, Oleg Endo wrote:
> > > On Thu, 2013-11-21 at 00:04 +0100, Steven Bosscher wrote:
> > > > Declaring the edge_iterator inside the for() is not a good argument
> > > > against FOR_EACH_EDGE. Of course, brownie points are up for grabs for
> > > > the brave soul daring enough to make edge iterators be proper C++
> > > > iterators... ;-)
> > 
> > so, as a first question why do we have a special edge iterator at all? it
> > seems like we could just have a vec iterator and use that removing a
> > bunch of indirection that seems pretty useless.
> 
> I don't know why it's there.  Looks like a remainder from the pre-C++
> code, as the conversion is being done step by step.
> 
> > 
> > > So, I gave it a try -- see the attached patch.
> > > It allows edge iteration to look more like STL container iteration:
> > > 
> > > for (basic_block::edge_iterator ei = bb->pred_edges ().begin ();
> > >      ei != bb->pred_edges ().end (); ++ei)
> > > {
> > >   basic_block pred_bb = (*ei)->src;
> > >   ...
> > > }
> > 
> > personally I'm not really a fan of overloading ++ / * that way, but I
> > can't speak for anyone else.  I'd prefer something like
> > 
> > for (vec_iterator i = vec.forward_iterator (); !i.done (); i.next ())
> > and
> > for (backward_vec_iterator i = vec.backward_iterator (); !i.done (); i.next ())
> > 
> > but that might break range base for loops?
> 
> Right, that doesn't work with range-based for loops, since it doesn't
> follow the standard concept of iteration.  For a more detailed
> explanation, see also for example
> http://www.codesynthesis.com/~boris/blog/2012/05/16/cxx11-range-based-for-loop/
> 
> BTW, if you look at the patch, I haven't overloaded any ++ operators:
> 
> Index: gcc/vec.h
> ===================================================================
> --- gcc/vec.h	(revision 205866)
> +++ gcc/vec.h	(working copy)
> @@ -482,6 +482,15 @@
>    void quick_grow (unsigned len);
>    void quick_grow_cleared (unsigned len);
>  
> +  /* STL like iterator interface.  */
> +  typedef T* iterator;
> +  typedef const T* const_iterator;
> +
> +  iterator begin (void) { return &m_vecdata[0]; }
> +  iterator end (void) { return &m_vecdata[m_vecpfx.m_num]; }
> +  const_iterator begin (void) const { return &m_vecdata[0]; }
> +  const_iterator end (void) const { &m_vecdata[m_vecpfx.m_num]; }
> 
> This is because raw pointers can be used as random access iterators.

yeah, somehow didn't occur to me T* just works (not that it wouldn't be
easier to see if the type had * in it)

> > > Then the
> > > typedef struct basic_block_def* basic_block;
> > > 
> > > is replaced with a wrapper class 'basic_block', which is just a simple
> > > POD wrapper around a basic_block_def*.  There should be no penalties
> > > compared to passing/storing raw pointers.  Because of the union with
> > > constructor restriction of C++98 an additional wrapper class
> > > 'basic_block_in_union' is required, which doesn't have any constructors
> > > defined.
> > > 
> > > Having 'basic_block' as a class allows putting typedefs for the edge
> > > iterator types in there (initially I tried putting the typedefs into
> > > struct basic_block_def, but gengtype would bail out).
> > 
> > namespacing like that seems a little messy, but so is vec_iterator or
> > such I guess.
> 
> I'm not sure which part of the namespacing you're referring to exactly.

I'm saying I'd rather have edge_iterator than
basic_blokc::edge_iterator.

> The basic_block::edge_iterator thing?  Usually the iterator type is
> defined in the container type.  In this case it would be vec<edge,
> va_gc>.  The choice of the container type for storing edges is done in
> basic_block_def.  Thus, ideally the iterator type should be obtained
> from the basic_block_def class somehow.  A more bureaucratic way would
> be to have a typedef inside basic_block_def (which is not possible
> because of gengtype as mentioned before, so let's assume it's in
> basic_block)... 
> 
> class basic_block
> {
> public:
>   typedef vec<edge, va_gc> edge_container;
> 
>   edge_container& pred_edges (void);
>   edge_container& succ_edges (void);
> 
> ...
> };
> 
> and then access the iterator via 
> for (basic_block::edge_container::iterator i = bb->bb->pred_edges
> ().begin (); ...)
> 
> Having to type out iterator types is a well known annoyance of C++98.
> Of course it's shorter to write
> for (edge_iterator i = ...)

actually what I really want to do is make returning a const vector part
of the api for getting the edge list from a basic block and then just
write

for (edge *e = vec.begin (); e != vec.end (); e++)

which imo makes it clear what's happening and is short, but I guess
richi doesn't like the interface being that a vector is returned :/

> but that means, that there can be only one type of edge container ever.

gcc seems to have survived for a long time with a single edge_iterator
so I don't see much reason to worry about allowing
basic_block::edge_iterator and foobar::edge_iterator.

> > > It would also be possible to have a free standing definition / typedef
> > > of edge_iterator, but it would conflict with the existing one and
> > > require too many changes at once.  Moreover, the iterator type actually
> > 
> > I bet it'll be a lot of work but changing everything seems nice so maybe
> > its worth just sitting down for a couple days and banging it out if it
> > gives nicer names?
> 
> Nicer names than "edge_iterator" you mean?  I can't think of any at the
> moment... 

well, I meant edge_iterator instead of basic_block::edge_iterator which
afaik is what you proposed?

> > > depends on the container type, which is vec<edge, ...>, and the
> > > container type is defined/selected by the basic_block class.
> > 
> > I don't see how this is relevent
> 
> I hope that the explanation above makes it somewhat clearer.
> 
> > 
> > > The following
> > >   basic_block pred_bb = (*ei)->src;
> > > 
> > > can also be written as
> > >   basic_block pred_bb = ei->src;
> > > 
> > > after converting the edge typedef to a wrapper of edge_def*.
> > 
> > this is assuming you overload operator -> on the iterator? I'm a c++ guy
> > not a stl guy, but that seems pretty dubious to me.
> 
> Yes, that requires overloading of "operator ->".  However, in this case
> not in the iterator, but in the pointer wrapper as I've done it already
> in the patch for class basic_block (in the file basic_block2.h).  This
> is common practice for pointer wrappers (see e.g. std::shared_ptr).

I know doing this for pointer wrappers is common and I think that's
fine.  However I'm not really convinced we need to or should make
basic_block a pointer wrapper class, I'd wrather see basic_block_def
become basic_block, and use basic_block * all over.

> Overloading "operator ->" is also required in iterators.  See
> http://www.cplusplus.com/reference/iterator/
> If raw pointers are used as iterators (as in my example patch), there's
> nothing to overload for those of course.

yeah, I'm not really a fan of that style of iterator :/

> > > The idea of the approach is to allow co-existence of the new
> > > edge_iterator and the old and thus be able to gradually convert code.
> > > The wrappers around raw pointers also helo encapsulating the underlying
> > > memory management issues.  For example, it would be much easier to
> > > replace garbage collected objects with intrusive reference counting.
> > 
> > I don't think there's actually a memory management issue here,
> > edge_iterator can only work if you allocate it on the stack since its
> > not marked for gty, and afaik ggc doesn't scan the stack so the
> > edge_iterator can't keep the vector alive.  Now I think it would be nice
> > if these vectors moved out of gc memory, but I don't think this is
> > particularly helpful for that.
> 
> Sorry, I think I caused a misunderstanding here.  By "memory management
> issue" I just meant the way a container stores its objects, like whether
> it's storing pointers to garbage collected objects, smart pointers like
> shared_ptr<edge> or whatever.  I didn't mean that the iterator should
> somehow influence the lifetime of the container.

 sure, I agree C++ makes memory management easier.

Trev

> Cheers,
> Oleg
>
Oleg Endo Dec. 16, 2013, 3:44 p.m. UTC | #5
On Mon, 2013-12-16 at 09:57 -0500, Trevor Saunders wrote:
> > 
> > BTW, if you look at the patch, I haven't overloaded any ++ operators:
> > 
> > Index: gcc/vec.h
> > ===================================================================
> > --- gcc/vec.h	(revision 205866)
> > +++ gcc/vec.h	(working copy)
> > @@ -482,6 +482,15 @@
> >    void quick_grow (unsigned len);
> >    void quick_grow_cleared (unsigned len);
> >  
> > +  /* STL like iterator interface.  */
> > +  typedef T* iterator;
> > +  typedef const T* const_iterator;
> > +
> > +  iterator begin (void) { return &m_vecdata[0]; }
> > +  iterator end (void) { return &m_vecdata[m_vecpfx.m_num]; }
> > +  const_iterator begin (void) const { return &m_vecdata[0]; }
> > +  const_iterator end (void) const { &m_vecdata[m_vecpfx.m_num]; }
> > 
> > This is because raw pointers can be used as random access iterators.

> yeah, somehow didn't occur to me T* just works (not that it wouldn't be
> easier to see if the type had * in it)
> 

Actually, it's the other way around.  (STL) Iterators behave like
pointers.  The only difference is that depending on the iterator
capabilities (which depend on the container type), the amount of pointer
arithmetic that can be done with an iterator is restricted, where a
random access iterator is a like a raw pointer.

> actually what I really want to do is make returning a const vector part
> of the api for getting the edge list from a basic block and then just
> write
> 
> for (edge *e = vec.begin (); e != vec.end (); e++)
> 
> which imo makes it clear what's happening and is short, but I guess
> richi doesn't like the interface being that a vector is returned :/

The above usage assumes that the container stores elements in contiguous
memory.  I know it's highly hypothetical in this case, but if one day
somebody wanted to try out a more suitable container to store edges in
basic blocks (let's say a linked list), it would be slightly impossible
to replace it without touching all the code.
Apart from that, I guess iteration over a linked list or some kind of
tree would look totally different.  I think iteration should look the
same regardless of the container type being used.  It makes it easier to
jump into foreign code and helps focusing on the actual problem the code
is trying to solve.


> > but that means, that there can be only one type of edge container ever.
> 
> gcc seems to have survived for a long time with a single edge_iterator
> so I don't see much reason to worry about allowing
> basic_block::edge_iterator and foobar::edge_iterator.

Yes, having one freestanding edge_iterator type is probably sufficient
for the near future.  I just didn't want to break existing code.

> well, I meant edge_iterator instead of basic_block::edge_iterator which
> afaik is what you proposed?

Yes, I was proposing to have a basic_block::edge_iterator or
basic_block::edge_container::iterator or something similar.

> I know doing this for pointer wrappers is common and I think that's
> fine.  However I'm not really convinced we need to or should make
> basic_block a pointer wrapper class, I'd wrather see basic_block_def
> become basic_block, and use basic_block * all over.

That could also be an option.  Although having pointer wrappers already
in place might open some other opportunities in the future.  For
instance it would make it relatively easy to try out reducing the number
of garbage collected objects by adding smart pointer functionality to
the pointer wrapper.

> 
> > Overloading "operator ->" is also required in iterators.  See
> > http://www.cplusplus.com/reference/iterator/
> > If raw pointers are used as iterators (as in my example patch), there's
> > nothing to overload for those of course.
> 
> yeah, I'm not really a fan of that style of iterator :/

But in the end, that iterator style allows writing code as you suggested
above:

> for (edge *e = vec.begin (); e != vec.end (); e++)

The only difference is that you don't write "edge*" but e.g.
"vec<edge>::iterator e", or just "auto" in C++11.

Cheers,
Oleg
Trevor Saunders Dec. 16, 2013, 4:15 p.m. UTC | #6
On Mon, Dec 16, 2013 at 04:44:00PM +0100, Oleg Endo wrote:
> On Mon, 2013-12-16 at 09:57 -0500, Trevor Saunders wrote:
> > > 
> > > BTW, if you look at the patch, I haven't overloaded any ++ operators:
> > > 
> > > Index: gcc/vec.h
> > > ===================================================================
> > > --- gcc/vec.h	(revision 205866)
> > > +++ gcc/vec.h	(working copy)
> > > @@ -482,6 +482,15 @@
> > >    void quick_grow (unsigned len);
> > >    void quick_grow_cleared (unsigned len);
> > >  
> > > +  /* STL like iterator interface.  */
> > > +  typedef T* iterator;
> > > +  typedef const T* const_iterator;
> > > +
> > > +  iterator begin (void) { return &m_vecdata[0]; }
> > > +  iterator end (void) { return &m_vecdata[m_vecpfx.m_num]; }
> > > +  const_iterator begin (void) const { return &m_vecdata[0]; }
> > > +  const_iterator end (void) const { &m_vecdata[m_vecpfx.m_num]; }
> > > 
> > > This is because raw pointers can be used as random access iterators.
> 
> > yeah, somehow didn't occur to me T* just works (not that it wouldn't be
> > easier to see if the type had * in it)
> > 
> 
> Actually, it's the other way around.  (STL) Iterators behave like
> pointers.  The only difference is that depending on the iterator
> capabilities (which depend on the container type), the amount of pointer
> arithmetic that can be done with an iterator is restricted, where a
> random access iterator is a like a raw pointer.

Well, iterators are sort of irrelevent to what I was commenting on, the
point is the original code

for (iterator_type i = thing.begin (); i != thing.end (); i++)
works because i is a pointer and they define all the necessary
operations.

> > actually what I really want to do is make returning a const vector part
> > of the api for getting the edge list from a basic block and then just
> > write
> > 
> > for (edge *e = vec.begin (); e != vec.end (); e++)
> > 
> > which imo makes it clear what's happening and is short, but I guess
> > richi doesn't like the interface being that a vector is returned :/
> 
> The above usage assumes that the container stores elements in contiguous
> memory.  I know it's highly hypothetical in this case, but if one day
> somebody wanted to try out a more suitable container to store edges in
> basic blocks (let's say a linked list), it would be slightly impossible
> to replace it without touching all the code.

sure, you'd have to either touch all the code or leave this interface
with a slow wrapper that converted a linked list or whatever to a
vector.

> Apart from that, I guess iteration over a linked list or some kind of
> tree would look totally different.  I think iteration should look the
> same regardless of the container type being used.  It makes it easier to
> jump into foreign code and helps focusing on the actual problem the code
> is trying to solve.

I'd disagree, vectors linked lists and trees all work differently and
have totally different performance characteristics, so it makes sense
they have different interfaces.

> > > but that means, that there can be only one type of edge container ever.
> > 
> > gcc seems to have survived for a long time with a single edge_iterator
> > so I don't see much reason to worry about allowing
> > basic_block::edge_iterator and foobar::edge_iterator.
> 
> Yes, having one freestanding edge_iterator type is probably sufficient
> for the near future.  I just didn't want to break existing code.

and I was suggesting we just change all the existing code.

> > well, I meant edge_iterator instead of basic_block::edge_iterator which
> > afaik is what you proposed?
> 
> Yes, I was proposing to have a basic_block::edge_iterator or
> basic_block::edge_container::iterator or something similar.
> 
> > I know doing this for pointer wrappers is common and I think that's
> > fine.  However I'm not really convinced we need to or should make
> > basic_block a pointer wrapper class, I'd wrather see basic_block_def
> > become basic_block, and use basic_block * all over.
> 
> That could also be an option.  Although having pointer wrappers already
> in place might open some other opportunities in the future.  For
> instance it would make it relatively easy to try out reducing the number
> of garbage collected objects by adding smart pointer functionality to
> the pointer wrapper.

Well, I'd expect freeing stuff owned by basic_block should happen in
basic_block::~basic_block not some pointer wrapper which would be more
error prone.  However making that happen is kind of tricky while
basic_block is in gc memory, I've been idly considering allowing gentype
to call dtors somehow...

> > > Overloading "operator ->" is also required in iterators.  See
> > > http://www.cplusplus.com/reference/iterator/
> > > If raw pointers are used as iterators (as in my example patch), there's
> > > nothing to overload for those of course.
> > 
> > yeah, I'm not really a fan of that style of iterator :/
> 
> But in the end, that iterator style allows writing code as you suggested
> above:
> 
> > for (edge *e = vec.begin (); e != vec.end (); e++)

sure, I think writing that for a vector makes sense, though I might
manually pull the vec.end() out of the loop at which point I might just
use indicies instead of end() / begin(), but I don't think it makes much
sense to write that to iterate over a linked list.

Trev


> 
> The only difference is that you don't write "edge*" but e.g.
> "vec<edge>::iterator e", or just "auto" in C++11.
> 
> Cheers,
> Oleg
>
Oleg Endo Dec. 16, 2013, 5:26 p.m. UTC | #7
On Mon, 2013-12-16 at 11:15 -0500, Trevor Saunders wrote:

> > That could also be an option.  Although having pointer wrappers already
> > in place might open some other opportunities in the future.  For
> > instance it would make it relatively easy to try out reducing the number
> > of garbage collected objects by adding smart pointer functionality to
> > the pointer wrapper.
> 
> Well, I'd expect freeing stuff owned by basic_block should happen in
> basic_block::~basic_block not some pointer wrapper which would be more
> error prone.  However making that happen is kind of tricky while
> basic_block is in gc memory, I've been idly considering allowing gentype
> to call dtors somehow...

Usually smart pointers (i.e. pointer wrappers) don't delete aggregated
stuff in the pointee, but delete the pointee only (if at all).  The
pointee's destructor is then supposed to free whatever it needs to,
which would meet your expectations.

> > 
> > > for (edge *e = vec.begin (); e != vec.end (); e++)
> 
> sure, I think writing that for a vector makes sense, though I might
> manually pull the vec.end() out of the loop

With which intention?

Cheers,
Oleg
diff mbox

Patch

Index: gcc/vec.h
===================================================================
--- gcc/vec.h	(revision 205866)
+++ gcc/vec.h	(working copy)
@@ -482,6 +482,15 @@ 
   void quick_grow (unsigned len);
   void quick_grow_cleared (unsigned len);
 
+  /* STL like iterator interface.  */
+  typedef T* iterator;
+  typedef const T* const_iterator;
+
+  iterator begin (void) { return &m_vecdata[0]; }
+  iterator end (void) { return &m_vecdata[m_vecpfx.m_num]; }
+  const_iterator begin (void) const { return &m_vecdata[0]; }
+  const_iterator end (void) const { &m_vecdata[m_vecpfx.m_num]; }

This is because raw pointers can be used as random access iterators.