diff mbox

[gimple-classes,committed,4/6] tree-ssa-tail-merge.c: Use gassign

Message ID 1415373690-26193-5-git-send-email-dmalcolm@redhat.com
State New
Headers show

Commit Message

David Malcolm Nov. 7, 2014, 3:21 p.m. UTC
gcc/ChangeLog.gimple-classes:
	* tree-ssa-tail-merge.c (same_succ_hash): Add checked cast.
	(gimple_equal_p): Add checked casts.
---
 gcc/ChangeLog.gimple-classes | 5 +++++
 gcc/tree-ssa-tail-merge.c    | 8 +++++---
 2 files changed, 10 insertions(+), 3 deletions(-)

Comments

Richard Biener Nov. 7, 2014, 9:01 p.m. UTC | #1
On Fri, Nov 7, 2014 at 4:21 PM, David Malcolm <dmalcolm@redhat.com> wrote:
> gcc/ChangeLog.gimple-classes:
>         * tree-ssa-tail-merge.c (same_succ_hash): Add checked cast.
>         (gimple_equal_p): Add checked casts.
> ---
>  gcc/ChangeLog.gimple-classes | 5 +++++
>  gcc/tree-ssa-tail-merge.c    | 8 +++++---
>  2 files changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/ChangeLog.gimple-classes b/gcc/ChangeLog.gimple-classes
> index f43df63..0bd0421 100644
> --- a/gcc/ChangeLog.gimple-classes
> +++ b/gcc/ChangeLog.gimple-classes
> @@ -1,5 +1,10 @@
>  2014-11-06  David Malcolm  <dmalcolm@redhat.com>
>
> +       * tree-ssa-tail-merge.c (same_succ_hash): Add checked cast.
> +       (gimple_equal_p): Add checked casts.
> +
> +2014-11-06  David Malcolm  <dmalcolm@redhat.com>
> +
>         * tree-ssa-structalias.c (find_func_aliases): Replace
>         is_gimple_assign with a dyn_cast, introducing local gassign *
>         "t_assign", using it in place of "t" for typesafety.
> diff --git a/gcc/tree-ssa-tail-merge.c b/gcc/tree-ssa-tail-merge.c
> index 5678657..b822214 100644
> --- a/gcc/tree-ssa-tail-merge.c
> +++ b/gcc/tree-ssa-tail-merge.c
> @@ -484,7 +484,7 @@ same_succ_hash (const_same_succ e)
>
>        hstate.add_int (gimple_code (stmt));
>        if (is_gimple_assign (stmt))
> -       hstate.add_int (gimple_assign_rhs_code (stmt));
> +       hstate.add_int (gimple_assign_rhs_code (as_a <gassign *> (stmt)));
>        if (!is_gimple_call (stmt))
>         continue;
>        if (gimple_call_internal_p (stmt))
> @@ -1172,8 +1172,10 @@ gimple_equal_p (same_succ same_succ, gimple s1, gimple s2)
>        if (TREE_CODE (lhs1) != SSA_NAME
>           && TREE_CODE (lhs2) != SSA_NAME)
>         return (operand_equal_p (lhs1, lhs2, 0)
> -               && gimple_operand_equal_value_p (gimple_assign_rhs1 (s1),
> -                                                gimple_assign_rhs1 (s2)));
> +               && gimple_operand_equal_value_p (gimple_assign_rhs1 (
> +                                                  as_a <gassign *> (s1)),
> +                                                gimple_assign_rhs1 (
> +                                                  as_a <gassign *> (s2))));

Just a comment as these patches flow by - I think this is a huge step
backwards from "enforcing" s1/s2 being a gimple_assign inside
gimple_assign_rhs1 to this as_a <gassign *> boilerplate at _each_ callsite!

Which means this step of the refactoring is totally broken and probably
requires much more manual work to avoid this kind of uglyness.

I definitely won't approve of this kind of changes.

Thanks,
Richard.

>        else if (TREE_CODE (lhs1) == SSA_NAME
>                && TREE_CODE (lhs2) == SSA_NAME)
>         return vn_valueize (lhs1) == vn_valueize (lhs2);
> --
> 1.7.11.7
>
Jakub Jelinek Nov. 7, 2014, 9:23 p.m. UTC | #2
On Fri, Nov 07, 2014 at 10:01:45PM +0100, Richard Biener wrote:
> > --- a/gcc/tree-ssa-tail-merge.c
> > +++ b/gcc/tree-ssa-tail-merge.c
> > @@ -484,7 +484,7 @@ same_succ_hash (const_same_succ e)
> >
> >        hstate.add_int (gimple_code (stmt));
> >        if (is_gimple_assign (stmt))
> > -       hstate.add_int (gimple_assign_rhs_code (stmt));
> > +       hstate.add_int (gimple_assign_rhs_code (as_a <gassign *> (stmt)));
> >        if (!is_gimple_call (stmt))
> >         continue;
> >        if (gimple_call_internal_p (stmt))
> > @@ -1172,8 +1172,10 @@ gimple_equal_p (same_succ same_succ, gimple s1, gimple s2)
> >        if (TREE_CODE (lhs1) != SSA_NAME
> >           && TREE_CODE (lhs2) != SSA_NAME)
> >         return (operand_equal_p (lhs1, lhs2, 0)
> > -               && gimple_operand_equal_value_p (gimple_assign_rhs1 (s1),
> > -                                                gimple_assign_rhs1 (s2)));
> > +               && gimple_operand_equal_value_p (gimple_assign_rhs1 (
> > +                                                  as_a <gassign *> (s1)),
> > +                                                gimple_assign_rhs1 (
> > +                                                  as_a <gassign *> (s2))));
> 
> Just a comment as these patches flow by - I think this is a huge step
> backwards from "enforcing" s1/s2 being a gimple_assign inside
> gimple_assign_rhs1 to this as_a <gassign *> boilerplate at _each_ callsite!
> 
> Which means this step of the refactoring is totally broken and probably
> requires much more manual work to avoid this kind of uglyness.
> 
> I definitely won't approve of this kind of changes.

I have to agree with this, this is too ugly to live with.
I must say I don't find anything wrong with what we have right now,
unlike RTL checking, the gimple checking is inexpensive, and much better
to do it that way then enforce all all developers to write it this way.
Otherwise we'll end up with code as ugly as in LLVM :(.

	Jakub
Marek Polacek Nov. 8, 2014, 10:13 a.m. UTC | #3
On Fri, Nov 07, 2014 at 10:01:45PM +0100, Richard Biener wrote:
> Just a comment as these patches flow by - I think this is a huge step
> backwards from "enforcing" s1/s2 being a gimple_assign inside
> gimple_assign_rhs1 to this as_a <gassign *> boilerplate at _each_ callsite!

FWIW, I feel the same way.  More to type, worse readability, a lot
more of line-wrapping.

Sorry,

	Marek
Richard Biener Nov. 8, 2014, 12:07 p.m. UTC | #4
On Fri, Nov 7, 2014 at 10:01 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Fri, Nov 7, 2014 at 4:21 PM, David Malcolm <dmalcolm@redhat.com> wrote:
>> gcc/ChangeLog.gimple-classes:
>>         * tree-ssa-tail-merge.c (same_succ_hash): Add checked cast.
>>         (gimple_equal_p): Add checked casts.
>> ---
>>  gcc/ChangeLog.gimple-classes | 5 +++++
>>  gcc/tree-ssa-tail-merge.c    | 8 +++++---
>>  2 files changed, 10 insertions(+), 3 deletions(-)
>>
>> diff --git a/gcc/ChangeLog.gimple-classes b/gcc/ChangeLog.gimple-classes
>> index f43df63..0bd0421 100644
>> --- a/gcc/ChangeLog.gimple-classes
>> +++ b/gcc/ChangeLog.gimple-classes
>> @@ -1,5 +1,10 @@
>>  2014-11-06  David Malcolm  <dmalcolm@redhat.com>
>>
>> +       * tree-ssa-tail-merge.c (same_succ_hash): Add checked cast.
>> +       (gimple_equal_p): Add checked casts.
>> +
>> +2014-11-06  David Malcolm  <dmalcolm@redhat.com>
>> +
>>         * tree-ssa-structalias.c (find_func_aliases): Replace
>>         is_gimple_assign with a dyn_cast, introducing local gassign *
>>         "t_assign", using it in place of "t" for typesafety.
>> diff --git a/gcc/tree-ssa-tail-merge.c b/gcc/tree-ssa-tail-merge.c
>> index 5678657..b822214 100644
>> --- a/gcc/tree-ssa-tail-merge.c
>> +++ b/gcc/tree-ssa-tail-merge.c
>> @@ -484,7 +484,7 @@ same_succ_hash (const_same_succ e)
>>
>>        hstate.add_int (gimple_code (stmt));
>>        if (is_gimple_assign (stmt))
>> -       hstate.add_int (gimple_assign_rhs_code (stmt));
>> +       hstate.add_int (gimple_assign_rhs_code (as_a <gassign *> (stmt)));
>>        if (!is_gimple_call (stmt))
>>         continue;
>>        if (gimple_call_internal_p (stmt))
>> @@ -1172,8 +1172,10 @@ gimple_equal_p (same_succ same_succ, gimple s1, gimple s2)
>>        if (TREE_CODE (lhs1) != SSA_NAME
>>           && TREE_CODE (lhs2) != SSA_NAME)
>>         return (operand_equal_p (lhs1, lhs2, 0)
>> -               && gimple_operand_equal_value_p (gimple_assign_rhs1 (s1),
>> -                                                gimple_assign_rhs1 (s2)));
>> +               && gimple_operand_equal_value_p (gimple_assign_rhs1 (
>> +                                                  as_a <gassign *> (s1)),
>> +                                                gimple_assign_rhs1 (
>> +                                                  as_a <gassign *> (s2))));
>
> Just a comment as these patches flow by - I think this is a huge step
> backwards from "enforcing" s1/s2 being a gimple_assign inside
> gimple_assign_rhs1 to this as_a <gassign *> boilerplate at _each_ callsite!
>
> Which means this step of the refactoring is totally broken and probably
> requires much more manual work to avoid this kind of uglyness.
>
> I definitely won't approve of this kind of changes.

To be constructive here - the above case is from within a
GIMPLE_ASSIGN case label
and thus I'd have expected

    case GIMPLE_ASSIGN:
      {
        gassign *a1 = as_a <gassign *> (s1);
        gassign *a2 = as_a <gassign *> (s2);
      lhs1 = gimple_assign_lhs (a1);
      lhs2 = gimple_assign_lhs (a2);
      if (TREE_CODE (lhs1) != SSA_NAME
          && TREE_CODE (lhs2) != SSA_NAME)
        return (operand_equal_p (lhs1, lhs2, 0)
                && gimple_operand_equal_value_p (gimple_assign_rhs1 (a1),
                                                 gimple_assign_rhs1 (a2)));
      else if (TREE_CODE (lhs1) == SSA_NAME
               && TREE_CODE (lhs2) == SSA_NAME)
        return vn_valueize (lhs1) == vn_valueize (lhs2);
      return false;
      }

instead.  That's the kind of changes I have expected and have approved of.

Thanks,
Richard.

> Thanks,
> Richard.
>
>>        else if (TREE_CODE (lhs1) == SSA_NAME
>>                && TREE_CODE (lhs2) == SSA_NAME)
>>         return vn_valueize (lhs1) == vn_valueize (lhs2);
>> --
>> 1.7.11.7
>>
Jakub Jelinek Nov. 8, 2014, 1:56 p.m. UTC | #5
On Sat, Nov 08, 2014 at 01:07:28PM +0100, Richard Biener wrote:
> To be constructive here - the above case is from within a
> GIMPLE_ASSIGN case label
> and thus I'd have expected
> 
>     case GIMPLE_ASSIGN:
>       {
>         gassign *a1 = as_a <gassign *> (s1);
>         gassign *a2 = as_a <gassign *> (s2);
>       lhs1 = gimple_assign_lhs (a1);
>       lhs2 = gimple_assign_lhs (a2);
>       if (TREE_CODE (lhs1) != SSA_NAME
>           && TREE_CODE (lhs2) != SSA_NAME)
>         return (operand_equal_p (lhs1, lhs2, 0)
>                 && gimple_operand_equal_value_p (gimple_assign_rhs1 (a1),
>                                                  gimple_assign_rhs1 (a2)));
>       else if (TREE_CODE (lhs1) == SSA_NAME
>                && TREE_CODE (lhs2) == SSA_NAME)
>         return vn_valueize (lhs1) == vn_valueize (lhs2);
>       return false;
>       }
> 
> instead.  That's the kind of changes I have expected and have approved of.

But even that looks like just adding extra work for all developers, with no
gain.  You only have to add extra code and extra temporaries, in switches
typically also have to add {} because of the temporaries and thus extra
indentation level, and it doesn't simplify anything in the code.

	Jakub
David Malcolm Nov. 8, 2014, 2 p.m. UTC | #6
On Sat, 2014-11-08 at 13:07 +0100, Richard Biener wrote:
> On Fri, Nov 7, 2014 at 10:01 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
> > On Fri, Nov 7, 2014 at 4:21 PM, David Malcolm <dmalcolm@redhat.com> wrote:
> >> gcc/ChangeLog.gimple-classes:
> >>         * tree-ssa-tail-merge.c (same_succ_hash): Add checked cast.
> >>         (gimple_equal_p): Add checked casts.
> >> ---
> >>  gcc/ChangeLog.gimple-classes | 5 +++++
> >>  gcc/tree-ssa-tail-merge.c    | 8 +++++---
> >>  2 files changed, 10 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/gcc/ChangeLog.gimple-classes b/gcc/ChangeLog.gimple-classes
> >> index f43df63..0bd0421 100644
> >> --- a/gcc/ChangeLog.gimple-classes
> >> +++ b/gcc/ChangeLog.gimple-classes
> >> @@ -1,5 +1,10 @@
> >>  2014-11-06  David Malcolm  <dmalcolm@redhat.com>
> >>
> >> +       * tree-ssa-tail-merge.c (same_succ_hash): Add checked cast.
> >> +       (gimple_equal_p): Add checked casts.
> >> +
> >> +2014-11-06  David Malcolm  <dmalcolm@redhat.com>
> >> +
> >>         * tree-ssa-structalias.c (find_func_aliases): Replace
> >>         is_gimple_assign with a dyn_cast, introducing local gassign *
> >>         "t_assign", using it in place of "t" for typesafety.
> >> diff --git a/gcc/tree-ssa-tail-merge.c b/gcc/tree-ssa-tail-merge.c
> >> index 5678657..b822214 100644
> >> --- a/gcc/tree-ssa-tail-merge.c
> >> +++ b/gcc/tree-ssa-tail-merge.c
> >> @@ -484,7 +484,7 @@ same_succ_hash (const_same_succ e)
> >>
> >>        hstate.add_int (gimple_code (stmt));
> >>        if (is_gimple_assign (stmt))
> >> -       hstate.add_int (gimple_assign_rhs_code (stmt));
> >> +       hstate.add_int (gimple_assign_rhs_code (as_a <gassign *> (stmt)));
> >>        if (!is_gimple_call (stmt))
> >>         continue;
> >>        if (gimple_call_internal_p (stmt))
> >> @@ -1172,8 +1172,10 @@ gimple_equal_p (same_succ same_succ, gimple s1, gimple s2)
> >>        if (TREE_CODE (lhs1) != SSA_NAME
> >>           && TREE_CODE (lhs2) != SSA_NAME)
> >>         return (operand_equal_p (lhs1, lhs2, 0)
> >> -               && gimple_operand_equal_value_p (gimple_assign_rhs1 (s1),
> >> -                                                gimple_assign_rhs1 (s2)));
> >> +               && gimple_operand_equal_value_p (gimple_assign_rhs1 (
> >> +                                                  as_a <gassign *> (s1)),
> >> +                                                gimple_assign_rhs1 (
> >> +                                                  as_a <gassign *> (s2))));
> >
> > Just a comment as these patches flow by - I think this is a huge step
> > backwards from "enforcing" s1/s2 being a gimple_assign inside
> > gimple_assign_rhs1 to this as_a <gassign *> boilerplate at _each_ callsite!
> >
> > Which means this step of the refactoring is totally broken and probably
> > requires much more manual work to avoid this kind of uglyness.
> >
> > I definitely won't approve of this kind of changes.
> 
> To be constructive here - the above case is from within a
> GIMPLE_ASSIGN case label
> and thus I'd have expected
> 
>     case GIMPLE_ASSIGN:
>       {
>         gassign *a1 = as_a <gassign *> (s1);
>         gassign *a2 = as_a <gassign *> (s2);
>       lhs1 = gimple_assign_lhs (a1);
>       lhs2 = gimple_assign_lhs (a2);
>       if (TREE_CODE (lhs1) != SSA_NAME
>           && TREE_CODE (lhs2) != SSA_NAME)
>         return (operand_equal_p (lhs1, lhs2, 0)
>                 && gimple_operand_equal_value_p (gimple_assign_rhs1 (a1),
>                                                  gimple_assign_rhs1 (a2)));
>       else if (TREE_CODE (lhs1) == SSA_NAME
>                && TREE_CODE (lhs2) == SSA_NAME)
>         return vn_valueize (lhs1) == vn_valueize (lhs2);
>       return false;
>       }
> 
> instead.  That's the kind of changes I have expected and have approved of.

I do make the above kind of change in some places within the
gimple-classes branch.

I think I didn't do it in this case because the body of the "case
GIMPLE_ASSIGN" doesn't yet have braces, so adding locals requires adding
them and re-indenting the case body.  I didn't spot the opportunity to
speed up the code as you do above by converting the two gimple_get_lhs
to gimple_assign_lhs.  Without that, I guess I decided to simply add the
two as_a<> directly in-place to avoid the reindent.   With your speedup
it's clearly better to reindent the code.


(Got to go now, sorry; I hope to write a better reply on Monday)

Thanks
Dave
David Malcolm Nov. 10, 2014, 10:27 p.m. UTC | #7
On Sat, 2014-11-08 at 14:56 +0100, Jakub Jelinek wrote:
> On Sat, Nov 08, 2014 at 01:07:28PM +0100, Richard Biener wrote:
> > To be constructive here - the above case is from within a
> > GIMPLE_ASSIGN case label
> > and thus I'd have expected
> > 
> >     case GIMPLE_ASSIGN:
> >       {
> >         gassign *a1 = as_a <gassign *> (s1);
> >         gassign *a2 = as_a <gassign *> (s2);
> >       lhs1 = gimple_assign_lhs (a1);
> >       lhs2 = gimple_assign_lhs (a2);
> >       if (TREE_CODE (lhs1) != SSA_NAME
> >           && TREE_CODE (lhs2) != SSA_NAME)
> >         return (operand_equal_p (lhs1, lhs2, 0)
> >                 && gimple_operand_equal_value_p (gimple_assign_rhs1 (a1),
> >                                                  gimple_assign_rhs1 (a2)));
> >       else if (TREE_CODE (lhs1) == SSA_NAME
> >                && TREE_CODE (lhs2) == SSA_NAME)
> >         return vn_valueize (lhs1) == vn_valueize (lhs2);
> >       return false;
> >       }
> > 
> > instead.  That's the kind of changes I have expected and have approved of.
> 
> But even that looks like just adding extra work for all developers, with no
> gain.  You only have to add extra code and extra temporaries, in switches
> typically also have to add {} because of the temporaries and thus extra
> indentation level, and it doesn't simplify anything in the code.

The branch attempts to use the C++ typesystem to capture information
about the kinds of gimple statement we expect, both:
  (A) so that the compiler can detect type errors, and
  (B) as a comprehension aid to the human reader of the code

The ideal here is when function params and struct field can be
strengthened from "gimple" to a subclass ptr.  This captures the
knowledge that every use of a function or within a struct has a given
gimple code.

Examples of this for the initial patchkit were:

* the "call_stmt" field of a cgraph_edge becoming a gcall *,
  rather than a plain gimple.

* various variables in tree-into-ssa.c change from just
  vec<gimple> to being vec<gphi *>, capturing the "phi-ness" of
  the contents as a compile-time check

* tree-inline.h's struct copy_body_data, the field "debug_stmts"
  can be "concretized" from a vec<gimple> to a vec<gdebug *>.

A more recent example, from:
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=5bd16d92b9e928b5a5a7aebd571d92f72dd517f8
The fields "arr_ref_first" and "arr_ref_last" of
tree-switch-conversion.c's struct switch_conv_info can be strengthened
from gimple to gassign *: they are always GIMPLE_ASSIGN.

I applied cleanups to do my initial patchkit, which Jeff approved (with
some provisos), and which became the first 92 commits on the branch:

"[gimple-classes, committed 00/92] Initial slew of commits":
  https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02791.html

followed by a merger from trunk into the branch:
"[gimple-classes] Merge trunk r216157-r216746 into branch":
  https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02982.html

With those commits, I was able to convert 180 accessors to taking a
concrete subclass, with 158 left taking a gimple or
const_gimple i.e. about half of them.
(My script to analyze this is gimple_typesafety.py
within https://github.com/davidmalcolm/gcc-refactoring-scripts)

I got it into my head that it was desirable to convert *all*
gimple accessors to this form, and to eliminate the GIMPLE_CHECK
macros (given that gcc development community seems to dislike
partial transitions).

I've been attempting this full conversion - convert all of the
gimple_ accessors, to require an appropriate gimple subclass
ptr, in particular focusing on the gimple_assign_ ones, but it's a *lot*
of extra work, and much more invasive than the patches
that Jeff conditionally approved.

I now suspect that it's going too far - in the initial patchkit I was
doing the clean, obvious ones, but now I'm left with the awkward ones
that would require me to uglify the code to "fix".

If it's OK to only convert some of them, then I'd rather just do that.

The type-strengthening is rarely as neat as being able to simply convert
a param or field type.  Some examples:

Functions passed a gsi
======================
Sometimes functions are passed a gsi, where it can be known that the gsi
currently references a stmt of known kind (although that isn't
necessarily obvious from reading the body of the function):

Example from tree-ssa-strlen.c:

handle_char_store (gimple_stmt_iterator *gsi)
 {
   int idx = -1;
   strinfo si = NULL;
-  gimple stmt = gsi_stmt (*gsi);
+  gassign *stmt = as_a <gassign *> (gsi_stmt (*gsi));
   tree ssaname = NULL_TREE, lhs = gimple_assign_lhs (stmt);
 
   if (TREE_CODE (lhs) == MEM_REF

from
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=78aae552f15ad5f8f5290fb825f9ae33f4a7cad9


Only acting on one kind of gimple
=================================

Some functions accept any kind of gimple, but only act on e.g. a
GIMPLE_ASSIGN, immediately returning if they got a different kind.

So I make this kind of change, where:

  void
  foo (gimple stmt, other params)
  {
    if (!is_gimple_assign (stmt))
       return;

    use_various gimple_assign_accessors (stmt);
  }

becomes:

  void
  foo (gimple gs, other params)
  {
    gassign *stmt = dyn_cast <gassign *> (gs);
    if (!stmt)
       return;

    use_various gimple_assign_accessors (stmt);
  }

renaming the param to "gs" to avoid a mass rename of "stmt".
Example tree-ssa-sink.c (statement_sink_location):
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=da19cf71e540f52a924d0985efdacd9e03684a6e


Suites where we know the type of a statement
============================================

If we have:

   if (is_gimple_assign (stmt))
     {
        /* 20 lines of logic, with numerous gimple_assign_ accessors
           on "stmt". */ 
     }

then I've been converting it to:

   if (gassign *assign_stmt = dyn_cast <gassign *> (stmt))
     {
        /* rename the "stmt" to "assign_stmt", wrapping lines as
           needed. */ 
     }

Is "assign_stmt" too long here?  It's handy to be able to distinguish
the meaning of the stmt e.g. "use_stmt", "def_stmt" can have synonyms
"use_assign" and "def_assign" for the regions where they're known to be
GIMPLE_ASSIGN).

The above is overkill for a 1-liner e.g.:

   if (gimple_get_code (stmt) == GIMPLE_ASSIGN)
     expr = gimple_get_lhs (stmt);

It could be converted to:

   if (gassign *assign_stmt = dyn_cast <gassign *> (stmt))
     expr = gimple_assign_get_lhs (assign_stmt);

or to:

   if (gimple_get_code (stmt) == GIMPLE_ASSIGN)
     expr = gimple_assign_get_lhs (as_a <gassign *> (stmt));

or left as-is if we don't want gimple_assign_get_lhs to require a
gassign *.

Casting functions
=================

Taking RTL's single_set as inspiration, I converted the predicates:
  gimple_assign_copy_p
  gimple_assign_ssa_name_copy_p
  gimple_assign_single_p
to return a gassign * rather than a bool, so that they can be used as
casting functions:
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=06a4829d92f383a0fc1e9d488f1634d3764a0171

(also burning a register, since without LTO or some attribute the
compiler can't know the invariant that if non-NULL, then the return
value == the param (albeit with a cast) - do we have a function
attribute for that?).

Example of use, from tree-ssa-pre.c

@@ -4040,6 +4041,7 @@ eliminate_dom_walker::before_dom_children (basic_block b)
       tree sprime = NULL_TREE;
       gimple stmt = gsi_stmt (gsi);
       tree lhs = gimple_get_lhs (stmt);
+      gassign *assign_stmt;
       if (lhs && TREE_CODE (lhs) == SSA_NAME
          && !gimple_has_volatile_ops (stmt)
          /* See PR43491.  Do not replace a global register variable when
@@ -4049,10 +4051,10 @@ eliminate_dom_walker::before_dom_children (basic_block b)
             ???  The fix isn't effective here.  This should instead
             be ensured by not value-numbering them the same but treating
             them like volatiles?  */
-         && !(gimple_assign_single_p (stmt)
-              && (TREE_CODE (gimple_assign_rhs1 (stmt)) == VAR_DECL
-                  && DECL_HARD_REGISTER (gimple_assign_rhs1 (stmt))
-                  && is_global_var (gimple_assign_rhs1 (stmt)))))
+         && !((assign_stmt = gimple_assign_single_p (stmt))
+              && (TREE_CODE (gimple_assign_rhs1 (assign_stmt)) == VAR_DECL
+                  && DECL_HARD_REGISTER (gimple_assign_rhs1 (assign_stmt))
+                  && is_global_var (gimple_assign_rhs1 (assign_stmt)))))

(followed by lots of renaming of "stmt" to "assign_stmt" within the
guarded scope; this is from:
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=18277e45b407ddd9f15012b48caf7403354ebaec
)

More awkward cases
==================
e.g. code that can work on both GIMPLE_CALL and GIMPLE_ASSIGN; once
we've dealt with GIMPLE_CALL, we can add:

  gassign *assign_stmt = as_a <gassign *> (stmt);

and have the rest of the code work on "assign_stmt".

etc


Current status
==============
I currently have 261 accessors converted, with 75 to go i.e.
about 75%, as compared to the 50% from the original patchkit.
I believe that the additional 50->75% work is relatively clean, but I'm
hitting a wall where the final 25% is requiring
much more invasive work, of the kind objected to earlier in this thread.
I'm having to do uglier and uglier patches.


How to proceed?
===============

I like the initial work I did, the:
  "[gimple-classes, committed 00/92] Initial slew of commits":
     https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02791.html
work, but I agree that the work on the branch after that is ugly in
places, and the volume of the work may resemble a DoS attack on the
reviewers - sorry.  (to answer a question on IRC, those followup commits
*were* handwritten, not autogenerated).

Is it acceptable to have the partial transition of strengthening the
types of only 50% or 75% of the accessors?

I'd like to apply those first commits to trunk (applying necessary
merger work), but I know we're approaching the close of stage 1 for gcc
5.

I can try to identify a subset of patches I think are likely to be more
acceptable if this work is still viable, and maybe put them on a
different git branch.  I hope I can get close to the 75% mark without
too much ugliness and verbosity.


Thoughts?

Thanks
Dave

(fwiw, I'm getting rather sick of refactoring, and keen to focus on
user-visible work for gcc 6)
Andrew Pinski Nov. 10, 2014, 11:37 p.m. UTC | #8
On Mon, Nov 10, 2014 at 2:27 PM, David Malcolm <dmalcolm@redhat.com> wrote:
> On Sat, 2014-11-08 at 14:56 +0100, Jakub Jelinek wrote:
>> On Sat, Nov 08, 2014 at 01:07:28PM +0100, Richard Biener wrote:
>> > To be constructive here - the above case is from within a
>> > GIMPLE_ASSIGN case label
>> > and thus I'd have expected
>> >
>> >     case GIMPLE_ASSIGN:
>> >       {
>> >         gassign *a1 = as_a <gassign *> (s1);
>> >         gassign *a2 = as_a <gassign *> (s2);
>> >       lhs1 = gimple_assign_lhs (a1);
>> >       lhs2 = gimple_assign_lhs (a2);
>> >       if (TREE_CODE (lhs1) != SSA_NAME
>> >           && TREE_CODE (lhs2) != SSA_NAME)
>> >         return (operand_equal_p (lhs1, lhs2, 0)
>> >                 && gimple_operand_equal_value_p (gimple_assign_rhs1 (a1),
>> >                                                  gimple_assign_rhs1 (a2)));
>> >       else if (TREE_CODE (lhs1) == SSA_NAME
>> >                && TREE_CODE (lhs2) == SSA_NAME)
>> >         return vn_valueize (lhs1) == vn_valueize (lhs2);
>> >       return false;
>> >       }
>> >
>> > instead.  That's the kind of changes I have expected and have approved of.
>>
>> But even that looks like just adding extra work for all developers, with no
>> gain.  You only have to add extra code and extra temporaries, in switches
>> typically also have to add {} because of the temporaries and thus extra
>> indentation level, and it doesn't simplify anything in the code.
>
> The branch attempts to use the C++ typesystem to capture information
> about the kinds of gimple statement we expect, both:
>   (A) so that the compiler can detect type errors, and
>   (B) as a comprehension aid to the human reader of the code
>
> The ideal here is when function params and struct field can be
> strengthened from "gimple" to a subclass ptr.  This captures the
> knowledge that every use of a function or within a struct has a given
> gimple code.
>
> Examples of this for the initial patchkit were:
>
> * the "call_stmt" field of a cgraph_edge becoming a gcall *,
>   rather than a plain gimple.
>
> * various variables in tree-into-ssa.c change from just
>   vec<gimple> to being vec<gphi *>, capturing the "phi-ness" of
>   the contents as a compile-time check
>
> * tree-inline.h's struct copy_body_data, the field "debug_stmts"
>   can be "concretized" from a vec<gimple> to a vec<gdebug *>.
>
> A more recent example, from:
> https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=5bd16d92b9e928b5a5a7aebd571d92f72dd517f8
> The fields "arr_ref_first" and "arr_ref_last" of
> tree-switch-conversion.c's struct switch_conv_info can be strengthened
> from gimple to gassign *: they are always GIMPLE_ASSIGN.
>
> I applied cleanups to do my initial patchkit, which Jeff approved (with
> some provisos), and which became the first 92 commits on the branch:
>
> "[gimple-classes, committed 00/92] Initial slew of commits":
>   https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02791.html
>
> followed by a merger from trunk into the branch:
> "[gimple-classes] Merge trunk r216157-r216746 into branch":
>   https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02982.html
>
> With those commits, I was able to convert 180 accessors to taking a
> concrete subclass, with 158 left taking a gimple or
> const_gimple i.e. about half of them.
> (My script to analyze this is gimple_typesafety.py
> within https://github.com/davidmalcolm/gcc-refactoring-scripts)
>
> I got it into my head that it was desirable to convert *all*
> gimple accessors to this form, and to eliminate the GIMPLE_CHECK
> macros (given that gcc development community seems to dislike
> partial transitions).
>
> I've been attempting this full conversion - convert all of the
> gimple_ accessors, to require an appropriate gimple subclass
> ptr, in particular focusing on the gimple_assign_ ones, but it's a *lot*
> of extra work, and much more invasive than the patches
> that Jeff conditionally approved.
>
> I now suspect that it's going too far - in the initial patchkit I was
> doing the clean, obvious ones, but now I'm left with the awkward ones
> that would require me to uglify the code to "fix".
>
> If it's OK to only convert some of them, then I'd rather just do that.
>
> The type-strengthening is rarely as neat as being able to simply convert
> a param or field type.  Some examples:
>
> Functions passed a gsi
> ======================
> Sometimes functions are passed a gsi, where it can be known that the gsi
> currently references a stmt of known kind (although that isn't
> necessarily obvious from reading the body of the function):
>
> Example from tree-ssa-strlen.c:
>
> handle_char_store (gimple_stmt_iterator *gsi)
>  {
>    int idx = -1;
>    strinfo si = NULL;
> -  gimple stmt = gsi_stmt (*gsi);
> +  gassign *stmt = as_a <gassign *> (gsi_stmt (*gsi));


Can we have something like:
gsi_assign (*gsi)
instead so there is less typing when we want an gassign rather than a
gimple stmt.  This should allow for easier converting also and puts
most of the as_a in one place.

Thanks,
Andrew Pinski

>    tree ssaname = NULL_TREE, lhs = gimple_assign_lhs (stmt);
>
>    if (TREE_CODE (lhs) == MEM_REF
>
> from
> https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=78aae552f15ad5f8f5290fb825f9ae33f4a7cad9
>
>
> Only acting on one kind of gimple
> =================================
>
> Some functions accept any kind of gimple, but only act on e.g. a
> GIMPLE_ASSIGN, immediately returning if they got a different kind.
>
> So I make this kind of change, where:
>
>   void
>   foo (gimple stmt, other params)
>   {
>     if (!is_gimple_assign (stmt))
>        return;
>
>     use_various gimple_assign_accessors (stmt);
>   }
>
> becomes:
>
>   void
>   foo (gimple gs, other params)
>   {
>     gassign *stmt = dyn_cast <gassign *> (gs);
>     if (!stmt)
>        return;
>
>     use_various gimple_assign_accessors (stmt);
>   }
>
> renaming the param to "gs" to avoid a mass rename of "stmt".
> Example tree-ssa-sink.c (statement_sink_location):
> https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=da19cf71e540f52a924d0985efdacd9e03684a6e
>
>
> Suites where we know the type of a statement
> ============================================
>
> If we have:
>
>    if (is_gimple_assign (stmt))
>      {
>         /* 20 lines of logic, with numerous gimple_assign_ accessors
>            on "stmt". */
>      }
>
> then I've been converting it to:
>
>    if (gassign *assign_stmt = dyn_cast <gassign *> (stmt))
>      {
>         /* rename the "stmt" to "assign_stmt", wrapping lines as
>            needed. */
>      }
>
> Is "assign_stmt" too long here?  It's handy to be able to distinguish
> the meaning of the stmt e.g. "use_stmt", "def_stmt" can have synonyms
> "use_assign" and "def_assign" for the regions where they're known to be
> GIMPLE_ASSIGN).
>
> The above is overkill for a 1-liner e.g.:
>
>    if (gimple_get_code (stmt) == GIMPLE_ASSIGN)
>      expr = gimple_get_lhs (stmt);
>
> It could be converted to:
>
>    if (gassign *assign_stmt = dyn_cast <gassign *> (stmt))
>      expr = gimple_assign_get_lhs (assign_stmt);
>
> or to:
>
>    if (gimple_get_code (stmt) == GIMPLE_ASSIGN)
>      expr = gimple_assign_get_lhs (as_a <gassign *> (stmt));
>
> or left as-is if we don't want gimple_assign_get_lhs to require a
> gassign *.
>
> Casting functions
> =================
>
> Taking RTL's single_set as inspiration, I converted the predicates:
>   gimple_assign_copy_p
>   gimple_assign_ssa_name_copy_p
>   gimple_assign_single_p
> to return a gassign * rather than a bool, so that they can be used as
> casting functions:
> https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=06a4829d92f383a0fc1e9d488f1634d3764a0171
>
> (also burning a register, since without LTO or some attribute the
> compiler can't know the invariant that if non-NULL, then the return
> value == the param (albeit with a cast) - do we have a function
> attribute for that?).
>
> Example of use, from tree-ssa-pre.c
>
> @@ -4040,6 +4041,7 @@ eliminate_dom_walker::before_dom_children (basic_block b)
>        tree sprime = NULL_TREE;
>        gimple stmt = gsi_stmt (gsi);
>        tree lhs = gimple_get_lhs (stmt);
> +      gassign *assign_stmt;
>        if (lhs && TREE_CODE (lhs) == SSA_NAME
>           && !gimple_has_volatile_ops (stmt)
>           /* See PR43491.  Do not replace a global register variable when
> @@ -4049,10 +4051,10 @@ eliminate_dom_walker::before_dom_children (basic_block b)
>              ???  The fix isn't effective here.  This should instead
>              be ensured by not value-numbering them the same but treating
>              them like volatiles?  */
> -         && !(gimple_assign_single_p (stmt)
> -              && (TREE_CODE (gimple_assign_rhs1 (stmt)) == VAR_DECL
> -                  && DECL_HARD_REGISTER (gimple_assign_rhs1 (stmt))
> -                  && is_global_var (gimple_assign_rhs1 (stmt)))))
> +         && !((assign_stmt = gimple_assign_single_p (stmt))
> +              && (TREE_CODE (gimple_assign_rhs1 (assign_stmt)) == VAR_DECL
> +                  && DECL_HARD_REGISTER (gimple_assign_rhs1 (assign_stmt))
> +                  && is_global_var (gimple_assign_rhs1 (assign_stmt)))))
>
> (followed by lots of renaming of "stmt" to "assign_stmt" within the
> guarded scope; this is from:
> https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=18277e45b407ddd9f15012b48caf7403354ebaec
> )
>
> More awkward cases
> ==================
> e.g. code that can work on both GIMPLE_CALL and GIMPLE_ASSIGN; once
> we've dealt with GIMPLE_CALL, we can add:
>
>   gassign *assign_stmt = as_a <gassign *> (stmt);
>
> and have the rest of the code work on "assign_stmt".
>
> etc
>
>
> Current status
> ==============
> I currently have 261 accessors converted, with 75 to go i.e.
> about 75%, as compared to the 50% from the original patchkit.
> I believe that the additional 50->75% work is relatively clean, but I'm
> hitting a wall where the final 25% is requiring
> much more invasive work, of the kind objected to earlier in this thread.
> I'm having to do uglier and uglier patches.
>
>
> How to proceed?
> ===============
>
> I like the initial work I did, the:
>   "[gimple-classes, committed 00/92] Initial slew of commits":
>      https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02791.html
> work, but I agree that the work on the branch after that is ugly in
> places, and the volume of the work may resemble a DoS attack on the
> reviewers - sorry.  (to answer a question on IRC, those followup commits
> *were* handwritten, not autogenerated).
>
> Is it acceptable to have the partial transition of strengthening the
> types of only 50% or 75% of the accessors?
>
> I'd like to apply those first commits to trunk (applying necessary
> merger work), but I know we're approaching the close of stage 1 for gcc
> 5.
>
> I can try to identify a subset of patches I think are likely to be more
> acceptable if this work is still viable, and maybe put them on a
> different git branch.  I hope I can get close to the 75% mark without
> too much ugliness and verbosity.
>
>
> Thoughts?
>
> Thanks
> Dave
>
> (fwiw, I'm getting rather sick of refactoring, and keen to focus on
> user-visible work for gcc 6)
>
Jakub Jelinek Nov. 11, 2014, 7:26 a.m. UTC | #9
On Mon, Nov 10, 2014 at 05:27:50PM -0500, David Malcolm wrote:
> On Sat, 2014-11-08 at 14:56 +0100, Jakub Jelinek wrote:
> > On Sat, Nov 08, 2014 at 01:07:28PM +0100, Richard Biener wrote:
> > > To be constructive here - the above case is from within a
> > > GIMPLE_ASSIGN case label
> > > and thus I'd have expected
> > > 
> > >     case GIMPLE_ASSIGN:
> > >       {
> > >         gassign *a1 = as_a <gassign *> (s1);
> > >         gassign *a2 = as_a <gassign *> (s2);
> > >       lhs1 = gimple_assign_lhs (a1);
> > >       lhs2 = gimple_assign_lhs (a2);
> > >       if (TREE_CODE (lhs1) != SSA_NAME
> > >           && TREE_CODE (lhs2) != SSA_NAME)
> > >         return (operand_equal_p (lhs1, lhs2, 0)
> > >                 && gimple_operand_equal_value_p (gimple_assign_rhs1 (a1),
> > >                                                  gimple_assign_rhs1 (a2)));
> > >       else if (TREE_CODE (lhs1) == SSA_NAME
> > >                && TREE_CODE (lhs2) == SSA_NAME)
> > >         return vn_valueize (lhs1) == vn_valueize (lhs2);
> > >       return false;
> > >       }
> > > 
> > > instead.  That's the kind of changes I have expected and have approved of.
> > 
> > But even that looks like just adding extra work for all developers, with no
> > gain.  You only have to add extra code and extra temporaries, in switches
> > typically also have to add {} because of the temporaries and thus extra
> > indentation level, and it doesn't simplify anything in the code.
> 
> The branch attempts to use the C++ typesystem to capture information
> about the kinds of gimple statement we expect, both:
>   (A) so that the compiler can detect type errors, and
>   (B) as a comprehension aid to the human reader of the code
> 
> The ideal here is when function params and struct field can be
> strengthened from "gimple" to a subclass ptr.  This captures the
> knowledge that every use of a function or within a struct has a given
> gimple code.

I just don't like all the as_a/is_a stuff enforced everywhere,
it means more typing, more temporaries, more indentation.
So, as I view it, instead of the checks being done cheaply (yes, I think
the gimple checking as we have right now is very cheap) under the
hood by the accessors (gimple_assign_{lhs,rhs1} etc.), those changes
put the burden on the developers, who has to check that manually through
the as_a/is_a stuff everywhere, more typing and uglier syntax.
I just don't see that as a step forward, instead a huge step backwards.
But perhaps I'm alone with this.
Can you e.g. compare the size of - lines in your patchset combined, and
size of + lines in your patchset?  As in, if your changes lead to less
typing or more.

	Jakub
Eric Botcazou Nov. 11, 2014, 8:30 a.m. UTC | #10
> I just don't like all the as_a/is_a stuff enforced everywhere,
> it means more typing, more temporaries, more indentation.
> So, as I view it, instead of the checks being done cheaply (yes, I think
> the gimple checking as we have right now is very cheap) under the
> hood by the accessors (gimple_assign_{lhs,rhs1} etc.), those changes
> put the burden on the developers, who has to check that manually through
> the as_a/is_a stuff everywhere, more typing and uglier syntax.
> I just don't see that as a step forward, instead a huge step backwards.
> But perhaps I'm alone with this.

IMO that's the sort of things some of us were afraid of when the C++ switch 
was being discussed and IIRC we were told this would not happen...
Richard Biener Nov. 11, 2014, 10:43 a.m. UTC | #11
On Tue, Nov 11, 2014 at 8:26 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Mon, Nov 10, 2014 at 05:27:50PM -0500, David Malcolm wrote:
>> On Sat, 2014-11-08 at 14:56 +0100, Jakub Jelinek wrote:
>> > On Sat, Nov 08, 2014 at 01:07:28PM +0100, Richard Biener wrote:
>> > > To be constructive here - the above case is from within a
>> > > GIMPLE_ASSIGN case label
>> > > and thus I'd have expected
>> > >
>> > >     case GIMPLE_ASSIGN:
>> > >       {
>> > >         gassign *a1 = as_a <gassign *> (s1);
>> > >         gassign *a2 = as_a <gassign *> (s2);
>> > >       lhs1 = gimple_assign_lhs (a1);
>> > >       lhs2 = gimple_assign_lhs (a2);
>> > >       if (TREE_CODE (lhs1) != SSA_NAME
>> > >           && TREE_CODE (lhs2) != SSA_NAME)
>> > >         return (operand_equal_p (lhs1, lhs2, 0)
>> > >                 && gimple_operand_equal_value_p (gimple_assign_rhs1 (a1),
>> > >                                                  gimple_assign_rhs1 (a2)));
>> > >       else if (TREE_CODE (lhs1) == SSA_NAME
>> > >                && TREE_CODE (lhs2) == SSA_NAME)
>> > >         return vn_valueize (lhs1) == vn_valueize (lhs2);
>> > >       return false;
>> > >       }
>> > >
>> > > instead.  That's the kind of changes I have expected and have approved of.
>> >
>> > But even that looks like just adding extra work for all developers, with no
>> > gain.  You only have to add extra code and extra temporaries, in switches
>> > typically also have to add {} because of the temporaries and thus extra
>> > indentation level, and it doesn't simplify anything in the code.
>>
>> The branch attempts to use the C++ typesystem to capture information
>> about the kinds of gimple statement we expect, both:
>>   (A) so that the compiler can detect type errors, and
>>   (B) as a comprehension aid to the human reader of the code
>>
>> The ideal here is when function params and struct field can be
>> strengthened from "gimple" to a subclass ptr.  This captures the
>> knowledge that every use of a function or within a struct has a given
>> gimple code.
>
> I just don't like all the as_a/is_a stuff enforced everywhere,
> it means more typing, more temporaries, more indentation.
> So, as I view it, instead of the checks being done cheaply (yes, I think
> the gimple checking as we have right now is very cheap) under the
> hood by the accessors (gimple_assign_{lhs,rhs1} etc.), those changes
> put the burden on the developers, who has to check that manually through
> the as_a/is_a stuff everywhere, more typing and uglier syntax.
> I just don't see that as a step forward, instead a huge step backwards.
> But perhaps I'm alone with this.
> Can you e.g. compare the size of - lines in your patchset combined, and
> size of + lines in your patchset?  As in, if your changes lead to less
> typing or more.

I see two ways out here.  One is to add overloads to all the functions
taking the special types like

tree
gimple_assign_rhs1 (gimple *);

or simply add

gassign *operator ()(gimple *g) { return as_a <gassign *> (g); }

into a gimple-compat.h header which you include in places that
are not converted "nicely".

Both avoid manually making the compiler happy (which the
explicit as_a<> stuff is!  It doesn't add any "checking" - it's
just placing the as_a<> at the callers and thus make the
runtine ICE fire there).

As much as I don't like "global" conversion operators I don't
like adding overloads to all of the accessor functions even more.

Whether you enable them generally or just for selected files
via a gimple-compat.h will be up to you (but I'd rather get
rid of them at some point).

Note this allows seamless transform of "random" functions
taking a gimple now but really only expecting a single kind.

Note that we don't absolutely have to rush this all in for GCC 5.
Being the very first for GCC 6 stage1 is another possibility.
We just should get it right.

Thanks,
Richard.


>         Jakub
Bernd Schmidt Nov. 11, 2014, 11:52 a.m. UTC | #12
On 11/11/2014 09:30 AM, Eric Botcazou wrote:
>> I just don't like all the as_a/is_a stuff enforced everywhere,
>> it means more typing, more temporaries, more indentation.
>> So, as I view it, instead of the checks being done cheaply (yes, I think
>> the gimple checking as we have right now is very cheap) under the
>> hood by the accessors (gimple_assign_{lhs,rhs1} etc.), those changes
>> put the burden on the developers, who has to check that manually through
>> the as_a/is_a stuff everywhere, more typing and uglier syntax.
>> I just don't see that as a step forward, instead a huge step backwards.
>> But perhaps I'm alone with this.
>
> IMO that's the sort of things some of us were afraid of when the C++ switch
> was being discussed and IIRC we were told this would not happen...

I'm with both of you on this.


Bernd
David Malcolm Nov. 13, 2014, 1:41 a.m. UTC | #13
On Tue, 2014-11-11 at 11:43 +0100, Richard Biener wrote:
> On Tue, Nov 11, 2014 at 8:26 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> > On Mon, Nov 10, 2014 at 05:27:50PM -0500, David Malcolm wrote:
> >> On Sat, 2014-11-08 at 14:56 +0100, Jakub Jelinek wrote:
> >> > On Sat, Nov 08, 2014 at 01:07:28PM +0100, Richard Biener wrote:
> >> > > To be constructive here - the above case is from within a
> >> > > GIMPLE_ASSIGN case label
> >> > > and thus I'd have expected
> >> > >
> >> > >     case GIMPLE_ASSIGN:
> >> > >       {
> >> > >         gassign *a1 = as_a <gassign *> (s1);
> >> > >         gassign *a2 = as_a <gassign *> (s2);
> >> > >       lhs1 = gimple_assign_lhs (a1);
> >> > >       lhs2 = gimple_assign_lhs (a2);
> >> > >       if (TREE_CODE (lhs1) != SSA_NAME
> >> > >           && TREE_CODE (lhs2) != SSA_NAME)
> >> > >         return (operand_equal_p (lhs1, lhs2, 0)
> >> > >                 && gimple_operand_equal_value_p (gimple_assign_rhs1 (a1),
> >> > >                                                  gimple_assign_rhs1 (a2)));
> >> > >       else if (TREE_CODE (lhs1) == SSA_NAME
> >> > >                && TREE_CODE (lhs2) == SSA_NAME)
> >> > >         return vn_valueize (lhs1) == vn_valueize (lhs2);
> >> > >       return false;
> >> > >       }
> >> > >
> >> > > instead.  That's the kind of changes I have expected and have approved of.
> >> >
> >> > But even that looks like just adding extra work for all developers, with no
> >> > gain.  You only have to add extra code and extra temporaries, in switches
> >> > typically also have to add {} because of the temporaries and thus extra
> >> > indentation level, and it doesn't simplify anything in the code.
> >>
> >> The branch attempts to use the C++ typesystem to capture information
> >> about the kinds of gimple statement we expect, both:
> >>   (A) so that the compiler can detect type errors, and
> >>   (B) as a comprehension aid to the human reader of the code
> >>
> >> The ideal here is when function params and struct field can be
> >> strengthened from "gimple" to a subclass ptr.  This captures the
> >> knowledge that every use of a function or within a struct has a given
> >> gimple code.
> >
> > I just don't like all the as_a/is_a stuff enforced everywhere,
> > it means more typing, more temporaries, more indentation.
> > So, as I view it, instead of the checks being done cheaply (yes, I think
> > the gimple checking as we have right now is very cheap) under the
> > hood by the accessors (gimple_assign_{lhs,rhs1} etc.), those changes
> > put the burden on the developers, who has to check that manually through
> > the as_a/is_a stuff everywhere, more typing and uglier syntax.
> > I just don't see that as a step forward, instead a huge step backwards.
> > But perhaps I'm alone with this.
> > Can you e.g. compare the size of - lines in your patchset combined, and
> > size of + lines in your patchset?  As in, if your changes lead to less
> > typing or more.
> 
> I see two ways out here.  One is to add overloads to all the functions
> taking the special types like
> 
> tree
> gimple_assign_rhs1 (gimple *);
> 
> or simply add
> 
> gassign *operator ()(gimple *g) { return as_a <gassign *> (g); }
> 
> into a gimple-compat.h header which you include in places that
> are not converted "nicely".

Thanks for the suggestions.

Am I missing something, or is the gimple-compat.h idea above not valid C
++?

Note that "gimple" is still a typedef to
  gimple_statement_base *
(as noted before, the gimple -> gimple * change would break everyone
else's patches, so we talked about that as a followup patch for early
stage3).

Given that, if I try to create an "operator ()" outside of a class, I
get this error:

‘gassign* operator()(gimple)’ must be a nonstatic member function

which is emitted from cp/decl.c's grok_op_properties:
      /* An operator function must either be a non-static member function
	 or have at least one parameter of a class, a reference to a class,
	 an enumeration, or a reference to an enumeration.  13.4.0.6 */

I tried making it a member function of gimple_statement_base, but that
doesn't work either: we want a conversion
  from a gimple_statement_base * to a gassign *, not
  from a gimple_statement_base   to a gassign *.

Is there some syntactic trick here that I'm missing?  Sorry if I'm being
dumb (I can imagine there's a way of doing it by making "gimple" become
some kind of wrapped ptr class, but that way lies madness, surely).

> Both avoid manually making the compiler happy (which the
> explicit as_a<> stuff is!  It doesn't add any "checking" - it's
> just placing the as_a<> at the callers and thus make the
> runtine ICE fire there).
> 
> As much as I don't like "global" conversion operators I don't
> like adding overloads to all of the accessor functions even more.

(nods)

Some other options:

Option 3: only convert the "easy" accessors: the ones I already did in
the /89 patch kit, as reviewed by Jeff, and rebased by me recently,
which is this 92-patch kit:
  "[gimple-classes, committed 00/92] Initial slew of commits":
     https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02791.html
Doing so converts about half of the gimple_foo_ accessors to taking a
gfoo *, giving a mixture of GIMPLE_CHECK vs subclass use.   I believe
the quality of those patches was higher than the later ones on the
branch: I was doing the places that didn't require the invasive/verbose
changes seen in the later patches.  Shelve the remaining ~80
increasingly ugly patches, starting a new branch to contain just the
good ones.


Option 4: don't convert any accessors, but instead focus on fields of
structs (e.g. "call_stmt" within a cgraph_edge), and on params of other
functions (e.g. phi-manipulation code).  That way we'd avoid the
inconsistency of some accessors using GIMPLE_CHECK and some using
subclasses - all would continue to consistently use GIMPLE_CHECK, but
there would be some extra type-checking and self-documentation of the
expected statement kinds in the code.



FWIW, option 3 is my preferred approach (I've already done the bulk of
the work and it's already been reviewed; it would need an update merger
from trunk, and there's the big gimple to gimple * fixup you wanted).

> Whether you enable them generally or just for selected files
> via a gimple-compat.h will be up to you (but I'd rather get
> rid of them at some point).
> 
> Note this allows seamless transform of "random" functions
> taking a gimple now but really only expecting a single kind.
> 
> Note that we don't absolutely have to rush this all in for GCC 5.
> Being the very first for GCC 6 stage1 is another possibility.
> We just should get it right.

Thanks
Dave
David Malcolm Nov. 13, 2014, 2:12 a.m. UTC | #14
On Tue, 2014-11-11 at 08:26 +0100, Jakub Jelinek wrote:
> On Mon, Nov 10, 2014 at 05:27:50PM -0500, David Malcolm wrote:
> > On Sat, 2014-11-08 at 14:56 +0100, Jakub Jelinek wrote:
> > > On Sat, Nov 08, 2014 at 01:07:28PM +0100, Richard Biener wrote:
> > > > To be constructive here - the above case is from within a
> > > > GIMPLE_ASSIGN case label
> > > > and thus I'd have expected
> > > > 
> > > >     case GIMPLE_ASSIGN:
> > > >       {
> > > >         gassign *a1 = as_a <gassign *> (s1);
> > > >         gassign *a2 = as_a <gassign *> (s2);
> > > >       lhs1 = gimple_assign_lhs (a1);
> > > >       lhs2 = gimple_assign_lhs (a2);
> > > >       if (TREE_CODE (lhs1) != SSA_NAME
> > > >           && TREE_CODE (lhs2) != SSA_NAME)
> > > >         return (operand_equal_p (lhs1, lhs2, 0)
> > > >                 && gimple_operand_equal_value_p (gimple_assign_rhs1 (a1),
> > > >                                                  gimple_assign_rhs1 (a2)));
> > > >       else if (TREE_CODE (lhs1) == SSA_NAME
> > > >                && TREE_CODE (lhs2) == SSA_NAME)
> > > >         return vn_valueize (lhs1) == vn_valueize (lhs2);
> > > >       return false;
> > > >       }
> > > > 
> > > > instead.  That's the kind of changes I have expected and have approved of.
> > > 
> > > But even that looks like just adding extra work for all developers, with no
> > > gain.  You only have to add extra code and extra temporaries, in switches
> > > typically also have to add {} because of the temporaries and thus extra
> > > indentation level, and it doesn't simplify anything in the code.
> > 
> > The branch attempts to use the C++ typesystem to capture information
> > about the kinds of gimple statement we expect, both:
> >   (A) so that the compiler can detect type errors, and
> >   (B) as a comprehension aid to the human reader of the code
> > 
> > The ideal here is when function params and struct field can be
> > strengthened from "gimple" to a subclass ptr.  This captures the
> > knowledge that every use of a function or within a struct has a given
> > gimple code.
> 
> I just don't like all the as_a/is_a stuff enforced everywhere,
> it means more typing, more temporaries, more indentation.
> So, as I view it, instead of the checks being done cheaply (yes, I think
> the gimple checking as we have right now is very cheap) under the
> hood by the accessors (gimple_assign_{lhs,rhs1} etc.), those changes
> put the burden on the developers, who has to check that manually through
> the as_a/is_a stuff everywhere, more typing and uglier syntax.
> I just don't see that as a step forward, instead a huge step backwards.
> But perhaps I'm alone with this.

I agree with you w.r.t. my later patches.   I like my initial set of 89
patches, but I overdid things with the attempt to convert all of the
gimple_assign_ accessors.

> Can you e.g. compare the size of - lines in your patchset combined, and
> size of + lines in your patchset?  As in, if your changes lead to less
> typing or more.

I don't know if this data matches the proposed interpretation
(autoindentation in emacs is wonderful), but here goes:

Diff of my current working copy vs last merge point, ignoring ChangeLog
additions, obtaining lines starting "-", piping into wc:

$ git diff fddbd0194b01f44c5b5f16379fd5405dcf6d71c0 $(git diff
fddbd0194b01f44c5b5f16379fd5405dcf6d71c0 | diffstat -lp1 | grep -v
ChangeLog) | grep -e "^-" | wc
   6169   31032  272916

Likewise for lines starting with "+":

$ git diff fddbd0194b01f44c5b5f16379fd5405dcf6d71c0 $(git diff
fddbd0194b01f44c5b5f16379fd5405dcf6d71c0 | diffstat -lp1 | grep -v
ChangeLog) | grep -e "^+" | wc
   7295   36566  309380

so the + lines have 13% more bytes than the - lines


Dave
Richard Biener Nov. 13, 2014, 10:45 a.m. UTC | #15
On Thu, Nov 13, 2014 at 2:41 AM, David Malcolm <dmalcolm@redhat.com> wrote:
> On Tue, 2014-11-11 at 11:43 +0100, Richard Biener wrote:
>> On Tue, Nov 11, 2014 at 8:26 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>> > On Mon, Nov 10, 2014 at 05:27:50PM -0500, David Malcolm wrote:
>> >> On Sat, 2014-11-08 at 14:56 +0100, Jakub Jelinek wrote:
>> >> > On Sat, Nov 08, 2014 at 01:07:28PM +0100, Richard Biener wrote:
>> >> > > To be constructive here - the above case is from within a
>> >> > > GIMPLE_ASSIGN case label
>> >> > > and thus I'd have expected
>> >> > >
>> >> > >     case GIMPLE_ASSIGN:
>> >> > >       {
>> >> > >         gassign *a1 = as_a <gassign *> (s1);
>> >> > >         gassign *a2 = as_a <gassign *> (s2);
>> >> > >       lhs1 = gimple_assign_lhs (a1);
>> >> > >       lhs2 = gimple_assign_lhs (a2);
>> >> > >       if (TREE_CODE (lhs1) != SSA_NAME
>> >> > >           && TREE_CODE (lhs2) != SSA_NAME)
>> >> > >         return (operand_equal_p (lhs1, lhs2, 0)
>> >> > >                 && gimple_operand_equal_value_p (gimple_assign_rhs1 (a1),
>> >> > >                                                  gimple_assign_rhs1 (a2)));
>> >> > >       else if (TREE_CODE (lhs1) == SSA_NAME
>> >> > >                && TREE_CODE (lhs2) == SSA_NAME)
>> >> > >         return vn_valueize (lhs1) == vn_valueize (lhs2);
>> >> > >       return false;
>> >> > >       }
>> >> > >
>> >> > > instead.  That's the kind of changes I have expected and have approved of.
>> >> >
>> >> > But even that looks like just adding extra work for all developers, with no
>> >> > gain.  You only have to add extra code and extra temporaries, in switches
>> >> > typically also have to add {} because of the temporaries and thus extra
>> >> > indentation level, and it doesn't simplify anything in the code.
>> >>
>> >> The branch attempts to use the C++ typesystem to capture information
>> >> about the kinds of gimple statement we expect, both:
>> >>   (A) so that the compiler can detect type errors, and
>> >>   (B) as a comprehension aid to the human reader of the code
>> >>
>> >> The ideal here is when function params and struct field can be
>> >> strengthened from "gimple" to a subclass ptr.  This captures the
>> >> knowledge that every use of a function or within a struct has a given
>> >> gimple code.
>> >
>> > I just don't like all the as_a/is_a stuff enforced everywhere,
>> > it means more typing, more temporaries, more indentation.
>> > So, as I view it, instead of the checks being done cheaply (yes, I think
>> > the gimple checking as we have right now is very cheap) under the
>> > hood by the accessors (gimple_assign_{lhs,rhs1} etc.), those changes
>> > put the burden on the developers, who has to check that manually through
>> > the as_a/is_a stuff everywhere, more typing and uglier syntax.
>> > I just don't see that as a step forward, instead a huge step backwards.
>> > But perhaps I'm alone with this.
>> > Can you e.g. compare the size of - lines in your patchset combined, and
>> > size of + lines in your patchset?  As in, if your changes lead to less
>> > typing or more.
>>
>> I see two ways out here.  One is to add overloads to all the functions
>> taking the special types like
>>
>> tree
>> gimple_assign_rhs1 (gimple *);
>>
>> or simply add
>>
>> gassign *operator ()(gimple *g) { return as_a <gassign *> (g); }
>>
>> into a gimple-compat.h header which you include in places that
>> are not converted "nicely".
>
> Thanks for the suggestions.
>
> Am I missing something, or is the gimple-compat.h idea above not valid C
> ++?
>
> Note that "gimple" is still a typedef to
>   gimple_statement_base *
> (as noted before, the gimple -> gimple * change would break everyone
> else's patches, so we talked about that as a followup patch for early
> stage3).
>
> Given that, if I try to create an "operator ()" outside of a class, I
> get this error:
>
> ‘gassign* operator()(gimple)’ must be a nonstatic member function
>
> which is emitted from cp/decl.c's grok_op_properties:
>       /* An operator function must either be a non-static member function
>          or have at least one parameter of a class, a reference to a class,
>          an enumeration, or a reference to an enumeration.  13.4.0.6 */
>
> I tried making it a member function of gimple_statement_base, but that
> doesn't work either: we want a conversion
>   from a gimple_statement_base * to a gassign *, not
>   from a gimple_statement_base   to a gassign *.
>
> Is there some syntactic trick here that I'm missing?  Sorry if I'm being
> dumb (I can imagine there's a way of doing it by making "gimple" become
> some kind of wrapped ptr class, but that way lies madness, surely).

Hmm.

struct assign;
struct base {
  operator assign *() const { return (assign *)this; }
};
struct assign : base {
};

void foo (assign *);
void bar (base *b)
{
  foo (b);
}

doesn't work, but

void bar (base &b)
{
  foo (b);
}

does.  Indeed C++ doesn't seem to provide what is necessary
for the compat trick :(

So the gimple-compat.h header would need to provide
additional overloads for the affected functions like

inline tree
gimple_assign_rhs1 (gimple *g)
{
  return gimple_assign_rhs1 (as_a <gassign *> (g));
}

that would work for me as well.

>> Both avoid manually making the compiler happy (which the
>> explicit as_a<> stuff is!  It doesn't add any "checking" - it's
>> just placing the as_a<> at the callers and thus make the
>> runtine ICE fire there).
>>
>> As much as I don't like "global" conversion operators I don't
>> like adding overloads to all of the accessor functions even more.
>
> (nods)
>
> Some other options:
>
> Option 3: only convert the "easy" accessors: the ones I already did in
> the /89 patch kit, as reviewed by Jeff, and rebased by me recently,
> which is this 92-patch kit:
>   "[gimple-classes, committed 00/92] Initial slew of commits":
>      https://gcc.gnu.org/ml/gcc-patches/2014-10/msg02791.html
> Doing so converts about half of the gimple_foo_ accessors to taking a
> gfoo *, giving a mixture of GIMPLE_CHECK vs subclass use.   I believe
> the quality of those patches was higher than the later ones on the
> branch: I was doing the places that didn't require the invasive/verbose
> changes seen in the later patches.  Shelve the remaining ~80
> increasingly ugly patches, starting a new branch to contain just the
> good ones.
>
>
> Option 4: don't convert any accessors, but instead focus on fields of
> structs (e.g. "call_stmt" within a cgraph_edge), and on params of other
> functions (e.g. phi-manipulation code).  That way we'd avoid the
> inconsistency of some accessors using GIMPLE_CHECK and some using
> subclasses - all would continue to consistently use GIMPLE_CHECK, but
> there would be some extra type-checking and self-documentation of the
> expected statement kinds in the code.
>
>
>
> FWIW, option 3 is my preferred approach (I've already done the bulk of
> the work and it's already been reviewed; it would need an update merger
> from trunk, and there's the big gimple to gimple * fixup you wanted).

Works for me as well.  The compat solution looks somewhat appealing
as we can then incrementally fix up things rather than requiring to
mass-convert everything.

Thanks,
Richard.

>> Whether you enable them generally or just for selected files
>> via a gimple-compat.h will be up to you (but I'd rather get
>> rid of them at some point).
>>
>> Note this allows seamless transform of "random" functions
>> taking a gimple now but really only expecting a single kind.
>>
>> Note that we don't absolutely have to rush this all in for GCC 5.
>> Being the very first for GCC 6 stage1 is another possibility.
>> We just should get it right.
>
> Thanks
> Dave
>
Jonathan Wakely Nov. 13, 2014, 10:59 a.m. UTC | #16
On 13 November 2014 10:45, Richard Biener wrote:
>
> Hmm.
>
> struct assign;
> struct base {
>   operator assign *() const { return (assign *)this; }
> };
> struct assign : base {
> };
>
> void foo (assign *);
> void bar (base *b)
> {
>   foo (b);
> }
>
> doesn't work, but
>
> void bar (base &b)
> {
>   foo (b);
> }
>
> does.  Indeed C++ doesn't seem to provide what is necessary
> for the compat trick :(

Right, base* is a built-in type, you can't call a member function on it.

There is no implicit conversion between unrelated pointer types, and
no implicit conversion from base* to base& that would be necessary to
call the conversion operator of base.
Andrew MacLeod Nov. 13, 2014, 2:24 p.m. UTC | #17
On 11/13/2014 05:45 AM, Richard Biener wrote:
> On Thu, Nov 13, 2014 at 2:41 AM, David Malcolm <dmalcolm@redhat.com> wrote:
>> On Tue, 2014-11-11 at 11:43 +0100, Richard Biener wrote:
>>> On Tue, Nov 11, 2014 at 8:26 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>>>> On Mon, Nov 10, 2014 at 05:27:50PM -0500, David Malcolm wrote:
>>>>> On Sat, 2014-11-08 at 14:56 +0100, Jakub Jelinek wrote:
>>>>>> On Sat, Nov 08, 2014 at 01:07:28PM +0100, Richard Biener wrote:
>>>>>>> To be constructive here - the above case is from within a
>>>>>>> GIMPLE_ASSIGN case label
>>>>>>> and thus I'd have expected
>>>>>>>
>>>>>>>      case GIMPLE_ASSIGN:
>>>>>>>        {
>>>>>>>          gassign *a1 = as_a <gassign *> (s1);
>>>>>>>          gassign *a2 = as_a <gassign *> (s2);
>>>>>>>        lhs1 = gimple_assign_lhs (a1);
>>>>>>>        lhs2 = gimple_assign_lhs (a2);
>>>>>>>        if (TREE_CODE (lhs1) != SSA_NAME
>>>>>>>            && TREE_CODE (lhs2) != SSA_NAME)
>>>>>>>          return (operand_equal_p (lhs1, lhs2, 0)
>>>>>>>                  && gimple_operand_equal_value_p (gimple_assign_rhs1 (a1),
>>>>>>>                                                   gimple_assign_rhs1 (a2)));
>>>>>>>        else if (TREE_CODE (lhs1) == SSA_NAME
>>>>>>>                 && TREE_CODE (lhs2) == SSA_NAME)
>>>>>>>          return vn_valueize (lhs1) == vn_valueize (lhs2);
>>>>>>>        return false;
>>>>>>>        }
>>>>>>>
>>>>>>> instead.  That's the kind of changes I have expected and have approved of.
>>>>>> But even that looks like just adding extra work for all developers, with no
>>>>>> gain.  You only have to add extra code and extra temporaries, in switches
>>>>>> typically also have to add {} because of the temporaries and thus extra
>>>>>> indentation level, and it doesn't simplify anything in the code.
>>>>> The branch attempts to use the C++ typesystem to capture information
>>>>> about the kinds of gimple statement we expect, both:
>>>>>    (A) so that the compiler can detect type errors, and
>>>>>    (B) as a comprehension aid to the human reader of the code
>>>>>
>>>>> The ideal here is when function params and struct field can be
>>>>> strengthened from "gimple" to a subclass ptr.  This captures the
>>>>> knowledge that every use of a function or within a struct has a given
>>>>> gimple code.
>>>> I just don't like all the as_a/is_a stuff enforced everywhere,
>>>> it means more typing, more temporaries, more indentation.
>>>> So, as I view it, instead of the checks being done cheaply (yes, I think
>>>> the gimple checking as we have right now is very cheap) under the
>>>> hood by the accessors (gimple_assign_{lhs,rhs1} etc.), those changes
>>>> put the burden on the developers, who has to check that manually through
>>>> the as_a/is_a stuff everywhere, more typing and uglier syntax.
>>>> I just don't see that as a step forward, instead a huge step backwards.
>>>> But perhaps I'm alone with this.
>>>> Can you e.g. compare the size of - lines in your patchset combined, and
>>>> size of + lines in your patchset?  As in, if your changes lead to less
>>>> typing or more.
>>> I see two ways out here.  One is to add overloads to all the functions
>>> taking the special types like
>>>
>>> tree
>>> gimple_assign_rhs1 (gimple *);
>>>
>>> or simply add
>>>
>>> gassign *operator ()(gimple *g) { return as_a <gassign *> (g); }
>>>
>>> into a gimple-compat.h header which you include in places that
>>> are not converted "nicely".
>> Thanks for the suggestions.
>>
>> Am I missing something, or is the gimple-compat.h idea above not valid C
>> ++?
>>
>> Note that "gimple" is still a typedef to
>>    gimple_statement_base *
>> (as noted before, the gimple -> gimple * change would break everyone
>> else's patches, so we talked about that as a followup patch for early
>> stage3).
>>
>> Given that, if I try to create an "operator ()" outside of a class, I
>> get this error:
>>
>> ‘gassign* operator()(gimple)’ must be a nonstatic member function
>>
>> which is emitted from cp/decl.c's grok_op_properties:
>>        /* An operator function must either be a non-static member function
>>           or have at least one parameter of a class, a reference to a class,
>>           an enumeration, or a reference to an enumeration.  13.4.0.6 */
>>
>> I tried making it a member function of gimple_statement_base, but that
>> doesn't work either: we want a conversion
>>    from a gimple_statement_base * to a gassign *, not
>>    from a gimple_statement_base   to a gassign *.
>>
>> Is there some syntactic trick here that I'm missing?  Sorry if I'm being
>> dumb (I can imagine there's a way of doing it by making "gimple" become
>> some kind of wrapped ptr class, but that way lies madness, surely).
> Hmm.
>
> struct assign;
> struct base {
>    operator assign *() const { return (assign *)this; }
> };
> struct assign : base {
> };
>
> void foo (assign *);
> void bar (base *b)
> {
>    foo (b);
> }
>
> doesn't work, but
>
> void bar (base &b)
> {
>    foo (b);
> }
>
> does.  Indeed C++ doesn't seem to provide what is necessary
> for the compat trick :(
>
> So the gimple-compat.h header would need to provide
> additional overloads for the affected functions like
>
> inline tree
> gimple_assign_rhs1 (gimple *g)
> {
>    return gimple_assign_rhs1 (as_a <gassign *> (g));
> }
>
> that would work for me as well.
>
>
Won't that defeat the desire for checking tho?  If you dont do a 
dyn_cast<>  in gimple_assign_rhs1 (gimple *g)  anyone can call it and 
upcast any kind of gimple into a gassign without checking that it is 
really a gassign...   Actually, a gcc_assert here may be sufficient... 
and better.

It'd be pretty easy to miss someone calling it when its not a gassign.

Andrew
Richard Biener Nov. 13, 2014, 2:34 p.m. UTC | #18
On Thu, Nov 13, 2014 at 3:24 PM, Andrew MacLeod <amacleod@redhat.com> wrote:
> On 11/13/2014 05:45 AM, Richard Biener wrote:
>>
>> On Thu, Nov 13, 2014 at 2:41 AM, David Malcolm <dmalcolm@redhat.com>
>> wrote:
>>>
>>> On Tue, 2014-11-11 at 11:43 +0100, Richard Biener wrote:
>>>>
>>>> On Tue, Nov 11, 2014 at 8:26 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>>>>>
>>>>> On Mon, Nov 10, 2014 at 05:27:50PM -0500, David Malcolm wrote:
>>>>>>
>>>>>> On Sat, 2014-11-08 at 14:56 +0100, Jakub Jelinek wrote:
>>>>>>>
>>>>>>> On Sat, Nov 08, 2014 at 01:07:28PM +0100, Richard Biener wrote:
>>>>>>>>
>>>>>>>> To be constructive here - the above case is from within a
>>>>>>>> GIMPLE_ASSIGN case label
>>>>>>>> and thus I'd have expected
>>>>>>>>
>>>>>>>>      case GIMPLE_ASSIGN:
>>>>>>>>        {
>>>>>>>>          gassign *a1 = as_a <gassign *> (s1);
>>>>>>>>          gassign *a2 = as_a <gassign *> (s2);
>>>>>>>>        lhs1 = gimple_assign_lhs (a1);
>>>>>>>>        lhs2 = gimple_assign_lhs (a2);
>>>>>>>>        if (TREE_CODE (lhs1) != SSA_NAME
>>>>>>>>            && TREE_CODE (lhs2) != SSA_NAME)
>>>>>>>>          return (operand_equal_p (lhs1, lhs2, 0)
>>>>>>>>                  && gimple_operand_equal_value_p (gimple_assign_rhs1
>>>>>>>> (a1),
>>>>>>>>                                                   gimple_assign_rhs1
>>>>>>>> (a2)));
>>>>>>>>        else if (TREE_CODE (lhs1) == SSA_NAME
>>>>>>>>                 && TREE_CODE (lhs2) == SSA_NAME)
>>>>>>>>          return vn_valueize (lhs1) == vn_valueize (lhs2);
>>>>>>>>        return false;
>>>>>>>>        }
>>>>>>>>
>>>>>>>> instead.  That's the kind of changes I have expected and have
>>>>>>>> approved of.
>>>>>>>
>>>>>>> But even that looks like just adding extra work for all developers,
>>>>>>> with no
>>>>>>> gain.  You only have to add extra code and extra temporaries, in
>>>>>>> switches
>>>>>>> typically also have to add {} because of the temporaries and thus
>>>>>>> extra
>>>>>>> indentation level, and it doesn't simplify anything in the code.
>>>>>>
>>>>>> The branch attempts to use the C++ typesystem to capture information
>>>>>> about the kinds of gimple statement we expect, both:
>>>>>>    (A) so that the compiler can detect type errors, and
>>>>>>    (B) as a comprehension aid to the human reader of the code
>>>>>>
>>>>>> The ideal here is when function params and struct field can be
>>>>>> strengthened from "gimple" to a subclass ptr.  This captures the
>>>>>> knowledge that every use of a function or within a struct has a given
>>>>>> gimple code.
>>>>>
>>>>> I just don't like all the as_a/is_a stuff enforced everywhere,
>>>>> it means more typing, more temporaries, more indentation.
>>>>> So, as I view it, instead of the checks being done cheaply (yes, I
>>>>> think
>>>>> the gimple checking as we have right now is very cheap) under the
>>>>> hood by the accessors (gimple_assign_{lhs,rhs1} etc.), those changes
>>>>> put the burden on the developers, who has to check that manually
>>>>> through
>>>>> the as_a/is_a stuff everywhere, more typing and uglier syntax.
>>>>> I just don't see that as a step forward, instead a huge step backwards.
>>>>> But perhaps I'm alone with this.
>>>>> Can you e.g. compare the size of - lines in your patchset combined, and
>>>>> size of + lines in your patchset?  As in, if your changes lead to less
>>>>> typing or more.
>>>>
>>>> I see two ways out here.  One is to add overloads to all the functions
>>>> taking the special types like
>>>>
>>>> tree
>>>> gimple_assign_rhs1 (gimple *);
>>>>
>>>> or simply add
>>>>
>>>> gassign *operator ()(gimple *g) { return as_a <gassign *> (g); }
>>>>
>>>> into a gimple-compat.h header which you include in places that
>>>> are not converted "nicely".
>>>
>>> Thanks for the suggestions.
>>>
>>> Am I missing something, or is the gimple-compat.h idea above not valid C
>>> ++?
>>>
>>> Note that "gimple" is still a typedef to
>>>    gimple_statement_base *
>>> (as noted before, the gimple -> gimple * change would break everyone
>>> else's patches, so we talked about that as a followup patch for early
>>> stage3).
>>>
>>> Given that, if I try to create an "operator ()" outside of a class, I
>>> get this error:
>>>
>>> ‘gassign* operator()(gimple)’ must be a nonstatic member function
>>>
>>> which is emitted from cp/decl.c's grok_op_properties:
>>>        /* An operator function must either be a non-static member
>>> function
>>>           or have at least one parameter of a class, a reference to a
>>> class,
>>>           an enumeration, or a reference to an enumeration.  13.4.0.6 */
>>>
>>> I tried making it a member function of gimple_statement_base, but that
>>> doesn't work either: we want a conversion
>>>    from a gimple_statement_base * to a gassign *, not
>>>    from a gimple_statement_base   to a gassign *.
>>>
>>> Is there some syntactic trick here that I'm missing?  Sorry if I'm being
>>> dumb (I can imagine there's a way of doing it by making "gimple" become
>>> some kind of wrapped ptr class, but that way lies madness, surely).
>>
>> Hmm.
>>
>> struct assign;
>> struct base {
>>    operator assign *() const { return (assign *)this; }
>> };
>> struct assign : base {
>> };
>>
>> void foo (assign *);
>> void bar (base *b)
>> {
>>    foo (b);
>> }
>>
>> doesn't work, but
>>
>> void bar (base &b)
>> {
>>    foo (b);
>> }
>>
>> does.  Indeed C++ doesn't seem to provide what is necessary
>> for the compat trick :(
>>
>> So the gimple-compat.h header would need to provide
>> additional overloads for the affected functions like
>>
>> inline tree
>> gimple_assign_rhs1 (gimple *g)
>> {
>>    return gimple_assign_rhs1 (as_a <gassign *> (g));
>> }
>>
>> that would work for me as well.
>>
>>
> Won't that defeat the desire for checking tho?  If you dont do a dyn_cast<>
> in gimple_assign_rhs1 (gimple *g)  anyone can call it and upcast any kind of
> gimple into a gassign without checking that it is really a gassign...
> Actually, a gcc_assert here may be sufficient... and better.
>
> It'd be pretty easy to miss someone calling it when its not a gassign.

as_a performs that checking.

Richard.

> Andrew
>
>
Andrew MacLeod Nov. 13, 2014, 3:17 p.m. UTC | #19
On 11/13/2014 09:34 AM, Richard Biener wrote:
> On Thu, Nov 13, 2014 at 3:24 PM, Andrew MacLeod <amacleod@redhat.com> wrote:
>> On 11/13/2014 05:45 AM, Richard Biener wrote:
>>> On Thu, Nov 13, 2014 at 2:41 AM, David Malcolm <dmalcolm@redhat.com>
>>> wrote:
>>>> On Tue, 2014-11-11 at 11:43 +0100, Richard Biener wrote:
>>>>> On Tue, Nov 11, 2014 at 8:26 AM, Jakub Jelinek <jakub@redhat.com> wrote:
>>>>>> On Mon, Nov 10, 2014 at 05:27:50PM -0500, David Malcolm wrote:
>>>>>>> On Sat, 2014-11-08 at 14:56 +0100, Jakub Jelinek wrote:
>>>>>>>> On Sat, Nov 08, 2014 at 01:07:28PM +0100, Richard Biener wrote:
>>>>>>>>> To be constructive here - the above case is from within a
>>>>>>>>> GIMPLE_ASSIGN case label
>>>>>>>>> and thus I'd have expected
>>>>>>>>>
>>>>>>>>>       case GIMPLE_ASSIGN:
>>>>>>>>>         {
>>>>>>>>>           gassign *a1 = as_a <gassign *> (s1);
>>>>>>>>>           gassign *a2 = as_a <gassign *> (s2);
>>>>>>>>>         lhs1 = gimple_assign_lhs (a1);
>>>>>>>>>         lhs2 = gimple_assign_lhs (a2);
>>>>>>>>>         if (TREE_CODE (lhs1) != SSA_NAME
>>>>>>>>>             && TREE_CODE (lhs2) != SSA_NAME)
>>>>>>>>>           return (operand_equal_p (lhs1, lhs2, 0)
>>>>>>>>>                   && gimple_operand_equal_value_p (gimple_assign_rhs1
>>>>>>>>> (a1),
>>>>>>>>>                                                    gimple_assign_rhs1
>>>>>>>>> (a2)));
>>>>>>>>>         else if (TREE_CODE (lhs1) == SSA_NAME
>>>>>>>>>                  && TREE_CODE (lhs2) == SSA_NAME)
>>>>>>>>>           return vn_valueize (lhs1) == vn_valueize (lhs2);
>>>>>>>>>         return false;
>>>>>>>>>         }
>>>>>>>>>
>>>>>>>>> instead.  That's the kind of changes I have expected and have
>>>>>>>>> approved of.
>>>>>>>> But even that looks like just adding extra work for all developers,
>>>>>>>> with no
>>>>>>>> gain.  You only have to add extra code and extra temporaries, in
>>>>>>>> switches
>>>>>>>> typically also have to add {} because of the temporaries and thus
>>>>>>>> extra
>>>>>>>> indentation level, and it doesn't simplify anything in the code.
>>>>>>> The branch attempts to use the C++ typesystem to capture information
>>>>>>> about the kinds of gimple statement we expect, both:
>>>>>>>     (A) so that the compiler can detect type errors, and
>>>>>>>     (B) as a comprehension aid to the human reader of the code
>>>>>>>
>>>>>>> The ideal here is when function params and struct field can be
>>>>>>> strengthened from "gimple" to a subclass ptr.  This captures the
>>>>>>> knowledge that every use of a function or within a struct has a given
>>>>>>> gimple code.
>>>>>> I just don't like all the as_a/is_a stuff enforced everywhere,
>>>>>> it means more typing, more temporaries, more indentation.
>>>>>> So, as I view it, instead of the checks being done cheaply (yes, I
>>>>>> think
>>>>>> the gimple checking as we have right now is very cheap) under the
>>>>>> hood by the accessors (gimple_assign_{lhs,rhs1} etc.), those changes
>>>>>> put the burden on the developers, who has to check that manually
>>>>>> through
>>>>>> the as_a/is_a stuff everywhere, more typing and uglier syntax.
>>>>>> I just don't see that as a step forward, instead a huge step backwards.
>>>>>> But perhaps I'm alone with this.
>>>>>> Can you e.g. compare the size of - lines in your patchset combined, and
>>>>>> size of + lines in your patchset?  As in, if your changes lead to less
>>>>>> typing or more.
>>>>> I see two ways out here.  One is to add overloads to all the functions
>>>>> taking the special types like
>>>>>
>>>>> tree
>>>>> gimple_assign_rhs1 (gimple *);
>>>>>
>>>>> or simply add
>>>>>
>>>>> gassign *operator ()(gimple *g) { return as_a <gassign *> (g); }
>>>>>
>>>>> into a gimple-compat.h header which you include in places that
>>>>> are not converted "nicely".
>>>> Thanks for the suggestions.
>>>>
>>>> Am I missing something, or is the gimple-compat.h idea above not valid C
>>>> ++?
>>>>
>>>> Note that "gimple" is still a typedef to
>>>>     gimple_statement_base *
>>>> (as noted before, the gimple -> gimple * change would break everyone
>>>> else's patches, so we talked about that as a followup patch for early
>>>> stage3).
>>>>
>>>> Given that, if I try to create an "operator ()" outside of a class, I
>>>> get this error:
>>>>
>>>> ‘gassign* operator()(gimple)’ must be a nonstatic member function
>>>>
>>>> which is emitted from cp/decl.c's grok_op_properties:
>>>>         /* An operator function must either be a non-static member
>>>> function
>>>>            or have at least one parameter of a class, a reference to a
>>>> class,
>>>>            an enumeration, or a reference to an enumeration.  13.4.0.6 */
>>>>
>>>> I tried making it a member function of gimple_statement_base, but that
>>>> doesn't work either: we want a conversion
>>>>     from a gimple_statement_base * to a gassign *, not
>>>>     from a gimple_statement_base   to a gassign *.
>>>>
>>>> Is there some syntactic trick here that I'm missing?  Sorry if I'm being
>>>> dumb (I can imagine there's a way of doing it by making "gimple" become
>>>> some kind of wrapped ptr class, but that way lies madness, surely).
>>> Hmm.
>>>
>>> struct assign;
>>> struct base {
>>>     operator assign *() const { return (assign *)this; }
>>> };
>>> struct assign : base {
>>> };
>>>
>>> void foo (assign *);
>>> void bar (base *b)
>>> {
>>>     foo (b);
>>> }
>>>
>>> doesn't work, but
>>>
>>> void bar (base &b)
>>> {
>>>     foo (b);
>>> }
>>>
>>> does.  Indeed C++ doesn't seem to provide what is necessary
>>> for the compat trick :(
>>>
>>> So the gimple-compat.h header would need to provide
>>> additional overloads for the affected functions like
>>>
>>> inline tree
>>> gimple_assign_rhs1 (gimple *g)
>>> {
>>>     return gimple_assign_rhs1 (as_a <gassign *> (g));
>>> }
>>>
>>> that would work for me as well.
>>>
>>>
>> Won't that defeat the desire for checking tho?  If you dont do a dyn_cast<>
>> in gimple_assign_rhs1 (gimple *g)  anyone can call it and upcast any kind of
>> gimple into a gassign without checking that it is really a gassign...
>> Actually, a gcc_assert here may be sufficient... and better.
>>
>> It'd be pretty easy to miss someone calling it when its not a gassign.
> as_a performs that checking.
>
> Richard.
>
>
Ah, so it does.  Nevermind :-)

Andrew
diff mbox

Patch

diff --git a/gcc/ChangeLog.gimple-classes b/gcc/ChangeLog.gimple-classes
index f43df63..0bd0421 100644
--- a/gcc/ChangeLog.gimple-classes
+++ b/gcc/ChangeLog.gimple-classes
@@ -1,5 +1,10 @@ 
 2014-11-06  David Malcolm  <dmalcolm@redhat.com>
 
+	* tree-ssa-tail-merge.c (same_succ_hash): Add checked cast.
+	(gimple_equal_p): Add checked casts.
+
+2014-11-06  David Malcolm  <dmalcolm@redhat.com>
+
 	* tree-ssa-structalias.c (find_func_aliases): Replace
 	is_gimple_assign with a dyn_cast, introducing local gassign *
 	"t_assign", using it in place of "t" for typesafety.
diff --git a/gcc/tree-ssa-tail-merge.c b/gcc/tree-ssa-tail-merge.c
index 5678657..b822214 100644
--- a/gcc/tree-ssa-tail-merge.c
+++ b/gcc/tree-ssa-tail-merge.c
@@ -484,7 +484,7 @@  same_succ_hash (const_same_succ e)
 
       hstate.add_int (gimple_code (stmt));
       if (is_gimple_assign (stmt))
-	hstate.add_int (gimple_assign_rhs_code (stmt));
+	hstate.add_int (gimple_assign_rhs_code (as_a <gassign *> (stmt)));
       if (!is_gimple_call (stmt))
 	continue;
       if (gimple_call_internal_p (stmt))
@@ -1172,8 +1172,10 @@  gimple_equal_p (same_succ same_succ, gimple s1, gimple s2)
       if (TREE_CODE (lhs1) != SSA_NAME
 	  && TREE_CODE (lhs2) != SSA_NAME)
 	return (operand_equal_p (lhs1, lhs2, 0)
-		&& gimple_operand_equal_value_p (gimple_assign_rhs1 (s1),
-						 gimple_assign_rhs1 (s2)));
+		&& gimple_operand_equal_value_p (gimple_assign_rhs1 (
+						   as_a <gassign *> (s1)),
+						 gimple_assign_rhs1 (
+						   as_a <gassign *> (s2))));
       else if (TREE_CODE (lhs1) == SSA_NAME
 	       && TREE_CODE (lhs2) == SSA_NAME)
 	return vn_valueize (lhs1) == vn_valueize (lhs2);