diff mbox

move increase_alignment from simple to regular ipa pass

Message ID CAAgBjMmzvQG5GLsWsX86cP2c=ZcRJXU+iRzho0RcNy_ej+WVvw@mail.gmail.com
State New
Headers show

Commit Message

Prathamesh Kulkarni June 9, 2016, 8:18 p.m. UTC
On 8 June 2016 at 20:38, Jan Hubicka <hubicka@ucw.cz> wrote:
>> I think it would be nice to work towards transitioning
>> flag_section_anchors to a flag on varpool nodes, thereby removing
>> the Optimization flag from common.opt:fsection-anchors
>>
>> That would simplify the walk over varpool candidates.
>
> Makes sense to me, too. There are more candidates for sutff that should be
> variable specific in common.opt (such as variable alignment, -fdata-sctions,
> -fmerge-constants) and targets.  We may try to do it in an easy to extend way
> so incrementally we can get rid of those global flags, too.
In this version I removed Optimization from fsection-anchors entry in
common.opt,
and gated the increase_alignment pass on flag_section_anchors != 0.
Cross tested on arm*-*-*, aarch64*-*-*.
Does it look OK ?
>
> One thing that needs to be done for LTO is sane merging, I guess in this case
> it is clear that the variable should be anchored when its previaling definition
> is.
Um could we determine during WPA if symbol is a section anchor for merging ?
Seems to me SYMBOL_REF_ANCHOR_P is defined only on DECL_RTL and not at
tree level.
Do we have DECL_RTL info available during WPA ?

Thanks,
Prathamesh
>
> Honza
>>
>> Richard.
>>
>> > Thanks,
>> > Prathamesh
>> > >
>> > > Honza
>> > >>
>> > >> Richard.
>> >
>> >
>>
>> --
>> Richard Biener <rguenther@suse.de>
>> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

Comments

Jan Hubicka June 9, 2016, 8:23 p.m. UTC | #1
> On 8 June 2016 at 20:38, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> I think it would be nice to work towards transitioning
> >> flag_section_anchors to a flag on varpool nodes, thereby removing
> >> the Optimization flag from common.opt:fsection-anchors
> >>
> >> That would simplify the walk over varpool candidates.
> >
> > Makes sense to me, too. There are more candidates for sutff that should be
> > variable specific in common.opt (such as variable alignment, -fdata-sctions,
> > -fmerge-constants) and targets.  We may try to do it in an easy to extend way
> > so incrementally we can get rid of those global flags, too.
> In this version I removed Optimization from fsection-anchors entry in
> common.opt,
> and gated the increase_alignment pass on flag_section_anchors != 0.
> Cross tested on arm*-*-*, aarch64*-*-*.
> Does it look OK ?

If you go this way you will need to do something sane for LTO.  Here one can compile
some object files with -fsection-anchors and other without and link with random setting
(because in traditional compilation linktime flags does not matter).

For global flags we have magic in merge_and_complain that determines flags to pass
to the LTO compiler.
It is not very robust though.
> >
> > One thing that needs to be done for LTO is sane merging, I guess in this case
> > it is clear that the variable should be anchored when its previaling definition
> > is.
> Um could we determine during WPA if symbol is a section anchor for merging ?
> Seems to me SYMBOL_REF_ANCHOR_P is defined only on DECL_RTL and not at
> tree level.
> Do we have DECL_RTL info available during WPA ?

We don't have anchros computed, but we can decide whether we want to potentially
anchor the variable if we can.

I would say all you need is to have section_anchor flag in varpool node itself
which controls RTL production. At varpool_finalize_decl you will set it
according to flag_varpool and stream it to LTO objects. At WPA when doing
linking, the section_anchor flag of the previaling decl wins..

Honza
> 
> Thanks,
> Prathamesh
> >
> > Honza
> >>
> >> Richard.
> >>
> >> > Thanks,
> >> > Prathamesh
> >> > >
> >> > > Honza
> >> > >>
> >> > >> Richard.
> >> >
> >> >
> >>
> >> --
> >> Richard Biener <rguenther@suse.de>
> >> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

> diff --git a/gcc/common.opt b/gcc/common.opt
> index f0d7196..f93f26c 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2133,7 +2133,7 @@ Common Report Var(flag_sched_dep_count_heuristic) Init(1) Optimization
>  Enable the dependent count heuristic in the scheduler.
>  
>  fsection-anchors
> -Common Report Var(flag_section_anchors) Optimization
> +Common Report Var(flag_section_anchors)
>  Access data in the same section from shared anchor points.
>  
>  fsee
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 3647e90..3a8063c 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -138,12 +138,12 @@ along with GCC; see the file COPYING3.  If not see
>    PUSH_INSERT_PASSES_WITHIN (pass_ipa_tree_profile)
>        NEXT_PASS (pass_feedback_split_functions);
>    POP_INSERT_PASSES ()
> -  NEXT_PASS (pass_ipa_increase_alignment);
>    NEXT_PASS (pass_ipa_tm);
>    NEXT_PASS (pass_ipa_lower_emutls);
>    TERMINATE_PASS_LIST (all_small_ipa_passes)
>  
>    INSERT_PASSES_AFTER (all_regular_ipa_passes)
> +  NEXT_PASS (pass_ipa_increase_alignment);
>    NEXT_PASS (pass_ipa_whole_program_visibility);
>    NEXT_PASS (pass_ipa_profile);
>    NEXT_PASS (pass_ipa_icf);
> diff --git a/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c
> new file mode 100644
> index 0000000..74eaed8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c
> @@ -0,0 +1,25 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target section_anchors } */
> +/* { dg-require-effective-target vect_int } */
> +
> +#define N 32
> +
> +/* Clone of section-anchors-vect-70.c with foo() having -fno-tree-loop-vectorize.  */ 
> +
> +static struct A {
> +  int p1, p2;
> +  int e[N];
> +} a, b, c;
> +
> +__attribute__((optimize("-fno-tree-loop-vectorize")))
> +int foo(void)
> +{
> +  for (int i = 0; i < N; i++)
> +    a.e[i] = b.e[i] + c.e[i];
> +
> +   return a.e[0];
> +}
> +
> +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target aarch64*-*-* } } } */
> +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target powerpc64*-*-* } } } */
> +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target arm*-*-* } } } */
> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
> index 36299a6..d36aa1d 100644
> --- a/gcc/tree-pass.h
> +++ b/gcc/tree-pass.h
> @@ -483,7 +483,7 @@ extern simple_ipa_opt_pass *make_pass_local_optimization_passes (gcc::context *c
>  
>  extern ipa_opt_pass_d *make_pass_ipa_whole_program_visibility (gcc::context
>  							       *ctxt);
> -extern simple_ipa_opt_pass *make_pass_ipa_increase_alignment (gcc::context
> +extern ipa_opt_pass_d *make_pass_ipa_increase_alignment (gcc::context
>  							      *ctxt);
>  extern ipa_opt_pass_d *make_pass_ipa_inline (gcc::context *ctxt);
>  extern simple_ipa_opt_pass *make_pass_ipa_free_lang_data (gcc::context *ctxt);
> diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
> index 2669813..d34e560 100644
> --- a/gcc/tree-vectorizer.c
> +++ b/gcc/tree-vectorizer.c
> @@ -899,6 +899,34 @@ get_vec_alignment_for_type (tree type)
>    return (alignment > TYPE_ALIGN (type)) ? alignment : 0;
>  }
>  
> +/* Return true if alignment should be increased for this vnode.
> +   This is done if every function that references/referring to vnode
> +   has flag_tree_loop_vectorize and flag_section_anchors set.  */
> +
> +static bool
> +increase_alignment_p (varpool_node *vnode)
> +{
> +  ipa_ref *ref;
> +
> +  for (int i = 0; vnode->iterate_reference (i, ref); i++)
> +    if (cgraph_node *cnode = dyn_cast<cgraph_node *> (ref->referred))
> +      {
> +	struct cl_optimization *opts = opts_for_fn (cnode->decl);
> +	if (!opts->x_flag_tree_loop_vectorize)
> +	  return false;
> +      }
> +
> +  for (int i = 0; vnode->iterate_referring (i, ref); i++)
> +    if (cgraph_node *cnode = dyn_cast<cgraph_node *> (ref->referring))
> +      {
> +	struct cl_optimization *opts = opts_for_fn (cnode->decl);
> +	if (!opts->x_flag_tree_loop_vectorize)
> +	  return false;
> +      }
> +
> +  return true;
> +}
> +
>  /* Entry point to increase_alignment pass.  */
>  static unsigned int
>  increase_alignment (void)
> @@ -916,7 +944,8 @@ increase_alignment (void)
>  
>        if ((decl_in_symtab_p (decl)
>  	  && !symtab_node::get (decl)->can_increase_alignment_p ())
> -	  || DECL_USER_ALIGN (decl) || DECL_ARTIFICIAL (decl))
> +	  || DECL_USER_ALIGN (decl) || DECL_ARTIFICIAL (decl)
> +	  || !increase_alignment_p (vnode))
>  	continue;
>  
>        alignment = get_vec_alignment_for_type (TREE_TYPE (decl));
> @@ -938,7 +967,7 @@ namespace {
>  
>  const pass_data pass_data_ipa_increase_alignment =
>  {
> -  SIMPLE_IPA_PASS, /* type */
> +  IPA_PASS, /* type */
>    "increase_alignment", /* name */
>    OPTGROUP_LOOP | OPTGROUP_VEC, /* optinfo_flags */
>    TV_IPA_OPT, /* tv_id */
> @@ -949,17 +978,26 @@ const pass_data pass_data_ipa_increase_alignment =
>    0, /* todo_flags_finish */
>  };
>  
> -class pass_ipa_increase_alignment : public simple_ipa_opt_pass
> +class pass_ipa_increase_alignment : public ipa_opt_pass_d
>  {
>  public:
>    pass_ipa_increase_alignment (gcc::context *ctxt)
> -    : simple_ipa_opt_pass (pass_data_ipa_increase_alignment, ctxt)
> +    : ipa_opt_pass_d (pass_data_ipa_increase_alignment, ctxt,
> +			   NULL, /* generate_summary  */
> +			   NULL, /* write summary  */
> +			   NULL, /* read summary  */
> +			   NULL, /* write optimization summary  */
> +			   NULL, /* read optimization summary  */
> +			   NULL, /* stmt fixup  */
> +			   0, /* function_transform_todo_flags_start  */
> +			   NULL, /* transform function  */
> +			   NULL )/* variable transform  */
>    {}
>  
>    /* opt_pass methods: */
>    virtual bool gate (function *)
>      {
> -      return flag_section_anchors && flag_tree_loop_vectorize;
> +      return flag_section_anchors != 0; 
>      }
>  
>    virtual unsigned int execute (function *) { return increase_alignment (); }
> @@ -968,7 +1006,7 @@ public:
>  
>  } // anon namespace
>  
> -simple_ipa_opt_pass *
> +ipa_opt_pass_d *
>  make_pass_ipa_increase_alignment (gcc::context *ctxt)
>  {
>    return new pass_ipa_increase_alignment (ctxt);
Prathamesh Kulkarni June 10, 2016, 9:33 a.m. UTC | #2
On 10 June 2016 at 01:53, Jan Hubicka <hubicka@ucw.cz> wrote:
>> On 8 June 2016 at 20:38, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >> I think it would be nice to work towards transitioning
>> >> flag_section_anchors to a flag on varpool nodes, thereby removing
>> >> the Optimization flag from common.opt:fsection-anchors
>> >>
>> >> That would simplify the walk over varpool candidates.
>> >
>> > Makes sense to me, too. There are more candidates for sutff that should be
>> > variable specific in common.opt (such as variable alignment, -fdata-sctions,
>> > -fmerge-constants) and targets.  We may try to do it in an easy to extend way
>> > so incrementally we can get rid of those global flags, too.
>> In this version I removed Optimization from fsection-anchors entry in
>> common.opt,
>> and gated the increase_alignment pass on flag_section_anchors != 0.
>> Cross tested on arm*-*-*, aarch64*-*-*.
>> Does it look OK ?
>
> If you go this way you will need to do something sane for LTO.  Here one can compile
> some object files with -fsection-anchors and other without and link with random setting
> (because in traditional compilation linktime flags does not matter).
>
> For global flags we have magic in merge_and_complain that determines flags to pass
> to the LTO compiler.
> It is not very robust though.
>> >
>> > One thing that needs to be done for LTO is sane merging, I guess in this case
>> > it is clear that the variable should be anchored when its previaling definition
>> > is.
>> Um could we determine during WPA if symbol is a section anchor for merging ?
>> Seems to me SYMBOL_REF_ANCHOR_P is defined only on DECL_RTL and not at
>> tree level.
>> Do we have DECL_RTL info available during WPA ?
>
> We don't have anchros computed, but we can decide whether we want to potentially
> anchor the variable if we can.
>
> I would say all you need is to have section_anchor flag in varpool node itself
> which controls RTL production. At varpool_finalize_decl you will set it
> according to flag_varpool and stream it to LTO objects. At WPA when doing
> linking, the section_anchor flag of the previaling decl wins..
Thanks for the suggestions.
IIUC, we want to add new section_anchor flag to varpool_node class
and set it in varpool_node::finalize_decl and stream it to LTO byte-code,
and then during WPA set section_anchor_flag during symbol merging if it is set
for prevailing decl.
In the increase_alignment_pass if a vnode has section_anchor flag set,
we will walk all functions that reference it to check if they have
-ftree-loop-vectorize set.
Is that correct ?
Could you please elaborate a bit more on "at varpool_finalize_decl you will
set section_anchor flag according to flag_varpool" ?
flag_varpool doesn't appear to be defined.

Thanks,
Prathamesh
>
> Honza
>>
>> Thanks,
>> Prathamesh
>> >
>> > Honza
>> >>
>> >> Richard.
>> >>
>> >> > Thanks,
>> >> > Prathamesh
>> >> > >
>> >> > > Honza
>> >> > >>
>> >> > >> Richard.
>> >> >
>> >> >
>> >>
>> >> --
>> >> Richard Biener <rguenther@suse.de>
>> >> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
>
>> diff --git a/gcc/common.opt b/gcc/common.opt
>> index f0d7196..f93f26c 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -2133,7 +2133,7 @@ Common Report Var(flag_sched_dep_count_heuristic) Init(1) Optimization
>>  Enable the dependent count heuristic in the scheduler.
>>
>>  fsection-anchors
>> -Common Report Var(flag_section_anchors) Optimization
>> +Common Report Var(flag_section_anchors)
>>  Access data in the same section from shared anchor points.
>>
>>  fsee
>> diff --git a/gcc/passes.def b/gcc/passes.def
>> index 3647e90..3a8063c 100644
>> --- a/gcc/passes.def
>> +++ b/gcc/passes.def
>> @@ -138,12 +138,12 @@ along with GCC; see the file COPYING3.  If not see
>>    PUSH_INSERT_PASSES_WITHIN (pass_ipa_tree_profile)
>>        NEXT_PASS (pass_feedback_split_functions);
>>    POP_INSERT_PASSES ()
>> -  NEXT_PASS (pass_ipa_increase_alignment);
>>    NEXT_PASS (pass_ipa_tm);
>>    NEXT_PASS (pass_ipa_lower_emutls);
>>    TERMINATE_PASS_LIST (all_small_ipa_passes)
>>
>>    INSERT_PASSES_AFTER (all_regular_ipa_passes)
>> +  NEXT_PASS (pass_ipa_increase_alignment);
>>    NEXT_PASS (pass_ipa_whole_program_visibility);
>>    NEXT_PASS (pass_ipa_profile);
>>    NEXT_PASS (pass_ipa_icf);
>> diff --git a/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c
>> new file mode 100644
>> index 0000000..74eaed8
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c
>> @@ -0,0 +1,25 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target section_anchors } */
>> +/* { dg-require-effective-target vect_int } */
>> +
>> +#define N 32
>> +
>> +/* Clone of section-anchors-vect-70.c with foo() having -fno-tree-loop-vectorize.  */
>> +
>> +static struct A {
>> +  int p1, p2;
>> +  int e[N];
>> +} a, b, c;
>> +
>> +__attribute__((optimize("-fno-tree-loop-vectorize")))
>> +int foo(void)
>> +{
>> +  for (int i = 0; i < N; i++)
>> +    a.e[i] = b.e[i] + c.e[i];
>> +
>> +   return a.e[0];
>> +}
>> +
>> +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target aarch64*-*-* } } } */
>> +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target powerpc64*-*-* } } } */
>> +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target arm*-*-* } } } */
>> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
>> index 36299a6..d36aa1d 100644
>> --- a/gcc/tree-pass.h
>> +++ b/gcc/tree-pass.h
>> @@ -483,7 +483,7 @@ extern simple_ipa_opt_pass *make_pass_local_optimization_passes (gcc::context *c
>>
>>  extern ipa_opt_pass_d *make_pass_ipa_whole_program_visibility (gcc::context
>>                                                              *ctxt);
>> -extern simple_ipa_opt_pass *make_pass_ipa_increase_alignment (gcc::context
>> +extern ipa_opt_pass_d *make_pass_ipa_increase_alignment (gcc::context
>>                                                             *ctxt);
>>  extern ipa_opt_pass_d *make_pass_ipa_inline (gcc::context *ctxt);
>>  extern simple_ipa_opt_pass *make_pass_ipa_free_lang_data (gcc::context *ctxt);
>> diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
>> index 2669813..d34e560 100644
>> --- a/gcc/tree-vectorizer.c
>> +++ b/gcc/tree-vectorizer.c
>> @@ -899,6 +899,34 @@ get_vec_alignment_for_type (tree type)
>>    return (alignment > TYPE_ALIGN (type)) ? alignment : 0;
>>  }
>>
>> +/* Return true if alignment should be increased for this vnode.
>> +   This is done if every function that references/referring to vnode
>> +   has flag_tree_loop_vectorize and flag_section_anchors set.  */
>> +
>> +static bool
>> +increase_alignment_p (varpool_node *vnode)
>> +{
>> +  ipa_ref *ref;
>> +
>> +  for (int i = 0; vnode->iterate_reference (i, ref); i++)
>> +    if (cgraph_node *cnode = dyn_cast<cgraph_node *> (ref->referred))
>> +      {
>> +     struct cl_optimization *opts = opts_for_fn (cnode->decl);
>> +     if (!opts->x_flag_tree_loop_vectorize)
>> +       return false;
>> +      }
>> +
>> +  for (int i = 0; vnode->iterate_referring (i, ref); i++)
>> +    if (cgraph_node *cnode = dyn_cast<cgraph_node *> (ref->referring))
>> +      {
>> +     struct cl_optimization *opts = opts_for_fn (cnode->decl);
>> +     if (!opts->x_flag_tree_loop_vectorize)
>> +       return false;
>> +      }
>> +
>> +  return true;
>> +}
>> +
>>  /* Entry point to increase_alignment pass.  */
>>  static unsigned int
>>  increase_alignment (void)
>> @@ -916,7 +944,8 @@ increase_alignment (void)
>>
>>        if ((decl_in_symtab_p (decl)
>>         && !symtab_node::get (decl)->can_increase_alignment_p ())
>> -       || DECL_USER_ALIGN (decl) || DECL_ARTIFICIAL (decl))
>> +       || DECL_USER_ALIGN (decl) || DECL_ARTIFICIAL (decl)
>> +       || !increase_alignment_p (vnode))
>>       continue;
>>
>>        alignment = get_vec_alignment_for_type (TREE_TYPE (decl));
>> @@ -938,7 +967,7 @@ namespace {
>>
>>  const pass_data pass_data_ipa_increase_alignment =
>>  {
>> -  SIMPLE_IPA_PASS, /* type */
>> +  IPA_PASS, /* type */
>>    "increase_alignment", /* name */
>>    OPTGROUP_LOOP | OPTGROUP_VEC, /* optinfo_flags */
>>    TV_IPA_OPT, /* tv_id */
>> @@ -949,17 +978,26 @@ const pass_data pass_data_ipa_increase_alignment =
>>    0, /* todo_flags_finish */
>>  };
>>
>> -class pass_ipa_increase_alignment : public simple_ipa_opt_pass
>> +class pass_ipa_increase_alignment : public ipa_opt_pass_d
>>  {
>>  public:
>>    pass_ipa_increase_alignment (gcc::context *ctxt)
>> -    : simple_ipa_opt_pass (pass_data_ipa_increase_alignment, ctxt)
>> +    : ipa_opt_pass_d (pass_data_ipa_increase_alignment, ctxt,
>> +                        NULL, /* generate_summary  */
>> +                        NULL, /* write summary  */
>> +                        NULL, /* read summary  */
>> +                        NULL, /* write optimization summary  */
>> +                        NULL, /* read optimization summary  */
>> +                        NULL, /* stmt fixup  */
>> +                        0, /* function_transform_todo_flags_start  */
>> +                        NULL, /* transform function  */
>> +                        NULL )/* variable transform  */
>>    {}
>>
>>    /* opt_pass methods: */
>>    virtual bool gate (function *)
>>      {
>> -      return flag_section_anchors && flag_tree_loop_vectorize;
>> +      return flag_section_anchors != 0;
>>      }
>>
>>    virtual unsigned int execute (function *) { return increase_alignment (); }
>> @@ -968,7 +1006,7 @@ public:
>>
>>  } // anon namespace
>>
>> -simple_ipa_opt_pass *
>> +ipa_opt_pass_d *
>>  make_pass_ipa_increase_alignment (gcc::context *ctxt)
>>  {
>>    return new pass_ipa_increase_alignment (ctxt);
>
Richard Biener June 10, 2016, 11:17 a.m. UTC | #3
On Fri, 10 Jun 2016, Prathamesh Kulkarni wrote:

> On 10 June 2016 at 01:53, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> On 8 June 2016 at 20:38, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> >> I think it would be nice to work towards transitioning
> >> >> flag_section_anchors to a flag on varpool nodes, thereby removing
> >> >> the Optimization flag from common.opt:fsection-anchors
> >> >>
> >> >> That would simplify the walk over varpool candidates.
> >> >
> >> > Makes sense to me, too. There are more candidates for sutff that should be
> >> > variable specific in common.opt (such as variable alignment, -fdata-sctions,
> >> > -fmerge-constants) and targets.  We may try to do it in an easy to extend way
> >> > so incrementally we can get rid of those global flags, too.
> >> In this version I removed Optimization from fsection-anchors entry in
> >> common.opt,
> >> and gated the increase_alignment pass on flag_section_anchors != 0.
> >> Cross tested on arm*-*-*, aarch64*-*-*.
> >> Does it look OK ?
> >
> > If you go this way you will need to do something sane for LTO.  Here one can compile
> > some object files with -fsection-anchors and other without and link with random setting
> > (because in traditional compilation linktime flags does not matter).
> >
> > For global flags we have magic in merge_and_complain that determines flags to pass
> > to the LTO compiler.
> > It is not very robust though.
> >> >
> >> > One thing that needs to be done for LTO is sane merging, I guess in this case
> >> > it is clear that the variable should be anchored when its previaling definition
> >> > is.
> >> Um could we determine during WPA if symbol is a section anchor for merging ?
> >> Seems to me SYMBOL_REF_ANCHOR_P is defined only on DECL_RTL and not at
> >> tree level.
> >> Do we have DECL_RTL info available during WPA ?
> >
> > We don't have anchros computed, but we can decide whether we want to potentially
> > anchor the variable if we can.
> >
> > I would say all you need is to have section_anchor flag in varpool node itself
> > which controls RTL production. At varpool_finalize_decl you will set it
> > according to flag_varpool and stream it to LTO objects. At WPA when doing
> > linking, the section_anchor flag of the previaling decl wins..
> Thanks for the suggestions.
> IIUC, we want to add new section_anchor flag to varpool_node class
> and set it in varpool_node::finalize_decl and stream it to LTO byte-code,
> and then during WPA set section_anchor_flag during symbol merging if it is set
> for prevailing decl.

Yes.

> In the increase_alignment_pass if a vnode has section_anchor flag set,
> we will walk all functions that reference it to check if they have
> -ftree-loop-vectorize set.
> Is that correct ?

Yes.

> Could you please elaborate a bit more on "at varpool_finalize_decl you will
> set section_anchor flag according to flag_varpool" ?
> flag_varpool doesn't appear to be defined.

flag_section_anchors.

Richard.

> Thanks,
> Prathamesh
> >
> > Honza
> >>
> >> Thanks,
> >> Prathamesh
> >> >
> >> > Honza
> >> >>
> >> >> Richard.
> >> >>
> >> >> > Thanks,
> >> >> > Prathamesh
> >> >> > >
> >> >> > > Honza
> >> >> > >>
> >> >> > >> Richard.
> >> >> >
> >> >> >
> >> >>
> >> >> --
> >> >> Richard Biener <rguenther@suse.de>
> >> >> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
> >
> >> diff --git a/gcc/common.opt b/gcc/common.opt
> >> index f0d7196..f93f26c 100644
> >> --- a/gcc/common.opt
> >> +++ b/gcc/common.opt
> >> @@ -2133,7 +2133,7 @@ Common Report Var(flag_sched_dep_count_heuristic) Init(1) Optimization
> >>  Enable the dependent count heuristic in the scheduler.
> >>
> >>  fsection-anchors
> >> -Common Report Var(flag_section_anchors) Optimization
> >> +Common Report Var(flag_section_anchors)
> >>  Access data in the same section from shared anchor points.
> >>
> >>  fsee
> >> diff --git a/gcc/passes.def b/gcc/passes.def
> >> index 3647e90..3a8063c 100644
> >> --- a/gcc/passes.def
> >> +++ b/gcc/passes.def
> >> @@ -138,12 +138,12 @@ along with GCC; see the file COPYING3.  If not see
> >>    PUSH_INSERT_PASSES_WITHIN (pass_ipa_tree_profile)
> >>        NEXT_PASS (pass_feedback_split_functions);
> >>    POP_INSERT_PASSES ()
> >> -  NEXT_PASS (pass_ipa_increase_alignment);
> >>    NEXT_PASS (pass_ipa_tm);
> >>    NEXT_PASS (pass_ipa_lower_emutls);
> >>    TERMINATE_PASS_LIST (all_small_ipa_passes)
> >>
> >>    INSERT_PASSES_AFTER (all_regular_ipa_passes)
> >> +  NEXT_PASS (pass_ipa_increase_alignment);
> >>    NEXT_PASS (pass_ipa_whole_program_visibility);
> >>    NEXT_PASS (pass_ipa_profile);
> >>    NEXT_PASS (pass_ipa_icf);
> >> diff --git a/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c
> >> new file mode 100644
> >> index 0000000..74eaed8
> >> --- /dev/null
> >> +++ b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c
> >> @@ -0,0 +1,25 @@
> >> +/* { dg-do compile } */
> >> +/* { dg-require-effective-target section_anchors } */
> >> +/* { dg-require-effective-target vect_int } */
> >> +
> >> +#define N 32
> >> +
> >> +/* Clone of section-anchors-vect-70.c with foo() having -fno-tree-loop-vectorize.  */
> >> +
> >> +static struct A {
> >> +  int p1, p2;
> >> +  int e[N];
> >> +} a, b, c;
> >> +
> >> +__attribute__((optimize("-fno-tree-loop-vectorize")))
> >> +int foo(void)
> >> +{
> >> +  for (int i = 0; i < N; i++)
> >> +    a.e[i] = b.e[i] + c.e[i];
> >> +
> >> +   return a.e[0];
> >> +}
> >> +
> >> +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target aarch64*-*-* } } } */
> >> +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target powerpc64*-*-* } } } */
> >> +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target arm*-*-* } } } */
> >> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
> >> index 36299a6..d36aa1d 100644
> >> --- a/gcc/tree-pass.h
> >> +++ b/gcc/tree-pass.h
> >> @@ -483,7 +483,7 @@ extern simple_ipa_opt_pass *make_pass_local_optimization_passes (gcc::context *c
> >>
> >>  extern ipa_opt_pass_d *make_pass_ipa_whole_program_visibility (gcc::context
> >>                                                              *ctxt);
> >> -extern simple_ipa_opt_pass *make_pass_ipa_increase_alignment (gcc::context
> >> +extern ipa_opt_pass_d *make_pass_ipa_increase_alignment (gcc::context
> >>                                                             *ctxt);
> >>  extern ipa_opt_pass_d *make_pass_ipa_inline (gcc::context *ctxt);
> >>  extern simple_ipa_opt_pass *make_pass_ipa_free_lang_data (gcc::context *ctxt);
> >> diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
> >> index 2669813..d34e560 100644
> >> --- a/gcc/tree-vectorizer.c
> >> +++ b/gcc/tree-vectorizer.c
> >> @@ -899,6 +899,34 @@ get_vec_alignment_for_type (tree type)
> >>    return (alignment > TYPE_ALIGN (type)) ? alignment : 0;
> >>  }
> >>
> >> +/* Return true if alignment should be increased for this vnode.
> >> +   This is done if every function that references/referring to vnode
> >> +   has flag_tree_loop_vectorize and flag_section_anchors set.  */
> >> +
> >> +static bool
> >> +increase_alignment_p (varpool_node *vnode)
> >> +{
> >> +  ipa_ref *ref;
> >> +
> >> +  for (int i = 0; vnode->iterate_reference (i, ref); i++)
> >> +    if (cgraph_node *cnode = dyn_cast<cgraph_node *> (ref->referred))
> >> +      {
> >> +     struct cl_optimization *opts = opts_for_fn (cnode->decl);
> >> +     if (!opts->x_flag_tree_loop_vectorize)
> >> +       return false;
> >> +      }
> >> +
> >> +  for (int i = 0; vnode->iterate_referring (i, ref); i++)
> >> +    if (cgraph_node *cnode = dyn_cast<cgraph_node *> (ref->referring))
> >> +      {
> >> +     struct cl_optimization *opts = opts_for_fn (cnode->decl);
> >> +     if (!opts->x_flag_tree_loop_vectorize)
> >> +       return false;
> >> +      }
> >> +
> >> +  return true;
> >> +}
> >> +
> >>  /* Entry point to increase_alignment pass.  */
> >>  static unsigned int
> >>  increase_alignment (void)
> >> @@ -916,7 +944,8 @@ increase_alignment (void)
> >>
> >>        if ((decl_in_symtab_p (decl)
> >>         && !symtab_node::get (decl)->can_increase_alignment_p ())
> >> -       || DECL_USER_ALIGN (decl) || DECL_ARTIFICIAL (decl))
> >> +       || DECL_USER_ALIGN (decl) || DECL_ARTIFICIAL (decl)
> >> +       || !increase_alignment_p (vnode))
> >>       continue;
> >>
> >>        alignment = get_vec_alignment_for_type (TREE_TYPE (decl));
> >> @@ -938,7 +967,7 @@ namespace {
> >>
> >>  const pass_data pass_data_ipa_increase_alignment =
> >>  {
> >> -  SIMPLE_IPA_PASS, /* type */
> >> +  IPA_PASS, /* type */
> >>    "increase_alignment", /* name */
> >>    OPTGROUP_LOOP | OPTGROUP_VEC, /* optinfo_flags */
> >>    TV_IPA_OPT, /* tv_id */
> >> @@ -949,17 +978,26 @@ const pass_data pass_data_ipa_increase_alignment =
> >>    0, /* todo_flags_finish */
> >>  };
> >>
> >> -class pass_ipa_increase_alignment : public simple_ipa_opt_pass
> >> +class pass_ipa_increase_alignment : public ipa_opt_pass_d
> >>  {
> >>  public:
> >>    pass_ipa_increase_alignment (gcc::context *ctxt)
> >> -    : simple_ipa_opt_pass (pass_data_ipa_increase_alignment, ctxt)
> >> +    : ipa_opt_pass_d (pass_data_ipa_increase_alignment, ctxt,
> >> +                        NULL, /* generate_summary  */
> >> +                        NULL, /* write summary  */
> >> +                        NULL, /* read summary  */
> >> +                        NULL, /* write optimization summary  */
> >> +                        NULL, /* read optimization summary  */
> >> +                        NULL, /* stmt fixup  */
> >> +                        0, /* function_transform_todo_flags_start  */
> >> +                        NULL, /* transform function  */
> >> +                        NULL )/* variable transform  */
> >>    {}
> >>
> >>    /* opt_pass methods: */
> >>    virtual bool gate (function *)
> >>      {
> >> -      return flag_section_anchors && flag_tree_loop_vectorize;
> >> +      return flag_section_anchors != 0;
> >>      }
> >>
> >>    virtual unsigned int execute (function *) { return increase_alignment (); }
> >> @@ -968,7 +1006,7 @@ public:
> >>
> >>  } // anon namespace
> >>
> >> -simple_ipa_opt_pass *
> >> +ipa_opt_pass_d *
> >>  make_pass_ipa_increase_alignment (gcc::context *ctxt)
> >>  {
> >>    return new pass_ipa_increase_alignment (ctxt);
> >
> 
>
diff mbox

Patch

diff --git a/gcc/common.opt b/gcc/common.opt
index f0d7196..f93f26c 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2133,7 +2133,7 @@  Common Report Var(flag_sched_dep_count_heuristic) Init(1) Optimization
 Enable the dependent count heuristic in the scheduler.
 
 fsection-anchors
-Common Report Var(flag_section_anchors) Optimization
+Common Report Var(flag_section_anchors)
 Access data in the same section from shared anchor points.
 
 fsee
diff --git a/gcc/passes.def b/gcc/passes.def
index 3647e90..3a8063c 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -138,12 +138,12 @@  along with GCC; see the file COPYING3.  If not see
   PUSH_INSERT_PASSES_WITHIN (pass_ipa_tree_profile)
       NEXT_PASS (pass_feedback_split_functions);
   POP_INSERT_PASSES ()
-  NEXT_PASS (pass_ipa_increase_alignment);
   NEXT_PASS (pass_ipa_tm);
   NEXT_PASS (pass_ipa_lower_emutls);
   TERMINATE_PASS_LIST (all_small_ipa_passes)
 
   INSERT_PASSES_AFTER (all_regular_ipa_passes)
+  NEXT_PASS (pass_ipa_increase_alignment);
   NEXT_PASS (pass_ipa_whole_program_visibility);
   NEXT_PASS (pass_ipa_profile);
   NEXT_PASS (pass_ipa_icf);
diff --git a/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c
new file mode 100644
index 0000000..74eaed8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c
@@ -0,0 +1,25 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target section_anchors } */
+/* { dg-require-effective-target vect_int } */
+
+#define N 32
+
+/* Clone of section-anchors-vect-70.c with foo() having -fno-tree-loop-vectorize.  */ 
+
+static struct A {
+  int p1, p2;
+  int e[N];
+} a, b, c;
+
+__attribute__((optimize("-fno-tree-loop-vectorize")))
+int foo(void)
+{
+  for (int i = 0; i < N; i++)
+    a.e[i] = b.e[i] + c.e[i];
+
+   return a.e[0];
+}
+
+/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target aarch64*-*-* } } } */
+/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target powerpc64*-*-* } } } */
+/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target arm*-*-* } } } */
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 36299a6..d36aa1d 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -483,7 +483,7 @@  extern simple_ipa_opt_pass *make_pass_local_optimization_passes (gcc::context *c
 
 extern ipa_opt_pass_d *make_pass_ipa_whole_program_visibility (gcc::context
 							       *ctxt);
-extern simple_ipa_opt_pass *make_pass_ipa_increase_alignment (gcc::context
+extern ipa_opt_pass_d *make_pass_ipa_increase_alignment (gcc::context
 							      *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_inline (gcc::context *ctxt);
 extern simple_ipa_opt_pass *make_pass_ipa_free_lang_data (gcc::context *ctxt);
diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index 2669813..d34e560 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -899,6 +899,34 @@  get_vec_alignment_for_type (tree type)
   return (alignment > TYPE_ALIGN (type)) ? alignment : 0;
 }
 
+/* Return true if alignment should be increased for this vnode.
+   This is done if every function that references/referring to vnode
+   has flag_tree_loop_vectorize and flag_section_anchors set.  */
+
+static bool
+increase_alignment_p (varpool_node *vnode)
+{
+  ipa_ref *ref;
+
+  for (int i = 0; vnode->iterate_reference (i, ref); i++)
+    if (cgraph_node *cnode = dyn_cast<cgraph_node *> (ref->referred))
+      {
+	struct cl_optimization *opts = opts_for_fn (cnode->decl);
+	if (!opts->x_flag_tree_loop_vectorize)
+	  return false;
+      }
+
+  for (int i = 0; vnode->iterate_referring (i, ref); i++)
+    if (cgraph_node *cnode = dyn_cast<cgraph_node *> (ref->referring))
+      {
+	struct cl_optimization *opts = opts_for_fn (cnode->decl);
+	if (!opts->x_flag_tree_loop_vectorize)
+	  return false;
+      }
+
+  return true;
+}
+
 /* Entry point to increase_alignment pass.  */
 static unsigned int
 increase_alignment (void)
@@ -916,7 +944,8 @@  increase_alignment (void)
 
       if ((decl_in_symtab_p (decl)
 	  && !symtab_node::get (decl)->can_increase_alignment_p ())
-	  || DECL_USER_ALIGN (decl) || DECL_ARTIFICIAL (decl))
+	  || DECL_USER_ALIGN (decl) || DECL_ARTIFICIAL (decl)
+	  || !increase_alignment_p (vnode))
 	continue;
 
       alignment = get_vec_alignment_for_type (TREE_TYPE (decl));
@@ -938,7 +967,7 @@  namespace {
 
 const pass_data pass_data_ipa_increase_alignment =
 {
-  SIMPLE_IPA_PASS, /* type */
+  IPA_PASS, /* type */
   "increase_alignment", /* name */
   OPTGROUP_LOOP | OPTGROUP_VEC, /* optinfo_flags */
   TV_IPA_OPT, /* tv_id */
@@ -949,17 +978,26 @@  const pass_data pass_data_ipa_increase_alignment =
   0, /* todo_flags_finish */
 };
 
-class pass_ipa_increase_alignment : public simple_ipa_opt_pass
+class pass_ipa_increase_alignment : public ipa_opt_pass_d
 {
 public:
   pass_ipa_increase_alignment (gcc::context *ctxt)
-    : simple_ipa_opt_pass (pass_data_ipa_increase_alignment, ctxt)
+    : ipa_opt_pass_d (pass_data_ipa_increase_alignment, ctxt,
+			   NULL, /* generate_summary  */
+			   NULL, /* write summary  */
+			   NULL, /* read summary  */
+			   NULL, /* write optimization summary  */
+			   NULL, /* read optimization summary  */
+			   NULL, /* stmt fixup  */
+			   0, /* function_transform_todo_flags_start  */
+			   NULL, /* transform function  */
+			   NULL )/* variable transform  */
   {}
 
   /* opt_pass methods: */
   virtual bool gate (function *)
     {
-      return flag_section_anchors && flag_tree_loop_vectorize;
+      return flag_section_anchors != 0; 
     }
 
   virtual unsigned int execute (function *) { return increase_alignment (); }
@@ -968,7 +1006,7 @@  public:
 
 } // anon namespace
 
-simple_ipa_opt_pass *
+ipa_opt_pass_d *
 make_pass_ipa_increase_alignment (gcc::context *ctxt)
 {
   return new pass_ipa_increase_alignment (ctxt);