diff mbox

move increase_alignment from simple to regular ipa pass

Message ID CAAgBjMnu_tV+8fc5mqfkM2hr7sjvuuVAyHV7nDyRZQv4bX2Zgg@mail.gmail.com
State New
Headers show

Commit Message

Prathamesh Kulkarni June 13, 2016, 8:57 a.m. UTC
On 10 June 2016 at 16:47, Richard Biener <rguenther@suse.de> wrote:
> On Fri, 10 Jun 2016, Prathamesh Kulkarni wrote:
>
>> On 10 June 2016 at 01:53, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >> On 8 June 2016 at 20:38, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >> >> I think it would be nice to work towards transitioning
>> >> >> flag_section_anchors to a flag on varpool nodes, thereby removing
>> >> >> the Optimization flag from common.opt:fsection-anchors
>> >> >>
>> >> >> That would simplify the walk over varpool candidates.
>> >> >
>> >> > Makes sense to me, too. There are more candidates for sutff that should be
>> >> > variable specific in common.opt (such as variable alignment, -fdata-sctions,
>> >> > -fmerge-constants) and targets.  We may try to do it in an easy to extend way
>> >> > so incrementally we can get rid of those global flags, too.
>> >> In this version I removed Optimization from fsection-anchors entry in
>> >> common.opt,
>> >> and gated the increase_alignment pass on flag_section_anchors != 0.
>> >> Cross tested on arm*-*-*, aarch64*-*-*.
>> >> Does it look OK ?
>> >
>> > If you go this way you will need to do something sane for LTO.  Here one can compile
>> > some object files with -fsection-anchors and other without and link with random setting
>> > (because in traditional compilation linktime flags does not matter).
>> >
>> > For global flags we have magic in merge_and_complain that determines flags to pass
>> > to the LTO compiler.
>> > It is not very robust though.
>> >> >
>> >> > One thing that needs to be done for LTO is sane merging, I guess in this case
>> >> > it is clear that the variable should be anchored when its previaling definition
>> >> > is.
>> >> Um could we determine during WPA if symbol is a section anchor for merging ?
>> >> Seems to me SYMBOL_REF_ANCHOR_P is defined only on DECL_RTL and not at
>> >> tree level.
>> >> Do we have DECL_RTL info available during WPA ?
>> >
>> > We don't have anchros computed, but we can decide whether we want to potentially
>> > anchor the variable if we can.
>> >
>> > I would say all you need is to have section_anchor flag in varpool node itself
>> > which controls RTL production. At varpool_finalize_decl you will set it
>> > according to flag_varpool and stream it to LTO objects. At WPA when doing
>> > linking, the section_anchor flag of the previaling decl wins..
>> Thanks for the suggestions.
>> IIUC, we want to add new section_anchor flag to varpool_node class
>> and set it in varpool_node::finalize_decl and stream it to LTO byte-code,
>> and then during WPA set section_anchor_flag during symbol merging if it is set
>> for prevailing decl.
>
> Yes.
>
>> In the increase_alignment_pass if a vnode has section_anchor flag set,
>> we will walk all functions that reference it to check if they have
>> -ftree-loop-vectorize set.
>> Is that correct ?
>
> Yes.
>
>> Could you please elaborate a bit more on "at varpool_finalize_decl you will
>> set section_anchor flag according to flag_varpool" ?
>> flag_varpool doesn't appear to be defined.
>
> flag_section_anchors.
Hi,
I have done the changes in this version
In varpool_node::finalize_decl,
I just set vnode->section_anchor = flag_section_anchors.
Should that be sufficient ?

I tried with a couple of test-cases, once with prevailing->section_anchors == 1
and once with entry->section_anchors == 1 and it appears
prevailing->section_anchor
always took precedence.
So I wonder if the change to lto_symtab_merge () in the patch is necessary ?

Re-introduced flag_ipa_increase_alignment to gate the pass on, so it runs only
for targets supporting section anchors.
Cross tested  on aarch64*-*-*, arm*-*-*.

Thanks,
Prathamesh
>
> Richard.
>
>> Thanks,
>> Prathamesh
>> >
>> > Honza
>> >>
>> >> Thanks,
>> >> Prathamesh
>> >> >
>> >> > Honza
>> >> >>
>> >> >> Richard.
>> >> >>
>> >> >> > Thanks,
>> >> >> > Prathamesh
>> >> >> > >
>> >> >> > > Honza
>> >> >> > >>
>> >> >> > >> Richard.
>> >> >> >
>> >> >> >
>> >> >>
>> >> >> --
>> >> >> Richard Biener <rguenther@suse.de>
>> >> >> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
>> >
>> >> diff --git a/gcc/common.opt b/gcc/common.opt
>> >> index f0d7196..f93f26c 100644
>> >> --- a/gcc/common.opt
>> >> +++ b/gcc/common.opt
>> >> @@ -2133,7 +2133,7 @@ Common Report Var(flag_sched_dep_count_heuristic) Init(1) Optimization
>> >>  Enable the dependent count heuristic in the scheduler.
>> >>
>> >>  fsection-anchors
>> >> -Common Report Var(flag_section_anchors) Optimization
>> >> +Common Report Var(flag_section_anchors)
>> >>  Access data in the same section from shared anchor points.
>> >>
>> >>  fsee
>> >> diff --git a/gcc/passes.def b/gcc/passes.def
>> >> index 3647e90..3a8063c 100644
>> >> --- a/gcc/passes.def
>> >> +++ b/gcc/passes.def
>> >> @@ -138,12 +138,12 @@ along with GCC; see the file COPYING3.  If not see
>> >>    PUSH_INSERT_PASSES_WITHIN (pass_ipa_tree_profile)
>> >>        NEXT_PASS (pass_feedback_split_functions);
>> >>    POP_INSERT_PASSES ()
>> >> -  NEXT_PASS (pass_ipa_increase_alignment);
>> >>    NEXT_PASS (pass_ipa_tm);
>> >>    NEXT_PASS (pass_ipa_lower_emutls);
>> >>    TERMINATE_PASS_LIST (all_small_ipa_passes)
>> >>
>> >>    INSERT_PASSES_AFTER (all_regular_ipa_passes)
>> >> +  NEXT_PASS (pass_ipa_increase_alignment);
>> >>    NEXT_PASS (pass_ipa_whole_program_visibility);
>> >>    NEXT_PASS (pass_ipa_profile);
>> >>    NEXT_PASS (pass_ipa_icf);
>> >> diff --git a/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c
>> >> new file mode 100644
>> >> index 0000000..74eaed8
>> >> --- /dev/null
>> >> +++ b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c
>> >> @@ -0,0 +1,25 @@
>> >> +/* { dg-do compile } */
>> >> +/* { dg-require-effective-target section_anchors } */
>> >> +/* { dg-require-effective-target vect_int } */
>> >> +
>> >> +#define N 32
>> >> +
>> >> +/* Clone of section-anchors-vect-70.c with foo() having -fno-tree-loop-vectorize.  */
>> >> +
>> >> +static struct A {
>> >> +  int p1, p2;
>> >> +  int e[N];
>> >> +} a, b, c;
>> >> +
>> >> +__attribute__((optimize("-fno-tree-loop-vectorize")))
>> >> +int foo(void)
>> >> +{
>> >> +  for (int i = 0; i < N; i++)
>> >> +    a.e[i] = b.e[i] + c.e[i];
>> >> +
>> >> +   return a.e[0];
>> >> +}
>> >> +
>> >> +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target aarch64*-*-* } } } */
>> >> +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target powerpc64*-*-* } } } */
>> >> +/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target arm*-*-* } } } */
>> >> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
>> >> index 36299a6..d36aa1d 100644
>> >> --- a/gcc/tree-pass.h
>> >> +++ b/gcc/tree-pass.h
>> >> @@ -483,7 +483,7 @@ extern simple_ipa_opt_pass *make_pass_local_optimization_passes (gcc::context *c
>> >>
>> >>  extern ipa_opt_pass_d *make_pass_ipa_whole_program_visibility (gcc::context
>> >>                                                              *ctxt);
>> >> -extern simple_ipa_opt_pass *make_pass_ipa_increase_alignment (gcc::context
>> >> +extern ipa_opt_pass_d *make_pass_ipa_increase_alignment (gcc::context
>> >>                                                             *ctxt);
>> >>  extern ipa_opt_pass_d *make_pass_ipa_inline (gcc::context *ctxt);
>> >>  extern simple_ipa_opt_pass *make_pass_ipa_free_lang_data (gcc::context *ctxt);
>> >> diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
>> >> index 2669813..d34e560 100644
>> >> --- a/gcc/tree-vectorizer.c
>> >> +++ b/gcc/tree-vectorizer.c
>> >> @@ -899,6 +899,34 @@ get_vec_alignment_for_type (tree type)
>> >>    return (alignment > TYPE_ALIGN (type)) ? alignment : 0;
>> >>  }
>> >>
>> >> +/* Return true if alignment should be increased for this vnode.
>> >> +   This is done if every function that references/referring to vnode
>> >> +   has flag_tree_loop_vectorize and flag_section_anchors set.  */
>> >> +
>> >> +static bool
>> >> +increase_alignment_p (varpool_node *vnode)
>> >> +{
>> >> +  ipa_ref *ref;
>> >> +
>> >> +  for (int i = 0; vnode->iterate_reference (i, ref); i++)
>> >> +    if (cgraph_node *cnode = dyn_cast<cgraph_node *> (ref->referred))
>> >> +      {
>> >> +     struct cl_optimization *opts = opts_for_fn (cnode->decl);
>> >> +     if (!opts->x_flag_tree_loop_vectorize)
>> >> +       return false;
>> >> +      }
>> >> +
>> >> +  for (int i = 0; vnode->iterate_referring (i, ref); i++)
>> >> +    if (cgraph_node *cnode = dyn_cast<cgraph_node *> (ref->referring))
>> >> +      {
>> >> +     struct cl_optimization *opts = opts_for_fn (cnode->decl);
>> >> +     if (!opts->x_flag_tree_loop_vectorize)
>> >> +       return false;
>> >> +      }
>> >> +
>> >> +  return true;
>> >> +}
>> >> +
>> >>  /* Entry point to increase_alignment pass.  */
>> >>  static unsigned int
>> >>  increase_alignment (void)
>> >> @@ -916,7 +944,8 @@ increase_alignment (void)
>> >>
>> >>        if ((decl_in_symtab_p (decl)
>> >>         && !symtab_node::get (decl)->can_increase_alignment_p ())
>> >> -       || DECL_USER_ALIGN (decl) || DECL_ARTIFICIAL (decl))
>> >> +       || DECL_USER_ALIGN (decl) || DECL_ARTIFICIAL (decl)
>> >> +       || !increase_alignment_p (vnode))
>> >>       continue;
>> >>
>> >>        alignment = get_vec_alignment_for_type (TREE_TYPE (decl));
>> >> @@ -938,7 +967,7 @@ namespace {
>> >>
>> >>  const pass_data pass_data_ipa_increase_alignment =
>> >>  {
>> >> -  SIMPLE_IPA_PASS, /* type */
>> >> +  IPA_PASS, /* type */
>> >>    "increase_alignment", /* name */
>> >>    OPTGROUP_LOOP | OPTGROUP_VEC, /* optinfo_flags */
>> >>    TV_IPA_OPT, /* tv_id */
>> >> @@ -949,17 +978,26 @@ const pass_data pass_data_ipa_increase_alignment =
>> >>    0, /* todo_flags_finish */
>> >>  };
>> >>
>> >> -class pass_ipa_increase_alignment : public simple_ipa_opt_pass
>> >> +class pass_ipa_increase_alignment : public ipa_opt_pass_d
>> >>  {
>> >>  public:
>> >>    pass_ipa_increase_alignment (gcc::context *ctxt)
>> >> -    : simple_ipa_opt_pass (pass_data_ipa_increase_alignment, ctxt)
>> >> +    : ipa_opt_pass_d (pass_data_ipa_increase_alignment, ctxt,
>> >> +                        NULL, /* generate_summary  */
>> >> +                        NULL, /* write summary  */
>> >> +                        NULL, /* read summary  */
>> >> +                        NULL, /* write optimization summary  */
>> >> +                        NULL, /* read optimization summary  */
>> >> +                        NULL, /* stmt fixup  */
>> >> +                        0, /* function_transform_todo_flags_start  */
>> >> +                        NULL, /* transform function  */
>> >> +                        NULL )/* variable transform  */
>> >>    {}
>> >>
>> >>    /* opt_pass methods: */
>> >>    virtual bool gate (function *)
>> >>      {
>> >> -      return flag_section_anchors && flag_tree_loop_vectorize;
>> >> +      return flag_section_anchors != 0;
>> >>      }
>> >>
>> >>    virtual unsigned int execute (function *) { return increase_alignment (); }
>> >> @@ -968,7 +1006,7 @@ public:
>> >>
>> >>  } // anon namespace
>> >>
>> >> -simple_ipa_opt_pass *
>> >> +ipa_opt_pass_d *
>> >>  make_pass_ipa_increase_alignment (gcc::context *ctxt)
>> >>  {
>> >>    return new pass_ipa_increase_alignment (ctxt);
>> >
>>
>>
>
> --
> Richard Biener <rguenther@suse.de>
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

Comments

Jan Hubicka June 13, 2016, 10:43 a.m. UTC | #1
> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
> index ecafe63..41ac408 100644
> --- a/gcc/cgraph.h
> +++ b/gcc/cgraph.h
> @@ -1874,6 +1874,9 @@ public:
>       if we did not do any inter-procedural code movement.  */
>    unsigned used_by_single_function : 1;
>  
> +  /* Set if -fsection-anchors is set.  */
> +  unsigned section_anchor : 1;
> +
>  private:
>    /* Assemble thunks and aliases associated to varpool node.  */
>    void assemble_aliases (void);
> diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
> index 4bfcad7..e75d5c0 100644
> --- a/gcc/cgraphunit.c
> +++ b/gcc/cgraphunit.c
> @@ -800,6 +800,9 @@ varpool_node::finalize_decl (tree decl)
>       it is available to notice_global_symbol.  */
>    node->definition = true;
>    notice_global_symbol (decl);
> +
> +  node->section_anchor = flag_section_anchors;
> +
>    if (TREE_THIS_VOLATILE (decl) || DECL_PRESERVE_P (decl)
>        /* Traditionally we do not eliminate static variables when not
>  	 optimizing and when not doing toplevel reoder.  */
> diff --git a/gcc/common.opt b/gcc/common.opt
> index f0d7196..e497795 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1590,6 +1590,10 @@ fira-algorithm=
>  Common Joined RejectNegative Enum(ira_algorithm) Var(flag_ira_algorithm) Init(IRA_ALGORITHM_CB) Optimization
>  -fira-algorithm=[CB|priority] Set the used IRA algorithm.
>  
> +fipa-increase_alignment
> +Common Report Var(flag_ipa_increase_alignment) Init(0) Optimization
> +Option to gate increase_alignment ipa pass.
> +
>  Enum
>  Name(ira_algorithm) Type(enum ira_algorithm) UnknownError(unknown IRA algorithm %qs)
>  
> @@ -2133,7 +2137,7 @@ Common Report Var(flag_sched_dep_count_heuristic) Init(1) Optimization
>  Enable the dependent count heuristic in the scheduler.
>  
>  fsection-anchors
> -Common Report Var(flag_section_anchors) Optimization
> +Common Report Var(flag_section_anchors)
>  Access data in the same section from shared anchor points.
>  
>  fsee
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index a0db3a4..1482566 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -8252,6 +8252,8 @@ aarch64_override_options (void)
>  
>    aarch64_register_fma_steering ();
>  
> +  /* Enable increase_alignment pass.  */
> +  flag_ipa_increase_alignment = 1;

I would rather enable it always on targets that do support anchors.
> diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c
> index ce9e146..7f09f3a 100644
> --- a/gcc/lto/lto-symtab.c
> +++ b/gcc/lto/lto-symtab.c
> @@ -342,6 +342,13 @@ lto_symtab_merge (symtab_node *prevailing, symtab_node *entry)
>       The type compatibility checks or the completing of types has properly
>       dealt with most issues.  */
>  
> +  /* ??? is this assert necessary ?  */
> +  varpool_node *v_prevailing = dyn_cast<varpool_node *> (prevailing);
> +  varpool_node *v_entry = dyn_cast<varpool_node *> (entry);
> +  gcc_assert (v_prevailing && v_entry);
> +  /* section_anchor of prevailing_decl wins.  */
> +  v_entry->section_anchor = v_prevailing->section_anchor;
> +
Other flags are merged in lto_varpool_replace_node so please move this there.
> +/* Return true if alignment should be increased for this vnode.
> +   This is done if every function that references/referring to vnode
> +   has flag_tree_loop_vectorize set.  */ 
> +
> +static bool
> +increase_alignment_p (varpool_node *vnode)
> +{
> +  ipa_ref *ref;
> +
> +  for (int i = 0; vnode->iterate_reference (i, ref); i++)
> +    if (cgraph_node *cnode = dyn_cast<cgraph_node *> (ref->referred))
> +      {
> +	struct cl_optimization *opts = opts_for_fn (cnode->decl);
> +	if (!opts->x_flag_tree_loop_vectorize) 
> +	  return false;
> +      }

If you take address of function that has vectorizer enabled probably doesn't
imply need to increase alignment of that var. So please drop the loop.

You only want function that read/writes or takes address of the symbol. But
onthe other hand, you need to walk all aliases of the symbol by
call_for_symbol_and_aliases
> +
> +  for (int i = 0; vnode->iterate_referring (i, ref); i++)
> +    if (cgraph_node *cnode = dyn_cast<cgraph_node *> (ref->referring))
> +      {
> +	struct cl_optimization *opts = opts_for_fn (cnode->decl);
> +	if (!opts->x_flag_tree_loop_vectorize) 
> +	  return false;
> +      }
> +
> +  return true;
> +}
> +
>  /* Entry point to increase_alignment pass.  */
>  static unsigned int
>  increase_alignment (void)
> @@ -914,9 +942,12 @@ increase_alignment (void)
>        tree decl = vnode->decl;
>        unsigned int alignment;
>  
> -      if ((decl_in_symtab_p (decl)
> -	  && !symtab_node::get (decl)->can_increase_alignment_p ())
> -	  || DECL_USER_ALIGN (decl) || DECL_ARTIFICIAL (decl))
> +      if (!vnode->section_anchor
> +	  || (decl_in_symtab_p (decl)
> +	      && !symtab_node::get (decl)->can_increase_alignment_p ())
> +	  || DECL_USER_ALIGN (decl)
> +	  || DECL_ARTIFICIAL (decl)
> +	  || !increase_alignment_p (vnode))

Incrementally we probably should do more testing whether the variable looks like
someting that can be vectorized, i.e. it contains array, has address taken or the
accesses are array accesses within loop.
This can be done by the analysis phase of the IPA pass inspecting the function
bodies.

I think it is important waste to bump up everything including error messages etc.
At least on i386 the effect on firefox datasegment of various alignment setting is
very visible.

Looks OK to me otherwise. please send updated patch.

Honza
diff mbox

Patch

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index ecafe63..41ac408 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -1874,6 +1874,9 @@  public:
      if we did not do any inter-procedural code movement.  */
   unsigned used_by_single_function : 1;
 
+  /* Set if -fsection-anchors is set.  */
+  unsigned section_anchor : 1;
+
 private:
   /* Assemble thunks and aliases associated to varpool node.  */
   void assemble_aliases (void);
diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 4bfcad7..e75d5c0 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -800,6 +800,9 @@  varpool_node::finalize_decl (tree decl)
      it is available to notice_global_symbol.  */
   node->definition = true;
   notice_global_symbol (decl);
+
+  node->section_anchor = flag_section_anchors;
+
   if (TREE_THIS_VOLATILE (decl) || DECL_PRESERVE_P (decl)
       /* Traditionally we do not eliminate static variables when not
 	 optimizing and when not doing toplevel reoder.  */
diff --git a/gcc/common.opt b/gcc/common.opt
index f0d7196..e497795 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1590,6 +1590,10 @@  fira-algorithm=
 Common Joined RejectNegative Enum(ira_algorithm) Var(flag_ira_algorithm) Init(IRA_ALGORITHM_CB) Optimization
 -fira-algorithm=[CB|priority] Set the used IRA algorithm.
 
+fipa-increase_alignment
+Common Report Var(flag_ipa_increase_alignment) Init(0) Optimization
+Option to gate increase_alignment ipa pass.
+
 Enum
 Name(ira_algorithm) Type(enum ira_algorithm) UnknownError(unknown IRA algorithm %qs)
 
@@ -2133,7 +2137,7 @@  Common Report Var(flag_sched_dep_count_heuristic) Init(1) Optimization
 Enable the dependent count heuristic in the scheduler.
 
 fsection-anchors
-Common Report Var(flag_section_anchors) Optimization
+Common Report Var(flag_section_anchors)
 Access data in the same section from shared anchor points.
 
 fsee
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index a0db3a4..1482566 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -8252,6 +8252,8 @@  aarch64_override_options (void)
 
   aarch64_register_fma_steering ();
 
+  /* Enable increase_alignment pass.  */
+  flag_ipa_increase_alignment = 1;
 }
 
 /* Implement targetm.override_options_after_change.  */
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 3503c15..b7f448e 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -3458,6 +3458,9 @@  arm_option_override (void)
 
   /* Init initial mode for testing.  */
   thumb_flipper = TARGET_THUMB;
+
+  /* Enable increase_alignment pass.  */
+  flag_ipa_increase_alignment = 1;
 }
 
 static void
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 2d7df6b..ed59068 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -5011,6 +5011,9 @@  rs6000_option_override (void)
     = { pass_analyze_swaps, "cse1", 1, PASS_POS_INSERT_BEFORE };
 
   register_pass (&analyze_swaps_info);
+
+  /* Enable increase_alignment pass.  */
+  flag_ipa_increase_alignment = 1;
 }
 
 
diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
index 5cef2ba..289d9c3 100644
--- a/gcc/lto-cgraph.c
+++ b/gcc/lto-cgraph.c
@@ -627,6 +627,7 @@  lto_output_varpool_node (struct lto_simple_output_block *ob, varpool_node *node,
   bp_pack_value (&bp, node->tls_model, 3);
   bp_pack_value (&bp, node->used_by_single_function, 1);
   bp_pack_value (&bp, node->need_bounds_init, 1);
+  bp_pack_value (&bp, node->section_anchor, 1);
   streamer_write_bitpack (&bp);
 
   group = node->get_comdat_group ();
@@ -1401,6 +1402,7 @@  input_varpool_node (struct lto_file_decl_data *file_data,
   node->tls_model = (enum tls_model)bp_unpack_value (&bp, 3);
   node->used_by_single_function = (enum tls_model)bp_unpack_value (&bp, 1);
   node->need_bounds_init = bp_unpack_value (&bp, 1);
+  node->section_anchor = bp_unpack_value (&bp, 1);
   group = read_identifier (ib);
   if (group)
     {
diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c
index ce9e146..7f09f3a 100644
--- a/gcc/lto/lto-symtab.c
+++ b/gcc/lto/lto-symtab.c
@@ -342,6 +342,13 @@  lto_symtab_merge (symtab_node *prevailing, symtab_node *entry)
      The type compatibility checks or the completing of types has properly
      dealt with most issues.  */
 
+  /* ??? is this assert necessary ?  */
+  varpool_node *v_prevailing = dyn_cast<varpool_node *> (prevailing);
+  varpool_node *v_entry = dyn_cast<varpool_node *> (entry);
+  gcc_assert (v_prevailing && v_entry);
+  /* section_anchor of prevailing_decl wins.  */
+  v_entry->section_anchor = v_prevailing->section_anchor;
+
   /* The following should all not invoke fatal errors as in non-LTO
      mode the linker wouldn't complain either.  Just emit warnings.  */
 
diff --git a/gcc/passes.def b/gcc/passes.def
index 3647e90..3a8063c 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -138,12 +138,12 @@  along with GCC; see the file COPYING3.  If not see
   PUSH_INSERT_PASSES_WITHIN (pass_ipa_tree_profile)
       NEXT_PASS (pass_feedback_split_functions);
   POP_INSERT_PASSES ()
-  NEXT_PASS (pass_ipa_increase_alignment);
   NEXT_PASS (pass_ipa_tm);
   NEXT_PASS (pass_ipa_lower_emutls);
   TERMINATE_PASS_LIST (all_small_ipa_passes)
 
   INSERT_PASSES_AFTER (all_regular_ipa_passes)
+  NEXT_PASS (pass_ipa_increase_alignment);
   NEXT_PASS (pass_ipa_whole_program_visibility);
   NEXT_PASS (pass_ipa_profile);
   NEXT_PASS (pass_ipa_icf);
diff --git a/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c
new file mode 100644
index 0000000..74eaed8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/aligned-section-anchors-vect-73.c
@@ -0,0 +1,25 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target section_anchors } */
+/* { dg-require-effective-target vect_int } */
+
+#define N 32
+
+/* Clone of section-anchors-vect-70.c with foo() having -fno-tree-loop-vectorize.  */ 
+
+static struct A {
+  int p1, p2;
+  int e[N];
+} a, b, c;
+
+__attribute__((optimize("-fno-tree-loop-vectorize")))
+int foo(void)
+{
+  for (int i = 0; i < N; i++)
+    a.e[i] = b.e[i] + c.e[i];
+
+   return a.e[0];
+}
+
+/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target aarch64*-*-* } } } */
+/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target powerpc64*-*-* } } } */
+/* { dg-final { scan-ipa-dump-times "Increasing alignment of decl" 0 "increase_alignment" { target arm*-*-* } } } */
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 36299a6..d36aa1d 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -483,7 +483,7 @@  extern simple_ipa_opt_pass *make_pass_local_optimization_passes (gcc::context *c
 
 extern ipa_opt_pass_d *make_pass_ipa_whole_program_visibility (gcc::context
 							       *ctxt);
-extern simple_ipa_opt_pass *make_pass_ipa_increase_alignment (gcc::context
+extern ipa_opt_pass_d *make_pass_ipa_increase_alignment (gcc::context
 							      *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_inline (gcc::context *ctxt);
 extern simple_ipa_opt_pass *make_pass_ipa_free_lang_data (gcc::context *ctxt);
diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index 2669813..abd7030 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -899,6 +899,34 @@  get_vec_alignment_for_type (tree type)
   return (alignment > TYPE_ALIGN (type)) ? alignment : 0;
 }
 
+/* Return true if alignment should be increased for this vnode.
+   This is done if every function that references/referring to vnode
+   has flag_tree_loop_vectorize set.  */ 
+
+static bool
+increase_alignment_p (varpool_node *vnode)
+{
+  ipa_ref *ref;
+
+  for (int i = 0; vnode->iterate_reference (i, ref); i++)
+    if (cgraph_node *cnode = dyn_cast<cgraph_node *> (ref->referred))
+      {
+	struct cl_optimization *opts = opts_for_fn (cnode->decl);
+	if (!opts->x_flag_tree_loop_vectorize) 
+	  return false;
+      }
+
+  for (int i = 0; vnode->iterate_referring (i, ref); i++)
+    if (cgraph_node *cnode = dyn_cast<cgraph_node *> (ref->referring))
+      {
+	struct cl_optimization *opts = opts_for_fn (cnode->decl);
+	if (!opts->x_flag_tree_loop_vectorize) 
+	  return false;
+      }
+
+  return true;
+}
+
 /* Entry point to increase_alignment pass.  */
 static unsigned int
 increase_alignment (void)
@@ -914,9 +942,12 @@  increase_alignment (void)
       tree decl = vnode->decl;
       unsigned int alignment;
 
-      if ((decl_in_symtab_p (decl)
-	  && !symtab_node::get (decl)->can_increase_alignment_p ())
-	  || DECL_USER_ALIGN (decl) || DECL_ARTIFICIAL (decl))
+      if (!vnode->section_anchor
+	  || (decl_in_symtab_p (decl)
+	      && !symtab_node::get (decl)->can_increase_alignment_p ())
+	  || DECL_USER_ALIGN (decl)
+	  || DECL_ARTIFICIAL (decl)
+	  || !increase_alignment_p (vnode))
 	continue;
 
       alignment = get_vec_alignment_for_type (TREE_TYPE (decl));
@@ -938,7 +969,7 @@  namespace {
 
 const pass_data pass_data_ipa_increase_alignment =
 {
-  SIMPLE_IPA_PASS, /* type */
+  IPA_PASS, /* type */
   "increase_alignment", /* name */
   OPTGROUP_LOOP | OPTGROUP_VEC, /* optinfo_flags */
   TV_IPA_OPT, /* tv_id */
@@ -949,17 +980,26 @@  const pass_data pass_data_ipa_increase_alignment =
   0, /* todo_flags_finish */
 };
 
-class pass_ipa_increase_alignment : public simple_ipa_opt_pass
+class pass_ipa_increase_alignment : public ipa_opt_pass_d
 {
 public:
   pass_ipa_increase_alignment (gcc::context *ctxt)
-    : simple_ipa_opt_pass (pass_data_ipa_increase_alignment, ctxt)
+    : ipa_opt_pass_d (pass_data_ipa_increase_alignment, ctxt,
+			   NULL, /* generate_summary  */
+			   NULL, /* write summary  */
+			   NULL, /* read summary  */
+			   NULL, /* write optimization summary  */
+			   NULL, /* read optimization summary  */
+			   NULL, /* stmt fixup  */
+			   0, /* function_transform_todo_flags_start  */
+			   NULL, /* transform function  */
+			   NULL )/* variable transform  */
   {}
 
   /* opt_pass methods: */
   virtual bool gate (function *)
     {
-      return flag_section_anchors && flag_tree_loop_vectorize;
+      return flag_ipa_increase_alignment != 0; 
     }
 
   virtual unsigned int execute (function *) { return increase_alignment (); }
@@ -968,7 +1008,7 @@  public:
 
 } // anon namespace
 
-simple_ipa_opt_pass *
+ipa_opt_pass_d *
 make_pass_ipa_increase_alignment (gcc::context *ctxt)
 {
   return new pass_ipa_increase_alignment (ctxt);