diff mbox

PR81136: ICE from inconsistent DR_MISALIGNMENTs

Message ID 87k244z2c5.fsf@linaro.org
State New
Headers show

Commit Message

Richard Sandiford June 22, 2017, 11:30 a.m. UTC
The test case triggered this assert in vect_update_misalignment_for_peel:

          gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
                      DR_MISALIGNMENT (dr_peel) / dr_peel_size);

We knew that the two DRs had the same misalignment at runtime, but when
considered in isolation, one data reference guaranteed a higher compile-time
base alignment than the other.

In the test case this looks like a missed opportunity.  Both references
are unconditional, so it should be possible to use the highest of the
available base alignment guarantees when analyzing each reference.
The patch does this.

However, as the comment in the patch says, the base alignment guarantees
provided by a conditional reference only apply if the reference occurs
at least once.  In this case it would be legitimate for two references
to have the same runtime misalignment and for one reference to provide a
stronger compile-time guarantee than the other about what the misalignment
actually is.  The patch therefore relaxes the assert to handle that case.

Tested on powerpc64-linux-gnu, aarch64-linux-gnu and x86_64-linux-gnu.
OK to instal?

Richard


2017-06-22  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	PR tree-optimization/81136
	* tree-vectorizer.h: Include tree-hash-traits.h.
	(vec_base_alignments): New typedef.
	(vec_info): Add a base_alignments field.
	(vect_compute_base_alignments: Declare.
	* tree-data-ref.h (data_reference): Add an is_conditional field.
	(DR_IS_CONDITIONAL): New macro.
	(create_data_ref): Add an is_conditional argument.
	* tree-data-ref.c (create_data_ref): Likewise.  Use it to initialize
	the is_conditional field.
	(data_ref_loc): Add an is_conditional field.
	(get_references_in_stmt): Set the is_conditional field.
	(find_data_references_in_stmt): Update call to create_data_ref.
	(graphite_find_data_references_in_stmt): Likewise.
	* tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Likewise.
	* tree-vect-data-refs.c (vect_analyze_data_refs): Likewise.
	(vect_get_base_address): New function.
	(vect_compute_base_alignments): Likewise.
	(vect_compute_base_alignment): Likewise, split out from...
	(vect_compute_data_ref_alignment): ...here.  Use precomputed
	base alignments.  Only compute a new base alignment here if the
	reference is conditional.
	(vect_update_misalignment_for_peel): Allow the compile-time
	DR_MISALIGNMENTs of two references with the same runtime alignment
	to be different if one of the references is conditional.
	(vect_find_same_alignment_drs): Compare base addresses instead
	of base objects.
	(vect_compute_data_ref_alignment): Call vect_compute_base_alignments.
	* tree-vect-slp.c (vect_slp_analyze_bb_1): Likewise.
	(new_bb_vec_info): Initialize base_alignments.
	* tree-vect-loop.c (new_loop_vec_info): Likewise.
	* tree-vectorizer.c (vect_destroy_datarefs): Release base_alignments.

gcc/testsuite/
	PR tree-optimization/81136
	* gcc.dg/vect/pr81136.c: New test.

Comments

Richard Biener June 23, 2017, 11:19 a.m. UTC | #1
On Thu, Jun 22, 2017 at 1:30 PM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> The test case triggered this assert in vect_update_misalignment_for_peel:
>
>           gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
>                       DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>
> We knew that the two DRs had the same misalignment at runtime, but when
> considered in isolation, one data reference guaranteed a higher compile-time
> base alignment than the other.
>
> In the test case this looks like a missed opportunity.  Both references
> are unconditional, so it should be possible to use the highest of the
> available base alignment guarantees when analyzing each reference.
> The patch does this.
>
> However, as the comment in the patch says, the base alignment guarantees
> provided by a conditional reference only apply if the reference occurs
> at least once.  In this case it would be legitimate for two references
> to have the same runtime misalignment and for one reference to provide a
> stronger compile-time guarantee than the other about what the misalignment
> actually is.  The patch therefore relaxes the assert to handle that case.

Hmm, but you don't actually check whether a reference occurs only conditional,
do you?  You just seem to say that for masked loads/stores the reference
is conditional (I believe that's not true).  But for a loop like

 for (;;)
   if (a[i])
     sum += b[j];

you still assume b[j] executes unconditionally?

The vectorizer of course only sees unconditionally executed stmts.

So - I'd simply not add this DR_IS_CONDITIONAL.  Did you run into
any real-world (testsuite) issues without this?

Note that the assert is to prevent bogus information.  Iff we aligned
DR with base alignment 8 and misalign 3 then if another same-align
DR has base alignment 16 we can't simply zero its DR_MISALIGNMENT
as it still can be 8 after aligning DR.

So I think it's wrong to put DRs with differing base-alignment into
the same-align-refs chain, those should get their DR_MISALIGNMENT
updated independenlty after peeling.

I'd rather not mix fixing this with the improvement to eventuall use a
larger align for the other DR if possible.

Thanks,
Richard.

> Tested on powerpc64-linux-gnu, aarch64-linux-gnu and x86_64-linux-gnu.
> OK to instal?
>
> Richard
>
>
> 2017-06-22  Richard Sandiford  <richard.sandiford@linaro.org>
>
> gcc/
>         PR tree-optimization/81136
>         * tree-vectorizer.h: Include tree-hash-traits.h.
>         (vec_base_alignments): New typedef.
>         (vec_info): Add a base_alignments field.
>         (vect_compute_base_alignments: Declare.
>         * tree-data-ref.h (data_reference): Add an is_conditional field.
>         (DR_IS_CONDITIONAL): New macro.
>         (create_data_ref): Add an is_conditional argument.
>         * tree-data-ref.c (create_data_ref): Likewise.  Use it to initialize
>         the is_conditional field.
>         (data_ref_loc): Add an is_conditional field.
>         (get_references_in_stmt): Set the is_conditional field.
>         (find_data_references_in_stmt): Update call to create_data_ref.
>         (graphite_find_data_references_in_stmt): Likewise.
>         * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Likewise.
>         * tree-vect-data-refs.c (vect_analyze_data_refs): Likewise.
>         (vect_get_base_address): New function.
>         (vect_compute_base_alignments): Likewise.
>         (vect_compute_base_alignment): Likewise, split out from...
>         (vect_compute_data_ref_alignment): ...here.  Use precomputed
>         base alignments.  Only compute a new base alignment here if the
>         reference is conditional.
>         (vect_update_misalignment_for_peel): Allow the compile-time
>         DR_MISALIGNMENTs of two references with the same runtime alignment
>         to be different if one of the references is conditional.
>         (vect_find_same_alignment_drs): Compare base addresses instead
>         of base objects.
>         (vect_compute_data_ref_alignment): Call vect_compute_base_alignments.
>         * tree-vect-slp.c (vect_slp_analyze_bb_1): Likewise.
>         (new_bb_vec_info): Initialize base_alignments.
>         * tree-vect-loop.c (new_loop_vec_info): Likewise.
>         * tree-vectorizer.c (vect_destroy_datarefs): Release base_alignments.
>
> gcc/testsuite/
>         PR tree-optimization/81136
>         * gcc.dg/vect/pr81136.c: New test.
>
> Index: gcc/tree-vectorizer.h
> ===================================================================
> --- gcc/tree-vectorizer.h       2017-06-08 08:51:43.347264181 +0100
> +++ gcc/tree-vectorizer.h       2017-06-22 12:23:21.288421018 +0100
> @@ -22,6 +22,7 @@ Software Foundation; either version 3, o
>  #define GCC_TREE_VECTORIZER_H
>
>  #include "tree-data-ref.h"
> +#include "tree-hash-traits.h"
>  #include "target.h"
>
>  /* Used for naming of new temporaries.  */
> @@ -84,6 +85,10 @@ struct stmt_info_for_cost {
>
>  typedef vec<stmt_info_for_cost> stmt_vector_for_cost;
>
> +/* Maps base addresses to the largest alignment that we've been able
> +   to calculate for them.  */
> +typedef hash_map<tree_operand_hash, unsigned int> vec_base_alignments;
> +
>  /************************************************************************
>    SLP
>   ************************************************************************/
> @@ -156,6 +161,10 @@ struct vec_info {
>    /* All data references.  */
>    vec<data_reference_p> datarefs;
>
> +  /* Maps the base addresses of all data references in DATAREFS to the
> +     largest alignment that we've been able to calculate for them.  */
> +  vec_base_alignments base_alignments;
> +
>    /* All data dependences.  */
>    vec<ddr_p> ddrs;
>
> @@ -1117,6 +1126,7 @@ extern bool vect_prune_runtime_alias_tes
>  extern bool vect_check_gather_scatter (gimple *, loop_vec_info,
>                                        gather_scatter_info *);
>  extern bool vect_analyze_data_refs (vec_info *, int *);
> +extern void vect_compute_base_alignments (vec_info *);
>  extern tree vect_create_data_ref_ptr (gimple *, tree, struct loop *, tree,
>                                       tree *, gimple_stmt_iterator *,
>                                       gimple **, bool, bool *,
> Index: gcc/tree-data-ref.h
> ===================================================================
> --- gcc/tree-data-ref.h 2017-06-08 08:51:43.349263895 +0100
> +++ gcc/tree-data-ref.h 2017-06-22 12:23:21.285421180 +0100
> @@ -119,6 +119,10 @@ struct data_reference
>    /* True when the data reference is in RHS of a stmt.  */
>    bool is_read;
>
> +  /* True when the data reference is conditional, i.e. if it might not
> +     occur even when the statement runs to completion.  */
> +  bool is_conditional;
> +
>    /* Behavior of the memory reference in the innermost loop.  */
>    struct innermost_loop_behavior innermost;
>
> @@ -138,6 +142,7 @@ #define DR_ACCESS_FN(DR, I)        DR_AC
>  #define DR_NUM_DIMENSIONS(DR)      DR_ACCESS_FNS (DR).length ()
>  #define DR_IS_READ(DR)             (DR)->is_read
>  #define DR_IS_WRITE(DR)            (!DR_IS_READ (DR))
> +#define DR_IS_CONDITIONAL(DR)      (DR)->is_conditional
>  #define DR_BASE_ADDRESS(DR)        (DR)->innermost.base_address
>  #define DR_OFFSET(DR)              (DR)->innermost.offset
>  #define DR_INIT(DR)                (DR)->innermost.init
> @@ -350,7 +355,8 @@ extern bool graphite_find_data_reference
>                                                    vec<data_reference_p> *);
>  tree find_data_references_in_loop (struct loop *, vec<data_reference_p> *);
>  bool loop_nest_has_data_refs (loop_p loop);
> -struct data_reference *create_data_ref (loop_p, loop_p, tree, gimple *, bool);
> +struct data_reference *create_data_ref (loop_p, loop_p, tree, gimple *, bool,
> +                                       bool);
>  extern bool find_loop_nest (struct loop *, vec<loop_p> *);
>  extern struct data_dependence_relation *initialize_data_dependence_relation
>       (struct data_reference *, struct data_reference *, vec<loop_p>);
> Index: gcc/tree-data-ref.c
> ===================================================================
> --- gcc/tree-data-ref.c 2017-06-08 08:51:43.349263895 +0100
> +++ gcc/tree-data-ref.c 2017-06-22 12:23:21.284421233 +0100
> @@ -1053,15 +1053,18 @@ free_data_ref (data_reference_p dr)
>    free (dr);
>  }
>
> -/* Analyzes memory reference MEMREF accessed in STMT.  The reference
> -   is read if IS_READ is true, write otherwise.  Returns the
> -   data_reference description of MEMREF.  NEST is the outermost loop
> -   in which the reference should be instantiated, LOOP is the loop in
> -   which the data reference should be analyzed.  */
> +/* Analyze memory reference MEMREF, which is accessed in STMT.  The reference
> +   is a read if IS_READ is true, otherwise it is a write.  IS_CONDITIONAL
> +   indicates that the reference is conditional, i.e. that it might not
> +   occur every time that STMT runs to completion.
> +
> +   Return the data_reference description of MEMREF.  NEST is the outermost
> +   loop in which the reference should be instantiated, LOOP is the loop
> +   in which the data reference should be analyzed.  */
>
>  struct data_reference *
>  create_data_ref (loop_p nest, loop_p loop, tree memref, gimple *stmt,
> -                bool is_read)
> +                bool is_read, bool is_conditional)
>  {
>    struct data_reference *dr;
>
> @@ -1076,6 +1079,7 @@ create_data_ref (loop_p nest, loop_p loo
>    DR_STMT (dr) = stmt;
>    DR_REF (dr) = memref;
>    DR_IS_READ (dr) = is_read;
> +  DR_IS_CONDITIONAL (dr) = is_conditional;
>
>    dr_analyze_innermost (dr, nest);
>    dr_analyze_indices (dr, nest, loop);
> @@ -4446,6 +4450,10 @@ struct data_ref_loc
>
>    /* True if the memory reference is read.  */
>    bool is_read;
> +
> +  /* True if the data reference is conditional, i.e. if it might not
> +     occur even when the statement runs to completion.  */
> +  bool is_conditional;
>  };
>
>
> @@ -4512,6 +4520,7 @@ get_references_in_stmt (gimple *stmt, ve
>         {
>           ref.ref = op1;
>           ref.is_read = true;
> +         ref.is_conditional = false;
>           references->safe_push (ref);
>         }
>      }
> @@ -4539,6 +4548,7 @@ get_references_in_stmt (gimple *stmt, ve
>               type = TREE_TYPE (gimple_call_arg (stmt, 3));
>             if (TYPE_ALIGN (type) != align)
>               type = build_aligned_type (type, align);
> +           ref.is_conditional = true;
>             ref.ref = fold_build2 (MEM_REF, type, gimple_call_arg (stmt, 0),
>                                    ptr);
>             references->safe_push (ref);
> @@ -4558,6 +4568,7 @@ get_references_in_stmt (gimple *stmt, ve
>             {
>               ref.ref = op1;
>               ref.is_read = true;
> +             ref.is_conditional = false;
>               references->safe_push (ref);
>             }
>         }
> @@ -4571,6 +4582,7 @@ get_references_in_stmt (gimple *stmt, ve
>      {
>        ref.ref = op0;
>        ref.is_read = false;
> +      ref.is_conditional = false;
>        references->safe_push (ref);
>      }
>    return clobbers_memory;
> @@ -4635,8 +4647,8 @@ find_data_references_in_stmt (struct loo
>
>    FOR_EACH_VEC_ELT (references, i, ref)
>      {
> -      dr = create_data_ref (nest, loop_containing_stmt (stmt),
> -                           ref->ref, stmt, ref->is_read);
> +      dr = create_data_ref (nest, loop_containing_stmt (stmt), ref->ref,
> +                           stmt, ref->is_read, ref->is_conditional);
>        gcc_assert (dr != NULL);
>        datarefs->safe_push (dr);
>      }
> @@ -4665,7 +4677,8 @@ graphite_find_data_references_in_stmt (l
>
>    FOR_EACH_VEC_ELT (references, i, ref)
>      {
> -      dr = create_data_ref (nest, loop, ref->ref, stmt, ref->is_read);
> +      dr = create_data_ref (nest, loop, ref->ref, stmt, ref->is_read,
> +                           ref->is_conditional);
>        gcc_assert (dr != NULL);
>        datarefs->safe_push (dr);
>      }
> Index: gcc/tree-ssa-loop-prefetch.c
> ===================================================================
> --- gcc/tree-ssa-loop-prefetch.c        2017-06-07 21:58:55.928557601 +0100
> +++ gcc/tree-ssa-loop-prefetch.c        2017-06-22 12:23:21.285421180 +0100
> @@ -1633,7 +1633,7 @@ determine_loop_nest_reuse (struct loop *
>      for (ref = gr->refs; ref; ref = ref->next)
>        {
>         dr = create_data_ref (nest, loop_containing_stmt (ref->stmt),
> -                             ref->mem, ref->stmt, !ref->write_p);
> +                             ref->mem, ref->stmt, !ref->write_p, false);
>
>         if (dr)
>           {
> Index: gcc/tree-vect-data-refs.c
> ===================================================================
> --- gcc/tree-vect-data-refs.c   2017-06-08 08:51:43.350263752 +0100
> +++ gcc/tree-vect-data-refs.c   2017-06-22 12:23:21.286421126 +0100
> @@ -646,6 +646,102 @@ vect_slp_analyze_instance_dependence (sl
>    return res;
>  }
>
> +/* If DR is nested in a loop that is being vectorized, return the base
> +   address in the context of the vectorized loop (rather than the
> +   nested loop).  Otherwise return the base address in the context
> +   of the containing statement.  */
> +
> +static tree
> +vect_get_base_address (data_reference *dr)
> +{
> +  gimple *stmt = DR_STMT (dr);
> +  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
> +  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
> +  struct loop *loop = loop_vinfo != NULL ? LOOP_VINFO_LOOP (loop_vinfo) : NULL;
> +  if (loop && nested_in_vect_loop_p (loop, stmt))
> +    return STMT_VINFO_DR_BASE_ADDRESS (stmt_info);
> +  else
> +    return DR_BASE_ADDRESS (dr);
> +}
> +
> +/* Compute and return the alignment of base address BASE_ADDR in DR.  */
> +
> +static unsigned int
> +vect_compute_base_alignment (data_reference *dr, tree base_addr)
> +{
> +  /* To look at the alignment of the base we have to preserve an inner
> +     MEM_REF as that carries the alignment information of the actual
> +     access.  */
> +  tree base = DR_REF (dr);
> +  while (handled_component_p (base))
> +    base = TREE_OPERAND (base, 0);
> +  unsigned int base_alignment = 0;
> +  unsigned HOST_WIDE_INT base_bitpos;
> +  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
> +
> +  /* As data-ref analysis strips the MEM_REF down to its base operand
> +     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
> +     adjust things to make base_alignment valid as the alignment of
> +     DR_BASE_ADDRESS.  */
> +  if (TREE_CODE (base) == MEM_REF)
> +    {
> +      /* Note all this only works if DR_BASE_ADDRESS is the same as
> +        MEM_REF operand zero, otherwise DR/SCEV analysis might have factored
> +        in other offsets.  We need to rework DR to compute the alingment
> +        of DR_BASE_ADDRESS as long as all information is still available.  */
> +      if (operand_equal_p (TREE_OPERAND (base, 0), base_addr, 0))
> +       {
> +         base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
> +         base_bitpos &= (base_alignment - 1);
> +       }
> +      else
> +       base_bitpos = BITS_PER_UNIT;
> +    }
> +  if (base_bitpos != 0)
> +    base_alignment = base_bitpos & -base_bitpos;
> +
> +  /* Also look at the alignment of the base address DR analysis
> +     computed.  */
> +  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
> +  if (base_addr_alignment > base_alignment)
> +    base_alignment = base_addr_alignment;
> +
> +  return base_alignment;
> +}
> +
> +/* Compute alignments for the base addresses of all datarefs in VINFO.  */
> +
> +void
> +vect_compute_base_alignments (vec_info *vinfo)
> +{
> +  /* If the region we're going to vectorize is reached, all unconditional
> +     data references occur at least once.  We can therefore pool the base
> +     alignment guarantees from each unconditional reference.  */
> +  data_reference *dr;
> +  unsigned int i;
> +  FOR_EACH_VEC_ELT (vinfo->datarefs, i, dr)
> +    if (!DR_IS_CONDITIONAL (dr))
> +      {
> +       tree base_addr = vect_get_base_address (dr);
> +       unsigned int alignment = vect_compute_base_alignment (dr, base_addr);
> +       bool existed;
> +       unsigned int &entry
> +         = vinfo->base_alignments.get_or_insert (base_addr, &existed);
> +       if (!existed || entry < alignment)
> +         {
> +           entry = alignment;
> +           if (dump_enabled_p ())
> +             {
> +               dump_printf_loc (MSG_NOTE, vect_location,
> +                                "setting base alignment for ");
> +               dump_generic_expr (MSG_NOTE, TDF_SLIM, base_addr);
> +               dump_printf (MSG_NOTE, " to %d, based on ", alignment);
> +               dump_gimple_stmt (MSG_NOTE, TDF_SLIM, DR_STMT (dr), 0);
> +             }
> +         }
> +      }
> +}
> +
>  /* Function vect_compute_data_ref_alignment
>
>     Compute the misalignment of the data reference DR.
> @@ -663,6 +759,7 @@ vect_compute_data_ref_alignment (struct
>  {
>    gimple *stmt = DR_STMT (dr);
>    stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
> +  vec_base_alignments *base_alignments = &stmt_info->vinfo->base_alignments;
>    loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>    struct loop *loop = NULL;
>    tree ref = DR_REF (dr);
> @@ -699,6 +796,8 @@ vect_compute_data_ref_alignment (struct
>      {
>        tree step = DR_STEP (dr);
>
> +      base_addr = STMT_VINFO_DR_BASE_ADDRESS (stmt_info);
> +      aligned_to = STMT_VINFO_DR_ALIGNED_TO (stmt_info);
>        if (tree_fits_shwi_p (step)
>           && tree_to_shwi (step) % GET_MODE_SIZE (TYPE_MODE (vectype)) == 0)
>          {
> @@ -706,8 +805,6 @@ vect_compute_data_ref_alignment (struct
>              dump_printf_loc (MSG_NOTE, vect_location,
>                               "inner step divides the vector-size.\n");
>           misalign = STMT_VINFO_DR_INIT (stmt_info);
> -         aligned_to = STMT_VINFO_DR_ALIGNED_TO (stmt_info);
> -         base_addr = STMT_VINFO_DR_BASE_ADDRESS (stmt_info);
>          }
>        else
>         {
> @@ -738,39 +835,15 @@ vect_compute_data_ref_alignment (struct
>         }
>      }
>
> -  /* To look at alignment of the base we have to preserve an inner MEM_REF
> -     as that carries alignment information of the actual access.  */
> -  base = ref;
> -  while (handled_component_p (base))
> -    base = TREE_OPERAND (base, 0);
> +  /* Calculate the maximum of the pooled base address alignment and the
> +     alignment that we can compute for DR itself.  The latter should
> +     already be included in the former for unconditional references.  */
>    unsigned int base_alignment = 0;
> -  unsigned HOST_WIDE_INT base_bitpos;
> -  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
> -  /* As data-ref analysis strips the MEM_REF down to its base operand
> -     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
> -     adjust things to make base_alignment valid as the alignment of
> -     DR_BASE_ADDRESS.  */
> -  if (TREE_CODE (base) == MEM_REF)
> -    {
> -      /* Note all this only works if DR_BASE_ADDRESS is the same as
> -        MEM_REF operand zero, otherwise DR/SCEV analysis might have factored
> -        in other offsets.  We need to rework DR to compute the alingment
> -        of DR_BASE_ADDRESS as long as all information is still available.  */
> -      if (operand_equal_p (TREE_OPERAND (base, 0), base_addr, 0))
> -       {
> -         base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
> -         base_bitpos &= (base_alignment - 1);
> -       }
> -      else
> -       base_bitpos = BITS_PER_UNIT;
> -    }
> -  if (base_bitpos != 0)
> -    base_alignment = base_bitpos & -base_bitpos;
> -  /* Also look at the alignment of the base address DR analysis
> -     computed.  */
> -  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
> -  if (base_addr_alignment > base_alignment)
> -    base_alignment = base_addr_alignment;
> +  if (DR_IS_CONDITIONAL (dr))
> +    base_alignment = vect_compute_base_alignment (dr, base_addr);
> +  if (unsigned int *entry = base_alignments->get (base_addr))
> +    base_alignment = MAX (base_alignment, *entry);
> +  gcc_assert (base_alignment != 0);
>
>    if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
>      DR_VECT_AUX (dr)->base_element_aligned = true;
> @@ -906,8 +979,29 @@ vect_update_misalignment_for_peel (struc
>      {
>        if (current_dr != dr)
>          continue;
> -      gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
> -                  DR_MISALIGNMENT (dr_peel) / dr_peel_size);
> +      /* Any alignment guarantees provided by a reference only apply if
> +        the reference actually occurs.  For example, in:
> +
> +           struct s __attribute__((aligned(32))) {
> +             int misaligner;
> +             int array[N];
> +           };
> +
> +           int *ptr;
> +           for (int i = 0; i < n; ++i)
> +             ptr[i] = c[i] ? ((struct s *) (ptr - 1))->array[i] : 0;
> +
> +        we can only assume that ptr is part of a struct s if at least one
> +        c[i] is true.  This in turn means that we have a higher base
> +        alignment guarantee for the read from ptr (if it occurs) than for
> +        the write to ptr, and we cannot unconditionally carry the former
> +        over to the latter.  We still know that the two address values
> +        have the same misalignment, so if peeling has forced one of them
> +        to be aligned, the other must be too.  */
> +      gcc_assert (DR_IS_CONDITIONAL (dr_peel)
> +                 || DR_IS_CONDITIONAL (dr)
> +                 || (DR_MISALIGNMENT (dr) / dr_size
> +                     == DR_MISALIGNMENT (dr_peel) / dr_peel_size));
>        SET_DR_MISALIGNMENT (dr, 0);
>        return;
>      }
> @@ -2117,8 +2211,7 @@ vect_find_same_alignment_drs (struct dat
>    if (dra == drb)
>      return;
>
> -  if (!operand_equal_p (DR_BASE_OBJECT (dra), DR_BASE_OBJECT (drb),
> -                       OEP_ADDRESS_OF)
> +  if (!operand_equal_p (DR_BASE_ADDRESS (dra), DR_BASE_ADDRESS (drb), 0)
>        || !operand_equal_p (DR_OFFSET (dra), DR_OFFSET (drb), 0)
>        || !operand_equal_p (DR_STEP (dra), DR_STEP (drb), 0))
>      return;
> @@ -2176,6 +2269,7 @@ vect_analyze_data_refs_alignment (loop_v
>    vec<data_reference_p> datarefs = vinfo->datarefs;
>    struct data_reference *dr;
>
> +  vect_compute_base_alignments (vinfo);
>    FOR_EACH_VEC_ELT (datarefs, i, dr)
>      {
>        stmt_vec_info stmt_info = vinfo_for_stmt (DR_STMT (dr));
> @@ -3374,7 +3468,8 @@ vect_analyze_data_refs (vec_info *vinfo,
>             {
>               struct data_reference *newdr
>                 = create_data_ref (NULL, loop_containing_stmt (stmt),
> -                                  DR_REF (dr), stmt, maybe_scatter ? false : true);
> +                                  DR_REF (dr), stmt, !maybe_scatter,
> +                                  DR_IS_CONDITIONAL (dr));
>               gcc_assert (newdr != NULL && DR_REF (newdr));
>               if (DR_BASE_ADDRESS (newdr)
>                   && DR_OFFSET (newdr)
> Index: gcc/tree-vect-slp.c
> ===================================================================
> --- gcc/tree-vect-slp.c 2017-06-07 21:58:56.336475882 +0100
> +++ gcc/tree-vect-slp.c 2017-06-22 12:23:21.288421018 +0100
> @@ -2367,6 +2367,7 @@ new_bb_vec_info (gimple_stmt_iterator re
>    gimple_stmt_iterator gsi;
>
>    res = (bb_vec_info) xcalloc (1, sizeof (struct _bb_vec_info));
> +  new (&res->base_alignments) vec_base_alignments ();
>    res->kind = vec_info::bb;
>    BB_VINFO_BB (res) = bb;
>    res->region_begin = region_begin;
> @@ -2741,6 +2742,8 @@ vect_slp_analyze_bb_1 (gimple_stmt_itera
>        return NULL;
>      }
>
> +  vect_compute_base_alignments (bb_vinfo);
> +
>    /* Analyze and verify the alignment of data references and the
>       dependence in the SLP instances.  */
>    for (i = 0; BB_VINFO_SLP_INSTANCES (bb_vinfo).iterate (i, &instance); )
> Index: gcc/tree-vect-loop.c
> ===================================================================
> --- gcc/tree-vect-loop.c        2017-06-22 12:22:57.734313143 +0100
> +++ gcc/tree-vect-loop.c        2017-06-22 12:23:21.287421072 +0100
> @@ -1157,6 +1157,7 @@ new_loop_vec_info (struct loop *loop)
>    LOOP_VINFO_VECT_FACTOR (res) = 0;
>    LOOP_VINFO_LOOP_NEST (res) = vNULL;
>    LOOP_VINFO_DATAREFS (res) = vNULL;
> +  new (&res->base_alignments) vec_base_alignments ();
>    LOOP_VINFO_DDRS (res) = vNULL;
>    LOOP_VINFO_UNALIGNED_DR (res) = NULL;
>    LOOP_VINFO_MAY_MISALIGN_STMTS (res) = vNULL;
> Index: gcc/tree-vectorizer.c
> ===================================================================
> --- gcc/tree-vectorizer.c       2017-06-22 12:22:57.732313220 +0100
> +++ gcc/tree-vectorizer.c       2017-06-22 12:23:21.288421018 +0100
> @@ -370,6 +370,8 @@ vect_destroy_datarefs (vec_info *vinfo)
>        }
>
>    free_data_refs (vinfo->datarefs);
> +
> +  vinfo->base_alignments.~vec_base_alignments ();
>  }
>
>  /* A helper function to free scev and LOOP niter information, as well as
> Index: gcc/testsuite/gcc.dg/vect/pr81136.c
> ===================================================================
> --- /dev/null   2017-06-22 07:43:14.805493307 +0100
> +++ gcc/testsuite/gcc.dg/vect/pr81136.c 2017-06-22 12:23:21.283421287 +0100
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +
> +struct __attribute__((aligned (32)))
> +{
> +  char misaligner;
> +  int foo[100];
> +  int bar[100];
> +} *a;
> +
> +void
> +fn1 (int n)
> +{
> +  int *b = a->foo;
> +  for (int i = 0; i < n; i++)
> +    a->bar[i] = b[i];
> +}
Richard Biener June 23, 2017, 11:24 a.m. UTC | #2
On Fri, Jun 23, 2017 at 1:19 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Thu, Jun 22, 2017 at 1:30 PM, Richard Sandiford
> <richard.sandiford@linaro.org> wrote:
>> The test case triggered this assert in vect_update_misalignment_for_peel:
>>
>>           gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
>>                       DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>>
>> We knew that the two DRs had the same misalignment at runtime, but when
>> considered in isolation, one data reference guaranteed a higher compile-time
>> base alignment than the other.
>>
>> In the test case this looks like a missed opportunity.  Both references
>> are unconditional, so it should be possible to use the highest of the
>> available base alignment guarantees when analyzing each reference.
>> The patch does this.
>>
>> However, as the comment in the patch says, the base alignment guarantees
>> provided by a conditional reference only apply if the reference occurs
>> at least once.  In this case it would be legitimate for two references
>> to have the same runtime misalignment and for one reference to provide a
>> stronger compile-time guarantee than the other about what the misalignment
>> actually is.  The patch therefore relaxes the assert to handle that case.
>
> Hmm, but you don't actually check whether a reference occurs only conditional,
> do you?  You just seem to say that for masked loads/stores the reference
> is conditional (I believe that's not true).  But for a loop like
>
>  for (;;)
>    if (a[i])
>      sum += b[j];
>
> you still assume b[j] executes unconditionally?
>
> The vectorizer of course only sees unconditionally executed stmts.
>
> So - I'd simply not add this DR_IS_CONDITIONAL.  Did you run into
> any real-world (testsuite) issues without this?

+           for (int i = 0; i < n; ++i)
+             ptr[i] = c[i] ? ((struct s *) (ptr - 1))->array[i] : 0;
+

note that this example shows an if-conversion bug as if-conversion
makes the access to ((struct s *) (ptr - 1))->array[i] unconditional
if it may not trap (or the trap helper doesn't consider alignment-related
traps).

I'm not convinced we should worry ... (ok, I didn't say that).

Richard.

> Note that the assert is to prevent bogus information.  Iff we aligned
> DR with base alignment 8 and misalign 3 then if another same-align
> DR has base alignment 16 we can't simply zero its DR_MISALIGNMENT
> as it still can be 8 after aligning DR.
>
> So I think it's wrong to put DRs with differing base-alignment into
> the same-align-refs chain, those should get their DR_MISALIGNMENT
> updated independenlty after peeling.
>
> I'd rather not mix fixing this with the improvement to eventuall use a
> larger align for the other DR if possible.
>
> Thanks,
> Richard.
>
>> Tested on powerpc64-linux-gnu, aarch64-linux-gnu and x86_64-linux-gnu.
>> OK to instal?
>>
>> Richard
>>
>>
>> 2017-06-22  Richard Sandiford  <richard.sandiford@linaro.org>
>>
>> gcc/
>>         PR tree-optimization/81136
>>         * tree-vectorizer.h: Include tree-hash-traits.h.
>>         (vec_base_alignments): New typedef.
>>         (vec_info): Add a base_alignments field.
>>         (vect_compute_base_alignments: Declare.
>>         * tree-data-ref.h (data_reference): Add an is_conditional field.
>>         (DR_IS_CONDITIONAL): New macro.
>>         (create_data_ref): Add an is_conditional argument.
>>         * tree-data-ref.c (create_data_ref): Likewise.  Use it to initialize
>>         the is_conditional field.
>>         (data_ref_loc): Add an is_conditional field.
>>         (get_references_in_stmt): Set the is_conditional field.
>>         (find_data_references_in_stmt): Update call to create_data_ref.
>>         (graphite_find_data_references_in_stmt): Likewise.
>>         * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Likewise.
>>         * tree-vect-data-refs.c (vect_analyze_data_refs): Likewise.
>>         (vect_get_base_address): New function.
>>         (vect_compute_base_alignments): Likewise.
>>         (vect_compute_base_alignment): Likewise, split out from...
>>         (vect_compute_data_ref_alignment): ...here.  Use precomputed
>>         base alignments.  Only compute a new base alignment here if the
>>         reference is conditional.
>>         (vect_update_misalignment_for_peel): Allow the compile-time
>>         DR_MISALIGNMENTs of two references with the same runtime alignment
>>         to be different if one of the references is conditional.
>>         (vect_find_same_alignment_drs): Compare base addresses instead
>>         of base objects.
>>         (vect_compute_data_ref_alignment): Call vect_compute_base_alignments.
>>         * tree-vect-slp.c (vect_slp_analyze_bb_1): Likewise.
>>         (new_bb_vec_info): Initialize base_alignments.
>>         * tree-vect-loop.c (new_loop_vec_info): Likewise.
>>         * tree-vectorizer.c (vect_destroy_datarefs): Release base_alignments.
>>
>> gcc/testsuite/
>>         PR tree-optimization/81136
>>         * gcc.dg/vect/pr81136.c: New test.
>>
>> Index: gcc/tree-vectorizer.h
>> ===================================================================
>> --- gcc/tree-vectorizer.h       2017-06-08 08:51:43.347264181 +0100
>> +++ gcc/tree-vectorizer.h       2017-06-22 12:23:21.288421018 +0100
>> @@ -22,6 +22,7 @@ Software Foundation; either version 3, o
>>  #define GCC_TREE_VECTORIZER_H
>>
>>  #include "tree-data-ref.h"
>> +#include "tree-hash-traits.h"
>>  #include "target.h"
>>
>>  /* Used for naming of new temporaries.  */
>> @@ -84,6 +85,10 @@ struct stmt_info_for_cost {
>>
>>  typedef vec<stmt_info_for_cost> stmt_vector_for_cost;
>>
>> +/* Maps base addresses to the largest alignment that we've been able
>> +   to calculate for them.  */
>> +typedef hash_map<tree_operand_hash, unsigned int> vec_base_alignments;
>> +
>>  /************************************************************************
>>    SLP
>>   ************************************************************************/
>> @@ -156,6 +161,10 @@ struct vec_info {
>>    /* All data references.  */
>>    vec<data_reference_p> datarefs;
>>
>> +  /* Maps the base addresses of all data references in DATAREFS to the
>> +     largest alignment that we've been able to calculate for them.  */
>> +  vec_base_alignments base_alignments;
>> +
>>    /* All data dependences.  */
>>    vec<ddr_p> ddrs;
>>
>> @@ -1117,6 +1126,7 @@ extern bool vect_prune_runtime_alias_tes
>>  extern bool vect_check_gather_scatter (gimple *, loop_vec_info,
>>                                        gather_scatter_info *);
>>  extern bool vect_analyze_data_refs (vec_info *, int *);
>> +extern void vect_compute_base_alignments (vec_info *);
>>  extern tree vect_create_data_ref_ptr (gimple *, tree, struct loop *, tree,
>>                                       tree *, gimple_stmt_iterator *,
>>                                       gimple **, bool, bool *,
>> Index: gcc/tree-data-ref.h
>> ===================================================================
>> --- gcc/tree-data-ref.h 2017-06-08 08:51:43.349263895 +0100
>> +++ gcc/tree-data-ref.h 2017-06-22 12:23:21.285421180 +0100
>> @@ -119,6 +119,10 @@ struct data_reference
>>    /* True when the data reference is in RHS of a stmt.  */
>>    bool is_read;
>>
>> +  /* True when the data reference is conditional, i.e. if it might not
>> +     occur even when the statement runs to completion.  */
>> +  bool is_conditional;
>> +
>>    /* Behavior of the memory reference in the innermost loop.  */
>>    struct innermost_loop_behavior innermost;
>>
>> @@ -138,6 +142,7 @@ #define DR_ACCESS_FN(DR, I)        DR_AC
>>  #define DR_NUM_DIMENSIONS(DR)      DR_ACCESS_FNS (DR).length ()
>>  #define DR_IS_READ(DR)             (DR)->is_read
>>  #define DR_IS_WRITE(DR)            (!DR_IS_READ (DR))
>> +#define DR_IS_CONDITIONAL(DR)      (DR)->is_conditional
>>  #define DR_BASE_ADDRESS(DR)        (DR)->innermost.base_address
>>  #define DR_OFFSET(DR)              (DR)->innermost.offset
>>  #define DR_INIT(DR)                (DR)->innermost.init
>> @@ -350,7 +355,8 @@ extern bool graphite_find_data_reference
>>                                                    vec<data_reference_p> *);
>>  tree find_data_references_in_loop (struct loop *, vec<data_reference_p> *);
>>  bool loop_nest_has_data_refs (loop_p loop);
>> -struct data_reference *create_data_ref (loop_p, loop_p, tree, gimple *, bool);
>> +struct data_reference *create_data_ref (loop_p, loop_p, tree, gimple *, bool,
>> +                                       bool);
>>  extern bool find_loop_nest (struct loop *, vec<loop_p> *);
>>  extern struct data_dependence_relation *initialize_data_dependence_relation
>>       (struct data_reference *, struct data_reference *, vec<loop_p>);
>> Index: gcc/tree-data-ref.c
>> ===================================================================
>> --- gcc/tree-data-ref.c 2017-06-08 08:51:43.349263895 +0100
>> +++ gcc/tree-data-ref.c 2017-06-22 12:23:21.284421233 +0100
>> @@ -1053,15 +1053,18 @@ free_data_ref (data_reference_p dr)
>>    free (dr);
>>  }
>>
>> -/* Analyzes memory reference MEMREF accessed in STMT.  The reference
>> -   is read if IS_READ is true, write otherwise.  Returns the
>> -   data_reference description of MEMREF.  NEST is the outermost loop
>> -   in which the reference should be instantiated, LOOP is the loop in
>> -   which the data reference should be analyzed.  */
>> +/* Analyze memory reference MEMREF, which is accessed in STMT.  The reference
>> +   is a read if IS_READ is true, otherwise it is a write.  IS_CONDITIONAL
>> +   indicates that the reference is conditional, i.e. that it might not
>> +   occur every time that STMT runs to completion.
>> +
>> +   Return the data_reference description of MEMREF.  NEST is the outermost
>> +   loop in which the reference should be instantiated, LOOP is the loop
>> +   in which the data reference should be analyzed.  */
>>
>>  struct data_reference *
>>  create_data_ref (loop_p nest, loop_p loop, tree memref, gimple *stmt,
>> -                bool is_read)
>> +                bool is_read, bool is_conditional)
>>  {
>>    struct data_reference *dr;
>>
>> @@ -1076,6 +1079,7 @@ create_data_ref (loop_p nest, loop_p loo
>>    DR_STMT (dr) = stmt;
>>    DR_REF (dr) = memref;
>>    DR_IS_READ (dr) = is_read;
>> +  DR_IS_CONDITIONAL (dr) = is_conditional;
>>
>>    dr_analyze_innermost (dr, nest);
>>    dr_analyze_indices (dr, nest, loop);
>> @@ -4446,6 +4450,10 @@ struct data_ref_loc
>>
>>    /* True if the memory reference is read.  */
>>    bool is_read;
>> +
>> +  /* True if the data reference is conditional, i.e. if it might not
>> +     occur even when the statement runs to completion.  */
>> +  bool is_conditional;
>>  };
>>
>>
>> @@ -4512,6 +4520,7 @@ get_references_in_stmt (gimple *stmt, ve
>>         {
>>           ref.ref = op1;
>>           ref.is_read = true;
>> +         ref.is_conditional = false;
>>           references->safe_push (ref);
>>         }
>>      }
>> @@ -4539,6 +4548,7 @@ get_references_in_stmt (gimple *stmt, ve
>>               type = TREE_TYPE (gimple_call_arg (stmt, 3));
>>             if (TYPE_ALIGN (type) != align)
>>               type = build_aligned_type (type, align);
>> +           ref.is_conditional = true;
>>             ref.ref = fold_build2 (MEM_REF, type, gimple_call_arg (stmt, 0),
>>                                    ptr);
>>             references->safe_push (ref);
>> @@ -4558,6 +4568,7 @@ get_references_in_stmt (gimple *stmt, ve
>>             {
>>               ref.ref = op1;
>>               ref.is_read = true;
>> +             ref.is_conditional = false;
>>               references->safe_push (ref);
>>             }
>>         }
>> @@ -4571,6 +4582,7 @@ get_references_in_stmt (gimple *stmt, ve
>>      {
>>        ref.ref = op0;
>>        ref.is_read = false;
>> +      ref.is_conditional = false;
>>        references->safe_push (ref);
>>      }
>>    return clobbers_memory;
>> @@ -4635,8 +4647,8 @@ find_data_references_in_stmt (struct loo
>>
>>    FOR_EACH_VEC_ELT (references, i, ref)
>>      {
>> -      dr = create_data_ref (nest, loop_containing_stmt (stmt),
>> -                           ref->ref, stmt, ref->is_read);
>> +      dr = create_data_ref (nest, loop_containing_stmt (stmt), ref->ref,
>> +                           stmt, ref->is_read, ref->is_conditional);
>>        gcc_assert (dr != NULL);
>>        datarefs->safe_push (dr);
>>      }
>> @@ -4665,7 +4677,8 @@ graphite_find_data_references_in_stmt (l
>>
>>    FOR_EACH_VEC_ELT (references, i, ref)
>>      {
>> -      dr = create_data_ref (nest, loop, ref->ref, stmt, ref->is_read);
>> +      dr = create_data_ref (nest, loop, ref->ref, stmt, ref->is_read,
>> +                           ref->is_conditional);
>>        gcc_assert (dr != NULL);
>>        datarefs->safe_push (dr);
>>      }
>> Index: gcc/tree-ssa-loop-prefetch.c
>> ===================================================================
>> --- gcc/tree-ssa-loop-prefetch.c        2017-06-07 21:58:55.928557601 +0100
>> +++ gcc/tree-ssa-loop-prefetch.c        2017-06-22 12:23:21.285421180 +0100
>> @@ -1633,7 +1633,7 @@ determine_loop_nest_reuse (struct loop *
>>      for (ref = gr->refs; ref; ref = ref->next)
>>        {
>>         dr = create_data_ref (nest, loop_containing_stmt (ref->stmt),
>> -                             ref->mem, ref->stmt, !ref->write_p);
>> +                             ref->mem, ref->stmt, !ref->write_p, false);
>>
>>         if (dr)
>>           {
>> Index: gcc/tree-vect-data-refs.c
>> ===================================================================
>> --- gcc/tree-vect-data-refs.c   2017-06-08 08:51:43.350263752 +0100
>> +++ gcc/tree-vect-data-refs.c   2017-06-22 12:23:21.286421126 +0100
>> @@ -646,6 +646,102 @@ vect_slp_analyze_instance_dependence (sl
>>    return res;
>>  }
>>
>> +/* If DR is nested in a loop that is being vectorized, return the base
>> +   address in the context of the vectorized loop (rather than the
>> +   nested loop).  Otherwise return the base address in the context
>> +   of the containing statement.  */
>> +
>> +static tree
>> +vect_get_base_address (data_reference *dr)
>> +{
>> +  gimple *stmt = DR_STMT (dr);
>> +  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
>> +  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>> +  struct loop *loop = loop_vinfo != NULL ? LOOP_VINFO_LOOP (loop_vinfo) : NULL;
>> +  if (loop && nested_in_vect_loop_p (loop, stmt))
>> +    return STMT_VINFO_DR_BASE_ADDRESS (stmt_info);
>> +  else
>> +    return DR_BASE_ADDRESS (dr);
>> +}
>> +
>> +/* Compute and return the alignment of base address BASE_ADDR in DR.  */
>> +
>> +static unsigned int
>> +vect_compute_base_alignment (data_reference *dr, tree base_addr)
>> +{
>> +  /* To look at the alignment of the base we have to preserve an inner
>> +     MEM_REF as that carries the alignment information of the actual
>> +     access.  */
>> +  tree base = DR_REF (dr);
>> +  while (handled_component_p (base))
>> +    base = TREE_OPERAND (base, 0);
>> +  unsigned int base_alignment = 0;
>> +  unsigned HOST_WIDE_INT base_bitpos;
>> +  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
>> +
>> +  /* As data-ref analysis strips the MEM_REF down to its base operand
>> +     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
>> +     adjust things to make base_alignment valid as the alignment of
>> +     DR_BASE_ADDRESS.  */
>> +  if (TREE_CODE (base) == MEM_REF)
>> +    {
>> +      /* Note all this only works if DR_BASE_ADDRESS is the same as
>> +        MEM_REF operand zero, otherwise DR/SCEV analysis might have factored
>> +        in other offsets.  We need to rework DR to compute the alingment
>> +        of DR_BASE_ADDRESS as long as all information is still available.  */
>> +      if (operand_equal_p (TREE_OPERAND (base, 0), base_addr, 0))
>> +       {
>> +         base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
>> +         base_bitpos &= (base_alignment - 1);
>> +       }
>> +      else
>> +       base_bitpos = BITS_PER_UNIT;
>> +    }
>> +  if (base_bitpos != 0)
>> +    base_alignment = base_bitpos & -base_bitpos;
>> +
>> +  /* Also look at the alignment of the base address DR analysis
>> +     computed.  */
>> +  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
>> +  if (base_addr_alignment > base_alignment)
>> +    base_alignment = base_addr_alignment;
>> +
>> +  return base_alignment;
>> +}
>> +
>> +/* Compute alignments for the base addresses of all datarefs in VINFO.  */
>> +
>> +void
>> +vect_compute_base_alignments (vec_info *vinfo)
>> +{
>> +  /* If the region we're going to vectorize is reached, all unconditional
>> +     data references occur at least once.  We can therefore pool the base
>> +     alignment guarantees from each unconditional reference.  */
>> +  data_reference *dr;
>> +  unsigned int i;
>> +  FOR_EACH_VEC_ELT (vinfo->datarefs, i, dr)
>> +    if (!DR_IS_CONDITIONAL (dr))
>> +      {
>> +       tree base_addr = vect_get_base_address (dr);
>> +       unsigned int alignment = vect_compute_base_alignment (dr, base_addr);
>> +       bool existed;
>> +       unsigned int &entry
>> +         = vinfo->base_alignments.get_or_insert (base_addr, &existed);
>> +       if (!existed || entry < alignment)
>> +         {
>> +           entry = alignment;
>> +           if (dump_enabled_p ())
>> +             {
>> +               dump_printf_loc (MSG_NOTE, vect_location,
>> +                                "setting base alignment for ");
>> +               dump_generic_expr (MSG_NOTE, TDF_SLIM, base_addr);
>> +               dump_printf (MSG_NOTE, " to %d, based on ", alignment);
>> +               dump_gimple_stmt (MSG_NOTE, TDF_SLIM, DR_STMT (dr), 0);
>> +             }
>> +         }
>> +      }
>> +}
>> +
>>  /* Function vect_compute_data_ref_alignment
>>
>>     Compute the misalignment of the data reference DR.
>> @@ -663,6 +759,7 @@ vect_compute_data_ref_alignment (struct
>>  {
>>    gimple *stmt = DR_STMT (dr);
>>    stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
>> +  vec_base_alignments *base_alignments = &stmt_info->vinfo->base_alignments;
>>    loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>>    struct loop *loop = NULL;
>>    tree ref = DR_REF (dr);
>> @@ -699,6 +796,8 @@ vect_compute_data_ref_alignment (struct
>>      {
>>        tree step = DR_STEP (dr);
>>
>> +      base_addr = STMT_VINFO_DR_BASE_ADDRESS (stmt_info);
>> +      aligned_to = STMT_VINFO_DR_ALIGNED_TO (stmt_info);
>>        if (tree_fits_shwi_p (step)
>>           && tree_to_shwi (step) % GET_MODE_SIZE (TYPE_MODE (vectype)) == 0)
>>          {
>> @@ -706,8 +805,6 @@ vect_compute_data_ref_alignment (struct
>>              dump_printf_loc (MSG_NOTE, vect_location,
>>                               "inner step divides the vector-size.\n");
>>           misalign = STMT_VINFO_DR_INIT (stmt_info);
>> -         aligned_to = STMT_VINFO_DR_ALIGNED_TO (stmt_info);
>> -         base_addr = STMT_VINFO_DR_BASE_ADDRESS (stmt_info);
>>          }
>>        else
>>         {
>> @@ -738,39 +835,15 @@ vect_compute_data_ref_alignment (struct
>>         }
>>      }
>>
>> -  /* To look at alignment of the base we have to preserve an inner MEM_REF
>> -     as that carries alignment information of the actual access.  */
>> -  base = ref;
>> -  while (handled_component_p (base))
>> -    base = TREE_OPERAND (base, 0);
>> +  /* Calculate the maximum of the pooled base address alignment and the
>> +     alignment that we can compute for DR itself.  The latter should
>> +     already be included in the former for unconditional references.  */
>>    unsigned int base_alignment = 0;
>> -  unsigned HOST_WIDE_INT base_bitpos;
>> -  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
>> -  /* As data-ref analysis strips the MEM_REF down to its base operand
>> -     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
>> -     adjust things to make base_alignment valid as the alignment of
>> -     DR_BASE_ADDRESS.  */
>> -  if (TREE_CODE (base) == MEM_REF)
>> -    {
>> -      /* Note all this only works if DR_BASE_ADDRESS is the same as
>> -        MEM_REF operand zero, otherwise DR/SCEV analysis might have factored
>> -        in other offsets.  We need to rework DR to compute the alingment
>> -        of DR_BASE_ADDRESS as long as all information is still available.  */
>> -      if (operand_equal_p (TREE_OPERAND (base, 0), base_addr, 0))
>> -       {
>> -         base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
>> -         base_bitpos &= (base_alignment - 1);
>> -       }
>> -      else
>> -       base_bitpos = BITS_PER_UNIT;
>> -    }
>> -  if (base_bitpos != 0)
>> -    base_alignment = base_bitpos & -base_bitpos;
>> -  /* Also look at the alignment of the base address DR analysis
>> -     computed.  */
>> -  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
>> -  if (base_addr_alignment > base_alignment)
>> -    base_alignment = base_addr_alignment;
>> +  if (DR_IS_CONDITIONAL (dr))
>> +    base_alignment = vect_compute_base_alignment (dr, base_addr);
>> +  if (unsigned int *entry = base_alignments->get (base_addr))
>> +    base_alignment = MAX (base_alignment, *entry);
>> +  gcc_assert (base_alignment != 0);
>>
>>    if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
>>      DR_VECT_AUX (dr)->base_element_aligned = true;
>> @@ -906,8 +979,29 @@ vect_update_misalignment_for_peel (struc
>>      {
>>        if (current_dr != dr)
>>          continue;
>> -      gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
>> -                  DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>> +      /* Any alignment guarantees provided by a reference only apply if
>> +        the reference actually occurs.  For example, in:
>> +
>> +           struct s __attribute__((aligned(32))) {
>> +             int misaligner;
>> +             int array[N];
>> +           };
>> +
>> +           int *ptr;
>> +           for (int i = 0; i < n; ++i)
>> +             ptr[i] = c[i] ? ((struct s *) (ptr - 1))->array[i] : 0;
>> +
>> +        we can only assume that ptr is part of a struct s if at least one
>> +        c[i] is true.  This in turn means that we have a higher base
>> +        alignment guarantee for the read from ptr (if it occurs) than for
>> +        the write to ptr, and we cannot unconditionally carry the former
>> +        over to the latter.  We still know that the two address values
>> +        have the same misalignment, so if peeling has forced one of them
>> +        to be aligned, the other must be too.  */
>> +      gcc_assert (DR_IS_CONDITIONAL (dr_peel)
>> +                 || DR_IS_CONDITIONAL (dr)
>> +                 || (DR_MISALIGNMENT (dr) / dr_size
>> +                     == DR_MISALIGNMENT (dr_peel) / dr_peel_size));
>>        SET_DR_MISALIGNMENT (dr, 0);
>>        return;
>>      }
>> @@ -2117,8 +2211,7 @@ vect_find_same_alignment_drs (struct dat
>>    if (dra == drb)
>>      return;
>>
>> -  if (!operand_equal_p (DR_BASE_OBJECT (dra), DR_BASE_OBJECT (drb),
>> -                       OEP_ADDRESS_OF)
>> +  if (!operand_equal_p (DR_BASE_ADDRESS (dra), DR_BASE_ADDRESS (drb), 0)
>>        || !operand_equal_p (DR_OFFSET (dra), DR_OFFSET (drb), 0)
>>        || !operand_equal_p (DR_STEP (dra), DR_STEP (drb), 0))
>>      return;
>> @@ -2176,6 +2269,7 @@ vect_analyze_data_refs_alignment (loop_v
>>    vec<data_reference_p> datarefs = vinfo->datarefs;
>>    struct data_reference *dr;
>>
>> +  vect_compute_base_alignments (vinfo);
>>    FOR_EACH_VEC_ELT (datarefs, i, dr)
>>      {
>>        stmt_vec_info stmt_info = vinfo_for_stmt (DR_STMT (dr));
>> @@ -3374,7 +3468,8 @@ vect_analyze_data_refs (vec_info *vinfo,
>>             {
>>               struct data_reference *newdr
>>                 = create_data_ref (NULL, loop_containing_stmt (stmt),
>> -                                  DR_REF (dr), stmt, maybe_scatter ? false : true);
>> +                                  DR_REF (dr), stmt, !maybe_scatter,
>> +                                  DR_IS_CONDITIONAL (dr));
>>               gcc_assert (newdr != NULL && DR_REF (newdr));
>>               if (DR_BASE_ADDRESS (newdr)
>>                   && DR_OFFSET (newdr)
>> Index: gcc/tree-vect-slp.c
>> ===================================================================
>> --- gcc/tree-vect-slp.c 2017-06-07 21:58:56.336475882 +0100
>> +++ gcc/tree-vect-slp.c 2017-06-22 12:23:21.288421018 +0100
>> @@ -2367,6 +2367,7 @@ new_bb_vec_info (gimple_stmt_iterator re
>>    gimple_stmt_iterator gsi;
>>
>>    res = (bb_vec_info) xcalloc (1, sizeof (struct _bb_vec_info));
>> +  new (&res->base_alignments) vec_base_alignments ();
>>    res->kind = vec_info::bb;
>>    BB_VINFO_BB (res) = bb;
>>    res->region_begin = region_begin;
>> @@ -2741,6 +2742,8 @@ vect_slp_analyze_bb_1 (gimple_stmt_itera
>>        return NULL;
>>      }
>>
>> +  vect_compute_base_alignments (bb_vinfo);
>> +
>>    /* Analyze and verify the alignment of data references and the
>>       dependence in the SLP instances.  */
>>    for (i = 0; BB_VINFO_SLP_INSTANCES (bb_vinfo).iterate (i, &instance); )
>> Index: gcc/tree-vect-loop.c
>> ===================================================================
>> --- gcc/tree-vect-loop.c        2017-06-22 12:22:57.734313143 +0100
>> +++ gcc/tree-vect-loop.c        2017-06-22 12:23:21.287421072 +0100
>> @@ -1157,6 +1157,7 @@ new_loop_vec_info (struct loop *loop)
>>    LOOP_VINFO_VECT_FACTOR (res) = 0;
>>    LOOP_VINFO_LOOP_NEST (res) = vNULL;
>>    LOOP_VINFO_DATAREFS (res) = vNULL;
>> +  new (&res->base_alignments) vec_base_alignments ();
>>    LOOP_VINFO_DDRS (res) = vNULL;
>>    LOOP_VINFO_UNALIGNED_DR (res) = NULL;
>>    LOOP_VINFO_MAY_MISALIGN_STMTS (res) = vNULL;
>> Index: gcc/tree-vectorizer.c
>> ===================================================================
>> --- gcc/tree-vectorizer.c       2017-06-22 12:22:57.732313220 +0100
>> +++ gcc/tree-vectorizer.c       2017-06-22 12:23:21.288421018 +0100
>> @@ -370,6 +370,8 @@ vect_destroy_datarefs (vec_info *vinfo)
>>        }
>>
>>    free_data_refs (vinfo->datarefs);
>> +
>> +  vinfo->base_alignments.~vec_base_alignments ();
>>  }
>>
>>  /* A helper function to free scev and LOOP niter information, as well as
>> Index: gcc/testsuite/gcc.dg/vect/pr81136.c
>> ===================================================================
>> --- /dev/null   2017-06-22 07:43:14.805493307 +0100
>> +++ gcc/testsuite/gcc.dg/vect/pr81136.c 2017-06-22 12:23:21.283421287 +0100
>> @@ -0,0 +1,16 @@
>> +/* { dg-do compile } */
>> +
>> +struct __attribute__((aligned (32)))
>> +{
>> +  char misaligner;
>> +  int foo[100];
>> +  int bar[100];
>> +} *a;
>> +
>> +void
>> +fn1 (int n)
>> +{
>> +  int *b = a->foo;
>> +  for (int i = 0; i < n; i++)
>> +    a->bar[i] = b[i];
>> +}
Richard Sandiford June 23, 2017, 12:05 p.m. UTC | #3
Richard Biener <richard.guenther@gmail.com> writes:
> On Thu, Jun 22, 2017 at 1:30 PM, Richard Sandiford
> <richard.sandiford@linaro.org> wrote:
>> The test case triggered this assert in vect_update_misalignment_for_peel:
>>
>>           gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
>>                       DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>>
>> We knew that the two DRs had the same misalignment at runtime, but when
>> considered in isolation, one data reference guaranteed a higher compile-time
>> base alignment than the other.
>>
>> In the test case this looks like a missed opportunity.  Both references
>> are unconditional, so it should be possible to use the highest of the
>> available base alignment guarantees when analyzing each reference.
>> The patch does this.
>>
>> However, as the comment in the patch says, the base alignment guarantees
>> provided by a conditional reference only apply if the reference occurs
>> at least once.  In this case it would be legitimate for two references
>> to have the same runtime misalignment and for one reference to provide a
>> stronger compile-time guarantee than the other about what the misalignment
>> actually is.  The patch therefore relaxes the assert to handle that case.
>
> Hmm, but you don't actually check whether a reference occurs only conditional,
> do you?  You just seem to say that for masked loads/stores the reference
> is conditional (I believe that's not true).  But for a loop like
>
>  for (;;)
>    if (a[i])
>      sum += b[j];
>
> you still assume b[j] executes unconditionally?

Maybe the documentation isn't clear enough, but DR_IS_CONDITIONAL
was supposed to mean "even if the containing statement executes
and runs to completion, the reference might not actually occur".
The example above isn't conditional in that sense because the
reference to b[j] does occur if the store is reached and completes.

Masked loads and stores are conditional in that sense though.
The reference only occurs if the mask is nonzero; the memory
isn't touched otherwise.  The functions are used to if-convert
things like:

   for (...)
     a[i] = b[i] ? c[i] : d[i];

where there's no guarantee that it's safe to access c[i] when !b[i]
(or d[i] when b[i]).  No reference occurs for an all-false mask.

> The vectorizer of course only sees unconditionally executed stmts.
>
> So - I'd simply not add this DR_IS_CONDITIONAL.  Did you run into
> any real-world (testsuite) issues without this?

Dropping DR_IS_CONDITIONAL would cause us to make invalid alignment
assumptions in silly corner cases.  I could add a scan test for it,
for targets with masked loads and stores.  It wouldn't trigger
an execution failure though because we assume that targets with
masked loads and stores allow unaligned accesses:

  /* For now assume all conditional loads/stores support unaligned
     access without any special code.  */
  if (is_gimple_call (stmt)
      && gimple_call_internal_p (stmt)
      && (gimple_call_internal_fn (stmt) == IFN_MASK_LOAD
	  || gimple_call_internal_fn (stmt) == IFN_MASK_STORE))
    return dr_unaligned_supported;

So the worst that would happen is that we'd supposedly peel for
alignment, but actually misalign everything instead, and so make
things slower rather than quicker.

> Note that the assert is to prevent bogus information.  Iff we aligned
> DR with base alignment 8 and misalign 3 then if another same-align
> DR has base alignment 16 we can't simply zero its DR_MISALIGNMENT
> as it still can be 8 after aligning DR.
>
> So I think it's wrong to put DRs with differing base-alignment into
> the same-align-refs chain, those should get their DR_MISALIGNMENT
> updated independenlty after peeling.

DR_MISALIGNMENT is relative to the vector alignment rather than
the base alignment though.  So:

a) when looking for references *A1 and *A2 with the same alignment,
   we simply have to prove that A1 % vecalign == A2 % vecalign.
   This doesn't require any knowledge about the base alignment.
   If we break the addresses down as:

      A1 = BASE1 + REST1,  REST1 = INIT1 + OFFSET1 + X * STEP1
      A2 = BASE2 + REST2,  REST2 = INIT2 + OFFSET2 + X * STEP2

   and can prove that BASE1 == BASE2, the alignment of that base
   isn't important.  We simply need to prove that REST1 % vecalign
   == REST2 % vecalign for all X.

b) In the assert, we've peeled the loop so that DR_PEEL is guaranteed
   to be vector-aligned.  If DR_PEEL is A1 in the example above, we have
   A1 % vecalign == 0, so A2 % vecalign must be 0 too.  This again doesn't
   rely on the base alignment being known.

What a high base alignment for DR_PEEL gives us is the ability to know
at compile how many iterations need to be peeled to make DR_PEEL aligned.
But the points above apply regardless of whether we know that value at
compile time or not.

In examples like the test case, we would have known at compile time that
VF-1 iterations would need to be peeled if we'd picked the store as the
DR_PEEL, but would have treated the number of peels as variable if we'd
picked the load.  The value calculated at runtime would still have been
VF-1, it's just that the code wouldn't have been as efficient.

One of the benefits of pooling the alignments for unconditional references
is that it no longer matters which DR we pick: the number of peels will
be a compile-time constant both ways.

Thanks,
Richard

> I'd rather not mix fixing this with the improvement to eventuall use a
> larger align for the other DR if possible.
>
> Thanks,
> Richard.
>
>> Tested on powerpc64-linux-gnu, aarch64-linux-gnu and x86_64-linux-gnu.
>> OK to instal?
>>
>> Richard
>>
>>
>> 2017-06-22  Richard Sandiford  <richard.sandiford@linaro.org>
>>
>> gcc/
>>         PR tree-optimization/81136
>>         * tree-vectorizer.h: Include tree-hash-traits.h.
>>         (vec_base_alignments): New typedef.
>>         (vec_info): Add a base_alignments field.
>>         (vect_compute_base_alignments: Declare.
>>         * tree-data-ref.h (data_reference): Add an is_conditional field.
>>         (DR_IS_CONDITIONAL): New macro.
>>         (create_data_ref): Add an is_conditional argument.
>>         * tree-data-ref.c (create_data_ref): Likewise.  Use it to initialize
>>         the is_conditional field.
>>         (data_ref_loc): Add an is_conditional field.
>>         (get_references_in_stmt): Set the is_conditional field.
>>         (find_data_references_in_stmt): Update call to create_data_ref.
>>         (graphite_find_data_references_in_stmt): Likewise.
>>         * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Likewise.
>>         * tree-vect-data-refs.c (vect_analyze_data_refs): Likewise.
>>         (vect_get_base_address): New function.
>>         (vect_compute_base_alignments): Likewise.
>>         (vect_compute_base_alignment): Likewise, split out from...
>>         (vect_compute_data_ref_alignment): ...here.  Use precomputed
>>         base alignments.  Only compute a new base alignment here if the
>>         reference is conditional.
>>         (vect_update_misalignment_for_peel): Allow the compile-time
>>         DR_MISALIGNMENTs of two references with the same runtime alignment
>>         to be different if one of the references is conditional.
>>         (vect_find_same_alignment_drs): Compare base addresses instead
>>         of base objects.
>>         (vect_compute_data_ref_alignment): Call vect_compute_base_alignments.
>>         * tree-vect-slp.c (vect_slp_analyze_bb_1): Likewise.
>>         (new_bb_vec_info): Initialize base_alignments.
>>         * tree-vect-loop.c (new_loop_vec_info): Likewise.
>>         * tree-vectorizer.c (vect_destroy_datarefs): Release base_alignments.
>>
>> gcc/testsuite/
>>         PR tree-optimization/81136
>>         * gcc.dg/vect/pr81136.c: New test.
>>
>> Index: gcc/tree-vectorizer.h
>> ===================================================================
>> --- gcc/tree-vectorizer.h       2017-06-08 08:51:43.347264181 +0100
>> +++ gcc/tree-vectorizer.h       2017-06-22 12:23:21.288421018 +0100
>> @@ -22,6 +22,7 @@ Software Foundation; either version 3, o
>>  #define GCC_TREE_VECTORIZER_H
>>
>>  #include "tree-data-ref.h"
>> +#include "tree-hash-traits.h"
>>  #include "target.h"
>>
>>  /* Used for naming of new temporaries.  */
>> @@ -84,6 +85,10 @@ struct stmt_info_for_cost {
>>
>>  typedef vec<stmt_info_for_cost> stmt_vector_for_cost;
>>
>> +/* Maps base addresses to the largest alignment that we've been able
>> +   to calculate for them.  */
>> +typedef hash_map<tree_operand_hash, unsigned int> vec_base_alignments;
>> +
>>  /************************************************************************
>>    SLP
>>   ************************************************************************/
>> @@ -156,6 +161,10 @@ struct vec_info {
>>    /* All data references.  */
>>    vec<data_reference_p> datarefs;
>>
>> +  /* Maps the base addresses of all data references in DATAREFS to the
>> +     largest alignment that we've been able to calculate for them.  */
>> +  vec_base_alignments base_alignments;
>> +
>>    /* All data dependences.  */
>>    vec<ddr_p> ddrs;
>>
>> @@ -1117,6 +1126,7 @@ extern bool vect_prune_runtime_alias_tes
>>  extern bool vect_check_gather_scatter (gimple *, loop_vec_info,
>>                                        gather_scatter_info *);
>>  extern bool vect_analyze_data_refs (vec_info *, int *);
>> +extern void vect_compute_base_alignments (vec_info *);
>>  extern tree vect_create_data_ref_ptr (gimple *, tree, struct loop *, tree,
>>                                       tree *, gimple_stmt_iterator *,
>>                                       gimple **, bool, bool *,
>> Index: gcc/tree-data-ref.h
>> ===================================================================
>> --- gcc/tree-data-ref.h 2017-06-08 08:51:43.349263895 +0100
>> +++ gcc/tree-data-ref.h 2017-06-22 12:23:21.285421180 +0100
>> @@ -119,6 +119,10 @@ struct data_reference
>>    /* True when the data reference is in RHS of a stmt.  */
>>    bool is_read;
>>
>> +  /* True when the data reference is conditional, i.e. if it might not
>> +     occur even when the statement runs to completion.  */
>> +  bool is_conditional;
>> +
>>    /* Behavior of the memory reference in the innermost loop.  */
>>    struct innermost_loop_behavior innermost;
>>
>> @@ -138,6 +142,7 @@ #define DR_ACCESS_FN(DR, I)        DR_AC
>>  #define DR_NUM_DIMENSIONS(DR)      DR_ACCESS_FNS (DR).length ()
>>  #define DR_IS_READ(DR)             (DR)->is_read
>>  #define DR_IS_WRITE(DR)            (!DR_IS_READ (DR))
>> +#define DR_IS_CONDITIONAL(DR)      (DR)->is_conditional
>>  #define DR_BASE_ADDRESS(DR)        (DR)->innermost.base_address
>>  #define DR_OFFSET(DR)              (DR)->innermost.offset
>>  #define DR_INIT(DR)                (DR)->innermost.init
>> @@ -350,7 +355,8 @@ extern bool graphite_find_data_reference
>>                                                    vec<data_reference_p> *);
>>  tree find_data_references_in_loop (struct loop *, vec<data_reference_p> *);
>>  bool loop_nest_has_data_refs (loop_p loop);
>> -struct data_reference *create_data_ref (loop_p, loop_p, tree, gimple *, bool);
>> +struct data_reference *create_data_ref (loop_p, loop_p, tree, gimple *, bool,
>> +                                       bool);
>>  extern bool find_loop_nest (struct loop *, vec<loop_p> *);
>>  extern struct data_dependence_relation *initialize_data_dependence_relation
>>       (struct data_reference *, struct data_reference *, vec<loop_p>);
>> Index: gcc/tree-data-ref.c
>> ===================================================================
>> --- gcc/tree-data-ref.c 2017-06-08 08:51:43.349263895 +0100
>> +++ gcc/tree-data-ref.c 2017-06-22 12:23:21.284421233 +0100
>> @@ -1053,15 +1053,18 @@ free_data_ref (data_reference_p dr)
>>    free (dr);
>>  }
>>
>> -/* Analyzes memory reference MEMREF accessed in STMT.  The reference
>> -   is read if IS_READ is true, write otherwise.  Returns the
>> -   data_reference description of MEMREF.  NEST is the outermost loop
>> -   in which the reference should be instantiated, LOOP is the loop in
>> -   which the data reference should be analyzed.  */
>> +/* Analyze memory reference MEMREF, which is accessed in STMT.  The reference
>> +   is a read if IS_READ is true, otherwise it is a write.  IS_CONDITIONAL
>> +   indicates that the reference is conditional, i.e. that it might not
>> +   occur every time that STMT runs to completion.
>> +
>> +   Return the data_reference description of MEMREF.  NEST is the outermost
>> +   loop in which the reference should be instantiated, LOOP is the loop
>> +   in which the data reference should be analyzed.  */
>>
>>  struct data_reference *
>>  create_data_ref (loop_p nest, loop_p loop, tree memref, gimple *stmt,
>> -                bool is_read)
>> +                bool is_read, bool is_conditional)
>>  {
>>    struct data_reference *dr;
>>
>> @@ -1076,6 +1079,7 @@ create_data_ref (loop_p nest, loop_p loo
>>    DR_STMT (dr) = stmt;
>>    DR_REF (dr) = memref;
>>    DR_IS_READ (dr) = is_read;
>> +  DR_IS_CONDITIONAL (dr) = is_conditional;
>>
>>    dr_analyze_innermost (dr, nest);
>>    dr_analyze_indices (dr, nest, loop);
>> @@ -4446,6 +4450,10 @@ struct data_ref_loc
>>
>>    /* True if the memory reference is read.  */
>>    bool is_read;
>> +
>> +  /* True if the data reference is conditional, i.e. if it might not
>> +     occur even when the statement runs to completion.  */
>> +  bool is_conditional;
>>  };
>>
>>
>> @@ -4512,6 +4520,7 @@ get_references_in_stmt (gimple *stmt, ve
>>         {
>>           ref.ref = op1;
>>           ref.is_read = true;
>> +         ref.is_conditional = false;
>>           references->safe_push (ref);
>>         }
>>      }
>> @@ -4539,6 +4548,7 @@ get_references_in_stmt (gimple *stmt, ve
>>               type = TREE_TYPE (gimple_call_arg (stmt, 3));
>>             if (TYPE_ALIGN (type) != align)
>>               type = build_aligned_type (type, align);
>> +           ref.is_conditional = true;
>>             ref.ref = fold_build2 (MEM_REF, type, gimple_call_arg (stmt, 0),
>>                                    ptr);
>>             references->safe_push (ref);
>> @@ -4558,6 +4568,7 @@ get_references_in_stmt (gimple *stmt, ve
>>             {
>>               ref.ref = op1;
>>               ref.is_read = true;
>> +             ref.is_conditional = false;
>>               references->safe_push (ref);
>>             }
>>         }
>> @@ -4571,6 +4582,7 @@ get_references_in_stmt (gimple *stmt, ve
>>      {
>>        ref.ref = op0;
>>        ref.is_read = false;
>> +      ref.is_conditional = false;
>>        references->safe_push (ref);
>>      }
>>    return clobbers_memory;
>> @@ -4635,8 +4647,8 @@ find_data_references_in_stmt (struct loo
>>
>>    FOR_EACH_VEC_ELT (references, i, ref)
>>      {
>> -      dr = create_data_ref (nest, loop_containing_stmt (stmt),
>> -                           ref->ref, stmt, ref->is_read);
>> +      dr = create_data_ref (nest, loop_containing_stmt (stmt), ref->ref,
>> +                           stmt, ref->is_read, ref->is_conditional);
>>        gcc_assert (dr != NULL);
>>        datarefs->safe_push (dr);
>>      }
>> @@ -4665,7 +4677,8 @@ graphite_find_data_references_in_stmt (l
>>
>>    FOR_EACH_VEC_ELT (references, i, ref)
>>      {
>> -      dr = create_data_ref (nest, loop, ref->ref, stmt, ref->is_read);
>> +      dr = create_data_ref (nest, loop, ref->ref, stmt, ref->is_read,
>> +                           ref->is_conditional);
>>        gcc_assert (dr != NULL);
>>        datarefs->safe_push (dr);
>>      }
>> Index: gcc/tree-ssa-loop-prefetch.c
>> ===================================================================
>> --- gcc/tree-ssa-loop-prefetch.c        2017-06-07 21:58:55.928557601 +0100
>> +++ gcc/tree-ssa-loop-prefetch.c        2017-06-22 12:23:21.285421180 +0100
>> @@ -1633,7 +1633,7 @@ determine_loop_nest_reuse (struct loop *
>>      for (ref = gr->refs; ref; ref = ref->next)
>>        {
>>         dr = create_data_ref (nest, loop_containing_stmt (ref->stmt),
>> -                             ref->mem, ref->stmt, !ref->write_p);
>> +                             ref->mem, ref->stmt, !ref->write_p, false);
>>
>>         if (dr)
>>           {
>> Index: gcc/tree-vect-data-refs.c
>> ===================================================================
>> --- gcc/tree-vect-data-refs.c   2017-06-08 08:51:43.350263752 +0100
>> +++ gcc/tree-vect-data-refs.c   2017-06-22 12:23:21.286421126 +0100
>> @@ -646,6 +646,102 @@ vect_slp_analyze_instance_dependence (sl
>>    return res;
>>  }
>>
>> +/* If DR is nested in a loop that is being vectorized, return the base
>> +   address in the context of the vectorized loop (rather than the
>> +   nested loop).  Otherwise return the base address in the context
>> +   of the containing statement.  */
>> +
>> +static tree
>> +vect_get_base_address (data_reference *dr)
>> +{
>> +  gimple *stmt = DR_STMT (dr);
>> +  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
>> +  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>> +  struct loop *loop = loop_vinfo != NULL ? LOOP_VINFO_LOOP (loop_vinfo) : NULL;
>> +  if (loop && nested_in_vect_loop_p (loop, stmt))
>> +    return STMT_VINFO_DR_BASE_ADDRESS (stmt_info);
>> +  else
>> +    return DR_BASE_ADDRESS (dr);
>> +}
>> +
>> +/* Compute and return the alignment of base address BASE_ADDR in DR.  */
>> +
>> +static unsigned int
>> +vect_compute_base_alignment (data_reference *dr, tree base_addr)
>> +{
>> +  /* To look at the alignment of the base we have to preserve an inner
>> +     MEM_REF as that carries the alignment information of the actual
>> +     access.  */
>> +  tree base = DR_REF (dr);
>> +  while (handled_component_p (base))
>> +    base = TREE_OPERAND (base, 0);
>> +  unsigned int base_alignment = 0;
>> +  unsigned HOST_WIDE_INT base_bitpos;
>> +  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
>> +
>> +  /* As data-ref analysis strips the MEM_REF down to its base operand
>> +     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
>> +     adjust things to make base_alignment valid as the alignment of
>> +     DR_BASE_ADDRESS.  */
>> +  if (TREE_CODE (base) == MEM_REF)
>> +    {
>> +      /* Note all this only works if DR_BASE_ADDRESS is the same as
>> +        MEM_REF operand zero, otherwise DR/SCEV analysis might have factored
>> +        in other offsets.  We need to rework DR to compute the alingment
>> +        of DR_BASE_ADDRESS as long as all information is still available.  */
>> +      if (operand_equal_p (TREE_OPERAND (base, 0), base_addr, 0))
>> +       {
>> +         base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
>> +         base_bitpos &= (base_alignment - 1);
>> +       }
>> +      else
>> +       base_bitpos = BITS_PER_UNIT;
>> +    }
>> +  if (base_bitpos != 0)
>> +    base_alignment = base_bitpos & -base_bitpos;
>> +
>> +  /* Also look at the alignment of the base address DR analysis
>> +     computed.  */
>> +  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
>> +  if (base_addr_alignment > base_alignment)
>> +    base_alignment = base_addr_alignment;
>> +
>> +  return base_alignment;
>> +}
>> +
>> +/* Compute alignments for the base addresses of all datarefs in VINFO.  */
>> +
>> +void
>> +vect_compute_base_alignments (vec_info *vinfo)
>> +{
>> +  /* If the region we're going to vectorize is reached, all unconditional
>> +     data references occur at least once.  We can therefore pool the base
>> +     alignment guarantees from each unconditional reference.  */
>> +  data_reference *dr;
>> +  unsigned int i;
>> +  FOR_EACH_VEC_ELT (vinfo->datarefs, i, dr)
>> +    if (!DR_IS_CONDITIONAL (dr))
>> +      {
>> +       tree base_addr = vect_get_base_address (dr);
>> +       unsigned int alignment = vect_compute_base_alignment (dr, base_addr);
>> +       bool existed;
>> +       unsigned int &entry
>> +         = vinfo->base_alignments.get_or_insert (base_addr, &existed);
>> +       if (!existed || entry < alignment)
>> +         {
>> +           entry = alignment;
>> +           if (dump_enabled_p ())
>> +             {
>> +               dump_printf_loc (MSG_NOTE, vect_location,
>> +                                "setting base alignment for ");
>> +               dump_generic_expr (MSG_NOTE, TDF_SLIM, base_addr);
>> +               dump_printf (MSG_NOTE, " to %d, based on ", alignment);
>> +               dump_gimple_stmt (MSG_NOTE, TDF_SLIM, DR_STMT (dr), 0);
>> +             }
>> +         }
>> +      }
>> +}
>> +
>>  /* Function vect_compute_data_ref_alignment
>>
>>     Compute the misalignment of the data reference DR.
>> @@ -663,6 +759,7 @@ vect_compute_data_ref_alignment (struct
>>  {
>>    gimple *stmt = DR_STMT (dr);
>>    stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
>> +  vec_base_alignments *base_alignments = &stmt_info->vinfo->base_alignments;
>>    loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>>    struct loop *loop = NULL;
>>    tree ref = DR_REF (dr);
>> @@ -699,6 +796,8 @@ vect_compute_data_ref_alignment (struct
>>      {
>>        tree step = DR_STEP (dr);
>>
>> +      base_addr = STMT_VINFO_DR_BASE_ADDRESS (stmt_info);
>> +      aligned_to = STMT_VINFO_DR_ALIGNED_TO (stmt_info);
>>        if (tree_fits_shwi_p (step)
>>           && tree_to_shwi (step) % GET_MODE_SIZE (TYPE_MODE (vectype)) == 0)
>>          {
>> @@ -706,8 +805,6 @@ vect_compute_data_ref_alignment (struct
>>              dump_printf_loc (MSG_NOTE, vect_location,
>>                               "inner step divides the vector-size.\n");
>>           misalign = STMT_VINFO_DR_INIT (stmt_info);
>> -         aligned_to = STMT_VINFO_DR_ALIGNED_TO (stmt_info);
>> -         base_addr = STMT_VINFO_DR_BASE_ADDRESS (stmt_info);
>>          }
>>        else
>>         {
>> @@ -738,39 +835,15 @@ vect_compute_data_ref_alignment (struct
>>         }
>>      }
>>
>> -  /* To look at alignment of the base we have to preserve an inner MEM_REF
>> -     as that carries alignment information of the actual access.  */
>> -  base = ref;
>> -  while (handled_component_p (base))
>> -    base = TREE_OPERAND (base, 0);
>> +  /* Calculate the maximum of the pooled base address alignment and the
>> +     alignment that we can compute for DR itself.  The latter should
>> +     already be included in the former for unconditional references.  */
>>    unsigned int base_alignment = 0;
>> -  unsigned HOST_WIDE_INT base_bitpos;
>> -  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
>> -  /* As data-ref analysis strips the MEM_REF down to its base operand
>> -     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
>> -     adjust things to make base_alignment valid as the alignment of
>> -     DR_BASE_ADDRESS.  */
>> -  if (TREE_CODE (base) == MEM_REF)
>> -    {
>> -      /* Note all this only works if DR_BASE_ADDRESS is the same as
>> -        MEM_REF operand zero, otherwise DR/SCEV analysis might have factored
>> -        in other offsets.  We need to rework DR to compute the alingment
>> -        of DR_BASE_ADDRESS as long as all information is still available.  */
>> -      if (operand_equal_p (TREE_OPERAND (base, 0), base_addr, 0))
>> -       {
>> -         base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
>> -         base_bitpos &= (base_alignment - 1);
>> -       }
>> -      else
>> -       base_bitpos = BITS_PER_UNIT;
>> -    }
>> -  if (base_bitpos != 0)
>> -    base_alignment = base_bitpos & -base_bitpos;
>> -  /* Also look at the alignment of the base address DR analysis
>> -     computed.  */
>> -  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
>> -  if (base_addr_alignment > base_alignment)
>> -    base_alignment = base_addr_alignment;
>> +  if (DR_IS_CONDITIONAL (dr))
>> +    base_alignment = vect_compute_base_alignment (dr, base_addr);
>> +  if (unsigned int *entry = base_alignments->get (base_addr))
>> +    base_alignment = MAX (base_alignment, *entry);
>> +  gcc_assert (base_alignment != 0);
>>
>>    if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
>>      DR_VECT_AUX (dr)->base_element_aligned = true;
>> @@ -906,8 +979,29 @@ vect_update_misalignment_for_peel (struc
>>      {
>>        if (current_dr != dr)
>>          continue;
>> -      gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
>> -                  DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>> +      /* Any alignment guarantees provided by a reference only apply if
>> +        the reference actually occurs.  For example, in:
>> +
>> +           struct s __attribute__((aligned(32))) {
>> +             int misaligner;
>> +             int array[N];
>> +           };
>> +
>> +           int *ptr;
>> +           for (int i = 0; i < n; ++i)
>> +             ptr[i] = c[i] ? ((struct s *) (ptr - 1))->array[i] : 0;
>> +
>> +        we can only assume that ptr is part of a struct s if at least one
>> +        c[i] is true.  This in turn means that we have a higher base
>> +        alignment guarantee for the read from ptr (if it occurs) than for
>> +        the write to ptr, and we cannot unconditionally carry the former
>> +        over to the latter.  We still know that the two address values
>> +        have the same misalignment, so if peeling has forced one of them
>> +        to be aligned, the other must be too.  */
>> +      gcc_assert (DR_IS_CONDITIONAL (dr_peel)
>> +                 || DR_IS_CONDITIONAL (dr)
>> +                 || (DR_MISALIGNMENT (dr) / dr_size
>> +                     == DR_MISALIGNMENT (dr_peel) / dr_peel_size));
>>        SET_DR_MISALIGNMENT (dr, 0);
>>        return;
>>      }
>> @@ -2117,8 +2211,7 @@ vect_find_same_alignment_drs (struct dat
>>    if (dra == drb)
>>      return;
>>
>> -  if (!operand_equal_p (DR_BASE_OBJECT (dra), DR_BASE_OBJECT (drb),
>> -                       OEP_ADDRESS_OF)
>> +  if (!operand_equal_p (DR_BASE_ADDRESS (dra), DR_BASE_ADDRESS (drb), 0)
>>        || !operand_equal_p (DR_OFFSET (dra), DR_OFFSET (drb), 0)
>>        || !operand_equal_p (DR_STEP (dra), DR_STEP (drb), 0))
>>      return;
>> @@ -2176,6 +2269,7 @@ vect_analyze_data_refs_alignment (loop_v
>>    vec<data_reference_p> datarefs = vinfo->datarefs;
>>    struct data_reference *dr;
>>
>> +  vect_compute_base_alignments (vinfo);
>>    FOR_EACH_VEC_ELT (datarefs, i, dr)
>>      {
>>        stmt_vec_info stmt_info = vinfo_for_stmt (DR_STMT (dr));
>> @@ -3374,7 +3468,8 @@ vect_analyze_data_refs (vec_info *vinfo,
>>             {
>>               struct data_reference *newdr
>>                 = create_data_ref (NULL, loop_containing_stmt (stmt),
>> -                                  DR_REF (dr), stmt, maybe_scatter ? false : true);
>> +                                  DR_REF (dr), stmt, !maybe_scatter,
>> +                                  DR_IS_CONDITIONAL (dr));
>>               gcc_assert (newdr != NULL && DR_REF (newdr));
>>               if (DR_BASE_ADDRESS (newdr)
>>                   && DR_OFFSET (newdr)
>> Index: gcc/tree-vect-slp.c
>> ===================================================================
>> --- gcc/tree-vect-slp.c 2017-06-07 21:58:56.336475882 +0100
>> +++ gcc/tree-vect-slp.c 2017-06-22 12:23:21.288421018 +0100
>> @@ -2367,6 +2367,7 @@ new_bb_vec_info (gimple_stmt_iterator re
>>    gimple_stmt_iterator gsi;
>>
>>    res = (bb_vec_info) xcalloc (1, sizeof (struct _bb_vec_info));
>> +  new (&res->base_alignments) vec_base_alignments ();
>>    res->kind = vec_info::bb;
>>    BB_VINFO_BB (res) = bb;
>>    res->region_begin = region_begin;
>> @@ -2741,6 +2742,8 @@ vect_slp_analyze_bb_1 (gimple_stmt_itera
>>        return NULL;
>>      }
>>
>> +  vect_compute_base_alignments (bb_vinfo);
>> +
>>    /* Analyze and verify the alignment of data references and the
>>       dependence in the SLP instances.  */
>>    for (i = 0; BB_VINFO_SLP_INSTANCES (bb_vinfo).iterate (i, &instance); )
>> Index: gcc/tree-vect-loop.c
>> ===================================================================
>> --- gcc/tree-vect-loop.c        2017-06-22 12:22:57.734313143 +0100
>> +++ gcc/tree-vect-loop.c        2017-06-22 12:23:21.287421072 +0100
>> @@ -1157,6 +1157,7 @@ new_loop_vec_info (struct loop *loop)
>>    LOOP_VINFO_VECT_FACTOR (res) = 0;
>>    LOOP_VINFO_LOOP_NEST (res) = vNULL;
>>    LOOP_VINFO_DATAREFS (res) = vNULL;
>> +  new (&res->base_alignments) vec_base_alignments ();
>>    LOOP_VINFO_DDRS (res) = vNULL;
>>    LOOP_VINFO_UNALIGNED_DR (res) = NULL;
>>    LOOP_VINFO_MAY_MISALIGN_STMTS (res) = vNULL;
>> Index: gcc/tree-vectorizer.c
>> ===================================================================
>> --- gcc/tree-vectorizer.c       2017-06-22 12:22:57.732313220 +0100
>> +++ gcc/tree-vectorizer.c       2017-06-22 12:23:21.288421018 +0100
>> @@ -370,6 +370,8 @@ vect_destroy_datarefs (vec_info *vinfo)
>>        }
>>
>>    free_data_refs (vinfo->datarefs);
>> +
>> +  vinfo->base_alignments.~vec_base_alignments ();
>>  }
>>
>>  /* A helper function to free scev and LOOP niter information, as well as
>> Index: gcc/testsuite/gcc.dg/vect/pr81136.c
>> ===================================================================
>> --- /dev/null   2017-06-22 07:43:14.805493307 +0100
>> +++ gcc/testsuite/gcc.dg/vect/pr81136.c 2017-06-22 12:23:21.283421287 +0100
>> @@ -0,0 +1,16 @@
>> +/* { dg-do compile } */
>> +
>> +struct __attribute__((aligned (32)))
>> +{
>> +  char misaligner;
>> +  int foo[100];
>> +  int bar[100];
>> +} *a;
>> +
>> +void
>> +fn1 (int n)
>> +{
>> +  int *b = a->foo;
>> +  for (int i = 0; i < n; i++)
>> +    a->bar[i] = b[i];
>> +}
Richard Biener June 26, 2017, 10:25 a.m. UTC | #4
On Fri, Jun 23, 2017 at 2:05 PM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:
>> On Thu, Jun 22, 2017 at 1:30 PM, Richard Sandiford
>> <richard.sandiford@linaro.org> wrote:
>>> The test case triggered this assert in vect_update_misalignment_for_peel:
>>>
>>>           gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
>>>                       DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>>>
>>> We knew that the two DRs had the same misalignment at runtime, but when
>>> considered in isolation, one data reference guaranteed a higher compile-time
>>> base alignment than the other.
>>>
>>> In the test case this looks like a missed opportunity.  Both references
>>> are unconditional, so it should be possible to use the highest of the
>>> available base alignment guarantees when analyzing each reference.
>>> The patch does this.
>>>
>>> However, as the comment in the patch says, the base alignment guarantees
>>> provided by a conditional reference only apply if the reference occurs
>>> at least once.  In this case it would be legitimate for two references
>>> to have the same runtime misalignment and for one reference to provide a
>>> stronger compile-time guarantee than the other about what the misalignment
>>> actually is.  The patch therefore relaxes the assert to handle that case.
>>
>> Hmm, but you don't actually check whether a reference occurs only conditional,
>> do you?  You just seem to say that for masked loads/stores the reference
>> is conditional (I believe that's not true).  But for a loop like
>>
>>  for (;;)
>>    if (a[i])
>>      sum += b[j];
>>
>> you still assume b[j] executes unconditionally?
>
> Maybe the documentation isn't clear enough, but DR_IS_CONDITIONAL
> was supposed to mean "even if the containing statement executes
> and runs to completion, the reference might not actually occur".
> The example above isn't conditional in that sense because the
> reference to b[j] does occur if the store is reached and completes.
>
> Masked loads and stores are conditional in that sense though.
> The reference only occurs if the mask is nonzero; the memory
> isn't touched otherwise.  The functions are used to if-convert
> things like:
>
>    for (...)
>      a[i] = b[i] ? c[i] : d[i];
>
> where there's no guarantee that it's safe to access c[i] when !b[i]
> (or d[i] when b[i]).  No reference occurs for an all-false mask.

But as you touch generic data-ref code here you should apply more
sensible semantics to DR_IS_CONDITIONAL than just marking
masked loads/stores but not DRs occuring inside BBs only executed
conditionally ...

>> The vectorizer of course only sees unconditionally executed stmts.
>>
>> So - I'd simply not add this DR_IS_CONDITIONAL.  Did you run into
>> any real-world (testsuite) issues without this?
>
> Dropping DR_IS_CONDITIONAL would cause us to make invalid alignment
> assumptions in silly corner cases.  I could add a scan test for it,
> for targets with masked loads and stores.  It wouldn't trigger
> an execution failure though because we assume that targets with
> masked loads and stores allow unaligned accesses:
>
>   /* For now assume all conditional loads/stores support unaligned
>      access without any special code.  */
>   if (is_gimple_call (stmt)
>       && gimple_call_internal_p (stmt)
>       && (gimple_call_internal_fn (stmt) == IFN_MASK_LOAD
>           || gimple_call_internal_fn (stmt) == IFN_MASK_STORE))
>     return dr_unaligned_supported;
>
> So the worst that would happen is that we'd supposedly peel for
> alignment, but actually misalign everything instead, and so make
> things slower rather than quicker.
>
>> Note that the assert is to prevent bogus information.  Iff we aligned
>> DR with base alignment 8 and misalign 3 then if another same-align
>> DR has base alignment 16 we can't simply zero its DR_MISALIGNMENT
>> as it still can be 8 after aligning DR.
>>
>> So I think it's wrong to put DRs with differing base-alignment into
>> the same-align-refs chain, those should get their DR_MISALIGNMENT
>> updated independenlty after peeling.
>
> DR_MISALIGNMENT is relative to the vector alignment rather than
> the base alignment though.  So:

We seem to use it that way, yes (looking at set_ptr_info_alignment
uses).  So why not fix the assert then by capping the alignment/misalignment
we compute at this value as well?  (and document this in the header
around DR_MISALIGNMENT)

Ideally we'd do alignment analysis independent of the vector size
though (for those stupid targets with multiple vector sizes to consider...).

> a) when looking for references *A1 and *A2 with the same alignment,
>    we simply have to prove that A1 % vecalign == A2 % vecalign.
>    This doesn't require any knowledge about the base alignment.
>    If we break the addresses down as:
>
>       A1 = BASE1 + REST1,  REST1 = INIT1 + OFFSET1 + X * STEP1
>       A2 = BASE2 + REST2,  REST2 = INIT2 + OFFSET2 + X * STEP2
>
>    and can prove that BASE1 == BASE2, the alignment of that base
>    isn't important.  We simply need to prove that REST1 % vecalign
>    == REST2 % vecalign for all X.
>
> b) In the assert, we've peeled the loop so that DR_PEEL is guaranteed
>    to be vector-aligned.  If DR_PEEL is A1 in the example above, we have
>    A1 % vecalign == 0, so A2 % vecalign must be 0 too.  This again doesn't
>    rely on the base alignment being known.
>
> What a high base alignment for DR_PEEL gives us is the ability to know
> at compile how many iterations need to be peeled to make DR_PEEL aligned.
> But the points above apply regardless of whether we know that value at
> compile time or not.
>
> In examples like the test case, we would have known at compile time that
> VF-1 iterations would need to be peeled if we'd picked the store as the
> DR_PEEL, but would have treated the number of peels as variable if we'd
> picked the load.  The value calculated at runtime would still have been
> VF-1, it's just that the code wouldn't have been as efficient.
>
> One of the benefits of pooling the alignments for unconditional references
> is that it no longer matters which DR we pick: the number of peels will
> be a compile-time constant both ways.
>
> Thanks,
> Richard
>
>> I'd rather not mix fixing this with the improvement to eventuall use a
>> larger align for the other DR if possible.

^^^

So can you fix the ICE with capping base alignment / DR_MISALIGNMENT?

Richard.

>> Thanks,
>> Richard.
>>
>>> Tested on powerpc64-linux-gnu, aarch64-linux-gnu and x86_64-linux-gnu.
>>> OK to instal?
>>>
>>> Richard
>>>
>>>
>>> 2017-06-22  Richard Sandiford  <richard.sandiford@linaro.org>
>>>
>>> gcc/
>>>         PR tree-optimization/81136
>>>         * tree-vectorizer.h: Include tree-hash-traits.h.
>>>         (vec_base_alignments): New typedef.
>>>         (vec_info): Add a base_alignments field.
>>>         (vect_compute_base_alignments: Declare.
>>>         * tree-data-ref.h (data_reference): Add an is_conditional field.
>>>         (DR_IS_CONDITIONAL): New macro.
>>>         (create_data_ref): Add an is_conditional argument.
>>>         * tree-data-ref.c (create_data_ref): Likewise.  Use it to initialize
>>>         the is_conditional field.
>>>         (data_ref_loc): Add an is_conditional field.
>>>         (get_references_in_stmt): Set the is_conditional field.
>>>         (find_data_references_in_stmt): Update call to create_data_ref.
>>>         (graphite_find_data_references_in_stmt): Likewise.
>>>         * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Likewise.
>>>         * tree-vect-data-refs.c (vect_analyze_data_refs): Likewise.
>>>         (vect_get_base_address): New function.
>>>         (vect_compute_base_alignments): Likewise.
>>>         (vect_compute_base_alignment): Likewise, split out from...
>>>         (vect_compute_data_ref_alignment): ...here.  Use precomputed
>>>         base alignments.  Only compute a new base alignment here if the
>>>         reference is conditional.
>>>         (vect_update_misalignment_for_peel): Allow the compile-time
>>>         DR_MISALIGNMENTs of two references with the same runtime alignment
>>>         to be different if one of the references is conditional.
>>>         (vect_find_same_alignment_drs): Compare base addresses instead
>>>         of base objects.
>>>         (vect_compute_data_ref_alignment): Call vect_compute_base_alignments.
>>>         * tree-vect-slp.c (vect_slp_analyze_bb_1): Likewise.
>>>         (new_bb_vec_info): Initialize base_alignments.
>>>         * tree-vect-loop.c (new_loop_vec_info): Likewise.
>>>         * tree-vectorizer.c (vect_destroy_datarefs): Release base_alignments.
>>>
>>> gcc/testsuite/
>>>         PR tree-optimization/81136
>>>         * gcc.dg/vect/pr81136.c: New test.
>>>
>>> Index: gcc/tree-vectorizer.h
>>> ===================================================================
>>> --- gcc/tree-vectorizer.h       2017-06-08 08:51:43.347264181 +0100
>>> +++ gcc/tree-vectorizer.h       2017-06-22 12:23:21.288421018 +0100
>>> @@ -22,6 +22,7 @@ Software Foundation; either version 3, o
>>>  #define GCC_TREE_VECTORIZER_H
>>>
>>>  #include "tree-data-ref.h"
>>> +#include "tree-hash-traits.h"
>>>  #include "target.h"
>>>
>>>  /* Used for naming of new temporaries.  */
>>> @@ -84,6 +85,10 @@ struct stmt_info_for_cost {
>>>
>>>  typedef vec<stmt_info_for_cost> stmt_vector_for_cost;
>>>
>>> +/* Maps base addresses to the largest alignment that we've been able
>>> +   to calculate for them.  */
>>> +typedef hash_map<tree_operand_hash, unsigned int> vec_base_alignments;
>>> +
>>>  /************************************************************************
>>>    SLP
>>>   ************************************************************************/
>>> @@ -156,6 +161,10 @@ struct vec_info {
>>>    /* All data references.  */
>>>    vec<data_reference_p> datarefs;
>>>
>>> +  /* Maps the base addresses of all data references in DATAREFS to the
>>> +     largest alignment that we've been able to calculate for them.  */
>>> +  vec_base_alignments base_alignments;
>>> +
>>>    /* All data dependences.  */
>>>    vec<ddr_p> ddrs;
>>>
>>> @@ -1117,6 +1126,7 @@ extern bool vect_prune_runtime_alias_tes
>>>  extern bool vect_check_gather_scatter (gimple *, loop_vec_info,
>>>                                        gather_scatter_info *);
>>>  extern bool vect_analyze_data_refs (vec_info *, int *);
>>> +extern void vect_compute_base_alignments (vec_info *);
>>>  extern tree vect_create_data_ref_ptr (gimple *, tree, struct loop *, tree,
>>>                                       tree *, gimple_stmt_iterator *,
>>>                                       gimple **, bool, bool *,
>>> Index: gcc/tree-data-ref.h
>>> ===================================================================
>>> --- gcc/tree-data-ref.h 2017-06-08 08:51:43.349263895 +0100
>>> +++ gcc/tree-data-ref.h 2017-06-22 12:23:21.285421180 +0100
>>> @@ -119,6 +119,10 @@ struct data_reference
>>>    /* True when the data reference is in RHS of a stmt.  */
>>>    bool is_read;
>>>
>>> +  /* True when the data reference is conditional, i.e. if it might not
>>> +     occur even when the statement runs to completion.  */
>>> +  bool is_conditional;
>>> +
>>>    /* Behavior of the memory reference in the innermost loop.  */
>>>    struct innermost_loop_behavior innermost;
>>>
>>> @@ -138,6 +142,7 @@ #define DR_ACCESS_FN(DR, I)        DR_AC
>>>  #define DR_NUM_DIMENSIONS(DR)      DR_ACCESS_FNS (DR).length ()
>>>  #define DR_IS_READ(DR)             (DR)->is_read
>>>  #define DR_IS_WRITE(DR)            (!DR_IS_READ (DR))
>>> +#define DR_IS_CONDITIONAL(DR)      (DR)->is_conditional
>>>  #define DR_BASE_ADDRESS(DR)        (DR)->innermost.base_address
>>>  #define DR_OFFSET(DR)              (DR)->innermost.offset
>>>  #define DR_INIT(DR)                (DR)->innermost.init
>>> @@ -350,7 +355,8 @@ extern bool graphite_find_data_reference
>>>                                                    vec<data_reference_p> *);
>>>  tree find_data_references_in_loop (struct loop *, vec<data_reference_p> *);
>>>  bool loop_nest_has_data_refs (loop_p loop);
>>> -struct data_reference *create_data_ref (loop_p, loop_p, tree, gimple *, bool);
>>> +struct data_reference *create_data_ref (loop_p, loop_p, tree, gimple *, bool,
>>> +                                       bool);
>>>  extern bool find_loop_nest (struct loop *, vec<loop_p> *);
>>>  extern struct data_dependence_relation *initialize_data_dependence_relation
>>>       (struct data_reference *, struct data_reference *, vec<loop_p>);
>>> Index: gcc/tree-data-ref.c
>>> ===================================================================
>>> --- gcc/tree-data-ref.c 2017-06-08 08:51:43.349263895 +0100
>>> +++ gcc/tree-data-ref.c 2017-06-22 12:23:21.284421233 +0100
>>> @@ -1053,15 +1053,18 @@ free_data_ref (data_reference_p dr)
>>>    free (dr);
>>>  }
>>>
>>> -/* Analyzes memory reference MEMREF accessed in STMT.  The reference
>>> -   is read if IS_READ is true, write otherwise.  Returns the
>>> -   data_reference description of MEMREF.  NEST is the outermost loop
>>> -   in which the reference should be instantiated, LOOP is the loop in
>>> -   which the data reference should be analyzed.  */
>>> +/* Analyze memory reference MEMREF, which is accessed in STMT.  The reference
>>> +   is a read if IS_READ is true, otherwise it is a write.  IS_CONDITIONAL
>>> +   indicates that the reference is conditional, i.e. that it might not
>>> +   occur every time that STMT runs to completion.
>>> +
>>> +   Return the data_reference description of MEMREF.  NEST is the outermost
>>> +   loop in which the reference should be instantiated, LOOP is the loop
>>> +   in which the data reference should be analyzed.  */
>>>
>>>  struct data_reference *
>>>  create_data_ref (loop_p nest, loop_p loop, tree memref, gimple *stmt,
>>> -                bool is_read)
>>> +                bool is_read, bool is_conditional)
>>>  {
>>>    struct data_reference *dr;
>>>
>>> @@ -1076,6 +1079,7 @@ create_data_ref (loop_p nest, loop_p loo
>>>    DR_STMT (dr) = stmt;
>>>    DR_REF (dr) = memref;
>>>    DR_IS_READ (dr) = is_read;
>>> +  DR_IS_CONDITIONAL (dr) = is_conditional;
>>>
>>>    dr_analyze_innermost (dr, nest);
>>>    dr_analyze_indices (dr, nest, loop);
>>> @@ -4446,6 +4450,10 @@ struct data_ref_loc
>>>
>>>    /* True if the memory reference is read.  */
>>>    bool is_read;
>>> +
>>> +  /* True if the data reference is conditional, i.e. if it might not
>>> +     occur even when the statement runs to completion.  */
>>> +  bool is_conditional;
>>>  };
>>>
>>>
>>> @@ -4512,6 +4520,7 @@ get_references_in_stmt (gimple *stmt, ve
>>>         {
>>>           ref.ref = op1;
>>>           ref.is_read = true;
>>> +         ref.is_conditional = false;
>>>           references->safe_push (ref);
>>>         }
>>>      }
>>> @@ -4539,6 +4548,7 @@ get_references_in_stmt (gimple *stmt, ve
>>>               type = TREE_TYPE (gimple_call_arg (stmt, 3));
>>>             if (TYPE_ALIGN (type) != align)
>>>               type = build_aligned_type (type, align);
>>> +           ref.is_conditional = true;
>>>             ref.ref = fold_build2 (MEM_REF, type, gimple_call_arg (stmt, 0),
>>>                                    ptr);
>>>             references->safe_push (ref);
>>> @@ -4558,6 +4568,7 @@ get_references_in_stmt (gimple *stmt, ve
>>>             {
>>>               ref.ref = op1;
>>>               ref.is_read = true;
>>> +             ref.is_conditional = false;
>>>               references->safe_push (ref);
>>>             }
>>>         }
>>> @@ -4571,6 +4582,7 @@ get_references_in_stmt (gimple *stmt, ve
>>>      {
>>>        ref.ref = op0;
>>>        ref.is_read = false;
>>> +      ref.is_conditional = false;
>>>        references->safe_push (ref);
>>>      }
>>>    return clobbers_memory;
>>> @@ -4635,8 +4647,8 @@ find_data_references_in_stmt (struct loo
>>>
>>>    FOR_EACH_VEC_ELT (references, i, ref)
>>>      {
>>> -      dr = create_data_ref (nest, loop_containing_stmt (stmt),
>>> -                           ref->ref, stmt, ref->is_read);
>>> +      dr = create_data_ref (nest, loop_containing_stmt (stmt), ref->ref,
>>> +                           stmt, ref->is_read, ref->is_conditional);
>>>        gcc_assert (dr != NULL);
>>>        datarefs->safe_push (dr);
>>>      }
>>> @@ -4665,7 +4677,8 @@ graphite_find_data_references_in_stmt (l
>>>
>>>    FOR_EACH_VEC_ELT (references, i, ref)
>>>      {
>>> -      dr = create_data_ref (nest, loop, ref->ref, stmt, ref->is_read);
>>> +      dr = create_data_ref (nest, loop, ref->ref, stmt, ref->is_read,
>>> +                           ref->is_conditional);
>>>        gcc_assert (dr != NULL);
>>>        datarefs->safe_push (dr);
>>>      }
>>> Index: gcc/tree-ssa-loop-prefetch.c
>>> ===================================================================
>>> --- gcc/tree-ssa-loop-prefetch.c        2017-06-07 21:58:55.928557601 +0100
>>> +++ gcc/tree-ssa-loop-prefetch.c        2017-06-22 12:23:21.285421180 +0100
>>> @@ -1633,7 +1633,7 @@ determine_loop_nest_reuse (struct loop *
>>>      for (ref = gr->refs; ref; ref = ref->next)
>>>        {
>>>         dr = create_data_ref (nest, loop_containing_stmt (ref->stmt),
>>> -                             ref->mem, ref->stmt, !ref->write_p);
>>> +                             ref->mem, ref->stmt, !ref->write_p, false);
>>>
>>>         if (dr)
>>>           {
>>> Index: gcc/tree-vect-data-refs.c
>>> ===================================================================
>>> --- gcc/tree-vect-data-refs.c   2017-06-08 08:51:43.350263752 +0100
>>> +++ gcc/tree-vect-data-refs.c   2017-06-22 12:23:21.286421126 +0100
>>> @@ -646,6 +646,102 @@ vect_slp_analyze_instance_dependence (sl
>>>    return res;
>>>  }
>>>
>>> +/* If DR is nested in a loop that is being vectorized, return the base
>>> +   address in the context of the vectorized loop (rather than the
>>> +   nested loop).  Otherwise return the base address in the context
>>> +   of the containing statement.  */
>>> +
>>> +static tree
>>> +vect_get_base_address (data_reference *dr)
>>> +{
>>> +  gimple *stmt = DR_STMT (dr);
>>> +  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
>>> +  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>>> +  struct loop *loop = loop_vinfo != NULL ? LOOP_VINFO_LOOP (loop_vinfo) : NULL;
>>> +  if (loop && nested_in_vect_loop_p (loop, stmt))
>>> +    return STMT_VINFO_DR_BASE_ADDRESS (stmt_info);
>>> +  else
>>> +    return DR_BASE_ADDRESS (dr);
>>> +}
>>> +
>>> +/* Compute and return the alignment of base address BASE_ADDR in DR.  */
>>> +
>>> +static unsigned int
>>> +vect_compute_base_alignment (data_reference *dr, tree base_addr)
>>> +{
>>> +  /* To look at the alignment of the base we have to preserve an inner
>>> +     MEM_REF as that carries the alignment information of the actual
>>> +     access.  */
>>> +  tree base = DR_REF (dr);
>>> +  while (handled_component_p (base))
>>> +    base = TREE_OPERAND (base, 0);
>>> +  unsigned int base_alignment = 0;
>>> +  unsigned HOST_WIDE_INT base_bitpos;
>>> +  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
>>> +
>>> +  /* As data-ref analysis strips the MEM_REF down to its base operand
>>> +     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
>>> +     adjust things to make base_alignment valid as the alignment of
>>> +     DR_BASE_ADDRESS.  */
>>> +  if (TREE_CODE (base) == MEM_REF)
>>> +    {
>>> +      /* Note all this only works if DR_BASE_ADDRESS is the same as
>>> +        MEM_REF operand zero, otherwise DR/SCEV analysis might have factored
>>> +        in other offsets.  We need to rework DR to compute the alingment
>>> +        of DR_BASE_ADDRESS as long as all information is still available.  */
>>> +      if (operand_equal_p (TREE_OPERAND (base, 0), base_addr, 0))
>>> +       {
>>> +         base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
>>> +         base_bitpos &= (base_alignment - 1);
>>> +       }
>>> +      else
>>> +       base_bitpos = BITS_PER_UNIT;
>>> +    }
>>> +  if (base_bitpos != 0)
>>> +    base_alignment = base_bitpos & -base_bitpos;
>>> +
>>> +  /* Also look at the alignment of the base address DR analysis
>>> +     computed.  */
>>> +  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
>>> +  if (base_addr_alignment > base_alignment)
>>> +    base_alignment = base_addr_alignment;
>>> +
>>> +  return base_alignment;
>>> +}
>>> +
>>> +/* Compute alignments for the base addresses of all datarefs in VINFO.  */
>>> +
>>> +void
>>> +vect_compute_base_alignments (vec_info *vinfo)
>>> +{
>>> +  /* If the region we're going to vectorize is reached, all unconditional
>>> +     data references occur at least once.  We can therefore pool the base
>>> +     alignment guarantees from each unconditional reference.  */
>>> +  data_reference *dr;
>>> +  unsigned int i;
>>> +  FOR_EACH_VEC_ELT (vinfo->datarefs, i, dr)
>>> +    if (!DR_IS_CONDITIONAL (dr))
>>> +      {
>>> +       tree base_addr = vect_get_base_address (dr);
>>> +       unsigned int alignment = vect_compute_base_alignment (dr, base_addr);
>>> +       bool existed;
>>> +       unsigned int &entry
>>> +         = vinfo->base_alignments.get_or_insert (base_addr, &existed);
>>> +       if (!existed || entry < alignment)
>>> +         {
>>> +           entry = alignment;
>>> +           if (dump_enabled_p ())
>>> +             {
>>> +               dump_printf_loc (MSG_NOTE, vect_location,
>>> +                                "setting base alignment for ");
>>> +               dump_generic_expr (MSG_NOTE, TDF_SLIM, base_addr);
>>> +               dump_printf (MSG_NOTE, " to %d, based on ", alignment);
>>> +               dump_gimple_stmt (MSG_NOTE, TDF_SLIM, DR_STMT (dr), 0);
>>> +             }
>>> +         }
>>> +      }
>>> +}
>>> +
>>>  /* Function vect_compute_data_ref_alignment
>>>
>>>     Compute the misalignment of the data reference DR.
>>> @@ -663,6 +759,7 @@ vect_compute_data_ref_alignment (struct
>>>  {
>>>    gimple *stmt = DR_STMT (dr);
>>>    stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
>>> +  vec_base_alignments *base_alignments = &stmt_info->vinfo->base_alignments;
>>>    loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>>>    struct loop *loop = NULL;
>>>    tree ref = DR_REF (dr);
>>> @@ -699,6 +796,8 @@ vect_compute_data_ref_alignment (struct
>>>      {
>>>        tree step = DR_STEP (dr);
>>>
>>> +      base_addr = STMT_VINFO_DR_BASE_ADDRESS (stmt_info);
>>> +      aligned_to = STMT_VINFO_DR_ALIGNED_TO (stmt_info);
>>>        if (tree_fits_shwi_p (step)
>>>           && tree_to_shwi (step) % GET_MODE_SIZE (TYPE_MODE (vectype)) == 0)
>>>          {
>>> @@ -706,8 +805,6 @@ vect_compute_data_ref_alignment (struct
>>>              dump_printf_loc (MSG_NOTE, vect_location,
>>>                               "inner step divides the vector-size.\n");
>>>           misalign = STMT_VINFO_DR_INIT (stmt_info);
>>> -         aligned_to = STMT_VINFO_DR_ALIGNED_TO (stmt_info);
>>> -         base_addr = STMT_VINFO_DR_BASE_ADDRESS (stmt_info);
>>>          }
>>>        else
>>>         {
>>> @@ -738,39 +835,15 @@ vect_compute_data_ref_alignment (struct
>>>         }
>>>      }
>>>
>>> -  /* To look at alignment of the base we have to preserve an inner MEM_REF
>>> -     as that carries alignment information of the actual access.  */
>>> -  base = ref;
>>> -  while (handled_component_p (base))
>>> -    base = TREE_OPERAND (base, 0);
>>> +  /* Calculate the maximum of the pooled base address alignment and the
>>> +     alignment that we can compute for DR itself.  The latter should
>>> +     already be included in the former for unconditional references.  */
>>>    unsigned int base_alignment = 0;
>>> -  unsigned HOST_WIDE_INT base_bitpos;
>>> -  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
>>> -  /* As data-ref analysis strips the MEM_REF down to its base operand
>>> -     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
>>> -     adjust things to make base_alignment valid as the alignment of
>>> -     DR_BASE_ADDRESS.  */
>>> -  if (TREE_CODE (base) == MEM_REF)
>>> -    {
>>> -      /* Note all this only works if DR_BASE_ADDRESS is the same as
>>> -        MEM_REF operand zero, otherwise DR/SCEV analysis might have factored
>>> -        in other offsets.  We need to rework DR to compute the alingment
>>> -        of DR_BASE_ADDRESS as long as all information is still available.  */
>>> -      if (operand_equal_p (TREE_OPERAND (base, 0), base_addr, 0))
>>> -       {
>>> -         base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
>>> -         base_bitpos &= (base_alignment - 1);
>>> -       }
>>> -      else
>>> -       base_bitpos = BITS_PER_UNIT;
>>> -    }
>>> -  if (base_bitpos != 0)
>>> -    base_alignment = base_bitpos & -base_bitpos;
>>> -  /* Also look at the alignment of the base address DR analysis
>>> -     computed.  */
>>> -  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
>>> -  if (base_addr_alignment > base_alignment)
>>> -    base_alignment = base_addr_alignment;
>>> +  if (DR_IS_CONDITIONAL (dr))
>>> +    base_alignment = vect_compute_base_alignment (dr, base_addr);
>>> +  if (unsigned int *entry = base_alignments->get (base_addr))
>>> +    base_alignment = MAX (base_alignment, *entry);
>>> +  gcc_assert (base_alignment != 0);
>>>
>>>    if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
>>>      DR_VECT_AUX (dr)->base_element_aligned = true;
>>> @@ -906,8 +979,29 @@ vect_update_misalignment_for_peel (struc
>>>      {
>>>        if (current_dr != dr)
>>>          continue;
>>> -      gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
>>> -                  DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>>> +      /* Any alignment guarantees provided by a reference only apply if
>>> +        the reference actually occurs.  For example, in:
>>> +
>>> +           struct s __attribute__((aligned(32))) {
>>> +             int misaligner;
>>> +             int array[N];
>>> +           };
>>> +
>>> +           int *ptr;
>>> +           for (int i = 0; i < n; ++i)
>>> +             ptr[i] = c[i] ? ((struct s *) (ptr - 1))->array[i] : 0;
>>> +
>>> +        we can only assume that ptr is part of a struct s if at least one
>>> +        c[i] is true.  This in turn means that we have a higher base
>>> +        alignment guarantee for the read from ptr (if it occurs) than for
>>> +        the write to ptr, and we cannot unconditionally carry the former
>>> +        over to the latter.  We still know that the two address values
>>> +        have the same misalignment, so if peeling has forced one of them
>>> +        to be aligned, the other must be too.  */
>>> +      gcc_assert (DR_IS_CONDITIONAL (dr_peel)
>>> +                 || DR_IS_CONDITIONAL (dr)
>>> +                 || (DR_MISALIGNMENT (dr) / dr_size
>>> +                     == DR_MISALIGNMENT (dr_peel) / dr_peel_size));
>>>        SET_DR_MISALIGNMENT (dr, 0);
>>>        return;
>>>      }
>>> @@ -2117,8 +2211,7 @@ vect_find_same_alignment_drs (struct dat
>>>    if (dra == drb)
>>>      return;
>>>
>>> -  if (!operand_equal_p (DR_BASE_OBJECT (dra), DR_BASE_OBJECT (drb),
>>> -                       OEP_ADDRESS_OF)
>>> +  if (!operand_equal_p (DR_BASE_ADDRESS (dra), DR_BASE_ADDRESS (drb), 0)
>>>        || !operand_equal_p (DR_OFFSET (dra), DR_OFFSET (drb), 0)
>>>        || !operand_equal_p (DR_STEP (dra), DR_STEP (drb), 0))
>>>      return;
>>> @@ -2176,6 +2269,7 @@ vect_analyze_data_refs_alignment (loop_v
>>>    vec<data_reference_p> datarefs = vinfo->datarefs;
>>>    struct data_reference *dr;
>>>
>>> +  vect_compute_base_alignments (vinfo);
>>>    FOR_EACH_VEC_ELT (datarefs, i, dr)
>>>      {
>>>        stmt_vec_info stmt_info = vinfo_for_stmt (DR_STMT (dr));
>>> @@ -3374,7 +3468,8 @@ vect_analyze_data_refs (vec_info *vinfo,
>>>             {
>>>               struct data_reference *newdr
>>>                 = create_data_ref (NULL, loop_containing_stmt (stmt),
>>> -                                  DR_REF (dr), stmt, maybe_scatter ? false : true);
>>> +                                  DR_REF (dr), stmt, !maybe_scatter,
>>> +                                  DR_IS_CONDITIONAL (dr));
>>>               gcc_assert (newdr != NULL && DR_REF (newdr));
>>>               if (DR_BASE_ADDRESS (newdr)
>>>                   && DR_OFFSET (newdr)
>>> Index: gcc/tree-vect-slp.c
>>> ===================================================================
>>> --- gcc/tree-vect-slp.c 2017-06-07 21:58:56.336475882 +0100
>>> +++ gcc/tree-vect-slp.c 2017-06-22 12:23:21.288421018 +0100
>>> @@ -2367,6 +2367,7 @@ new_bb_vec_info (gimple_stmt_iterator re
>>>    gimple_stmt_iterator gsi;
>>>
>>>    res = (bb_vec_info) xcalloc (1, sizeof (struct _bb_vec_info));
>>> +  new (&res->base_alignments) vec_base_alignments ();
>>>    res->kind = vec_info::bb;
>>>    BB_VINFO_BB (res) = bb;
>>>    res->region_begin = region_begin;
>>> @@ -2741,6 +2742,8 @@ vect_slp_analyze_bb_1 (gimple_stmt_itera
>>>        return NULL;
>>>      }
>>>
>>> +  vect_compute_base_alignments (bb_vinfo);
>>> +
>>>    /* Analyze and verify the alignment of data references and the
>>>       dependence in the SLP instances.  */
>>>    for (i = 0; BB_VINFO_SLP_INSTANCES (bb_vinfo).iterate (i, &instance); )
>>> Index: gcc/tree-vect-loop.c
>>> ===================================================================
>>> --- gcc/tree-vect-loop.c        2017-06-22 12:22:57.734313143 +0100
>>> +++ gcc/tree-vect-loop.c        2017-06-22 12:23:21.287421072 +0100
>>> @@ -1157,6 +1157,7 @@ new_loop_vec_info (struct loop *loop)
>>>    LOOP_VINFO_VECT_FACTOR (res) = 0;
>>>    LOOP_VINFO_LOOP_NEST (res) = vNULL;
>>>    LOOP_VINFO_DATAREFS (res) = vNULL;
>>> +  new (&res->base_alignments) vec_base_alignments ();
>>>    LOOP_VINFO_DDRS (res) = vNULL;
>>>    LOOP_VINFO_UNALIGNED_DR (res) = NULL;
>>>    LOOP_VINFO_MAY_MISALIGN_STMTS (res) = vNULL;
>>> Index: gcc/tree-vectorizer.c
>>> ===================================================================
>>> --- gcc/tree-vectorizer.c       2017-06-22 12:22:57.732313220 +0100
>>> +++ gcc/tree-vectorizer.c       2017-06-22 12:23:21.288421018 +0100
>>> @@ -370,6 +370,8 @@ vect_destroy_datarefs (vec_info *vinfo)
>>>        }
>>>
>>>    free_data_refs (vinfo->datarefs);
>>> +
>>> +  vinfo->base_alignments.~vec_base_alignments ();
>>>  }
>>>
>>>  /* A helper function to free scev and LOOP niter information, as well as
>>> Index: gcc/testsuite/gcc.dg/vect/pr81136.c
>>> ===================================================================
>>> --- /dev/null   2017-06-22 07:43:14.805493307 +0100
>>> +++ gcc/testsuite/gcc.dg/vect/pr81136.c 2017-06-22 12:23:21.283421287 +0100
>>> @@ -0,0 +1,16 @@
>>> +/* { dg-do compile } */
>>> +
>>> +struct __attribute__((aligned (32)))
>>> +{
>>> +  char misaligner;
>>> +  int foo[100];
>>> +  int bar[100];
>>> +} *a;
>>> +
>>> +void
>>> +fn1 (int n)
>>> +{
>>> +  int *b = a->foo;
>>> +  for (int i = 0; i < n; i++)
>>> +    a->bar[i] = b[i];
>>> +}
Richard Biener June 26, 2017, 10:27 a.m. UTC | #5
On Mon, Jun 26, 2017 at 12:25 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Fri, Jun 23, 2017 at 2:05 PM, Richard Sandiford
> <richard.sandiford@linaro.org> wrote:
>> Richard Biener <richard.guenther@gmail.com> writes:
>>> On Thu, Jun 22, 2017 at 1:30 PM, Richard Sandiford
>>> <richard.sandiford@linaro.org> wrote:
>>>> The test case triggered this assert in vect_update_misalignment_for_peel:
>>>>
>>>>           gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
>>>>                       DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>>>>
>>>> We knew that the two DRs had the same misalignment at runtime, but when
>>>> considered in isolation, one data reference guaranteed a higher compile-time
>>>> base alignment than the other.
>>>>
>>>> In the test case this looks like a missed opportunity.  Both references
>>>> are unconditional, so it should be possible to use the highest of the
>>>> available base alignment guarantees when analyzing each reference.
>>>> The patch does this.
>>>>
>>>> However, as the comment in the patch says, the base alignment guarantees
>>>> provided by a conditional reference only apply if the reference occurs
>>>> at least once.  In this case it would be legitimate for two references
>>>> to have the same runtime misalignment and for one reference to provide a
>>>> stronger compile-time guarantee than the other about what the misalignment
>>>> actually is.  The patch therefore relaxes the assert to handle that case.
>>>
>>> Hmm, but you don't actually check whether a reference occurs only conditional,
>>> do you?  You just seem to say that for masked loads/stores the reference
>>> is conditional (I believe that's not true).  But for a loop like
>>>
>>>  for (;;)
>>>    if (a[i])
>>>      sum += b[j];
>>>
>>> you still assume b[j] executes unconditionally?
>>
>> Maybe the documentation isn't clear enough, but DR_IS_CONDITIONAL
>> was supposed to mean "even if the containing statement executes
>> and runs to completion, the reference might not actually occur".
>> The example above isn't conditional in that sense because the
>> reference to b[j] does occur if the store is reached and completes.
>>
>> Masked loads and stores are conditional in that sense though.
>> The reference only occurs if the mask is nonzero; the memory
>> isn't touched otherwise.  The functions are used to if-convert
>> things like:
>>
>>    for (...)
>>      a[i] = b[i] ? c[i] : d[i];
>>
>> where there's no guarantee that it's safe to access c[i] when !b[i]
>> (or d[i] when b[i]).  No reference occurs for an all-false mask.
>
> But as you touch generic data-ref code here you should apply more
> sensible semantics to DR_IS_CONDITIONAL than just marking
> masked loads/stores but not DRs occuring inside BBs only executed
> conditionally ...
>
>>> The vectorizer of course only sees unconditionally executed stmts.
>>>
>>> So - I'd simply not add this DR_IS_CONDITIONAL.  Did you run into
>>> any real-world (testsuite) issues without this?
>>
>> Dropping DR_IS_CONDITIONAL would cause us to make invalid alignment
>> assumptions in silly corner cases.  I could add a scan test for it,
>> for targets with masked loads and stores.  It wouldn't trigger
>> an execution failure though because we assume that targets with
>> masked loads and stores allow unaligned accesses:
>>
>>   /* For now assume all conditional loads/stores support unaligned
>>      access without any special code.  */
>>   if (is_gimple_call (stmt)
>>       && gimple_call_internal_p (stmt)
>>       && (gimple_call_internal_fn (stmt) == IFN_MASK_LOAD
>>           || gimple_call_internal_fn (stmt) == IFN_MASK_STORE))
>>     return dr_unaligned_supported;
>>
>> So the worst that would happen is that we'd supposedly peel for
>> alignment, but actually misalign everything instead, and so make
>> things slower rather than quicker.
>>
>>> Note that the assert is to prevent bogus information.  Iff we aligned
>>> DR with base alignment 8 and misalign 3 then if another same-align
>>> DR has base alignment 16 we can't simply zero its DR_MISALIGNMENT
>>> as it still can be 8 after aligning DR.
>>>
>>> So I think it's wrong to put DRs with differing base-alignment into
>>> the same-align-refs chain, those should get their DR_MISALIGNMENT
>>> updated independenlty after peeling.
>>
>> DR_MISALIGNMENT is relative to the vector alignment rather than
>> the base alignment though.  So:
>
> We seem to use it that way, yes (looking at set_ptr_info_alignment
> uses).  So why not fix the assert then by capping the alignment/misalignment
> we compute at this value as well?  (and document this in the header
> around DR_MISALIGNMENT)
>
> Ideally we'd do alignment analysis independent of the vector size
> though (for those stupid targets with multiple vector sizes to consider...).
>
>> a) when looking for references *A1 and *A2 with the same alignment,
>>    we simply have to prove that A1 % vecalign == A2 % vecalign.
>>    This doesn't require any knowledge about the base alignment.
>>    If we break the addresses down as:
>>
>>       A1 = BASE1 + REST1,  REST1 = INIT1 + OFFSET1 + X * STEP1
>>       A2 = BASE2 + REST2,  REST2 = INIT2 + OFFSET2 + X * STEP2
>>
>>    and can prove that BASE1 == BASE2, the alignment of that base
>>    isn't important.  We simply need to prove that REST1 % vecalign
>>    == REST2 % vecalign for all X.
>>
>> b) In the assert, we've peeled the loop so that DR_PEEL is guaranteed
>>    to be vector-aligned.  If DR_PEEL is A1 in the example above, we have
>>    A1 % vecalign == 0, so A2 % vecalign must be 0 too.  This again doesn't
>>    rely on the base alignment being known.
>>
>> What a high base alignment for DR_PEEL gives us is the ability to know
>> at compile how many iterations need to be peeled to make DR_PEEL aligned.
>> But the points above apply regardless of whether we know that value at
>> compile time or not.
>>
>> In examples like the test case, we would have known at compile time that
>> VF-1 iterations would need to be peeled if we'd picked the store as the
>> DR_PEEL, but would have treated the number of peels as variable if we'd
>> picked the load.  The value calculated at runtime would still have been
>> VF-1, it's just that the code wouldn't have been as efficient.
>>
>> One of the benefits of pooling the alignments for unconditional references
>> is that it no longer matters which DR we pick: the number of peels will
>> be a compile-time constant both ways.
>>
>> Thanks,
>> Richard
>>
>>> I'd rather not mix fixing this with the improvement to eventuall use a
>>> larger align for the other DR if possible.
>
> ^^^
>
> So can you fix the ICE with capping base alignment / DR_MISALIGNMENT?

Oh, and as you are touching general tree-data-refs.[ch] on my wishlist is still
"properly" computing the base alignment in tree-data-refs.[ch] itself rather
than trying to undo analyze_innermost in the vectorizer and 2nd-guessing
the alignment.  Basically make alignment/misalignment first-class
citizens.

Richard.

> Richard.
>
>>> Thanks,
>>> Richard.
>>>
>>>> Tested on powerpc64-linux-gnu, aarch64-linux-gnu and x86_64-linux-gnu.
>>>> OK to instal?
>>>>
>>>> Richard
>>>>
>>>>
>>>> 2017-06-22  Richard Sandiford  <richard.sandiford@linaro.org>
>>>>
>>>> gcc/
>>>>         PR tree-optimization/81136
>>>>         * tree-vectorizer.h: Include tree-hash-traits.h.
>>>>         (vec_base_alignments): New typedef.
>>>>         (vec_info): Add a base_alignments field.
>>>>         (vect_compute_base_alignments: Declare.
>>>>         * tree-data-ref.h (data_reference): Add an is_conditional field.
>>>>         (DR_IS_CONDITIONAL): New macro.
>>>>         (create_data_ref): Add an is_conditional argument.
>>>>         * tree-data-ref.c (create_data_ref): Likewise.  Use it to initialize
>>>>         the is_conditional field.
>>>>         (data_ref_loc): Add an is_conditional field.
>>>>         (get_references_in_stmt): Set the is_conditional field.
>>>>         (find_data_references_in_stmt): Update call to create_data_ref.
>>>>         (graphite_find_data_references_in_stmt): Likewise.
>>>>         * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Likewise.
>>>>         * tree-vect-data-refs.c (vect_analyze_data_refs): Likewise.
>>>>         (vect_get_base_address): New function.
>>>>         (vect_compute_base_alignments): Likewise.
>>>>         (vect_compute_base_alignment): Likewise, split out from...
>>>>         (vect_compute_data_ref_alignment): ...here.  Use precomputed
>>>>         base alignments.  Only compute a new base alignment here if the
>>>>         reference is conditional.
>>>>         (vect_update_misalignment_for_peel): Allow the compile-time
>>>>         DR_MISALIGNMENTs of two references with the same runtime alignment
>>>>         to be different if one of the references is conditional.
>>>>         (vect_find_same_alignment_drs): Compare base addresses instead
>>>>         of base objects.
>>>>         (vect_compute_data_ref_alignment): Call vect_compute_base_alignments.
>>>>         * tree-vect-slp.c (vect_slp_analyze_bb_1): Likewise.
>>>>         (new_bb_vec_info): Initialize base_alignments.
>>>>         * tree-vect-loop.c (new_loop_vec_info): Likewise.
>>>>         * tree-vectorizer.c (vect_destroy_datarefs): Release base_alignments.
>>>>
>>>> gcc/testsuite/
>>>>         PR tree-optimization/81136
>>>>         * gcc.dg/vect/pr81136.c: New test.
>>>>
>>>> Index: gcc/tree-vectorizer.h
>>>> ===================================================================
>>>> --- gcc/tree-vectorizer.h       2017-06-08 08:51:43.347264181 +0100
>>>> +++ gcc/tree-vectorizer.h       2017-06-22 12:23:21.288421018 +0100
>>>> @@ -22,6 +22,7 @@ Software Foundation; either version 3, o
>>>>  #define GCC_TREE_VECTORIZER_H
>>>>
>>>>  #include "tree-data-ref.h"
>>>> +#include "tree-hash-traits.h"
>>>>  #include "target.h"
>>>>
>>>>  /* Used for naming of new temporaries.  */
>>>> @@ -84,6 +85,10 @@ struct stmt_info_for_cost {
>>>>
>>>>  typedef vec<stmt_info_for_cost> stmt_vector_for_cost;
>>>>
>>>> +/* Maps base addresses to the largest alignment that we've been able
>>>> +   to calculate for them.  */
>>>> +typedef hash_map<tree_operand_hash, unsigned int> vec_base_alignments;
>>>> +
>>>>  /************************************************************************
>>>>    SLP
>>>>   ************************************************************************/
>>>> @@ -156,6 +161,10 @@ struct vec_info {
>>>>    /* All data references.  */
>>>>    vec<data_reference_p> datarefs;
>>>>
>>>> +  /* Maps the base addresses of all data references in DATAREFS to the
>>>> +     largest alignment that we've been able to calculate for them.  */
>>>> +  vec_base_alignments base_alignments;
>>>> +
>>>>    /* All data dependences.  */
>>>>    vec<ddr_p> ddrs;
>>>>
>>>> @@ -1117,6 +1126,7 @@ extern bool vect_prune_runtime_alias_tes
>>>>  extern bool vect_check_gather_scatter (gimple *, loop_vec_info,
>>>>                                        gather_scatter_info *);
>>>>  extern bool vect_analyze_data_refs (vec_info *, int *);
>>>> +extern void vect_compute_base_alignments (vec_info *);
>>>>  extern tree vect_create_data_ref_ptr (gimple *, tree, struct loop *, tree,
>>>>                                       tree *, gimple_stmt_iterator *,
>>>>                                       gimple **, bool, bool *,
>>>> Index: gcc/tree-data-ref.h
>>>> ===================================================================
>>>> --- gcc/tree-data-ref.h 2017-06-08 08:51:43.349263895 +0100
>>>> +++ gcc/tree-data-ref.h 2017-06-22 12:23:21.285421180 +0100
>>>> @@ -119,6 +119,10 @@ struct data_reference
>>>>    /* True when the data reference is in RHS of a stmt.  */
>>>>    bool is_read;
>>>>
>>>> +  /* True when the data reference is conditional, i.e. if it might not
>>>> +     occur even when the statement runs to completion.  */
>>>> +  bool is_conditional;
>>>> +
>>>>    /* Behavior of the memory reference in the innermost loop.  */
>>>>    struct innermost_loop_behavior innermost;
>>>>
>>>> @@ -138,6 +142,7 @@ #define DR_ACCESS_FN(DR, I)        DR_AC
>>>>  #define DR_NUM_DIMENSIONS(DR)      DR_ACCESS_FNS (DR).length ()
>>>>  #define DR_IS_READ(DR)             (DR)->is_read
>>>>  #define DR_IS_WRITE(DR)            (!DR_IS_READ (DR))
>>>> +#define DR_IS_CONDITIONAL(DR)      (DR)->is_conditional
>>>>  #define DR_BASE_ADDRESS(DR)        (DR)->innermost.base_address
>>>>  #define DR_OFFSET(DR)              (DR)->innermost.offset
>>>>  #define DR_INIT(DR)                (DR)->innermost.init
>>>> @@ -350,7 +355,8 @@ extern bool graphite_find_data_reference
>>>>                                                    vec<data_reference_p> *);
>>>>  tree find_data_references_in_loop (struct loop *, vec<data_reference_p> *);
>>>>  bool loop_nest_has_data_refs (loop_p loop);
>>>> -struct data_reference *create_data_ref (loop_p, loop_p, tree, gimple *, bool);
>>>> +struct data_reference *create_data_ref (loop_p, loop_p, tree, gimple *, bool,
>>>> +                                       bool);
>>>>  extern bool find_loop_nest (struct loop *, vec<loop_p> *);
>>>>  extern struct data_dependence_relation *initialize_data_dependence_relation
>>>>       (struct data_reference *, struct data_reference *, vec<loop_p>);
>>>> Index: gcc/tree-data-ref.c
>>>> ===================================================================
>>>> --- gcc/tree-data-ref.c 2017-06-08 08:51:43.349263895 +0100
>>>> +++ gcc/tree-data-ref.c 2017-06-22 12:23:21.284421233 +0100
>>>> @@ -1053,15 +1053,18 @@ free_data_ref (data_reference_p dr)
>>>>    free (dr);
>>>>  }
>>>>
>>>> -/* Analyzes memory reference MEMREF accessed in STMT.  The reference
>>>> -   is read if IS_READ is true, write otherwise.  Returns the
>>>> -   data_reference description of MEMREF.  NEST is the outermost loop
>>>> -   in which the reference should be instantiated, LOOP is the loop in
>>>> -   which the data reference should be analyzed.  */
>>>> +/* Analyze memory reference MEMREF, which is accessed in STMT.  The reference
>>>> +   is a read if IS_READ is true, otherwise it is a write.  IS_CONDITIONAL
>>>> +   indicates that the reference is conditional, i.e. that it might not
>>>> +   occur every time that STMT runs to completion.
>>>> +
>>>> +   Return the data_reference description of MEMREF.  NEST is the outermost
>>>> +   loop in which the reference should be instantiated, LOOP is the loop
>>>> +   in which the data reference should be analyzed.  */
>>>>
>>>>  struct data_reference *
>>>>  create_data_ref (loop_p nest, loop_p loop, tree memref, gimple *stmt,
>>>> -                bool is_read)
>>>> +                bool is_read, bool is_conditional)
>>>>  {
>>>>    struct data_reference *dr;
>>>>
>>>> @@ -1076,6 +1079,7 @@ create_data_ref (loop_p nest, loop_p loo
>>>>    DR_STMT (dr) = stmt;
>>>>    DR_REF (dr) = memref;
>>>>    DR_IS_READ (dr) = is_read;
>>>> +  DR_IS_CONDITIONAL (dr) = is_conditional;
>>>>
>>>>    dr_analyze_innermost (dr, nest);
>>>>    dr_analyze_indices (dr, nest, loop);
>>>> @@ -4446,6 +4450,10 @@ struct data_ref_loc
>>>>
>>>>    /* True if the memory reference is read.  */
>>>>    bool is_read;
>>>> +
>>>> +  /* True if the data reference is conditional, i.e. if it might not
>>>> +     occur even when the statement runs to completion.  */
>>>> +  bool is_conditional;
>>>>  };
>>>>
>>>>
>>>> @@ -4512,6 +4520,7 @@ get_references_in_stmt (gimple *stmt, ve
>>>>         {
>>>>           ref.ref = op1;
>>>>           ref.is_read = true;
>>>> +         ref.is_conditional = false;
>>>>           references->safe_push (ref);
>>>>         }
>>>>      }
>>>> @@ -4539,6 +4548,7 @@ get_references_in_stmt (gimple *stmt, ve
>>>>               type = TREE_TYPE (gimple_call_arg (stmt, 3));
>>>>             if (TYPE_ALIGN (type) != align)
>>>>               type = build_aligned_type (type, align);
>>>> +           ref.is_conditional = true;
>>>>             ref.ref = fold_build2 (MEM_REF, type, gimple_call_arg (stmt, 0),
>>>>                                    ptr);
>>>>             references->safe_push (ref);
>>>> @@ -4558,6 +4568,7 @@ get_references_in_stmt (gimple *stmt, ve
>>>>             {
>>>>               ref.ref = op1;
>>>>               ref.is_read = true;
>>>> +             ref.is_conditional = false;
>>>>               references->safe_push (ref);
>>>>             }
>>>>         }
>>>> @@ -4571,6 +4582,7 @@ get_references_in_stmt (gimple *stmt, ve
>>>>      {
>>>>        ref.ref = op0;
>>>>        ref.is_read = false;
>>>> +      ref.is_conditional = false;
>>>>        references->safe_push (ref);
>>>>      }
>>>>    return clobbers_memory;
>>>> @@ -4635,8 +4647,8 @@ find_data_references_in_stmt (struct loo
>>>>
>>>>    FOR_EACH_VEC_ELT (references, i, ref)
>>>>      {
>>>> -      dr = create_data_ref (nest, loop_containing_stmt (stmt),
>>>> -                           ref->ref, stmt, ref->is_read);
>>>> +      dr = create_data_ref (nest, loop_containing_stmt (stmt), ref->ref,
>>>> +                           stmt, ref->is_read, ref->is_conditional);
>>>>        gcc_assert (dr != NULL);
>>>>        datarefs->safe_push (dr);
>>>>      }
>>>> @@ -4665,7 +4677,8 @@ graphite_find_data_references_in_stmt (l
>>>>
>>>>    FOR_EACH_VEC_ELT (references, i, ref)
>>>>      {
>>>> -      dr = create_data_ref (nest, loop, ref->ref, stmt, ref->is_read);
>>>> +      dr = create_data_ref (nest, loop, ref->ref, stmt, ref->is_read,
>>>> +                           ref->is_conditional);
>>>>        gcc_assert (dr != NULL);
>>>>        datarefs->safe_push (dr);
>>>>      }
>>>> Index: gcc/tree-ssa-loop-prefetch.c
>>>> ===================================================================
>>>> --- gcc/tree-ssa-loop-prefetch.c        2017-06-07 21:58:55.928557601 +0100
>>>> +++ gcc/tree-ssa-loop-prefetch.c        2017-06-22 12:23:21.285421180 +0100
>>>> @@ -1633,7 +1633,7 @@ determine_loop_nest_reuse (struct loop *
>>>>      for (ref = gr->refs; ref; ref = ref->next)
>>>>        {
>>>>         dr = create_data_ref (nest, loop_containing_stmt (ref->stmt),
>>>> -                             ref->mem, ref->stmt, !ref->write_p);
>>>> +                             ref->mem, ref->stmt, !ref->write_p, false);
>>>>
>>>>         if (dr)
>>>>           {
>>>> Index: gcc/tree-vect-data-refs.c
>>>> ===================================================================
>>>> --- gcc/tree-vect-data-refs.c   2017-06-08 08:51:43.350263752 +0100
>>>> +++ gcc/tree-vect-data-refs.c   2017-06-22 12:23:21.286421126 +0100
>>>> @@ -646,6 +646,102 @@ vect_slp_analyze_instance_dependence (sl
>>>>    return res;
>>>>  }
>>>>
>>>> +/* If DR is nested in a loop that is being vectorized, return the base
>>>> +   address in the context of the vectorized loop (rather than the
>>>> +   nested loop).  Otherwise return the base address in the context
>>>> +   of the containing statement.  */
>>>> +
>>>> +static tree
>>>> +vect_get_base_address (data_reference *dr)
>>>> +{
>>>> +  gimple *stmt = DR_STMT (dr);
>>>> +  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
>>>> +  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>>>> +  struct loop *loop = loop_vinfo != NULL ? LOOP_VINFO_LOOP (loop_vinfo) : NULL;
>>>> +  if (loop && nested_in_vect_loop_p (loop, stmt))
>>>> +    return STMT_VINFO_DR_BASE_ADDRESS (stmt_info);
>>>> +  else
>>>> +    return DR_BASE_ADDRESS (dr);
>>>> +}
>>>> +
>>>> +/* Compute and return the alignment of base address BASE_ADDR in DR.  */
>>>> +
>>>> +static unsigned int
>>>> +vect_compute_base_alignment (data_reference *dr, tree base_addr)
>>>> +{
>>>> +  /* To look at the alignment of the base we have to preserve an inner
>>>> +     MEM_REF as that carries the alignment information of the actual
>>>> +     access.  */
>>>> +  tree base = DR_REF (dr);
>>>> +  while (handled_component_p (base))
>>>> +    base = TREE_OPERAND (base, 0);
>>>> +  unsigned int base_alignment = 0;
>>>> +  unsigned HOST_WIDE_INT base_bitpos;
>>>> +  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
>>>> +
>>>> +  /* As data-ref analysis strips the MEM_REF down to its base operand
>>>> +     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
>>>> +     adjust things to make base_alignment valid as the alignment of
>>>> +     DR_BASE_ADDRESS.  */
>>>> +  if (TREE_CODE (base) == MEM_REF)
>>>> +    {
>>>> +      /* Note all this only works if DR_BASE_ADDRESS is the same as
>>>> +        MEM_REF operand zero, otherwise DR/SCEV analysis might have factored
>>>> +        in other offsets.  We need to rework DR to compute the alingment
>>>> +        of DR_BASE_ADDRESS as long as all information is still available.  */
>>>> +      if (operand_equal_p (TREE_OPERAND (base, 0), base_addr, 0))
>>>> +       {
>>>> +         base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
>>>> +         base_bitpos &= (base_alignment - 1);
>>>> +       }
>>>> +      else
>>>> +       base_bitpos = BITS_PER_UNIT;
>>>> +    }
>>>> +  if (base_bitpos != 0)
>>>> +    base_alignment = base_bitpos & -base_bitpos;
>>>> +
>>>> +  /* Also look at the alignment of the base address DR analysis
>>>> +     computed.  */
>>>> +  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
>>>> +  if (base_addr_alignment > base_alignment)
>>>> +    base_alignment = base_addr_alignment;
>>>> +
>>>> +  return base_alignment;
>>>> +}
>>>> +
>>>> +/* Compute alignments for the base addresses of all datarefs in VINFO.  */
>>>> +
>>>> +void
>>>> +vect_compute_base_alignments (vec_info *vinfo)
>>>> +{
>>>> +  /* If the region we're going to vectorize is reached, all unconditional
>>>> +     data references occur at least once.  We can therefore pool the base
>>>> +     alignment guarantees from each unconditional reference.  */
>>>> +  data_reference *dr;
>>>> +  unsigned int i;
>>>> +  FOR_EACH_VEC_ELT (vinfo->datarefs, i, dr)
>>>> +    if (!DR_IS_CONDITIONAL (dr))
>>>> +      {
>>>> +       tree base_addr = vect_get_base_address (dr);
>>>> +       unsigned int alignment = vect_compute_base_alignment (dr, base_addr);
>>>> +       bool existed;
>>>> +       unsigned int &entry
>>>> +         = vinfo->base_alignments.get_or_insert (base_addr, &existed);
>>>> +       if (!existed || entry < alignment)
>>>> +         {
>>>> +           entry = alignment;
>>>> +           if (dump_enabled_p ())
>>>> +             {
>>>> +               dump_printf_loc (MSG_NOTE, vect_location,
>>>> +                                "setting base alignment for ");
>>>> +               dump_generic_expr (MSG_NOTE, TDF_SLIM, base_addr);
>>>> +               dump_printf (MSG_NOTE, " to %d, based on ", alignment);
>>>> +               dump_gimple_stmt (MSG_NOTE, TDF_SLIM, DR_STMT (dr), 0);
>>>> +             }
>>>> +         }
>>>> +      }
>>>> +}
>>>> +
>>>>  /* Function vect_compute_data_ref_alignment
>>>>
>>>>     Compute the misalignment of the data reference DR.
>>>> @@ -663,6 +759,7 @@ vect_compute_data_ref_alignment (struct
>>>>  {
>>>>    gimple *stmt = DR_STMT (dr);
>>>>    stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
>>>> +  vec_base_alignments *base_alignments = &stmt_info->vinfo->base_alignments;
>>>>    loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
>>>>    struct loop *loop = NULL;
>>>>    tree ref = DR_REF (dr);
>>>> @@ -699,6 +796,8 @@ vect_compute_data_ref_alignment (struct
>>>>      {
>>>>        tree step = DR_STEP (dr);
>>>>
>>>> +      base_addr = STMT_VINFO_DR_BASE_ADDRESS (stmt_info);
>>>> +      aligned_to = STMT_VINFO_DR_ALIGNED_TO (stmt_info);
>>>>        if (tree_fits_shwi_p (step)
>>>>           && tree_to_shwi (step) % GET_MODE_SIZE (TYPE_MODE (vectype)) == 0)
>>>>          {
>>>> @@ -706,8 +805,6 @@ vect_compute_data_ref_alignment (struct
>>>>              dump_printf_loc (MSG_NOTE, vect_location,
>>>>                               "inner step divides the vector-size.\n");
>>>>           misalign = STMT_VINFO_DR_INIT (stmt_info);
>>>> -         aligned_to = STMT_VINFO_DR_ALIGNED_TO (stmt_info);
>>>> -         base_addr = STMT_VINFO_DR_BASE_ADDRESS (stmt_info);
>>>>          }
>>>>        else
>>>>         {
>>>> @@ -738,39 +835,15 @@ vect_compute_data_ref_alignment (struct
>>>>         }
>>>>      }
>>>>
>>>> -  /* To look at alignment of the base we have to preserve an inner MEM_REF
>>>> -     as that carries alignment information of the actual access.  */
>>>> -  base = ref;
>>>> -  while (handled_component_p (base))
>>>> -    base = TREE_OPERAND (base, 0);
>>>> +  /* Calculate the maximum of the pooled base address alignment and the
>>>> +     alignment that we can compute for DR itself.  The latter should
>>>> +     already be included in the former for unconditional references.  */
>>>>    unsigned int base_alignment = 0;
>>>> -  unsigned HOST_WIDE_INT base_bitpos;
>>>> -  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
>>>> -  /* As data-ref analysis strips the MEM_REF down to its base operand
>>>> -     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
>>>> -     adjust things to make base_alignment valid as the alignment of
>>>> -     DR_BASE_ADDRESS.  */
>>>> -  if (TREE_CODE (base) == MEM_REF)
>>>> -    {
>>>> -      /* Note all this only works if DR_BASE_ADDRESS is the same as
>>>> -        MEM_REF operand zero, otherwise DR/SCEV analysis might have factored
>>>> -        in other offsets.  We need to rework DR to compute the alingment
>>>> -        of DR_BASE_ADDRESS as long as all information is still available.  */
>>>> -      if (operand_equal_p (TREE_OPERAND (base, 0), base_addr, 0))
>>>> -       {
>>>> -         base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
>>>> -         base_bitpos &= (base_alignment - 1);
>>>> -       }
>>>> -      else
>>>> -       base_bitpos = BITS_PER_UNIT;
>>>> -    }
>>>> -  if (base_bitpos != 0)
>>>> -    base_alignment = base_bitpos & -base_bitpos;
>>>> -  /* Also look at the alignment of the base address DR analysis
>>>> -     computed.  */
>>>> -  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
>>>> -  if (base_addr_alignment > base_alignment)
>>>> -    base_alignment = base_addr_alignment;
>>>> +  if (DR_IS_CONDITIONAL (dr))
>>>> +    base_alignment = vect_compute_base_alignment (dr, base_addr);
>>>> +  if (unsigned int *entry = base_alignments->get (base_addr))
>>>> +    base_alignment = MAX (base_alignment, *entry);
>>>> +  gcc_assert (base_alignment != 0);
>>>>
>>>>    if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
>>>>      DR_VECT_AUX (dr)->base_element_aligned = true;
>>>> @@ -906,8 +979,29 @@ vect_update_misalignment_for_peel (struc
>>>>      {
>>>>        if (current_dr != dr)
>>>>          continue;
>>>> -      gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
>>>> -                  DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>>>> +      /* Any alignment guarantees provided by a reference only apply if
>>>> +        the reference actually occurs.  For example, in:
>>>> +
>>>> +           struct s __attribute__((aligned(32))) {
>>>> +             int misaligner;
>>>> +             int array[N];
>>>> +           };
>>>> +
>>>> +           int *ptr;
>>>> +           for (int i = 0; i < n; ++i)
>>>> +             ptr[i] = c[i] ? ((struct s *) (ptr - 1))->array[i] : 0;
>>>> +
>>>> +        we can only assume that ptr is part of a struct s if at least one
>>>> +        c[i] is true.  This in turn means that we have a higher base
>>>> +        alignment guarantee for the read from ptr (if it occurs) than for
>>>> +        the write to ptr, and we cannot unconditionally carry the former
>>>> +        over to the latter.  We still know that the two address values
>>>> +        have the same misalignment, so if peeling has forced one of them
>>>> +        to be aligned, the other must be too.  */
>>>> +      gcc_assert (DR_IS_CONDITIONAL (dr_peel)
>>>> +                 || DR_IS_CONDITIONAL (dr)
>>>> +                 || (DR_MISALIGNMENT (dr) / dr_size
>>>> +                     == DR_MISALIGNMENT (dr_peel) / dr_peel_size));
>>>>        SET_DR_MISALIGNMENT (dr, 0);
>>>>        return;
>>>>      }
>>>> @@ -2117,8 +2211,7 @@ vect_find_same_alignment_drs (struct dat
>>>>    if (dra == drb)
>>>>      return;
>>>>
>>>> -  if (!operand_equal_p (DR_BASE_OBJECT (dra), DR_BASE_OBJECT (drb),
>>>> -                       OEP_ADDRESS_OF)
>>>> +  if (!operand_equal_p (DR_BASE_ADDRESS (dra), DR_BASE_ADDRESS (drb), 0)
>>>>        || !operand_equal_p (DR_OFFSET (dra), DR_OFFSET (drb), 0)
>>>>        || !operand_equal_p (DR_STEP (dra), DR_STEP (drb), 0))
>>>>      return;
>>>> @@ -2176,6 +2269,7 @@ vect_analyze_data_refs_alignment (loop_v
>>>>    vec<data_reference_p> datarefs = vinfo->datarefs;
>>>>    struct data_reference *dr;
>>>>
>>>> +  vect_compute_base_alignments (vinfo);
>>>>    FOR_EACH_VEC_ELT (datarefs, i, dr)
>>>>      {
>>>>        stmt_vec_info stmt_info = vinfo_for_stmt (DR_STMT (dr));
>>>> @@ -3374,7 +3468,8 @@ vect_analyze_data_refs (vec_info *vinfo,
>>>>             {
>>>>               struct data_reference *newdr
>>>>                 = create_data_ref (NULL, loop_containing_stmt (stmt),
>>>> -                                  DR_REF (dr), stmt, maybe_scatter ? false : true);
>>>> +                                  DR_REF (dr), stmt, !maybe_scatter,
>>>> +                                  DR_IS_CONDITIONAL (dr));
>>>>               gcc_assert (newdr != NULL && DR_REF (newdr));
>>>>               if (DR_BASE_ADDRESS (newdr)
>>>>                   && DR_OFFSET (newdr)
>>>> Index: gcc/tree-vect-slp.c
>>>> ===================================================================
>>>> --- gcc/tree-vect-slp.c 2017-06-07 21:58:56.336475882 +0100
>>>> +++ gcc/tree-vect-slp.c 2017-06-22 12:23:21.288421018 +0100
>>>> @@ -2367,6 +2367,7 @@ new_bb_vec_info (gimple_stmt_iterator re
>>>>    gimple_stmt_iterator gsi;
>>>>
>>>>    res = (bb_vec_info) xcalloc (1, sizeof (struct _bb_vec_info));
>>>> +  new (&res->base_alignments) vec_base_alignments ();
>>>>    res->kind = vec_info::bb;
>>>>    BB_VINFO_BB (res) = bb;
>>>>    res->region_begin = region_begin;
>>>> @@ -2741,6 +2742,8 @@ vect_slp_analyze_bb_1 (gimple_stmt_itera
>>>>        return NULL;
>>>>      }
>>>>
>>>> +  vect_compute_base_alignments (bb_vinfo);
>>>> +
>>>>    /* Analyze and verify the alignment of data references and the
>>>>       dependence in the SLP instances.  */
>>>>    for (i = 0; BB_VINFO_SLP_INSTANCES (bb_vinfo).iterate (i, &instance); )
>>>> Index: gcc/tree-vect-loop.c
>>>> ===================================================================
>>>> --- gcc/tree-vect-loop.c        2017-06-22 12:22:57.734313143 +0100
>>>> +++ gcc/tree-vect-loop.c        2017-06-22 12:23:21.287421072 +0100
>>>> @@ -1157,6 +1157,7 @@ new_loop_vec_info (struct loop *loop)
>>>>    LOOP_VINFO_VECT_FACTOR (res) = 0;
>>>>    LOOP_VINFO_LOOP_NEST (res) = vNULL;
>>>>    LOOP_VINFO_DATAREFS (res) = vNULL;
>>>> +  new (&res->base_alignments) vec_base_alignments ();
>>>>    LOOP_VINFO_DDRS (res) = vNULL;
>>>>    LOOP_VINFO_UNALIGNED_DR (res) = NULL;
>>>>    LOOP_VINFO_MAY_MISALIGN_STMTS (res) = vNULL;
>>>> Index: gcc/tree-vectorizer.c
>>>> ===================================================================
>>>> --- gcc/tree-vectorizer.c       2017-06-22 12:22:57.732313220 +0100
>>>> +++ gcc/tree-vectorizer.c       2017-06-22 12:23:21.288421018 +0100
>>>> @@ -370,6 +370,8 @@ vect_destroy_datarefs (vec_info *vinfo)
>>>>        }
>>>>
>>>>    free_data_refs (vinfo->datarefs);
>>>> +
>>>> +  vinfo->base_alignments.~vec_base_alignments ();
>>>>  }
>>>>
>>>>  /* A helper function to free scev and LOOP niter information, as well as
>>>> Index: gcc/testsuite/gcc.dg/vect/pr81136.c
>>>> ===================================================================
>>>> --- /dev/null   2017-06-22 07:43:14.805493307 +0100
>>>> +++ gcc/testsuite/gcc.dg/vect/pr81136.c 2017-06-22 12:23:21.283421287 +0100
>>>> @@ -0,0 +1,16 @@
>>>> +/* { dg-do compile } */
>>>> +
>>>> +struct __attribute__((aligned (32)))
>>>> +{
>>>> +  char misaligner;
>>>> +  int foo[100];
>>>> +  int bar[100];
>>>> +} *a;
>>>> +
>>>> +void
>>>> +fn1 (int n)
>>>> +{
>>>> +  int *b = a->foo;
>>>> +  for (int i = 0; i < n; i++)
>>>> +    a->bar[i] = b[i];
>>>> +}
Richard Sandiford June 26, 2017, 11:14 a.m. UTC | #6
Richard Biener <richard.guenther@gmail.com> writes:
> On Fri, Jun 23, 2017 at 2:05 PM, Richard Sandiford
> <richard.sandiford@linaro.org> wrote:
>> Richard Biener <richard.guenther@gmail.com> writes:
>>> On Thu, Jun 22, 2017 at 1:30 PM, Richard Sandiford
>>> <richard.sandiford@linaro.org> wrote:
>>>> The test case triggered this assert in vect_update_misalignment_for_peel:
>>>>
>>>>           gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
>>>>                       DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>>>>
>>>> We knew that the two DRs had the same misalignment at runtime, but when
>>>> considered in isolation, one data reference guaranteed a higher compile-time
>>>> base alignment than the other.
>>>>
>>>> In the test case this looks like a missed opportunity.  Both references
>>>> are unconditional, so it should be possible to use the highest of the
>>>> available base alignment guarantees when analyzing each reference.
>>>> The patch does this.
>>>>
>>>> However, as the comment in the patch says, the base alignment guarantees
>>>> provided by a conditional reference only apply if the reference occurs
>>>> at least once.  In this case it would be legitimate for two references
>>>> to have the same runtime misalignment and for one reference to provide a
>>>> stronger compile-time guarantee than the other about what the misalignment
>>>> actually is.  The patch therefore relaxes the assert to handle that case.
>>>
>>> Hmm, but you don't actually check whether a reference occurs only
> conditional,
>>> do you?  You just seem to say that for masked loads/stores the reference
>>> is conditional (I believe that's not true).  But for a loop like
>>>
>>>  for (;;)
>>>    if (a[i])
>>>      sum += b[j];
>>>
>>> you still assume b[j] executes unconditionally?
>>
>> Maybe the documentation isn't clear enough, but DR_IS_CONDITIONAL
>> was supposed to mean "even if the containing statement executes
>> and runs to completion, the reference might not actually occur".
>> The example above isn't conditional in that sense because the
>> reference to b[j] does occur if the store is reached and completes.
>>
>> Masked loads and stores are conditional in that sense though.
>> The reference only occurs if the mask is nonzero; the memory
>> isn't touched otherwise.  The functions are used to if-convert
>> things like:
>>
>>    for (...)
>>      a[i] = b[i] ? c[i] : d[i];
>>
>> where there's no guarantee that it's safe to access c[i] when !b[i]
>> (or d[i] when b[i]).  No reference occurs for an all-false mask.
>
> But as you touch generic data-ref code here you should apply more
> sensible semantics to DR_IS_CONDITIONAL than just marking
> masked loads/stores but not DRs occuring inside BBs only executed
> conditionally ...

I don't see why that's more sensible though.  If a statement is only
conditionally executed in a loop, it's up to the consumer to decide
what to do about that.  The conditions under which the statement
is reached are a control-flow issue and tree-data-ref.c doesn't
have any special information about it.

Masked loads and stores are special because the DR_REFs created by
tree-data-ref.c are artificial: they didn't exist as MEM_REFs in the
original DR_STMT.  And AIUI they didn't exist as MEM_REFs precisely
because they're not guaranteed to happen, even if the load or store
statement itself is executed.  So in this case the DR_IS_CONDITIONAL
is reflecting something that tree-data-ref.c itself has done.

How about calling it DR_IS_CONDITIONAL_IN_STMT to avoid the
general-sounding name?

>>> The vectorizer of course only sees unconditionally executed stmts.
>>>
>>> So - I'd simply not add this DR_IS_CONDITIONAL.  Did you run into
>>> any real-world (testsuite) issues without this?
>>
>> Dropping DR_IS_CONDITIONAL would cause us to make invalid alignment
>> assumptions in silly corner cases.  I could add a scan test for it,
>> for targets with masked loads and stores.  It wouldn't trigger
>> an execution failure though because we assume that targets with
>> masked loads and stores allow unaligned accesses:
>>
>>   /* For now assume all conditional loads/stores support unaligned
>>      access without any special code.  */
>>   if (is_gimple_call (stmt)
>>       && gimple_call_internal_p (stmt)
>>       && (gimple_call_internal_fn (stmt) == IFN_MASK_LOAD
>>           || gimple_call_internal_fn (stmt) == IFN_MASK_STORE))
>>     return dr_unaligned_supported;
>>
>> So the worst that would happen is that we'd supposedly peel for
>> alignment, but actually misalign everything instead, and so make
>> things slower rather than quicker.
>>
>>> Note that the assert is to prevent bogus information.  Iff we aligned
>>> DR with base alignment 8 and misalign 3 then if another same-align
>>> DR has base alignment 16 we can't simply zero its DR_MISALIGNMENT
>>> as it still can be 8 after aligning DR.
>>>
>>> So I think it's wrong to put DRs with differing base-alignment into
>>> the same-align-refs chain, those should get their DR_MISALIGNMENT
>>> updated independenlty after peeling.
>>
>> DR_MISALIGNMENT is relative to the vector alignment rather than
>> the base alignment though.  So:
>
> We seem to use it that way, yes (looking at set_ptr_info_alignment
> uses).  So why not fix the assert then by capping the alignment/misalignment
> we compute at this value as well?  (and document this in the header
> around DR_MISALIGNMENT)
>
> Ideally we'd do alignment analysis independent of the vector size
> though (for those stupid targets with multiple vector sizes to consider...).
>
>> a) when looking for references *A1 and *A2 with the same alignment,
>>    we simply have to prove that A1 % vecalign == A2 % vecalign.
>>    This doesn't require any knowledge about the base alignment.
>>    If we break the addresses down as:
>>
>>       A1 = BASE1 + REST1,  REST1 = INIT1 + OFFSET1 + X * STEP1
>>       A2 = BASE2 + REST2,  REST2 = INIT2 + OFFSET2 + X * STEP2
>>
>>    and can prove that BASE1 == BASE2, the alignment of that base
>>    isn't important.  We simply need to prove that REST1 % vecalign
>>    == REST2 % vecalign for all X.
>>
>> b) In the assert, we've peeled the loop so that DR_PEEL is guaranteed
>>    to be vector-aligned.  If DR_PEEL is A1 in the example above, we have
>>    A1 % vecalign == 0, so A2 % vecalign must be 0 too.  This again doesn't
>>    rely on the base alignment being known.
>>
>> What a high base alignment for DR_PEEL gives us is the ability to know
>> at compile how many iterations need to be peeled to make DR_PEEL aligned.
>> But the points above apply regardless of whether we know that value at
>> compile time or not.
>>
>> In examples like the test case, we would have known at compile time that
>> VF-1 iterations would need to be peeled if we'd picked the store as the
>> DR_PEEL, but would have treated the number of peels as variable if we'd
>> picked the load.  The value calculated at runtime would still have been
>> VF-1, it's just that the code wouldn't have been as efficient.
>>
>> One of the benefits of pooling the alignments for unconditional references
>> is that it no longer matters which DR we pick: the number of peels will
>> be a compile-time constant both ways.
>>
>> Thanks,
>> Richard
>>
>>> I'd rather not mix fixing this with the improvement to eventuall use a
>>> larger align for the other DR if possible.
>
> ^^^
>
> So can you fix the ICE with capping base alignment / DR_MISALIGNMENT?

I don't think the problem is the lack of a cap.  In the test case we
see that:

1. B is known at compile time to be X * vecsize + Y when considered in
   isolation, because the base alignment derived from its DR_REF >= vecsize.
   So DR_MISALIGNMENT (B) == Y.

2. A's misalignment wrt vecsize is not known at compile time when
   considered in isolation, because no useful base alignment can be
   derived from its DR_REF.  (The DR_REF is to a plain int rather than
   to a structure with a high alignment.)  So DR_MISALIGNMENT (A) == -1.

3. A and B when considered as a pair trivially have the same misalignment
   wrt vecsize, for the reasons above.

Each of these results is individually correct.  The problem is that the
assert is conflating two things: it's saying that if we know two datarefs
have the same misaligment, we must either be able to calculate a
compile-time misalignment for both datarefs in isolation, or we must
fail to calculate a compile-time misalignment for both datarefs in
isolation.  That isn't true: it's valid to have situations in which the
compile-time misalignment is known for one dataref in isolation but not
for the other.

Thanks,
Richard
Richard Biener June 26, 2017, 11:36 a.m. UTC | #7
On Mon, Jun 26, 2017 at 1:14 PM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:
>> On Fri, Jun 23, 2017 at 2:05 PM, Richard Sandiford
>> <richard.sandiford@linaro.org> wrote:
>>> Richard Biener <richard.guenther@gmail.com> writes:
>>>> On Thu, Jun 22, 2017 at 1:30 PM, Richard Sandiford
>>>> <richard.sandiford@linaro.org> wrote:
>>>>> The test case triggered this assert in vect_update_misalignment_for_peel:
>>>>>
>>>>>           gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
>>>>>                       DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>>>>>
>>>>> We knew that the two DRs had the same misalignment at runtime, but when
>>>>> considered in isolation, one data reference guaranteed a higher compile-time
>>>>> base alignment than the other.
>>>>>
>>>>> In the test case this looks like a missed opportunity.  Both references
>>>>> are unconditional, so it should be possible to use the highest of the
>>>>> available base alignment guarantees when analyzing each reference.
>>>>> The patch does this.
>>>>>
>>>>> However, as the comment in the patch says, the base alignment guarantees
>>>>> provided by a conditional reference only apply if the reference occurs
>>>>> at least once.  In this case it would be legitimate for two references
>>>>> to have the same runtime misalignment and for one reference to provide a
>>>>> stronger compile-time guarantee than the other about what the misalignment
>>>>> actually is.  The patch therefore relaxes the assert to handle that case.
>>>>
>>>> Hmm, but you don't actually check whether a reference occurs only
>> conditional,
>>>> do you?  You just seem to say that for masked loads/stores the reference
>>>> is conditional (I believe that's not true).  But for a loop like
>>>>
>>>>  for (;;)
>>>>    if (a[i])
>>>>      sum += b[j];
>>>>
>>>> you still assume b[j] executes unconditionally?
>>>
>>> Maybe the documentation isn't clear enough, but DR_IS_CONDITIONAL
>>> was supposed to mean "even if the containing statement executes
>>> and runs to completion, the reference might not actually occur".
>>> The example above isn't conditional in that sense because the
>>> reference to b[j] does occur if the store is reached and completes.
>>>
>>> Masked loads and stores are conditional in that sense though.
>>> The reference only occurs if the mask is nonzero; the memory
>>> isn't touched otherwise.  The functions are used to if-convert
>>> things like:
>>>
>>>    for (...)
>>>      a[i] = b[i] ? c[i] : d[i];
>>>
>>> where there's no guarantee that it's safe to access c[i] when !b[i]
>>> (or d[i] when b[i]).  No reference occurs for an all-false mask.
>>
>> But as you touch generic data-ref code here you should apply more
>> sensible semantics to DR_IS_CONDITIONAL than just marking
>> masked loads/stores but not DRs occuring inside BBs only executed
>> conditionally ...
>
> I don't see why that's more sensible though.  If a statement is only
> conditionally executed in a loop, it's up to the consumer to decide
> what to do about that.  The conditions under which the statement
> is reached are a control-flow issue and tree-data-ref.c doesn't
> have any special information about it.
>
> Masked loads and stores are special because the DR_REFs created by
> tree-data-ref.c are artificial: they didn't exist as MEM_REFs in the
> original DR_STMT.  And AIUI they didn't exist as MEM_REFs precisely
> because they're not guaranteed to happen, even if the load or store
> statement itself is executed.  So in this case the DR_IS_CONDITIONAL
> is reflecting something that tree-data-ref.c itself has done.
>
> How about calling it DR_IS_CONDITIONAL_IN_STMT to avoid the
> general-sounding name?

That sounds better and avoids the ambiguity.

>>>> The vectorizer of course only sees unconditionally executed stmts.
>>>>
>>>> So - I'd simply not add this DR_IS_CONDITIONAL.  Did you run into
>>>> any real-world (testsuite) issues without this?
>>>
>>> Dropping DR_IS_CONDITIONAL would cause us to make invalid alignment
>>> assumptions in silly corner cases.  I could add a scan test for it,
>>> for targets with masked loads and stores.  It wouldn't trigger
>>> an execution failure though because we assume that targets with
>>> masked loads and stores allow unaligned accesses:
>>>
>>>   /* For now assume all conditional loads/stores support unaligned
>>>      access without any special code.  */
>>>   if (is_gimple_call (stmt)
>>>       && gimple_call_internal_p (stmt)
>>>       && (gimple_call_internal_fn (stmt) == IFN_MASK_LOAD
>>>           || gimple_call_internal_fn (stmt) == IFN_MASK_STORE))
>>>     return dr_unaligned_supported;
>>>
>>> So the worst that would happen is that we'd supposedly peel for
>>> alignment, but actually misalign everything instead, and so make
>>> things slower rather than quicker.
>>>
>>>> Note that the assert is to prevent bogus information.  Iff we aligned
>>>> DR with base alignment 8 and misalign 3 then if another same-align
>>>> DR has base alignment 16 we can't simply zero its DR_MISALIGNMENT
>>>> as it still can be 8 after aligning DR.
>>>>
>>>> So I think it's wrong to put DRs with differing base-alignment into
>>>> the same-align-refs chain, those should get their DR_MISALIGNMENT
>>>> updated independenlty after peeling.
>>>
>>> DR_MISALIGNMENT is relative to the vector alignment rather than
>>> the base alignment though.  So:
>>
>> We seem to use it that way, yes (looking at set_ptr_info_alignment
>> uses).  So why not fix the assert then by capping the alignment/misalignment
>> we compute at this value as well?  (and document this in the header
>> around DR_MISALIGNMENT)
>>
>> Ideally we'd do alignment analysis independent of the vector size
>> though (for those stupid targets with multiple vector sizes to consider...).
>>
>>> a) when looking for references *A1 and *A2 with the same alignment,
>>>    we simply have to prove that A1 % vecalign == A2 % vecalign.
>>>    This doesn't require any knowledge about the base alignment.
>>>    If we break the addresses down as:
>>>
>>>       A1 = BASE1 + REST1,  REST1 = INIT1 + OFFSET1 + X * STEP1
>>>       A2 = BASE2 + REST2,  REST2 = INIT2 + OFFSET2 + X * STEP2
>>>
>>>    and can prove that BASE1 == BASE2, the alignment of that base
>>>    isn't important.  We simply need to prove that REST1 % vecalign
>>>    == REST2 % vecalign for all X.
>>>
>>> b) In the assert, we've peeled the loop so that DR_PEEL is guaranteed
>>>    to be vector-aligned.  If DR_PEEL is A1 in the example above, we have
>>>    A1 % vecalign == 0, so A2 % vecalign must be 0 too.  This again doesn't
>>>    rely on the base alignment being known.
>>>
>>> What a high base alignment for DR_PEEL gives us is the ability to know
>>> at compile how many iterations need to be peeled to make DR_PEEL aligned.
>>> But the points above apply regardless of whether we know that value at
>>> compile time or not.
>>>
>>> In examples like the test case, we would have known at compile time that
>>> VF-1 iterations would need to be peeled if we'd picked the store as the
>>> DR_PEEL, but would have treated the number of peels as variable if we'd
>>> picked the load.  The value calculated at runtime would still have been
>>> VF-1, it's just that the code wouldn't have been as efficient.
>>>
>>> One of the benefits of pooling the alignments for unconditional references
>>> is that it no longer matters which DR we pick: the number of peels will
>>> be a compile-time constant both ways.
>>>
>>> Thanks,
>>> Richard
>>>
>>>> I'd rather not mix fixing this with the improvement to eventuall use a
>>>> larger align for the other DR if possible.
>>
>> ^^^
>>
>> So can you fix the ICE with capping base alignment / DR_MISALIGNMENT?
>
> I don't think the problem is the lack of a cap.  In the test case we
> see that:
>
> 1. B is known at compile time to be X * vecsize + Y when considered in
>    isolation, because the base alignment derived from its DR_REF >= vecsize.
>    So DR_MISALIGNMENT (B) == Y.
>
> 2. A's misalignment wrt vecsize is not known at compile time when
>    considered in isolation, because no useful base alignment can be
>    derived from its DR_REF.  (The DR_REF is to a plain int rather than
>    to a structure with a high alignment.)  So DR_MISALIGNMENT (A) == -1.
>
> 3. A and B when considered as a pair trivially have the same misalignment
>    wrt vecsize, for the reasons above.
>
> Each of these results is individually correct.  The problem is that the
> assert is conflating two things: it's saying that if we know two datarefs
> have the same misaligment, we must either be able to calculate a
> compile-time misalignment for both datarefs in isolation, or we must
> fail to calculate a compile-time misalignment for both datarefs in
> isolation.  That isn't true: it's valid to have situations in which the
> compile-time misalignment is known for one dataref in isolation but not
> for the other.

True.  So the assert should then become

      gcc_assert (! known_alignment_for_access_p (dr)
                  || DR_MISALIGNMENT (dr) / dr_size ==
                    DR_MISALIGNMENT (dr_peel) / dr_peel_size);

?

Richard.


>
> Thanks,
> Richard
>
Richard Sandiford June 26, 2017, 11:50 a.m. UTC | #8
Richard Biener <richard.guenther@gmail.com> writes:
> On Mon, Jun 26, 2017 at 1:14 PM, Richard Sandiford
> <richard.sandiford@linaro.org> wrote:
>> Richard Biener <richard.guenther@gmail.com> writes:
>>> On Fri, Jun 23, 2017 at 2:05 PM, Richard Sandiford
>>> <richard.sandiford@linaro.org> wrote:
>>>> Richard Biener <richard.guenther@gmail.com> writes:
>>>>> On Thu, Jun 22, 2017 at 1:30 PM, Richard Sandiford
>>>>> <richard.sandiford@linaro.org> wrote:
>>>>>> The test case triggered this assert in vect_update_misalignment_for_peel:
>>>>>>
>>>>>>           gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
>>>>>>                       DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>>>>>>
>>>>>> We knew that the two DRs had the same misalignment at runtime, but when
>>>>>> considered in isolation, one data reference guaranteed a higher compile-time
>>>>>> base alignment than the other.
>>>>>>
>>>>>> In the test case this looks like a missed opportunity.  Both references
>>>>>> are unconditional, so it should be possible to use the highest of the
>>>>>> available base alignment guarantees when analyzing each reference.
>>>>>> The patch does this.
>>>>>>
>>>>>> However, as the comment in the patch says, the base alignment guarantees
>>>>>> provided by a conditional reference only apply if the reference occurs
>>>>>> at least once.  In this case it would be legitimate for two references
>>>>>> to have the same runtime misalignment and for one reference to provide a
>>>>>> stronger compile-time guarantee than the other about what the misalignment
>>>>>> actually is.  The patch therefore relaxes the assert to handle that case.
>>>>>
>>>>> Hmm, but you don't actually check whether a reference occurs only
>>> conditional,
>>>>> do you?  You just seem to say that for masked loads/stores the reference
>>>>> is conditional (I believe that's not true).  But for a loop like
>>>>>
>>>>>  for (;;)
>>>>>    if (a[i])
>>>>>      sum += b[j];
>>>>>
>>>>> you still assume b[j] executes unconditionally?
>>>>
>>>> Maybe the documentation isn't clear enough, but DR_IS_CONDITIONAL
>>>> was supposed to mean "even if the containing statement executes
>>>> and runs to completion, the reference might not actually occur".
>>>> The example above isn't conditional in that sense because the
>>>> reference to b[j] does occur if the store is reached and completes.
>>>>
>>>> Masked loads and stores are conditional in that sense though.
>>>> The reference only occurs if the mask is nonzero; the memory
>>>> isn't touched otherwise.  The functions are used to if-convert
>>>> things like:
>>>>
>>>>    for (...)
>>>>      a[i] = b[i] ? c[i] : d[i];
>>>>
>>>> where there's no guarantee that it's safe to access c[i] when !b[i]
>>>> (or d[i] when b[i]).  No reference occurs for an all-false mask.
>>>
>>> But as you touch generic data-ref code here you should apply more
>>> sensible semantics to DR_IS_CONDITIONAL than just marking
>>> masked loads/stores but not DRs occuring inside BBs only executed
>>> conditionally ...
>>
>> I don't see why that's more sensible though.  If a statement is only
>> conditionally executed in a loop, it's up to the consumer to decide
>> what to do about that.  The conditions under which the statement
>> is reached are a control-flow issue and tree-data-ref.c doesn't
>> have any special information about it.
>>
>> Masked loads and stores are special because the DR_REFs created by
>> tree-data-ref.c are artificial: they didn't exist as MEM_REFs in the
>> original DR_STMT.  And AIUI they didn't exist as MEM_REFs precisely
>> because they're not guaranteed to happen, even if the load or store
>> statement itself is executed.  So in this case the DR_IS_CONDITIONAL
>> is reflecting something that tree-data-ref.c itself has done.
>>
>> How about calling it DR_IS_CONDITIONAL_IN_STMT to avoid the
>> general-sounding name?
>
> That sounds better and avoids the ambiguity.

OK.

>>>>> The vectorizer of course only sees unconditionally executed stmts.
>>>>>
>>>>> So - I'd simply not add this DR_IS_CONDITIONAL.  Did you run into
>>>>> any real-world (testsuite) issues without this?
>>>>
>>>> Dropping DR_IS_CONDITIONAL would cause us to make invalid alignment
>>>> assumptions in silly corner cases.  I could add a scan test for it,
>>>> for targets with masked loads and stores.  It wouldn't trigger
>>>> an execution failure though because we assume that targets with
>>>> masked loads and stores allow unaligned accesses:
>>>>
>>>>   /* For now assume all conditional loads/stores support unaligned
>>>>      access without any special code.  */
>>>>   if (is_gimple_call (stmt)
>>>>       && gimple_call_internal_p (stmt)
>>>>       && (gimple_call_internal_fn (stmt) == IFN_MASK_LOAD
>>>>           || gimple_call_internal_fn (stmt) == IFN_MASK_STORE))
>>>>     return dr_unaligned_supported;
>>>>
>>>> So the worst that would happen is that we'd supposedly peel for
>>>> alignment, but actually misalign everything instead, and so make
>>>> things slower rather than quicker.
>>>>
>>>>> Note that the assert is to prevent bogus information.  Iff we aligned
>>>>> DR with base alignment 8 and misalign 3 then if another same-align
>>>>> DR has base alignment 16 we can't simply zero its DR_MISALIGNMENT
>>>>> as it still can be 8 after aligning DR.
>>>>>
>>>>> So I think it's wrong to put DRs with differing base-alignment into
>>>>> the same-align-refs chain, those should get their DR_MISALIGNMENT
>>>>> updated independenlty after peeling.
>>>>
>>>> DR_MISALIGNMENT is relative to the vector alignment rather than
>>>> the base alignment though.  So:
>>>
>>> We seem to use it that way, yes (looking at set_ptr_info_alignment
>>> uses).  So why not fix the assert then by capping the alignment/misalignment
>>> we compute at this value as well?  (and document this in the header
>>> around DR_MISALIGNMENT)
>>>
>>> Ideally we'd do alignment analysis independent of the vector size
>>> though (for those stupid targets with multiple vector sizes to consider...).
>>>
>>>> a) when looking for references *A1 and *A2 with the same alignment,
>>>>    we simply have to prove that A1 % vecalign == A2 % vecalign.
>>>>    This doesn't require any knowledge about the base alignment.
>>>>    If we break the addresses down as:
>>>>
>>>>       A1 = BASE1 + REST1,  REST1 = INIT1 + OFFSET1 + X * STEP1
>>>>       A2 = BASE2 + REST2,  REST2 = INIT2 + OFFSET2 + X * STEP2
>>>>
>>>>    and can prove that BASE1 == BASE2, the alignment of that base
>>>>    isn't important.  We simply need to prove that REST1 % vecalign
>>>>    == REST2 % vecalign for all X.
>>>>
>>>> b) In the assert, we've peeled the loop so that DR_PEEL is guaranteed
>>>>    to be vector-aligned.  If DR_PEEL is A1 in the example above, we have
>>>>    A1 % vecalign == 0, so A2 % vecalign must be 0 too.  This again doesn't
>>>>    rely on the base alignment being known.
>>>>
>>>> What a high base alignment for DR_PEEL gives us is the ability to know
>>>> at compile how many iterations need to be peeled to make DR_PEEL aligned.
>>>> But the points above apply regardless of whether we know that value at
>>>> compile time or not.
>>>>
>>>> In examples like the test case, we would have known at compile time that
>>>> VF-1 iterations would need to be peeled if we'd picked the store as the
>>>> DR_PEEL, but would have treated the number of peels as variable if we'd
>>>> picked the load.  The value calculated at runtime would still have been
>>>> VF-1, it's just that the code wouldn't have been as efficient.
>>>>
>>>> One of the benefits of pooling the alignments for unconditional references
>>>> is that it no longer matters which DR we pick: the number of peels will
>>>> be a compile-time constant both ways.
>>>>
>>>> Thanks,
>>>> Richard
>>>>
>>>>> I'd rather not mix fixing this with the improvement to eventuall use a
>>>>> larger align for the other DR if possible.
>>>
>>> ^^^
>>>
>>> So can you fix the ICE with capping base alignment / DR_MISALIGNMENT?
>>
>> I don't think the problem is the lack of a cap.  In the test case we
>> see that:
>>
>> 1. B is known at compile time to be X * vecsize + Y when considered in
>>    isolation, because the base alignment derived from its DR_REF >= vecsize.
>>    So DR_MISALIGNMENT (B) == Y.
>>
>> 2. A's misalignment wrt vecsize is not known at compile time when
>>    considered in isolation, because no useful base alignment can be
>>    derived from its DR_REF.  (The DR_REF is to a plain int rather than
>>    to a structure with a high alignment.)  So DR_MISALIGNMENT (A) == -1.
>>
>> 3. A and B when considered as a pair trivially have the same misalignment
>>    wrt vecsize, for the reasons above.
>>
>> Each of these results is individually correct.  The problem is that the
>> assert is conflating two things: it's saying that if we know two datarefs
>> have the same misaligment, we must either be able to calculate a
>> compile-time misalignment for both datarefs in isolation, or we must
>> fail to calculate a compile-time misalignment for both datarefs in
>> isolation.  That isn't true: it's valid to have situations in which the
>> compile-time misalignment is known for one dataref in isolation but not
>> for the other.
>
> True.  So the assert should then become
>
>       gcc_assert (! known_alignment_for_access_p (dr)
>                   || DR_MISALIGNMENT (dr) / dr_size ==
>                     DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>
> ?

I think it would need to be:

      gcc_assert (!known_alignment_for_access_p (dr)
		  || !known_alignment_for_access_p (dr_peel)
		  || (DR_MISALIGNMENT (dr) / dr_size
		      == DR_MISALIGNMENT (dr_peel) / dr_peel_size));

But yeah, that would work too.  The idea with the assert in the patch was
that for unconditional references we probably *do* want to try to compute
the same compile-time misalignment, but for optimisation reasons rather
than correctness.  Maybe that's more properly a gcc_checking_assert
though, since nothing goes wrong if it fails.  So perhaps:

     gcc_checking_assert (DR_IS_CONDITIONAL_IN_STMT (dr)
			  || DR_IS_CONDITIONAL_IN_STMT (dr_peel)
			  || (known_alignment_for_access_p (dr)
			      == known_alignment_for_access_p (dr_peel)));

as a follow-on assert.

Should I split it into two patches, one to change the gcc_assert and
another to add the optimisation?

Thanks,
Richard
Richard Biener June 27, 2017, 8:41 a.m. UTC | #9
On Mon, Jun 26, 2017 at 1:50 PM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:
>> On Mon, Jun 26, 2017 at 1:14 PM, Richard Sandiford
>> <richard.sandiford@linaro.org> wrote:
>>> Richard Biener <richard.guenther@gmail.com> writes:
>>>> On Fri, Jun 23, 2017 at 2:05 PM, Richard Sandiford
>>>> <richard.sandiford@linaro.org> wrote:
>>>>> Richard Biener <richard.guenther@gmail.com> writes:
>>>>>> On Thu, Jun 22, 2017 at 1:30 PM, Richard Sandiford
>>>>>> <richard.sandiford@linaro.org> wrote:
>>>>>>> The test case triggered this assert in vect_update_misalignment_for_peel:
>>>>>>>
>>>>>>>           gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
>>>>>>>                       DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>>>>>>>
>>>>>>> We knew that the two DRs had the same misalignment at runtime, but when
>>>>>>> considered in isolation, one data reference guaranteed a higher compile-time
>>>>>>> base alignment than the other.
>>>>>>>
>>>>>>> In the test case this looks like a missed opportunity.  Both references
>>>>>>> are unconditional, so it should be possible to use the highest of the
>>>>>>> available base alignment guarantees when analyzing each reference.
>>>>>>> The patch does this.
>>>>>>>
>>>>>>> However, as the comment in the patch says, the base alignment guarantees
>>>>>>> provided by a conditional reference only apply if the reference occurs
>>>>>>> at least once.  In this case it would be legitimate for two references
>>>>>>> to have the same runtime misalignment and for one reference to provide a
>>>>>>> stronger compile-time guarantee than the other about what the misalignment
>>>>>>> actually is.  The patch therefore relaxes the assert to handle that case.
>>>>>>
>>>>>> Hmm, but you don't actually check whether a reference occurs only
>>>> conditional,
>>>>>> do you?  You just seem to say that for masked loads/stores the reference
>>>>>> is conditional (I believe that's not true).  But for a loop like
>>>>>>
>>>>>>  for (;;)
>>>>>>    if (a[i])
>>>>>>      sum += b[j];
>>>>>>
>>>>>> you still assume b[j] executes unconditionally?
>>>>>
>>>>> Maybe the documentation isn't clear enough, but DR_IS_CONDITIONAL
>>>>> was supposed to mean "even if the containing statement executes
>>>>> and runs to completion, the reference might not actually occur".
>>>>> The example above isn't conditional in that sense because the
>>>>> reference to b[j] does occur if the store is reached and completes.
>>>>>
>>>>> Masked loads and stores are conditional in that sense though.
>>>>> The reference only occurs if the mask is nonzero; the memory
>>>>> isn't touched otherwise.  The functions are used to if-convert
>>>>> things like:
>>>>>
>>>>>    for (...)
>>>>>      a[i] = b[i] ? c[i] : d[i];
>>>>>
>>>>> where there's no guarantee that it's safe to access c[i] when !b[i]
>>>>> (or d[i] when b[i]).  No reference occurs for an all-false mask.
>>>>
>>>> But as you touch generic data-ref code here you should apply more
>>>> sensible semantics to DR_IS_CONDITIONAL than just marking
>>>> masked loads/stores but not DRs occuring inside BBs only executed
>>>> conditionally ...
>>>
>>> I don't see why that's more sensible though.  If a statement is only
>>> conditionally executed in a loop, it's up to the consumer to decide
>>> what to do about that.  The conditions under which the statement
>>> is reached are a control-flow issue and tree-data-ref.c doesn't
>>> have any special information about it.
>>>
>>> Masked loads and stores are special because the DR_REFs created by
>>> tree-data-ref.c are artificial: they didn't exist as MEM_REFs in the
>>> original DR_STMT.  And AIUI they didn't exist as MEM_REFs precisely
>>> because they're not guaranteed to happen, even if the load or store
>>> statement itself is executed.  So in this case the DR_IS_CONDITIONAL
>>> is reflecting something that tree-data-ref.c itself has done.
>>>
>>> How about calling it DR_IS_CONDITIONAL_IN_STMT to avoid the
>>> general-sounding name?
>>
>> That sounds better and avoids the ambiguity.
>
> OK.
>
>>>>>> The vectorizer of course only sees unconditionally executed stmts.
>>>>>>
>>>>>> So - I'd simply not add this DR_IS_CONDITIONAL.  Did you run into
>>>>>> any real-world (testsuite) issues without this?
>>>>>
>>>>> Dropping DR_IS_CONDITIONAL would cause us to make invalid alignment
>>>>> assumptions in silly corner cases.  I could add a scan test for it,
>>>>> for targets with masked loads and stores.  It wouldn't trigger
>>>>> an execution failure though because we assume that targets with
>>>>> masked loads and stores allow unaligned accesses:
>>>>>
>>>>>   /* For now assume all conditional loads/stores support unaligned
>>>>>      access without any special code.  */
>>>>>   if (is_gimple_call (stmt)
>>>>>       && gimple_call_internal_p (stmt)
>>>>>       && (gimple_call_internal_fn (stmt) == IFN_MASK_LOAD
>>>>>           || gimple_call_internal_fn (stmt) == IFN_MASK_STORE))
>>>>>     return dr_unaligned_supported;
>>>>>
>>>>> So the worst that would happen is that we'd supposedly peel for
>>>>> alignment, but actually misalign everything instead, and so make
>>>>> things slower rather than quicker.
>>>>>
>>>>>> Note that the assert is to prevent bogus information.  Iff we aligned
>>>>>> DR with base alignment 8 and misalign 3 then if another same-align
>>>>>> DR has base alignment 16 we can't simply zero its DR_MISALIGNMENT
>>>>>> as it still can be 8 after aligning DR.
>>>>>>
>>>>>> So I think it's wrong to put DRs with differing base-alignment into
>>>>>> the same-align-refs chain, those should get their DR_MISALIGNMENT
>>>>>> updated independenlty after peeling.
>>>>>
>>>>> DR_MISALIGNMENT is relative to the vector alignment rather than
>>>>> the base alignment though.  So:
>>>>
>>>> We seem to use it that way, yes (looking at set_ptr_info_alignment
>>>> uses).  So why not fix the assert then by capping the alignment/misalignment
>>>> we compute at this value as well?  (and document this in the header
>>>> around DR_MISALIGNMENT)
>>>>
>>>> Ideally we'd do alignment analysis independent of the vector size
>>>> though (for those stupid targets with multiple vector sizes to consider...).
>>>>
>>>>> a) when looking for references *A1 and *A2 with the same alignment,
>>>>>    we simply have to prove that A1 % vecalign == A2 % vecalign.
>>>>>    This doesn't require any knowledge about the base alignment.
>>>>>    If we break the addresses down as:
>>>>>
>>>>>       A1 = BASE1 + REST1,  REST1 = INIT1 + OFFSET1 + X * STEP1
>>>>>       A2 = BASE2 + REST2,  REST2 = INIT2 + OFFSET2 + X * STEP2
>>>>>
>>>>>    and can prove that BASE1 == BASE2, the alignment of that base
>>>>>    isn't important.  We simply need to prove that REST1 % vecalign
>>>>>    == REST2 % vecalign for all X.
>>>>>
>>>>> b) In the assert, we've peeled the loop so that DR_PEEL is guaranteed
>>>>>    to be vector-aligned.  If DR_PEEL is A1 in the example above, we have
>>>>>    A1 % vecalign == 0, so A2 % vecalign must be 0 too.  This again doesn't
>>>>>    rely on the base alignment being known.
>>>>>
>>>>> What a high base alignment for DR_PEEL gives us is the ability to know
>>>>> at compile how many iterations need to be peeled to make DR_PEEL aligned.
>>>>> But the points above apply regardless of whether we know that value at
>>>>> compile time or not.
>>>>>
>>>>> In examples like the test case, we would have known at compile time that
>>>>> VF-1 iterations would need to be peeled if we'd picked the store as the
>>>>> DR_PEEL, but would have treated the number of peels as variable if we'd
>>>>> picked the load.  The value calculated at runtime would still have been
>>>>> VF-1, it's just that the code wouldn't have been as efficient.
>>>>>
>>>>> One of the benefits of pooling the alignments for unconditional references
>>>>> is that it no longer matters which DR we pick: the number of peels will
>>>>> be a compile-time constant both ways.
>>>>>
>>>>> Thanks,
>>>>> Richard
>>>>>
>>>>>> I'd rather not mix fixing this with the improvement to eventuall use a
>>>>>> larger align for the other DR if possible.
>>>>
>>>> ^^^
>>>>
>>>> So can you fix the ICE with capping base alignment / DR_MISALIGNMENT?
>>>
>>> I don't think the problem is the lack of a cap.  In the test case we
>>> see that:
>>>
>>> 1. B is known at compile time to be X * vecsize + Y when considered in
>>>    isolation, because the base alignment derived from its DR_REF >= vecsize.
>>>    So DR_MISALIGNMENT (B) == Y.
>>>
>>> 2. A's misalignment wrt vecsize is not known at compile time when
>>>    considered in isolation, because no useful base alignment can be
>>>    derived from its DR_REF.  (The DR_REF is to a plain int rather than
>>>    to a structure with a high alignment.)  So DR_MISALIGNMENT (A) == -1.
>>>
>>> 3. A and B when considered as a pair trivially have the same misalignment
>>>    wrt vecsize, for the reasons above.
>>>
>>> Each of these results is individually correct.  The problem is that the
>>> assert is conflating two things: it's saying that if we know two datarefs
>>> have the same misaligment, we must either be able to calculate a
>>> compile-time misalignment for both datarefs in isolation, or we must
>>> fail to calculate a compile-time misalignment for both datarefs in
>>> isolation.  That isn't true: it's valid to have situations in which the
>>> compile-time misalignment is known for one dataref in isolation but not
>>> for the other.
>>
>> True.  So the assert should then become
>>
>>       gcc_assert (! known_alignment_for_access_p (dr)
>>                   || DR_MISALIGNMENT (dr) / dr_size ==
>>                     DR_MISALIGNMENT (dr_peel) / dr_peel_size);
>>
>> ?
>
> I think it would need to be:
>
>       gcc_assert (!known_alignment_for_access_p (dr)
>                   || !known_alignment_for_access_p (dr_peel)
>                   || (DR_MISALIGNMENT (dr) / dr_size
>                       == DR_MISALIGNMENT (dr_peel) / dr_peel_size));

I think for !known_alignment_for_access_p (dr_peel) the assert doesn't make
any sense (DR_MISALIGNMENT is -1), so yes, you are right.

> But yeah, that would work too.  The idea with the assert in the patch was
> that for unconditional references we probably *do* want to try to compute
> the same compile-time misalignment, but for optimisation reasons rather
> than correctness.  Maybe that's more properly a gcc_checking_assert
> though, since nothing goes wrong if it fails.  So perhaps:

We shouldn't have asserts for optimization reasons, even with checking
IMHO.

>
>      gcc_checking_assert (DR_IS_CONDITIONAL_IN_STMT (dr)
>                           || DR_IS_CONDITIONAL_IN_STMT (dr_peel)
>                           || (known_alignment_for_access_p (dr)
>                               == known_alignment_for_access_p (dr_peel)));
>
> as a follow-on assert.
>
> Should I split it into two patches, one to change the gcc_assert and
> another to add the optimisation?

Yes please.

Thanks,
Richard.

> Thanks,
> Richard
diff mbox

Patch

Index: gcc/tree-vectorizer.h
===================================================================
--- gcc/tree-vectorizer.h	2017-06-08 08:51:43.347264181 +0100
+++ gcc/tree-vectorizer.h	2017-06-22 12:23:21.288421018 +0100
@@ -22,6 +22,7 @@  Software Foundation; either version 3, o
 #define GCC_TREE_VECTORIZER_H
 
 #include "tree-data-ref.h"
+#include "tree-hash-traits.h"
 #include "target.h"
 
 /* Used for naming of new temporaries.  */
@@ -84,6 +85,10 @@  struct stmt_info_for_cost {
 
 typedef vec<stmt_info_for_cost> stmt_vector_for_cost;
 
+/* Maps base addresses to the largest alignment that we've been able
+   to calculate for them.  */
+typedef hash_map<tree_operand_hash, unsigned int> vec_base_alignments;
+
 /************************************************************************
   SLP
  ************************************************************************/
@@ -156,6 +161,10 @@  struct vec_info {
   /* All data references.  */
   vec<data_reference_p> datarefs;
 
+  /* Maps the base addresses of all data references in DATAREFS to the
+     largest alignment that we've been able to calculate for them.  */
+  vec_base_alignments base_alignments;
+
   /* All data dependences.  */
   vec<ddr_p> ddrs;
 
@@ -1117,6 +1126,7 @@  extern bool vect_prune_runtime_alias_tes
 extern bool vect_check_gather_scatter (gimple *, loop_vec_info,
 				       gather_scatter_info *);
 extern bool vect_analyze_data_refs (vec_info *, int *);
+extern void vect_compute_base_alignments (vec_info *);
 extern tree vect_create_data_ref_ptr (gimple *, tree, struct loop *, tree,
 				      tree *, gimple_stmt_iterator *,
 				      gimple **, bool, bool *,
Index: gcc/tree-data-ref.h
===================================================================
--- gcc/tree-data-ref.h	2017-06-08 08:51:43.349263895 +0100
+++ gcc/tree-data-ref.h	2017-06-22 12:23:21.285421180 +0100
@@ -119,6 +119,10 @@  struct data_reference
   /* True when the data reference is in RHS of a stmt.  */
   bool is_read;
 
+  /* True when the data reference is conditional, i.e. if it might not
+     occur even when the statement runs to completion.  */
+  bool is_conditional;
+
   /* Behavior of the memory reference in the innermost loop.  */
   struct innermost_loop_behavior innermost;
 
@@ -138,6 +142,7 @@  #define DR_ACCESS_FN(DR, I)        DR_AC
 #define DR_NUM_DIMENSIONS(DR)      DR_ACCESS_FNS (DR).length ()
 #define DR_IS_READ(DR)             (DR)->is_read
 #define DR_IS_WRITE(DR)            (!DR_IS_READ (DR))
+#define DR_IS_CONDITIONAL(DR)      (DR)->is_conditional
 #define DR_BASE_ADDRESS(DR)        (DR)->innermost.base_address
 #define DR_OFFSET(DR)              (DR)->innermost.offset
 #define DR_INIT(DR)                (DR)->innermost.init
@@ -350,7 +355,8 @@  extern bool graphite_find_data_reference
 						   vec<data_reference_p> *);
 tree find_data_references_in_loop (struct loop *, vec<data_reference_p> *);
 bool loop_nest_has_data_refs (loop_p loop);
-struct data_reference *create_data_ref (loop_p, loop_p, tree, gimple *, bool);
+struct data_reference *create_data_ref (loop_p, loop_p, tree, gimple *, bool,
+					bool);
 extern bool find_loop_nest (struct loop *, vec<loop_p> *);
 extern struct data_dependence_relation *initialize_data_dependence_relation
      (struct data_reference *, struct data_reference *, vec<loop_p>);
Index: gcc/tree-data-ref.c
===================================================================
--- gcc/tree-data-ref.c	2017-06-08 08:51:43.349263895 +0100
+++ gcc/tree-data-ref.c	2017-06-22 12:23:21.284421233 +0100
@@ -1053,15 +1053,18 @@  free_data_ref (data_reference_p dr)
   free (dr);
 }
 
-/* Analyzes memory reference MEMREF accessed in STMT.  The reference
-   is read if IS_READ is true, write otherwise.  Returns the
-   data_reference description of MEMREF.  NEST is the outermost loop
-   in which the reference should be instantiated, LOOP is the loop in
-   which the data reference should be analyzed.  */
+/* Analyze memory reference MEMREF, which is accessed in STMT.  The reference
+   is a read if IS_READ is true, otherwise it is a write.  IS_CONDITIONAL
+   indicates that the reference is conditional, i.e. that it might not
+   occur every time that STMT runs to completion.
+
+   Return the data_reference description of MEMREF.  NEST is the outermost
+   loop in which the reference should be instantiated, LOOP is the loop
+   in which the data reference should be analyzed.  */
 
 struct data_reference *
 create_data_ref (loop_p nest, loop_p loop, tree memref, gimple *stmt,
-		 bool is_read)
+		 bool is_read, bool is_conditional)
 {
   struct data_reference *dr;
 
@@ -1076,6 +1079,7 @@  create_data_ref (loop_p nest, loop_p loo
   DR_STMT (dr) = stmt;
   DR_REF (dr) = memref;
   DR_IS_READ (dr) = is_read;
+  DR_IS_CONDITIONAL (dr) = is_conditional;
 
   dr_analyze_innermost (dr, nest);
   dr_analyze_indices (dr, nest, loop);
@@ -4446,6 +4450,10 @@  struct data_ref_loc
 
   /* True if the memory reference is read.  */
   bool is_read;
+
+  /* True if the data reference is conditional, i.e. if it might not
+     occur even when the statement runs to completion.  */
+  bool is_conditional;
 };
 
 
@@ -4512,6 +4520,7 @@  get_references_in_stmt (gimple *stmt, ve
 	{
 	  ref.ref = op1;
 	  ref.is_read = true;
+	  ref.is_conditional = false;
 	  references->safe_push (ref);
 	}
     }
@@ -4539,6 +4548,7 @@  get_references_in_stmt (gimple *stmt, ve
 	      type = TREE_TYPE (gimple_call_arg (stmt, 3));
 	    if (TYPE_ALIGN (type) != align)
 	      type = build_aligned_type (type, align);
+	    ref.is_conditional = true;
 	    ref.ref = fold_build2 (MEM_REF, type, gimple_call_arg (stmt, 0),
 				   ptr);
 	    references->safe_push (ref);
@@ -4558,6 +4568,7 @@  get_references_in_stmt (gimple *stmt, ve
 	    {
 	      ref.ref = op1;
 	      ref.is_read = true;
+	      ref.is_conditional = false;
 	      references->safe_push (ref);
 	    }
 	}
@@ -4571,6 +4582,7 @@  get_references_in_stmt (gimple *stmt, ve
     {
       ref.ref = op0;
       ref.is_read = false;
+      ref.is_conditional = false;
       references->safe_push (ref);
     }
   return clobbers_memory;
@@ -4635,8 +4647,8 @@  find_data_references_in_stmt (struct loo
 
   FOR_EACH_VEC_ELT (references, i, ref)
     {
-      dr = create_data_ref (nest, loop_containing_stmt (stmt),
-			    ref->ref, stmt, ref->is_read);
+      dr = create_data_ref (nest, loop_containing_stmt (stmt), ref->ref,
+			    stmt, ref->is_read, ref->is_conditional);
       gcc_assert (dr != NULL);
       datarefs->safe_push (dr);
     }
@@ -4665,7 +4677,8 @@  graphite_find_data_references_in_stmt (l
 
   FOR_EACH_VEC_ELT (references, i, ref)
     {
-      dr = create_data_ref (nest, loop, ref->ref, stmt, ref->is_read);
+      dr = create_data_ref (nest, loop, ref->ref, stmt, ref->is_read,
+			    ref->is_conditional);
       gcc_assert (dr != NULL);
       datarefs->safe_push (dr);
     }
Index: gcc/tree-ssa-loop-prefetch.c
===================================================================
--- gcc/tree-ssa-loop-prefetch.c	2017-06-07 21:58:55.928557601 +0100
+++ gcc/tree-ssa-loop-prefetch.c	2017-06-22 12:23:21.285421180 +0100
@@ -1633,7 +1633,7 @@  determine_loop_nest_reuse (struct loop *
     for (ref = gr->refs; ref; ref = ref->next)
       {
 	dr = create_data_ref (nest, loop_containing_stmt (ref->stmt),
-			      ref->mem, ref->stmt, !ref->write_p);
+			      ref->mem, ref->stmt, !ref->write_p, false);
 
 	if (dr)
 	  {
Index: gcc/tree-vect-data-refs.c
===================================================================
--- gcc/tree-vect-data-refs.c	2017-06-08 08:51:43.350263752 +0100
+++ gcc/tree-vect-data-refs.c	2017-06-22 12:23:21.286421126 +0100
@@ -646,6 +646,102 @@  vect_slp_analyze_instance_dependence (sl
   return res;
 }
 
+/* If DR is nested in a loop that is being vectorized, return the base
+   address in the context of the vectorized loop (rather than the
+   nested loop).  Otherwise return the base address in the context
+   of the containing statement.  */
+
+static tree
+vect_get_base_address (data_reference *dr)
+{
+  gimple *stmt = DR_STMT (dr);
+  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
+  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
+  struct loop *loop = loop_vinfo != NULL ? LOOP_VINFO_LOOP (loop_vinfo) : NULL;
+  if (loop && nested_in_vect_loop_p (loop, stmt))
+    return STMT_VINFO_DR_BASE_ADDRESS (stmt_info);
+  else
+    return DR_BASE_ADDRESS (dr);
+}
+
+/* Compute and return the alignment of base address BASE_ADDR in DR.  */
+
+static unsigned int
+vect_compute_base_alignment (data_reference *dr, tree base_addr)
+{
+  /* To look at the alignment of the base we have to preserve an inner
+     MEM_REF as that carries the alignment information of the actual
+     access.  */
+  tree base = DR_REF (dr);
+  while (handled_component_p (base))
+    base = TREE_OPERAND (base, 0);
+  unsigned int base_alignment = 0;
+  unsigned HOST_WIDE_INT base_bitpos;
+  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
+
+  /* As data-ref analysis strips the MEM_REF down to its base operand
+     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
+     adjust things to make base_alignment valid as the alignment of
+     DR_BASE_ADDRESS.  */
+  if (TREE_CODE (base) == MEM_REF)
+    {
+      /* Note all this only works if DR_BASE_ADDRESS is the same as
+	 MEM_REF operand zero, otherwise DR/SCEV analysis might have factored
+	 in other offsets.  We need to rework DR to compute the alingment
+	 of DR_BASE_ADDRESS as long as all information is still available.  */
+      if (operand_equal_p (TREE_OPERAND (base, 0), base_addr, 0))
+	{
+	  base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
+	  base_bitpos &= (base_alignment - 1);
+	}
+      else
+	base_bitpos = BITS_PER_UNIT;
+    }
+  if (base_bitpos != 0)
+    base_alignment = base_bitpos & -base_bitpos;
+
+  /* Also look at the alignment of the base address DR analysis
+     computed.  */
+  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
+  if (base_addr_alignment > base_alignment)
+    base_alignment = base_addr_alignment;
+
+  return base_alignment;
+}
+
+/* Compute alignments for the base addresses of all datarefs in VINFO.  */
+
+void
+vect_compute_base_alignments (vec_info *vinfo)
+{
+  /* If the region we're going to vectorize is reached, all unconditional
+     data references occur at least once.  We can therefore pool the base
+     alignment guarantees from each unconditional reference.  */
+  data_reference *dr;
+  unsigned int i;
+  FOR_EACH_VEC_ELT (vinfo->datarefs, i, dr)
+    if (!DR_IS_CONDITIONAL (dr))
+      {
+	tree base_addr = vect_get_base_address (dr);
+	unsigned int alignment = vect_compute_base_alignment (dr, base_addr);
+	bool existed;
+	unsigned int &entry
+	  = vinfo->base_alignments.get_or_insert (base_addr, &existed);
+	if (!existed || entry < alignment)
+	  {
+	    entry = alignment;
+	    if (dump_enabled_p ())
+	      {
+		dump_printf_loc (MSG_NOTE, vect_location,
+				 "setting base alignment for ");
+		dump_generic_expr (MSG_NOTE, TDF_SLIM, base_addr);
+		dump_printf (MSG_NOTE, " to %d, based on ", alignment);
+		dump_gimple_stmt (MSG_NOTE, TDF_SLIM, DR_STMT (dr), 0);
+	      }
+	  }
+      }
+}
+
 /* Function vect_compute_data_ref_alignment
 
    Compute the misalignment of the data reference DR.
@@ -663,6 +759,7 @@  vect_compute_data_ref_alignment (struct
 {
   gimple *stmt = DR_STMT (dr);
   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
+  vec_base_alignments *base_alignments = &stmt_info->vinfo->base_alignments;
   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
   struct loop *loop = NULL;
   tree ref = DR_REF (dr);
@@ -699,6 +796,8 @@  vect_compute_data_ref_alignment (struct
     {
       tree step = DR_STEP (dr);
 
+      base_addr = STMT_VINFO_DR_BASE_ADDRESS (stmt_info);
+      aligned_to = STMT_VINFO_DR_ALIGNED_TO (stmt_info);
       if (tree_fits_shwi_p (step)
 	  && tree_to_shwi (step) % GET_MODE_SIZE (TYPE_MODE (vectype)) == 0)
         {
@@ -706,8 +805,6 @@  vect_compute_data_ref_alignment (struct
             dump_printf_loc (MSG_NOTE, vect_location,
                              "inner step divides the vector-size.\n");
 	  misalign = STMT_VINFO_DR_INIT (stmt_info);
-	  aligned_to = STMT_VINFO_DR_ALIGNED_TO (stmt_info);
-	  base_addr = STMT_VINFO_DR_BASE_ADDRESS (stmt_info);
         }
       else
 	{
@@ -738,39 +835,15 @@  vect_compute_data_ref_alignment (struct
 	}
     }
 
-  /* To look at alignment of the base we have to preserve an inner MEM_REF
-     as that carries alignment information of the actual access.  */
-  base = ref;
-  while (handled_component_p (base))
-    base = TREE_OPERAND (base, 0);
+  /* Calculate the maximum of the pooled base address alignment and the
+     alignment that we can compute for DR itself.  The latter should
+     already be included in the former for unconditional references.  */
   unsigned int base_alignment = 0;
-  unsigned HOST_WIDE_INT base_bitpos;
-  get_object_alignment_1 (base, &base_alignment, &base_bitpos);
-  /* As data-ref analysis strips the MEM_REF down to its base operand
-     to form DR_BASE_ADDRESS and adds the offset to DR_INIT we have to
-     adjust things to make base_alignment valid as the alignment of
-     DR_BASE_ADDRESS.  */
-  if (TREE_CODE (base) == MEM_REF)
-    {
-      /* Note all this only works if DR_BASE_ADDRESS is the same as
-	 MEM_REF operand zero, otherwise DR/SCEV analysis might have factored
-	 in other offsets.  We need to rework DR to compute the alingment
-	 of DR_BASE_ADDRESS as long as all information is still available.  */
-      if (operand_equal_p (TREE_OPERAND (base, 0), base_addr, 0))
-	{
-	  base_bitpos -= mem_ref_offset (base).to_short_addr () * BITS_PER_UNIT;
-	  base_bitpos &= (base_alignment - 1);
-	}
-      else
-	base_bitpos = BITS_PER_UNIT;
-    }
-  if (base_bitpos != 0)
-    base_alignment = base_bitpos & -base_bitpos;
-  /* Also look at the alignment of the base address DR analysis
-     computed.  */
-  unsigned int base_addr_alignment = get_pointer_alignment (base_addr);
-  if (base_addr_alignment > base_alignment)
-    base_alignment = base_addr_alignment;
+  if (DR_IS_CONDITIONAL (dr))
+    base_alignment = vect_compute_base_alignment (dr, base_addr);
+  if (unsigned int *entry = base_alignments->get (base_addr))
+    base_alignment = MAX (base_alignment, *entry);
+  gcc_assert (base_alignment != 0);
 
   if (base_alignment >= TYPE_ALIGN (TREE_TYPE (vectype)))
     DR_VECT_AUX (dr)->base_element_aligned = true;
@@ -906,8 +979,29 @@  vect_update_misalignment_for_peel (struc
     {
       if (current_dr != dr)
         continue;
-      gcc_assert (DR_MISALIGNMENT (dr) / dr_size ==
-                  DR_MISALIGNMENT (dr_peel) / dr_peel_size);
+      /* Any alignment guarantees provided by a reference only apply if
+	 the reference actually occurs.  For example, in:
+
+	    struct s __attribute__((aligned(32))) {
+	      int misaligner;
+	      int array[N];
+	    };
+
+	    int *ptr;
+	    for (int i = 0; i < n; ++i)
+	      ptr[i] = c[i] ? ((struct s *) (ptr - 1))->array[i] : 0;
+
+	 we can only assume that ptr is part of a struct s if at least one
+	 c[i] is true.  This in turn means that we have a higher base
+	 alignment guarantee for the read from ptr (if it occurs) than for
+	 the write to ptr, and we cannot unconditionally carry the former
+	 over to the latter.  We still know that the two address values
+	 have the same misalignment, so if peeling has forced one of them
+	 to be aligned, the other must be too.  */
+      gcc_assert (DR_IS_CONDITIONAL (dr_peel)
+		  || DR_IS_CONDITIONAL (dr)
+		  || (DR_MISALIGNMENT (dr) / dr_size
+		      == DR_MISALIGNMENT (dr_peel) / dr_peel_size));
       SET_DR_MISALIGNMENT (dr, 0);
       return;
     }
@@ -2117,8 +2211,7 @@  vect_find_same_alignment_drs (struct dat
   if (dra == drb)
     return;
 
-  if (!operand_equal_p (DR_BASE_OBJECT (dra), DR_BASE_OBJECT (drb),
-			OEP_ADDRESS_OF)
+  if (!operand_equal_p (DR_BASE_ADDRESS (dra), DR_BASE_ADDRESS (drb), 0)
       || !operand_equal_p (DR_OFFSET (dra), DR_OFFSET (drb), 0)
       || !operand_equal_p (DR_STEP (dra), DR_STEP (drb), 0))
     return;
@@ -2176,6 +2269,7 @@  vect_analyze_data_refs_alignment (loop_v
   vec<data_reference_p> datarefs = vinfo->datarefs;
   struct data_reference *dr;
 
+  vect_compute_base_alignments (vinfo);
   FOR_EACH_VEC_ELT (datarefs, i, dr)
     {
       stmt_vec_info stmt_info = vinfo_for_stmt (DR_STMT (dr));
@@ -3374,7 +3468,8 @@  vect_analyze_data_refs (vec_info *vinfo,
 	    {
 	      struct data_reference *newdr
 		= create_data_ref (NULL, loop_containing_stmt (stmt),
-				   DR_REF (dr), stmt, maybe_scatter ? false : true);
+				   DR_REF (dr), stmt, !maybe_scatter,
+				   DR_IS_CONDITIONAL (dr));
 	      gcc_assert (newdr != NULL && DR_REF (newdr));
 	      if (DR_BASE_ADDRESS (newdr)
 		  && DR_OFFSET (newdr)
Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c	2017-06-07 21:58:56.336475882 +0100
+++ gcc/tree-vect-slp.c	2017-06-22 12:23:21.288421018 +0100
@@ -2367,6 +2367,7 @@  new_bb_vec_info (gimple_stmt_iterator re
   gimple_stmt_iterator gsi;
 
   res = (bb_vec_info) xcalloc (1, sizeof (struct _bb_vec_info));
+  new (&res->base_alignments) vec_base_alignments ();
   res->kind = vec_info::bb;
   BB_VINFO_BB (res) = bb;
   res->region_begin = region_begin;
@@ -2741,6 +2742,8 @@  vect_slp_analyze_bb_1 (gimple_stmt_itera
       return NULL;
     }
 
+  vect_compute_base_alignments (bb_vinfo);
+
   /* Analyze and verify the alignment of data references and the
      dependence in the SLP instances.  */
   for (i = 0; BB_VINFO_SLP_INSTANCES (bb_vinfo).iterate (i, &instance); )
Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c	2017-06-22 12:22:57.734313143 +0100
+++ gcc/tree-vect-loop.c	2017-06-22 12:23:21.287421072 +0100
@@ -1157,6 +1157,7 @@  new_loop_vec_info (struct loop *loop)
   LOOP_VINFO_VECT_FACTOR (res) = 0;
   LOOP_VINFO_LOOP_NEST (res) = vNULL;
   LOOP_VINFO_DATAREFS (res) = vNULL;
+  new (&res->base_alignments) vec_base_alignments ();
   LOOP_VINFO_DDRS (res) = vNULL;
   LOOP_VINFO_UNALIGNED_DR (res) = NULL;
   LOOP_VINFO_MAY_MISALIGN_STMTS (res) = vNULL;
Index: gcc/tree-vectorizer.c
===================================================================
--- gcc/tree-vectorizer.c	2017-06-22 12:22:57.732313220 +0100
+++ gcc/tree-vectorizer.c	2017-06-22 12:23:21.288421018 +0100
@@ -370,6 +370,8 @@  vect_destroy_datarefs (vec_info *vinfo)
       }
 
   free_data_refs (vinfo->datarefs);
+
+  vinfo->base_alignments.~vec_base_alignments ();
 }
 
 /* A helper function to free scev and LOOP niter information, as well as
Index: gcc/testsuite/gcc.dg/vect/pr81136.c
===================================================================
--- /dev/null	2017-06-22 07:43:14.805493307 +0100
+++ gcc/testsuite/gcc.dg/vect/pr81136.c	2017-06-22 12:23:21.283421287 +0100
@@ -0,0 +1,16 @@ 
+/* { dg-do compile } */
+
+struct __attribute__((aligned (32)))
+{
+  char misaligner;
+  int foo[100];
+  int bar[100];
+} *a;
+
+void
+fn1 (int n)
+{
+  int *b = a->foo;
+  for (int i = 0; i < n; i++)
+    a->bar[i] = b[i];
+}