Handle data dependence relations with different bases

Message ID	87wp6urs0j.fsf@linaro.org
State	New
Headers	show Return-Path: <gcc-patches-return-459153-incoming=patchwork.ozlabs.org@gcc.gnu.org> DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:references:date:in-reply-to:message-id :mime-version:content-type; q=dns; s=default; b=VtM7qWwZarASeEN+ aTnkq4dLSOy7RsPmPQWq+lv7eIYnqvOdzN2dSyo2GYYrcuVEMqL61TQgzSYw2tFY vis3q7I/DCd+bZDlr2Adg1/RQz/IHFVxoQGlyIKhJS1BVxzphuVkvCdF80mjeR2i N9TBc7oGK5mRthNuVWxmCHOw4Sg= Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org From: Richard Sandiford <richard.sandiford@linaro.org> To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, Richard Biener <richard.guenther@gmail.com>, richard.sandiford@linaro.org Cc: Richard Biener <richard.guenther@gmail.com> Subject: Re: Handle data dependence relations with different bases References: <87tw52s73r.fsf@linaro.org> <CAFiYyc0MREgfp6GCL6=d1hkRaEVX8NOAaX=hNgaGooEs8L5QbQ@mail.gmail.com> <87lgp38p59.fsf@linaro.org> <3947361.mLME8MOua7@polaris> <87y3s08hy2.fsf@linaro.org> Date: Thu, 27 Jul 2017 13:19:56 +0100 In-Reply-To: <87y3s08hy2.fsf@linaro.org> (Richard Sandiford's message of "Fri, 07 Jul 2017 15:01:09 +0100") Message-ID: <87wp6urs0j.fsf@linaro.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain

Message ID

87wp6urs0j.fsf@linaro.org

State

New

Headers

DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:from
	:to:cc:subject:references:date:in-reply-to:message-id
	:mime-version:content-type; q=dns; s=default; b=VtM7qWwZarASeEN+
	aTnkq4dLSOy7RsPmPQWq+lv7eIYnqvOdzN2dSyo2GYYrcuVEMqL61TQgzSYw2tFY
	vis3q7I/DCd+bZDlr2Adg1/RQz/IHFVxoQGlyIKhJS1BVxzphuVkvCdF80mjeR2i
	N9TBc7oGK5mRthNuVWxmCHOw4Sg=
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
Sender: gcc-patches-owner@gcc.gnu.org
From: Richard Sandiford <richard.sandiford@linaro.org>
To: gcc-patches@gcc.gnu.org
Mail-Followup-To: gcc-patches@gcc.gnu.org,
	Richard Biener <richard.guenther@gmail.com>,
	richard.sandiford@linaro.org
Cc: Richard Biener <richard.guenther@gmail.com>
Subject: Re: Handle data dependence relations with different bases
References: <87tw52s73r.fsf@linaro.org>	<CAFiYyc0MREgfp6GCL6=d1hkRaEVX8NOAaX=hNgaGooEs8L5QbQ@mail.gmail.com>	<87lgp38p59.fsf@linaro.org>
	<3947361.mLME8MOua7@polaris>	<87y3s08hy2.fsf@linaro.org>
Date: Thu, 27 Jul 2017 13:19:56 +0100
In-Reply-To: <87y3s08hy2.fsf@linaro.org> (Richard Sandiford's message of
	"Fri, 07 Jul 2017 15:01:09 +0100")
Message-ID: <87wp6urs0j.fsf@linaro.org>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain

Commit Message

Richard Sandiford July 27, 2017, 12:19 p.m. UTC

Richard Sandiford <richard.sandiford@linaro.org> writes:
> Eric Botcazou <ebotcazou@adacore.com> writes:
>> [Sorry for missing the previous messages]
>>
>>> Thanks.  Just been retesting, and I think I must have forgotten
>>> to include Ada last time.  It turns out that the patch causes a dg-scan
>>> regression in gnat.dg/vect17.adb, because we now think that if the
>>> array RECORD_TYPEs *do* alias in:
>>> 
>>>    procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
>>>    begin
>>>       for I in Sarray'Range loop
>>>          R(I) := X(I) + Y(I);
>>>       end loop;
>>>    end;
>>> 
>>> then the dependence distance must be zero.  Eric, does that hold true
>>> for Ada?  I.e. if X and R (or Y and R) alias, must it be the case that
>>> X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?
>>
>> Yes, I'd think so (even without the artificial RECORD_TYPE around the arrays).
>
> Good!
>
>>> 2017-06-07  Richard Sandiford  <richard.sandiford@linaro.org>
>>> 
>>> gcc/testsuite/
>>> 	* gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
>>> 	* gnat.dg/vect17.adb (Add): Create a dependence distance of 1
>>> 	when X = R or Y = R.
>>
>> I think that you need to modify vect15 and vect16 the same way.
>
> Ah, yeah.  And doing that shows that I'd not handled safelen for
> DDR_COULD_BE_INDEPENDENT_P.  I've fixed that locally.
>
> How does this look?  Tested on x86_64-linux-gnu both without the
> vectoriser changes and with the fixed vectoriser patch.

Here's a version of the patch that handles safelen.  I split the
handling out into a new function (vect_analyze_possibly_independent_ddr)
since it was getting too big to do inline.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Thanks,
Richard


2017-07-27  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* tree-data-ref.h (subscript): Add access_fn field.
	(data_dependence_relation): Add could_be_independent_p.
	(SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros.
	(same_access_functions): Move to tree-data-ref.c.
	* tree-data-ref.c (ref_contains_union_access_p): New function.
	(access_fn_component_p): Likewise.
	(access_fn_components_comparable_p): Likewise.
	(dr_analyze_indices): Add a reference to access_fn_component_p.
	(dump_data_dependence_relation): Use SUB_ACCESS_FN instead of
	DR_ACCESS_FN.
	(constant_access_functions): Likewise.
	(add_other_self_distances): Likewise.
	(same_access_functions): Likewise.  (Moved from tree-data-ref.h.)
	(initialize_data_dependence_relation): Use XCNEW and remove
	explicit zeroing of DDR_REVERSED_P.  Look for a subsequence
	of access functions that have the same type.  Allow the
	subsequence to end with different bases in some circumstances.
	Record the chosen access functions in SUB_ACCESS_FN.
	(build_classic_dist_vector_1): Replace ddr_a and ddr_b with
	a_index and b_index.  Use SUB_ACCESS_FN instead of DR_ACCESS_FN.
	(subscript_dependence_tester_1): Likewise dra and drb.
	(build_classic_dist_vector): Update calls accordingly.
	(subscript_dependence_tester): Likewise.
	* tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Check
	DDR_COULD_BE_INDEPENDENT_P.
	* tree-vectorizer.h (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Test
	comp_alias_ddrs instead of may_alias_ddrs.
	* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
	New function.
	(vect_analyze_data_ref_dependence): Use it if
	DDR_COULD_BE_INDEPENDENT_P, but fall back to using the recorded
	distance vectors if that fails.
	(dependence_distance_ge_vf): New function.
	(vect_prune_runtime_alias_test_list): Use it.  Don't clear
	LOOP_VINFO_MAY_ALIAS_DDRS.

gcc/testsuite/
	* gcc.dg/vect/vect-alias-check-3.c: New test.
	* gcc.dg/vect/vect-alias-check-4.c: Likewise.
	* gcc.dg/vect/vect-alias-check-5.c: Likewise.

Comments

Richard Sandiford Aug. 3, 2017, 8:52 p.m. UTC | #1

Ping

Richard Sandiford <richard.sandiford@linaro.org> writes:
> Richard Sandiford <richard.sandiford@linaro.org> writes:
>> Eric Botcazou <ebotcazou@adacore.com> writes:
>>> [Sorry for missing the previous messages]
>>>
>>>> Thanks.  Just been retesting, and I think I must have forgotten
>>>> to include Ada last time.  It turns out that the patch causes a dg-scan
>>>> regression in gnat.dg/vect17.adb, because we now think that if the
>>>> array RECORD_TYPEs *do* alias in:
>>>> 
>>>>    procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
>>>>    begin
>>>>       for I in Sarray'Range loop
>>>>          R(I) := X(I) + Y(I);
>>>>       end loop;
>>>>    end;
>>>> 
>>>> then the dependence distance must be zero.  Eric, does that hold true
>>>> for Ada?  I.e. if X and R (or Y and R) alias, must it be the case that
>>>> X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?
>>>
>>> Yes, I'd think so (even without the artificial RECORD_TYPE around the arrays).
>>
>> Good!
>>
>>>> 2017-06-07  Richard Sandiford  <richard.sandiford@linaro.org>
>>>> 
>>>> gcc/testsuite/
>>>> 	* gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
>>>> 	* gnat.dg/vect17.adb (Add): Create a dependence distance of 1
>>>> 	when X = R or Y = R.
>>>
>>> I think that you need to modify vect15 and vect16 the same way.
>>
>> Ah, yeah.  And doing that shows that I'd not handled safelen for
>> DDR_COULD_BE_INDEPENDENT_P.  I've fixed that locally.
>>
>> How does this look?  Tested on x86_64-linux-gnu both without the
>> vectoriser changes and with the fixed vectoriser patch.
>
> Here's a version of the patch that handles safelen.  I split the
> handling out into a new function (vect_analyze_possibly_independent_ddr)
> since it was getting too big to do inline.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>
> Thanks,
> Richard
>
>
> 2017-07-27  Richard Sandiford  <richard.sandiford@linaro.org>
>
> gcc/
> 	* tree-data-ref.h (subscript): Add access_fn field.
> 	(data_dependence_relation): Add could_be_independent_p.
> 	(SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros.
> 	(same_access_functions): Move to tree-data-ref.c.
> 	* tree-data-ref.c (ref_contains_union_access_p): New function.
> 	(access_fn_component_p): Likewise.
> 	(access_fn_components_comparable_p): Likewise.
> 	(dr_analyze_indices): Add a reference to access_fn_component_p.
> 	(dump_data_dependence_relation): Use SUB_ACCESS_FN instead of
> 	DR_ACCESS_FN.
> 	(constant_access_functions): Likewise.
> 	(add_other_self_distances): Likewise.
> 	(same_access_functions): Likewise.  (Moved from tree-data-ref.h.)
> 	(initialize_data_dependence_relation): Use XCNEW and remove
> 	explicit zeroing of DDR_REVERSED_P.  Look for a subsequence
> 	of access functions that have the same type.  Allow the
> 	subsequence to end with different bases in some circumstances.
> 	Record the chosen access functions in SUB_ACCESS_FN.
> 	(build_classic_dist_vector_1): Replace ddr_a and ddr_b with
> 	a_index and b_index.  Use SUB_ACCESS_FN instead of DR_ACCESS_FN.
> 	(subscript_dependence_tester_1): Likewise dra and drb.
> 	(build_classic_dist_vector): Update calls accordingly.
> 	(subscript_dependence_tester): Likewise.
> 	* tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Check
> 	DDR_COULD_BE_INDEPENDENT_P.
> 	* tree-vectorizer.h (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Test
> 	comp_alias_ddrs instead of may_alias_ddrs.
> 	* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
> 	New function.
> 	(vect_analyze_data_ref_dependence): Use it if
> 	DDR_COULD_BE_INDEPENDENT_P, but fall back to using the recorded
> 	distance vectors if that fails.
> 	(dependence_distance_ge_vf): New function.
> 	(vect_prune_runtime_alias_test_list): Use it.  Don't clear
> 	LOOP_VINFO_MAY_ALIAS_DDRS.
>
> gcc/testsuite/
> 	* gcc.dg/vect/vect-alias-check-3.c: New test.
> 	* gcc.dg/vect/vect-alias-check-4.c: Likewise.
> 	* gcc.dg/vect/vect-alias-check-5.c: Likewise.
>
> Index: gcc/tree-data-ref.h
> ===================================================================
> --- gcc/tree-data-ref.h	2017-07-27 13:10:29.620045506 +0100
> +++ gcc/tree-data-ref.h	2017-07-27 13:10:33.023912613 +0100
> @@ -260,6 +260,9 @@ struct conflict_function
>  
>  struct subscript
>  {
> +  /* The access functions of the two references.  */
> +  tree access_fn[2];
> +
>    /* A description of the iterations for which the elements are
>       accessed twice.  */
>    conflict_function *conflicting_iterations_in_a;
> @@ -278,6 +281,7 @@ struct subscript
>  
>  typedef struct subscript *subscript_p;
>  
> +#define SUB_ACCESS_FN(SUB, I) (SUB)->access_fn[I]
>  #define SUB_CONFLICTS_IN_A(SUB) (SUB)->conflicting_iterations_in_a
>  #define SUB_CONFLICTS_IN_B(SUB) (SUB)->conflicting_iterations_in_b
>  #define SUB_LAST_CONFLICT(SUB) (SUB)->last_conflict
> @@ -333,6 +337,33 @@ struct data_dependence_relation
>    /* Set to true when the dependence relation is on the same data
>       access.  */
>    bool self_reference_p;
> +
> +  /* True if the dependence described is conservatively correct rather
> +     than exact, and if it is still possible for the accesses to be
> +     conditionally independent.  For example, the a and b references in:
> +
> +       struct s *a, *b;
> +       for (int i = 0; i < n; ++i)
> +         a->f[i] += b->f[i];
> +
> +     conservatively have a distance vector of (0), for the case in which
> +     a == b, but the accesses are independent if a != b.  Similarly,
> +     the a and b references in:
> +
> +       struct s *a, *b;
> +       for (int i = 0; i < n; ++i)
> +         a[0].f[i] += b[i].f[i];
> +
> +     conservatively have a distance vector of (0), but they are indepenent
> +     when a != b + i.  In contrast, the references in:
> +
> +       struct s *a;
> +       for (int i = 0; i < n; ++i)
> +         a->f[i] += a->f[i];
> +
> +     have the same distance vector of (0), but the accesses can never be
> +     independent.  */
> +  bool could_be_independent_p;
>  };
>  
>  typedef struct data_dependence_relation *ddr_p;
> @@ -363,6 +394,7 @@ #define DDR_DIR_VECT(DDR, I) \
>  #define DDR_DIST_VECT(DDR, I) \
>    DDR_DIST_VECTS (DDR)[I]
>  #define DDR_REVERSED_P(DDR) (DDR)->reversed_p
> +#define DDR_COULD_BE_INDEPENDENT_P(DDR) (DDR)->could_be_independent_p
>  
>  
>  bool dr_analyze_innermost (innermost_loop_behavior *, tree, struct loop *);
> @@ -457,22 +489,6 @@ same_data_refs (data_reference_p a, data
>        return false;
>  
>    return true;
> -}
> -
> -/* Return true when the DDR contains two data references that have the
> -   same access functions.  */
> -
> -static inline bool
> -same_access_functions (const struct data_dependence_relation *ddr)
> -{
> -  unsigned i;
> -
> -  for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
> -    if (!eq_evolutions_p (DR_ACCESS_FN (DDR_A (ddr), i),
> -			  DR_ACCESS_FN (DDR_B (ddr), i)))
> -      return false;
> -
> -  return true;
>  }
>  
>  /* Returns true when all the dependences are computable.  */
> Index: gcc/tree-data-ref.c
> ===================================================================
> --- gcc/tree-data-ref.c	2017-07-27 13:10:29.620045506 +0100
> +++ gcc/tree-data-ref.c	2017-07-27 13:10:33.023912613 +0100
> @@ -124,8 +124,7 @@ Software Foundation; either version 3, o
>  } dependence_stats;
>  
>  static bool subscript_dependence_tester_1 (struct data_dependence_relation *,
> -					   struct data_reference *,
> -					   struct data_reference *,
> +					   unsigned int, unsigned int,
>  					   struct loop *);
>  /* Returns true iff A divides B.  */
>  
> @@ -145,6 +144,21 @@ int_divides_p (int a, int b)
>    return ((b % a) == 0);
>  }
>  
> +/* Return true if reference REF contains a union access.  */
> +
> +static bool
> +ref_contains_union_access_p (tree ref)
> +{
> +  while (handled_component_p (ref))
> +    {
> +      ref = TREE_OPERAND (ref, 0);
> +      if (TREE_CODE (TREE_TYPE (ref)) == UNION_TYPE
> +	  || TREE_CODE (TREE_TYPE (ref)) == QUAL_UNION_TYPE)
> +	return true;
> +    }
> +  return false;
> +}
> +
>  
>  
>  /* Dump into FILE all the data references from DATAREFS.  */
> @@ -434,13 +448,14 @@ dump_data_dependence_relation (FILE *out
>        unsigned int i;
>        struct loop *loopi;
>  
> -      for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
> +      subscript *sub;
> +      FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
>  	{
>  	  fprintf (outf, "  access_fn_A: ");
> -	  print_generic_stmt (outf, DR_ACCESS_FN (dra, i));
> +	  print_generic_stmt (outf, SUB_ACCESS_FN (sub, 0));
>  	  fprintf (outf, "  access_fn_B: ");
> -	  print_generic_stmt (outf, DR_ACCESS_FN (drb, i));
> -	  dump_subscript (outf, DDR_SUBSCRIPT (ddr, i));
> +	  print_generic_stmt (outf, SUB_ACCESS_FN (sub, 1));
> +	  dump_subscript (outf, sub);
>  	}
>  
>        fprintf (outf, "  inner loop index: %d\n", DDR_INNER_LOOP (ddr));
> @@ -920,6 +935,27 @@ dr_analyze_innermost (innermost_loop_beh
>    return true;
>  }
>  
> +/* Return true if OP is a valid component reference for a DR access
> +   function.  This accepts a subset of what handled_component_p accepts.  */
> +
> +static bool
> +access_fn_component_p (tree op)
> +{
> +  switch (TREE_CODE (op))
> +    {
> +    case REALPART_EXPR:
> +    case IMAGPART_EXPR:
> +    case ARRAY_REF:
> +      return true;
> +
> +    case COMPONENT_REF:
> +      return TREE_CODE (TREE_TYPE (TREE_OPERAND (op, 0))) == RECORD_TYPE;
> +
> +    default:
> +      return false;
> +    }
> +}
> +
>  /* Determines the base object and the list of indices of memory reference
>     DR, analyzed in LOOP and instantiated in loop nest NEST.  */
>  
> @@ -957,7 +993,9 @@ dr_analyze_indices (struct data_referenc
>        access_fns.safe_push (integer_one_node);
>      }
>  
> -  /* Analyze access functions of dimensions we know to be independent.  */
> +  /* Analyze access functions of dimensions we know to be independent.
> +     The list of component references handled here should be kept in
> +     sync with access_fn_component_p.  */
>    while (handled_component_p (ref))
>      {
>        if (TREE_CODE (ref) == ARRAY_REF)
> @@ -2148,6 +2186,38 @@ dr_may_alias_p (const struct data_refere
>    return refs_may_alias_p (addr_a, addr_b);
>  }
>  
> +/* REF_A and REF_B both satisfy access_fn_component_p.  Return true
> +   if it is meaningful to compare their associated access functions
> +   when checking for dependencies.  */
> +
> +static bool
> +access_fn_components_comparable_p (tree ref_a, tree ref_b)
> +{
> +  /* Allow pairs of component refs from the following sets:
> +
> +       { REALPART_EXPR, IMAGPART_EXPR }
> +       { COMPONENT_REF }
> +       { ARRAY_REF }.  */
> +  tree_code code_a = TREE_CODE (ref_a);
> +  tree_code code_b = TREE_CODE (ref_b);
> +  if (code_a == IMAGPART_EXPR)
> +    code_a = REALPART_EXPR;
> +  if (code_b == IMAGPART_EXPR)
> +    code_b = REALPART_EXPR;
> +  if (code_a != code_b)
> +    return false;
> +
> +  if (TREE_CODE (ref_a) == COMPONENT_REF)
> +    /* ??? We cannot simply use the type of operand #0 of the refs here as
> +       the Fortran compiler smuggles type punning into COMPONENT_REFs.
> +       Use the DECL_CONTEXT of the FIELD_DECLs instead.  */
> +    return (DECL_CONTEXT (TREE_OPERAND (ref_a, 1))
> +	    == DECL_CONTEXT (TREE_OPERAND (ref_b, 1)));
> +
> +  return types_compatible_p (TREE_TYPE (TREE_OPERAND (ref_a, 0)),
> +			     TREE_TYPE (TREE_OPERAND (ref_b, 0)));
> +}
> +
>  /* Initialize a data dependence relation between data accesses A and
>     B.  NB_LOOPS is the number of loops surrounding the references: the
>     size of the classic distance/direction vectors.  */
> @@ -2160,11 +2230,10 @@ initialize_data_dependence_relation (str
>    struct data_dependence_relation *res;
>    unsigned int i;
>  
> -  res = XNEW (struct data_dependence_relation);
> +  res = XCNEW (struct data_dependence_relation);
>    DDR_A (res) = a;
>    DDR_B (res) = b;
>    DDR_LOOP_NEST (res).create (0);
> -  DDR_REVERSED_P (res) = false;
>    DDR_SUBSCRIPTS (res).create (0);
>    DDR_DIR_VECTS (res).create (0);
>    DDR_DIST_VECTS (res).create (0);
> @@ -2182,82 +2251,277 @@ initialize_data_dependence_relation (str
>        return res;
>      }
>  
> -  /* The case where the references are exactly the same.  */
> -  if (operand_equal_p (DR_REF (a), DR_REF (b), 0))
> +  unsigned int num_dimensions_a = DR_NUM_DIMENSIONS (a);
> +  unsigned int num_dimensions_b = DR_NUM_DIMENSIONS (b);
> +  if (num_dimensions_a == 0 || num_dimensions_b == 0)
>      {
> -      if ((loop_nest.exists ()
> -	   && !object_address_invariant_in_loop_p (loop_nest[0],
> -						   DR_BASE_OBJECT (a)))
> -	  || DR_NUM_DIMENSIONS (a) == 0)
> +      DDR_ARE_DEPENDENT (res) = chrec_dont_know;
> +      return res;
> +    }
> +
> +  /* For unconstrained bases, the root (highest-indexed) subscript
> +     describes a variation in the base of the original DR_REF rather
> +     than a component access.  We have no type that accurately describes
> +     the new DR_BASE_OBJECT (whose TREE_TYPE describes the type *after*
> +     applying this subscript) so limit the search to the last real
> +     component access.
> +
> +     E.g. for:
> +
> +	void
> +	f (int a[][8], int b[][8])
> +	{
> +	  for (int i = 0; i < 8; ++i)
> +	    a[i * 2][0] = b[i][0];
> +	}
> +
> +     the a and b accesses have a single ARRAY_REF component reference [0]
> +     but have two subscripts.  */
> +  if (DR_UNCONSTRAINED_BASE (a))
> +    num_dimensions_a -= 1;
> +  if (DR_UNCONSTRAINED_BASE (b))
> +    num_dimensions_b -= 1;
> +
> +  /* These structures describe sequences of component references in
> +     DR_REF (A) and DR_REF (B).  Each component reference is tied to a
> +     specific access function.  */
> +  struct {
> +    /* The sequence starts at DR_ACCESS_FN (A, START_A) of A and
> +       DR_ACCESS_FN (B, START_B) of B (inclusive) and extends to higher
> +       indices.  In C notation, these are the indices of the rightmost
> +       component references; e.g. for a sequence .b.c.d, the start
> +       index is for .d.  */
> +    unsigned int start_a;
> +    unsigned int start_b;
> +
> +    /* The sequence contains LENGTH consecutive access functions from
> +       each DR.  */
> +    unsigned int length;
> +
> +    /* The enclosing objects for the A and B sequences respectively,
> +       i.e. the objects to which DR_ACCESS_FN (A, START_A + LENGTH - 1)
> +       and DR_ACCESS_FN (B, START_B + LENGTH - 1) are applied.  */
> +    tree object_a;
> +    tree object_b;
> +  } full_seq = {}, struct_seq = {};
> +
> +  /* Before each iteration of the loop:
> +
> +     - REF_A is what you get after applying DR_ACCESS_FN (A, INDEX_A) and
> +     - REF_B is what you get after applying DR_ACCESS_FN (B, INDEX_B).  */
> +  unsigned int index_a = 0;
> +  unsigned int index_b = 0;
> +  tree ref_a = DR_REF (a);
> +  tree ref_b = DR_REF (b);
> +
> +  /* Now walk the component references from the final DR_REFs back up to
> +     the enclosing base objects.  Each component reference corresponds
> +     to one access function in the DR, with access function 0 being for
> +     the final DR_REF and the highest-indexed access function being the
> +     one that is applied to the base of the DR.
> +
> +     Look for a sequence of component references whose access functions
> +     are comparable (see access_fn_components_comparable_p).  If more
> +     than one such sequence exists, pick the one nearest the base
> +     (which is the leftmost sequence in C notation).  Store this sequence
> +     in FULL_SEQ.
> +
> +     For example, if we have:
> +
> +	struct foo { struct bar s; ... } (*a)[10], (*b)[10];
> +
> +	A: a[0][i].s.c.d
> +	B: __real b[0][i].s.e[i].f
> +
> +     (where d is the same type as the real component of f) then the access
> +     functions would be:
> +
> +			 0   1   2   3
> +	A:              .d  .c  .s [i]
> +
> +		 0   1   2   3   4   5
> +	B:  __real  .f [i]  .e  .s [i]
> +
> +     The A0/B2 column isn't comparable, since .d is a COMPONENT_REF
> +     and [i] is an ARRAY_REF.  However, the A1/B3 column contains two
> +     COMPONENT_REF accesses for struct bar, so is comparable.  Likewise
> +     the A2/B4 column contains two COMPONENT_REF accesses for struct foo,
> +     so is comparable.  The A3/B5 column contains two ARRAY_REFs that
> +     index foo[10] arrays, so is again comparable.  The sequence is
> +     therefore:
> +
> +        A: [1, 3]  (i.e. [i].s.c)
> +        B: [3, 5]  (i.e. [i].s.e)
> +
> +     Also look for sequences of component references whose access
> +     functions are comparable and whose enclosing objects have the same
> +     RECORD_TYPE.  Store this sequence in STRUCT_SEQ.  In the above
> +     example, STRUCT_SEQ would be:
> +
> +        A: [1, 2]  (i.e. s.c)
> +        B: [3, 4]  (i.e. s.e)  */
> +  while (index_a < num_dimensions_a && index_b < num_dimensions_b)
> +    {
> +      /* REF_A and REF_B must be one of the component access types
> +	 allowed by dr_analyze_indices.  */
> +      gcc_checking_assert (access_fn_component_p (ref_a));
> +      gcc_checking_assert (access_fn_component_p (ref_b));
> +
> +      /* Get the immediately-enclosing objects for REF_A and REF_B,
> +	 i.e. the references *before* applying DR_ACCESS_FN (A, INDEX_A)
> +	 and DR_ACCESS_FN (B, INDEX_B).  */
> +      tree object_a = TREE_OPERAND (ref_a, 0);
> +      tree object_b = TREE_OPERAND (ref_b, 0);
> +
> +      tree type_a = TREE_TYPE (object_a);
> +      tree type_b = TREE_TYPE (object_b);
> +      if (access_fn_components_comparable_p (ref_a, ref_b))
> +	{
> +	  /* This pair of component accesses is comparable for dependence
> +	     analysis, so we can include DR_ACCESS_FN (A, INDEX_A) and
> +	     DR_ACCESS_FN (B, INDEX_B) in the sequence.  */
> +	  if (full_seq.start_a + full_seq.length != index_a
> +	      || full_seq.start_b + full_seq.length != index_b)
> +	    {
> +	      /* The accesses don't extend the current sequence,
> +		 so start a new one here.  */
> +	      full_seq.start_a = index_a;
> +	      full_seq.start_b = index_b;
> +	      full_seq.length = 0;
> +	    }
> +
> +	  /* Add this pair of references to the sequence.  */
> +	  full_seq.length += 1;
> +	  full_seq.object_a = object_a;
> +	  full_seq.object_b = object_b;
> +
> +	  /* If the enclosing objects are structures (and thus have the
> +	     same RECORD_TYPE), record the new sequence in STRUCT_SEQ.  */
> +	  if (TREE_CODE (type_a) == RECORD_TYPE)
> +	    struct_seq = full_seq;
> +
> +	  /* Move to the next containing reference for both A and B.  */
> +	  ref_a = object_a;
> +	  ref_b = object_b;
> +	  index_a += 1;
> +	  index_b += 1;
> +	  continue;
> +	}
> +
> +      /* Try to approach equal type sizes.  */
> +      if (!COMPLETE_TYPE_P (type_a)
> +	  || !COMPLETE_TYPE_P (type_b)
> +	  || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a))
> +	  || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b)))
> +	break;
> +
> +      unsigned HOST_WIDE_INT size_a = tree_to_uhwi (TYPE_SIZE_UNIT (type_a));
> +      unsigned HOST_WIDE_INT size_b = tree_to_uhwi (TYPE_SIZE_UNIT (type_b));
> +      if (size_a <= size_b)
>  	{
> -	  DDR_ARE_DEPENDENT (res) = chrec_dont_know;
> -	  return res;
> +	  index_a += 1;
> +	  ref_a = object_a;
> +	}
> +      if (size_b <= size_a)
> +	{
> +	  index_b += 1;
> +	  ref_b = object_b;
>  	}
> -      DDR_AFFINE_P (res) = true;
> -      DDR_ARE_DEPENDENT (res) = NULL_TREE;
> -      DDR_SUBSCRIPTS (res).create (DR_NUM_DIMENSIONS (a));
> -      DDR_LOOP_NEST (res) = loop_nest;
> -      DDR_INNER_LOOP (res) = 0;
> -      DDR_SELF_REFERENCE (res) = true;
> -      for (i = 0; i < DR_NUM_DIMENSIONS (a); i++)
> -       {
> -         struct subscript *subscript;
> -
> -         subscript = XNEW (struct subscript);
> -         SUB_CONFLICTS_IN_A (subscript) = conflict_fn_not_known ();
> -         SUB_CONFLICTS_IN_B (subscript) = conflict_fn_not_known ();
> -         SUB_LAST_CONFLICT (subscript) = chrec_dont_know;
> -         SUB_DISTANCE (subscript) = chrec_dont_know;
> -         DDR_SUBSCRIPTS (res).safe_push (subscript);
> -       }
> -      return res;
>      }
>  
> -  /* If the references do not access the same object, we do not know
> -     whether they alias or not.  We do not care about TBAA or alignment
> -     info so we can use OEP_ADDRESS_OF to avoid false negatives.
> -     But the accesses have to use compatible types as otherwise the
> -     built indices would not match.  */
> -  if (!operand_equal_p (DR_BASE_OBJECT (a), DR_BASE_OBJECT (b), OEP_ADDRESS_OF)
> -      || !types_compatible_p (TREE_TYPE (DR_BASE_OBJECT (a)),
> -			      TREE_TYPE (DR_BASE_OBJECT (b))))
> +  /* See whether FULL_SEQ ends at the base and whether the two bases
> +     are equal.  We do not care about TBAA or alignment info so we can
> +     use OEP_ADDRESS_OF to avoid false negatives.  */
> +  tree base_a = DR_BASE_OBJECT (a);
> +  tree base_b = DR_BASE_OBJECT (b);
> +  bool same_base_p = (full_seq.start_a + full_seq.length == num_dimensions_a
> +		      && full_seq.start_b + full_seq.length == num_dimensions_b
> +		      && DR_UNCONSTRAINED_BASE (a) == DR_UNCONSTRAINED_BASE (b)
> +		      && operand_equal_p (base_a, base_b, OEP_ADDRESS_OF)
> +		      && types_compatible_p (TREE_TYPE (base_a),
> +					     TREE_TYPE (base_b))
> +		      && (!loop_nest.exists ()
> +			  || (object_address_invariant_in_loop_p
> +			      (loop_nest[0], base_a))));
> +
> +  /* If the bases are the same, we can include the base variation too.
> +     E.g. the b accesses in:
> +
> +       for (int i = 0; i < n; ++i)
> +         b[i + 4][0] = b[i][0];
> +
> +     have a definite dependence distance of 4, while for:
> +
> +       for (int i = 0; i < n; ++i)
> +         a[i + 4][0] = b[i][0];
> +
> +     the dependence distance depends on the gap between a and b.
> +
> +     If the bases are different then we can only rely on the sequence
> +     rooted at a structure access, since arrays are allowed to overlap
> +     arbitrarily and change shape arbitrarily.  E.g. we treat this as
> +     valid code:
> +
> +       int a[256];
> +       ...
> +       ((int (*)[4][3]) &a[1])[i][0] += ((int (*)[4][3]) &a[2])[i][0];
> +
> +     where two lvalues with the same int[4][3] type overlap, and where
> +     both lvalues are distinct from the object's declared type.  */
> +  if (same_base_p)
>      {
> -      DDR_ARE_DEPENDENT (res) = chrec_dont_know;
> -      return res;
> +      if (DR_UNCONSTRAINED_BASE (a))
> +	full_seq.length += 1;
>      }
> +  else
> +    full_seq = struct_seq;
>  
> -  /* If the base of the object is not invariant in the loop nest, we cannot
> -     analyze it.  TODO -- in fact, it would suffice to record that there may
> -     be arbitrary dependences in the loops where the base object varies.  */
> -  if ((loop_nest.exists ()
> -       && !object_address_invariant_in_loop_p (loop_nest[0], DR_BASE_OBJECT (a)))
> -      || DR_NUM_DIMENSIONS (a) == 0)
> +  /* Punt if we didn't find a suitable sequence.  */
> +  if (full_seq.length == 0)
>      {
>        DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>        return res;
>      }
>  
> -  /* If the number of dimensions of the access to not agree we can have
> -     a pointer access to a component of the array element type and an
> -     array access while the base-objects are still the same.  Punt.  */
> -  if (DR_NUM_DIMENSIONS (a) != DR_NUM_DIMENSIONS (b))
> +  if (!same_base_p)
>      {
> -      DDR_ARE_DEPENDENT (res) = chrec_dont_know;
> -      return res;
> +      /* Partial overlap is possible for different bases when strict aliasing
> +	 is not in effect.  It's also possible if either base involves a union
> +	 access; e.g. for:
> +
> +	   struct s1 { int a[2]; };
> +	   struct s2 { struct s1 b; int c; };
> +	   struct s3 { int d; struct s1 e; };
> +	   union u { struct s2 f; struct s3 g; } *p, *q;
> +
> +	 the s1 at "p->f.b" (base "p->f") partially overlaps the s1 at
> +	 "p->g.e" (base "p->g") and might partially overlap the s1 at
> +	 "q->g.e" (base "q->g").  */
> +      if (!flag_strict_aliasing
> +	  || ref_contains_union_access_p (full_seq.object_a)
> +	  || ref_contains_union_access_p (full_seq.object_b))
> +	{
> +	  DDR_ARE_DEPENDENT (res) = chrec_dont_know;
> +	  return res;
> +	}
> +
> +      DDR_COULD_BE_INDEPENDENT_P (res) = true;
>      }
>  
>    DDR_AFFINE_P (res) = true;
>    DDR_ARE_DEPENDENT (res) = NULL_TREE;
> -  DDR_SUBSCRIPTS (res).create (DR_NUM_DIMENSIONS (a));
> +  DDR_SUBSCRIPTS (res).create (full_seq.length);
>    DDR_LOOP_NEST (res) = loop_nest;
>    DDR_INNER_LOOP (res) = 0;
>    DDR_SELF_REFERENCE (res) = false;
>  
> -  for (i = 0; i < DR_NUM_DIMENSIONS (a); i++)
> +  for (i = 0; i < full_seq.length; ++i)
>      {
>        struct subscript *subscript;
>  
>        subscript = XNEW (struct subscript);
> +      SUB_ACCESS_FN (subscript, 0) = DR_ACCESS_FN (a, full_seq.start_a + i);
> +      SUB_ACCESS_FN (subscript, 1) = DR_ACCESS_FN (b, full_seq.start_b + i);
>        SUB_CONFLICTS_IN_A (subscript) = conflict_fn_not_known ();
>        SUB_CONFLICTS_IN_B (subscript) = conflict_fn_not_known ();
>        SUB_LAST_CONFLICT (subscript) = chrec_dont_know;
> @@ -3839,14 +4103,15 @@ add_outer_distances (struct data_depende
>  }
>  
>  /* Return false when fail to represent the data dependence as a
> -   distance vector.  INIT_B is set to true when a component has been
> +   distance vector.  A_INDEX is the index of the first reference
> +   (0 for DDR_A, 1 for DDR_B) and B_INDEX is the index of the
> +   second reference.  INIT_B is set to true when a component has been
>     added to the distance vector DIST_V.  INDEX_CARRY is then set to
>     the index in DIST_V that carries the dependence.  */
>  
>  static bool
>  build_classic_dist_vector_1 (struct data_dependence_relation *ddr,
> -			     struct data_reference *ddr_a,
> -			     struct data_reference *ddr_b,
> +			     unsigned int a_index, unsigned int b_index,
>  			     lambda_vector dist_v, bool *init_b,
>  			     int *index_carry)
>  {
> @@ -3864,8 +4129,8 @@ build_classic_dist_vector_1 (struct data
>  	  return false;
>  	}
>  
> -      access_fn_a = DR_ACCESS_FN (ddr_a, i);
> -      access_fn_b = DR_ACCESS_FN (ddr_b, i);
> +      access_fn_a = SUB_ACCESS_FN (subscript, a_index);
> +      access_fn_b = SUB_ACCESS_FN (subscript, b_index);
>  
>        if (TREE_CODE (access_fn_a) == POLYNOMIAL_CHREC
>  	  && TREE_CODE (access_fn_b) == POLYNOMIAL_CHREC)
> @@ -3925,10 +4190,11 @@ build_classic_dist_vector_1 (struct data
>  constant_access_functions (const struct data_dependence_relation *ddr)
>  {
>    unsigned i;
> +  subscript *sub;
>  
> -  for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
> -    if (!evolution_function_is_constant_p (DR_ACCESS_FN (DDR_A (ddr), i))
> -	|| !evolution_function_is_constant_p (DR_ACCESS_FN (DDR_B (ddr), i)))
> +  FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
> +    if (!evolution_function_is_constant_p (SUB_ACCESS_FN (sub, 0))
> +	|| !evolution_function_is_constant_p (SUB_ACCESS_FN (sub, 1)))
>        return false;
>  
>    return true;
> @@ -3991,10 +4257,11 @@ add_other_self_distances (struct data_de
>    lambda_vector dist_v;
>    unsigned i;
>    int index_carry = DDR_NB_LOOPS (ddr);
> +  subscript *sub;
>  
> -  for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
> +  FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
>      {
> -      tree access_fun = DR_ACCESS_FN (DDR_A (ddr), i);
> +      tree access_fun = SUB_ACCESS_FN (sub, 0);
>  
>        if (TREE_CODE (access_fun) == POLYNOMIAL_CHREC)
>  	{
> @@ -4006,7 +4273,7 @@ add_other_self_distances (struct data_de
>  		  return;
>  		}
>  
> -	      access_fun = DR_ACCESS_FN (DDR_A (ddr), 0);
> +	      access_fun = SUB_ACCESS_FN (DDR_SUBSCRIPT (ddr, 0), 0);
>  
>  	      if (TREE_CODE (CHREC_LEFT (access_fun)) == POLYNOMIAL_CHREC)
>  		add_multivariate_self_dist (ddr, access_fun);
> @@ -4077,6 +4344,23 @@ add_distance_for_zero_overlaps (struct d
>      }
>  }
>  
> +/* Return true when the DDR contains two data references that have the
> +   same access functions.  */
> +
> +static inline bool
> +same_access_functions (const struct data_dependence_relation *ddr)
> +{
> +  unsigned i;
> +  subscript *sub;
> +
> +  FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
> +    if (!eq_evolutions_p (SUB_ACCESS_FN (sub, 0),
> +			  SUB_ACCESS_FN (sub, 1)))
> +      return false;
> +
> +  return true;
> +}
> +
>  /* Compute the classic per loop distance vector.  DDR is the data
>     dependence relation to build a vector from.  Return false when fail
>     to represent the data dependence as a distance vector.  */
> @@ -4108,8 +4392,7 @@ build_classic_dist_vector (struct data_d
>      }
>  
>    dist_v = lambda_vector_new (DDR_NB_LOOPS (ddr));
> -  if (!build_classic_dist_vector_1 (ddr, DDR_A (ddr), DDR_B (ddr),
> -				    dist_v, &init_b, &index_carry))
> +  if (!build_classic_dist_vector_1 (ddr, 0, 1, dist_v, &init_b, &index_carry))
>      return false;
>  
>    /* Save the distance vector if we initialized one.  */
> @@ -4142,12 +4425,11 @@ build_classic_dist_vector (struct data_d
>        if (!lambda_vector_lexico_pos (dist_v, DDR_NB_LOOPS (ddr)))
>  	{
>  	  lambda_vector save_v = lambda_vector_new (DDR_NB_LOOPS (ddr));
> -	  if (!subscript_dependence_tester_1 (ddr, DDR_B (ddr), DDR_A (ddr),
> -					      loop_nest))
> +	  if (!subscript_dependence_tester_1 (ddr, 1, 0, loop_nest))
>  	    return false;
>  	  compute_subscript_distance (ddr);
> -	  if (!build_classic_dist_vector_1 (ddr, DDR_B (ddr), DDR_A (ddr),
> -					    save_v, &init_b, &index_carry))
> +	  if (!build_classic_dist_vector_1 (ddr, 1, 0, save_v, &init_b,
> +					    &index_carry))
>  	    return false;
>  	  save_dist_v (ddr, save_v);
>  	  DDR_REVERSED_P (ddr) = true;
> @@ -4183,12 +4465,10 @@ build_classic_dist_vector (struct data_d
>  	    {
>  	      lambda_vector opposite_v = lambda_vector_new (DDR_NB_LOOPS (ddr));
>  
> -	      if (!subscript_dependence_tester_1 (ddr, DDR_B (ddr),
> -						  DDR_A (ddr), loop_nest))
> +	      if (!subscript_dependence_tester_1 (ddr, 1, 0, loop_nest))
>  		return false;
>  	      compute_subscript_distance (ddr);
> -	      if (!build_classic_dist_vector_1 (ddr, DDR_B (ddr), DDR_A (ddr),
> -						opposite_v, &init_b,
> +	      if (!build_classic_dist_vector_1 (ddr, 1, 0, opposite_v, &init_b,
>  						&index_carry))
>  		return false;
>  
> @@ -4267,13 +4547,13 @@ build_classic_dir_vector (struct data_de
>      }
>  }
>  
> -/* Helper function.  Returns true when there is a dependence between
> -   data references DRA and DRB.  */
> +/* Helper function.  Returns true when there is a dependence between the
> +   data references.  A_INDEX is the index of the first reference (0 for
> +   DDR_A, 1 for DDR_B) and B_INDEX is the index of the second reference.  */
>  
>  static bool
>  subscript_dependence_tester_1 (struct data_dependence_relation *ddr,
> -			       struct data_reference *dra,
> -			       struct data_reference *drb,
> +			       unsigned int a_index, unsigned int b_index,
>  			       struct loop *loop_nest)
>  {
>    unsigned int i;
> @@ -4285,8 +4565,8 @@ subscript_dependence_tester_1 (struct da
>      {
>        conflict_function *overlaps_a, *overlaps_b;
>  
> -      analyze_overlapping_iterations (DR_ACCESS_FN (dra, i),
> -				      DR_ACCESS_FN (drb, i),
> +      analyze_overlapping_iterations (SUB_ACCESS_FN (subscript, a_index),
> +				      SUB_ACCESS_FN (subscript, b_index),
>  				      &overlaps_a, &overlaps_b,
>  				      &last_conflicts, loop_nest);
>  
> @@ -4335,7 +4615,7 @@ subscript_dependence_tester_1 (struct da
>  subscript_dependence_tester (struct data_dependence_relation *ddr,
>  			     struct loop *loop_nest)
>  {
> -  if (subscript_dependence_tester_1 (ddr, DDR_A (ddr), DDR_B (ddr), loop_nest))
> +  if (subscript_dependence_tester_1 (ddr, 0, 1, loop_nest))
>      dependence_stats.num_dependence_dependent++;
>  
>    compute_subscript_distance (ddr);
> Index: gcc/tree-ssa-loop-prefetch.c
> ===================================================================
> --- gcc/tree-ssa-loop-prefetch.c	2017-07-27 13:10:29.620045506 +0100
> +++ gcc/tree-ssa-loop-prefetch.c	2017-07-27 13:10:33.023912613 +0100
> @@ -1668,6 +1668,7 @@ determine_loop_nest_reuse (struct loop *
>        refb = (struct mem_ref *) DDR_B (dep)->aux;
>  
>        if (DDR_ARE_DEPENDENT (dep) == chrec_dont_know
> +	  || DDR_COULD_BE_INDEPENDENT_P (dep)
>  	  || DDR_NUM_DIST_VECTS (dep) == 0)
>  	{
>  	  /* If the dependence cannot be analyzed, assume that there might be
> Index: gcc/tree-vectorizer.h
> ===================================================================
> --- gcc/tree-vectorizer.h	2017-07-27 13:10:29.620045506 +0100
> +++ gcc/tree-vectorizer.h	2017-07-27 13:10:33.024912868 +0100
> @@ -358,7 +358,7 @@ #define LOOP_VINFO_ORIG_LOOP_INFO(L)
>  #define LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT(L)	\
>    ((L)->may_misalign_stmts.length () > 0)
>  #define LOOP_REQUIRES_VERSIONING_FOR_ALIAS(L)		\
> -  ((L)->may_alias_ddrs.length () > 0)
> +  ((L)->comp_alias_ddrs.length () > 0)
>  #define LOOP_REQUIRES_VERSIONING_FOR_NITERS(L)		\
>    (LOOP_VINFO_NITERS_ASSUMPTIONS (L))
>  #define LOOP_REQUIRES_VERSIONING(L)			\
> Index: gcc/tree-vect-data-refs.c
> ===================================================================
> --- gcc/tree-vect-data-refs.c	2017-07-27 13:10:29.620045506 +0100
> +++ gcc/tree-vect-data-refs.c	2017-07-27 13:10:33.024912868 +0100
> @@ -160,6 +160,60 @@ vect_mark_for_runtime_alias_test (ddr_p
>  }
>  
>  
> +/* A subroutine of vect_analyze_data_ref_dependence.  Handle
> +   DDR_COULD_BE_INDEPENDENT_P ddr DDR that has a known set of dependence
> +   distances.  These distances are conservatively correct but they don't
> +   reflect a guaranteed dependence.
> +
> +   Return true if this function does all the work necessary to avoid
> +   an alias or false if the caller should use the dependence distances
> +   to limit the vectorization factor in the usual way.  LOOP_DEPTH is
> +   the depth of the loop described by LOOP_VINFO and the other arguments
> +   are as for vect_analyze_data_ref_dependence.  */
> +
> +static bool
> +vect_analyze_possibly_independent_ddr (data_dependence_relation *ddr,
> +				       loop_vec_info loop_vinfo,
> +				       int loop_depth, int *max_vf)
> +{
> +  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
> +  lambda_vector dist_v;
> +  unsigned int i;
> +  FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
> +    {
> +      int dist = dist_v[loop_depth];
> +      if (dist != 0 && !(dist > 0 && DDR_REVERSED_P (ddr)))
> +	{
> +	  /* If the user asserted safelen >= DIST consecutive iterations
> +	     can be executed concurrently, assume independence.
> +
> +	     ??? An alternative would be to add the alias check even
> +	     in this case, and vectorize the fallback loop with the
> +	     maximum VF set to safelen.  However, if the user has
> +	     explicitly given a length, it's less likely that that
> +	     would be a win.  */
> +	  if (loop->safelen >= 2 && abs_hwi (dist) <= loop->safelen)
> +	    {
> +	      if (loop->safelen < *max_vf)
> +		*max_vf = loop->safelen;
> +	      LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo) = false;
> +	      continue;
> +	    }
> +
> +	  /* For dependence distances of 2 or more, we have the option
> +	     of limiting VF or checking for an alias at runtime.
> +	     Prefer to check at runtime if we can, to avoid limiting
> +	     the VF unnecessarily when the bases are in fact independent.
> +
> +	     Note that the alias checks will be removed if the VF ends up
> +	     being small enough.  */
> +	  return vect_mark_for_runtime_alias_test (ddr, loop_vinfo);
> +	}
> +    }
> +  return true;
> +}
> +
> +
>  /* Function vect_analyze_data_ref_dependence.
>  
>     Return TRUE if there (might) exist a dependence between a memory-reference
> @@ -305,6 +359,12 @@ vect_analyze_data_ref_dependence (struct
>      }
>  
>    loop_depth = index_in_loop_nest (loop->num, DDR_LOOP_NEST (ddr));
> +
> +  if (DDR_COULD_BE_INDEPENDENT_P (ddr)
> +      && vect_analyze_possibly_independent_ddr (ddr, loop_vinfo,
> +						loop_depth, max_vf))
> +    return false;
> +
>    FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
>      {
>        int dist = dist_v[loop_depth];
> @@ -2878,6 +2938,44 @@ vect_no_alias_p (struct data_reference *
>    return false;
>  }
>  
> +/* Return true if the minimum nonzero dependence distance for loop LOOP_DEPTH
> +   in DDR is >= VF.  */
> +
> +static bool
> +dependence_distance_ge_vf (data_dependence_relation *ddr,
> +			   unsigned int loop_depth, unsigned HOST_WIDE_INT vf)
> +{
> +  if (DDR_ARE_DEPENDENT (ddr) != NULL_TREE
> +      || DDR_NUM_DIST_VECTS (ddr) == 0)
> +    return false;
> +
> +  /* If the dependence is exact, we should have limited the VF instead.  */
> +  gcc_checking_assert (DDR_COULD_BE_INDEPENDENT_P (ddr));
> +
> +  unsigned int i;
> +  lambda_vector dist_v;
> +  FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
> +    {
> +      HOST_WIDE_INT dist = dist_v[loop_depth];
> +      if (dist != 0
> +	  && !(dist > 0 && DDR_REVERSED_P (ddr))
> +	  && (unsigned HOST_WIDE_INT) abs_hwi (dist) < vf)
> +	return false;
> +    }
> +
> +  if (dump_enabled_p ())
> +    {
> +      dump_printf_loc (MSG_NOTE, vect_location,
> +		       "dependence distance between ");
> +      dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (DDR_A (ddr)));
> +      dump_printf (MSG_NOTE,  " and ");
> +      dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (DDR_B (ddr)));
> +      dump_printf (MSG_NOTE,  " is >= VF\n");
> +    }
> +
> +  return true;
> +}
> +
>  /* Function vect_prune_runtime_alias_test_list.
>  
>     Prune a list of ddrs to be tested at run-time by versioning for alias.
> @@ -2908,6 +3006,10 @@ vect_prune_runtime_alias_test_list (loop
>  
>    comp_alias_ddrs.create (may_alias_ddrs.length ());
>  
> +  unsigned int loop_depth
> +    = index_in_loop_nest (LOOP_VINFO_LOOP (loop_vinfo)->num,
> +			  LOOP_VINFO_LOOP_NEST (loop_vinfo));
> +
>    /* First, we collect all data ref pairs for aliasing checks.  */
>    FOR_EACH_VEC_ELT (may_alias_ddrs, i, ddr)
>      {
> @@ -2917,6 +3019,11 @@ vect_prune_runtime_alias_test_list (loop
>        tree segment_length_a, segment_length_b;
>        gimple *stmt_a, *stmt_b;
>  
> +      /* Ignore the alias if the VF we chose ended up being no greater
> +	 than the dependence distance.  */
> +      if (dependence_distance_ge_vf (ddr, loop_depth, vect_factor))
> +	continue;
> +
>        dr_a = DDR_A (ddr);
>        stmt_a = DR_STMT (DDR_A (ddr));
>        dr_group_first_a = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_a));
> @@ -2993,10 +3100,6 @@ vect_prune_runtime_alias_test_list (loop
>        return false;
>      }
>  
> -  /* All alias checks have been resolved at compilation time.  */
> -  if (!comp_alias_ddrs.length ())
> -    LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo).truncate (0);
> -
>    return true;
>  }
>  
> Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-3.c
> ===================================================================
> --- /dev/null	2017-07-27 10:25:31.671280760 +0100
> +++ gcc/testsuite/gcc.dg/vect/vect-alias-check-3.c	2017-07-27 13:10:33.022912357 +0100
> @@ -0,0 +1,120 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vect_int } */
> +/* { dg-additional-options "--param vect-max-version-for-alias-checks=0 -fopenmp-simd" } */
> +
> +/* Intended to be larger than any VF.  */
> +#define GAP 128
> +#define N (GAP * 3)
> +
> +struct s { int x[N + 1]; };
> +struct t { struct s x[N + 1]; };
> +struct u { int x[N + 1]; int y; };
> +struct v { struct s s; };
> +
> +void
> +f1 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a->x[i] += b->x[i];
> +}
> +
> +void
> +f2 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a[1].x[i] += b[2].x[i];
> +}
> +
> +void
> +f3 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a[1].x[i] += b[i].x[i];
> +}
> +
> +void
> +f4 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a[i].x[i] += b[i].x[i];
> +}
> +
> +void
> +f5 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a->x[i] += b->x[i + 1];
> +}
> +
> +void
> +f6 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a[1].x[i] += b[2].x[i + 1];
> +}
> +
> +void
> +f7 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a[1].x[i] += b[i].x[i + 1];
> +}
> +
> +void
> +f8 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a[i].x[i] += b[i].x[i + 1];
> +}
> +
> +void
> +f9 (struct s *a, struct t *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a->x[i] += b->x[1].x[i];
> +}
> +
> +void
> +f10 (struct s *a, struct t *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a->x[i] += b->x[i].x[i];
> +}
> +
> +void
> +f11 (struct u *a, struct u *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a->x[i] += b->x[i] + b[i].y;
> +}
> +
> +void
> +f12 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < GAP; ++i)
> +    a->x[i + GAP] += b->x[i];
> +}
> +
> +void
> +f13 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < GAP * 2; ++i)
> +    a->x[i + GAP] += b->x[i];
> +}
> +
> +void
> +f14 (struct v *a, struct s *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a->s.x[i] = b->x[i];
> +}
> +
> +void
> +f15 (struct s *a, struct s *b)
> +{
> +  #pragma omp simd safelen(N)
> +  for (int i = 0; i < N; ++i)
> +    a->x[i + 1] += b->x[i];
> +}
> +
> +/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 15 "vect" } } */
> Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-4.c
> ===================================================================
> --- /dev/null	2017-07-27 10:25:31.671280760 +0100
> +++ gcc/testsuite/gcc.dg/vect/vect-alias-check-4.c	2017-07-27 13:10:33.022912357 +0100
> @@ -0,0 +1,35 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vect_int } */
> +/* { dg-additional-options "--param vect-max-version-for-alias-checks=0" } */
> +
> +#define N 16
> +
> +struct s1 { int a[N]; };
> +struct s2 { struct s1 b; int c; };
> +struct s3 { int d; struct s1 e; };
> +union u { struct s2 f; struct s3 g; };
> +
> +/* We allow a and b to overlap arbitrarily.  */
> +
> +void
> +f1 (int a[][N], int b[][N])
> +{
> +  for (int i = 0; i < N; ++i)
> +    a[0][i] += b[0][i];
> +}
> +
> +void
> +f2 (union u *a, union u *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a->f.b.a[i] += b->g.e.a[i];
> +}
> +
> +void
> +f3 (struct s1 *a, struct s1 *b)
> +{
> +  for (int i = 0; i < N - 1; ++i)
> +    a->a[i + 1] += b->a[i];
> +}
> +
> +/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
> Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-5.c
> ===================================================================
> --- /dev/null	2017-07-27 10:25:31.671280760 +0100
> +++ gcc/testsuite/gcc.dg/vect/vect-alias-check-5.c	2017-07-27 13:10:33.022912357 +0100
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vect_int } */
> +
> +/* Intended to be larger than any VF.  */
> +#define GAP 128
> +#define N (GAP * 3)
> +
> +struct s { int x[N]; };
> +
> +void
> +f1 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < GAP * 2; ++i)
> +    a->x[i + GAP] += b->x[i];
> +}
> +
> +/* { dg-final { scan-tree-dump-times "consider run-time aliasing" 1 "vect" } } */
> +/* { dg-final { scan-tree-dump-times "improved number of alias checks from 1 to 0" 1 "vect" } } */
> +/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 1 "vect" } } */

Richard Biener Aug. 4, 2017, 9:06 a.m. UTC | #2

On Thu, Jul 27, 2017 at 2:19 PM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> Richard Sandiford <richard.sandiford@linaro.org> writes:
>> Eric Botcazou <ebotcazou@adacore.com> writes:
>>> [Sorry for missing the previous messages]
>>>
>>>> Thanks.  Just been retesting, and I think I must have forgotten
>>>> to include Ada last time.  It turns out that the patch causes a dg-scan
>>>> regression in gnat.dg/vect17.adb, because we now think that if the
>>>> array RECORD_TYPEs *do* alias in:
>>>>
>>>>    procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
>>>>    begin
>>>>       for I in Sarray'Range loop
>>>>          R(I) := X(I) + Y(I);
>>>>       end loop;
>>>>    end;
>>>>
>>>> then the dependence distance must be zero.  Eric, does that hold true
>>>> for Ada?  I.e. if X and R (or Y and R) alias, must it be the case that
>>>> X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?
>>>
>>> Yes, I'd think so (even without the artificial RECORD_TYPE around the arrays).
>>
>> Good!
>>
>>>> 2017-06-07  Richard Sandiford  <richard.sandiford@linaro.org>
>>>>
>>>> gcc/testsuite/
>>>>     * gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
>>>>     * gnat.dg/vect17.adb (Add): Create a dependence distance of 1
>>>>     when X = R or Y = R.
>>>
>>> I think that you need to modify vect15 and vect16 the same way.
>>
>> Ah, yeah.  And doing that shows that I'd not handled safelen for
>> DDR_COULD_BE_INDEPENDENT_P.  I've fixed that locally.
>>
>> How does this look?  Tested on x86_64-linux-gnu both without the
>> vectoriser changes and with the fixed vectoriser patch.
>
> Here's a version of the patch that handles safelen.  I split the
> handling out into a new function (vect_analyze_possibly_independent_ddr)
> since it was getting too big to do inline.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Ok.

Did you check whether BB vectorization is affected?  See
vect_slp_analyze_instance_dependence
and friends.  It's quite conservative but given the prefetching change
I wonder if we need
to rule out DDR_COULD_BE_INDEPENDENT_P?

Thanks,
Richard.

> Thanks,
> Richard
>
>
> 2017-07-27  Richard Sandiford  <richard.sandiford@linaro.org>
>
> gcc/
>         * tree-data-ref.h (subscript): Add access_fn field.
>         (data_dependence_relation): Add could_be_independent_p.
>         (SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros.
>         (same_access_functions): Move to tree-data-ref.c.
>         * tree-data-ref.c (ref_contains_union_access_p): New function.
>         (access_fn_component_p): Likewise.
>         (access_fn_components_comparable_p): Likewise.
>         (dr_analyze_indices): Add a reference to access_fn_component_p.
>         (dump_data_dependence_relation): Use SUB_ACCESS_FN instead of
>         DR_ACCESS_FN.
>         (constant_access_functions): Likewise.
>         (add_other_self_distances): Likewise.
>         (same_access_functions): Likewise.  (Moved from tree-data-ref.h.)
>         (initialize_data_dependence_relation): Use XCNEW and remove
>         explicit zeroing of DDR_REVERSED_P.  Look for a subsequence
>         of access functions that have the same type.  Allow the
>         subsequence to end with different bases in some circumstances.
>         Record the chosen access functions in SUB_ACCESS_FN.
>         (build_classic_dist_vector_1): Replace ddr_a and ddr_b with
>         a_index and b_index.  Use SUB_ACCESS_FN instead of DR_ACCESS_FN.
>         (subscript_dependence_tester_1): Likewise dra and drb.
>         (build_classic_dist_vector): Update calls accordingly.
>         (subscript_dependence_tester): Likewise.
>         * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Check
>         DDR_COULD_BE_INDEPENDENT_P.
>         * tree-vectorizer.h (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Test
>         comp_alias_ddrs instead of may_alias_ddrs.
>         * tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
>         New function.
>         (vect_analyze_data_ref_dependence): Use it if
>         DDR_COULD_BE_INDEPENDENT_P, but fall back to using the recorded
>         distance vectors if that fails.
>         (dependence_distance_ge_vf): New function.
>         (vect_prune_runtime_alias_test_list): Use it.  Don't clear
>         LOOP_VINFO_MAY_ALIAS_DDRS.
>
> gcc/testsuite/
>         * gcc.dg/vect/vect-alias-check-3.c: New test.
>         * gcc.dg/vect/vect-alias-check-4.c: Likewise.
>         * gcc.dg/vect/vect-alias-check-5.c: Likewise.
>
> Index: gcc/tree-data-ref.h
> ===================================================================
> --- gcc/tree-data-ref.h 2017-07-27 13:10:29.620045506 +0100
> +++ gcc/tree-data-ref.h 2017-07-27 13:10:33.023912613 +0100
> @@ -260,6 +260,9 @@ struct conflict_function
>
>  struct subscript
>  {
> +  /* The access functions of the two references.  */
> +  tree access_fn[2];
> +
>    /* A description of the iterations for which the elements are
>       accessed twice.  */
>    conflict_function *conflicting_iterations_in_a;
> @@ -278,6 +281,7 @@ struct subscript
>
>  typedef struct subscript *subscript_p;
>
> +#define SUB_ACCESS_FN(SUB, I) (SUB)->access_fn[I]
>  #define SUB_CONFLICTS_IN_A(SUB) (SUB)->conflicting_iterations_in_a
>  #define SUB_CONFLICTS_IN_B(SUB) (SUB)->conflicting_iterations_in_b
>  #define SUB_LAST_CONFLICT(SUB) (SUB)->last_conflict
> @@ -333,6 +337,33 @@ struct data_dependence_relation
>    /* Set to true when the dependence relation is on the same data
>       access.  */
>    bool self_reference_p;
> +
> +  /* True if the dependence described is conservatively correct rather
> +     than exact, and if it is still possible for the accesses to be
> +     conditionally independent.  For example, the a and b references in:
> +
> +       struct s *a, *b;
> +       for (int i = 0; i < n; ++i)
> +         a->f[i] += b->f[i];
> +
> +     conservatively have a distance vector of (0), for the case in which
> +     a == b, but the accesses are independent if a != b.  Similarly,
> +     the a and b references in:
> +
> +       struct s *a, *b;
> +       for (int i = 0; i < n; ++i)
> +         a[0].f[i] += b[i].f[i];
> +
> +     conservatively have a distance vector of (0), but they are indepenent
> +     when a != b + i.  In contrast, the references in:
> +
> +       struct s *a;
> +       for (int i = 0; i < n; ++i)
> +         a->f[i] += a->f[i];
> +
> +     have the same distance vector of (0), but the accesses can never be
> +     independent.  */
> +  bool could_be_independent_p;
>  };
>
>  typedef struct data_dependence_relation *ddr_p;
> @@ -363,6 +394,7 @@ #define DDR_DIR_VECT(DDR, I) \
>  #define DDR_DIST_VECT(DDR, I) \
>    DDR_DIST_VECTS (DDR)[I]
>  #define DDR_REVERSED_P(DDR) (DDR)->reversed_p
> +#define DDR_COULD_BE_INDEPENDENT_P(DDR) (DDR)->could_be_independent_p
>
>
>  bool dr_analyze_innermost (innermost_loop_behavior *, tree, struct loop *);
> @@ -457,22 +489,6 @@ same_data_refs (data_reference_p a, data
>        return false;
>
>    return true;
> -}
> -
> -/* Return true when the DDR contains two data references that have the
> -   same access functions.  */
> -
> -static inline bool
> -same_access_functions (const struct data_dependence_relation *ddr)
> -{
> -  unsigned i;
> -
> -  for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
> -    if (!eq_evolutions_p (DR_ACCESS_FN (DDR_A (ddr), i),
> -                         DR_ACCESS_FN (DDR_B (ddr), i)))
> -      return false;
> -
> -  return true;
>  }
>
>  /* Returns true when all the dependences are computable.  */
> Index: gcc/tree-data-ref.c
> ===================================================================
> --- gcc/tree-data-ref.c 2017-07-27 13:10:29.620045506 +0100
> +++ gcc/tree-data-ref.c 2017-07-27 13:10:33.023912613 +0100
> @@ -124,8 +124,7 @@ Software Foundation; either version 3, o
>  } dependence_stats;
>
>  static bool subscript_dependence_tester_1 (struct data_dependence_relation *,
> -                                          struct data_reference *,
> -                                          struct data_reference *,
> +                                          unsigned int, unsigned int,
>                                            struct loop *);
>  /* Returns true iff A divides B.  */
>
> @@ -145,6 +144,21 @@ int_divides_p (int a, int b)
>    return ((b % a) == 0);
>  }
>
> +/* Return true if reference REF contains a union access.  */
> +
> +static bool
> +ref_contains_union_access_p (tree ref)
> +{
> +  while (handled_component_p (ref))
> +    {
> +      ref = TREE_OPERAND (ref, 0);
> +      if (TREE_CODE (TREE_TYPE (ref)) == UNION_TYPE
> +         || TREE_CODE (TREE_TYPE (ref)) == QUAL_UNION_TYPE)
> +       return true;
> +    }
> +  return false;
> +}
> +
>
>
>  /* Dump into FILE all the data references from DATAREFS.  */
> @@ -434,13 +448,14 @@ dump_data_dependence_relation (FILE *out
>        unsigned int i;
>        struct loop *loopi;
>
> -      for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
> +      subscript *sub;
> +      FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
>         {
>           fprintf (outf, "  access_fn_A: ");
> -         print_generic_stmt (outf, DR_ACCESS_FN (dra, i));
> +         print_generic_stmt (outf, SUB_ACCESS_FN (sub, 0));
>           fprintf (outf, "  access_fn_B: ");
> -         print_generic_stmt (outf, DR_ACCESS_FN (drb, i));
> -         dump_subscript (outf, DDR_SUBSCRIPT (ddr, i));
> +         print_generic_stmt (outf, SUB_ACCESS_FN (sub, 1));
> +         dump_subscript (outf, sub);
>         }
>
>        fprintf (outf, "  inner loop index: %d\n", DDR_INNER_LOOP (ddr));
> @@ -920,6 +935,27 @@ dr_analyze_innermost (innermost_loop_beh
>    return true;
>  }
>
> +/* Return true if OP is a valid component reference for a DR access
> +   function.  This accepts a subset of what handled_component_p accepts.  */
> +
> +static bool
> +access_fn_component_p (tree op)
> +{
> +  switch (TREE_CODE (op))
> +    {
> +    case REALPART_EXPR:
> +    case IMAGPART_EXPR:
> +    case ARRAY_REF:
> +      return true;
> +
> +    case COMPONENT_REF:
> +      return TREE_CODE (TREE_TYPE (TREE_OPERAND (op, 0))) == RECORD_TYPE;
> +
> +    default:
> +      return false;
> +    }
> +}
> +
>  /* Determines the base object and the list of indices of memory reference
>     DR, analyzed in LOOP and instantiated in loop nest NEST.  */
>
> @@ -957,7 +993,9 @@ dr_analyze_indices (struct data_referenc
>        access_fns.safe_push (integer_one_node);
>      }
>
> -  /* Analyze access functions of dimensions we know to be independent.  */
> +  /* Analyze access functions of dimensions we know to be independent.
> +     The list of component references handled here should be kept in
> +     sync with access_fn_component_p.  */
>    while (handled_component_p (ref))
>      {
>        if (TREE_CODE (ref) == ARRAY_REF)
> @@ -2148,6 +2186,38 @@ dr_may_alias_p (const struct data_refere
>    return refs_may_alias_p (addr_a, addr_b);
>  }
>
> +/* REF_A and REF_B both satisfy access_fn_component_p.  Return true
> +   if it is meaningful to compare their associated access functions
> +   when checking for dependencies.  */
> +
> +static bool
> +access_fn_components_comparable_p (tree ref_a, tree ref_b)
> +{
> +  /* Allow pairs of component refs from the following sets:
> +
> +       { REALPART_EXPR, IMAGPART_EXPR }
> +       { COMPONENT_REF }
> +       { ARRAY_REF }.  */
> +  tree_code code_a = TREE_CODE (ref_a);
> +  tree_code code_b = TREE_CODE (ref_b);
> +  if (code_a == IMAGPART_EXPR)
> +    code_a = REALPART_EXPR;
> +  if (code_b == IMAGPART_EXPR)
> +    code_b = REALPART_EXPR;
> +  if (code_a != code_b)
> +    return false;
> +
> +  if (TREE_CODE (ref_a) == COMPONENT_REF)
> +    /* ??? We cannot simply use the type of operand #0 of the refs here as
> +       the Fortran compiler smuggles type punning into COMPONENT_REFs.
> +       Use the DECL_CONTEXT of the FIELD_DECLs instead.  */
> +    return (DECL_CONTEXT (TREE_OPERAND (ref_a, 1))
> +           == DECL_CONTEXT (TREE_OPERAND (ref_b, 1)));
> +
> +  return types_compatible_p (TREE_TYPE (TREE_OPERAND (ref_a, 0)),
> +                            TREE_TYPE (TREE_OPERAND (ref_b, 0)));
> +}
> +
>  /* Initialize a data dependence relation between data accesses A and
>     B.  NB_LOOPS is the number of loops surrounding the references: the
>     size of the classic distance/direction vectors.  */
> @@ -2160,11 +2230,10 @@ initialize_data_dependence_relation (str
>    struct data_dependence_relation *res;
>    unsigned int i;
>
> -  res = XNEW (struct data_dependence_relation);
> +  res = XCNEW (struct data_dependence_relation);
>    DDR_A (res) = a;
>    DDR_B (res) = b;
>    DDR_LOOP_NEST (res).create (0);
> -  DDR_REVERSED_P (res) = false;
>    DDR_SUBSCRIPTS (res).create (0);
>    DDR_DIR_VECTS (res).create (0);
>    DDR_DIST_VECTS (res).create (0);
> @@ -2182,82 +2251,277 @@ initialize_data_dependence_relation (str
>        return res;
>      }
>
> -  /* The case where the references are exactly the same.  */
> -  if (operand_equal_p (DR_REF (a), DR_REF (b), 0))
> +  unsigned int num_dimensions_a = DR_NUM_DIMENSIONS (a);
> +  unsigned int num_dimensions_b = DR_NUM_DIMENSIONS (b);
> +  if (num_dimensions_a == 0 || num_dimensions_b == 0)
>      {
> -      if ((loop_nest.exists ()
> -          && !object_address_invariant_in_loop_p (loop_nest[0],
> -                                                  DR_BASE_OBJECT (a)))
> -         || DR_NUM_DIMENSIONS (a) == 0)
> +      DDR_ARE_DEPENDENT (res) = chrec_dont_know;
> +      return res;
> +    }
> +
> +  /* For unconstrained bases, the root (highest-indexed) subscript
> +     describes a variation in the base of the original DR_REF rather
> +     than a component access.  We have no type that accurately describes
> +     the new DR_BASE_OBJECT (whose TREE_TYPE describes the type *after*
> +     applying this subscript) so limit the search to the last real
> +     component access.
> +
> +     E.g. for:
> +
> +       void
> +       f (int a[][8], int b[][8])
> +       {
> +         for (int i = 0; i < 8; ++i)
> +           a[i * 2][0] = b[i][0];
> +       }
> +
> +     the a and b accesses have a single ARRAY_REF component reference [0]
> +     but have two subscripts.  */
> +  if (DR_UNCONSTRAINED_BASE (a))
> +    num_dimensions_a -= 1;
> +  if (DR_UNCONSTRAINED_BASE (b))
> +    num_dimensions_b -= 1;
> +
> +  /* These structures describe sequences of component references in
> +     DR_REF (A) and DR_REF (B).  Each component reference is tied to a
> +     specific access function.  */
> +  struct {
> +    /* The sequence starts at DR_ACCESS_FN (A, START_A) of A and
> +       DR_ACCESS_FN (B, START_B) of B (inclusive) and extends to higher
> +       indices.  In C notation, these are the indices of the rightmost
> +       component references; e.g. for a sequence .b.c.d, the start
> +       index is for .d.  */
> +    unsigned int start_a;
> +    unsigned int start_b;
> +
> +    /* The sequence contains LENGTH consecutive access functions from
> +       each DR.  */
> +    unsigned int length;
> +
> +    /* The enclosing objects for the A and B sequences respectively,
> +       i.e. the objects to which DR_ACCESS_FN (A, START_A + LENGTH - 1)
> +       and DR_ACCESS_FN (B, START_B + LENGTH - 1) are applied.  */
> +    tree object_a;
> +    tree object_b;
> +  } full_seq = {}, struct_seq = {};
> +
> +  /* Before each iteration of the loop:
> +
> +     - REF_A is what you get after applying DR_ACCESS_FN (A, INDEX_A) and
> +     - REF_B is what you get after applying DR_ACCESS_FN (B, INDEX_B).  */
> +  unsigned int index_a = 0;
> +  unsigned int index_b = 0;
> +  tree ref_a = DR_REF (a);
> +  tree ref_b = DR_REF (b);
> +
> +  /* Now walk the component references from the final DR_REFs back up to
> +     the enclosing base objects.  Each component reference corresponds
> +     to one access function in the DR, with access function 0 being for
> +     the final DR_REF and the highest-indexed access function being the
> +     one that is applied to the base of the DR.
> +
> +     Look for a sequence of component references whose access functions
> +     are comparable (see access_fn_components_comparable_p).  If more
> +     than one such sequence exists, pick the one nearest the base
> +     (which is the leftmost sequence in C notation).  Store this sequence
> +     in FULL_SEQ.
> +
> +     For example, if we have:
> +
> +       struct foo { struct bar s; ... } (*a)[10], (*b)[10];
> +
> +       A: a[0][i].s.c.d
> +       B: __real b[0][i].s.e[i].f
> +
> +     (where d is the same type as the real component of f) then the access
> +     functions would be:
> +
> +                        0   1   2   3
> +       A:              .d  .c  .s [i]
> +
> +                0   1   2   3   4   5
> +       B:  __real  .f [i]  .e  .s [i]
> +
> +     The A0/B2 column isn't comparable, since .d is a COMPONENT_REF
> +     and [i] is an ARRAY_REF.  However, the A1/B3 column contains two
> +     COMPONENT_REF accesses for struct bar, so is comparable.  Likewise
> +     the A2/B4 column contains two COMPONENT_REF accesses for struct foo,
> +     so is comparable.  The A3/B5 column contains two ARRAY_REFs that
> +     index foo[10] arrays, so is again comparable.  The sequence is
> +     therefore:
> +
> +        A: [1, 3]  (i.e. [i].s.c)
> +        B: [3, 5]  (i.e. [i].s.e)
> +
> +     Also look for sequences of component references whose access
> +     functions are comparable and whose enclosing objects have the same
> +     RECORD_TYPE.  Store this sequence in STRUCT_SEQ.  In the above
> +     example, STRUCT_SEQ would be:
> +
> +        A: [1, 2]  (i.e. s.c)
> +        B: [3, 4]  (i.e. s.e)  */
> +  while (index_a < num_dimensions_a && index_b < num_dimensions_b)
> +    {
> +      /* REF_A and REF_B must be one of the component access types
> +        allowed by dr_analyze_indices.  */
> +      gcc_checking_assert (access_fn_component_p (ref_a));
> +      gcc_checking_assert (access_fn_component_p (ref_b));
> +
> +      /* Get the immediately-enclosing objects for REF_A and REF_B,
> +        i.e. the references *before* applying DR_ACCESS_FN (A, INDEX_A)
> +        and DR_ACCESS_FN (B, INDEX_B).  */
> +      tree object_a = TREE_OPERAND (ref_a, 0);
> +      tree object_b = TREE_OPERAND (ref_b, 0);
> +
> +      tree type_a = TREE_TYPE (object_a);
> +      tree type_b = TREE_TYPE (object_b);
> +      if (access_fn_components_comparable_p (ref_a, ref_b))
> +       {
> +         /* This pair of component accesses is comparable for dependence
> +            analysis, so we can include DR_ACCESS_FN (A, INDEX_A) and
> +            DR_ACCESS_FN (B, INDEX_B) in the sequence.  */
> +         if (full_seq.start_a + full_seq.length != index_a
> +             || full_seq.start_b + full_seq.length != index_b)
> +           {
> +             /* The accesses don't extend the current sequence,
> +                so start a new one here.  */
> +             full_seq.start_a = index_a;
> +             full_seq.start_b = index_b;
> +             full_seq.length = 0;
> +           }
> +
> +         /* Add this pair of references to the sequence.  */
> +         full_seq.length += 1;
> +         full_seq.object_a = object_a;
> +         full_seq.object_b = object_b;
> +
> +         /* If the enclosing objects are structures (and thus have the
> +            same RECORD_TYPE), record the new sequence in STRUCT_SEQ.  */
> +         if (TREE_CODE (type_a) == RECORD_TYPE)
> +           struct_seq = full_seq;
> +
> +         /* Move to the next containing reference for both A and B.  */
> +         ref_a = object_a;
> +         ref_b = object_b;
> +         index_a += 1;
> +         index_b += 1;
> +         continue;
> +       }
> +
> +      /* Try to approach equal type sizes.  */
> +      if (!COMPLETE_TYPE_P (type_a)
> +         || !COMPLETE_TYPE_P (type_b)
> +         || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a))
> +         || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b)))
> +       break;
> +
> +      unsigned HOST_WIDE_INT size_a = tree_to_uhwi (TYPE_SIZE_UNIT (type_a));
> +      unsigned HOST_WIDE_INT size_b = tree_to_uhwi (TYPE_SIZE_UNIT (type_b));
> +      if (size_a <= size_b)
>         {
> -         DDR_ARE_DEPENDENT (res) = chrec_dont_know;
> -         return res;
> +         index_a += 1;
> +         ref_a = object_a;
> +       }
> +      if (size_b <= size_a)
> +       {
> +         index_b += 1;
> +         ref_b = object_b;
>         }
> -      DDR_AFFINE_P (res) = true;
> -      DDR_ARE_DEPENDENT (res) = NULL_TREE;
> -      DDR_SUBSCRIPTS (res).create (DR_NUM_DIMENSIONS (a));
> -      DDR_LOOP_NEST (res) = loop_nest;
> -      DDR_INNER_LOOP (res) = 0;
> -      DDR_SELF_REFERENCE (res) = true;
> -      for (i = 0; i < DR_NUM_DIMENSIONS (a); i++)
> -       {
> -         struct subscript *subscript;
> -
> -         subscript = XNEW (struct subscript);
> -         SUB_CONFLICTS_IN_A (subscript) = conflict_fn_not_known ();
> -         SUB_CONFLICTS_IN_B (subscript) = conflict_fn_not_known ();
> -         SUB_LAST_CONFLICT (subscript) = chrec_dont_know;
> -         SUB_DISTANCE (subscript) = chrec_dont_know;
> -         DDR_SUBSCRIPTS (res).safe_push (subscript);
> -       }
> -      return res;
>      }
>
> -  /* If the references do not access the same object, we do not know
> -     whether they alias or not.  We do not care about TBAA or alignment
> -     info so we can use OEP_ADDRESS_OF to avoid false negatives.
> -     But the accesses have to use compatible types as otherwise the
> -     built indices would not match.  */
> -  if (!operand_equal_p (DR_BASE_OBJECT (a), DR_BASE_OBJECT (b), OEP_ADDRESS_OF)
> -      || !types_compatible_p (TREE_TYPE (DR_BASE_OBJECT (a)),
> -                             TREE_TYPE (DR_BASE_OBJECT (b))))
> +  /* See whether FULL_SEQ ends at the base and whether the two bases
> +     are equal.  We do not care about TBAA or alignment info so we can
> +     use OEP_ADDRESS_OF to avoid false negatives.  */
> +  tree base_a = DR_BASE_OBJECT (a);
> +  tree base_b = DR_BASE_OBJECT (b);
> +  bool same_base_p = (full_seq.start_a + full_seq.length == num_dimensions_a
> +                     && full_seq.start_b + full_seq.length == num_dimensions_b
> +                     && DR_UNCONSTRAINED_BASE (a) == DR_UNCONSTRAINED_BASE (b)
> +                     && operand_equal_p (base_a, base_b, OEP_ADDRESS_OF)
> +                     && types_compatible_p (TREE_TYPE (base_a),
> +                                            TREE_TYPE (base_b))
> +                     && (!loop_nest.exists ()
> +                         || (object_address_invariant_in_loop_p
> +                             (loop_nest[0], base_a))));
> +
> +  /* If the bases are the same, we can include the base variation too.
> +     E.g. the b accesses in:
> +
> +       for (int i = 0; i < n; ++i)
> +         b[i + 4][0] = b[i][0];
> +
> +     have a definite dependence distance of 4, while for:
> +
> +       for (int i = 0; i < n; ++i)
> +         a[i + 4][0] = b[i][0];
> +
> +     the dependence distance depends on the gap between a and b.
> +
> +     If the bases are different then we can only rely on the sequence
> +     rooted at a structure access, since arrays are allowed to overlap
> +     arbitrarily and change shape arbitrarily.  E.g. we treat this as
> +     valid code:
> +
> +       int a[256];
> +       ...
> +       ((int (*)[4][3]) &a[1])[i][0] += ((int (*)[4][3]) &a[2])[i][0];
> +
> +     where two lvalues with the same int[4][3] type overlap, and where
> +     both lvalues are distinct from the object's declared type.  */
> +  if (same_base_p)
>      {
> -      DDR_ARE_DEPENDENT (res) = chrec_dont_know;
> -      return res;
> +      if (DR_UNCONSTRAINED_BASE (a))
> +       full_seq.length += 1;
>      }
> +  else
> +    full_seq = struct_seq;
>
> -  /* If the base of the object is not invariant in the loop nest, we cannot
> -     analyze it.  TODO -- in fact, it would suffice to record that there may
> -     be arbitrary dependences in the loops where the base object varies.  */
> -  if ((loop_nest.exists ()
> -       && !object_address_invariant_in_loop_p (loop_nest[0], DR_BASE_OBJECT (a)))
> -      || DR_NUM_DIMENSIONS (a) == 0)
> +  /* Punt if we didn't find a suitable sequence.  */
> +  if (full_seq.length == 0)
>      {
>        DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>        return res;
>      }
>
> -  /* If the number of dimensions of the access to not agree we can have
> -     a pointer access to a component of the array element type and an
> -     array access while the base-objects are still the same.  Punt.  */
> -  if (DR_NUM_DIMENSIONS (a) != DR_NUM_DIMENSIONS (b))
> +  if (!same_base_p)
>      {
> -      DDR_ARE_DEPENDENT (res) = chrec_dont_know;
> -      return res;
> +      /* Partial overlap is possible for different bases when strict aliasing
> +        is not in effect.  It's also possible if either base involves a union
> +        access; e.g. for:
> +
> +          struct s1 { int a[2]; };
> +          struct s2 { struct s1 b; int c; };
> +          struct s3 { int d; struct s1 e; };
> +          union u { struct s2 f; struct s3 g; } *p, *q;
> +
> +        the s1 at "p->f.b" (base "p->f") partially overlaps the s1 at
> +        "p->g.e" (base "p->g") and might partially overlap the s1 at
> +        "q->g.e" (base "q->g").  */
> +      if (!flag_strict_aliasing
> +         || ref_contains_union_access_p (full_seq.object_a)
> +         || ref_contains_union_access_p (full_seq.object_b))
> +       {
> +         DDR_ARE_DEPENDENT (res) = chrec_dont_know;
> +         return res;
> +       }
> +
> +      DDR_COULD_BE_INDEPENDENT_P (res) = true;
>      }
>
>    DDR_AFFINE_P (res) = true;
>    DDR_ARE_DEPENDENT (res) = NULL_TREE;
> -  DDR_SUBSCRIPTS (res).create (DR_NUM_DIMENSIONS (a));
> +  DDR_SUBSCRIPTS (res).create (full_seq.length);
>    DDR_LOOP_NEST (res) = loop_nest;
>    DDR_INNER_LOOP (res) = 0;
>    DDR_SELF_REFERENCE (res) = false;
>
> -  for (i = 0; i < DR_NUM_DIMENSIONS (a); i++)
> +  for (i = 0; i < full_seq.length; ++i)
>      {
>        struct subscript *subscript;
>
>        subscript = XNEW (struct subscript);
> +      SUB_ACCESS_FN (subscript, 0) = DR_ACCESS_FN (a, full_seq.start_a + i);
> +      SUB_ACCESS_FN (subscript, 1) = DR_ACCESS_FN (b, full_seq.start_b + i);
>        SUB_CONFLICTS_IN_A (subscript) = conflict_fn_not_known ();
>        SUB_CONFLICTS_IN_B (subscript) = conflict_fn_not_known ();
>        SUB_LAST_CONFLICT (subscript) = chrec_dont_know;
> @@ -3839,14 +4103,15 @@ add_outer_distances (struct data_depende
>  }
>
>  /* Return false when fail to represent the data dependence as a
> -   distance vector.  INIT_B is set to true when a component has been
> +   distance vector.  A_INDEX is the index of the first reference
> +   (0 for DDR_A, 1 for DDR_B) and B_INDEX is the index of the
> +   second reference.  INIT_B is set to true when a component has been
>     added to the distance vector DIST_V.  INDEX_CARRY is then set to
>     the index in DIST_V that carries the dependence.  */
>
>  static bool
>  build_classic_dist_vector_1 (struct data_dependence_relation *ddr,
> -                            struct data_reference *ddr_a,
> -                            struct data_reference *ddr_b,
> +                            unsigned int a_index, unsigned int b_index,
>                              lambda_vector dist_v, bool *init_b,
>                              int *index_carry)
>  {
> @@ -3864,8 +4129,8 @@ build_classic_dist_vector_1 (struct data
>           return false;
>         }
>
> -      access_fn_a = DR_ACCESS_FN (ddr_a, i);
> -      access_fn_b = DR_ACCESS_FN (ddr_b, i);
> +      access_fn_a = SUB_ACCESS_FN (subscript, a_index);
> +      access_fn_b = SUB_ACCESS_FN (subscript, b_index);
>
>        if (TREE_CODE (access_fn_a) == POLYNOMIAL_CHREC
>           && TREE_CODE (access_fn_b) == POLYNOMIAL_CHREC)
> @@ -3925,10 +4190,11 @@ build_classic_dist_vector_1 (struct data
>  constant_access_functions (const struct data_dependence_relation *ddr)
>  {
>    unsigned i;
> +  subscript *sub;
>
> -  for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
> -    if (!evolution_function_is_constant_p (DR_ACCESS_FN (DDR_A (ddr), i))
> -       || !evolution_function_is_constant_p (DR_ACCESS_FN (DDR_B (ddr), i)))
> +  FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
> +    if (!evolution_function_is_constant_p (SUB_ACCESS_FN (sub, 0))
> +       || !evolution_function_is_constant_p (SUB_ACCESS_FN (sub, 1)))
>        return false;
>
>    return true;
> @@ -3991,10 +4257,11 @@ add_other_self_distances (struct data_de
>    lambda_vector dist_v;
>    unsigned i;
>    int index_carry = DDR_NB_LOOPS (ddr);
> +  subscript *sub;
>
> -  for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
> +  FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
>      {
> -      tree access_fun = DR_ACCESS_FN (DDR_A (ddr), i);
> +      tree access_fun = SUB_ACCESS_FN (sub, 0);
>
>        if (TREE_CODE (access_fun) == POLYNOMIAL_CHREC)
>         {
> @@ -4006,7 +4273,7 @@ add_other_self_distances (struct data_de
>                   return;
>                 }
>
> -             access_fun = DR_ACCESS_FN (DDR_A (ddr), 0);
> +             access_fun = SUB_ACCESS_FN (DDR_SUBSCRIPT (ddr, 0), 0);
>
>               if (TREE_CODE (CHREC_LEFT (access_fun)) == POLYNOMIAL_CHREC)
>                 add_multivariate_self_dist (ddr, access_fun);
> @@ -4077,6 +4344,23 @@ add_distance_for_zero_overlaps (struct d
>      }
>  }
>
> +/* Return true when the DDR contains two data references that have the
> +   same access functions.  */
> +
> +static inline bool
> +same_access_functions (const struct data_dependence_relation *ddr)
> +{
> +  unsigned i;
> +  subscript *sub;
> +
> +  FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
> +    if (!eq_evolutions_p (SUB_ACCESS_FN (sub, 0),
> +                         SUB_ACCESS_FN (sub, 1)))
> +      return false;
> +
> +  return true;
> +}
> +
>  /* Compute the classic per loop distance vector.  DDR is the data
>     dependence relation to build a vector from.  Return false when fail
>     to represent the data dependence as a distance vector.  */
> @@ -4108,8 +4392,7 @@ build_classic_dist_vector (struct data_d
>      }
>
>    dist_v = lambda_vector_new (DDR_NB_LOOPS (ddr));
> -  if (!build_classic_dist_vector_1 (ddr, DDR_A (ddr), DDR_B (ddr),
> -                                   dist_v, &init_b, &index_carry))
> +  if (!build_classic_dist_vector_1 (ddr, 0, 1, dist_v, &init_b, &index_carry))
>      return false;
>
>    /* Save the distance vector if we initialized one.  */
> @@ -4142,12 +4425,11 @@ build_classic_dist_vector (struct data_d
>        if (!lambda_vector_lexico_pos (dist_v, DDR_NB_LOOPS (ddr)))
>         {
>           lambda_vector save_v = lambda_vector_new (DDR_NB_LOOPS (ddr));
> -         if (!subscript_dependence_tester_1 (ddr, DDR_B (ddr), DDR_A (ddr),
> -                                             loop_nest))
> +         if (!subscript_dependence_tester_1 (ddr, 1, 0, loop_nest))
>             return false;
>           compute_subscript_distance (ddr);
> -         if (!build_classic_dist_vector_1 (ddr, DDR_B (ddr), DDR_A (ddr),
> -                                           save_v, &init_b, &index_carry))
> +         if (!build_classic_dist_vector_1 (ddr, 1, 0, save_v, &init_b,
> +                                           &index_carry))
>             return false;
>           save_dist_v (ddr, save_v);
>           DDR_REVERSED_P (ddr) = true;
> @@ -4183,12 +4465,10 @@ build_classic_dist_vector (struct data_d
>             {
>               lambda_vector opposite_v = lambda_vector_new (DDR_NB_LOOPS (ddr));
>
> -             if (!subscript_dependence_tester_1 (ddr, DDR_B (ddr),
> -                                                 DDR_A (ddr), loop_nest))
> +             if (!subscript_dependence_tester_1 (ddr, 1, 0, loop_nest))
>                 return false;
>               compute_subscript_distance (ddr);
> -             if (!build_classic_dist_vector_1 (ddr, DDR_B (ddr), DDR_A (ddr),
> -                                               opposite_v, &init_b,
> +             if (!build_classic_dist_vector_1 (ddr, 1, 0, opposite_v, &init_b,
>                                                 &index_carry))
>                 return false;
>
> @@ -4267,13 +4547,13 @@ build_classic_dir_vector (struct data_de
>      }
>  }
>
> -/* Helper function.  Returns true when there is a dependence between
> -   data references DRA and DRB.  */
> +/* Helper function.  Returns true when there is a dependence between the
> +   data references.  A_INDEX is the index of the first reference (0 for
> +   DDR_A, 1 for DDR_B) and B_INDEX is the index of the second reference.  */
>
>  static bool
>  subscript_dependence_tester_1 (struct data_dependence_relation *ddr,
> -                              struct data_reference *dra,
> -                              struct data_reference *drb,
> +                              unsigned int a_index, unsigned int b_index,
>                                struct loop *loop_nest)
>  {
>    unsigned int i;
> @@ -4285,8 +4565,8 @@ subscript_dependence_tester_1 (struct da
>      {
>        conflict_function *overlaps_a, *overlaps_b;
>
> -      analyze_overlapping_iterations (DR_ACCESS_FN (dra, i),
> -                                     DR_ACCESS_FN (drb, i),
> +      analyze_overlapping_iterations (SUB_ACCESS_FN (subscript, a_index),
> +                                     SUB_ACCESS_FN (subscript, b_index),
>                                       &overlaps_a, &overlaps_b,
>                                       &last_conflicts, loop_nest);
>
> @@ -4335,7 +4615,7 @@ subscript_dependence_tester_1 (struct da
>  subscript_dependence_tester (struct data_dependence_relation *ddr,
>                              struct loop *loop_nest)
>  {
> -  if (subscript_dependence_tester_1 (ddr, DDR_A (ddr), DDR_B (ddr), loop_nest))
> +  if (subscript_dependence_tester_1 (ddr, 0, 1, loop_nest))
>      dependence_stats.num_dependence_dependent++;
>
>    compute_subscript_distance (ddr);
> Index: gcc/tree-ssa-loop-prefetch.c
> ===================================================================
> --- gcc/tree-ssa-loop-prefetch.c        2017-07-27 13:10:29.620045506 +0100
> +++ gcc/tree-ssa-loop-prefetch.c        2017-07-27 13:10:33.023912613 +0100
> @@ -1668,6 +1668,7 @@ determine_loop_nest_reuse (struct loop *
>        refb = (struct mem_ref *) DDR_B (dep)->aux;
>
>        if (DDR_ARE_DEPENDENT (dep) == chrec_dont_know
> +         || DDR_COULD_BE_INDEPENDENT_P (dep)
>           || DDR_NUM_DIST_VECTS (dep) == 0)
>         {
>           /* If the dependence cannot be analyzed, assume that there might be
> Index: gcc/tree-vectorizer.h
> ===================================================================
> --- gcc/tree-vectorizer.h       2017-07-27 13:10:29.620045506 +0100
> +++ gcc/tree-vectorizer.h       2017-07-27 13:10:33.024912868 +0100
> @@ -358,7 +358,7 @@ #define LOOP_VINFO_ORIG_LOOP_INFO(L)
>  #define LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT(L)      \
>    ((L)->may_misalign_stmts.length () > 0)
>  #define LOOP_REQUIRES_VERSIONING_FOR_ALIAS(L)          \
> -  ((L)->may_alias_ddrs.length () > 0)
> +  ((L)->comp_alias_ddrs.length () > 0)
>  #define LOOP_REQUIRES_VERSIONING_FOR_NITERS(L)         \
>    (LOOP_VINFO_NITERS_ASSUMPTIONS (L))
>  #define LOOP_REQUIRES_VERSIONING(L)                    \
> Index: gcc/tree-vect-data-refs.c
> ===================================================================
> --- gcc/tree-vect-data-refs.c   2017-07-27 13:10:29.620045506 +0100
> +++ gcc/tree-vect-data-refs.c   2017-07-27 13:10:33.024912868 +0100
> @@ -160,6 +160,60 @@ vect_mark_for_runtime_alias_test (ddr_p
>  }
>
>
> +/* A subroutine of vect_analyze_data_ref_dependence.  Handle
> +   DDR_COULD_BE_INDEPENDENT_P ddr DDR that has a known set of dependence
> +   distances.  These distances are conservatively correct but they don't
> +   reflect a guaranteed dependence.
> +
> +   Return true if this function does all the work necessary to avoid
> +   an alias or false if the caller should use the dependence distances
> +   to limit the vectorization factor in the usual way.  LOOP_DEPTH is
> +   the depth of the loop described by LOOP_VINFO and the other arguments
> +   are as for vect_analyze_data_ref_dependence.  */
> +
> +static bool
> +vect_analyze_possibly_independent_ddr (data_dependence_relation *ddr,
> +                                      loop_vec_info loop_vinfo,
> +                                      int loop_depth, int *max_vf)
> +{
> +  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
> +  lambda_vector dist_v;
> +  unsigned int i;
> +  FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
> +    {
> +      int dist = dist_v[loop_depth];
> +      if (dist != 0 && !(dist > 0 && DDR_REVERSED_P (ddr)))
> +       {
> +         /* If the user asserted safelen >= DIST consecutive iterations
> +            can be executed concurrently, assume independence.
> +
> +            ??? An alternative would be to add the alias check even
> +            in this case, and vectorize the fallback loop with the
> +            maximum VF set to safelen.  However, if the user has
> +            explicitly given a length, it's less likely that that
> +            would be a win.  */
> +         if (loop->safelen >= 2 && abs_hwi (dist) <= loop->safelen)
> +           {
> +             if (loop->safelen < *max_vf)
> +               *max_vf = loop->safelen;
> +             LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo) = false;
> +             continue;
> +           }
> +
> +         /* For dependence distances of 2 or more, we have the option
> +            of limiting VF or checking for an alias at runtime.
> +            Prefer to check at runtime if we can, to avoid limiting
> +            the VF unnecessarily when the bases are in fact independent.
> +
> +            Note that the alias checks will be removed if the VF ends up
> +            being small enough.  */
> +         return vect_mark_for_runtime_alias_test (ddr, loop_vinfo);
> +       }
> +    }
> +  return true;
> +}
> +
> +
>  /* Function vect_analyze_data_ref_dependence.
>
>     Return TRUE if there (might) exist a dependence between a memory-reference
> @@ -305,6 +359,12 @@ vect_analyze_data_ref_dependence (struct
>      }
>
>    loop_depth = index_in_loop_nest (loop->num, DDR_LOOP_NEST (ddr));
> +
> +  if (DDR_COULD_BE_INDEPENDENT_P (ddr)
> +      && vect_analyze_possibly_independent_ddr (ddr, loop_vinfo,
> +                                               loop_depth, max_vf))
> +    return false;
> +
>    FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
>      {
>        int dist = dist_v[loop_depth];
> @@ -2878,6 +2938,44 @@ vect_no_alias_p (struct data_reference *
>    return false;
>  }
>
> +/* Return true if the minimum nonzero dependence distance for loop LOOP_DEPTH
> +   in DDR is >= VF.  */
> +
> +static bool
> +dependence_distance_ge_vf (data_dependence_relation *ddr,
> +                          unsigned int loop_depth, unsigned HOST_WIDE_INT vf)
> +{
> +  if (DDR_ARE_DEPENDENT (ddr) != NULL_TREE
> +      || DDR_NUM_DIST_VECTS (ddr) == 0)
> +    return false;
> +
> +  /* If the dependence is exact, we should have limited the VF instead.  */
> +  gcc_checking_assert (DDR_COULD_BE_INDEPENDENT_P (ddr));
> +
> +  unsigned int i;
> +  lambda_vector dist_v;
> +  FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
> +    {
> +      HOST_WIDE_INT dist = dist_v[loop_depth];
> +      if (dist != 0
> +         && !(dist > 0 && DDR_REVERSED_P (ddr))
> +         && (unsigned HOST_WIDE_INT) abs_hwi (dist) < vf)
> +       return false;
> +    }
> +
> +  if (dump_enabled_p ())
> +    {
> +      dump_printf_loc (MSG_NOTE, vect_location,
> +                      "dependence distance between ");
> +      dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (DDR_A (ddr)));
> +      dump_printf (MSG_NOTE,  " and ");
> +      dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (DDR_B (ddr)));
> +      dump_printf (MSG_NOTE,  " is >= VF\n");
> +    }
> +
> +  return true;
> +}
> +
>  /* Function vect_prune_runtime_alias_test_list.
>
>     Prune a list of ddrs to be tested at run-time by versioning for alias.
> @@ -2908,6 +3006,10 @@ vect_prune_runtime_alias_test_list (loop
>
>    comp_alias_ddrs.create (may_alias_ddrs.length ());
>
> +  unsigned int loop_depth
> +    = index_in_loop_nest (LOOP_VINFO_LOOP (loop_vinfo)->num,
> +                         LOOP_VINFO_LOOP_NEST (loop_vinfo));
> +
>    /* First, we collect all data ref pairs for aliasing checks.  */
>    FOR_EACH_VEC_ELT (may_alias_ddrs, i, ddr)
>      {
> @@ -2917,6 +3019,11 @@ vect_prune_runtime_alias_test_list (loop
>        tree segment_length_a, segment_length_b;
>        gimple *stmt_a, *stmt_b;
>
> +      /* Ignore the alias if the VF we chose ended up being no greater
> +        than the dependence distance.  */
> +      if (dependence_distance_ge_vf (ddr, loop_depth, vect_factor))
> +       continue;
> +
>        dr_a = DDR_A (ddr);
>        stmt_a = DR_STMT (DDR_A (ddr));
>        dr_group_first_a = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_a));
> @@ -2993,10 +3100,6 @@ vect_prune_runtime_alias_test_list (loop
>        return false;
>      }
>
> -  /* All alias checks have been resolved at compilation time.  */
> -  if (!comp_alias_ddrs.length ())
> -    LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo).truncate (0);
> -
>    return true;
>  }
>
> Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-3.c
> ===================================================================
> --- /dev/null   2017-07-27 10:25:31.671280760 +0100
> +++ gcc/testsuite/gcc.dg/vect/vect-alias-check-3.c      2017-07-27 13:10:33.022912357 +0100
> @@ -0,0 +1,120 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vect_int } */
> +/* { dg-additional-options "--param vect-max-version-for-alias-checks=0 -fopenmp-simd" } */
> +
> +/* Intended to be larger than any VF.  */
> +#define GAP 128
> +#define N (GAP * 3)
> +
> +struct s { int x[N + 1]; };
> +struct t { struct s x[N + 1]; };
> +struct u { int x[N + 1]; int y; };
> +struct v { struct s s; };
> +
> +void
> +f1 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a->x[i] += b->x[i];
> +}
> +
> +void
> +f2 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a[1].x[i] += b[2].x[i];
> +}
> +
> +void
> +f3 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a[1].x[i] += b[i].x[i];
> +}
> +
> +void
> +f4 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a[i].x[i] += b[i].x[i];
> +}
> +
> +void
> +f5 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a->x[i] += b->x[i + 1];
> +}
> +
> +void
> +f6 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a[1].x[i] += b[2].x[i + 1];
> +}
> +
> +void
> +f7 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a[1].x[i] += b[i].x[i + 1];
> +}
> +
> +void
> +f8 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a[i].x[i] += b[i].x[i + 1];
> +}
> +
> +void
> +f9 (struct s *a, struct t *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a->x[i] += b->x[1].x[i];
> +}
> +
> +void
> +f10 (struct s *a, struct t *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a->x[i] += b->x[i].x[i];
> +}
> +
> +void
> +f11 (struct u *a, struct u *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a->x[i] += b->x[i] + b[i].y;
> +}
> +
> +void
> +f12 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < GAP; ++i)
> +    a->x[i + GAP] += b->x[i];
> +}
> +
> +void
> +f13 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < GAP * 2; ++i)
> +    a->x[i + GAP] += b->x[i];
> +}
> +
> +void
> +f14 (struct v *a, struct s *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a->s.x[i] = b->x[i];
> +}
> +
> +void
> +f15 (struct s *a, struct s *b)
> +{
> +  #pragma omp simd safelen(N)
> +  for (int i = 0; i < N; ++i)
> +    a->x[i + 1] += b->x[i];
> +}
> +
> +/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 15 "vect" } } */
> Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-4.c
> ===================================================================
> --- /dev/null   2017-07-27 10:25:31.671280760 +0100
> +++ gcc/testsuite/gcc.dg/vect/vect-alias-check-4.c      2017-07-27 13:10:33.022912357 +0100
> @@ -0,0 +1,35 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vect_int } */
> +/* { dg-additional-options "--param vect-max-version-for-alias-checks=0" } */
> +
> +#define N 16
> +
> +struct s1 { int a[N]; };
> +struct s2 { struct s1 b; int c; };
> +struct s3 { int d; struct s1 e; };
> +union u { struct s2 f; struct s3 g; };
> +
> +/* We allow a and b to overlap arbitrarily.  */
> +
> +void
> +f1 (int a[][N], int b[][N])
> +{
> +  for (int i = 0; i < N; ++i)
> +    a[0][i] += b[0][i];
> +}
> +
> +void
> +f2 (union u *a, union u *b)
> +{
> +  for (int i = 0; i < N; ++i)
> +    a->f.b.a[i] += b->g.e.a[i];
> +}
> +
> +void
> +f3 (struct s1 *a, struct s1 *b)
> +{
> +  for (int i = 0; i < N - 1; ++i)
> +    a->a[i + 1] += b->a[i];
> +}
> +
> +/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
> Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-5.c
> ===================================================================
> --- /dev/null   2017-07-27 10:25:31.671280760 +0100
> +++ gcc/testsuite/gcc.dg/vect/vect-alias-check-5.c      2017-07-27 13:10:33.022912357 +0100
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target vect_int } */
> +
> +/* Intended to be larger than any VF.  */
> +#define GAP 128
> +#define N (GAP * 3)
> +
> +struct s { int x[N]; };
> +
> +void
> +f1 (struct s *a, struct s *b)
> +{
> +  for (int i = 0; i < GAP * 2; ++i)
> +    a->x[i + GAP] += b->x[i];
> +}
> +
> +/* { dg-final { scan-tree-dump-times "consider run-time aliasing" 1 "vect" } } */
> +/* { dg-final { scan-tree-dump-times "improved number of alias checks from 1 to 0" 1 "vect" } } */
> +/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 1 "vect" } } */

Richard Sandiford Aug. 4, 2017, 9:28 a.m. UTC | #3

Richard Biener <richard.guenther@gmail.com> writes:
> On Thu, Jul 27, 2017 at 2:19 PM, Richard Sandiford
> <richard.sandiford@linaro.org> wrote:
>> Richard Sandiford <richard.sandiford@linaro.org> writes:
>>> Eric Botcazou <ebotcazou@adacore.com> writes:
>>>> [Sorry for missing the previous messages]
>>>>
>>>>> Thanks.  Just been retesting, and I think I must have forgotten
>>>>> to include Ada last time.  It turns out that the patch causes a dg-scan
>>>>> regression in gnat.dg/vect17.adb, because we now think that if the
>>>>> array RECORD_TYPEs *do* alias in:
>>>>>
>>>>>    procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
>>>>>    begin
>>>>>       for I in Sarray'Range loop
>>>>>          R(I) := X(I) + Y(I);
>>>>>       end loop;
>>>>>    end;
>>>>>
>>>>> then the dependence distance must be zero.  Eric, does that hold true
>>>>> for Ada?  I.e. if X and R (or Y and R) alias, must it be the case that
>>>>> X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?
>>>>
>>>> Yes, I'd think so (even without the artificial RECORD_TYPE around
> the arrays).
>>>
>>> Good!
>>>
>>>>> 2017-06-07  Richard Sandiford  <richard.sandiford@linaro.org>
>>>>>
>>>>> gcc/testsuite/
>>>>>     * gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
>>>>>     * gnat.dg/vect17.adb (Add): Create a dependence distance of 1
>>>>>     when X = R or Y = R.
>>>>
>>>> I think that you need to modify vect15 and vect16 the same way.
>>>
>>> Ah, yeah.  And doing that shows that I'd not handled safelen for
>>> DDR_COULD_BE_INDEPENDENT_P.  I've fixed that locally.
>>>
>>> How does this look?  Tested on x86_64-linux-gnu both without the
>>> vectoriser changes and with the fixed vectoriser patch.
>>
>> Here's a version of the patch that handles safelen.  I split the
>> handling out into a new function (vect_analyze_possibly_independent_ddr)
>> since it was getting too big to do inline.
>>
>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>
> Ok.

Thanks!

> Did you check whether BB vectorization is affected?  See
> vect_slp_analyze_instance_dependence
> and friends.  It's quite conservative but given the prefetching change
> I wonder if we need
> to rule out DDR_COULD_BE_INDEPENDENT_P?

I think it should be OK.  When DDR_COULD_BE_INDEPENDENT_P is set,
we've effectively changed from DDR_ARE_DEPENDENT == chrec_dont_know
to a conservatively-correct distance vector.  It looks like
vect_slp_analyze_data_ref_dependence handles both cases in the
same way (by returning true).

Thanks,
Richard

>
> Thanks,
> Richard.
>
>> Thanks,
>> Richard
>>
>>
>> 2017-07-27  Richard Sandiford  <richard.sandiford@linaro.org>
>>
>> gcc/
>>         * tree-data-ref.h (subscript): Add access_fn field.
>>         (data_dependence_relation): Add could_be_independent_p.
>>         (SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros.
>>         (same_access_functions): Move to tree-data-ref.c.
>>         * tree-data-ref.c (ref_contains_union_access_p): New function.
>>         (access_fn_component_p): Likewise.
>>         (access_fn_components_comparable_p): Likewise.
>>         (dr_analyze_indices): Add a reference to access_fn_component_p.
>>         (dump_data_dependence_relation): Use SUB_ACCESS_FN instead of
>>         DR_ACCESS_FN.
>>         (constant_access_functions): Likewise.
>>         (add_other_self_distances): Likewise.
>>         (same_access_functions): Likewise.  (Moved from tree-data-ref.h.)
>>         (initialize_data_dependence_relation): Use XCNEW and remove
>>         explicit zeroing of DDR_REVERSED_P.  Look for a subsequence
>>         of access functions that have the same type.  Allow the
>>         subsequence to end with different bases in some circumstances.
>>         Record the chosen access functions in SUB_ACCESS_FN.
>>         (build_classic_dist_vector_1): Replace ddr_a and ddr_b with
>>         a_index and b_index.  Use SUB_ACCESS_FN instead of DR_ACCESS_FN.
>>         (subscript_dependence_tester_1): Likewise dra and drb.
>>         (build_classic_dist_vector): Update calls accordingly.
>>         (subscript_dependence_tester): Likewise.
>>         * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Check
>>         DDR_COULD_BE_INDEPENDENT_P.
>>         * tree-vectorizer.h (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Test
>>         comp_alias_ddrs instead of may_alias_ddrs.
>>         * tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
>>         New function.
>>         (vect_analyze_data_ref_dependence): Use it if
>>         DDR_COULD_BE_INDEPENDENT_P, but fall back to using the recorded
>>         distance vectors if that fails.
>>         (dependence_distance_ge_vf): New function.
>>         (vect_prune_runtime_alias_test_list): Use it.  Don't clear
>>         LOOP_VINFO_MAY_ALIAS_DDRS.
>>
>> gcc/testsuite/
>>         * gcc.dg/vect/vect-alias-check-3.c: New test.
>>         * gcc.dg/vect/vect-alias-check-4.c: Likewise.
>>         * gcc.dg/vect/vect-alias-check-5.c: Likewise.
>>
>> Index: gcc/tree-data-ref.h
>> ===================================================================
>> --- gcc/tree-data-ref.h 2017-07-27 13:10:29.620045506 +0100
>> +++ gcc/tree-data-ref.h 2017-07-27 13:10:33.023912613 +0100
>> @@ -260,6 +260,9 @@ struct conflict_function
>>
>>  struct subscript
>>  {
>> +  /* The access functions of the two references.  */
>> +  tree access_fn[2];
>> +
>>    /* A description of the iterations for which the elements are
>>       accessed twice.  */
>>    conflict_function *conflicting_iterations_in_a;
>> @@ -278,6 +281,7 @@ struct subscript
>>
>>  typedef struct subscript *subscript_p;
>>
>> +#define SUB_ACCESS_FN(SUB, I) (SUB)->access_fn[I]
>>  #define SUB_CONFLICTS_IN_A(SUB) (SUB)->conflicting_iterations_in_a
>>  #define SUB_CONFLICTS_IN_B(SUB) (SUB)->conflicting_iterations_in_b
>>  #define SUB_LAST_CONFLICT(SUB) (SUB)->last_conflict
>> @@ -333,6 +337,33 @@ struct data_dependence_relation
>>    /* Set to true when the dependence relation is on the same data
>>       access.  */
>>    bool self_reference_p;
>> +
>> +  /* True if the dependence described is conservatively correct rather
>> +     than exact, and if it is still possible for the accesses to be
>> +     conditionally independent.  For example, the a and b references in:
>> +
>> +       struct s *a, *b;
>> +       for (int i = 0; i < n; ++i)
>> +         a->f[i] += b->f[i];
>> +
>> +     conservatively have a distance vector of (0), for the case in which
>> +     a == b, but the accesses are independent if a != b.  Similarly,
>> +     the a and b references in:
>> +
>> +       struct s *a, *b;
>> +       for (int i = 0; i < n; ++i)
>> +         a[0].f[i] += b[i].f[i];
>> +
>> +     conservatively have a distance vector of (0), but they are indepenent
>> +     when a != b + i.  In contrast, the references in:
>> +
>> +       struct s *a;
>> +       for (int i = 0; i < n; ++i)
>> +         a->f[i] += a->f[i];
>> +
>> +     have the same distance vector of (0), but the accesses can never be
>> +     independent.  */
>> +  bool could_be_independent_p;
>>  };
>>
>>  typedef struct data_dependence_relation *ddr_p;
>> @@ -363,6 +394,7 @@ #define DDR_DIR_VECT(DDR, I) \
>>  #define DDR_DIST_VECT(DDR, I) \
>>    DDR_DIST_VECTS (DDR)[I]
>>  #define DDR_REVERSED_P(DDR) (DDR)->reversed_p
>> +#define DDR_COULD_BE_INDEPENDENT_P(DDR) (DDR)->could_be_independent_p
>>
>>
>>  bool dr_analyze_innermost (innermost_loop_behavior *, tree, struct loop *);
>> @@ -457,22 +489,6 @@ same_data_refs (data_reference_p a, data
>>        return false;
>>
>>    return true;
>> -}
>> -
>> -/* Return true when the DDR contains two data references that have the
>> -   same access functions.  */
>> -
>> -static inline bool
>> -same_access_functions (const struct data_dependence_relation *ddr)
>> -{
>> -  unsigned i;
>> -
>> -  for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
>> -    if (!eq_evolutions_p (DR_ACCESS_FN (DDR_A (ddr), i),
>> -                         DR_ACCESS_FN (DDR_B (ddr), i)))
>> -      return false;
>> -
>> -  return true;
>>  }
>>
>>  /* Returns true when all the dependences are computable.  */
>> Index: gcc/tree-data-ref.c
>> ===================================================================
>> --- gcc/tree-data-ref.c 2017-07-27 13:10:29.620045506 +0100
>> +++ gcc/tree-data-ref.c 2017-07-27 13:10:33.023912613 +0100
>> @@ -124,8 +124,7 @@ Software Foundation; either version 3, o
>>  } dependence_stats;
>>
>>  static bool subscript_dependence_tester_1 (struct data_dependence_relation *,
>> -                                          struct data_reference *,
>> -                                          struct data_reference *,
>> +                                          unsigned int, unsigned int,
>>                                            struct loop *);
>>  /* Returns true iff A divides B.  */
>>
>> @@ -145,6 +144,21 @@ int_divides_p (int a, int b)
>>    return ((b % a) == 0);
>>  }
>>
>> +/* Return true if reference REF contains a union access.  */
>> +
>> +static bool
>> +ref_contains_union_access_p (tree ref)
>> +{
>> +  while (handled_component_p (ref))
>> +    {
>> +      ref = TREE_OPERAND (ref, 0);
>> +      if (TREE_CODE (TREE_TYPE (ref)) == UNION_TYPE
>> +         || TREE_CODE (TREE_TYPE (ref)) == QUAL_UNION_TYPE)
>> +       return true;
>> +    }
>> +  return false;
>> +}
>> +
>>
>>
>>  /* Dump into FILE all the data references from DATAREFS.  */
>> @@ -434,13 +448,14 @@ dump_data_dependence_relation (FILE *out
>>        unsigned int i;
>>        struct loop *loopi;
>>
>> -      for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
>> +      subscript *sub;
>> +      FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
>>         {
>>           fprintf (outf, "  access_fn_A: ");
>> -         print_generic_stmt (outf, DR_ACCESS_FN (dra, i));
>> +         print_generic_stmt (outf, SUB_ACCESS_FN (sub, 0));
>>           fprintf (outf, "  access_fn_B: ");
>> -         print_generic_stmt (outf, DR_ACCESS_FN (drb, i));
>> -         dump_subscript (outf, DDR_SUBSCRIPT (ddr, i));
>> +         print_generic_stmt (outf, SUB_ACCESS_FN (sub, 1));
>> +         dump_subscript (outf, sub);
>>         }
>>
>>        fprintf (outf, "  inner loop index: %d\n", DDR_INNER_LOOP (ddr));
>> @@ -920,6 +935,27 @@ dr_analyze_innermost (innermost_loop_beh
>>    return true;
>>  }
>>
>> +/* Return true if OP is a valid component reference for a DR access
>> +   function.  This accepts a subset of what handled_component_p accepts.  */
>> +
>> +static bool
>> +access_fn_component_p (tree op)
>> +{
>> +  switch (TREE_CODE (op))
>> +    {
>> +    case REALPART_EXPR:
>> +    case IMAGPART_EXPR:
>> +    case ARRAY_REF:
>> +      return true;
>> +
>> +    case COMPONENT_REF:
>> +      return TREE_CODE (TREE_TYPE (TREE_OPERAND (op, 0))) == RECORD_TYPE;
>> +
>> +    default:
>> +      return false;
>> +    }
>> +}
>> +
>>  /* Determines the base object and the list of indices of memory reference
>>     DR, analyzed in LOOP and instantiated in loop nest NEST.  */
>>
>> @@ -957,7 +993,9 @@ dr_analyze_indices (struct data_referenc
>>        access_fns.safe_push (integer_one_node);
>>      }
>>
>> -  /* Analyze access functions of dimensions we know to be independent.  */
>> +  /* Analyze access functions of dimensions we know to be independent.
>> +     The list of component references handled here should be kept in
>> +     sync with access_fn_component_p.  */
>>    while (handled_component_p (ref))
>>      {
>>        if (TREE_CODE (ref) == ARRAY_REF)
>> @@ -2148,6 +2186,38 @@ dr_may_alias_p (const struct data_refere
>>    return refs_may_alias_p (addr_a, addr_b);
>>  }
>>
>> +/* REF_A and REF_B both satisfy access_fn_component_p.  Return true
>> +   if it is meaningful to compare their associated access functions
>> +   when checking for dependencies.  */
>> +
>> +static bool
>> +access_fn_components_comparable_p (tree ref_a, tree ref_b)
>> +{
>> +  /* Allow pairs of component refs from the following sets:
>> +
>> +       { REALPART_EXPR, IMAGPART_EXPR }
>> +       { COMPONENT_REF }
>> +       { ARRAY_REF }.  */
>> +  tree_code code_a = TREE_CODE (ref_a);
>> +  tree_code code_b = TREE_CODE (ref_b);
>> +  if (code_a == IMAGPART_EXPR)
>> +    code_a = REALPART_EXPR;
>> +  if (code_b == IMAGPART_EXPR)
>> +    code_b = REALPART_EXPR;
>> +  if (code_a != code_b)
>> +    return false;
>> +
>> +  if (TREE_CODE (ref_a) == COMPONENT_REF)
>> +    /* ??? We cannot simply use the type of operand #0 of the refs here as
>> +       the Fortran compiler smuggles type punning into COMPONENT_REFs.
>> +       Use the DECL_CONTEXT of the FIELD_DECLs instead.  */
>> +    return (DECL_CONTEXT (TREE_OPERAND (ref_a, 1))
>> +           == DECL_CONTEXT (TREE_OPERAND (ref_b, 1)));
>> +
>> +  return types_compatible_p (TREE_TYPE (TREE_OPERAND (ref_a, 0)),
>> +                            TREE_TYPE (TREE_OPERAND (ref_b, 0)));
>> +}
>> +
>>  /* Initialize a data dependence relation between data accesses A and
>>     B.  NB_LOOPS is the number of loops surrounding the references: the
>>     size of the classic distance/direction vectors.  */
>> @@ -2160,11 +2230,10 @@ initialize_data_dependence_relation (str
>>    struct data_dependence_relation *res;
>>    unsigned int i;
>>
>> -  res = XNEW (struct data_dependence_relation);
>> +  res = XCNEW (struct data_dependence_relation);
>>    DDR_A (res) = a;
>>    DDR_B (res) = b;
>>    DDR_LOOP_NEST (res).create (0);
>> -  DDR_REVERSED_P (res) = false;
>>    DDR_SUBSCRIPTS (res).create (0);
>>    DDR_DIR_VECTS (res).create (0);
>>    DDR_DIST_VECTS (res).create (0);
>> @@ -2182,82 +2251,277 @@ initialize_data_dependence_relation (str
>>        return res;
>>      }
>>
>> -  /* The case where the references are exactly the same.  */
>> -  if (operand_equal_p (DR_REF (a), DR_REF (b), 0))
>> +  unsigned int num_dimensions_a = DR_NUM_DIMENSIONS (a);
>> +  unsigned int num_dimensions_b = DR_NUM_DIMENSIONS (b);
>> +  if (num_dimensions_a == 0 || num_dimensions_b == 0)
>>      {
>> -      if ((loop_nest.exists ()
>> -          && !object_address_invariant_in_loop_p (loop_nest[0],
>> -                                                  DR_BASE_OBJECT (a)))
>> -         || DR_NUM_DIMENSIONS (a) == 0)
>> +      DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>> +      return res;
>> +    }
>> +
>> +  /* For unconstrained bases, the root (highest-indexed) subscript
>> +     describes a variation in the base of the original DR_REF rather
>> +     than a component access.  We have no type that accurately describes
>> +     the new DR_BASE_OBJECT (whose TREE_TYPE describes the type *after*
>> +     applying this subscript) so limit the search to the last real
>> +     component access.
>> +
>> +     E.g. for:
>> +
>> +       void
>> +       f (int a[][8], int b[][8])
>> +       {
>> +         for (int i = 0; i < 8; ++i)
>> +           a[i * 2][0] = b[i][0];
>> +       }
>> +
>> +     the a and b accesses have a single ARRAY_REF component reference [0]
>> +     but have two subscripts.  */
>> +  if (DR_UNCONSTRAINED_BASE (a))
>> +    num_dimensions_a -= 1;
>> +  if (DR_UNCONSTRAINED_BASE (b))
>> +    num_dimensions_b -= 1;
>> +
>> +  /* These structures describe sequences of component references in
>> +     DR_REF (A) and DR_REF (B).  Each component reference is tied to a
>> +     specific access function.  */
>> +  struct {
>> +    /* The sequence starts at DR_ACCESS_FN (A, START_A) of A and
>> +       DR_ACCESS_FN (B, START_B) of B (inclusive) and extends to higher
>> +       indices.  In C notation, these are the indices of the rightmost
>> +       component references; e.g. for a sequence .b.c.d, the start
>> +       index is for .d.  */
>> +    unsigned int start_a;
>> +    unsigned int start_b;
>> +
>> +    /* The sequence contains LENGTH consecutive access functions from
>> +       each DR.  */
>> +    unsigned int length;
>> +
>> +    /* The enclosing objects for the A and B sequences respectively,
>> +       i.e. the objects to which DR_ACCESS_FN (A, START_A + LENGTH - 1)
>> +       and DR_ACCESS_FN (B, START_B + LENGTH - 1) are applied.  */
>> +    tree object_a;
>> +    tree object_b;
>> +  } full_seq = {}, struct_seq = {};
>> +
>> +  /* Before each iteration of the loop:
>> +
>> +     - REF_A is what you get after applying DR_ACCESS_FN (A, INDEX_A) and
>> +     - REF_B is what you get after applying DR_ACCESS_FN (B, INDEX_B).  */
>> +  unsigned int index_a = 0;
>> +  unsigned int index_b = 0;
>> +  tree ref_a = DR_REF (a);
>> +  tree ref_b = DR_REF (b);
>> +
>> +  /* Now walk the component references from the final DR_REFs back up to
>> +     the enclosing base objects.  Each component reference corresponds
>> +     to one access function in the DR, with access function 0 being for
>> +     the final DR_REF and the highest-indexed access function being the
>> +     one that is applied to the base of the DR.
>> +
>> +     Look for a sequence of component references whose access functions
>> +     are comparable (see access_fn_components_comparable_p).  If more
>> +     than one such sequence exists, pick the one nearest the base
>> +     (which is the leftmost sequence in C notation).  Store this sequence
>> +     in FULL_SEQ.
>> +
>> +     For example, if we have:
>> +
>> +       struct foo { struct bar s; ... } (*a)[10], (*b)[10];
>> +
>> +       A: a[0][i].s.c.d
>> +       B: __real b[0][i].s.e[i].f
>> +
>> +     (where d is the same type as the real component of f) then the access
>> +     functions would be:
>> +
>> +                        0   1   2   3
>> +       A:              .d  .c  .s [i]
>> +
>> +                0   1   2   3   4   5
>> +       B:  __real  .f [i]  .e  .s [i]
>> +
>> +     The A0/B2 column isn't comparable, since .d is a COMPONENT_REF
>> +     and [i] is an ARRAY_REF.  However, the A1/B3 column contains two
>> +     COMPONENT_REF accesses for struct bar, so is comparable.  Likewise
>> +     the A2/B4 column contains two COMPONENT_REF accesses for struct foo,
>> +     so is comparable.  The A3/B5 column contains two ARRAY_REFs that
>> +     index foo[10] arrays, so is again comparable.  The sequence is
>> +     therefore:
>> +
>> +        A: [1, 3]  (i.e. [i].s.c)
>> +        B: [3, 5]  (i.e. [i].s.e)
>> +
>> +     Also look for sequences of component references whose access
>> +     functions are comparable and whose enclosing objects have the same
>> +     RECORD_TYPE.  Store this sequence in STRUCT_SEQ.  In the above
>> +     example, STRUCT_SEQ would be:
>> +
>> +        A: [1, 2]  (i.e. s.c)
>> +        B: [3, 4]  (i.e. s.e)  */
>> +  while (index_a < num_dimensions_a && index_b < num_dimensions_b)
>> +    {
>> +      /* REF_A and REF_B must be one of the component access types
>> +        allowed by dr_analyze_indices.  */
>> +      gcc_checking_assert (access_fn_component_p (ref_a));
>> +      gcc_checking_assert (access_fn_component_p (ref_b));
>> +
>> +      /* Get the immediately-enclosing objects for REF_A and REF_B,
>> +        i.e. the references *before* applying DR_ACCESS_FN (A, INDEX_A)
>> +        and DR_ACCESS_FN (B, INDEX_B).  */
>> +      tree object_a = TREE_OPERAND (ref_a, 0);
>> +      tree object_b = TREE_OPERAND (ref_b, 0);
>> +
>> +      tree type_a = TREE_TYPE (object_a);
>> +      tree type_b = TREE_TYPE (object_b);
>> +      if (access_fn_components_comparable_p (ref_a, ref_b))
>> +       {
>> +         /* This pair of component accesses is comparable for dependence
>> +            analysis, so we can include DR_ACCESS_FN (A, INDEX_A) and
>> +            DR_ACCESS_FN (B, INDEX_B) in the sequence.  */
>> +         if (full_seq.start_a + full_seq.length != index_a
>> +             || full_seq.start_b + full_seq.length != index_b)
>> +           {
>> +             /* The accesses don't extend the current sequence,
>> +                so start a new one here.  */
>> +             full_seq.start_a = index_a;
>> +             full_seq.start_b = index_b;
>> +             full_seq.length = 0;
>> +           }
>> +
>> +         /* Add this pair of references to the sequence.  */
>> +         full_seq.length += 1;
>> +         full_seq.object_a = object_a;
>> +         full_seq.object_b = object_b;
>> +
>> +         /* If the enclosing objects are structures (and thus have the
>> +            same RECORD_TYPE), record the new sequence in STRUCT_SEQ.  */
>> +         if (TREE_CODE (type_a) == RECORD_TYPE)
>> +           struct_seq = full_seq;
>> +
>> +         /* Move to the next containing reference for both A and B.  */
>> +         ref_a = object_a;
>> +         ref_b = object_b;
>> +         index_a += 1;
>> +         index_b += 1;
>> +         continue;
>> +       }
>> +
>> +      /* Try to approach equal type sizes.  */
>> +      if (!COMPLETE_TYPE_P (type_a)
>> +         || !COMPLETE_TYPE_P (type_b)
>> +         || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a))
>> +         || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b)))
>> +       break;
>> +
>> +      unsigned HOST_WIDE_INT size_a = tree_to_uhwi (TYPE_SIZE_UNIT (type_a));
>> +      unsigned HOST_WIDE_INT size_b = tree_to_uhwi (TYPE_SIZE_UNIT (type_b));
>> +      if (size_a <= size_b)
>>         {
>> -         DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>> -         return res;
>> +         index_a += 1;
>> +         ref_a = object_a;
>> +       }
>> +      if (size_b <= size_a)
>> +       {
>> +         index_b += 1;
>> +         ref_b = object_b;
>>         }
>> -      DDR_AFFINE_P (res) = true;
>> -      DDR_ARE_DEPENDENT (res) = NULL_TREE;
>> -      DDR_SUBSCRIPTS (res).create (DR_NUM_DIMENSIONS (a));
>> -      DDR_LOOP_NEST (res) = loop_nest;
>> -      DDR_INNER_LOOP (res) = 0;
>> -      DDR_SELF_REFERENCE (res) = true;
>> -      for (i = 0; i < DR_NUM_DIMENSIONS (a); i++)
>> -       {
>> -         struct subscript *subscript;
>> -
>> -         subscript = XNEW (struct subscript);
>> -         SUB_CONFLICTS_IN_A (subscript) = conflict_fn_not_known ();
>> -         SUB_CONFLICTS_IN_B (subscript) = conflict_fn_not_known ();
>> -         SUB_LAST_CONFLICT (subscript) = chrec_dont_know;
>> -         SUB_DISTANCE (subscript) = chrec_dont_know;
>> -         DDR_SUBSCRIPTS (res).safe_push (subscript);
>> -       }
>> -      return res;
>>      }
>>
>> -  /* If the references do not access the same object, we do not know
>> -     whether they alias or not.  We do not care about TBAA or alignment
>> -     info so we can use OEP_ADDRESS_OF to avoid false negatives.
>> -     But the accesses have to use compatible types as otherwise the
>> -     built indices would not match.  */
>> - if (!operand_equal_p (DR_BASE_OBJECT (a), DR_BASE_OBJECT (b),
> OEP_ADDRESS_OF)
>> -      || !types_compatible_p (TREE_TYPE (DR_BASE_OBJECT (a)),
>> -                             TREE_TYPE (DR_BASE_OBJECT (b))))
>> +  /* See whether FULL_SEQ ends at the base and whether the two bases
>> +     are equal.  We do not care about TBAA or alignment info so we can
>> +     use OEP_ADDRESS_OF to avoid false negatives.  */
>> +  tree base_a = DR_BASE_OBJECT (a);
>> +  tree base_b = DR_BASE_OBJECT (b);
>> +  bool same_base_p = (full_seq.start_a + full_seq.length == num_dimensions_a
>> + && full_seq.start_b + full_seq.length == num_dimensions_b
>> + && DR_UNCONSTRAINED_BASE (a) == DR_UNCONSTRAINED_BASE (b)
>> +                     && operand_equal_p (base_a, base_b, OEP_ADDRESS_OF)
>> +                     && types_compatible_p (TREE_TYPE (base_a),
>> +                                            TREE_TYPE (base_b))
>> +                     && (!loop_nest.exists ()
>> +                         || (object_address_invariant_in_loop_p
>> +                             (loop_nest[0], base_a))));
>> +
>> +  /* If the bases are the same, we can include the base variation too.
>> +     E.g. the b accesses in:
>> +
>> +       for (int i = 0; i < n; ++i)
>> +         b[i + 4][0] = b[i][0];
>> +
>> +     have a definite dependence distance of 4, while for:
>> +
>> +       for (int i = 0; i < n; ++i)
>> +         a[i + 4][0] = b[i][0];
>> +
>> +     the dependence distance depends on the gap between a and b.
>> +
>> +     If the bases are different then we can only rely on the sequence
>> +     rooted at a structure access, since arrays are allowed to overlap
>> +     arbitrarily and change shape arbitrarily.  E.g. we treat this as
>> +     valid code:
>> +
>> +       int a[256];
>> +       ...
>> +       ((int (*)[4][3]) &a[1])[i][0] += ((int (*)[4][3]) &a[2])[i][0];
>> +
>> +     where two lvalues with the same int[4][3] type overlap, and where
>> +     both lvalues are distinct from the object's declared type.  */
>> +  if (same_base_p)
>>      {
>> -      DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>> -      return res;
>> +      if (DR_UNCONSTRAINED_BASE (a))
>> +       full_seq.length += 1;
>>      }
>> +  else
>> +    full_seq = struct_seq;
>>
>> -  /* If the base of the object is not invariant in the loop nest, we cannot
>> -     analyze it.  TODO -- in fact, it would suffice to record that there may
>> -     be arbitrary dependences in the loops where the base object varies.  */
>> -  if ((loop_nest.exists ()
>> - && !object_address_invariant_in_loop_p (loop_nest[0], DR_BASE_OBJECT
> (a)))
>> -      || DR_NUM_DIMENSIONS (a) == 0)
>> +  /* Punt if we didn't find a suitable sequence.  */
>> +  if (full_seq.length == 0)
>>      {
>>        DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>>        return res;
>>      }
>>
>> -  /* If the number of dimensions of the access to not agree we can have
>> -     a pointer access to a component of the array element type and an
>> -     array access while the base-objects are still the same.  Punt.  */
>> -  if (DR_NUM_DIMENSIONS (a) != DR_NUM_DIMENSIONS (b))
>> +  if (!same_base_p)
>>      {
>> -      DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>> -      return res;
>> +      /* Partial overlap is possible for different bases when strict aliasing
>> +        is not in effect.  It's also possible if either base involves a union
>> +        access; e.g. for:
>> +
>> +          struct s1 { int a[2]; };
>> +          struct s2 { struct s1 b; int c; };
>> +          struct s3 { int d; struct s1 e; };
>> +          union u { struct s2 f; struct s3 g; } *p, *q;
>> +
>> +        the s1 at "p->f.b" (base "p->f") partially overlaps the s1 at
>> +        "p->g.e" (base "p->g") and might partially overlap the s1 at
>> +        "q->g.e" (base "q->g").  */
>> +      if (!flag_strict_aliasing
>> +         || ref_contains_union_access_p (full_seq.object_a)
>> +         || ref_contains_union_access_p (full_seq.object_b))
>> +       {
>> +         DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>> +         return res;
>> +       }
>> +
>> +      DDR_COULD_BE_INDEPENDENT_P (res) = true;
>>      }
>>
>>    DDR_AFFINE_P (res) = true;
>>    DDR_ARE_DEPENDENT (res) = NULL_TREE;
>> -  DDR_SUBSCRIPTS (res).create (DR_NUM_DIMENSIONS (a));
>> +  DDR_SUBSCRIPTS (res).create (full_seq.length);
>>    DDR_LOOP_NEST (res) = loop_nest;
>>    DDR_INNER_LOOP (res) = 0;
>>    DDR_SELF_REFERENCE (res) = false;
>>
>> -  for (i = 0; i < DR_NUM_DIMENSIONS (a); i++)
>> +  for (i = 0; i < full_seq.length; ++i)
>>      {
>>        struct subscript *subscript;
>>
>>        subscript = XNEW (struct subscript);
>> +      SUB_ACCESS_FN (subscript, 0) = DR_ACCESS_FN (a, full_seq.start_a + i);
>> +      SUB_ACCESS_FN (subscript, 1) = DR_ACCESS_FN (b, full_seq.start_b + i);
>>        SUB_CONFLICTS_IN_A (subscript) = conflict_fn_not_known ();
>>        SUB_CONFLICTS_IN_B (subscript) = conflict_fn_not_known ();
>>        SUB_LAST_CONFLICT (subscript) = chrec_dont_know;
>> @@ -3839,14 +4103,15 @@ add_outer_distances (struct data_depende
>>  }
>>
>>  /* Return false when fail to represent the data dependence as a
>> -   distance vector.  INIT_B is set to true when a component has been
>> +   distance vector.  A_INDEX is the index of the first reference
>> +   (0 for DDR_A, 1 for DDR_B) and B_INDEX is the index of the
>> +   second reference.  INIT_B is set to true when a component has been
>>     added to the distance vector DIST_V.  INDEX_CARRY is then set to
>>     the index in DIST_V that carries the dependence.  */
>>
>>  static bool
>>  build_classic_dist_vector_1 (struct data_dependence_relation *ddr,
>> -                            struct data_reference *ddr_a,
>> -                            struct data_reference *ddr_b,
>> +                            unsigned int a_index, unsigned int b_index,
>>                              lambda_vector dist_v, bool *init_b,
>>                              int *index_carry)
>>  {
>> @@ -3864,8 +4129,8 @@ build_classic_dist_vector_1 (struct data
>>           return false;
>>         }
>>
>> -      access_fn_a = DR_ACCESS_FN (ddr_a, i);
>> -      access_fn_b = DR_ACCESS_FN (ddr_b, i);
>> +      access_fn_a = SUB_ACCESS_FN (subscript, a_index);
>> +      access_fn_b = SUB_ACCESS_FN (subscript, b_index);
>>
>>        if (TREE_CODE (access_fn_a) == POLYNOMIAL_CHREC
>>           && TREE_CODE (access_fn_b) == POLYNOMIAL_CHREC)
>> @@ -3925,10 +4190,11 @@ build_classic_dist_vector_1 (struct data
>>  constant_access_functions (const struct data_dependence_relation *ddr)
>>  {
>>    unsigned i;
>> +  subscript *sub;
>>
>> -  for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
>> -    if (!evolution_function_is_constant_p (DR_ACCESS_FN (DDR_A (ddr), i))
>> -       || !evolution_function_is_constant_p (DR_ACCESS_FN (DDR_B (ddr), i)))
>> +  FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
>> +    if (!evolution_function_is_constant_p (SUB_ACCESS_FN (sub, 0))
>> +       || !evolution_function_is_constant_p (SUB_ACCESS_FN (sub, 1)))
>>        return false;
>>
>>    return true;
>> @@ -3991,10 +4257,11 @@ add_other_self_distances (struct data_de
>>    lambda_vector dist_v;
>>    unsigned i;
>>    int index_carry = DDR_NB_LOOPS (ddr);
>> +  subscript *sub;
>>
>> -  for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
>> +  FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
>>      {
>> -      tree access_fun = DR_ACCESS_FN (DDR_A (ddr), i);
>> +      tree access_fun = SUB_ACCESS_FN (sub, 0);
>>
>>        if (TREE_CODE (access_fun) == POLYNOMIAL_CHREC)
>>         {
>> @@ -4006,7 +4273,7 @@ add_other_self_distances (struct data_de
>>                   return;
>>                 }
>>
>> -             access_fun = DR_ACCESS_FN (DDR_A (ddr), 0);
>> +             access_fun = SUB_ACCESS_FN (DDR_SUBSCRIPT (ddr, 0), 0);
>>
>>               if (TREE_CODE (CHREC_LEFT (access_fun)) == POLYNOMIAL_CHREC)
>>                 add_multivariate_self_dist (ddr, access_fun);
>> @@ -4077,6 +4344,23 @@ add_distance_for_zero_overlaps (struct d
>>      }
>>  }
>>
>> +/* Return true when the DDR contains two data references that have the
>> +   same access functions.  */
>> +
>> +static inline bool
>> +same_access_functions (const struct data_dependence_relation *ddr)
>> +{
>> +  unsigned i;
>> +  subscript *sub;
>> +
>> +  FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
>> +    if (!eq_evolutions_p (SUB_ACCESS_FN (sub, 0),
>> +                         SUB_ACCESS_FN (sub, 1)))
>> +      return false;
>> +
>> +  return true;
>> +}
>> +
>>  /* Compute the classic per loop distance vector.  DDR is the data
>>     dependence relation to build a vector from.  Return false when fail
>>     to represent the data dependence as a distance vector.  */
>> @@ -4108,8 +4392,7 @@ build_classic_dist_vector (struct data_d
>>      }
>>
>>    dist_v = lambda_vector_new (DDR_NB_LOOPS (ddr));
>> -  if (!build_classic_dist_vector_1 (ddr, DDR_A (ddr), DDR_B (ddr),
>> -                                   dist_v, &init_b, &index_carry))
>> + if (!build_classic_dist_vector_1 (ddr, 0, 1, dist_v, &init_b,
> &index_carry))
>>      return false;
>>
>>    /* Save the distance vector if we initialized one.  */
>> @@ -4142,12 +4425,11 @@ build_classic_dist_vector (struct data_d
>>        if (!lambda_vector_lexico_pos (dist_v, DDR_NB_LOOPS (ddr)))
>>         {
>>           lambda_vector save_v = lambda_vector_new (DDR_NB_LOOPS (ddr));
>> -         if (!subscript_dependence_tester_1 (ddr, DDR_B (ddr), DDR_A (ddr),
>> -                                             loop_nest))
>> +         if (!subscript_dependence_tester_1 (ddr, 1, 0, loop_nest))
>>             return false;
>>           compute_subscript_distance (ddr);
>> -         if (!build_classic_dist_vector_1 (ddr, DDR_B (ddr), DDR_A (ddr),
>> -                                           save_v, &init_b, &index_carry))
>> +         if (!build_classic_dist_vector_1 (ddr, 1, 0, save_v, &init_b,
>> +                                           &index_carry))
>>             return false;
>>           save_dist_v (ddr, save_v);
>>           DDR_REVERSED_P (ddr) = true;
>> @@ -4183,12 +4465,10 @@ build_classic_dist_vector (struct data_d
>>             {
>> lambda_vector opposite_v = lambda_vector_new (DDR_NB_LOOPS (ddr));
>>
>> -             if (!subscript_dependence_tester_1 (ddr, DDR_B (ddr),
>> -                                                 DDR_A (ddr), loop_nest))
>> +             if (!subscript_dependence_tester_1 (ddr, 1, 0, loop_nest))
>>                 return false;
>>               compute_subscript_distance (ddr);
>> -             if (!build_classic_dist_vector_1 (ddr, DDR_B (ddr), DDR_A (ddr),
>> -                                               opposite_v, &init_b,
>> + if (!build_classic_dist_vector_1 (ddr, 1, 0, opposite_v, &init_b,
>>                                                 &index_carry))
>>                 return false;
>>
>> @@ -4267,13 +4547,13 @@ build_classic_dir_vector (struct data_de
>>      }
>>  }
>>
>> -/* Helper function.  Returns true when there is a dependence between
>> -   data references DRA and DRB.  */
>> +/* Helper function.  Returns true when there is a dependence between the
>> +   data references.  A_INDEX is the index of the first reference (0 for
>> +   DDR_A, 1 for DDR_B) and B_INDEX is the index of the second reference.  */
>>
>>  static bool
>>  subscript_dependence_tester_1 (struct data_dependence_relation *ddr,
>> -                              struct data_reference *dra,
>> -                              struct data_reference *drb,
>> +                              unsigned int a_index, unsigned int b_index,
>>                                struct loop *loop_nest)
>>  {
>>    unsigned int i;
>> @@ -4285,8 +4565,8 @@ subscript_dependence_tester_1 (struct da
>>      {
>>        conflict_function *overlaps_a, *overlaps_b;
>>
>> -      analyze_overlapping_iterations (DR_ACCESS_FN (dra, i),
>> -                                     DR_ACCESS_FN (drb, i),
>> +      analyze_overlapping_iterations (SUB_ACCESS_FN (subscript, a_index),
>> +                                     SUB_ACCESS_FN (subscript, b_index),
>>                                       &overlaps_a, &overlaps_b,
>>                                       &last_conflicts, loop_nest);
>>
>> @@ -4335,7 +4615,7 @@ subscript_dependence_tester_1 (struct da
>>  subscript_dependence_tester (struct data_dependence_relation *ddr,
>>                              struct loop *loop_nest)
>>  {
>> - if (subscript_dependence_tester_1 (ddr, DDR_A (ddr), DDR_B (ddr),
> loop_nest))
>> +  if (subscript_dependence_tester_1 (ddr, 0, 1, loop_nest))
>>      dependence_stats.num_dependence_dependent++;
>>
>>    compute_subscript_distance (ddr);
>> Index: gcc/tree-ssa-loop-prefetch.c
>> ===================================================================
>> --- gcc/tree-ssa-loop-prefetch.c        2017-07-27 13:10:29.620045506 +0100
>> +++ gcc/tree-ssa-loop-prefetch.c        2017-07-27 13:10:33.023912613 +0100
>> @@ -1668,6 +1668,7 @@ determine_loop_nest_reuse (struct loop *
>>        refb = (struct mem_ref *) DDR_B (dep)->aux;
>>
>>        if (DDR_ARE_DEPENDENT (dep) == chrec_dont_know
>> +         || DDR_COULD_BE_INDEPENDENT_P (dep)
>>           || DDR_NUM_DIST_VECTS (dep) == 0)
>>         {
>>           /* If the dependence cannot be analyzed, assume that there might be
>> Index: gcc/tree-vectorizer.h
>> ===================================================================
>> --- gcc/tree-vectorizer.h       2017-07-27 13:10:29.620045506 +0100
>> +++ gcc/tree-vectorizer.h       2017-07-27 13:10:33.024912868 +0100
>> @@ -358,7 +358,7 @@ #define LOOP_VINFO_ORIG_LOOP_INFO(L)
>>  #define LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT(L)      \
>>    ((L)->may_misalign_stmts.length () > 0)
>>  #define LOOP_REQUIRES_VERSIONING_FOR_ALIAS(L)          \
>> -  ((L)->may_alias_ddrs.length () > 0)
>> +  ((L)->comp_alias_ddrs.length () > 0)
>>  #define LOOP_REQUIRES_VERSIONING_FOR_NITERS(L)         \
>>    (LOOP_VINFO_NITERS_ASSUMPTIONS (L))
>>  #define LOOP_REQUIRES_VERSIONING(L)                    \
>> Index: gcc/tree-vect-data-refs.c
>> ===================================================================
>> --- gcc/tree-vect-data-refs.c   2017-07-27 13:10:29.620045506 +0100
>> +++ gcc/tree-vect-data-refs.c   2017-07-27 13:10:33.024912868 +0100
>> @@ -160,6 +160,60 @@ vect_mark_for_runtime_alias_test (ddr_p
>>  }
>>
>>
>> +/* A subroutine of vect_analyze_data_ref_dependence.  Handle
>> +   DDR_COULD_BE_INDEPENDENT_P ddr DDR that has a known set of dependence
>> +   distances.  These distances are conservatively correct but they don't
>> +   reflect a guaranteed dependence.
>> +
>> +   Return true if this function does all the work necessary to avoid
>> +   an alias or false if the caller should use the dependence distances
>> +   to limit the vectorization factor in the usual way.  LOOP_DEPTH is
>> +   the depth of the loop described by LOOP_VINFO and the other arguments
>> +   are as for vect_analyze_data_ref_dependence.  */
>> +
>> +static bool
>> +vect_analyze_possibly_independent_ddr (data_dependence_relation *ddr,
>> +                                      loop_vec_info loop_vinfo,
>> +                                      int loop_depth, int *max_vf)
>> +{
>> +  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
>> +  lambda_vector dist_v;
>> +  unsigned int i;
>> +  FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
>> +    {
>> +      int dist = dist_v[loop_depth];
>> +      if (dist != 0 && !(dist > 0 && DDR_REVERSED_P (ddr)))
>> +       {
>> +         /* If the user asserted safelen >= DIST consecutive iterations
>> +            can be executed concurrently, assume independence.
>> +
>> +            ??? An alternative would be to add the alias check even
>> +            in this case, and vectorize the fallback loop with the
>> +            maximum VF set to safelen.  However, if the user has
>> +            explicitly given a length, it's less likely that that
>> +            would be a win.  */
>> +         if (loop->safelen >= 2 && abs_hwi (dist) <= loop->safelen)
>> +           {
>> +             if (loop->safelen < *max_vf)
>> +               *max_vf = loop->safelen;
>> +             LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo) = false;
>> +             continue;
>> +           }
>> +
>> +         /* For dependence distances of 2 or more, we have the option
>> +            of limiting VF or checking for an alias at runtime.
>> +            Prefer to check at runtime if we can, to avoid limiting
>> +            the VF unnecessarily when the bases are in fact independent.
>> +
>> +            Note that the alias checks will be removed if the VF ends up
>> +            being small enough.  */
>> +         return vect_mark_for_runtime_alias_test (ddr, loop_vinfo);
>> +       }
>> +    }
>> +  return true;
>> +}
>> +
>> +
>>  /* Function vect_analyze_data_ref_dependence.
>>
>>     Return TRUE if there (might) exist a dependence between a memory-reference
>> @@ -305,6 +359,12 @@ vect_analyze_data_ref_dependence (struct
>>      }
>>
>>    loop_depth = index_in_loop_nest (loop->num, DDR_LOOP_NEST (ddr));
>> +
>> +  if (DDR_COULD_BE_INDEPENDENT_P (ddr)
>> +      && vect_analyze_possibly_independent_ddr (ddr, loop_vinfo,
>> +                                               loop_depth, max_vf))
>> +    return false;
>> +
>>    FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
>>      {
>>        int dist = dist_v[loop_depth];
>> @@ -2878,6 +2938,44 @@ vect_no_alias_p (struct data_reference *
>>    return false;
>>  }
>>
>> +/* Return true if the minimum nonzero dependence distance for loop LOOP_DEPTH
>> +   in DDR is >= VF.  */
>> +
>> +static bool
>> +dependence_distance_ge_vf (data_dependence_relation *ddr,
>> +                          unsigned int loop_depth, unsigned HOST_WIDE_INT vf)
>> +{
>> +  if (DDR_ARE_DEPENDENT (ddr) != NULL_TREE
>> +      || DDR_NUM_DIST_VECTS (ddr) == 0)
>> +    return false;
>> +
>> +  /* If the dependence is exact, we should have limited the VF instead.  */
>> +  gcc_checking_assert (DDR_COULD_BE_INDEPENDENT_P (ddr));
>> +
>> +  unsigned int i;
>> +  lambda_vector dist_v;
>> +  FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
>> +    {
>> +      HOST_WIDE_INT dist = dist_v[loop_depth];
>> +      if (dist != 0
>> +         && !(dist > 0 && DDR_REVERSED_P (ddr))
>> +         && (unsigned HOST_WIDE_INT) abs_hwi (dist) < vf)
>> +       return false;
>> +    }
>> +
>> +  if (dump_enabled_p ())
>> +    {
>> +      dump_printf_loc (MSG_NOTE, vect_location,
>> +                      "dependence distance between ");
>> +      dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (DDR_A (ddr)));
>> +      dump_printf (MSG_NOTE,  " and ");
>> +      dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (DDR_B (ddr)));
>> +      dump_printf (MSG_NOTE,  " is >= VF\n");
>> +    }
>> +
>> +  return true;
>> +}
>> +
>>  /* Function vect_prune_runtime_alias_test_list.
>>
>>     Prune a list of ddrs to be tested at run-time by versioning for alias.
>> @@ -2908,6 +3006,10 @@ vect_prune_runtime_alias_test_list (loop
>>
>>    comp_alias_ddrs.create (may_alias_ddrs.length ());
>>
>> +  unsigned int loop_depth
>> +    = index_in_loop_nest (LOOP_VINFO_LOOP (loop_vinfo)->num,
>> +                         LOOP_VINFO_LOOP_NEST (loop_vinfo));
>> +
>>    /* First, we collect all data ref pairs for aliasing checks.  */
>>    FOR_EACH_VEC_ELT (may_alias_ddrs, i, ddr)
>>      {
>> @@ -2917,6 +3019,11 @@ vect_prune_runtime_alias_test_list (loop
>>        tree segment_length_a, segment_length_b;
>>        gimple *stmt_a, *stmt_b;
>>
>> +      /* Ignore the alias if the VF we chose ended up being no greater
>> +        than the dependence distance.  */
>> +      if (dependence_distance_ge_vf (ddr, loop_depth, vect_factor))
>> +       continue;
>> +
>>        dr_a = DDR_A (ddr);
>>        stmt_a = DR_STMT (DDR_A (ddr));
>>        dr_group_first_a = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_a));
>> @@ -2993,10 +3100,6 @@ vect_prune_runtime_alias_test_list (loop
>>        return false;
>>      }
>>
>> -  /* All alias checks have been resolved at compilation time.  */
>> -  if (!comp_alias_ddrs.length ())
>> -    LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo).truncate (0);
>> -
>>    return true;
>>  }
>>
>> Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-3.c
>> ===================================================================
>> --- /dev/null   2017-07-27 10:25:31.671280760 +0100
>> +++ gcc/testsuite/gcc.dg/vect/vect-alias-check-3.c 2017-07-27
> 13:10:33.022912357 +0100
>> @@ -0,0 +1,120 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target vect_int } */
>> +/* { dg-additional-options "--param
> vect-max-version-for-alias-checks=0 -fopenmp-simd" } */
>> +
>> +/* Intended to be larger than any VF.  */
>> +#define GAP 128
>> +#define N (GAP * 3)
>> +
>> +struct s { int x[N + 1]; };
>> +struct t { struct s x[N + 1]; };
>> +struct u { int x[N + 1]; int y; };
>> +struct v { struct s s; };
>> +
>> +void
>> +f1 (struct s *a, struct s *b)
>> +{
>> +  for (int i = 0; i < N; ++i)
>> +    a->x[i] += b->x[i];
>> +}
>> +
>> +void
>> +f2 (struct s *a, struct s *b)
>> +{
>> +  for (int i = 0; i < N; ++i)
>> +    a[1].x[i] += b[2].x[i];
>> +}
>> +
>> +void
>> +f3 (struct s *a, struct s *b)
>> +{
>> +  for (int i = 0; i < N; ++i)
>> +    a[1].x[i] += b[i].x[i];
>> +}
>> +
>> +void
>> +f4 (struct s *a, struct s *b)
>> +{
>> +  for (int i = 0; i < N; ++i)
>> +    a[i].x[i] += b[i].x[i];
>> +}
>> +
>> +void
>> +f5 (struct s *a, struct s *b)
>> +{
>> +  for (int i = 0; i < N; ++i)
>> +    a->x[i] += b->x[i + 1];
>> +}
>> +
>> +void
>> +f6 (struct s *a, struct s *b)
>> +{
>> +  for (int i = 0; i < N; ++i)
>> +    a[1].x[i] += b[2].x[i + 1];
>> +}
>> +
>> +void
>> +f7 (struct s *a, struct s *b)
>> +{
>> +  for (int i = 0; i < N; ++i)
>> +    a[1].x[i] += b[i].x[i + 1];
>> +}
>> +
>> +void
>> +f8 (struct s *a, struct s *b)
>> +{
>> +  for (int i = 0; i < N; ++i)
>> +    a[i].x[i] += b[i].x[i + 1];
>> +}
>> +
>> +void
>> +f9 (struct s *a, struct t *b)
>> +{
>> +  for (int i = 0; i < N; ++i)
>> +    a->x[i] += b->x[1].x[i];
>> +}
>> +
>> +void
>> +f10 (struct s *a, struct t *b)
>> +{
>> +  for (int i = 0; i < N; ++i)
>> +    a->x[i] += b->x[i].x[i];
>> +}
>> +
>> +void
>> +f11 (struct u *a, struct u *b)
>> +{
>> +  for (int i = 0; i < N; ++i)
>> +    a->x[i] += b->x[i] + b[i].y;
>> +}
>> +
>> +void
>> +f12 (struct s *a, struct s *b)
>> +{
>> +  for (int i = 0; i < GAP; ++i)
>> +    a->x[i + GAP] += b->x[i];
>> +}
>> +
>> +void
>> +f13 (struct s *a, struct s *b)
>> +{
>> +  for (int i = 0; i < GAP * 2; ++i)
>> +    a->x[i + GAP] += b->x[i];
>> +}
>> +
>> +void
>> +f14 (struct v *a, struct s *b)
>> +{
>> +  for (int i = 0; i < N; ++i)
>> +    a->s.x[i] = b->x[i];
>> +}
>> +
>> +void
>> +f15 (struct s *a, struct s *b)
>> +{
>> +  #pragma omp simd safelen(N)
>> +  for (int i = 0; i < N; ++i)
>> +    a->x[i + 1] += b->x[i];
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 15 "vect" } } */
>> Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-4.c
>> ===================================================================
>> --- /dev/null   2017-07-27 10:25:31.671280760 +0100
>> +++ gcc/testsuite/gcc.dg/vect/vect-alias-check-4.c 2017-07-27
> 13:10:33.022912357 +0100
>> @@ -0,0 +1,35 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target vect_int } */
>> +/* { dg-additional-options "--param vect-max-version-for-alias-checks=0" } */
>> +
>> +#define N 16
>> +
>> +struct s1 { int a[N]; };
>> +struct s2 { struct s1 b; int c; };
>> +struct s3 { int d; struct s1 e; };
>> +union u { struct s2 f; struct s3 g; };
>> +
>> +/* We allow a and b to overlap arbitrarily.  */
>> +
>> +void
>> +f1 (int a[][N], int b[][N])
>> +{
>> +  for (int i = 0; i < N; ++i)
>> +    a[0][i] += b[0][i];
>> +}
>> +
>> +void
>> +f2 (union u *a, union u *b)
>> +{
>> +  for (int i = 0; i < N; ++i)
>> +    a->f.b.a[i] += b->g.e.a[i];
>> +}
>> +
>> +void
>> +f3 (struct s1 *a, struct s1 *b)
>> +{
>> +  for (int i = 0; i < N - 1; ++i)
>> +    a->a[i + 1] += b->a[i];
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
>> Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-5.c
>> ===================================================================
>> --- /dev/null   2017-07-27 10:25:31.671280760 +0100
>> +++ gcc/testsuite/gcc.dg/vect/vect-alias-check-5.c 2017-07-27
> 13:10:33.022912357 +0100
>> @@ -0,0 +1,19 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target vect_int } */
>> +
>> +/* Intended to be larger than any VF.  */
>> +#define GAP 128
>> +#define N (GAP * 3)
>> +
>> +struct s { int x[N]; };
>> +
>> +void
>> +f1 (struct s *a, struct s *b)
>> +{
>> +  for (int i = 0; i < GAP * 2; ++i)
>> +    a->x[i + GAP] += b->x[i];
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-times "consider run-time aliasing" 1
> "vect" } } */
>> +/* { dg-final { scan-tree-dump-times "improved number of alias checks
> from 1 to 0" 1 "vect" } } */
>> +/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 1 "vect" } } */

Richard Biener Aug. 4, 2017, 9:35 a.m. UTC | #4

On Fri, Aug 4, 2017 at 11:28 AM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:
>> On Thu, Jul 27, 2017 at 2:19 PM, Richard Sandiford
>> <richard.sandiford@linaro.org> wrote:
>>> Richard Sandiford <richard.sandiford@linaro.org> writes:
>>>> Eric Botcazou <ebotcazou@adacore.com> writes:
>>>>> [Sorry for missing the previous messages]
>>>>>
>>>>>> Thanks.  Just been retesting, and I think I must have forgotten
>>>>>> to include Ada last time.  It turns out that the patch causes a dg-scan
>>>>>> regression in gnat.dg/vect17.adb, because we now think that if the
>>>>>> array RECORD_TYPEs *do* alias in:
>>>>>>
>>>>>>    procedure Add (X, Y : aliased Sarray; R : aliased out Sarray) is
>>>>>>    begin
>>>>>>       for I in Sarray'Range loop
>>>>>>          R(I) := X(I) + Y(I);
>>>>>>       end loop;
>>>>>>    end;
>>>>>>
>>>>>> then the dependence distance must be zero.  Eric, does that hold true
>>>>>> for Ada?  I.e. if X and R (or Y and R) alias, must it be the case that
>>>>>> X(I) can only alias R(I) and not for example R(I-1) or R(I+1)?
>>>>>
>>>>> Yes, I'd think so (even without the artificial RECORD_TYPE around
>> the arrays).
>>>>
>>>> Good!
>>>>
>>>>>> 2017-06-07  Richard Sandiford  <richard.sandiford@linaro.org>
>>>>>>
>>>>>> gcc/testsuite/
>>>>>>     * gnat.dg/vect17.ads (Sarray): Increase range to 1 .. 5.
>>>>>>     * gnat.dg/vect17.adb (Add): Create a dependence distance of 1
>>>>>>     when X = R or Y = R.
>>>>>
>>>>> I think that you need to modify vect15 and vect16 the same way.
>>>>
>>>> Ah, yeah.  And doing that shows that I'd not handled safelen for
>>>> DDR_COULD_BE_INDEPENDENT_P.  I've fixed that locally.
>>>>
>>>> How does this look?  Tested on x86_64-linux-gnu both without the
>>>> vectoriser changes and with the fixed vectoriser patch.
>>>
>>> Here's a version of the patch that handles safelen.  I split the
>>> handling out into a new function (vect_analyze_possibly_independent_ddr)
>>> since it was getting too big to do inline.
>>>
>>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>>
>> Ok.
>
> Thanks!
>
>> Did you check whether BB vectorization is affected?  See
>> vect_slp_analyze_instance_dependence
>> and friends.  It's quite conservative but given the prefetching change
>> I wonder if we need
>> to rule out DDR_COULD_BE_INDEPENDENT_P?
>
> I think it should be OK.  When DDR_COULD_BE_INDEPENDENT_P is set,
> we've effectively changed from DDR_ARE_DEPENDENT == chrec_dont_know
> to a conservatively-correct distance vector.  It looks like
> vect_slp_analyze_data_ref_dependence handles both cases in the
> same way (by returning true).

Yes.  Could be improved of course.

Thanks for double-checking.
Richard.

> Thanks,
> Richard
>
>>
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> Richard
>>>
>>>
>>> 2017-07-27  Richard Sandiford  <richard.sandiford@linaro.org>
>>>
>>> gcc/
>>>         * tree-data-ref.h (subscript): Add access_fn field.
>>>         (data_dependence_relation): Add could_be_independent_p.
>>>         (SUB_ACCESS_FN, DDR_COULD_BE_INDEPENDENT_P): New macros.
>>>         (same_access_functions): Move to tree-data-ref.c.
>>>         * tree-data-ref.c (ref_contains_union_access_p): New function.
>>>         (access_fn_component_p): Likewise.
>>>         (access_fn_components_comparable_p): Likewise.
>>>         (dr_analyze_indices): Add a reference to access_fn_component_p.
>>>         (dump_data_dependence_relation): Use SUB_ACCESS_FN instead of
>>>         DR_ACCESS_FN.
>>>         (constant_access_functions): Likewise.
>>>         (add_other_self_distances): Likewise.
>>>         (same_access_functions): Likewise.  (Moved from tree-data-ref.h.)
>>>         (initialize_data_dependence_relation): Use XCNEW and remove
>>>         explicit zeroing of DDR_REVERSED_P.  Look for a subsequence
>>>         of access functions that have the same type.  Allow the
>>>         subsequence to end with different bases in some circumstances.
>>>         Record the chosen access functions in SUB_ACCESS_FN.
>>>         (build_classic_dist_vector_1): Replace ddr_a and ddr_b with
>>>         a_index and b_index.  Use SUB_ACCESS_FN instead of DR_ACCESS_FN.
>>>         (subscript_dependence_tester_1): Likewise dra and drb.
>>>         (build_classic_dist_vector): Update calls accordingly.
>>>         (subscript_dependence_tester): Likewise.
>>>         * tree-ssa-loop-prefetch.c (determine_loop_nest_reuse): Check
>>>         DDR_COULD_BE_INDEPENDENT_P.
>>>         * tree-vectorizer.h (LOOP_REQUIRES_VERSIONING_FOR_ALIAS): Test
>>>         comp_alias_ddrs instead of may_alias_ddrs.
>>>         * tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
>>>         New function.
>>>         (vect_analyze_data_ref_dependence): Use it if
>>>         DDR_COULD_BE_INDEPENDENT_P, but fall back to using the recorded
>>>         distance vectors if that fails.
>>>         (dependence_distance_ge_vf): New function.
>>>         (vect_prune_runtime_alias_test_list): Use it.  Don't clear
>>>         LOOP_VINFO_MAY_ALIAS_DDRS.
>>>
>>> gcc/testsuite/
>>>         * gcc.dg/vect/vect-alias-check-3.c: New test.
>>>         * gcc.dg/vect/vect-alias-check-4.c: Likewise.
>>>         * gcc.dg/vect/vect-alias-check-5.c: Likewise.
>>>
>>> Index: gcc/tree-data-ref.h
>>> ===================================================================
>>> --- gcc/tree-data-ref.h 2017-07-27 13:10:29.620045506 +0100
>>> +++ gcc/tree-data-ref.h 2017-07-27 13:10:33.023912613 +0100
>>> @@ -260,6 +260,9 @@ struct conflict_function
>>>
>>>  struct subscript
>>>  {
>>> +  /* The access functions of the two references.  */
>>> +  tree access_fn[2];
>>> +
>>>    /* A description of the iterations for which the elements are
>>>       accessed twice.  */
>>>    conflict_function *conflicting_iterations_in_a;
>>> @@ -278,6 +281,7 @@ struct subscript
>>>
>>>  typedef struct subscript *subscript_p;
>>>
>>> +#define SUB_ACCESS_FN(SUB, I) (SUB)->access_fn[I]
>>>  #define SUB_CONFLICTS_IN_A(SUB) (SUB)->conflicting_iterations_in_a
>>>  #define SUB_CONFLICTS_IN_B(SUB) (SUB)->conflicting_iterations_in_b
>>>  #define SUB_LAST_CONFLICT(SUB) (SUB)->last_conflict
>>> @@ -333,6 +337,33 @@ struct data_dependence_relation
>>>    /* Set to true when the dependence relation is on the same data
>>>       access.  */
>>>    bool self_reference_p;
>>> +
>>> +  /* True if the dependence described is conservatively correct rather
>>> +     than exact, and if it is still possible for the accesses to be
>>> +     conditionally independent.  For example, the a and b references in:
>>> +
>>> +       struct s *a, *b;
>>> +       for (int i = 0; i < n; ++i)
>>> +         a->f[i] += b->f[i];
>>> +
>>> +     conservatively have a distance vector of (0), for the case in which
>>> +     a == b, but the accesses are independent if a != b.  Similarly,
>>> +     the a and b references in:
>>> +
>>> +       struct s *a, *b;
>>> +       for (int i = 0; i < n; ++i)
>>> +         a[0].f[i] += b[i].f[i];
>>> +
>>> +     conservatively have a distance vector of (0), but they are indepenent
>>> +     when a != b + i.  In contrast, the references in:
>>> +
>>> +       struct s *a;
>>> +       for (int i = 0; i < n; ++i)
>>> +         a->f[i] += a->f[i];
>>> +
>>> +     have the same distance vector of (0), but the accesses can never be
>>> +     independent.  */
>>> +  bool could_be_independent_p;
>>>  };
>>>
>>>  typedef struct data_dependence_relation *ddr_p;
>>> @@ -363,6 +394,7 @@ #define DDR_DIR_VECT(DDR, I) \
>>>  #define DDR_DIST_VECT(DDR, I) \
>>>    DDR_DIST_VECTS (DDR)[I]
>>>  #define DDR_REVERSED_P(DDR) (DDR)->reversed_p
>>> +#define DDR_COULD_BE_INDEPENDENT_P(DDR) (DDR)->could_be_independent_p
>>>
>>>
>>>  bool dr_analyze_innermost (innermost_loop_behavior *, tree, struct loop *);
>>> @@ -457,22 +489,6 @@ same_data_refs (data_reference_p a, data
>>>        return false;
>>>
>>>    return true;
>>> -}
>>> -
>>> -/* Return true when the DDR contains two data references that have the
>>> -   same access functions.  */
>>> -
>>> -static inline bool
>>> -same_access_functions (const struct data_dependence_relation *ddr)
>>> -{
>>> -  unsigned i;
>>> -
>>> -  for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
>>> -    if (!eq_evolutions_p (DR_ACCESS_FN (DDR_A (ddr), i),
>>> -                         DR_ACCESS_FN (DDR_B (ddr), i)))
>>> -      return false;
>>> -
>>> -  return true;
>>>  }
>>>
>>>  /* Returns true when all the dependences are computable.  */
>>> Index: gcc/tree-data-ref.c
>>> ===================================================================
>>> --- gcc/tree-data-ref.c 2017-07-27 13:10:29.620045506 +0100
>>> +++ gcc/tree-data-ref.c 2017-07-27 13:10:33.023912613 +0100
>>> @@ -124,8 +124,7 @@ Software Foundation; either version 3, o
>>>  } dependence_stats;
>>>
>>>  static bool subscript_dependence_tester_1 (struct data_dependence_relation *,
>>> -                                          struct data_reference *,
>>> -                                          struct data_reference *,
>>> +                                          unsigned int, unsigned int,
>>>                                            struct loop *);
>>>  /* Returns true iff A divides B.  */
>>>
>>> @@ -145,6 +144,21 @@ int_divides_p (int a, int b)
>>>    return ((b % a) == 0);
>>>  }
>>>
>>> +/* Return true if reference REF contains a union access.  */
>>> +
>>> +static bool
>>> +ref_contains_union_access_p (tree ref)
>>> +{
>>> +  while (handled_component_p (ref))
>>> +    {
>>> +      ref = TREE_OPERAND (ref, 0);
>>> +      if (TREE_CODE (TREE_TYPE (ref)) == UNION_TYPE
>>> +         || TREE_CODE (TREE_TYPE (ref)) == QUAL_UNION_TYPE)
>>> +       return true;
>>> +    }
>>> +  return false;
>>> +}
>>> +
>>>
>>>
>>>  /* Dump into FILE all the data references from DATAREFS.  */
>>> @@ -434,13 +448,14 @@ dump_data_dependence_relation (FILE *out
>>>        unsigned int i;
>>>        struct loop *loopi;
>>>
>>> -      for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
>>> +      subscript *sub;
>>> +      FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
>>>         {
>>>           fprintf (outf, "  access_fn_A: ");
>>> -         print_generic_stmt (outf, DR_ACCESS_FN (dra, i));
>>> +         print_generic_stmt (outf, SUB_ACCESS_FN (sub, 0));
>>>           fprintf (outf, "  access_fn_B: ");
>>> -         print_generic_stmt (outf, DR_ACCESS_FN (drb, i));
>>> -         dump_subscript (outf, DDR_SUBSCRIPT (ddr, i));
>>> +         print_generic_stmt (outf, SUB_ACCESS_FN (sub, 1));
>>> +         dump_subscript (outf, sub);
>>>         }
>>>
>>>        fprintf (outf, "  inner loop index: %d\n", DDR_INNER_LOOP (ddr));
>>> @@ -920,6 +935,27 @@ dr_analyze_innermost (innermost_loop_beh
>>>    return true;
>>>  }
>>>
>>> +/* Return true if OP is a valid component reference for a DR access
>>> +   function.  This accepts a subset of what handled_component_p accepts.  */
>>> +
>>> +static bool
>>> +access_fn_component_p (tree op)
>>> +{
>>> +  switch (TREE_CODE (op))
>>> +    {
>>> +    case REALPART_EXPR:
>>> +    case IMAGPART_EXPR:
>>> +    case ARRAY_REF:
>>> +      return true;
>>> +
>>> +    case COMPONENT_REF:
>>> +      return TREE_CODE (TREE_TYPE (TREE_OPERAND (op, 0))) == RECORD_TYPE;
>>> +
>>> +    default:
>>> +      return false;
>>> +    }
>>> +}
>>> +
>>>  /* Determines the base object and the list of indices of memory reference
>>>     DR, analyzed in LOOP and instantiated in loop nest NEST.  */
>>>
>>> @@ -957,7 +993,9 @@ dr_analyze_indices (struct data_referenc
>>>        access_fns.safe_push (integer_one_node);
>>>      }
>>>
>>> -  /* Analyze access functions of dimensions we know to be independent.  */
>>> +  /* Analyze access functions of dimensions we know to be independent.
>>> +     The list of component references handled here should be kept in
>>> +     sync with access_fn_component_p.  */
>>>    while (handled_component_p (ref))
>>>      {
>>>        if (TREE_CODE (ref) == ARRAY_REF)
>>> @@ -2148,6 +2186,38 @@ dr_may_alias_p (const struct data_refere
>>>    return refs_may_alias_p (addr_a, addr_b);
>>>  }
>>>
>>> +/* REF_A and REF_B both satisfy access_fn_component_p.  Return true
>>> +   if it is meaningful to compare their associated access functions
>>> +   when checking for dependencies.  */
>>> +
>>> +static bool
>>> +access_fn_components_comparable_p (tree ref_a, tree ref_b)
>>> +{
>>> +  /* Allow pairs of component refs from the following sets:
>>> +
>>> +       { REALPART_EXPR, IMAGPART_EXPR }
>>> +       { COMPONENT_REF }
>>> +       { ARRAY_REF }.  */
>>> +  tree_code code_a = TREE_CODE (ref_a);
>>> +  tree_code code_b = TREE_CODE (ref_b);
>>> +  if (code_a == IMAGPART_EXPR)
>>> +    code_a = REALPART_EXPR;
>>> +  if (code_b == IMAGPART_EXPR)
>>> +    code_b = REALPART_EXPR;
>>> +  if (code_a != code_b)
>>> +    return false;
>>> +
>>> +  if (TREE_CODE (ref_a) == COMPONENT_REF)
>>> +    /* ??? We cannot simply use the type of operand #0 of the refs here as
>>> +       the Fortran compiler smuggles type punning into COMPONENT_REFs.
>>> +       Use the DECL_CONTEXT of the FIELD_DECLs instead.  */
>>> +    return (DECL_CONTEXT (TREE_OPERAND (ref_a, 1))
>>> +           == DECL_CONTEXT (TREE_OPERAND (ref_b, 1)));
>>> +
>>> +  return types_compatible_p (TREE_TYPE (TREE_OPERAND (ref_a, 0)),
>>> +                            TREE_TYPE (TREE_OPERAND (ref_b, 0)));
>>> +}
>>> +
>>>  /* Initialize a data dependence relation between data accesses A and
>>>     B.  NB_LOOPS is the number of loops surrounding the references: the
>>>     size of the classic distance/direction vectors.  */
>>> @@ -2160,11 +2230,10 @@ initialize_data_dependence_relation (str
>>>    struct data_dependence_relation *res;
>>>    unsigned int i;
>>>
>>> -  res = XNEW (struct data_dependence_relation);
>>> +  res = XCNEW (struct data_dependence_relation);
>>>    DDR_A (res) = a;
>>>    DDR_B (res) = b;
>>>    DDR_LOOP_NEST (res).create (0);
>>> -  DDR_REVERSED_P (res) = false;
>>>    DDR_SUBSCRIPTS (res).create (0);
>>>    DDR_DIR_VECTS (res).create (0);
>>>    DDR_DIST_VECTS (res).create (0);
>>> @@ -2182,82 +2251,277 @@ initialize_data_dependence_relation (str
>>>        return res;
>>>      }
>>>
>>> -  /* The case where the references are exactly the same.  */
>>> -  if (operand_equal_p (DR_REF (a), DR_REF (b), 0))
>>> +  unsigned int num_dimensions_a = DR_NUM_DIMENSIONS (a);
>>> +  unsigned int num_dimensions_b = DR_NUM_DIMENSIONS (b);
>>> +  if (num_dimensions_a == 0 || num_dimensions_b == 0)
>>>      {
>>> -      if ((loop_nest.exists ()
>>> -          && !object_address_invariant_in_loop_p (loop_nest[0],
>>> -                                                  DR_BASE_OBJECT (a)))
>>> -         || DR_NUM_DIMENSIONS (a) == 0)
>>> +      DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>>> +      return res;
>>> +    }
>>> +
>>> +  /* For unconstrained bases, the root (highest-indexed) subscript
>>> +     describes a variation in the base of the original DR_REF rather
>>> +     than a component access.  We have no type that accurately describes
>>> +     the new DR_BASE_OBJECT (whose TREE_TYPE describes the type *after*
>>> +     applying this subscript) so limit the search to the last real
>>> +     component access.
>>> +
>>> +     E.g. for:
>>> +
>>> +       void
>>> +       f (int a[][8], int b[][8])
>>> +       {
>>> +         for (int i = 0; i < 8; ++i)
>>> +           a[i * 2][0] = b[i][0];
>>> +       }
>>> +
>>> +     the a and b accesses have a single ARRAY_REF component reference [0]
>>> +     but have two subscripts.  */
>>> +  if (DR_UNCONSTRAINED_BASE (a))
>>> +    num_dimensions_a -= 1;
>>> +  if (DR_UNCONSTRAINED_BASE (b))
>>> +    num_dimensions_b -= 1;
>>> +
>>> +  /* These structures describe sequences of component references in
>>> +     DR_REF (A) and DR_REF (B).  Each component reference is tied to a
>>> +     specific access function.  */
>>> +  struct {
>>> +    /* The sequence starts at DR_ACCESS_FN (A, START_A) of A and
>>> +       DR_ACCESS_FN (B, START_B) of B (inclusive) and extends to higher
>>> +       indices.  In C notation, these are the indices of the rightmost
>>> +       component references; e.g. for a sequence .b.c.d, the start
>>> +       index is for .d.  */
>>> +    unsigned int start_a;
>>> +    unsigned int start_b;
>>> +
>>> +    /* The sequence contains LENGTH consecutive access functions from
>>> +       each DR.  */
>>> +    unsigned int length;
>>> +
>>> +    /* The enclosing objects for the A and B sequences respectively,
>>> +       i.e. the objects to which DR_ACCESS_FN (A, START_A + LENGTH - 1)
>>> +       and DR_ACCESS_FN (B, START_B + LENGTH - 1) are applied.  */
>>> +    tree object_a;
>>> +    tree object_b;
>>> +  } full_seq = {}, struct_seq = {};
>>> +
>>> +  /* Before each iteration of the loop:
>>> +
>>> +     - REF_A is what you get after applying DR_ACCESS_FN (A, INDEX_A) and
>>> +     - REF_B is what you get after applying DR_ACCESS_FN (B, INDEX_B).  */
>>> +  unsigned int index_a = 0;
>>> +  unsigned int index_b = 0;
>>> +  tree ref_a = DR_REF (a);
>>> +  tree ref_b = DR_REF (b);
>>> +
>>> +  /* Now walk the component references from the final DR_REFs back up to
>>> +     the enclosing base objects.  Each component reference corresponds
>>> +     to one access function in the DR, with access function 0 being for
>>> +     the final DR_REF and the highest-indexed access function being the
>>> +     one that is applied to the base of the DR.
>>> +
>>> +     Look for a sequence of component references whose access functions
>>> +     are comparable (see access_fn_components_comparable_p).  If more
>>> +     than one such sequence exists, pick the one nearest the base
>>> +     (which is the leftmost sequence in C notation).  Store this sequence
>>> +     in FULL_SEQ.
>>> +
>>> +     For example, if we have:
>>> +
>>> +       struct foo { struct bar s; ... } (*a)[10], (*b)[10];
>>> +
>>> +       A: a[0][i].s.c.d
>>> +       B: __real b[0][i].s.e[i].f
>>> +
>>> +     (where d is the same type as the real component of f) then the access
>>> +     functions would be:
>>> +
>>> +                        0   1   2   3
>>> +       A:              .d  .c  .s [i]
>>> +
>>> +                0   1   2   3   4   5
>>> +       B:  __real  .f [i]  .e  .s [i]
>>> +
>>> +     The A0/B2 column isn't comparable, since .d is a COMPONENT_REF
>>> +     and [i] is an ARRAY_REF.  However, the A1/B3 column contains two
>>> +     COMPONENT_REF accesses for struct bar, so is comparable.  Likewise
>>> +     the A2/B4 column contains two COMPONENT_REF accesses for struct foo,
>>> +     so is comparable.  The A3/B5 column contains two ARRAY_REFs that
>>> +     index foo[10] arrays, so is again comparable.  The sequence is
>>> +     therefore:
>>> +
>>> +        A: [1, 3]  (i.e. [i].s.c)
>>> +        B: [3, 5]  (i.e. [i].s.e)
>>> +
>>> +     Also look for sequences of component references whose access
>>> +     functions are comparable and whose enclosing objects have the same
>>> +     RECORD_TYPE.  Store this sequence in STRUCT_SEQ.  In the above
>>> +     example, STRUCT_SEQ would be:
>>> +
>>> +        A: [1, 2]  (i.e. s.c)
>>> +        B: [3, 4]  (i.e. s.e)  */
>>> +  while (index_a < num_dimensions_a && index_b < num_dimensions_b)
>>> +    {
>>> +      /* REF_A and REF_B must be one of the component access types
>>> +        allowed by dr_analyze_indices.  */
>>> +      gcc_checking_assert (access_fn_component_p (ref_a));
>>> +      gcc_checking_assert (access_fn_component_p (ref_b));
>>> +
>>> +      /* Get the immediately-enclosing objects for REF_A and REF_B,
>>> +        i.e. the references *before* applying DR_ACCESS_FN (A, INDEX_A)
>>> +        and DR_ACCESS_FN (B, INDEX_B).  */
>>> +      tree object_a = TREE_OPERAND (ref_a, 0);
>>> +      tree object_b = TREE_OPERAND (ref_b, 0);
>>> +
>>> +      tree type_a = TREE_TYPE (object_a);
>>> +      tree type_b = TREE_TYPE (object_b);
>>> +      if (access_fn_components_comparable_p (ref_a, ref_b))
>>> +       {
>>> +         /* This pair of component accesses is comparable for dependence
>>> +            analysis, so we can include DR_ACCESS_FN (A, INDEX_A) and
>>> +            DR_ACCESS_FN (B, INDEX_B) in the sequence.  */
>>> +         if (full_seq.start_a + full_seq.length != index_a
>>> +             || full_seq.start_b + full_seq.length != index_b)
>>> +           {
>>> +             /* The accesses don't extend the current sequence,
>>> +                so start a new one here.  */
>>> +             full_seq.start_a = index_a;
>>> +             full_seq.start_b = index_b;
>>> +             full_seq.length = 0;
>>> +           }
>>> +
>>> +         /* Add this pair of references to the sequence.  */
>>> +         full_seq.length += 1;
>>> +         full_seq.object_a = object_a;
>>> +         full_seq.object_b = object_b;
>>> +
>>> +         /* If the enclosing objects are structures (and thus have the
>>> +            same RECORD_TYPE), record the new sequence in STRUCT_SEQ.  */
>>> +         if (TREE_CODE (type_a) == RECORD_TYPE)
>>> +           struct_seq = full_seq;
>>> +
>>> +         /* Move to the next containing reference for both A and B.  */
>>> +         ref_a = object_a;
>>> +         ref_b = object_b;
>>> +         index_a += 1;
>>> +         index_b += 1;
>>> +         continue;
>>> +       }
>>> +
>>> +      /* Try to approach equal type sizes.  */
>>> +      if (!COMPLETE_TYPE_P (type_a)
>>> +         || !COMPLETE_TYPE_P (type_b)
>>> +         || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a))
>>> +         || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b)))
>>> +       break;
>>> +
>>> +      unsigned HOST_WIDE_INT size_a = tree_to_uhwi (TYPE_SIZE_UNIT (type_a));
>>> +      unsigned HOST_WIDE_INT size_b = tree_to_uhwi (TYPE_SIZE_UNIT (type_b));
>>> +      if (size_a <= size_b)
>>>         {
>>> -         DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>>> -         return res;
>>> +         index_a += 1;
>>> +         ref_a = object_a;
>>> +       }
>>> +      if (size_b <= size_a)
>>> +       {
>>> +         index_b += 1;
>>> +         ref_b = object_b;
>>>         }
>>> -      DDR_AFFINE_P (res) = true;
>>> -      DDR_ARE_DEPENDENT (res) = NULL_TREE;
>>> -      DDR_SUBSCRIPTS (res).create (DR_NUM_DIMENSIONS (a));
>>> -      DDR_LOOP_NEST (res) = loop_nest;
>>> -      DDR_INNER_LOOP (res) = 0;
>>> -      DDR_SELF_REFERENCE (res) = true;
>>> -      for (i = 0; i < DR_NUM_DIMENSIONS (a); i++)
>>> -       {
>>> -         struct subscript *subscript;
>>> -
>>> -         subscript = XNEW (struct subscript);
>>> -         SUB_CONFLICTS_IN_A (subscript) = conflict_fn_not_known ();
>>> -         SUB_CONFLICTS_IN_B (subscript) = conflict_fn_not_known ();
>>> -         SUB_LAST_CONFLICT (subscript) = chrec_dont_know;
>>> -         SUB_DISTANCE (subscript) = chrec_dont_know;
>>> -         DDR_SUBSCRIPTS (res).safe_push (subscript);
>>> -       }
>>> -      return res;
>>>      }
>>>
>>> -  /* If the references do not access the same object, we do not know
>>> -     whether they alias or not.  We do not care about TBAA or alignment
>>> -     info so we can use OEP_ADDRESS_OF to avoid false negatives.
>>> -     But the accesses have to use compatible types as otherwise the
>>> -     built indices would not match.  */
>>> - if (!operand_equal_p (DR_BASE_OBJECT (a), DR_BASE_OBJECT (b),
>> OEP_ADDRESS_OF)
>>> -      || !types_compatible_p (TREE_TYPE (DR_BASE_OBJECT (a)),
>>> -                             TREE_TYPE (DR_BASE_OBJECT (b))))
>>> +  /* See whether FULL_SEQ ends at the base and whether the two bases
>>> +     are equal.  We do not care about TBAA or alignment info so we can
>>> +     use OEP_ADDRESS_OF to avoid false negatives.  */
>>> +  tree base_a = DR_BASE_OBJECT (a);
>>> +  tree base_b = DR_BASE_OBJECT (b);
>>> +  bool same_base_p = (full_seq.start_a + full_seq.length == num_dimensions_a
>>> + && full_seq.start_b + full_seq.length == num_dimensions_b
>>> + && DR_UNCONSTRAINED_BASE (a) == DR_UNCONSTRAINED_BASE (b)
>>> +                     && operand_equal_p (base_a, base_b, OEP_ADDRESS_OF)
>>> +                     && types_compatible_p (TREE_TYPE (base_a),
>>> +                                            TREE_TYPE (base_b))
>>> +                     && (!loop_nest.exists ()
>>> +                         || (object_address_invariant_in_loop_p
>>> +                             (loop_nest[0], base_a))));
>>> +
>>> +  /* If the bases are the same, we can include the base variation too.
>>> +     E.g. the b accesses in:
>>> +
>>> +       for (int i = 0; i < n; ++i)
>>> +         b[i + 4][0] = b[i][0];
>>> +
>>> +     have a definite dependence distance of 4, while for:
>>> +
>>> +       for (int i = 0; i < n; ++i)
>>> +         a[i + 4][0] = b[i][0];
>>> +
>>> +     the dependence distance depends on the gap between a and b.
>>> +
>>> +     If the bases are different then we can only rely on the sequence
>>> +     rooted at a structure access, since arrays are allowed to overlap
>>> +     arbitrarily and change shape arbitrarily.  E.g. we treat this as
>>> +     valid code:
>>> +
>>> +       int a[256];
>>> +       ...
>>> +       ((int (*)[4][3]) &a[1])[i][0] += ((int (*)[4][3]) &a[2])[i][0];
>>> +
>>> +     where two lvalues with the same int[4][3] type overlap, and where
>>> +     both lvalues are distinct from the object's declared type.  */
>>> +  if (same_base_p)
>>>      {
>>> -      DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>>> -      return res;
>>> +      if (DR_UNCONSTRAINED_BASE (a))
>>> +       full_seq.length += 1;
>>>      }
>>> +  else
>>> +    full_seq = struct_seq;
>>>
>>> -  /* If the base of the object is not invariant in the loop nest, we cannot
>>> -     analyze it.  TODO -- in fact, it would suffice to record that there may
>>> -     be arbitrary dependences in the loops where the base object varies.  */
>>> -  if ((loop_nest.exists ()
>>> - && !object_address_invariant_in_loop_p (loop_nest[0], DR_BASE_OBJECT
>> (a)))
>>> -      || DR_NUM_DIMENSIONS (a) == 0)
>>> +  /* Punt if we didn't find a suitable sequence.  */
>>> +  if (full_seq.length == 0)
>>>      {
>>>        DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>>>        return res;
>>>      }
>>>
>>> -  /* If the number of dimensions of the access to not agree we can have
>>> -     a pointer access to a component of the array element type and an
>>> -     array access while the base-objects are still the same.  Punt.  */
>>> -  if (DR_NUM_DIMENSIONS (a) != DR_NUM_DIMENSIONS (b))
>>> +  if (!same_base_p)
>>>      {
>>> -      DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>>> -      return res;
>>> +      /* Partial overlap is possible for different bases when strict aliasing
>>> +        is not in effect.  It's also possible if either base involves a union
>>> +        access; e.g. for:
>>> +
>>> +          struct s1 { int a[2]; };
>>> +          struct s2 { struct s1 b; int c; };
>>> +          struct s3 { int d; struct s1 e; };
>>> +          union u { struct s2 f; struct s3 g; } *p, *q;
>>> +
>>> +        the s1 at "p->f.b" (base "p->f") partially overlaps the s1 at
>>> +        "p->g.e" (base "p->g") and might partially overlap the s1 at
>>> +        "q->g.e" (base "q->g").  */
>>> +      if (!flag_strict_aliasing
>>> +         || ref_contains_union_access_p (full_seq.object_a)
>>> +         || ref_contains_union_access_p (full_seq.object_b))
>>> +       {
>>> +         DDR_ARE_DEPENDENT (res) = chrec_dont_know;
>>> +         return res;
>>> +       }
>>> +
>>> +      DDR_COULD_BE_INDEPENDENT_P (res) = true;
>>>      }
>>>
>>>    DDR_AFFINE_P (res) = true;
>>>    DDR_ARE_DEPENDENT (res) = NULL_TREE;
>>> -  DDR_SUBSCRIPTS (res).create (DR_NUM_DIMENSIONS (a));
>>> +  DDR_SUBSCRIPTS (res).create (full_seq.length);
>>>    DDR_LOOP_NEST (res) = loop_nest;
>>>    DDR_INNER_LOOP (res) = 0;
>>>    DDR_SELF_REFERENCE (res) = false;
>>>
>>> -  for (i = 0; i < DR_NUM_DIMENSIONS (a); i++)
>>> +  for (i = 0; i < full_seq.length; ++i)
>>>      {
>>>        struct subscript *subscript;
>>>
>>>        subscript = XNEW (struct subscript);
>>> +      SUB_ACCESS_FN (subscript, 0) = DR_ACCESS_FN (a, full_seq.start_a + i);
>>> +      SUB_ACCESS_FN (subscript, 1) = DR_ACCESS_FN (b, full_seq.start_b + i);
>>>        SUB_CONFLICTS_IN_A (subscript) = conflict_fn_not_known ();
>>>        SUB_CONFLICTS_IN_B (subscript) = conflict_fn_not_known ();
>>>        SUB_LAST_CONFLICT (subscript) = chrec_dont_know;
>>> @@ -3839,14 +4103,15 @@ add_outer_distances (struct data_depende
>>>  }
>>>
>>>  /* Return false when fail to represent the data dependence as a
>>> -   distance vector.  INIT_B is set to true when a component has been
>>> +   distance vector.  A_INDEX is the index of the first reference
>>> +   (0 for DDR_A, 1 for DDR_B) and B_INDEX is the index of the
>>> +   second reference.  INIT_B is set to true when a component has been
>>>     added to the distance vector DIST_V.  INDEX_CARRY is then set to
>>>     the index in DIST_V that carries the dependence.  */
>>>
>>>  static bool
>>>  build_classic_dist_vector_1 (struct data_dependence_relation *ddr,
>>> -                            struct data_reference *ddr_a,
>>> -                            struct data_reference *ddr_b,
>>> +                            unsigned int a_index, unsigned int b_index,
>>>                              lambda_vector dist_v, bool *init_b,
>>>                              int *index_carry)
>>>  {
>>> @@ -3864,8 +4129,8 @@ build_classic_dist_vector_1 (struct data
>>>           return false;
>>>         }
>>>
>>> -      access_fn_a = DR_ACCESS_FN (ddr_a, i);
>>> -      access_fn_b = DR_ACCESS_FN (ddr_b, i);
>>> +      access_fn_a = SUB_ACCESS_FN (subscript, a_index);
>>> +      access_fn_b = SUB_ACCESS_FN (subscript, b_index);
>>>
>>>        if (TREE_CODE (access_fn_a) == POLYNOMIAL_CHREC
>>>           && TREE_CODE (access_fn_b) == POLYNOMIAL_CHREC)
>>> @@ -3925,10 +4190,11 @@ build_classic_dist_vector_1 (struct data
>>>  constant_access_functions (const struct data_dependence_relation *ddr)
>>>  {
>>>    unsigned i;
>>> +  subscript *sub;
>>>
>>> -  for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
>>> -    if (!evolution_function_is_constant_p (DR_ACCESS_FN (DDR_A (ddr), i))
>>> -       || !evolution_function_is_constant_p (DR_ACCESS_FN (DDR_B (ddr), i)))
>>> +  FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
>>> +    if (!evolution_function_is_constant_p (SUB_ACCESS_FN (sub, 0))
>>> +       || !evolution_function_is_constant_p (SUB_ACCESS_FN (sub, 1)))
>>>        return false;
>>>
>>>    return true;
>>> @@ -3991,10 +4257,11 @@ add_other_self_distances (struct data_de
>>>    lambda_vector dist_v;
>>>    unsigned i;
>>>    int index_carry = DDR_NB_LOOPS (ddr);
>>> +  subscript *sub;
>>>
>>> -  for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
>>> +  FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
>>>      {
>>> -      tree access_fun = DR_ACCESS_FN (DDR_A (ddr), i);
>>> +      tree access_fun = SUB_ACCESS_FN (sub, 0);
>>>
>>>        if (TREE_CODE (access_fun) == POLYNOMIAL_CHREC)
>>>         {
>>> @@ -4006,7 +4273,7 @@ add_other_self_distances (struct data_de
>>>                   return;
>>>                 }
>>>
>>> -             access_fun = DR_ACCESS_FN (DDR_A (ddr), 0);
>>> +             access_fun = SUB_ACCESS_FN (DDR_SUBSCRIPT (ddr, 0), 0);
>>>
>>>               if (TREE_CODE (CHREC_LEFT (access_fun)) == POLYNOMIAL_CHREC)
>>>                 add_multivariate_self_dist (ddr, access_fun);
>>> @@ -4077,6 +4344,23 @@ add_distance_for_zero_overlaps (struct d
>>>      }
>>>  }
>>>
>>> +/* Return true when the DDR contains two data references that have the
>>> +   same access functions.  */
>>> +
>>> +static inline bool
>>> +same_access_functions (const struct data_dependence_relation *ddr)
>>> +{
>>> +  unsigned i;
>>> +  subscript *sub;
>>> +
>>> +  FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
>>> +    if (!eq_evolutions_p (SUB_ACCESS_FN (sub, 0),
>>> +                         SUB_ACCESS_FN (sub, 1)))
>>> +      return false;
>>> +
>>> +  return true;
>>> +}
>>> +
>>>  /* Compute the classic per loop distance vector.  DDR is the data
>>>     dependence relation to build a vector from.  Return false when fail
>>>     to represent the data dependence as a distance vector.  */
>>> @@ -4108,8 +4392,7 @@ build_classic_dist_vector (struct data_d
>>>      }
>>>
>>>    dist_v = lambda_vector_new (DDR_NB_LOOPS (ddr));
>>> -  if (!build_classic_dist_vector_1 (ddr, DDR_A (ddr), DDR_B (ddr),
>>> -                                   dist_v, &init_b, &index_carry))
>>> + if (!build_classic_dist_vector_1 (ddr, 0, 1, dist_v, &init_b,
>> &index_carry))
>>>      return false;
>>>
>>>    /* Save the distance vector if we initialized one.  */
>>> @@ -4142,12 +4425,11 @@ build_classic_dist_vector (struct data_d
>>>        if (!lambda_vector_lexico_pos (dist_v, DDR_NB_LOOPS (ddr)))
>>>         {
>>>           lambda_vector save_v = lambda_vector_new (DDR_NB_LOOPS (ddr));
>>> -         if (!subscript_dependence_tester_1 (ddr, DDR_B (ddr), DDR_A (ddr),
>>> -                                             loop_nest))
>>> +         if (!subscript_dependence_tester_1 (ddr, 1, 0, loop_nest))
>>>             return false;
>>>           compute_subscript_distance (ddr);
>>> -         if (!build_classic_dist_vector_1 (ddr, DDR_B (ddr), DDR_A (ddr),
>>> -                                           save_v, &init_b, &index_carry))
>>> +         if (!build_classic_dist_vector_1 (ddr, 1, 0, save_v, &init_b,
>>> +                                           &index_carry))
>>>             return false;
>>>           save_dist_v (ddr, save_v);
>>>           DDR_REVERSED_P (ddr) = true;
>>> @@ -4183,12 +4465,10 @@ build_classic_dist_vector (struct data_d
>>>             {
>>> lambda_vector opposite_v = lambda_vector_new (DDR_NB_LOOPS (ddr));
>>>
>>> -             if (!subscript_dependence_tester_1 (ddr, DDR_B (ddr),
>>> -                                                 DDR_A (ddr), loop_nest))
>>> +             if (!subscript_dependence_tester_1 (ddr, 1, 0, loop_nest))
>>>                 return false;
>>>               compute_subscript_distance (ddr);
>>> -             if (!build_classic_dist_vector_1 (ddr, DDR_B (ddr), DDR_A (ddr),
>>> -                                               opposite_v, &init_b,
>>> + if (!build_classic_dist_vector_1 (ddr, 1, 0, opposite_v, &init_b,
>>>                                                 &index_carry))
>>>                 return false;
>>>
>>> @@ -4267,13 +4547,13 @@ build_classic_dir_vector (struct data_de
>>>      }
>>>  }
>>>
>>> -/* Helper function.  Returns true when there is a dependence between
>>> -   data references DRA and DRB.  */
>>> +/* Helper function.  Returns true when there is a dependence between the
>>> +   data references.  A_INDEX is the index of the first reference (0 for
>>> +   DDR_A, 1 for DDR_B) and B_INDEX is the index of the second reference.  */
>>>
>>>  static bool
>>>  subscript_dependence_tester_1 (struct data_dependence_relation *ddr,
>>> -                              struct data_reference *dra,
>>> -                              struct data_reference *drb,
>>> +                              unsigned int a_index, unsigned int b_index,
>>>                                struct loop *loop_nest)
>>>  {
>>>    unsigned int i;
>>> @@ -4285,8 +4565,8 @@ subscript_dependence_tester_1 (struct da
>>>      {
>>>        conflict_function *overlaps_a, *overlaps_b;
>>>
>>> -      analyze_overlapping_iterations (DR_ACCESS_FN (dra, i),
>>> -                                     DR_ACCESS_FN (drb, i),
>>> +      analyze_overlapping_iterations (SUB_ACCESS_FN (subscript, a_index),
>>> +                                     SUB_ACCESS_FN (subscript, b_index),
>>>                                       &overlaps_a, &overlaps_b,
>>>                                       &last_conflicts, loop_nest);
>>>
>>> @@ -4335,7 +4615,7 @@ subscript_dependence_tester_1 (struct da
>>>  subscript_dependence_tester (struct data_dependence_relation *ddr,
>>>                              struct loop *loop_nest)
>>>  {
>>> - if (subscript_dependence_tester_1 (ddr, DDR_A (ddr), DDR_B (ddr),
>> loop_nest))
>>> +  if (subscript_dependence_tester_1 (ddr, 0, 1, loop_nest))
>>>      dependence_stats.num_dependence_dependent++;
>>>
>>>    compute_subscript_distance (ddr);
>>> Index: gcc/tree-ssa-loop-prefetch.c
>>> ===================================================================
>>> --- gcc/tree-ssa-loop-prefetch.c        2017-07-27 13:10:29.620045506 +0100
>>> +++ gcc/tree-ssa-loop-prefetch.c        2017-07-27 13:10:33.023912613 +0100
>>> @@ -1668,6 +1668,7 @@ determine_loop_nest_reuse (struct loop *
>>>        refb = (struct mem_ref *) DDR_B (dep)->aux;
>>>
>>>        if (DDR_ARE_DEPENDENT (dep) == chrec_dont_know
>>> +         || DDR_COULD_BE_INDEPENDENT_P (dep)
>>>           || DDR_NUM_DIST_VECTS (dep) == 0)
>>>         {
>>>           /* If the dependence cannot be analyzed, assume that there might be
>>> Index: gcc/tree-vectorizer.h
>>> ===================================================================
>>> --- gcc/tree-vectorizer.h       2017-07-27 13:10:29.620045506 +0100
>>> +++ gcc/tree-vectorizer.h       2017-07-27 13:10:33.024912868 +0100
>>> @@ -358,7 +358,7 @@ #define LOOP_VINFO_ORIG_LOOP_INFO(L)
>>>  #define LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT(L)      \
>>>    ((L)->may_misalign_stmts.length () > 0)
>>>  #define LOOP_REQUIRES_VERSIONING_FOR_ALIAS(L)          \
>>> -  ((L)->may_alias_ddrs.length () > 0)
>>> +  ((L)->comp_alias_ddrs.length () > 0)
>>>  #define LOOP_REQUIRES_VERSIONING_FOR_NITERS(L)         \
>>>    (LOOP_VINFO_NITERS_ASSUMPTIONS (L))
>>>  #define LOOP_REQUIRES_VERSIONING(L)                    \
>>> Index: gcc/tree-vect-data-refs.c
>>> ===================================================================
>>> --- gcc/tree-vect-data-refs.c   2017-07-27 13:10:29.620045506 +0100
>>> +++ gcc/tree-vect-data-refs.c   2017-07-27 13:10:33.024912868 +0100
>>> @@ -160,6 +160,60 @@ vect_mark_for_runtime_alias_test (ddr_p
>>>  }
>>>
>>>
>>> +/* A subroutine of vect_analyze_data_ref_dependence.  Handle
>>> +   DDR_COULD_BE_INDEPENDENT_P ddr DDR that has a known set of dependence
>>> +   distances.  These distances are conservatively correct but they don't
>>> +   reflect a guaranteed dependence.
>>> +
>>> +   Return true if this function does all the work necessary to avoid
>>> +   an alias or false if the caller should use the dependence distances
>>> +   to limit the vectorization factor in the usual way.  LOOP_DEPTH is
>>> +   the depth of the loop described by LOOP_VINFO and the other arguments
>>> +   are as for vect_analyze_data_ref_dependence.  */
>>> +
>>> +static bool
>>> +vect_analyze_possibly_independent_ddr (data_dependence_relation *ddr,
>>> +                                      loop_vec_info loop_vinfo,
>>> +                                      int loop_depth, int *max_vf)
>>> +{
>>> +  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
>>> +  lambda_vector dist_v;
>>> +  unsigned int i;
>>> +  FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
>>> +    {
>>> +      int dist = dist_v[loop_depth];
>>> +      if (dist != 0 && !(dist > 0 && DDR_REVERSED_P (ddr)))
>>> +       {
>>> +         /* If the user asserted safelen >= DIST consecutive iterations
>>> +            can be executed concurrently, assume independence.
>>> +
>>> +            ??? An alternative would be to add the alias check even
>>> +            in this case, and vectorize the fallback loop with the
>>> +            maximum VF set to safelen.  However, if the user has
>>> +            explicitly given a length, it's less likely that that
>>> +            would be a win.  */
>>> +         if (loop->safelen >= 2 && abs_hwi (dist) <= loop->safelen)
>>> +           {
>>> +             if (loop->safelen < *max_vf)
>>> +               *max_vf = loop->safelen;
>>> +             LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo) = false;
>>> +             continue;
>>> +           }
>>> +
>>> +         /* For dependence distances of 2 or more, we have the option
>>> +            of limiting VF or checking for an alias at runtime.
>>> +            Prefer to check at runtime if we can, to avoid limiting
>>> +            the VF unnecessarily when the bases are in fact independent.
>>> +
>>> +            Note that the alias checks will be removed if the VF ends up
>>> +            being small enough.  */
>>> +         return vect_mark_for_runtime_alias_test (ddr, loop_vinfo);
>>> +       }
>>> +    }
>>> +  return true;
>>> +}
>>> +
>>> +
>>>  /* Function vect_analyze_data_ref_dependence.
>>>
>>>     Return TRUE if there (might) exist a dependence between a memory-reference
>>> @@ -305,6 +359,12 @@ vect_analyze_data_ref_dependence (struct
>>>      }
>>>
>>>    loop_depth = index_in_loop_nest (loop->num, DDR_LOOP_NEST (ddr));
>>> +
>>> +  if (DDR_COULD_BE_INDEPENDENT_P (ddr)
>>> +      && vect_analyze_possibly_independent_ddr (ddr, loop_vinfo,
>>> +                                               loop_depth, max_vf))
>>> +    return false;
>>> +
>>>    FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
>>>      {
>>>        int dist = dist_v[loop_depth];
>>> @@ -2878,6 +2938,44 @@ vect_no_alias_p (struct data_reference *
>>>    return false;
>>>  }
>>>
>>> +/* Return true if the minimum nonzero dependence distance for loop LOOP_DEPTH
>>> +   in DDR is >= VF.  */
>>> +
>>> +static bool
>>> +dependence_distance_ge_vf (data_dependence_relation *ddr,
>>> +                          unsigned int loop_depth, unsigned HOST_WIDE_INT vf)
>>> +{
>>> +  if (DDR_ARE_DEPENDENT (ddr) != NULL_TREE
>>> +      || DDR_NUM_DIST_VECTS (ddr) == 0)
>>> +    return false;
>>> +
>>> +  /* If the dependence is exact, we should have limited the VF instead.  */
>>> +  gcc_checking_assert (DDR_COULD_BE_INDEPENDENT_P (ddr));
>>> +
>>> +  unsigned int i;
>>> +  lambda_vector dist_v;
>>> +  FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
>>> +    {
>>> +      HOST_WIDE_INT dist = dist_v[loop_depth];
>>> +      if (dist != 0
>>> +         && !(dist > 0 && DDR_REVERSED_P (ddr))
>>> +         && (unsigned HOST_WIDE_INT) abs_hwi (dist) < vf)
>>> +       return false;
>>> +    }
>>> +
>>> +  if (dump_enabled_p ())
>>> +    {
>>> +      dump_printf_loc (MSG_NOTE, vect_location,
>>> +                      "dependence distance between ");
>>> +      dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (DDR_A (ddr)));
>>> +      dump_printf (MSG_NOTE,  " and ");
>>> +      dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (DDR_B (ddr)));
>>> +      dump_printf (MSG_NOTE,  " is >= VF\n");
>>> +    }
>>> +
>>> +  return true;
>>> +}
>>> +
>>>  /* Function vect_prune_runtime_alias_test_list.
>>>
>>>     Prune a list of ddrs to be tested at run-time by versioning for alias.
>>> @@ -2908,6 +3006,10 @@ vect_prune_runtime_alias_test_list (loop
>>>
>>>    comp_alias_ddrs.create (may_alias_ddrs.length ());
>>>
>>> +  unsigned int loop_depth
>>> +    = index_in_loop_nest (LOOP_VINFO_LOOP (loop_vinfo)->num,
>>> +                         LOOP_VINFO_LOOP_NEST (loop_vinfo));
>>> +
>>>    /* First, we collect all data ref pairs for aliasing checks.  */
>>>    FOR_EACH_VEC_ELT (may_alias_ddrs, i, ddr)
>>>      {
>>> @@ -2917,6 +3019,11 @@ vect_prune_runtime_alias_test_list (loop
>>>        tree segment_length_a, segment_length_b;
>>>        gimple *stmt_a, *stmt_b;
>>>
>>> +      /* Ignore the alias if the VF we chose ended up being no greater
>>> +        than the dependence distance.  */
>>> +      if (dependence_distance_ge_vf (ddr, loop_depth, vect_factor))
>>> +       continue;
>>> +
>>>        dr_a = DDR_A (ddr);
>>>        stmt_a = DR_STMT (DDR_A (ddr));
>>>        dr_group_first_a = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_a));
>>> @@ -2993,10 +3100,6 @@ vect_prune_runtime_alias_test_list (loop
>>>        return false;
>>>      }
>>>
>>> -  /* All alias checks have been resolved at compilation time.  */
>>> -  if (!comp_alias_ddrs.length ())
>>> -    LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo).truncate (0);
>>> -
>>>    return true;
>>>  }
>>>
>>> Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-3.c
>>> ===================================================================
>>> --- /dev/null   2017-07-27 10:25:31.671280760 +0100
>>> +++ gcc/testsuite/gcc.dg/vect/vect-alias-check-3.c 2017-07-27
>> 13:10:33.022912357 +0100
>>> @@ -0,0 +1,120 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-require-effective-target vect_int } */
>>> +/* { dg-additional-options "--param
>> vect-max-version-for-alias-checks=0 -fopenmp-simd" } */
>>> +
>>> +/* Intended to be larger than any VF.  */
>>> +#define GAP 128
>>> +#define N (GAP * 3)
>>> +
>>> +struct s { int x[N + 1]; };
>>> +struct t { struct s x[N + 1]; };
>>> +struct u { int x[N + 1]; int y; };
>>> +struct v { struct s s; };
>>> +
>>> +void
>>> +f1 (struct s *a, struct s *b)
>>> +{
>>> +  for (int i = 0; i < N; ++i)
>>> +    a->x[i] += b->x[i];
>>> +}
>>> +
>>> +void
>>> +f2 (struct s *a, struct s *b)
>>> +{
>>> +  for (int i = 0; i < N; ++i)
>>> +    a[1].x[i] += b[2].x[i];
>>> +}
>>> +
>>> +void
>>> +f3 (struct s *a, struct s *b)
>>> +{
>>> +  for (int i = 0; i < N; ++i)
>>> +    a[1].x[i] += b[i].x[i];
>>> +}
>>> +
>>> +void
>>> +f4 (struct s *a, struct s *b)
>>> +{
>>> +  for (int i = 0; i < N; ++i)
>>> +    a[i].x[i] += b[i].x[i];
>>> +}
>>> +
>>> +void
>>> +f5 (struct s *a, struct s *b)
>>> +{
>>> +  for (int i = 0; i < N; ++i)
>>> +    a->x[i] += b->x[i + 1];
>>> +}
>>> +
>>> +void
>>> +f6 (struct s *a, struct s *b)
>>> +{
>>> +  for (int i = 0; i < N; ++i)
>>> +    a[1].x[i] += b[2].x[i + 1];
>>> +}
>>> +
>>> +void
>>> +f7 (struct s *a, struct s *b)
>>> +{
>>> +  for (int i = 0; i < N; ++i)
>>> +    a[1].x[i] += b[i].x[i + 1];
>>> +}
>>> +
>>> +void
>>> +f8 (struct s *a, struct s *b)
>>> +{
>>> +  for (int i = 0; i < N; ++i)
>>> +    a[i].x[i] += b[i].x[i + 1];
>>> +}
>>> +
>>> +void
>>> +f9 (struct s *a, struct t *b)
>>> +{
>>> +  for (int i = 0; i < N; ++i)
>>> +    a->x[i] += b->x[1].x[i];
>>> +}
>>> +
>>> +void
>>> +f10 (struct s *a, struct t *b)
>>> +{
>>> +  for (int i = 0; i < N; ++i)
>>> +    a->x[i] += b->x[i].x[i];
>>> +}
>>> +
>>> +void
>>> +f11 (struct u *a, struct u *b)
>>> +{
>>> +  for (int i = 0; i < N; ++i)
>>> +    a->x[i] += b->x[i] + b[i].y;
>>> +}
>>> +
>>> +void
>>> +f12 (struct s *a, struct s *b)
>>> +{
>>> +  for (int i = 0; i < GAP; ++i)
>>> +    a->x[i + GAP] += b->x[i];
>>> +}
>>> +
>>> +void
>>> +f13 (struct s *a, struct s *b)
>>> +{
>>> +  for (int i = 0; i < GAP * 2; ++i)
>>> +    a->x[i + GAP] += b->x[i];
>>> +}
>>> +
>>> +void
>>> +f14 (struct v *a, struct s *b)
>>> +{
>>> +  for (int i = 0; i < N; ++i)
>>> +    a->s.x[i] = b->x[i];
>>> +}
>>> +
>>> +void
>>> +f15 (struct s *a, struct s *b)
>>> +{
>>> +  #pragma omp simd safelen(N)
>>> +  for (int i = 0; i < N; ++i)
>>> +    a->x[i + 1] += b->x[i];
>>> +}
>>> +
>>> +/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 15 "vect" } } */
>>> Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-4.c
>>> ===================================================================
>>> --- /dev/null   2017-07-27 10:25:31.671280760 +0100
>>> +++ gcc/testsuite/gcc.dg/vect/vect-alias-check-4.c 2017-07-27
>> 13:10:33.022912357 +0100
>>> @@ -0,0 +1,35 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-require-effective-target vect_int } */
>>> +/* { dg-additional-options "--param vect-max-version-for-alias-checks=0" } */
>>> +
>>> +#define N 16
>>> +
>>> +struct s1 { int a[N]; };
>>> +struct s2 { struct s1 b; int c; };
>>> +struct s3 { int d; struct s1 e; };
>>> +union u { struct s2 f; struct s3 g; };
>>> +
>>> +/* We allow a and b to overlap arbitrarily.  */
>>> +
>>> +void
>>> +f1 (int a[][N], int b[][N])
>>> +{
>>> +  for (int i = 0; i < N; ++i)
>>> +    a[0][i] += b[0][i];
>>> +}
>>> +
>>> +void
>>> +f2 (union u *a, union u *b)
>>> +{
>>> +  for (int i = 0; i < N; ++i)
>>> +    a->f.b.a[i] += b->g.e.a[i];
>>> +}
>>> +
>>> +void
>>> +f3 (struct s1 *a, struct s1 *b)
>>> +{
>>> +  for (int i = 0; i < N - 1; ++i)
>>> +    a->a[i + 1] += b->a[i];
>>> +}
>>> +
>>> +/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
>>> Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-5.c
>>> ===================================================================
>>> --- /dev/null   2017-07-27 10:25:31.671280760 +0100
>>> +++ gcc/testsuite/gcc.dg/vect/vect-alias-check-5.c 2017-07-27
>> 13:10:33.022912357 +0100
>>> @@ -0,0 +1,19 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-require-effective-target vect_int } */
>>> +
>>> +/* Intended to be larger than any VF.  */
>>> +#define GAP 128
>>> +#define N (GAP * 3)
>>> +
>>> +struct s { int x[N]; };
>>> +
>>> +void
>>> +f1 (struct s *a, struct s *b)
>>> +{
>>> +  for (int i = 0; i < GAP * 2; ++i)
>>> +    a->x[i + GAP] += b->x[i];
>>> +}
>>> +
>>> +/* { dg-final { scan-tree-dump-times "consider run-time aliasing" 1
>> "vect" } } */
>>> +/* { dg-final { scan-tree-dump-times "improved number of alias checks
>> from 1 to 0" 1 "vect" } } */
>>> +/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 1 "vect" } } */

Index: gcc/tree-data-ref.h
===================================================================
--- gcc/tree-data-ref.h	2017-07-27 13:10:29.620045506 +0100
+++ gcc/tree-data-ref.h	2017-07-27 13:10:33.023912613 +0100
@@ -260,6 +260,9 @@  struct conflict_function
 
 struct subscript
 {
+  /* The access functions of the two references.  */
+  tree access_fn[2];
+
   /* A description of the iterations for which the elements are
      accessed twice.  */
   conflict_function *conflicting_iterations_in_a;
@@ -278,6 +281,7 @@  struct subscript
 
 typedef struct subscript *subscript_p;
 
+#define SUB_ACCESS_FN(SUB, I) (SUB)->access_fn[I]
 #define SUB_CONFLICTS_IN_A(SUB) (SUB)->conflicting_iterations_in_a
 #define SUB_CONFLICTS_IN_B(SUB) (SUB)->conflicting_iterations_in_b
 #define SUB_LAST_CONFLICT(SUB) (SUB)->last_conflict
@@ -333,6 +337,33 @@  struct data_dependence_relation
   /* Set to true when the dependence relation is on the same data
      access.  */
   bool self_reference_p;
+
+  /* True if the dependence described is conservatively correct rather
+     than exact, and if it is still possible for the accesses to be
+     conditionally independent.  For example, the a and b references in:
+
+       struct s *a, *b;
+       for (int i = 0; i < n; ++i)
+         a->f[i] += b->f[i];
+
+     conservatively have a distance vector of (0), for the case in which
+     a == b, but the accesses are independent if a != b.  Similarly,
+     the a and b references in:
+
+       struct s *a, *b;
+       for (int i = 0; i < n; ++i)
+         a[0].f[i] += b[i].f[i];
+
+     conservatively have a distance vector of (0), but they are indepenent
+     when a != b + i.  In contrast, the references in:
+
+       struct s *a;
+       for (int i = 0; i < n; ++i)
+         a->f[i] += a->f[i];
+
+     have the same distance vector of (0), but the accesses can never be
+     independent.  */
+  bool could_be_independent_p;
 };
 
 typedef struct data_dependence_relation *ddr_p;
@@ -363,6 +394,7 @@  #define DDR_DIR_VECT(DDR, I) \
 #define DDR_DIST_VECT(DDR, I) \
   DDR_DIST_VECTS (DDR)[I]
 #define DDR_REVERSED_P(DDR) (DDR)->reversed_p
+#define DDR_COULD_BE_INDEPENDENT_P(DDR) (DDR)->could_be_independent_p
 
 
 bool dr_analyze_innermost (innermost_loop_behavior *, tree, struct loop *);
@@ -457,22 +489,6 @@  same_data_refs (data_reference_p a, data
       return false;
 
   return true;
-}
-
-/* Return true when the DDR contains two data references that have the
-   same access functions.  */
-
-static inline bool
-same_access_functions (const struct data_dependence_relation *ddr)
-{
-  unsigned i;
-
-  for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
-    if (!eq_evolutions_p (DR_ACCESS_FN (DDR_A (ddr), i),
-			  DR_ACCESS_FN (DDR_B (ddr), i)))
-      return false;
-
-  return true;
 }
 
 /* Returns true when all the dependences are computable.  */
Index: gcc/tree-data-ref.c
===================================================================
--- gcc/tree-data-ref.c	2017-07-27 13:10:29.620045506 +0100
+++ gcc/tree-data-ref.c	2017-07-27 13:10:33.023912613 +0100
@@ -124,8 +124,7 @@  Software Foundation; either version 3, o
 } dependence_stats;
 
 static bool subscript_dependence_tester_1 (struct data_dependence_relation *,
-					   struct data_reference *,
-					   struct data_reference *,
+					   unsigned int, unsigned int,
 					   struct loop *);
 /* Returns true iff A divides B.  */
 
@@ -145,6 +144,21 @@  int_divides_p (int a, int b)
   return ((b % a) == 0);
 }
 
+/* Return true if reference REF contains a union access.  */
+
+static bool
+ref_contains_union_access_p (tree ref)
+{
+  while (handled_component_p (ref))
+    {
+      ref = TREE_OPERAND (ref, 0);
+      if (TREE_CODE (TREE_TYPE (ref)) == UNION_TYPE
+	  || TREE_CODE (TREE_TYPE (ref)) == QUAL_UNION_TYPE)
+	return true;
+    }
+  return false;
+}
+
 
 
 /* Dump into FILE all the data references from DATAREFS.  */
@@ -434,13 +448,14 @@  dump_data_dependence_relation (FILE *out
       unsigned int i;
       struct loop *loopi;
 
-      for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
+      subscript *sub;
+      FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
 	{
 	  fprintf (outf, "  access_fn_A: ");
-	  print_generic_stmt (outf, DR_ACCESS_FN (dra, i));
+	  print_generic_stmt (outf, SUB_ACCESS_FN (sub, 0));
 	  fprintf (outf, "  access_fn_B: ");
-	  print_generic_stmt (outf, DR_ACCESS_FN (drb, i));
-	  dump_subscript (outf, DDR_SUBSCRIPT (ddr, i));
+	  print_generic_stmt (outf, SUB_ACCESS_FN (sub, 1));
+	  dump_subscript (outf, sub);
 	}
 
       fprintf (outf, "  inner loop index: %d\n", DDR_INNER_LOOP (ddr));
@@ -920,6 +935,27 @@  dr_analyze_innermost (innermost_loop_beh
   return true;
 }
 
+/* Return true if OP is a valid component reference for a DR access
+   function.  This accepts a subset of what handled_component_p accepts.  */
+
+static bool
+access_fn_component_p (tree op)
+{
+  switch (TREE_CODE (op))
+    {
+    case REALPART_EXPR:
+    case IMAGPART_EXPR:
+    case ARRAY_REF:
+      return true;
+
+    case COMPONENT_REF:
+      return TREE_CODE (TREE_TYPE (TREE_OPERAND (op, 0))) == RECORD_TYPE;
+
+    default:
+      return false;
+    }
+}
+
 /* Determines the base object and the list of indices of memory reference
    DR, analyzed in LOOP and instantiated in loop nest NEST.  */
 
@@ -957,7 +993,9 @@  dr_analyze_indices (struct data_referenc
       access_fns.safe_push (integer_one_node);
     }
 
-  /* Analyze access functions of dimensions we know to be independent.  */
+  /* Analyze access functions of dimensions we know to be independent.
+     The list of component references handled here should be kept in
+     sync with access_fn_component_p.  */
   while (handled_component_p (ref))
     {
       if (TREE_CODE (ref) == ARRAY_REF)
@@ -2148,6 +2186,38 @@  dr_may_alias_p (const struct data_refere
   return refs_may_alias_p (addr_a, addr_b);
 }
 
+/* REF_A and REF_B both satisfy access_fn_component_p.  Return true
+   if it is meaningful to compare their associated access functions
+   when checking for dependencies.  */
+
+static bool
+access_fn_components_comparable_p (tree ref_a, tree ref_b)
+{
+  /* Allow pairs of component refs from the following sets:
+
+       { REALPART_EXPR, IMAGPART_EXPR }
+       { COMPONENT_REF }
+       { ARRAY_REF }.  */
+  tree_code code_a = TREE_CODE (ref_a);
+  tree_code code_b = TREE_CODE (ref_b);
+  if (code_a == IMAGPART_EXPR)
+    code_a = REALPART_EXPR;
+  if (code_b == IMAGPART_EXPR)
+    code_b = REALPART_EXPR;
+  if (code_a != code_b)
+    return false;
+
+  if (TREE_CODE (ref_a) == COMPONENT_REF)
+    /* ??? We cannot simply use the type of operand #0 of the refs here as
+       the Fortran compiler smuggles type punning into COMPONENT_REFs.
+       Use the DECL_CONTEXT of the FIELD_DECLs instead.  */
+    return (DECL_CONTEXT (TREE_OPERAND (ref_a, 1))
+	    == DECL_CONTEXT (TREE_OPERAND (ref_b, 1)));
+
+  return types_compatible_p (TREE_TYPE (TREE_OPERAND (ref_a, 0)),
+			     TREE_TYPE (TREE_OPERAND (ref_b, 0)));
+}
+
 /* Initialize a data dependence relation between data accesses A and
    B.  NB_LOOPS is the number of loops surrounding the references: the
    size of the classic distance/direction vectors.  */
@@ -2160,11 +2230,10 @@  initialize_data_dependence_relation (str
   struct data_dependence_relation *res;
   unsigned int i;
 
-  res = XNEW (struct data_dependence_relation);
+  res = XCNEW (struct data_dependence_relation);
   DDR_A (res) = a;
   DDR_B (res) = b;
   DDR_LOOP_NEST (res).create (0);
-  DDR_REVERSED_P (res) = false;
   DDR_SUBSCRIPTS (res).create (0);
   DDR_DIR_VECTS (res).create (0);
   DDR_DIST_VECTS (res).create (0);
@@ -2182,82 +2251,277 @@  initialize_data_dependence_relation (str
       return res;
     }
 
-  /* The case where the references are exactly the same.  */
-  if (operand_equal_p (DR_REF (a), DR_REF (b), 0))
+  unsigned int num_dimensions_a = DR_NUM_DIMENSIONS (a);
+  unsigned int num_dimensions_b = DR_NUM_DIMENSIONS (b);
+  if (num_dimensions_a == 0 || num_dimensions_b == 0)
     {
-      if ((loop_nest.exists ()
-	   && !object_address_invariant_in_loop_p (loop_nest[0],
-						   DR_BASE_OBJECT (a)))
-	  || DR_NUM_DIMENSIONS (a) == 0)
+      DDR_ARE_DEPENDENT (res) = chrec_dont_know;
+      return res;
+    }
+
+  /* For unconstrained bases, the root (highest-indexed) subscript
+     describes a variation in the base of the original DR_REF rather
+     than a component access.  We have no type that accurately describes
+     the new DR_BASE_OBJECT (whose TREE_TYPE describes the type *after*
+     applying this subscript) so limit the search to the last real
+     component access.
+
+     E.g. for:
+
+	void
+	f (int a[][8], int b[][8])
+	{
+	  for (int i = 0; i < 8; ++i)
+	    a[i * 2][0] = b[i][0];
+	}
+
+     the a and b accesses have a single ARRAY_REF component reference [0]
+     but have two subscripts.  */
+  if (DR_UNCONSTRAINED_BASE (a))
+    num_dimensions_a -= 1;
+  if (DR_UNCONSTRAINED_BASE (b))
+    num_dimensions_b -= 1;
+
+  /* These structures describe sequences of component references in
+     DR_REF (A) and DR_REF (B).  Each component reference is tied to a
+     specific access function.  */
+  struct {
+    /* The sequence starts at DR_ACCESS_FN (A, START_A) of A and
+       DR_ACCESS_FN (B, START_B) of B (inclusive) and extends to higher
+       indices.  In C notation, these are the indices of the rightmost
+       component references; e.g. for a sequence .b.c.d, the start
+       index is for .d.  */
+    unsigned int start_a;
+    unsigned int start_b;
+
+    /* The sequence contains LENGTH consecutive access functions from
+       each DR.  */
+    unsigned int length;
+
+    /* The enclosing objects for the A and B sequences respectively,
+       i.e. the objects to which DR_ACCESS_FN (A, START_A + LENGTH - 1)
+       and DR_ACCESS_FN (B, START_B + LENGTH - 1) are applied.  */
+    tree object_a;
+    tree object_b;
+  } full_seq = {}, struct_seq = {};
+
+  /* Before each iteration of the loop:
+
+     - REF_A is what you get after applying DR_ACCESS_FN (A, INDEX_A) and
+     - REF_B is what you get after applying DR_ACCESS_FN (B, INDEX_B).  */
+  unsigned int index_a = 0;
+  unsigned int index_b = 0;
+  tree ref_a = DR_REF (a);
+  tree ref_b = DR_REF (b);
+
+  /* Now walk the component references from the final DR_REFs back up to
+     the enclosing base objects.  Each component reference corresponds
+     to one access function in the DR, with access function 0 being for
+     the final DR_REF and the highest-indexed access function being the
+     one that is applied to the base of the DR.
+
+     Look for a sequence of component references whose access functions
+     are comparable (see access_fn_components_comparable_p).  If more
+     than one such sequence exists, pick the one nearest the base
+     (which is the leftmost sequence in C notation).  Store this sequence
+     in FULL_SEQ.
+
+     For example, if we have:
+
+	struct foo { struct bar s; ... } (*a)[10], (*b)[10];
+
+	A: a[0][i].s.c.d
+	B: __real b[0][i].s.e[i].f
+
+     (where d is the same type as the real component of f) then the access
+     functions would be:
+
+			 0   1   2   3
+	A:              .d  .c  .s [i]
+
+		 0   1   2   3   4   5
+	B:  __real  .f [i]  .e  .s [i]
+
+     The A0/B2 column isn't comparable, since .d is a COMPONENT_REF
+     and [i] is an ARRAY_REF.  However, the A1/B3 column contains two
+     COMPONENT_REF accesses for struct bar, so is comparable.  Likewise
+     the A2/B4 column contains two COMPONENT_REF accesses for struct foo,
+     so is comparable.  The A3/B5 column contains two ARRAY_REFs that
+     index foo[10] arrays, so is again comparable.  The sequence is
+     therefore:
+
+        A: [1, 3]  (i.e. [i].s.c)
+        B: [3, 5]  (i.e. [i].s.e)
+
+     Also look for sequences of component references whose access
+     functions are comparable and whose enclosing objects have the same
+     RECORD_TYPE.  Store this sequence in STRUCT_SEQ.  In the above
+     example, STRUCT_SEQ would be:
+
+        A: [1, 2]  (i.e. s.c)
+        B: [3, 4]  (i.e. s.e)  */
+  while (index_a < num_dimensions_a && index_b < num_dimensions_b)
+    {
+      /* REF_A and REF_B must be one of the component access types
+	 allowed by dr_analyze_indices.  */
+      gcc_checking_assert (access_fn_component_p (ref_a));
+      gcc_checking_assert (access_fn_component_p (ref_b));
+
+      /* Get the immediately-enclosing objects for REF_A and REF_B,
+	 i.e. the references *before* applying DR_ACCESS_FN (A, INDEX_A)
+	 and DR_ACCESS_FN (B, INDEX_B).  */
+      tree object_a = TREE_OPERAND (ref_a, 0);
+      tree object_b = TREE_OPERAND (ref_b, 0);
+
+      tree type_a = TREE_TYPE (object_a);
+      tree type_b = TREE_TYPE (object_b);
+      if (access_fn_components_comparable_p (ref_a, ref_b))
+	{
+	  /* This pair of component accesses is comparable for dependence
+	     analysis, so we can include DR_ACCESS_FN (A, INDEX_A) and
+	     DR_ACCESS_FN (B, INDEX_B) in the sequence.  */
+	  if (full_seq.start_a + full_seq.length != index_a
+	      || full_seq.start_b + full_seq.length != index_b)
+	    {
+	      /* The accesses don't extend the current sequence,
+		 so start a new one here.  */
+	      full_seq.start_a = index_a;
+	      full_seq.start_b = index_b;
+	      full_seq.length = 0;
+	    }
+
+	  /* Add this pair of references to the sequence.  */
+	  full_seq.length += 1;
+	  full_seq.object_a = object_a;
+	  full_seq.object_b = object_b;
+
+	  /* If the enclosing objects are structures (and thus have the
+	     same RECORD_TYPE), record the new sequence in STRUCT_SEQ.  */
+	  if (TREE_CODE (type_a) == RECORD_TYPE)
+	    struct_seq = full_seq;
+
+	  /* Move to the next containing reference for both A and B.  */
+	  ref_a = object_a;
+	  ref_b = object_b;
+	  index_a += 1;
+	  index_b += 1;
+	  continue;
+	}
+
+      /* Try to approach equal type sizes.  */
+      if (!COMPLETE_TYPE_P (type_a)
+	  || !COMPLETE_TYPE_P (type_b)
+	  || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_a))
+	  || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type_b)))
+	break;
+
+      unsigned HOST_WIDE_INT size_a = tree_to_uhwi (TYPE_SIZE_UNIT (type_a));
+      unsigned HOST_WIDE_INT size_b = tree_to_uhwi (TYPE_SIZE_UNIT (type_b));
+      if (size_a <= size_b)
 	{
-	  DDR_ARE_DEPENDENT (res) = chrec_dont_know;
-	  return res;
+	  index_a += 1;
+	  ref_a = object_a;
+	}
+      if (size_b <= size_a)
+	{
+	  index_b += 1;
+	  ref_b = object_b;
 	}
-      DDR_AFFINE_P (res) = true;
-      DDR_ARE_DEPENDENT (res) = NULL_TREE;
-      DDR_SUBSCRIPTS (res).create (DR_NUM_DIMENSIONS (a));
-      DDR_LOOP_NEST (res) = loop_nest;
-      DDR_INNER_LOOP (res) = 0;
-      DDR_SELF_REFERENCE (res) = true;
-      for (i = 0; i < DR_NUM_DIMENSIONS (a); i++)
-       {
-         struct subscript *subscript;
-
-         subscript = XNEW (struct subscript);
-         SUB_CONFLICTS_IN_A (subscript) = conflict_fn_not_known ();
-         SUB_CONFLICTS_IN_B (subscript) = conflict_fn_not_known ();
-         SUB_LAST_CONFLICT (subscript) = chrec_dont_know;
-         SUB_DISTANCE (subscript) = chrec_dont_know;
-         DDR_SUBSCRIPTS (res).safe_push (subscript);
-       }
-      return res;
     }
 
-  /* If the references do not access the same object, we do not know
-     whether they alias or not.  We do not care about TBAA or alignment
-     info so we can use OEP_ADDRESS_OF to avoid false negatives.
-     But the accesses have to use compatible types as otherwise the
-     built indices would not match.  */
-  if (!operand_equal_p (DR_BASE_OBJECT (a), DR_BASE_OBJECT (b), OEP_ADDRESS_OF)
-      || !types_compatible_p (TREE_TYPE (DR_BASE_OBJECT (a)),
-			      TREE_TYPE (DR_BASE_OBJECT (b))))
+  /* See whether FULL_SEQ ends at the base and whether the two bases
+     are equal.  We do not care about TBAA or alignment info so we can
+     use OEP_ADDRESS_OF to avoid false negatives.  */
+  tree base_a = DR_BASE_OBJECT (a);
+  tree base_b = DR_BASE_OBJECT (b);
+  bool same_base_p = (full_seq.start_a + full_seq.length == num_dimensions_a
+		      && full_seq.start_b + full_seq.length == num_dimensions_b
+		      && DR_UNCONSTRAINED_BASE (a) == DR_UNCONSTRAINED_BASE (b)
+		      && operand_equal_p (base_a, base_b, OEP_ADDRESS_OF)
+		      && types_compatible_p (TREE_TYPE (base_a),
+					     TREE_TYPE (base_b))
+		      && (!loop_nest.exists ()
+			  || (object_address_invariant_in_loop_p
+			      (loop_nest[0], base_a))));
+
+  /* If the bases are the same, we can include the base variation too.
+     E.g. the b accesses in:
+
+       for (int i = 0; i < n; ++i)
+         b[i + 4][0] = b[i][0];
+
+     have a definite dependence distance of 4, while for:
+
+       for (int i = 0; i < n; ++i)
+         a[i + 4][0] = b[i][0];
+
+     the dependence distance depends on the gap between a and b.
+
+     If the bases are different then we can only rely on the sequence
+     rooted at a structure access, since arrays are allowed to overlap
+     arbitrarily and change shape arbitrarily.  E.g. we treat this as
+     valid code:
+
+       int a[256];
+       ...
+       ((int (*)[4][3]) &a[1])[i][0] += ((int (*)[4][3]) &a[2])[i][0];
+
+     where two lvalues with the same int[4][3] type overlap, and where
+     both lvalues are distinct from the object's declared type.  */
+  if (same_base_p)
     {
-      DDR_ARE_DEPENDENT (res) = chrec_dont_know;
-      return res;
+      if (DR_UNCONSTRAINED_BASE (a))
+	full_seq.length += 1;
     }
+  else
+    full_seq = struct_seq;
 
-  /* If the base of the object is not invariant in the loop nest, we cannot
-     analyze it.  TODO -- in fact, it would suffice to record that there may
-     be arbitrary dependences in the loops where the base object varies.  */
-  if ((loop_nest.exists ()
-       && !object_address_invariant_in_loop_p (loop_nest[0], DR_BASE_OBJECT (a)))
-      || DR_NUM_DIMENSIONS (a) == 0)
+  /* Punt if we didn't find a suitable sequence.  */
+  if (full_seq.length == 0)
     {
       DDR_ARE_DEPENDENT (res) = chrec_dont_know;
       return res;
     }
 
-  /* If the number of dimensions of the access to not agree we can have
-     a pointer access to a component of the array element type and an
-     array access while the base-objects are still the same.  Punt.  */
-  if (DR_NUM_DIMENSIONS (a) != DR_NUM_DIMENSIONS (b))
+  if (!same_base_p)
     {
-      DDR_ARE_DEPENDENT (res) = chrec_dont_know;
-      return res;
+      /* Partial overlap is possible for different bases when strict aliasing
+	 is not in effect.  It's also possible if either base involves a union
+	 access; e.g. for:
+
+	   struct s1 { int a[2]; };
+	   struct s2 { struct s1 b; int c; };
+	   struct s3 { int d; struct s1 e; };
+	   union u { struct s2 f; struct s3 g; } *p, *q;
+
+	 the s1 at "p->f.b" (base "p->f") partially overlaps the s1 at
+	 "p->g.e" (base "p->g") and might partially overlap the s1 at
+	 "q->g.e" (base "q->g").  */
+      if (!flag_strict_aliasing
+	  || ref_contains_union_access_p (full_seq.object_a)
+	  || ref_contains_union_access_p (full_seq.object_b))
+	{
+	  DDR_ARE_DEPENDENT (res) = chrec_dont_know;
+	  return res;
+	}
+
+      DDR_COULD_BE_INDEPENDENT_P (res) = true;
     }
 
   DDR_AFFINE_P (res) = true;
   DDR_ARE_DEPENDENT (res) = NULL_TREE;
-  DDR_SUBSCRIPTS (res).create (DR_NUM_DIMENSIONS (a));
+  DDR_SUBSCRIPTS (res).create (full_seq.length);
   DDR_LOOP_NEST (res) = loop_nest;
   DDR_INNER_LOOP (res) = 0;
   DDR_SELF_REFERENCE (res) = false;
 
-  for (i = 0; i < DR_NUM_DIMENSIONS (a); i++)
+  for (i = 0; i < full_seq.length; ++i)
     {
       struct subscript *subscript;
 
       subscript = XNEW (struct subscript);
+      SUB_ACCESS_FN (subscript, 0) = DR_ACCESS_FN (a, full_seq.start_a + i);
+      SUB_ACCESS_FN (subscript, 1) = DR_ACCESS_FN (b, full_seq.start_b + i);
       SUB_CONFLICTS_IN_A (subscript) = conflict_fn_not_known ();
       SUB_CONFLICTS_IN_B (subscript) = conflict_fn_not_known ();
       SUB_LAST_CONFLICT (subscript) = chrec_dont_know;
@@ -3839,14 +4103,15 @@  add_outer_distances (struct data_depende
 }
 
 /* Return false when fail to represent the data dependence as a
-   distance vector.  INIT_B is set to true when a component has been
+   distance vector.  A_INDEX is the index of the first reference
+   (0 for DDR_A, 1 for DDR_B) and B_INDEX is the index of the
+   second reference.  INIT_B is set to true when a component has been
    added to the distance vector DIST_V.  INDEX_CARRY is then set to
    the index in DIST_V that carries the dependence.  */
 
 static bool
 build_classic_dist_vector_1 (struct data_dependence_relation *ddr,
-			     struct data_reference *ddr_a,
-			     struct data_reference *ddr_b,
+			     unsigned int a_index, unsigned int b_index,
 			     lambda_vector dist_v, bool *init_b,
 			     int *index_carry)
 {
@@ -3864,8 +4129,8 @@  build_classic_dist_vector_1 (struct data
 	  return false;
 	}
 
-      access_fn_a = DR_ACCESS_FN (ddr_a, i);
-      access_fn_b = DR_ACCESS_FN (ddr_b, i);
+      access_fn_a = SUB_ACCESS_FN (subscript, a_index);
+      access_fn_b = SUB_ACCESS_FN (subscript, b_index);
 
       if (TREE_CODE (access_fn_a) == POLYNOMIAL_CHREC
 	  && TREE_CODE (access_fn_b) == POLYNOMIAL_CHREC)
@@ -3925,10 +4190,11 @@  build_classic_dist_vector_1 (struct data
 constant_access_functions (const struct data_dependence_relation *ddr)
 {
   unsigned i;
+  subscript *sub;
 
-  for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
-    if (!evolution_function_is_constant_p (DR_ACCESS_FN (DDR_A (ddr), i))
-	|| !evolution_function_is_constant_p (DR_ACCESS_FN (DDR_B (ddr), i)))
+  FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
+    if (!evolution_function_is_constant_p (SUB_ACCESS_FN (sub, 0))
+	|| !evolution_function_is_constant_p (SUB_ACCESS_FN (sub, 1)))
       return false;
 
   return true;
@@ -3991,10 +4257,11 @@  add_other_self_distances (struct data_de
   lambda_vector dist_v;
   unsigned i;
   int index_carry = DDR_NB_LOOPS (ddr);
+  subscript *sub;
 
-  for (i = 0; i < DDR_NUM_SUBSCRIPTS (ddr); i++)
+  FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
     {
-      tree access_fun = DR_ACCESS_FN (DDR_A (ddr), i);
+      tree access_fun = SUB_ACCESS_FN (sub, 0);
 
       if (TREE_CODE (access_fun) == POLYNOMIAL_CHREC)
 	{
@@ -4006,7 +4273,7 @@  add_other_self_distances (struct data_de
 		  return;
 		}
 
-	      access_fun = DR_ACCESS_FN (DDR_A (ddr), 0);
+	      access_fun = SUB_ACCESS_FN (DDR_SUBSCRIPT (ddr, 0), 0);
 
 	      if (TREE_CODE (CHREC_LEFT (access_fun)) == POLYNOMIAL_CHREC)
 		add_multivariate_self_dist (ddr, access_fun);
@@ -4077,6 +4344,23 @@  add_distance_for_zero_overlaps (struct d
     }
 }
 
+/* Return true when the DDR contains two data references that have the
+   same access functions.  */
+
+static inline bool
+same_access_functions (const struct data_dependence_relation *ddr)
+{
+  unsigned i;
+  subscript *sub;
+
+  FOR_EACH_VEC_ELT (DDR_SUBSCRIPTS (ddr), i, sub)
+    if (!eq_evolutions_p (SUB_ACCESS_FN (sub, 0),
+			  SUB_ACCESS_FN (sub, 1)))
+      return false;
+
+  return true;
+}
+
 /* Compute the classic per loop distance vector.  DDR is the data
    dependence relation to build a vector from.  Return false when fail
    to represent the data dependence as a distance vector.  */
@@ -4108,8 +4392,7 @@  build_classic_dist_vector (struct data_d
     }
 
   dist_v = lambda_vector_new (DDR_NB_LOOPS (ddr));
-  if (!build_classic_dist_vector_1 (ddr, DDR_A (ddr), DDR_B (ddr),
-				    dist_v, &init_b, &index_carry))
+  if (!build_classic_dist_vector_1 (ddr, 0, 1, dist_v, &init_b, &index_carry))
     return false;
 
   /* Save the distance vector if we initialized one.  */
@@ -4142,12 +4425,11 @@  build_classic_dist_vector (struct data_d
       if (!lambda_vector_lexico_pos (dist_v, DDR_NB_LOOPS (ddr)))
 	{
 	  lambda_vector save_v = lambda_vector_new (DDR_NB_LOOPS (ddr));
-	  if (!subscript_dependence_tester_1 (ddr, DDR_B (ddr), DDR_A (ddr),
-					      loop_nest))
+	  if (!subscript_dependence_tester_1 (ddr, 1, 0, loop_nest))
 	    return false;
 	  compute_subscript_distance (ddr);
-	  if (!build_classic_dist_vector_1 (ddr, DDR_B (ddr), DDR_A (ddr),
-					    save_v, &init_b, &index_carry))
+	  if (!build_classic_dist_vector_1 (ddr, 1, 0, save_v, &init_b,
+					    &index_carry))
 	    return false;
 	  save_dist_v (ddr, save_v);
 	  DDR_REVERSED_P (ddr) = true;
@@ -4183,12 +4465,10 @@  build_classic_dist_vector (struct data_d
 	    {
 	      lambda_vector opposite_v = lambda_vector_new (DDR_NB_LOOPS (ddr));
 
-	      if (!subscript_dependence_tester_1 (ddr, DDR_B (ddr),
-						  DDR_A (ddr), loop_nest))
+	      if (!subscript_dependence_tester_1 (ddr, 1, 0, loop_nest))
 		return false;
 	      compute_subscript_distance (ddr);
-	      if (!build_classic_dist_vector_1 (ddr, DDR_B (ddr), DDR_A (ddr),
-						opposite_v, &init_b,
+	      if (!build_classic_dist_vector_1 (ddr, 1, 0, opposite_v, &init_b,
 						&index_carry))
 		return false;
 
@@ -4267,13 +4547,13 @@  build_classic_dir_vector (struct data_de
     }
 }
 
-/* Helper function.  Returns true when there is a dependence between
-   data references DRA and DRB.  */
+/* Helper function.  Returns true when there is a dependence between the
+   data references.  A_INDEX is the index of the first reference (0 for
+   DDR_A, 1 for DDR_B) and B_INDEX is the index of the second reference.  */
 
 static bool
 subscript_dependence_tester_1 (struct data_dependence_relation *ddr,
-			       struct data_reference *dra,
-			       struct data_reference *drb,
+			       unsigned int a_index, unsigned int b_index,
 			       struct loop *loop_nest)
 {
   unsigned int i;
@@ -4285,8 +4565,8 @@  subscript_dependence_tester_1 (struct da
     {
       conflict_function *overlaps_a, *overlaps_b;
 
-      analyze_overlapping_iterations (DR_ACCESS_FN (dra, i),
-				      DR_ACCESS_FN (drb, i),
+      analyze_overlapping_iterations (SUB_ACCESS_FN (subscript, a_index),
+				      SUB_ACCESS_FN (subscript, b_index),
 				      &overlaps_a, &overlaps_b,
 				      &last_conflicts, loop_nest);
 
@@ -4335,7 +4615,7 @@  subscript_dependence_tester_1 (struct da
 subscript_dependence_tester (struct data_dependence_relation *ddr,
 			     struct loop *loop_nest)
 {
-  if (subscript_dependence_tester_1 (ddr, DDR_A (ddr), DDR_B (ddr), loop_nest))
+  if (subscript_dependence_tester_1 (ddr, 0, 1, loop_nest))
     dependence_stats.num_dependence_dependent++;
 
   compute_subscript_distance (ddr);
Index: gcc/tree-ssa-loop-prefetch.c
===================================================================
--- gcc/tree-ssa-loop-prefetch.c	2017-07-27 13:10:29.620045506 +0100
+++ gcc/tree-ssa-loop-prefetch.c	2017-07-27 13:10:33.023912613 +0100
@@ -1668,6 +1668,7 @@  determine_loop_nest_reuse (struct loop *
       refb = (struct mem_ref *) DDR_B (dep)->aux;
 
       if (DDR_ARE_DEPENDENT (dep) == chrec_dont_know
+	  || DDR_COULD_BE_INDEPENDENT_P (dep)
 	  || DDR_NUM_DIST_VECTS (dep) == 0)
 	{
 	  /* If the dependence cannot be analyzed, assume that there might be
Index: gcc/tree-vectorizer.h
===================================================================
--- gcc/tree-vectorizer.h	2017-07-27 13:10:29.620045506 +0100
+++ gcc/tree-vectorizer.h	2017-07-27 13:10:33.024912868 +0100
@@ -358,7 +358,7 @@  #define LOOP_VINFO_ORIG_LOOP_INFO(L)
 #define LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT(L)	\
   ((L)->may_misalign_stmts.length () > 0)
 #define LOOP_REQUIRES_VERSIONING_FOR_ALIAS(L)		\
-  ((L)->may_alias_ddrs.length () > 0)
+  ((L)->comp_alias_ddrs.length () > 0)
 #define LOOP_REQUIRES_VERSIONING_FOR_NITERS(L)		\
   (LOOP_VINFO_NITERS_ASSUMPTIONS (L))
 #define LOOP_REQUIRES_VERSIONING(L)			\
Index: gcc/tree-vect-data-refs.c
===================================================================
--- gcc/tree-vect-data-refs.c	2017-07-27 13:10:29.620045506 +0100
+++ gcc/tree-vect-data-refs.c	2017-07-27 13:10:33.024912868 +0100
@@ -160,6 +160,60 @@  vect_mark_for_runtime_alias_test (ddr_p
 }
 
 
+/* A subroutine of vect_analyze_data_ref_dependence.  Handle
+   DDR_COULD_BE_INDEPENDENT_P ddr DDR that has a known set of dependence
+   distances.  These distances are conservatively correct but they don't
+   reflect a guaranteed dependence.
+
+   Return true if this function does all the work necessary to avoid
+   an alias or false if the caller should use the dependence distances
+   to limit the vectorization factor in the usual way.  LOOP_DEPTH is
+   the depth of the loop described by LOOP_VINFO and the other arguments
+   are as for vect_analyze_data_ref_dependence.  */
+
+static bool
+vect_analyze_possibly_independent_ddr (data_dependence_relation *ddr,
+				       loop_vec_info loop_vinfo,
+				       int loop_depth, int *max_vf)
+{
+  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
+  lambda_vector dist_v;
+  unsigned int i;
+  FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
+    {
+      int dist = dist_v[loop_depth];
+      if (dist != 0 && !(dist > 0 && DDR_REVERSED_P (ddr)))
+	{
+	  /* If the user asserted safelen >= DIST consecutive iterations
+	     can be executed concurrently, assume independence.
+
+	     ??? An alternative would be to add the alias check even
+	     in this case, and vectorize the fallback loop with the
+	     maximum VF set to safelen.  However, if the user has
+	     explicitly given a length, it's less likely that that
+	     would be a win.  */
+	  if (loop->safelen >= 2 && abs_hwi (dist) <= loop->safelen)
+	    {
+	      if (loop->safelen < *max_vf)
+		*max_vf = loop->safelen;
+	      LOOP_VINFO_NO_DATA_DEPENDENCIES (loop_vinfo) = false;
+	      continue;
+	    }
+
+	  /* For dependence distances of 2 or more, we have the option
+	     of limiting VF or checking for an alias at runtime.
+	     Prefer to check at runtime if we can, to avoid limiting
+	     the VF unnecessarily when the bases are in fact independent.
+
+	     Note that the alias checks will be removed if the VF ends up
+	     being small enough.  */
+	  return vect_mark_for_runtime_alias_test (ddr, loop_vinfo);
+	}
+    }
+  return true;
+}
+
+
 /* Function vect_analyze_data_ref_dependence.
 
    Return TRUE if there (might) exist a dependence between a memory-reference
@@ -305,6 +359,12 @@  vect_analyze_data_ref_dependence (struct
     }
 
   loop_depth = index_in_loop_nest (loop->num, DDR_LOOP_NEST (ddr));
+
+  if (DDR_COULD_BE_INDEPENDENT_P (ddr)
+      && vect_analyze_possibly_independent_ddr (ddr, loop_vinfo,
+						loop_depth, max_vf))
+    return false;
+
   FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
     {
       int dist = dist_v[loop_depth];
@@ -2878,6 +2938,44 @@  vect_no_alias_p (struct data_reference *
   return false;
 }
 
+/* Return true if the minimum nonzero dependence distance for loop LOOP_DEPTH
+   in DDR is >= VF.  */
+
+static bool
+dependence_distance_ge_vf (data_dependence_relation *ddr,
+			   unsigned int loop_depth, unsigned HOST_WIDE_INT vf)
+{
+  if (DDR_ARE_DEPENDENT (ddr) != NULL_TREE
+      || DDR_NUM_DIST_VECTS (ddr) == 0)
+    return false;
+
+  /* If the dependence is exact, we should have limited the VF instead.  */
+  gcc_checking_assert (DDR_COULD_BE_INDEPENDENT_P (ddr));
+
+  unsigned int i;
+  lambda_vector dist_v;
+  FOR_EACH_VEC_ELT (DDR_DIST_VECTS (ddr), i, dist_v)
+    {
+      HOST_WIDE_INT dist = dist_v[loop_depth];
+      if (dist != 0
+	  && !(dist > 0 && DDR_REVERSED_P (ddr))
+	  && (unsigned HOST_WIDE_INT) abs_hwi (dist) < vf)
+	return false;
+    }
+
+  if (dump_enabled_p ())
+    {
+      dump_printf_loc (MSG_NOTE, vect_location,
+		       "dependence distance between ");
+      dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (DDR_A (ddr)));
+      dump_printf (MSG_NOTE,  " and ");
+      dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (DDR_B (ddr)));
+      dump_printf (MSG_NOTE,  " is >= VF\n");
+    }
+
+  return true;
+}
+
 /* Function vect_prune_runtime_alias_test_list.
 
    Prune a list of ddrs to be tested at run-time by versioning for alias.
@@ -2908,6 +3006,10 @@  vect_prune_runtime_alias_test_list (loop
 
   comp_alias_ddrs.create (may_alias_ddrs.length ());
 
+  unsigned int loop_depth
+    = index_in_loop_nest (LOOP_VINFO_LOOP (loop_vinfo)->num,
+			  LOOP_VINFO_LOOP_NEST (loop_vinfo));
+
   /* First, we collect all data ref pairs for aliasing checks.  */
   FOR_EACH_VEC_ELT (may_alias_ddrs, i, ddr)
     {
@@ -2917,6 +3019,11 @@  vect_prune_runtime_alias_test_list (loop
       tree segment_length_a, segment_length_b;
       gimple *stmt_a, *stmt_b;
 
+      /* Ignore the alias if the VF we chose ended up being no greater
+	 than the dependence distance.  */
+      if (dependence_distance_ge_vf (ddr, loop_depth, vect_factor))
+	continue;
+
       dr_a = DDR_A (ddr);
       stmt_a = DR_STMT (DDR_A (ddr));
       dr_group_first_a = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_a));
@@ -2993,10 +3100,6 @@  vect_prune_runtime_alias_test_list (loop
       return false;
     }
 
-  /* All alias checks have been resolved at compilation time.  */
-  if (!comp_alias_ddrs.length ())
-    LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo).truncate (0);
-
   return true;
 }
 
Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-3.c
===================================================================
--- /dev/null	2017-07-27 10:25:31.671280760 +0100
+++ gcc/testsuite/gcc.dg/vect/vect-alias-check-3.c	2017-07-27 13:10:33.022912357 +0100
@@ -0,0 +1,120 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "--param vect-max-version-for-alias-checks=0 -fopenmp-simd" } */
+
+/* Intended to be larger than any VF.  */
+#define GAP 128
+#define N (GAP * 3)
+
+struct s { int x[N + 1]; };
+struct t { struct s x[N + 1]; };
+struct u { int x[N + 1]; int y; };
+struct v { struct s s; };
+
+void
+f1 (struct s *a, struct s *b)
+{
+  for (int i = 0; i < N; ++i)
+    a->x[i] += b->x[i];
+}
+
+void
+f2 (struct s *a, struct s *b)
+{
+  for (int i = 0; i < N; ++i)
+    a[1].x[i] += b[2].x[i];
+}
+
+void
+f3 (struct s *a, struct s *b)
+{
+  for (int i = 0; i < N; ++i)
+    a[1].x[i] += b[i].x[i];
+}
+
+void
+f4 (struct s *a, struct s *b)
+{
+  for (int i = 0; i < N; ++i)
+    a[i].x[i] += b[i].x[i];
+}
+
+void
+f5 (struct s *a, struct s *b)
+{
+  for (int i = 0; i < N; ++i)
+    a->x[i] += b->x[i + 1];
+}
+
+void
+f6 (struct s *a, struct s *b)
+{
+  for (int i = 0; i < N; ++i)
+    a[1].x[i] += b[2].x[i + 1];
+}
+
+void
+f7 (struct s *a, struct s *b)
+{
+  for (int i = 0; i < N; ++i)
+    a[1].x[i] += b[i].x[i + 1];
+}
+
+void
+f8 (struct s *a, struct s *b)
+{
+  for (int i = 0; i < N; ++i)
+    a[i].x[i] += b[i].x[i + 1];
+}
+
+void
+f9 (struct s *a, struct t *b)
+{
+  for (int i = 0; i < N; ++i)
+    a->x[i] += b->x[1].x[i];
+}
+
+void
+f10 (struct s *a, struct t *b)
+{
+  for (int i = 0; i < N; ++i)
+    a->x[i] += b->x[i].x[i];
+}
+
+void
+f11 (struct u *a, struct u *b)
+{
+  for (int i = 0; i < N; ++i)
+    a->x[i] += b->x[i] + b[i].y;
+}
+
+void
+f12 (struct s *a, struct s *b)
+{
+  for (int i = 0; i < GAP; ++i)
+    a->x[i + GAP] += b->x[i];
+}
+
+void
+f13 (struct s *a, struct s *b)
+{
+  for (int i = 0; i < GAP * 2; ++i)
+    a->x[i + GAP] += b->x[i];
+}
+
+void
+f14 (struct v *a, struct s *b)
+{
+  for (int i = 0; i < N; ++i)
+    a->s.x[i] = b->x[i];
+}
+
+void
+f15 (struct s *a, struct s *b)
+{
+  #pragma omp simd safelen(N)
+  for (int i = 0; i < N; ++i)
+    a->x[i + 1] += b->x[i];
+}
+
+/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 15 "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-4.c
===================================================================
--- /dev/null	2017-07-27 10:25:31.671280760 +0100
+++ gcc/testsuite/gcc.dg/vect/vect-alias-check-4.c	2017-07-27 13:10:33.022912357 +0100
@@ -0,0 +1,35 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "--param vect-max-version-for-alias-checks=0" } */
+
+#define N 16
+
+struct s1 { int a[N]; };
+struct s2 { struct s1 b; int c; };
+struct s3 { int d; struct s1 e; };
+union u { struct s2 f; struct s3 g; };
+
+/* We allow a and b to overlap arbitrarily.  */
+
+void
+f1 (int a[][N], int b[][N])
+{
+  for (int i = 0; i < N; ++i)
+    a[0][i] += b[0][i];
+}
+
+void
+f2 (union u *a, union u *b)
+{
+  for (int i = 0; i < N; ++i)
+    a->f.b.a[i] += b->g.e.a[i];
+}
+
+void
+f3 (struct s1 *a, struct s1 *b)
+{
+  for (int i = 0; i < N - 1; ++i)
+    a->a[i + 1] += b->a[i];
+}
+
+/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
Index: gcc/testsuite/gcc.dg/vect/vect-alias-check-5.c
===================================================================
--- /dev/null	2017-07-27 10:25:31.671280760 +0100
+++ gcc/testsuite/gcc.dg/vect/vect-alias-check-5.c	2017-07-27 13:10:33.022912357 +0100
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+
+/* Intended to be larger than any VF.  */
+#define GAP 128
+#define N (GAP * 3)
+
+struct s { int x[N]; };
+
+void
+f1 (struct s *a, struct s *b)
+{
+  for (int i = 0; i < GAP * 2; ++i)
+    a->x[i + GAP] += b->x[i];
+}
+
+/* { dg-final { scan-tree-dump-times "consider run-time aliasing" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "improved number of alias checks from 1 to 0" 1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "LOOP VECTORIZED" 1 "vect" } } */

Handle data dependence relations with different bases

Commit Message

Comments

Patch