diff mbox

Reducing number of alias checks in vectorization.

Message ID CAK=A3=0dZGhTt_amFBCPZB0_yVCUVnPG0Zh+wf5vVhnsteiW8A@mail.gmail.com
State New
Headers show

Commit Message

Cong Hou Oct. 22, 2013, 9:12 p.m. UTC
On Fri, Oct 18, 2013 at 6:09 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Oct 15, 2013 at 12:43 AM, Cong Hou <congh@google.com> wrote:
>> Sorry for forgetting using plain-text mode. Resend it.
>>
>>
>> ---------- Forwarded message ----------
>> From: Cong Hou <congh@google.com>
>> Date: Mon, Oct 14, 2013 at 3:29 PM
>> Subject: Re: [PATCH] Reducing number of alias checks in vectorization.
>> To: Richard Biener <rguenther@suse.de>, GCC Patches <gcc-patches@gcc.gnu.org>
>> Cc: Jakub Jelinek <jakub@redhat.com>
>>
>>
>> I have made a new patch for this issue according to your comments.
>>
>> There are several modifications to my previous patch:
>>
>>
>> 1. Remove the use of STL features such as vector and sort. Use GCC's
>> vec and qsort instead.
>
> I think that using <utility> and thus std::pair and std::swap is ok.  Including
> of <utility> should be done via system.h though.
>


Unfortunately, std::swap is defined in <algorithm>. Can I still use it?


>> 2. Comparisons between tree nodes are not based on their addresses any
>> more. Use compare_tree() function instead.
>
> Ok.
>
>> 3. The function vect_create_cond_for_alias_checks() now returns the
>> number of alias checks. If its second parameter cond_expr is NULL,
>> then this function only calculate the number of alias checks after the
>> merging and won't generate comparison expressions.
>
> We now perform the merging twice - that's easily avoided by recording
> the merge result in loop_vinfo alongside may_alias_ddrs (which you
> should be able to drop as they are already contained in the data
> structures you build).
>
> Which means you can split the function and move the merging
> part to vect_prune_runtime_alias_test_list.


I have added one more field to loop_vec_info to store the dr pairs
from which alias checks are built. And I also split the functionality
of vect_create_cond_for_alias_checks() into two functions, so that we
now perform the merging once instead of twice.


>
>> 4. The function vect_prune_runtime_alias_test_list() now uses
>> vect_create_cond_for_alias_checks() to get the number of alias checks.
>>
>>
>> The patch is attached as a text file.
>>
>> Please give me your comment on this patch. Thank you!
>
> +  if (!operand_equal_p (dr_a1->basic_addr, dr_a2->basic_addr, 0)
> +      || TREE_CODE (dr_a1->offset) != INTEGER_CST
> +      || TREE_CODE (dr_a2->offset) != INTEGER_CST)
> +    continue;
> +
> +  HOST_WIDEST_INT diff = widest_int_cst_value (dr_a2->offset) -
> + widest_int_cst_value (dr_a1->offset);
>
> use !host_integerp (dr_a1->offset) as check and then int_cst_value.


Done.


>
> +  if (diff <= vect_factor
> +      || (TREE_CODE (dr_a1->seg_len) == INTEGER_CST
> +  && diff - widest_int_cst_value (dr_a1->seg_len) < vect_factor)
> +      || (TREE_CODE (dr_a1->seg_len) == INTEGER_CST
> +  && TREE_CODE (dr_b1->seg_len) == INTEGER_CST
> +  && diff - widest_int_cst_value (dr_a1->seg_len) <
> +     widest_int_cst_value (dr_b1->seg_len)))
>
> can you add a comment above this why it is safe to combine under
> this condition?  You seem to combine offsets but not data-ref steps
> for example.  The segment length is computed quite optimistically
> and you don't seem to adjust that.  That is, generally the alias
> check, if implemented as simple overlap, would need to intersect
> [init, init + niter * step]  intervals, losing some non-aliasing cases.
> The current segment length is optimistically computed and thus
> can recover some more cases, but you have to be careful with
> merging to not break its implicit assumptions.


I have changed the code and made it more clear as shown below. I still
use the condition


DR_A2->OFFSET - DR_A1->OFFSET - SEGMENT_LENGTH_A < SEGMENT_LENGTH_B


But in case that SEGMENT_LENGTH_A or SEGMENT_LENGTH_B is not constant,
we can still check the following two conditions:


DR_A2->OFFSET - DR_A1->OFFSET  < SEGMENT_LENGTH_B
(if SEGMENT_LENGTH_A is not a constant).


DR_A2->OFFSET - DR_A1->OFFSET - SEGMENT_LENGTH_A < VECT_FACTOR
(if SEGMENT_LENGTH_B is not a constant, then its minimum value should
be VECT_FACTOR).



The corresponding code is:


 /* Now we check if the following condition is satisfied:

    DIFF - SEGMENT_LENGTH_A < SEGMENT_LENGTH_B

    where DIFF = DR_A2->OFFSET - DR_A1->OFFSET.  However,
    SEGMENT_LENGTH_A or SEGMENT_LENGTH_B may not be constant so we
    have to make a best estimation.  We can get the minimum value
    of SEGMENT_LENGTH_B as a constant, represented by MIN_SEG_LEN_B,
    then either of the following two conditions can guarantee the
    one above:

    1: DIFF <= MIN_SEG_LEN_B
    2: DIFF - SEGMENT_LENGTH_A < MIN_SEG_LEN_B

    */

 HOST_WIDEST_INT
 min_seg_len_b = (TREE_CODE (dr_b1->seg_len) == INTEGER_CST) ?
     widest_int_cst_value (dr_b1->seg_len) :
     vect_factor;

 if (diff <= min_seg_len_b
     || (TREE_CODE (dr_a1->seg_len) == INTEGER_CST
 && diff - widest_int_cst_value (dr_a1->seg_len) <
    min_seg_len_b))
   { ... }



The new patch is pasted here again. Bootstrapping and regression tests
are passed.


thanks,
Cong








>
> Thanks,
> Richard.
>
>>
>> Cong
>>
>>
>> On Thu, Oct 3, 2013 at 2:35 PM, Cong Hou <congh@google.com> wrote:
>>>
>>> Forget about this "aux" idea as the segment length for one data ref
>>> can be different in different dr pairs.
>>>
>>> In my patch I created a struct as shown below:
>>>
>>> struct dr_addr_with_seg_len
>>> {
>>>   data_reference *dr;
>>>   tree basic_addr;
>>>   tree offset;
>>>   tree seg_len;
>>> };
>>>
>>>
>>> Note that basic_addr and offset can always obtained from dr, but we
>>> need to store two segment lengths for each dr pair. It is improper to
>>> add a field to data_dependence_relation as it is defined outside of
>>> vectorizer. We can change the type (a new one combining
>>> data_dependence_relation and segment length) of may_alias_ddrs in
>>> loop_vec_info to include such information, but we have to add a new
>>> type to tree-vectorizer.h which is only used in two places - still too
>>> much.
>>>
>>> One possible solution is that we create a local struct as shown above
>>> and a new function which returns the merged alias check information.
>>> This function will be called twice: once during analysis phase and
>>> once in transformation phase. Then we don't have to store the merged
>>> alias check information during those two phases. The additional time
>>> cost is minimal as there will not be too many data dependent dr pairs
>>> in a loop.
>>>
>>> Any comment?
>>>
>>>
>>> thanks,
>>> Cong
>>>
>>>
>>> On Thu, Oct 3, 2013 at 10:57 AM, Cong Hou <congh@google.com> wrote:
>>> > I noticed that there is a "struct dataref_aux" defined in
>>> > tree-vectorizer.h which is specific to the vectorizer pass and is
>>> > stored in (void*)aux in "struct data_reference". Can we add one more
>>> > field "segment_length" to dataref_aux so that we can pass this
>>> > information for merging alias checks? Then we can avoid to modify or
>>> > create other structures.
>>> >
>>> >
>>> > thanks,
>>> > Cong
>>> >
>>> >
>>> > On Wed, Oct 2, 2013 at 2:34 PM, Cong Hou <congh@google.com> wrote:
>>> >> On Wed, Oct 2, 2013 at 4:24 AM, Richard Biener <rguenther@suse.de> wrote:
>>> >>> On Tue, 1 Oct 2013, Cong Hou wrote:
>>> >>>
>>> >>>> When alias exists between data refs in a loop, to vectorize it GCC
>>> >>>> does loop versioning and adds runtime alias checks. Basically for each
>>> >>>> pair of data refs with possible data dependence, there will be two
>>> >>>> comparisons generated to make sure there is no aliasing between them
>>> >>>> in each iteration of the vectorized loop. If there are many such data
>>> >>>> refs pairs, the number of comparisons can be very large, which is a
>>> >>>> big overhead.
>>> >>>>
>>> >>>> However, in some cases it is possible to reduce the number of those
>>> >>>> comparisons. For example, for the following loop, we can detect that
>>> >>>> b[0] and b[1] are two consecutive member accesses so that we can
>>> >>>> combine the alias check between a[0:100]&b[0] and a[0:100]&b[1] into
>>> >>>> checking a[0:100]&b[0:2]:
>>> >>>>
>>> >>>> void foo(int*a, int* b)
>>> >>>> {
>>> >>>>    for (int i = 0; i < 100; ++i)
>>> >>>>     a[i] = b[0] + b[1];
>>> >>>> }
>>> >>>>
>>> >>>> Actually, the requirement of consecutive memory accesses is too
>>> >>>> strict. For the following loop, we can still combine the alias checks
>>> >>>> between a[0:100]&b[0] and a[0:100]&b[100]:
>>> >>>>
>>> >>>> void foo(int*a, int* b)
>>> >>>> {
>>> >>>>    for (int i = 0; i < 100; ++i)
>>> >>>>     a[i] = b[0] + b[100];
>>> >>>> }
>>> >>>>
>>> >>>> This is because if b[0] is not in a[0:100] and b[100] is not in
>>> >>>> a[0:100] then a[0:100] cannot be between b[0] and b[100]. We only need
>>> >>>> to check a[0:100] and b[0:101] don't overlap.
>>> >>>>
>>> >>>> More generally, consider two pairs of data refs (a, b1) and (a, b2).
>>> >>>> Suppose addr_b1 and addr_b2 are basic addresses of data ref b1 and b2;
>>> >>>> offset_b1 and offset_b2 (offset_b1 < offset_b2) are offsets of b1 and
>>> >>>> b2, and segment_length_a, segment_length_b1, and segment_length_b2 are
>>> >>>> segment length of a, b1, and b2. Then we can combine the two
>>> >>>> comparisons into one if the following condition is satisfied:
>>> >>>>
>>> >>>> offset_b2- offset_b1 - segment_length_b1 < segment_length_a
>>> >>>>
>>> >>>>
>>> >>>> This patch detects those combination opportunities to reduce the
>>> >>>> number of alias checks. It is tested on an x86-64 machine.
>>> >>>
>>> >>> Apart from the other comments you got (to which I agree) the patch
>>> >>> seems to do two things, namely also:
>>> >>>
>>> >>> +  /* Extract load and store statements on pointers with zero-stride
>>> >>> +     accesses.  */
>>> >>> +  if (LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo))
>>> >>> +    {
>>> >>>
>>> >>> which I'd rather see in a separate patch (and done also when
>>> >>> the loop doesn't require versioning for alias).
>>> >>>
>>> >>
>>> >>
>>> >> My mistake.. I am working on those two patches at the same time and
>>> >> pasted that one also here by mistake. I will send another patch about
>>> >> the "hoist" topic.
>>> >>
>>> >>
>>> >>> Also combining the alias checks in vect_create_cond_for_alias_checks
>>> >>> is nice but doesn't properly fix the use of the
>>> >>> vect-max-version-for-alias-checks param which currently inhibits
>>> >>> vectorization of the HIMENO benchmark by default (and make us look bad
>>> >>> compared to LLVM).
>>> >>>
>>> >>> So I believe this merging should be done incrementally when
>>> >>> we collect the DDRs we need to test in vect_mark_for_runtime_alias_test.
>>> >>>
>>> >>
>>> >>
>>> >> I agree that vect-max-version-for-alias-checks param should count the
>>> >> number of checks after the merge. However, the struct
>>> >> data_dependence_relation could not record the new information produced
>>> >> by the merge. The new information I mentioned contains the new segment
>>> >> length for comparisons. This length is calculated right in
>>> >> vect_create_cond_for_alias_checks() function. Since
>>> >> vect-max-version-for-alias-checks is used during analysis phase, shall
>>> >> we move all those (get segment length for each data ref and merge
>>> >> alias checks) from transformation to analysis phase? If we cannot
>>> >> store the result properly (data_dependence_relation is not enough),
>>> >> shall we do it twice in both phases?
>>> >>
>>> >> I also noticed a possible bug in the function vect_same_range_drs()
>>> >> called by vect_prune_runtime_alias_test_list(). For the following code
>>> >> I get two pairs of data refs after
>>> >> vect_prune_runtime_alias_test_list(), but in
>>> >> vect_create_cond_for_alias_checks() after detecting grouped accesses I
>>> >> got two identical pairs of data refs. The consequence is two identical
>>> >> alias checks are produced.
>>> >>
>>> >>
>>> >> void yuv2yuyv_ref (int *d, int *src, int n)
>>> >> {
>>> >>   char *dest = (char *)d;
>>> >>   int i;
>>> >>
>>> >>   for(i=0;i<n/2;i++){
>>> >>     dest[i*4 + 0] = (src[i*2 + 0])>>16;
>>> >>     dest[i*4 + 1] = (src[i*2 + 1])>>8;
>>> >>     dest[i*4 + 2] = (src[i*2 + 0])>>16;
>>> >>     dest[i*4 + 3] = (src[i*2 + 0])>>0;
>>> >>   }
>>> >> }
>>> >>
>>> >>
>>> >> I think the solution to this problem is changing
>>> >>
>>> >> GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_i))
>>> >> == GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_j)
>>> >>
>>> >> into
>>> >>
>>> >> STMT_VINFO_DATA_REF (vinfo_for_stmt (GROUP_FIRST_ELEMENT
>>> >> (vinfo_for_stmt (stmt_i))))
>>> >> == STMT_VINFO_DATA_REF (vinfo_for_stmt (GROUP_FIRST_ELEMENT
>>> >> (vinfo_for_stmt (stmt_j)))
>>> >>
>>> >>
>>> >> in function vect_same_range_drs(). What do you think about it?
>>> >>
>>> >>
>>> >> thanks,
>>> >> Cong
>>> >>
>>> >>
>>> >>
>>> >>> Thanks for working on this,
>>> >>> Richard.
>>> >>>
>>> >>>>
>>> >>>> thanks,
>>> >>>> Cong
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> Index: gcc/tree-vect-loop-manip.c
>>> >>>> ===================================================================
>>> >>>> --- gcc/tree-vect-loop-manip.c (revision 202662)
>>> >>>> +++ gcc/tree-vect-loop-manip.c (working copy)
>>> >>>> @@ -19,6 +19,10 @@ You should have received a copy of the G
>>> >>>>  along with GCC; see the file COPYING3.  If not see
>>> >>>>  <http://www.gnu.org/licenses/>.  */
>>> >>>>
>>> >>>> +#include <vector>
>>> >>>> +#include <utility>
>>> >>>> +#include <algorithm>
>>> >>>> +
>>> >>>>  #include "config.h"
>>> >>>>  #include "system.h"
>>> >>>>  #include "coretypes.h"
>>> >>>> @@ -2248,6 +2252,74 @@ vect_vfa_segment_size (struct data_refer
>>> >>>>    return segment_length;
>>> >>>>  }
>>> >>>>
>>> >>>> +namespace
>>> >>>> +{
>>> >>>> +
>>> >>>> +/* struct dr_addr_with_seg_len
>>> >>>> +
>>> >>>> +   A struct storing information of a data reference, including the data
>>> >>>> +   ref itself, its basic address, the access offset and the segment length
>>> >>>> +   for aliasing checks.  */
>>> >>>> +
>>> >>>> +struct dr_addr_with_seg_len
>>> >>>> +{
>>> >>>> +  dr_addr_with_seg_len (data_reference* d, tree addr, tree off, tree len)
>>> >>>> +    : dr (d), basic_addr (addr), offset (off), seg_len (len) {}
>>> >>>> +
>>> >>>> +  data_reference* dr;
>>> >>>> +  tree basic_addr;
>>> >>>> +  tree offset;
>>> >>>> +  tree seg_len;
>>> >>>> +};
>>> >>>> +
>>> >>>> +/* Operator == between two dr_addr_with_seg_len objects.
>>> >>>> +
>>> >>>> +   This equality operator is used to make sure two data refs
>>> >>>> +   are the same one so that we will consider to combine the
>>> >>>> +   aliasing checks of those two pairs of data dependent data
>>> >>>> +   refs.  */
>>> >>>> +
>>> >>>> +bool operator == (const dr_addr_with_seg_len& d1,
>>> >>>> +  const dr_addr_with_seg_len& d2)
>>> >>>> +{
>>> >>>> +  return operand_equal_p (d1.basic_addr, d2.basic_addr, 0)
>>> >>>> + && operand_equal_p (d1.offset, d2.offset, 0)
>>> >>>> + && operand_equal_p (d1.seg_len, d2.seg_len, 0);
>>> >>>> +}
>>> >>>> +
>>> >>>> +typedef std::pair <dr_addr_with_seg_len, dr_addr_with_seg_len>
>>> >>>> + dr_addr_with_seg_len_pair_t;
>>> >>>> +
>>> >>>> +
>>> >>>> +/* Operator < between two dr_addr_with_seg_len_pair_t objects.
>>> >>>> +
>>> >>>> +   This operator is used to sort objects of dr_addr_with_seg_len_pair_t
>>> >>>> +   so that we can combine aliasing checks during one scan.  */
>>> >>>> +
>>> >>>> +bool operator < (const dr_addr_with_seg_len_pair_t& p1,
>>> >>>> + const dr_addr_with_seg_len_pair_t& p2)
>>> >>>> +{
>>> >>>> +  const dr_addr_with_seg_len& p11 = p1.first;
>>> >>>> +  const dr_addr_with_seg_len& p12 = p1.second;
>>> >>>> +  const dr_addr_with_seg_len& p21 = p2.first;
>>> >>>> +  const dr_addr_with_seg_len& p22 = p2.second;
>>> >>>> +
>>> >>>> +  if (p11.basic_addr != p21.basic_addr)
>>> >>>> +    return p11.basic_addr < p21.basic_addr;
>>> >>>> +  if (p12.basic_addr != p22.basic_addr)
>>> >>>> +    return p12.basic_addr < p22.basic_addr;
>>> >>>> +  if (TREE_CODE (p11.offset) != INTEGER_CST
>>> >>>> +      || TREE_CODE (p21.offset) != INTEGER_CST)
>>> >>>> +    return p11.offset < p21.offset;
>>> >>>> +  if (int_cst_value (p11.offset) != int_cst_value (p21.offset))
>>> >>>> +    return int_cst_value (p11.offset) < int_cst_value (p21.offset);
>>> >>>> +  if (TREE_CODE (p12.offset) != INTEGER_CST
>>> >>>> +      || TREE_CODE (p22.offset) != INTEGER_CST)
>>> >>>> +    return p12.offset < p22.offset;
>>> >>>> +  return int_cst_value (p12.offset) < int_cst_value (p22.offset);
>>> >>>> +}
>>> >>>> +
>>> >>>> +}
>>> >>>>
>>> >>>>  /* Function vect_create_cond_for_alias_checks.
>>> >>>>
>>> >>>> @@ -2292,20 +2364,51 @@ vect_create_cond_for_alias_checks (loop_
>>> >>>>    if (may_alias_ddrs.is_empty ())
>>> >>>>      return;
>>> >>>>
>>> >>>> +
>>> >>>> +  /* Basically, for each pair of dependent data refs store_ptr_0
>>> >>>> +     and load_ptr_0, we create an expression:
>>> >>>> +
>>> >>>> +     ((store_ptr_0 + store_segment_length_0) <= load_ptr_0)
>>> >>>> +     || (load_ptr_0 + load_segment_length_0) <= store_ptr_0))
>>> >>>> +
>>> >>>> +     for aliasing checks. However, in some cases we can decrease
>>> >>>> +     the number of checks by combining two checks into one. For
>>> >>>> +     example, suppose we have another pair of data refs store_ptr_0
>>> >>>> +     and load_ptr_1, and if the following condition is satisfied:
>>> >>>> +
>>> >>>> +     load_ptr_0 < load_ptr_1  &&
>>> >>>> +     load_ptr_1 - load_ptr_0 - load_segment_length_0 < store_segment_length_0
>>> >>>> +
>>> >>>> +     (this condition means, in each iteration of vectorized loop,
>>> >>>> +     the accessed memory of store_ptr_0 cannot be between the memory
>>> >>>> +     of load_ptr_0 and load_ptr_1.)
>>> >>>> +
>>> >>>> +     we then can use only the following expression to finish the
>>> >>>> +     alising checks between store_ptr_0 & load_ptr_0 and
>>> >>>> +     store_ptr_0 & load_ptr_1:
>>> >>>> +
>>> >>>> +     ((store_ptr_0 + store_segment_length_0) <= load_ptr_0)
>>> >>>> +     || (load_ptr_1 + load_segment_length_1 <= store_ptr_0))
>>> >>>> +
>>> >>>> +     Note that we only consider that load_ptr_0 and load_ptr_1 have the
>>> >>>> +     same basic address.  */
>>> >>>> +
>>> >>>> +  std::vector<dr_addr_with_seg_len_pair_t> ddrs_with_seg_len;
>>> >>>> +
>>> >>>> +  /* First, we collect all data ref pairs for aliasing checks.  */
>>> >>>> +
>>> >>>>    FOR_EACH_VEC_ELT (may_alias_ddrs, i, ddr)
>>> >>>>      {
>>> >>>>        struct data_reference *dr_a, *dr_b;
>>> >>>>        gimple dr_group_first_a, dr_group_first_b;
>>> >>>> -      tree addr_base_a, addr_base_b;
>>> >>>>        tree segment_length_a, segment_length_b;
>>> >>>>        gimple stmt_a, stmt_b;
>>> >>>> -      tree seg_a_min, seg_a_max, seg_b_min, seg_b_max;
>>> >>>>
>>> >>>>        dr_a = DDR_A (ddr);
>>> >>>>        stmt_a = DR_STMT (DDR_A (ddr));
>>> >>>>        dr_group_first_a = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_a));
>>> >>>>        if (dr_group_first_a)
>>> >>>> -        {
>>> >>>> + {
>>> >>>>    stmt_a = dr_group_first_a;
>>> >>>>    dr_a = STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt_a));
>>> >>>>   }
>>> >>>> @@ -2314,20 +2417,11 @@ vect_create_cond_for_alias_checks (loop_
>>> >>>>        stmt_b = DR_STMT (DDR_B (ddr));
>>> >>>>        dr_group_first_b = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_b));
>>> >>>>        if (dr_group_first_b)
>>> >>>> -        {
>>> >>>> + {
>>> >>>>    stmt_b = dr_group_first_b;
>>> >>>>    dr_b = STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt_b));
>>> >>>>   }
>>> >>>>
>>> >>>> -      addr_base_a
>>> >>>> - = fold_build_pointer_plus (DR_BASE_ADDRESS (dr_a),
>>> >>>> -   size_binop (PLUS_EXPR, DR_OFFSET (dr_a),
>>> >>>> -       DR_INIT (dr_a)));
>>> >>>> -      addr_base_b
>>> >>>> - = fold_build_pointer_plus (DR_BASE_ADDRESS (dr_b),
>>> >>>> -   size_binop (PLUS_EXPR, DR_OFFSET (dr_b),
>>> >>>> -       DR_INIT (dr_b)));
>>> >>>> -
>>> >>>>        if (!operand_equal_p (DR_STEP (dr_a), DR_STEP (dr_b), 0))
>>> >>>>   length_factor = scalar_loop_iters;
>>> >>>>        else
>>> >>>> @@ -2335,24 +2429,149 @@ vect_create_cond_for_alias_checks (loop_
>>> >>>>        segment_length_a = vect_vfa_segment_size (dr_a, length_factor);
>>> >>>>        segment_length_b = vect_vfa_segment_size (dr_b, length_factor);
>>> >>>>
>>> >>>> +      dr_addr_with_seg_len_pair_t dr_with_seg_len_pair
>>> >>>> +  (dr_addr_with_seg_len
>>> >>>> +       (dr_a, DR_BASE_ADDRESS (dr_a),
>>> >>>> + size_binop (PLUS_EXPR, DR_OFFSET (dr_a), DR_INIT (dr_a)),
>>> >>>> + segment_length_a),
>>> >>>> +   dr_addr_with_seg_len
>>> >>>> +       (dr_b, DR_BASE_ADDRESS (dr_b),
>>> >>>> + size_binop (PLUS_EXPR, DR_OFFSET (dr_b), DR_INIT (dr_b)),
>>> >>>> + segment_length_b));
>>> >>>> +
>>> >>>> +      if (dr_with_seg_len_pair.first.basic_addr >
>>> >>>> +  dr_with_seg_len_pair.second.basic_addr)
>>> >>>> + std::swap (dr_with_seg_len_pair.first, dr_with_seg_len_pair.second);
>>> >>>> +
>>> >>>> +      ddrs_with_seg_len.push_back (dr_with_seg_len_pair);
>>> >>>> +    }
>>> >>>> +
>>> >>>> +  /* Second, we sort the collected data ref pairs so that we can scan
>>> >>>> +     them once to combine all possible aliasing checks.  */
>>> >>>> +
>>> >>>> +  std::sort (ddrs_with_seg_len.begin(), ddrs_with_seg_len.end());
>>> >>>> +
>>> >>>> +  /* Remove duplicate data ref pairs.  */
>>> >>>> +  ddrs_with_seg_len.erase (std::unique (ddrs_with_seg_len.begin(),
>>> >>>> + ddrs_with_seg_len.end()),
>>> >>>> +   ddrs_with_seg_len.end());
>>> >>>> +
>>> >>>> +  /* We then scan the sorted dr pairs and check if we can combine
>>> >>>> +     alias checks of two neighbouring dr pairs.  */
>>> >>>> +
>>> >>>> +  for (size_t i = 1; i < ddrs_with_seg_len.size (); ++i)
>>> >>>> +    {
>>> >>>> +      dr_addr_with_seg_len& dr_a1 = ddrs_with_seg_len[i-1].first;
>>> >>>> +      dr_addr_with_seg_len& dr_b1 = ddrs_with_seg_len[i-1].second;
>>> >>>> +      dr_addr_with_seg_len& dr_a2 = ddrs_with_seg_len[i].first;
>>> >>>> +      dr_addr_with_seg_len& dr_b2 = ddrs_with_seg_len[i].second;
>>> >>>> +
>>> >>>> +      if (dr_a1 == dr_a2)
>>> >>>> + {
>>> >>>> +  if (dr_b1.basic_addr != dr_b2.basic_addr
>>> >>>> +      || TREE_CODE (dr_b1.offset) != INTEGER_CST
>>> >>>> +      || TREE_CODE (dr_b2.offset) != INTEGER_CST)
>>> >>>> +    continue;
>>> >>>> +
>>> >>>> +  int diff = int_cst_value (dr_b2.offset) -
>>> >>>> +     int_cst_value (dr_b1.offset);
>>> >>>> +
>>> >>>> +  gcc_assert (diff > 0);
>>> >>>> +
>>> >>>> +  if (diff <= vect_factor
>>> >>>> +      || (TREE_CODE (dr_b1.seg_len) == INTEGER_CST
>>> >>>> +  && diff - int_cst_value (dr_b1.seg_len) < vect_factor)
>>> >>>> +      || (TREE_CODE (dr_b1.seg_len) == INTEGER_CST
>>> >>>> +  && TREE_CODE (dr_a1.seg_len) == INTEGER_CST
>>> >>>> +  && diff - int_cst_value (dr_b1.seg_len) <
>>> >>>> +     int_cst_value (dr_a1.seg_len)))
>>> >>>> +    {
>>> >>>> +      if (dump_enabled_p ())
>>> >>>> + {
>>> >>>> +  dump_printf_loc
>>> >>>> +      (MSG_NOTE, vect_location,
>>> >>>> +       "combining two runtime checks for data references ");
>>> >>>> +  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_b1.dr));
>>> >>>> +  dump_printf (MSG_NOTE, " and ");
>>> >>>> +  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_b2.dr));
>>> >>>> +  dump_printf (MSG_NOTE, "\n");
>>> >>>> + }
>>> >>>> +
>>> >>>> +      dr_b1.seg_len = size_binop (PLUS_EXPR,
>>> >>>> +  dr_b2.seg_len, size_int (diff));
>>> >>>> +      ddrs_with_seg_len.erase (ddrs_with_seg_len.begin () + i);
>>> >>>> +      --i;
>>> >>>> +    }
>>> >>>> + }
>>> >>>> +      else if (dr_b1 == dr_b2)
>>> >>>> + {
>>> >>>> +  if (dr_a1.basic_addr != dr_a2.basic_addr
>>> >>>> +      || TREE_CODE (dr_a1.offset) != INTEGER_CST
>>> >>>> +      || TREE_CODE (dr_a2.offset) != INTEGER_CST)
>>> >>>> +    continue;
>>> >>>> +
>>> >>>> +  int diff = int_cst_value (dr_a2.offset) -
>>> >>>> +     int_cst_value (dr_a1.offset);
>>> >>>> +
>>> >>>> +  gcc_assert (diff > 0);
>>> >>>> +
>>> >>>> +  if (diff <= vect_factor
>>> >>>> +      || (TREE_CODE (dr_a1.seg_len) == INTEGER_CST
>>> >>>> +  && diff - int_cst_value (dr_a1.seg_len) < vect_factor)
>>> >>>> +      || (TREE_CODE (dr_a1.seg_len) == INTEGER_CST
>>> >>>> +  && TREE_CODE (dr_b1.seg_len) == INTEGER_CST
>>> >>>> +  && diff - int_cst_value (dr_a1.seg_len) <
>>> >>>> +     int_cst_value (dr_b1.seg_len)))
>>> >>>> +    {
>>> >>>> +      if (dump_enabled_p ())
>>> >>>> + {
>>> >>>> +  dump_printf_loc
>>> >>>> +      (MSG_NOTE, vect_location,
>>> >>>> +       "combining two runtime checks for data references ");
>>> >>>> +  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a1.dr));
>>> >>>> +  dump_printf (MSG_NOTE, " and ");
>>> >>>> +  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a2.dr));
>>> >>>> +  dump_printf (MSG_NOTE, "\n");
>>> >>>> + }
>>> >>>> +
>>> >>>> +      dr_a1.seg_len = size_binop (PLUS_EXPR,
>>> >>>> +  dr_a2.seg_len, size_int (diff));
>>> >>>> +      ddrs_with_seg_len.erase (ddrs_with_seg_len.begin () + i);
>>> >>>> +      --i;
>>> >>>> +    }
>>> >>>> + }
>>> >>>> +    }
>>> >>>> +
>>> >>>> +  for (size_t i = 0, s = ddrs_with_seg_len.size (); i < s; ++i)
>>> >>>> +    {
>>> >>>> +      const dr_addr_with_seg_len& dr_a = ddrs_with_seg_len[i].first;
>>> >>>> +      const dr_addr_with_seg_len& dr_b = ddrs_with_seg_len[i].second;
>>> >>>> +      tree segment_length_a = dr_a.seg_len;
>>> >>>> +      tree segment_length_b = dr_b.seg_len;
>>> >>>> +
>>> >>>> +      tree addr_base_a
>>> >>>> + = fold_build_pointer_plus (dr_a.basic_addr, dr_a.offset);
>>> >>>> +      tree addr_base_b
>>> >>>> + = fold_build_pointer_plus (dr_b.basic_addr, dr_b.offset);
>>> >>>> +
>>> >>>>        if (dump_enabled_p ())
>>> >>>>   {
>>> >>>>    dump_printf_loc (MSG_NOTE, vect_location,
>>> >>>> -                           "create runtime check for data references ");
>>> >>>> -  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a));
>>> >>>> +   "create runtime check for data references ");
>>> >>>> +  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a.dr));
>>> >>>>    dump_printf (MSG_NOTE, " and ");
>>> >>>> -  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_b));
>>> >>>> -          dump_printf (MSG_NOTE, "\n");
>>> >>>> +  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_b.dr));
>>> >>>> +  dump_printf (MSG_NOTE, "\n");
>>> >>>>   }
>>> >>>>
>>> >>>> -      seg_a_min = addr_base_a;
>>> >>>> -      seg_a_max = fold_build_pointer_plus (addr_base_a, segment_length_a);
>>> >>>> -      if (tree_int_cst_compare (DR_STEP (dr_a), size_zero_node) < 0)
>>> >>>> +      tree seg_a_min = addr_base_a;
>>> >>>> +      tree seg_a_max = fold_build_pointer_plus (addr_base_a, segment_length_a);
>>> >>>> +      if (tree_int_cst_compare (DR_STEP (dr_a.dr), size_zero_node) < 0)
>>> >>>>   seg_a_min = seg_a_max, seg_a_max = addr_base_a;
>>> >>>>
>>> >>>> -      seg_b_min = addr_base_b;
>>> >>>> -      seg_b_max = fold_build_pointer_plus (addr_base_b, segment_length_b);
>>> >>>> -      if (tree_int_cst_compare (DR_STEP (dr_b), size_zero_node) < 0)
>>> >>>> +      tree seg_b_min = addr_base_b;
>>> >>>> +      tree seg_b_max = fold_build_pointer_plus (addr_base_b, segment_length_b);
>>> >>>> +      if (tree_int_cst_compare (DR_STEP (dr_b.dr), size_zero_node) < 0)
>>> >>>>   seg_b_min = seg_b_max, seg_b_max = addr_base_b;
>>> >>>>
>>> >>>>        part_cond_expr =
>>> >>>> @@ -2477,6 +2696,81 @@ vect_loop_versioning (loop_vec_info loop
>>> >>>>        adjust_phi_and_debug_stmts (orig_phi, e, PHI_RESULT (new_phi));
>>> >>>>      }
>>> >>>>
>>> >>>> +  /* Extract load and store statements on pointers with zero-stride
>>> >>>> +     accesses.  */
>>> >>>> +  if (LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo))
>>> >>>> +    {
>>> >>>> +
>>> >>>> +      /* In the loop body, we iterate each statement to check if it is a load
>>> >>>> + or store. Then we check the DR_STEP of the data reference.  If
>>> >>>> + DR_STEP is zero, then we will hoist the load statement to the loop
>>> >>>> + preheader, and move the store statement to the loop exit.  */
>>> >>>> +
>>> >>>> +      for (gimple_stmt_iterator si = gsi_start_bb (loop->header);
>>> >>>> +   !gsi_end_p (si); )
>>> >>>> + {
>>> >>>> +  gimple stmt = gsi_stmt (si);
>>> >>>> +  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
>>> >>>> +  struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
>>> >>>> +
>>> >>>> +
>>> >>>> +  if (dr && integer_zerop (DR_STEP (dr)))
>>> >>>> +    {
>>> >>>> +      if (DR_IS_READ (dr))
>>> >>>> + {
>>> >>>> +  if (dump_file)
>>> >>>> +    {
>>> >>>> +      fprintf (dump_file,
>>> >>>> +       "Hoist the load to outside of the loop:\n");
>>> >>>> +      print_gimple_stmt (dump_file, stmt, 0,
>>> >>>> + TDF_VOPS|TDF_MEMSYMS);
>>> >>>> +    }
>>> >>>> +
>>> >>>> +  basic_block preheader = loop_preheader_edge (loop)->src;
>>> >>>> +  gimple_stmt_iterator si_dst = gsi_last_bb (preheader);
>>> >>>> +  gsi_move_after (&si, &si_dst);
>>> >>>> + }
>>> >>>> +      else
>>> >>>> + {
>>> >>>> +  gimple_stmt_iterator si_dst =
>>> >>>> +      gsi_last_bb (single_exit (loop)->dest);
>>> >>>> +  gsi_move_after (&si, &si_dst);
>>> >>>> + }
>>> >>>> +              continue;
>>> >>>> +    }
>>> >>>> +  else if (!dr)
>>> >>>> +          {
>>> >>>> +            bool hoist = true;
>>> >>>> +            for (size_t i = 0; i < gimple_num_ops (stmt); i++)
>>> >>>> +            {
>>> >>>> +              tree op = gimple_op (stmt, i);
>>> >>>> +              if (TREE_CODE (op) == INTEGER_CST
>>> >>>> +                  || TREE_CODE (op) == REAL_CST)
>>> >>>> +                continue;
>>> >>>> +              if (TREE_CODE (op) == SSA_NAME)
>>> >>>> +              {
>>> >>>> +                gimple def = SSA_NAME_DEF_STMT (op);
>>> >>>> +                if (def == stmt
>>> >>>> +                    || gimple_nop_p (def)
>>> >>>> +                    || !flow_bb_inside_loop_p (loop, gimple_bb (def)))
>>> >>>> +                  continue;
>>> >>>> +              }
>>> >>>> +              hoist = false;
>>> >>>> +              break;
>>> >>>> +            }
>>> >>>> +
>>> >>>> +            if (hoist)
>>> >>>> +            {
>>> >>>> +              basic_block preheader = loop_preheader_edge (loop)->src;
>>> >>>> +              gimple_stmt_iterator si_dst = gsi_last_bb (preheader);
>>> >>>> +              gsi_move_after (&si, &si_dst);
>>> >>>> +              continue;
>>> >>>> +            }
>>> >>>> +          }
>>> >>>> +          gsi_next (&si);
>>> >>>> + }
>>> >>>> +    }
>>> >>>> +
>>> >>>>    /* End loop-exit-fixes after versioning.  */
>>> >>>>
>>> >>>>    if (cond_expr_stmt_list)
>>> >>>> Index: gcc/ChangeLog
>>> >>>> ===================================================================
>>> >>>> --- gcc/ChangeLog (revision 202663)
>>> >>>> +++ gcc/ChangeLog (working copy)
>>> >>>> @@ -1,3 +1,8 @@
>>> >>>> +2013-10-01  Cong Hou  <congh@google.com>
>>> >>>> +
>>> >>>> + * tree-vect-loop-manip.c (vect_create_cond_for_alias_checks): Combine
>>> >>>> + alias checks if it is possible to amortize the runtime overhead.
>>> >>>> +
>>> >>>>
>>> >>>>
>>> >>>
>>> >>> --
>>> >>> Richard Biener <rguenther@suse.de>
>>> >>> SUSE / SUSE Labs
>>> >>> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>>> >>> GF: Jeff Hawn, Jennifer Guild, Felix Imend
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8a38316..7333f6a 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2013-10-14  Cong Hou  <congh@google.com>
+
+	* tree-vect-loop-manip.c (vect_create_cond_for_alias_checks):
+	Combine alias checks if it is possible to amortize the runtime
+	overhead.  Return the number of alias checks after merging.
+	* tree-vect-data-refs.c (vect_prune_runtime_alias_test_list):
+	Use the function vect_create_cond_for_alias_checks () to check
+	the number of alias checks.
+
 2013-10-14  David Malcolm  <dmalcolm@redhat.com>
 
 	* dumpfile.h (gcc::dump_manager): New class, to hold state
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 075d071..ea2782f 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@
+2013-10-14  Cong Hou  <congh@google.com>
+
+	* gcc.dg/vect/vect-alias-check.c: New.
+
 2013-10-14  Tobias Burnus  <burnus@net-b.de>
 
 	PR fortran/58658
diff --git a/gcc/testsuite/gcc.dg/vect/vect-alias-check.c b/gcc/testsuite/gcc.dg/vect/vect-alias-check.c
new file mode 100644
index 0000000..bf9eb6b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-alias-check.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize
+     --param=vect-max-version-for-alias-checks=2
+     -fdump-tree-vect-details" } */
+
+/* A test case showing three potential alias checks between
+   a[i] and b[i], b[i+7], b[i+14]. With alias checks merging
+   enabled, those tree checks can be merged into one, and the
+   loop will be vectorized with vect-max-version-for-alias-checks=2.  */
+
+void foo (int *a, int *b)
+{
+  int i;
+  for (i = 0; i < 1000; ++i)
+    a[i] = b[i] + b[i+7] + b[i+14];
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index e7b2f33..5f70d58 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -128,41 +128,6 @@ vect_get_smallest_scalar_type (gimple stmt, HOST_WIDE_INT *lhs_size_unit,
 }
 
 
-/* Check if data references pointed by DR_I and DR_J are same or
-   belong to same interleaving group.  Return FALSE if drs are
-   different, otherwise return TRUE.  */
-
-static bool
-vect_same_range_drs (data_reference_p dr_i, data_reference_p dr_j)
-{
-  gimple stmt_i = DR_STMT (dr_i);
-  gimple stmt_j = DR_STMT (dr_j);
-
-  if (operand_equal_p (DR_REF (dr_i), DR_REF (dr_j), 0)
-      || (GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_i))
-	    && GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_j))
-	    && (GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_i))
-		== GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_j)))))
-    return true;
-  else
-    return false;
-}
-
-/* If address ranges represented by DDR_I and DDR_J are equal,
-   return TRUE, otherwise return FALSE.  */
-
-static bool
-vect_vfa_range_equal (ddr_p ddr_i, ddr_p ddr_j)
-{
-  if ((vect_same_range_drs (DDR_A (ddr_i), DDR_A (ddr_j))
-       && vect_same_range_drs (DDR_B (ddr_i), DDR_B (ddr_j)))
-      || (vect_same_range_drs (DDR_A (ddr_i), DDR_B (ddr_j))
-	  && vect_same_range_drs (DDR_B (ddr_i), DDR_A (ddr_j))))
-    return true;
-  else
-    return false;
-}
-
 /* Insert DDR into LOOP_VINFO list of ddrs that may alias and need to be
    tested at run-time.  Return TRUE if DDR was successfully inserted.
    Return false if versioning is not supported.  */
@@ -2647,69 +2612,331 @@ vect_analyze_data_ref_accesses (loop_vec_info loop_vinfo, bb_vec_info bb_vinfo)
   return true;
 }
 
+
+/* Operator == between two dr_addr_with_seg_len objects.
+
+   This equality operator is used to make sure two data refs
+   are the same one so that we will consider to combine the
+   aliasing checks of those two pairs of data dependent data
+   refs.  */
+
+static bool
+operator == (const dr_addr_with_seg_len& d1,
+	     const dr_addr_with_seg_len& d2)
+{
+  return operand_equal_p (d1.basic_addr, d2.basic_addr, 0)
+	 && compare_tree (d1.offset, d2.offset) == 0
+	 && compare_tree (d1.seg_len, d2.seg_len) == 0;
+}
+
+/* Function comp_dr_addr_with_seg_len_pair.
+
+   Comparison function for sorting objects of dr_addr_with_seg_len_pair_t
+   so that we can combine aliasing checks in one scan.  */
+
+static int
+comp_dr_addr_with_seg_len_pair (const void *p1_, const void *p2_)
+{
+  const dr_addr_with_seg_len_pair_t* p1 =
+    (const dr_addr_with_seg_len_pair_t *) p1_;
+  const dr_addr_with_seg_len_pair_t* p2 =
+    (const dr_addr_with_seg_len_pair_t *) p2_;
+
+  const dr_addr_with_seg_len &p11 = p1->first,
+			     &p12 = p1->second,
+			     &p21 = p2->first,
+			     &p22 = p2->second;
+
+  int comp_res = compare_tree (p11.basic_addr, p21.basic_addr);
+  if (comp_res != 0)
+    return comp_res;
+
+  comp_res = compare_tree (p12.basic_addr, p22.basic_addr);
+  if (comp_res != 0)
+    return comp_res;
+
+  if (TREE_CODE (p11.offset) != INTEGER_CST
+      || TREE_CODE (p21.offset) != INTEGER_CST)
+    {
+      comp_res = compare_tree (p11.offset, p21.offset);
+      if (comp_res != 0)
+	return comp_res;
+    }
+  if (tree_int_cst_compare (p11.offset, p21.offset) < 0)
+    return -1;
+  if (tree_int_cst_compare (p11.offset, p21.offset) > 0)
+    return 1;
+  if (TREE_CODE (p12.offset) != INTEGER_CST
+      || TREE_CODE (p22.offset) != INTEGER_CST)
+    {
+      comp_res = compare_tree (p12.offset, p22.offset);
+      if (comp_res != 0)
+	return comp_res;
+    }
+  if (tree_int_cst_compare (p12.offset, p22.offset) < 0)
+    return -1;
+  if (tree_int_cst_compare (p12.offset, p22.offset) > 0)
+    return 1;
+
+  return 0;
+}
+
+template <class T> static void
+swap (T& a, T& b)
+{
+  T c (a);
+  a = b;
+  b = c;
+}
+
+/* Function vect_vfa_segment_size.
+
+   Create an expression that computes the size of segment
+   that will be accessed for a data reference.  The functions takes into
+   account that realignment loads may access one more vector.
+
+   Input:
+     DR: The data reference.
+     LENGTH_FACTOR: segment length to consider.
+
+   Return an expression whose value is the size of segment which will be
+   accessed by DR.  */
+
+static tree
+vect_vfa_segment_size (struct data_reference *dr, tree length_factor)
+{
+  tree segment_length;
+
+  if (integer_zerop (DR_STEP (dr)))
+    segment_length = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr)));
+  else
+    segment_length = size_binop (MULT_EXPR,
+                                 fold_convert (sizetype, DR_STEP (dr)),
+                                 fold_convert (sizetype, length_factor));
+
+  if (vect_supportable_dr_alignment (dr, false)
+        == dr_explicit_realign_optimized)
+    {
+      tree vector_size = TYPE_SIZE_UNIT
+			  (STMT_VINFO_VECTYPE (vinfo_for_stmt (DR_STMT (dr))));
+
+      segment_length = size_binop (PLUS_EXPR, segment_length, vector_size);
+    }
+  return segment_length;
+}
+
 /* Function vect_prune_runtime_alias_test_list.
 
    Prune a list of ddrs to be tested at run-time by versioning for alias.
+   Merge several alias checks into one if possible.
    Return FALSE if resulting list of ddrs is longer then allowed by
    PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS, otherwise return TRUE.  */
 
 bool
 vect_prune_runtime_alias_test_list (loop_vec_info loop_vinfo)
 {
-  vec<ddr_p>  ddrs =
+  vec<ddr_p> may_alias_ddrs =
     LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo);
-  unsigned i, j;
+  vec<dr_addr_with_seg_len_pair_t>& comp_alias_ddrs =
+    LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo);
+  int vect_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
+  tree scalar_loop_iters = LOOP_VINFO_NITERS (loop_vinfo);
+
+  ddr_p ddr;
+  unsigned int i;
+  tree length_factor;
 
   if (dump_enabled_p ())
     dump_printf_loc (MSG_NOTE, vect_location,
                      "=== vect_prune_runtime_alias_test_list ===\n");
 
-  for (i = 0; i < ddrs.length (); )
+  if (may_alias_ddrs.is_empty ())
+    return true;
+
+  /* Basically, for each pair of dependent data refs store_ptr_0
+     and load_ptr_0, we create an expression:
+
+     ((store_ptr_0 + store_segment_length_0) <= load_ptr_0)
+     || (load_ptr_0 + load_segment_length_0) <= store_ptr_0))
+
+     for aliasing checks.  However, in some cases we can decrease
+     the number of checks by combining two checks into one.  For
+     example, suppose we have another pair of data refs store_ptr_0
+     and load_ptr_1, and if the following condition is satisfied:
+
+     load_ptr_0 < load_ptr_1  &&
+     load_ptr_1 - load_ptr_0 - load_segment_length_0 < store_segment_length_0
+
+     (this condition means, in each iteration of vectorized loop,
+     the accessed memory of store_ptr_0 cannot be between the memory
+     of load_ptr_0 and load_ptr_1.)
+
+     we then can use only the following expression to finish the
+     alising checks between store_ptr_0 & load_ptr_0 and
+     store_ptr_0 & load_ptr_1:
+
+     ((store_ptr_0 + store_segment_length_0) <= load_ptr_0)
+     || (load_ptr_1 + load_segment_length_1 <= store_ptr_0))
+
+     Note that we only consider that load_ptr_0 and load_ptr_1 have the
+     same basic address.  */
+
+  comp_alias_ddrs.create (may_alias_ddrs.length ());
+
+  /* First, we collect all data ref pairs for aliasing checks.  */
+  FOR_EACH_VEC_ELT (may_alias_ddrs, i, ddr)
     {
-      bool found;
-      ddr_p ddr_i;
+      struct data_reference *dr_a, *dr_b;
+      gimple dr_group_first_a, dr_group_first_b;
+      tree segment_length_a, segment_length_b;
+      gimple stmt_a, stmt_b;
+
+      dr_a = DDR_A (ddr);
+      stmt_a = DR_STMT (DDR_A (ddr));
+      dr_group_first_a = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_a));
+      if (dr_group_first_a)
+	{
+	  stmt_a = dr_group_first_a;
+	  dr_a = STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt_a));
+	}
 
-      ddr_i = ddrs[i];
-      found = false;
+      dr_b = DDR_B (ddr);
+      stmt_b = DR_STMT (DDR_B (ddr));
+      dr_group_first_b = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_b));
+      if (dr_group_first_b)
+	{
+	  stmt_b = dr_group_first_b;
+	  dr_b = STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt_b));
+	}
 
-      for (j = 0; j < i; j++)
-        {
-	  ddr_p ddr_j = ddrs[j];
+      if (!operand_equal_p (DR_STEP (dr_a), DR_STEP (dr_b), 0))
+	length_factor = scalar_loop_iters;
+      else
+	length_factor = size_int (vect_factor);
+      segment_length_a = vect_vfa_segment_size (dr_a, length_factor);
+      segment_length_b = vect_vfa_segment_size (dr_b, length_factor);
+
+      dr_addr_with_seg_len_pair_t dr_with_seg_len_pair
+	  (dr_addr_with_seg_len
+	       (dr_a, DR_BASE_ADDRESS (dr_a),
+		size_binop (PLUS_EXPR, DR_OFFSET (dr_a), DR_INIT (dr_a)),
+		segment_length_a),
+	   dr_addr_with_seg_len
+	       (dr_b, DR_BASE_ADDRESS (dr_b),
+		size_binop (PLUS_EXPR, DR_OFFSET (dr_b), DR_INIT (dr_b)),
+		segment_length_b));
+
+      if (compare_tree (dr_with_seg_len_pair.first.basic_addr,
+			dr_with_seg_len_pair.second.basic_addr) > 0)
+	swap (dr_with_seg_len_pair.first, dr_with_seg_len_pair.second);
+
+      comp_alias_ddrs.safe_push (dr_with_seg_len_pair);
+    }
 
-	  if (vect_vfa_range_equal (ddr_i, ddr_j))
+  /* Second, we sort the collected data ref pairs so that we can scan
+     them once to combine all possible aliasing checks.  */
+  comp_alias_ddrs.qsort (comp_dr_addr_with_seg_len_pair);
+
+  /* Third, we scan the sorted dr pairs and check if we can combine
+     alias checks of two neighbouring dr pairs.  */
+  for (size_t i = 1; i < comp_alias_ddrs.length (); ++i)
+    {
+      /* Deal with two ddrs (dr_a1, dr_b1) and (dr_a2, dr_b2).  */
+      dr_addr_with_seg_len *dr_a1 = &comp_alias_ddrs[i-1].first,
+			   *dr_b1 = &comp_alias_ddrs[i-1].second,
+			   *dr_a2 = &comp_alias_ddrs[i].first,
+			   *dr_b2 = &comp_alias_ddrs[i].second;
+
+      /* Remove duplicate data ref pairs.  */
+      if (*dr_a1 == *dr_a2 && *dr_b1 == *dr_b2)
+	{
+	  if (dump_enabled_p ())
+	    {
+	      dump_printf_loc (MSG_NOTE, vect_location,
+			       "found equal ranges ");
+	      dump_generic_expr (MSG_NOTE, TDF_SLIM,
+				 DR_REF (dr_a1->dr));
+	      dump_printf (MSG_NOTE,  ", ");
+	      dump_generic_expr (MSG_NOTE, TDF_SLIM,
+				 DR_REF (dr_b1->dr));
+	      dump_printf (MSG_NOTE,  " and ");
+	      dump_generic_expr (MSG_NOTE, TDF_SLIM,
+				 DR_REF (dr_a2->dr));
+	      dump_printf (MSG_NOTE,  ", ");
+	      dump_generic_expr (MSG_NOTE, TDF_SLIM,
+				 DR_REF (dr_b2->dr));
+	      dump_printf (MSG_NOTE, "\n");
+	    }
+
+	  comp_alias_ddrs.ordered_remove (i--);
+	  continue;
+	}
+
+      if (*dr_a1 == *dr_a2 || *dr_b1 == *dr_b2)
+	{
+	  /* We consider the case that DR_B1 and DR_B2 are same memrefs,
+	     and DR_A1 and DR_A2 are two consecutive memrefs.  */
+	  if (*dr_a1 == *dr_a2)
+	    {
+	      swap (dr_a1, dr_b1);
+	      swap (dr_a2, dr_b2);
+	    }
+
+	  if (!operand_equal_p (dr_a1->basic_addr, dr_a2->basic_addr, 0)
+	      || !host_integerp (dr_a1->offset, 0)
+	      || !host_integerp (dr_a2->offset, 0))
+	    continue;
+
+	  HOST_WIDEST_INT diff = widest_int_cst_value (dr_a2->offset) -
+				 widest_int_cst_value (dr_a1->offset);
+
+
+	  /* Now we check if the following condition is satisfied:
+
+	     DIFF - SEGMENT_LENGTH_A < SEGMENT_LENGTH_B
+
+	     where DIFF = DR_A2->OFFSET - DR_A1->OFFSET.  However,
+	     SEGMENT_LENGTH_A or SEGMENT_LENGTH_B may not be constant so we
+	     have to make a best estimation.  We can get the minimum value
+	     of SEGMENT_LENGTH_B as a constant, represented by MIN_SEG_LEN_B,
+	     then either of the following two conditions can guarantee the
+	     one above:
+
+	     1: DIFF <= MIN_SEG_LEN_B
+	     2: DIFF - SEGMENT_LENGTH_A < MIN_SEG_LEN_B
+
+	     */
+
+	  HOST_WIDEST_INT
+	  min_seg_len_b = (TREE_CODE (dr_b1->seg_len) == INTEGER_CST) ?
+			      widest_int_cst_value (dr_b1->seg_len) :
+			      vect_factor;
+
+	  if (diff <= min_seg_len_b
+	      || (TREE_CODE (dr_a1->seg_len) == INTEGER_CST
+		  && diff - widest_int_cst_value (dr_a1->seg_len) <
+		     min_seg_len_b))
 	    {
 	      if (dump_enabled_p ())
 		{
-		  dump_printf_loc (MSG_NOTE, vect_location,
-		                   "found equal ranges ");
-		  dump_generic_expr (MSG_NOTE, TDF_SLIM,
-		                     DR_REF (DDR_A (ddr_i)));
-		  dump_printf (MSG_NOTE,  ", ");
-		  dump_generic_expr (MSG_NOTE, TDF_SLIM,
-		                     DR_REF (DDR_B (ddr_i)));
-		  dump_printf (MSG_NOTE,  " and ");
-		  dump_generic_expr (MSG_NOTE, TDF_SLIM,
-		                     DR_REF (DDR_A (ddr_j)));
-		  dump_printf (MSG_NOTE,  ", ");
-		  dump_generic_expr (MSG_NOTE, TDF_SLIM,
-		                     DR_REF (DDR_B (ddr_j)));
+		  dump_printf_loc
+		      (MSG_NOTE, vect_location,
+		       "merging two runtime checks for data references ");
+		  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a1->dr));
+		  dump_printf (MSG_NOTE, " and ");
+		  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a2->dr));
 		  dump_printf (MSG_NOTE, "\n");
 		}
-	      found = true;
-	      break;
+
+	      dr_a1->seg_len = size_binop (PLUS_EXPR,
+					   dr_a2->seg_len, size_int (diff));
+	      comp_alias_ddrs.ordered_remove (i--);
 	    }
 	}
-
-      if (found)
-      {
-	ddrs.ordered_remove (i);
-	continue;
-      }
-      i++;
     }
 
-  if (ddrs.length () >
-       (unsigned) PARAM_VALUE (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS))
+  if ((int) comp_alias_ddrs.length () >
+      PARAM_VALUE (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS))
     {
       if (dump_enabled_p ())
 	{
@@ -2718,8 +2945,6 @@ vect_prune_runtime_alias_test_list (loop_vec_info loop_vinfo)
 	                   "generated checks exceeded.\n");
 	}
 
-      LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo).truncate (0);
-
       return false;
     }
 
diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 574446a..1e03853 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -2211,44 +2211,6 @@ vect_create_cond_for_align_checks (loop_vec_info loop_vinfo,
     *cond_expr = part_cond_expr;
 }
 
-
-/* Function vect_vfa_segment_size.
-
-   Create an expression that computes the size of segment
-   that will be accessed for a data reference.  The functions takes into
-   account that realignment loads may access one more vector.
-
-   Input:
-     DR: The data reference.
-     LENGTH_FACTOR: segment length to consider.
-
-   Return an expression whose value is the size of segment which will be
-   accessed by DR.  */
-
-static tree
-vect_vfa_segment_size (struct data_reference *dr, tree length_factor)
-{
-  tree segment_length;
-
-  if (integer_zerop (DR_STEP (dr)))
-    segment_length = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr)));
-  else
-    segment_length = size_binop (MULT_EXPR,
-                                 fold_convert (sizetype, DR_STEP (dr)),
-                                 fold_convert (sizetype, length_factor));
-
-  if (vect_supportable_dr_alignment (dr, false)
-        == dr_explicit_realign_optimized)
-    {
-      tree vector_size = TYPE_SIZE_UNIT
-			  (STMT_VINFO_VECTYPE (vinfo_for_stmt (DR_STMT (dr))));
-
-      segment_length = size_binop (PLUS_EXPR, segment_length, vector_size);
-    }
-  return segment_length;
-}
-
-
 /* Function vect_create_cond_for_alias_checks.
 
    Create a conditional expression that represents the run-time checks for
@@ -2257,28 +2219,24 @@ vect_vfa_segment_size (struct data_reference *dr, tree length_factor)
 
    Input:
    COND_EXPR  - input conditional expression.  New conditions will be chained
-                with logical AND operation.
+                with logical AND operation.  If it is NULL, then the function
+                is used to return the number of alias checks.
    LOOP_VINFO - field LOOP_VINFO_MAY_ALIAS_STMTS contains the list of ddrs
 	        to be checked.
 
    Output:
    COND_EXPR - conditional expression.
 
-   The returned value is the conditional expression to be used in the if
+   The returned COND_EXPR is the conditional expression to be used in the if
    statement that controls which version of the loop gets executed at runtime.
 */
 
-static void
+void
 vect_create_cond_for_alias_checks (loop_vec_info loop_vinfo, tree * cond_expr)
 {
-  vec<ddr_p>  may_alias_ddrs =
-    LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo);
-  int vect_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
-  tree scalar_loop_iters = LOOP_VINFO_NITERS (loop_vinfo);
-
-  ddr_p ddr;
-  unsigned int i;
-  tree part_cond_expr, length_factor;
+  vec<dr_addr_with_seg_len_pair_t> comp_alias_ddrs =
+    LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo);
+  tree part_cond_expr;
 
   /* Create expression
      ((store_ptr_0 + store_segment_length_0) <= load_ptr_0)
@@ -2289,70 +2247,39 @@ vect_create_cond_for_alias_checks (loop_vec_info loop_vinfo, tree * cond_expr)
      ((store_ptr_n + store_segment_length_n) <= load_ptr_n)
      || (load_ptr_n + load_segment_length_n) <= store_ptr_n))  */
 
-  if (may_alias_ddrs.is_empty ())
+  if (comp_alias_ddrs.is_empty ())
     return;
 
-  FOR_EACH_VEC_ELT (may_alias_ddrs, i, ddr)
+  for (size_t i = 0, s = comp_alias_ddrs.length (); i < s; ++i)
     {
-      struct data_reference *dr_a, *dr_b;
-      gimple dr_group_first_a, dr_group_first_b;
-      tree addr_base_a, addr_base_b;
-      tree segment_length_a, segment_length_b;
-      gimple stmt_a, stmt_b;
-      tree seg_a_min, seg_a_max, seg_b_min, seg_b_max;
-
-      dr_a = DDR_A (ddr);
-      stmt_a = DR_STMT (DDR_A (ddr));
-      dr_group_first_a = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_a));
-      if (dr_group_first_a)
-        {
-	  stmt_a = dr_group_first_a;
-	  dr_a = STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt_a));
-	}
-
-      dr_b = DDR_B (ddr);
-      stmt_b = DR_STMT (DDR_B (ddr));
-      dr_group_first_b = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_b));
-      if (dr_group_first_b)
-        {
-	  stmt_b = dr_group_first_b;
-	  dr_b = STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt_b));
-	}
+      const dr_addr_with_seg_len& dr_a = comp_alias_ddrs[i].first;
+      const dr_addr_with_seg_len& dr_b = comp_alias_ddrs[i].second;
+      tree segment_length_a = dr_a.seg_len;
+      tree segment_length_b = dr_b.seg_len;
 
-      addr_base_a
-	= fold_build_pointer_plus (DR_BASE_ADDRESS (dr_a),
-				   size_binop (PLUS_EXPR, DR_OFFSET (dr_a),
-					       DR_INIT (dr_a)));
-      addr_base_b
-	= fold_build_pointer_plus (DR_BASE_ADDRESS (dr_b),
-				   size_binop (PLUS_EXPR, DR_OFFSET (dr_b),
-					       DR_INIT (dr_b)));
-
-      if (!operand_equal_p (DR_STEP (dr_a), DR_STEP (dr_b), 0))
-	length_factor = scalar_loop_iters;
-      else
-	length_factor = size_int (vect_factor);
-      segment_length_a = vect_vfa_segment_size (dr_a, length_factor);
-      segment_length_b = vect_vfa_segment_size (dr_b, length_factor);
+      tree addr_base_a
+	= fold_build_pointer_plus (dr_a.basic_addr, dr_a.offset);
+      tree addr_base_b
+	= fold_build_pointer_plus (dr_b.basic_addr, dr_b.offset);
 
       if (dump_enabled_p ())
 	{
 	  dump_printf_loc (MSG_NOTE, vect_location,
-                           "create runtime check for data references ");
-	  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a));
+			   "create runtime check for data references ");
+	  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a.dr));
 	  dump_printf (MSG_NOTE, " and ");
-	  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_b));
-          dump_printf (MSG_NOTE, "\n");
+	  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_b.dr));
+	  dump_printf (MSG_NOTE, "\n");
 	}
 
-      seg_a_min = addr_base_a;
-      seg_a_max = fold_build_pointer_plus (addr_base_a, segment_length_a);
-      if (tree_int_cst_compare (DR_STEP (dr_a), size_zero_node) < 0)
+      tree seg_a_min = addr_base_a;
+      tree seg_a_max = fold_build_pointer_plus (addr_base_a, segment_length_a);
+      if (tree_int_cst_compare (DR_STEP (dr_a.dr), size_zero_node) < 0)
 	seg_a_min = seg_a_max, seg_a_max = addr_base_a;
 
-      seg_b_min = addr_base_b;
-      seg_b_max = fold_build_pointer_plus (addr_base_b, segment_length_b);
-      if (tree_int_cst_compare (DR_STEP (dr_b), size_zero_node) < 0)
+      tree seg_b_min = addr_base_b;
+      tree seg_b_max = fold_build_pointer_plus (addr_base_b, segment_length_b);
+      if (tree_int_cst_compare (DR_STEP (dr_b.dr), size_zero_node) < 0)
 	seg_b_min = seg_b_max, seg_b_max = addr_base_b;
 
       part_cond_expr =
@@ -2370,7 +2297,9 @@ vect_create_cond_for_alias_checks (loop_vec_info loop_vinfo, tree * cond_expr)
   if (dump_enabled_p ())
     dump_printf_loc (MSG_NOTE, vect_location,
 		     "created %u versioning for alias checks.\n",
-		     may_alias_ddrs.length ());
+		     comp_alias_ddrs.length ());
+
+  comp_alias_ddrs.release ();
 }
 
 
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 8b7b345..f231b62 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -175,6 +175,36 @@ typedef struct _slp_oprnd_info
 
 
 
+/* This struct is used to store the information of a data reference,
+   including the data ref itself, its basic address, the access offset
+   and the segment length for aliasing checks.  This is used to generate
+   alias checks.  */
+
+struct dr_addr_with_seg_len
+{
+  dr_addr_with_seg_len (data_reference* d, tree addr, tree off, tree len)
+    : dr (d), basic_addr (addr), offset (off), seg_len (len) {}
+
+  data_reference *dr;
+  tree basic_addr;
+  tree offset;
+  tree seg_len;
+};
+
+/* This struct contains two dr_addr_with_seg_len objects with aliasing data
+   refs.  Two comparisons are generated from them.  */
+
+struct dr_addr_with_seg_len_pair_t
+{
+  dr_addr_with_seg_len_pair_t (const dr_addr_with_seg_len& d1,
+			       const dr_addr_with_seg_len& d2)
+    : first (d1), second (d2) {}
+
+  dr_addr_with_seg_len first;
+  dr_addr_with_seg_len second;
+};
+
+
 typedef struct _vect_peel_info
 {
   int npeel;
@@ -274,6 +304,10 @@ typedef struct _loop_vec_info {
      for a run-time aliasing check.  */
   vec<ddr_p> may_alias_ddrs;
 
+  /* Data Dependence Relations defining address ranges together with segment
+     lengths from which the run-time aliasing check is built.  */
+  vec<dr_addr_with_seg_len_pair_t> comp_alias_ddrs;
+
   /* Statements in the loop that have data references that are candidates for a
      runtime (loop versioning) misalignment check.  */
   vec<gimple> may_misalign_stmts;
@@ -336,6 +370,7 @@ typedef struct _loop_vec_info {
 #define LOOP_VINFO_MAY_MISALIGN_STMTS(L)   (L)->may_misalign_stmts
 #define LOOP_VINFO_LOC(L)                  (L)->loop_line_number
 #define LOOP_VINFO_MAY_ALIAS_DDRS(L)       (L)->may_alias_ddrs
+#define LOOP_VINFO_COMP_ALIAS_DDRS(L)      (L)->comp_alias_ddrs
 #define LOOP_VINFO_GROUPED_STORES(L)       (L)->grouped_stores
 #define LOOP_VINFO_SLP_INSTANCES(L)        (L)->slp_instances
 #define LOOP_VINFO_SLP_UNROLLING_FACTOR(L) (L)->slp_unrolling_factor

Comments

Richard Biener Oct. 23, 2013, 12:53 p.m. UTC | #1
On Tue, Oct 22, 2013 at 11:12 PM, Cong Hou <congh@google.com> wrote:
> On Fri, Oct 18, 2013 at 6:09 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Tue, Oct 15, 2013 at 12:43 AM, Cong Hou <congh@google.com> wrote:
>>> Sorry for forgetting using plain-text mode. Resend it.
>>>
>>>
>>> ---------- Forwarded message ----------
>>> From: Cong Hou <congh@google.com>
>>> Date: Mon, Oct 14, 2013 at 3:29 PM
>>> Subject: Re: [PATCH] Reducing number of alias checks in vectorization.
>>> To: Richard Biener <rguenther@suse.de>, GCC Patches <gcc-patches@gcc.gnu.org>
>>> Cc: Jakub Jelinek <jakub@redhat.com>
>>>
>>>
>>> I have made a new patch for this issue according to your comments.
>>>
>>> There are several modifications to my previous patch:
>>>
>>>
>>> 1. Remove the use of STL features such as vector and sort. Use GCC's
>>> vec and qsort instead.
>>
>> I think that using <utility> and thus std::pair and std::swap is ok.  Including
>> of <utility> should be done via system.h though.
>>
>
>
> Unfortunately, std::swap is defined in <algorithm>. Can I still use it?

Hmm, let's defer this issue for now, adding swap and pair templates
locally in GCC is better for now.  That is, I don't want a discussion of
what parts of the standard library to use inside this thread.  You may
want to rise the this issue in a new thread on gcc@gcc.gnu.org.

>>> 2. Comparisons between tree nodes are not based on their addresses any
>>> more. Use compare_tree() function instead.
>>
>> Ok.
>>
>>> 3. The function vect_create_cond_for_alias_checks() now returns the
>>> number of alias checks. If its second parameter cond_expr is NULL,
>>> then this function only calculate the number of alias checks after the
>>> merging and won't generate comparison expressions.
>>
>> We now perform the merging twice - that's easily avoided by recording
>> the merge result in loop_vinfo alongside may_alias_ddrs (which you
>> should be able to drop as they are already contained in the data
>> structures you build).
>>
>> Which means you can split the function and move the merging
>> part to vect_prune_runtime_alias_test_list.
>
>
> I have added one more field to loop_vec_info to store the dr pairs
> from which alias checks are built. And I also split the functionality
> of vect_create_cond_for_alias_checks() into two functions, so that we
> now perform the merging once instead of twice.
>
>
>>
>>> 4. The function vect_prune_runtime_alias_test_list() now uses
>>> vect_create_cond_for_alias_checks() to get the number of alias checks.
>>>
>>>
>>> The patch is attached as a text file.
>>>
>>> Please give me your comment on this patch. Thank you!
>>
>> +  if (!operand_equal_p (dr_a1->basic_addr, dr_a2->basic_addr, 0)
>> +      || TREE_CODE (dr_a1->offset) != INTEGER_CST
>> +      || TREE_CODE (dr_a2->offset) != INTEGER_CST)
>> +    continue;
>> +
>> +  HOST_WIDEST_INT diff = widest_int_cst_value (dr_a2->offset) -
>> + widest_int_cst_value (dr_a1->offset);
>>
>> use !host_integerp (dr_a1->offset) as check and then int_cst_value.
>
>
> Done.
>
>
>>
>> +  if (diff <= vect_factor
>> +      || (TREE_CODE (dr_a1->seg_len) == INTEGER_CST
>> +  && diff - widest_int_cst_value (dr_a1->seg_len) < vect_factor)
>> +      || (TREE_CODE (dr_a1->seg_len) == INTEGER_CST
>> +  && TREE_CODE (dr_b1->seg_len) == INTEGER_CST
>> +  && diff - widest_int_cst_value (dr_a1->seg_len) <
>> +     widest_int_cst_value (dr_b1->seg_len)))
>>
>> can you add a comment above this why it is safe to combine under
>> this condition?  You seem to combine offsets but not data-ref steps
>> for example.  The segment length is computed quite optimistically
>> and you don't seem to adjust that.  That is, generally the alias
>> check, if implemented as simple overlap, would need to intersect
>> [init, init + niter * step]  intervals, losing some non-aliasing cases.
>> The current segment length is optimistically computed and thus
>> can recover some more cases, but you have to be careful with
>> merging to not break its implicit assumptions.
>
>
> I have changed the code and made it more clear as shown below. I still
> use the condition
>
>
> DR_A2->OFFSET - DR_A1->OFFSET - SEGMENT_LENGTH_A < SEGMENT_LENGTH_B
>
>
> But in case that SEGMENT_LENGTH_A or SEGMENT_LENGTH_B is not constant,
> we can still check the following two conditions:
>
>
> DR_A2->OFFSET - DR_A1->OFFSET  < SEGMENT_LENGTH_B
> (if SEGMENT_LENGTH_A is not a constant).
>
>
> DR_A2->OFFSET - DR_A1->OFFSET - SEGMENT_LENGTH_A < VECT_FACTOR
> (if SEGMENT_LENGTH_B is not a constant, then its minimum value should
> be VECT_FACTOR).
>
>
>
> The corresponding code is:
>
>
>  /* Now we check if the following condition is satisfied:
>
>     DIFF - SEGMENT_LENGTH_A < SEGMENT_LENGTH_B
>
>     where DIFF = DR_A2->OFFSET - DR_A1->OFFSET.  However,
>     SEGMENT_LENGTH_A or SEGMENT_LENGTH_B may not be constant so we
>     have to make a best estimation.  We can get the minimum value
>     of SEGMENT_LENGTH_B as a constant, represented by MIN_SEG_LEN_B,
>     then either of the following two conditions can guarantee the
>     one above:
>
>     1: DIFF <= MIN_SEG_LEN_B
>     2: DIFF - SEGMENT_LENGTH_A < MIN_SEG_LEN_B
>
>     */
>
>  HOST_WIDEST_INT
>  min_seg_len_b = (TREE_CODE (dr_b1->seg_len) == INTEGER_CST) ?
>      widest_int_cst_value (dr_b1->seg_len) :
>      vect_factor;
>
>  if (diff <= min_seg_len_b
>      || (TREE_CODE (dr_a1->seg_len) == INTEGER_CST
>  && diff - widest_int_cst_value (dr_a1->seg_len) <
>     min_seg_len_b))
>    { ... }
>
>
>
> The new patch is pasted here again. Bootstrapping and regression tests
> are passed.
>
>
> thanks,
> Cong
>
>
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 8a38316..7333f6a 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,12 @@
> +2013-10-14  Cong Hou  <congh@google.com>
> +
> + * tree-vect-loop-manip.c (vect_create_cond_for_alias_checks):
> + Combine alias checks if it is possible to amortize the runtime
> + overhead.  Return the number of alias checks after merging.
> + * tree-vect-data-refs.c (vect_prune_runtime_alias_test_list):
> + Use the function vect_create_cond_for_alias_checks () to check
> + the number of alias checks.
> +

Your MUA eats tabs ... ;)

>  2013-10-14  David Malcolm  <dmalcolm@redhat.com>
>
>   * dumpfile.h (gcc::dump_manager): New class, to hold state
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 075d071..ea2782f 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,7 @@
> +2013-10-14  Cong Hou  <congh@google.com>
> +
> + * gcc.dg/vect/vect-alias-check.c: New.
> +
>  2013-10-14  Tobias Burnus  <burnus@net-b.de>
>
>   PR fortran/58658
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-alias-check.c
> b/gcc/testsuite/gcc.dg/vect/vect-alias-check.c
> new file mode 100644
> index 0000000..bf9eb6b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-alias-check.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -ftree-vectorize
> +     --param=vect-max-version-for-alias-checks=2
> +     -fdump-tree-vect-details" } */
> +
> +/* A test case showing three potential alias checks between
> +   a[i] and b[i], b[i+7], b[i+14]. With alias checks merging
> +   enabled, those tree checks can be merged into one, and the
> +   loop will be vectorized with vect-max-version-for-alias-checks=2.  */
> +
> +void foo (int *a, int *b)
> +{
> +  int i;
> +  for (i = 0; i < 1000; ++i)
> +    a[i] = b[i] + b[i+7] + b[i+14];
> +}
> +
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
> +/* { dg-final { cleanup-tree-dump "vect" } } */
> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> index e7b2f33..5f70d58 100644
> --- a/gcc/tree-vect-data-refs.c
> +++ b/gcc/tree-vect-data-refs.c
> @@ -128,41 +128,6 @@ vect_get_smallest_scalar_type (gimple stmt,
> HOST_WIDE_INT *lhs_size_unit,
>  }
>
>
> -/* Check if data references pointed by DR_I and DR_J are same or
> -   belong to same interleaving group.  Return FALSE if drs are
> -   different, otherwise return TRUE.  */
> -
> -static bool
> -vect_same_range_drs (data_reference_p dr_i, data_reference_p dr_j)
> -{
> -  gimple stmt_i = DR_STMT (dr_i);
> -  gimple stmt_j = DR_STMT (dr_j);
> -
> -  if (operand_equal_p (DR_REF (dr_i), DR_REF (dr_j), 0)
> -      || (GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_i))
> -    && GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_j))
> -    && (GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_i))
> - == GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_j)))))
> -    return true;
> -  else
> -    return false;
> -}
> -
> -/* If address ranges represented by DDR_I and DDR_J are equal,
> -   return TRUE, otherwise return FALSE.  */
> -
> -static bool
> -vect_vfa_range_equal (ddr_p ddr_i, ddr_p ddr_j)
> -{
> -  if ((vect_same_range_drs (DDR_A (ddr_i), DDR_A (ddr_j))
> -       && vect_same_range_drs (DDR_B (ddr_i), DDR_B (ddr_j)))
> -      || (vect_same_range_drs (DDR_A (ddr_i), DDR_B (ddr_j))
> -  && vect_same_range_drs (DDR_B (ddr_i), DDR_A (ddr_j))))
> -    return true;
> -  else
> -    return false;
> -}
> -
>  /* Insert DDR into LOOP_VINFO list of ddrs that may alias and need to be
>     tested at run-time.  Return TRUE if DDR was successfully inserted.
>     Return false if versioning is not supported.  */
> @@ -2647,69 +2612,331 @@ vect_analyze_data_ref_accesses (loop_vec_info
> loop_vinfo, bb_vec_info bb_vinfo)
>    return true;
>  }
>
> +
> +/* Operator == between two dr_addr_with_seg_len objects.
> +
> +   This equality operator is used to make sure two data refs
> +   are the same one so that we will consider to combine the
> +   aliasing checks of those two pairs of data dependent data
> +   refs.  */
> +
> +static bool
> +operator == (const dr_addr_with_seg_len& d1,
> +     const dr_addr_with_seg_len& d2)
> +{
> +  return operand_equal_p (d1.basic_addr, d2.basic_addr, 0)
> + && compare_tree (d1.offset, d2.offset) == 0
> + && compare_tree (d1.seg_len, d2.seg_len) == 0;
> +}
> +
> +/* Function comp_dr_addr_with_seg_len_pair.
> +
> +   Comparison function for sorting objects of dr_addr_with_seg_len_pair_t
> +   so that we can combine aliasing checks in one scan.  */
> +
> +static int
> +comp_dr_addr_with_seg_len_pair (const void *p1_, const void *p2_)
> +{
> +  const dr_addr_with_seg_len_pair_t* p1 =
> +    (const dr_addr_with_seg_len_pair_t *) p1_;
> +  const dr_addr_with_seg_len_pair_t* p2 =
> +    (const dr_addr_with_seg_len_pair_t *) p2_;
> +
> +  const dr_addr_with_seg_len &p11 = p1->first,
> +     &p12 = p1->second,
> +     &p21 = p2->first,
> +     &p22 = p2->second;
> +
> +  int comp_res = compare_tree (p11.basic_addr, p21.basic_addr);
> +  if (comp_res != 0)
> +    return comp_res;
> +
> +  comp_res = compare_tree (p12.basic_addr, p22.basic_addr);
> +  if (comp_res != 0)
> +    return comp_res;
> +
> +  if (TREE_CODE (p11.offset) != INTEGER_CST
> +      || TREE_CODE (p21.offset) != INTEGER_CST)
> +    {
> +      comp_res = compare_tree (p11.offset, p21.offset);
> +      if (comp_res != 0)
> + return comp_res;
> +    }
> +  if (tree_int_cst_compare (p11.offset, p21.offset) < 0)
> +    return -1;
> +  if (tree_int_cst_compare (p11.offset, p21.offset) > 0)
> +    return 1;
> +  if (TREE_CODE (p12.offset) != INTEGER_CST
> +      || TREE_CODE (p22.offset) != INTEGER_CST)
> +    {
> +      comp_res = compare_tree (p12.offset, p22.offset);
> +      if (comp_res != 0)
> + return comp_res;
> +    }
> +  if (tree_int_cst_compare (p12.offset, p22.offset) < 0)
> +    return -1;
> +  if (tree_int_cst_compare (p12.offset, p22.offset) > 0)
> +    return 1;
> +
> +  return 0;
> +}
> +
> +template <class T> static void
> +swap (T& a, T& b)
> +{
> +  T c (a);
> +  a = b;
> +  b = c;
> +}
> +
> +/* Function vect_vfa_segment_size.
> +
> +   Create an expression that computes the size of segment
> +   that will be accessed for a data reference.  The functions takes into
> +   account that realignment loads may access one more vector.
> +
> +   Input:
> +     DR: The data reference.
> +     LENGTH_FACTOR: segment length to consider.
> +
> +   Return an expression whose value is the size of segment which will be
> +   accessed by DR.  */
> +
> +static tree
> +vect_vfa_segment_size (struct data_reference *dr, tree length_factor)
> +{
> +  tree segment_length;
> +
> +  if (integer_zerop (DR_STEP (dr)))
> +    segment_length = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr)));
> +  else
> +    segment_length = size_binop (MULT_EXPR,
> +                                 fold_convert (sizetype, DR_STEP (dr)),
> +                                 fold_convert (sizetype, length_factor));
> +
> +  if (vect_supportable_dr_alignment (dr, false)
> +        == dr_explicit_realign_optimized)
> +    {
> +      tree vector_size = TYPE_SIZE_UNIT
> +  (STMT_VINFO_VECTYPE (vinfo_for_stmt (DR_STMT (dr))));
> +
> +      segment_length = size_binop (PLUS_EXPR, segment_length, vector_size);
> +    }
> +  return segment_length;
> +}
> +
>  /* Function vect_prune_runtime_alias_test_list.
>
>     Prune a list of ddrs to be tested at run-time by versioning for alias.
> +   Merge several alias checks into one if possible.
>     Return FALSE if resulting list of ddrs is longer then allowed by
>     PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS, otherwise return TRUE.  */
>
>  bool
>  vect_prune_runtime_alias_test_list (loop_vec_info loop_vinfo)
>  {
> -  vec<ddr_p>  ddrs =
> +  vec<ddr_p> may_alias_ddrs =
>      LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo);
> -  unsigned i, j;
> +  vec<dr_addr_with_seg_len_pair_t>& comp_alias_ddrs =
> +    LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo);
> +  int vect_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
> +  tree scalar_loop_iters = LOOP_VINFO_NITERS (loop_vinfo);
> +
> +  ddr_p ddr;
> +  unsigned int i;
> +  tree length_factor;
>
>    if (dump_enabled_p ())
>      dump_printf_loc (MSG_NOTE, vect_location,
>                       "=== vect_prune_runtime_alias_test_list ===\n");
>
> -  for (i = 0; i < ddrs.length (); )
> +  if (may_alias_ddrs.is_empty ())
> +    return true;
> +
> +  /* Basically, for each pair of dependent data refs store_ptr_0
> +     and load_ptr_0, we create an expression:
> +
> +     ((store_ptr_0 + store_segment_length_0) <= load_ptr_0)
> +     || (load_ptr_0 + load_segment_length_0) <= store_ptr_0))
> +
> +     for aliasing checks.  However, in some cases we can decrease
> +     the number of checks by combining two checks into one.  For
> +     example, suppose we have another pair of data refs store_ptr_0
> +     and load_ptr_1, and if the following condition is satisfied:
> +
> +     load_ptr_0 < load_ptr_1  &&
> +     load_ptr_1 - load_ptr_0 - load_segment_length_0 < store_segment_length_0
> +
> +     (this condition means, in each iteration of vectorized loop,
> +     the accessed memory of store_ptr_0 cannot be between the memory
> +     of load_ptr_0 and load_ptr_1.)
> +
> +     we then can use only the following expression to finish the
> +     alising checks between store_ptr_0 & load_ptr_0 and
> +     store_ptr_0 & load_ptr_1:
> +
> +     ((store_ptr_0 + store_segment_length_0) <= load_ptr_0)
> +     || (load_ptr_1 + load_segment_length_1 <= store_ptr_0))
> +
> +     Note that we only consider that load_ptr_0 and load_ptr_1 have the
> +     same basic address.  */
> +
> +  comp_alias_ddrs.create (may_alias_ddrs.length ());
> +
> +  /* First, we collect all data ref pairs for aliasing checks.  */
> +  FOR_EACH_VEC_ELT (may_alias_ddrs, i, ddr)
>      {
> -      bool found;
> -      ddr_p ddr_i;
> +      struct data_reference *dr_a, *dr_b;
> +      gimple dr_group_first_a, dr_group_first_b;
> +      tree segment_length_a, segment_length_b;
> +      gimple stmt_a, stmt_b;
> +
> +      dr_a = DDR_A (ddr);
> +      stmt_a = DR_STMT (DDR_A (ddr));
> +      dr_group_first_a = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_a));
> +      if (dr_group_first_a)
> + {
> +  stmt_a = dr_group_first_a;
> +  dr_a = STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt_a));
> + }
>
> -      ddr_i = ddrs[i];
> -      found = false;
> +      dr_b = DDR_B (ddr);
> +      stmt_b = DR_STMT (DDR_B (ddr));
> +      dr_group_first_b = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_b));
> +      if (dr_group_first_b)
> + {
> +  stmt_b = dr_group_first_b;
> +  dr_b = STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt_b));
> + }
>
> -      for (j = 0; j < i; j++)
> -        {
> -  ddr_p ddr_j = ddrs[j];
> +      if (!operand_equal_p (DR_STEP (dr_a), DR_STEP (dr_b), 0))
> + length_factor = scalar_loop_iters;
> +      else
> + length_factor = size_int (vect_factor);

^^^^

this is the thing I am worried about.  When the steps are equal we
build reduced size segments to handle overlaps with a dependence
distance bigger than vect_factor.

> +      segment_length_a = vect_vfa_segment_size (dr_a, length_factor);
> +      segment_length_b = vect_vfa_segment_size (dr_b, length_factor);
> +
> +      dr_addr_with_seg_len_pair_t dr_with_seg_len_pair
> +  (dr_addr_with_seg_len
> +       (dr_a, DR_BASE_ADDRESS (dr_a),
> + size_binop (PLUS_EXPR, DR_OFFSET (dr_a), DR_INIT (dr_a)),
> + segment_length_a),
> +   dr_addr_with_seg_len
> +       (dr_b, DR_BASE_ADDRESS (dr_b),
> + size_binop (PLUS_EXPR, DR_OFFSET (dr_b), DR_INIT (dr_b)),
> + segment_length_b));
> +
> +      if (compare_tree (dr_with_seg_len_pair.first.basic_addr,
> + dr_with_seg_len_pair.second.basic_addr) > 0)
> + swap (dr_with_seg_len_pair.first, dr_with_seg_len_pair.second);
> +
> +      comp_alias_ddrs.safe_push (dr_with_seg_len_pair);
> +    }
>
> -  if (vect_vfa_range_equal (ddr_i, ddr_j))
> +  /* Second, we sort the collected data ref pairs so that we can scan
> +     them once to combine all possible aliasing checks.  */
> +  comp_alias_ddrs.qsort (comp_dr_addr_with_seg_len_pair);
> +
> +  /* Third, we scan the sorted dr pairs and check if we can combine
> +     alias checks of two neighbouring dr pairs.  */
> +  for (size_t i = 1; i < comp_alias_ddrs.length (); ++i)
> +    {
> +      /* Deal with two ddrs (dr_a1, dr_b1) and (dr_a2, dr_b2).  */
> +      dr_addr_with_seg_len *dr_a1 = &comp_alias_ddrs[i-1].first,
> +   *dr_b1 = &comp_alias_ddrs[i-1].second,
> +   *dr_a2 = &comp_alias_ddrs[i].first,
> +   *dr_b2 = &comp_alias_ddrs[i].second;
> +
> +      /* Remove duplicate data ref pairs.  */
> +      if (*dr_a1 == *dr_a2 && *dr_b1 == *dr_b2)
> + {
> +  if (dump_enabled_p ())
> +    {
> +      dump_printf_loc (MSG_NOTE, vect_location,
> +       "found equal ranges ");
> +      dump_generic_expr (MSG_NOTE, TDF_SLIM,
> + DR_REF (dr_a1->dr));
> +      dump_printf (MSG_NOTE,  ", ");
> +      dump_generic_expr (MSG_NOTE, TDF_SLIM,
> + DR_REF (dr_b1->dr));
> +      dump_printf (MSG_NOTE,  " and ");
> +      dump_generic_expr (MSG_NOTE, TDF_SLIM,
> + DR_REF (dr_a2->dr));
> +      dump_printf (MSG_NOTE,  ", ");
> +      dump_generic_expr (MSG_NOTE, TDF_SLIM,
> + DR_REF (dr_b2->dr));
> +      dump_printf (MSG_NOTE, "\n");
> +    }
> +
> +  comp_alias_ddrs.ordered_remove (i--);
> +  continue;
> + }
> +
> +      if (*dr_a1 == *dr_a2 || *dr_b1 == *dr_b2)
> + {
> +  /* We consider the case that DR_B1 and DR_B2 are same memrefs,
> +     and DR_A1 and DR_A2 are two consecutive memrefs.  */
> +  if (*dr_a1 == *dr_a2)
> +    {
> +      swap (dr_a1, dr_b1);
> +      swap (dr_a2, dr_b2);
> +    }
> +
> +  if (!operand_equal_p (dr_a1->basic_addr, dr_a2->basic_addr, 0)
> +      || !host_integerp (dr_a1->offset, 0)
> +      || !host_integerp (dr_a2->offset, 0))
> +    continue;
> +
> +  HOST_WIDEST_INT diff = widest_int_cst_value (dr_a2->offset) -
> + widest_int_cst_value (dr_a1->offset);
> +
> +
> +  /* Now we check if the following condition is satisfied:
> +
> +     DIFF - SEGMENT_LENGTH_A < SEGMENT_LENGTH_B
> +
> +     where DIFF = DR_A2->OFFSET - DR_A1->OFFSET.  However,
> +     SEGMENT_LENGTH_A or SEGMENT_LENGTH_B may not be constant so we
> +     have to make a best estimation.  We can get the minimum value
> +     of SEGMENT_LENGTH_B as a constant, represented by MIN_SEG_LEN_B,
> +     then either of the following two conditions can guarantee the
> +     one above:
> +
> +     1: DIFF <= MIN_SEG_LEN_B
> +     2: DIFF - SEGMENT_LENGTH_A < MIN_SEG_LEN_B
> +
> +     */
> +
> +  HOST_WIDEST_INT
> +  min_seg_len_b = (TREE_CODE (dr_b1->seg_len) == INTEGER_CST) ?
> +      widest_int_cst_value (dr_b1->seg_len) :

don't use widest_int_cst_value but tree_int_cst_low () if you previously
checked host_integerp.

> +      vect_factor;
> +
> +  if (diff <= min_seg_len_b
> +      || (TREE_CODE (dr_a1->seg_len) == INTEGER_CST
> +  && diff - widest_int_cst_value (dr_a1->seg_len) <
> +     min_seg_len_b))
>      {
>        if (dump_enabled_p ())
>   {
> -  dump_printf_loc (MSG_NOTE, vect_location,
> -                   "found equal ranges ");
> -  dump_generic_expr (MSG_NOTE, TDF_SLIM,
> -                     DR_REF (DDR_A (ddr_i)));
> -  dump_printf (MSG_NOTE,  ", ");
> -  dump_generic_expr (MSG_NOTE, TDF_SLIM,
> -                     DR_REF (DDR_B (ddr_i)));
> -  dump_printf (MSG_NOTE,  " and ");
> -  dump_generic_expr (MSG_NOTE, TDF_SLIM,
> -                     DR_REF (DDR_A (ddr_j)));
> -  dump_printf (MSG_NOTE,  ", ");
> -  dump_generic_expr (MSG_NOTE, TDF_SLIM,
> -                     DR_REF (DDR_B (ddr_j)));
> +  dump_printf_loc
> +      (MSG_NOTE, vect_location,
> +       "merging two runtime checks for data references ");
> +  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a1->dr));
> +  dump_printf (MSG_NOTE, " and ");
> +  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a2->dr));
>    dump_printf (MSG_NOTE, "\n");
>   }
> -      found = true;
> -      break;
> +
> +      dr_a1->seg_len = size_binop (PLUS_EXPR,
> +   dr_a2->seg_len, size_int (diff));
> +      comp_alias_ddrs.ordered_remove (i--);
>      }
>   }
> -
> -      if (found)
> -      {
> - ddrs.ordered_remove (i);
> - continue;
> -      }
> -      i++;
>      }
>
> -  if (ddrs.length () >
> -       (unsigned) PARAM_VALUE (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS))
> +  if ((int) comp_alias_ddrs.length () >
> +      PARAM_VALUE (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS))
>      {
>        if (dump_enabled_p ())
>   {
> @@ -2718,8 +2945,6 @@ vect_prune_runtime_alias_test_list
> (loop_vec_info loop_vinfo)
>                     "generated checks exceeded.\n");
>   }
>
> -      LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo).truncate (0);
> -
>        return false;
>      }
>
> diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
> index 574446a..1e03853 100644
> --- a/gcc/tree-vect-loop-manip.c
> +++ b/gcc/tree-vect-loop-manip.c
> @@ -2211,44 +2211,6 @@ vect_create_cond_for_align_checks
> (loop_vec_info loop_vinfo,
>      *cond_expr = part_cond_expr;
>  }
>
> -
> -/* Function vect_vfa_segment_size.
> -
> -   Create an expression that computes the size of segment
> -   that will be accessed for a data reference.  The functions takes into
> -   account that realignment loads may access one more vector.
> -
> -   Input:
> -     DR: The data reference.
> -     LENGTH_FACTOR: segment length to consider.
> -
> -   Return an expression whose value is the size of segment which will be
> -   accessed by DR.  */
> -
> -static tree
> -vect_vfa_segment_size (struct data_reference *dr, tree length_factor)
> -{
> -  tree segment_length;
> -
> -  if (integer_zerop (DR_STEP (dr)))
> -    segment_length = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr)));
> -  else
> -    segment_length = size_binop (MULT_EXPR,
> -                                 fold_convert (sizetype, DR_STEP (dr)),
> -                                 fold_convert (sizetype, length_factor));
> -
> -  if (vect_supportable_dr_alignment (dr, false)
> -        == dr_explicit_realign_optimized)
> -    {
> -      tree vector_size = TYPE_SIZE_UNIT
> -  (STMT_VINFO_VECTYPE (vinfo_for_stmt (DR_STMT (dr))));
> -
> -      segment_length = size_binop (PLUS_EXPR, segment_length, vector_size);
> -    }
> -  return segment_length;
> -}
> -
> -
>  /* Function vect_create_cond_for_alias_checks.
>
>     Create a conditional expression that represents the run-time checks for
> @@ -2257,28 +2219,24 @@ vect_vfa_segment_size (struct data_reference
> *dr, tree length_factor)
>
>     Input:
>     COND_EXPR  - input conditional expression.  New conditions will be chained
> -                with logical AND operation.
> +                with logical AND operation.  If it is NULL, then the function
> +                is used to return the number of alias checks.
>     LOOP_VINFO - field LOOP_VINFO_MAY_ALIAS_STMTS contains the list of ddrs
>          to be checked.
>
>     Output:
>     COND_EXPR - conditional expression.
>
> -   The returned value is the conditional expression to be used in the if
> +   The returned COND_EXPR is the conditional expression to be used in the if
>     statement that controls which version of the loop gets executed at runtime.
>  */
>
> -static void
> +void
>  vect_create_cond_for_alias_checks (loop_vec_info loop_vinfo, tree * cond_expr)
>  {
> -  vec<ddr_p>  may_alias_ddrs =
> -    LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo);
> -  int vect_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
> -  tree scalar_loop_iters = LOOP_VINFO_NITERS (loop_vinfo);
> -
> -  ddr_p ddr;
> -  unsigned int i;
> -  tree part_cond_expr, length_factor;
> +  vec<dr_addr_with_seg_len_pair_t> comp_alias_ddrs =
> +    LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo);
> +  tree part_cond_expr;
>
>    /* Create expression
>       ((store_ptr_0 + store_segment_length_0) <= load_ptr_0)
> @@ -2289,70 +2247,39 @@ vect_create_cond_for_alias_checks
> (loop_vec_info loop_vinfo, tree * cond_expr)
>       ((store_ptr_n + store_segment_length_n) <= load_ptr_n)
>       || (load_ptr_n + load_segment_length_n) <= store_ptr_n))  */
>
> -  if (may_alias_ddrs.is_empty ())
> +  if (comp_alias_ddrs.is_empty ())
>      return;
>
> -  FOR_EACH_VEC_ELT (may_alias_ddrs, i, ddr)
> +  for (size_t i = 0, s = comp_alias_ddrs.length (); i < s; ++i)
>      {
> -      struct data_reference *dr_a, *dr_b;
> -      gimple dr_group_first_a, dr_group_first_b;
> -      tree addr_base_a, addr_base_b;
> -      tree segment_length_a, segment_length_b;
> -      gimple stmt_a, stmt_b;
> -      tree seg_a_min, seg_a_max, seg_b_min, seg_b_max;
> -
> -      dr_a = DDR_A (ddr);
> -      stmt_a = DR_STMT (DDR_A (ddr));
> -      dr_group_first_a = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_a));
> -      if (dr_group_first_a)
> -        {
> -  stmt_a = dr_group_first_a;
> -  dr_a = STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt_a));
> - }
> -
> -      dr_b = DDR_B (ddr);
> -      stmt_b = DR_STMT (DDR_B (ddr));
> -      dr_group_first_b = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_b));
> -      if (dr_group_first_b)
> -        {
> -  stmt_b = dr_group_first_b;
> -  dr_b = STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt_b));
> - }
> +      const dr_addr_with_seg_len& dr_a = comp_alias_ddrs[i].first;
> +      const dr_addr_with_seg_len& dr_b = comp_alias_ddrs[i].second;
> +      tree segment_length_a = dr_a.seg_len;
> +      tree segment_length_b = dr_b.seg_len;
>
> -      addr_base_a
> - = fold_build_pointer_plus (DR_BASE_ADDRESS (dr_a),
> -   size_binop (PLUS_EXPR, DR_OFFSET (dr_a),
> -       DR_INIT (dr_a)));
> -      addr_base_b
> - = fold_build_pointer_plus (DR_BASE_ADDRESS (dr_b),
> -   size_binop (PLUS_EXPR, DR_OFFSET (dr_b),
> -       DR_INIT (dr_b)));
> -
> -      if (!operand_equal_p (DR_STEP (dr_a), DR_STEP (dr_b), 0))
> - length_factor = scalar_loop_iters;
> -      else
> - length_factor = size_int (vect_factor);
> -      segment_length_a = vect_vfa_segment_size (dr_a, length_factor);
> -      segment_length_b = vect_vfa_segment_size (dr_b, length_factor);
> +      tree addr_base_a
> + = fold_build_pointer_plus (dr_a.basic_addr, dr_a.offset);
> +      tree addr_base_b
> + = fold_build_pointer_plus (dr_b.basic_addr, dr_b.offset);
>
>        if (dump_enabled_p ())
>   {
>    dump_printf_loc (MSG_NOTE, vect_location,
> -                           "create runtime check for data references ");
> -  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a));
> +   "create runtime check for data references ");
> +  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a.dr));
>    dump_printf (MSG_NOTE, " and ");
> -  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_b));
> -          dump_printf (MSG_NOTE, "\n");
> +  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_b.dr));
> +  dump_printf (MSG_NOTE, "\n");
>   }
>
> -      seg_a_min = addr_base_a;
> -      seg_a_max = fold_build_pointer_plus (addr_base_a, segment_length_a);
> -      if (tree_int_cst_compare (DR_STEP (dr_a), size_zero_node) < 0)
> +      tree seg_a_min = addr_base_a;
> +      tree seg_a_max = fold_build_pointer_plus (addr_base_a, segment_length_a);
> +      if (tree_int_cst_compare (DR_STEP (dr_a.dr), size_zero_node) < 0)
>   seg_a_min = seg_a_max, seg_a_max = addr_base_a;
>
> -      seg_b_min = addr_base_b;
> -      seg_b_max = fold_build_pointer_plus (addr_base_b, segment_length_b);
> -      if (tree_int_cst_compare (DR_STEP (dr_b), size_zero_node) < 0)
> +      tree seg_b_min = addr_base_b;
> +      tree seg_b_max = fold_build_pointer_plus (addr_base_b, segment_length_b);
> +      if (tree_int_cst_compare (DR_STEP (dr_b.dr), size_zero_node) < 0)
>   seg_b_min = seg_b_max, seg_b_max = addr_base_b;
>
>        part_cond_expr =
> @@ -2370,7 +2297,9 @@ vect_create_cond_for_alias_checks (loop_vec_info
> loop_vinfo, tree * cond_expr)
>    if (dump_enabled_p ())
>      dump_printf_loc (MSG_NOTE, vect_location,
>       "created %u versioning for alias checks.\n",
> -     may_alias_ddrs.length ());
> +     comp_alias_ddrs.length ());
> +
> +  comp_alias_ddrs.release ();
>  }
>
>
> diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
> index 8b7b345..f231b62 100644
> --- a/gcc/tree-vectorizer.h
> +++ b/gcc/tree-vectorizer.h
> @@ -175,6 +175,36 @@ typedef struct _slp_oprnd_info
>
>
>
> +/* This struct is used to store the information of a data reference,
> +   including the data ref itself, its basic address, the access offset
> +   and the segment length for aliasing checks.  This is used to generate
> +   alias checks.  */
> +
> +struct dr_addr_with_seg_len
> +{
> +  dr_addr_with_seg_len (data_reference* d, tree addr, tree off, tree len)
> +    : dr (d), basic_addr (addr), offset (off), seg_len (len) {}
> +
> +  data_reference *dr;
> +  tree basic_addr;
> +  tree offset;
> +  tree seg_len;
> +};
> +
> +/* This struct contains two dr_addr_with_seg_len objects with aliasing data
> +   refs.  Two comparisons are generated from them.  */
> +
> +struct dr_addr_with_seg_len_pair_t
> +{
> +  dr_addr_with_seg_len_pair_t (const dr_addr_with_seg_len& d1,
> +       const dr_addr_with_seg_len& d2)
> +    : first (d1), second (d2) {}
> +
> +  dr_addr_with_seg_len first;
> +  dr_addr_with_seg_len second;
> +};
> +
> +
>  typedef struct _vect_peel_info
>  {
>    int npeel;
> @@ -274,6 +304,10 @@ typedef struct _loop_vec_info {
>       for a run-time aliasing check.  */
>    vec<ddr_p> may_alias_ddrs;
>
> +  /* Data Dependence Relations defining address ranges together with segment
> +     lengths from which the run-time aliasing check is built.  */
> +  vec<dr_addr_with_seg_len_pair_t> comp_alias_ddrs;
> +

Are you sure you release those on all paths that exit the vectorizer?

Thanks,
Richard.

>    /* Statements in the loop that have data references that are candidates for a
>       runtime (loop versioning) misalignment check.  */
>    vec<gimple> may_misalign_stmts;
> @@ -336,6 +370,7 @@ typedef struct _loop_vec_info {
>  #define LOOP_VINFO_MAY_MISALIGN_STMTS(L)   (L)->may_misalign_stmts
>  #define LOOP_VINFO_LOC(L)                  (L)->loop_line_number
>  #define LOOP_VINFO_MAY_ALIAS_DDRS(L)       (L)->may_alias_ddrs
> +#define LOOP_VINFO_COMP_ALIAS_DDRS(L)      (L)->comp_alias_ddrs
>  #define LOOP_VINFO_GROUPED_STORES(L)       (L)->grouped_stores
>  #define LOOP_VINFO_SLP_INSTANCES(L)        (L)->slp_instances
>  #define LOOP_VINFO_SLP_UNROLLING_FACTOR(L) (L)->slp_unrolling_factor
>
>
>
>
>
>>
>> Thanks,
>> Richard.
>>
>>>
>>> Cong
>>>
>>>
>>> On Thu, Oct 3, 2013 at 2:35 PM, Cong Hou <congh@google.com> wrote:
>>>>
>>>> Forget about this "aux" idea as the segment length for one data ref
>>>> can be different in different dr pairs.
>>>>
>>>> In my patch I created a struct as shown below:
>>>>
>>>> struct dr_addr_with_seg_len
>>>> {
>>>>   data_reference *dr;
>>>>   tree basic_addr;
>>>>   tree offset;
>>>>   tree seg_len;
>>>> };
>>>>
>>>>
>>>> Note that basic_addr and offset can always obtained from dr, but we
>>>> need to store two segment lengths for each dr pair. It is improper to
>>>> add a field to data_dependence_relation as it is defined outside of
>>>> vectorizer. We can change the type (a new one combining
>>>> data_dependence_relation and segment length) of may_alias_ddrs in
>>>> loop_vec_info to include such information, but we have to add a new
>>>> type to tree-vectorizer.h which is only used in two places - still too
>>>> much.
>>>>
>>>> One possible solution is that we create a local struct as shown above
>>>> and a new function which returns the merged alias check information.
>>>> This function will be called twice: once during analysis phase and
>>>> once in transformation phase. Then we don't have to store the merged
>>>> alias check information during those two phases. The additional time
>>>> cost is minimal as there will not be too many data dependent dr pairs
>>>> in a loop.
>>>>
>>>> Any comment?
>>>>
>>>>
>>>> thanks,
>>>> Cong
>>>>
>>>>
>>>> On Thu, Oct 3, 2013 at 10:57 AM, Cong Hou <congh@google.com> wrote:
>>>> > I noticed that there is a "struct dataref_aux" defined in
>>>> > tree-vectorizer.h which is specific to the vectorizer pass and is
>>>> > stored in (void*)aux in "struct data_reference". Can we add one more
>>>> > field "segment_length" to dataref_aux so that we can pass this
>>>> > information for merging alias checks? Then we can avoid to modify or
>>>> > create other structures.
>>>> >
>>>> >
>>>> > thanks,
>>>> > Cong
>>>> >
>>>> >
>>>> > On Wed, Oct 2, 2013 at 2:34 PM, Cong Hou <congh@google.com> wrote:
>>>> >> On Wed, Oct 2, 2013 at 4:24 AM, Richard Biener <rguenther@suse.de> wrote:
>>>> >>> On Tue, 1 Oct 2013, Cong Hou wrote:
>>>> >>>
>>>> >>>> When alias exists between data refs in a loop, to vectorize it GCC
>>>> >>>> does loop versioning and adds runtime alias checks. Basically for each
>>>> >>>> pair of data refs with possible data dependence, there will be two
>>>> >>>> comparisons generated to make sure there is no aliasing between them
>>>> >>>> in each iteration of the vectorized loop. If there are many such data
>>>> >>>> refs pairs, the number of comparisons can be very large, which is a
>>>> >>>> big overhead.
>>>> >>>>
>>>> >>>> However, in some cases it is possible to reduce the number of those
>>>> >>>> comparisons. For example, for the following loop, we can detect that
>>>> >>>> b[0] and b[1] are two consecutive member accesses so that we can
>>>> >>>> combine the alias check between a[0:100]&b[0] and a[0:100]&b[1] into
>>>> >>>> checking a[0:100]&b[0:2]:
>>>> >>>>
>>>> >>>> void foo(int*a, int* b)
>>>> >>>> {
>>>> >>>>    for (int i = 0; i < 100; ++i)
>>>> >>>>     a[i] = b[0] + b[1];
>>>> >>>> }
>>>> >>>>
>>>> >>>> Actually, the requirement of consecutive memory accesses is too
>>>> >>>> strict. For the following loop, we can still combine the alias checks
>>>> >>>> between a[0:100]&b[0] and a[0:100]&b[100]:
>>>> >>>>
>>>> >>>> void foo(int*a, int* b)
>>>> >>>> {
>>>> >>>>    for (int i = 0; i < 100; ++i)
>>>> >>>>     a[i] = b[0] + b[100];
>>>> >>>> }
>>>> >>>>
>>>> >>>> This is because if b[0] is not in a[0:100] and b[100] is not in
>>>> >>>> a[0:100] then a[0:100] cannot be between b[0] and b[100]. We only need
>>>> >>>> to check a[0:100] and b[0:101] don't overlap.
>>>> >>>>
>>>> >>>> More generally, consider two pairs of data refs (a, b1) and (a, b2).
>>>> >>>> Suppose addr_b1 and addr_b2 are basic addresses of data ref b1 and b2;
>>>> >>>> offset_b1 and offset_b2 (offset_b1 < offset_b2) are offsets of b1 and
>>>> >>>> b2, and segment_length_a, segment_length_b1, and segment_length_b2 are
>>>> >>>> segment length of a, b1, and b2. Then we can combine the two
>>>> >>>> comparisons into one if the following condition is satisfied:
>>>> >>>>
>>>> >>>> offset_b2- offset_b1 - segment_length_b1 < segment_length_a
>>>> >>>>
>>>> >>>>
>>>> >>>> This patch detects those combination opportunities to reduce the
>>>> >>>> number of alias checks. It is tested on an x86-64 machine.
>>>> >>>
>>>> >>> Apart from the other comments you got (to which I agree) the patch
>>>> >>> seems to do two things, namely also:
>>>> >>>
>>>> >>> +  /* Extract load and store statements on pointers with zero-stride
>>>> >>> +     accesses.  */
>>>> >>> +  if (LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo))
>>>> >>> +    {
>>>> >>>
>>>> >>> which I'd rather see in a separate patch (and done also when
>>>> >>> the loop doesn't require versioning for alias).
>>>> >>>
>>>> >>
>>>> >>
>>>> >> My mistake.. I am working on those two patches at the same time and
>>>> >> pasted that one also here by mistake. I will send another patch about
>>>> >> the "hoist" topic.
>>>> >>
>>>> >>
>>>> >>> Also combining the alias checks in vect_create_cond_for_alias_checks
>>>> >>> is nice but doesn't properly fix the use of the
>>>> >>> vect-max-version-for-alias-checks param which currently inhibits
>>>> >>> vectorization of the HIMENO benchmark by default (and make us look bad
>>>> >>> compared to LLVM).
>>>> >>>
>>>> >>> So I believe this merging should be done incrementally when
>>>> >>> we collect the DDRs we need to test in vect_mark_for_runtime_alias_test.
>>>> >>>
>>>> >>
>>>> >>
>>>> >> I agree that vect-max-version-for-alias-checks param should count the
>>>> >> number of checks after the merge. However, the struct
>>>> >> data_dependence_relation could not record the new information produced
>>>> >> by the merge. The new information I mentioned contains the new segment
>>>> >> length for comparisons. This length is calculated right in
>>>> >> vect_create_cond_for_alias_checks() function. Since
>>>> >> vect-max-version-for-alias-checks is used during analysis phase, shall
>>>> >> we move all those (get segment length for each data ref and merge
>>>> >> alias checks) from transformation to analysis phase? If we cannot
>>>> >> store the result properly (data_dependence_relation is not enough),
>>>> >> shall we do it twice in both phases?
>>>> >>
>>>> >> I also noticed a possible bug in the function vect_same_range_drs()
>>>> >> called by vect_prune_runtime_alias_test_list(). For the following code
>>>> >> I get two pairs of data refs after
>>>> >> vect_prune_runtime_alias_test_list(), but in
>>>> >> vect_create_cond_for_alias_checks() after detecting grouped accesses I
>>>> >> got two identical pairs of data refs. The consequence is two identical
>>>> >> alias checks are produced.
>>>> >>
>>>> >>
>>>> >> void yuv2yuyv_ref (int *d, int *src, int n)
>>>> >> {
>>>> >>   char *dest = (char *)d;
>>>> >>   int i;
>>>> >>
>>>> >>   for(i=0;i<n/2;i++){
>>>> >>     dest[i*4 + 0] = (src[i*2 + 0])>>16;
>>>> >>     dest[i*4 + 1] = (src[i*2 + 1])>>8;
>>>> >>     dest[i*4 + 2] = (src[i*2 + 0])>>16;
>>>> >>     dest[i*4 + 3] = (src[i*2 + 0])>>0;
>>>> >>   }
>>>> >> }
>>>> >>
>>>> >>
>>>> >> I think the solution to this problem is changing
>>>> >>
>>>> >> GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_i))
>>>> >> == GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_j)
>>>> >>
>>>> >> into
>>>> >>
>>>> >> STMT_VINFO_DATA_REF (vinfo_for_stmt (GROUP_FIRST_ELEMENT
>>>> >> (vinfo_for_stmt (stmt_i))))
>>>> >> == STMT_VINFO_DATA_REF (vinfo_for_stmt (GROUP_FIRST_ELEMENT
>>>> >> (vinfo_for_stmt (stmt_j)))
>>>> >>
>>>> >>
>>>> >> in function vect_same_range_drs(). What do you think about it?
>>>> >>
>>>> >>
>>>> >> thanks,
>>>> >> Cong
>>>> >>
>>>> >>
>>>> >>
>>>> >>> Thanks for working on this,
>>>> >>> Richard.
>>>> >>>
>>>> >>>>
>>>> >>>> thanks,
>>>> >>>> Cong
>>>> >>>>
>>>> >>>>
>>>> >>>>
>>>> >>>> Index: gcc/tree-vect-loop-manip.c
>>>> >>>> ===================================================================
>>>> >>>> --- gcc/tree-vect-loop-manip.c (revision 202662)
>>>> >>>> +++ gcc/tree-vect-loop-manip.c (working copy)
>>>> >>>> @@ -19,6 +19,10 @@ You should have received a copy of the G
>>>> >>>>  along with GCC; see the file COPYING3.  If not see
>>>> >>>>  <http://www.gnu.org/licenses/>.  */
>>>> >>>>
>>>> >>>> +#include <vector>
>>>> >>>> +#include <utility>
>>>> >>>> +#include <algorithm>
>>>> >>>> +
>>>> >>>>  #include "config.h"
>>>> >>>>  #include "system.h"
>>>> >>>>  #include "coretypes.h"
>>>> >>>> @@ -2248,6 +2252,74 @@ vect_vfa_segment_size (struct data_refer
>>>> >>>>    return segment_length;
>>>> >>>>  }
>>>> >>>>
>>>> >>>> +namespace
>>>> >>>> +{
>>>> >>>> +
>>>> >>>> +/* struct dr_addr_with_seg_len
>>>> >>>> +
>>>> >>>> +   A struct storing information of a data reference, including the data
>>>> >>>> +   ref itself, its basic address, the access offset and the segment length
>>>> >>>> +   for aliasing checks.  */
>>>> >>>> +
>>>> >>>> +struct dr_addr_with_seg_len
>>>> >>>> +{
>>>> >>>> +  dr_addr_with_seg_len (data_reference* d, tree addr, tree off, tree len)
>>>> >>>> +    : dr (d), basic_addr (addr), offset (off), seg_len (len) {}
>>>> >>>> +
>>>> >>>> +  data_reference* dr;
>>>> >>>> +  tree basic_addr;
>>>> >>>> +  tree offset;
>>>> >>>> +  tree seg_len;
>>>> >>>> +};
>>>> >>>> +
>>>> >>>> +/* Operator == between two dr_addr_with_seg_len objects.
>>>> >>>> +
>>>> >>>> +   This equality operator is used to make sure two data refs
>>>> >>>> +   are the same one so that we will consider to combine the
>>>> >>>> +   aliasing checks of those two pairs of data dependent data
>>>> >>>> +   refs.  */
>>>> >>>> +
>>>> >>>> +bool operator == (const dr_addr_with_seg_len& d1,
>>>> >>>> +  const dr_addr_with_seg_len& d2)
>>>> >>>> +{
>>>> >>>> +  return operand_equal_p (d1.basic_addr, d2.basic_addr, 0)
>>>> >>>> + && operand_equal_p (d1.offset, d2.offset, 0)
>>>> >>>> + && operand_equal_p (d1.seg_len, d2.seg_len, 0);
>>>> >>>> +}
>>>> >>>> +
>>>> >>>> +typedef std::pair <dr_addr_with_seg_len, dr_addr_with_seg_len>
>>>> >>>> + dr_addr_with_seg_len_pair_t;
>>>> >>>> +
>>>> >>>> +
>>>> >>>> +/* Operator < between two dr_addr_with_seg_len_pair_t objects.
>>>> >>>> +
>>>> >>>> +   This operator is used to sort objects of dr_addr_with_seg_len_pair_t
>>>> >>>> +   so that we can combine aliasing checks during one scan.  */
>>>> >>>> +
>>>> >>>> +bool operator < (const dr_addr_with_seg_len_pair_t& p1,
>>>> >>>> + const dr_addr_with_seg_len_pair_t& p2)
>>>> >>>> +{
>>>> >>>> +  const dr_addr_with_seg_len& p11 = p1.first;
>>>> >>>> +  const dr_addr_with_seg_len& p12 = p1.second;
>>>> >>>> +  const dr_addr_with_seg_len& p21 = p2.first;
>>>> >>>> +  const dr_addr_with_seg_len& p22 = p2.second;
>>>> >>>> +
>>>> >>>> +  if (p11.basic_addr != p21.basic_addr)
>>>> >>>> +    return p11.basic_addr < p21.basic_addr;
>>>> >>>> +  if (p12.basic_addr != p22.basic_addr)
>>>> >>>> +    return p12.basic_addr < p22.basic_addr;
>>>> >>>> +  if (TREE_CODE (p11.offset) != INTEGER_CST
>>>> >>>> +      || TREE_CODE (p21.offset) != INTEGER_CST)
>>>> >>>> +    return p11.offset < p21.offset;
>>>> >>>> +  if (int_cst_value (p11.offset) != int_cst_value (p21.offset))
>>>> >>>> +    return int_cst_value (p11.offset) < int_cst_value (p21.offset);
>>>> >>>> +  if (TREE_CODE (p12.offset) != INTEGER_CST
>>>> >>>> +      || TREE_CODE (p22.offset) != INTEGER_CST)
>>>> >>>> +    return p12.offset < p22.offset;
>>>> >>>> +  return int_cst_value (p12.offset) < int_cst_value (p22.offset);
>>>> >>>> +}
>>>> >>>> +
>>>> >>>> +}
>>>> >>>>
>>>> >>>>  /* Function vect_create_cond_for_alias_checks.
>>>> >>>>
>>>> >>>> @@ -2292,20 +2364,51 @@ vect_create_cond_for_alias_checks (loop_
>>>> >>>>    if (may_alias_ddrs.is_empty ())
>>>> >>>>      return;
>>>> >>>>
>>>> >>>> +
>>>> >>>> +  /* Basically, for each pair of dependent data refs store_ptr_0
>>>> >>>> +     and load_ptr_0, we create an expression:
>>>> >>>> +
>>>> >>>> +     ((store_ptr_0 + store_segment_length_0) <= load_ptr_0)
>>>> >>>> +     || (load_ptr_0 + load_segment_length_0) <= store_ptr_0))
>>>> >>>> +
>>>> >>>> +     for aliasing checks. However, in some cases we can decrease
>>>> >>>> +     the number of checks by combining two checks into one. For
>>>> >>>> +     example, suppose we have another pair of data refs store_ptr_0
>>>> >>>> +     and load_ptr_1, and if the following condition is satisfied:
>>>> >>>> +
>>>> >>>> +     load_ptr_0 < load_ptr_1  &&
>>>> >>>> +     load_ptr_1 - load_ptr_0 - load_segment_length_0 < store_segment_length_0
>>>> >>>> +
>>>> >>>> +     (this condition means, in each iteration of vectorized loop,
>>>> >>>> +     the accessed memory of store_ptr_0 cannot be between the memory
>>>> >>>> +     of load_ptr_0 and load_ptr_1.)
>>>> >>>> +
>>>> >>>> +     we then can use only the following expression to finish the
>>>> >>>> +     alising checks between store_ptr_0 & load_ptr_0 and
>>>> >>>> +     store_ptr_0 & load_ptr_1:
>>>> >>>> +
>>>> >>>> +     ((store_ptr_0 + store_segment_length_0) <= load_ptr_0)
>>>> >>>> +     || (load_ptr_1 + load_segment_length_1 <= store_ptr_0))
>>>> >>>> +
>>>> >>>> +     Note that we only consider that load_ptr_0 and load_ptr_1 have the
>>>> >>>> +     same basic address.  */
>>>> >>>> +
>>>> >>>> +  std::vector<dr_addr_with_seg_len_pair_t> ddrs_with_seg_len;
>>>> >>>> +
>>>> >>>> +  /* First, we collect all data ref pairs for aliasing checks.  */
>>>> >>>> +
>>>> >>>>    FOR_EACH_VEC_ELT (may_alias_ddrs, i, ddr)
>>>> >>>>      {
>>>> >>>>        struct data_reference *dr_a, *dr_b;
>>>> >>>>        gimple dr_group_first_a, dr_group_first_b;
>>>> >>>> -      tree addr_base_a, addr_base_b;
>>>> >>>>        tree segment_length_a, segment_length_b;
>>>> >>>>        gimple stmt_a, stmt_b;
>>>> >>>> -      tree seg_a_min, seg_a_max, seg_b_min, seg_b_max;
>>>> >>>>
>>>> >>>>        dr_a = DDR_A (ddr);
>>>> >>>>        stmt_a = DR_STMT (DDR_A (ddr));
>>>> >>>>        dr_group_first_a = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_a));
>>>> >>>>        if (dr_group_first_a)
>>>> >>>> -        {
>>>> >>>> + {
>>>> >>>>    stmt_a = dr_group_first_a;
>>>> >>>>    dr_a = STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt_a));
>>>> >>>>   }
>>>> >>>> @@ -2314,20 +2417,11 @@ vect_create_cond_for_alias_checks (loop_
>>>> >>>>        stmt_b = DR_STMT (DDR_B (ddr));
>>>> >>>>        dr_group_first_b = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_b));
>>>> >>>>        if (dr_group_first_b)
>>>> >>>> -        {
>>>> >>>> + {
>>>> >>>>    stmt_b = dr_group_first_b;
>>>> >>>>    dr_b = STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt_b));
>>>> >>>>   }
>>>> >>>>
>>>> >>>> -      addr_base_a
>>>> >>>> - = fold_build_pointer_plus (DR_BASE_ADDRESS (dr_a),
>>>> >>>> -   size_binop (PLUS_EXPR, DR_OFFSET (dr_a),
>>>> >>>> -       DR_INIT (dr_a)));
>>>> >>>> -      addr_base_b
>>>> >>>> - = fold_build_pointer_plus (DR_BASE_ADDRESS (dr_b),
>>>> >>>> -   size_binop (PLUS_EXPR, DR_OFFSET (dr_b),
>>>> >>>> -       DR_INIT (dr_b)));
>>>> >>>> -
>>>> >>>>        if (!operand_equal_p (DR_STEP (dr_a), DR_STEP (dr_b), 0))
>>>> >>>>   length_factor = scalar_loop_iters;
>>>> >>>>        else
>>>> >>>> @@ -2335,24 +2429,149 @@ vect_create_cond_for_alias_checks (loop_
>>>> >>>>        segment_length_a = vect_vfa_segment_size (dr_a, length_factor);
>>>> >>>>        segment_length_b = vect_vfa_segment_size (dr_b, length_factor);
>>>> >>>>
>>>> >>>> +      dr_addr_with_seg_len_pair_t dr_with_seg_len_pair
>>>> >>>> +  (dr_addr_with_seg_len
>>>> >>>> +       (dr_a, DR_BASE_ADDRESS (dr_a),
>>>> >>>> + size_binop (PLUS_EXPR, DR_OFFSET (dr_a), DR_INIT (dr_a)),
>>>> >>>> + segment_length_a),
>>>> >>>> +   dr_addr_with_seg_len
>>>> >>>> +       (dr_b, DR_BASE_ADDRESS (dr_b),
>>>> >>>> + size_binop (PLUS_EXPR, DR_OFFSET (dr_b), DR_INIT (dr_b)),
>>>> >>>> + segment_length_b));
>>>> >>>> +
>>>> >>>> +      if (dr_with_seg_len_pair.first.basic_addr >
>>>> >>>> +  dr_with_seg_len_pair.second.basic_addr)
>>>> >>>> + std::swap (dr_with_seg_len_pair.first, dr_with_seg_len_pair.second);
>>>> >>>> +
>>>> >>>> +      ddrs_with_seg_len.push_back (dr_with_seg_len_pair);
>>>> >>>> +    }
>>>> >>>> +
>>>> >>>> +  /* Second, we sort the collected data ref pairs so that we can scan
>>>> >>>> +     them once to combine all possible aliasing checks.  */
>>>> >>>> +
>>>> >>>> +  std::sort (ddrs_with_seg_len.begin(), ddrs_with_seg_len.end());
>>>> >>>> +
>>>> >>>> +  /* Remove duplicate data ref pairs.  */
>>>> >>>> +  ddrs_with_seg_len.erase (std::unique (ddrs_with_seg_len.begin(),
>>>> >>>> + ddrs_with_seg_len.end()),
>>>> >>>> +   ddrs_with_seg_len.end());
>>>> >>>> +
>>>> >>>> +  /* We then scan the sorted dr pairs and check if we can combine
>>>> >>>> +     alias checks of two neighbouring dr pairs.  */
>>>> >>>> +
>>>> >>>> +  for (size_t i = 1; i < ddrs_with_seg_len.size (); ++i)
>>>> >>>> +    {
>>>> >>>> +      dr_addr_with_seg_len& dr_a1 = ddrs_with_seg_len[i-1].first;
>>>> >>>> +      dr_addr_with_seg_len& dr_b1 = ddrs_with_seg_len[i-1].second;
>>>> >>>> +      dr_addr_with_seg_len& dr_a2 = ddrs_with_seg_len[i].first;
>>>> >>>> +      dr_addr_with_seg_len& dr_b2 = ddrs_with_seg_len[i].second;
>>>> >>>> +
>>>> >>>> +      if (dr_a1 == dr_a2)
>>>> >>>> + {
>>>> >>>> +  if (dr_b1.basic_addr != dr_b2.basic_addr
>>>> >>>> +      || TREE_CODE (dr_b1.offset) != INTEGER_CST
>>>> >>>> +      || TREE_CODE (dr_b2.offset) != INTEGER_CST)
>>>> >>>> +    continue;
>>>> >>>> +
>>>> >>>> +  int diff = int_cst_value (dr_b2.offset) -
>>>> >>>> +     int_cst_value (dr_b1.offset);
>>>> >>>> +
>>>> >>>> +  gcc_assert (diff > 0);
>>>> >>>> +
>>>> >>>> +  if (diff <= vect_factor
>>>> >>>> +      || (TREE_CODE (dr_b1.seg_len) == INTEGER_CST
>>>> >>>> +  && diff - int_cst_value (dr_b1.seg_len) < vect_factor)
>>>> >>>> +      || (TREE_CODE (dr_b1.seg_len) == INTEGER_CST
>>>> >>>> +  && TREE_CODE (dr_a1.seg_len) == INTEGER_CST
>>>> >>>> +  && diff - int_cst_value (dr_b1.seg_len) <
>>>> >>>> +     int_cst_value (dr_a1.seg_len)))
>>>> >>>> +    {
>>>> >>>> +      if (dump_enabled_p ())
>>>> >>>> + {
>>>> >>>> +  dump_printf_loc
>>>> >>>> +      (MSG_NOTE, vect_location,
>>>> >>>> +       "combining two runtime checks for data references ");
>>>> >>>> +  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_b1.dr));
>>>> >>>> +  dump_printf (MSG_NOTE, " and ");
>>>> >>>> +  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_b2.dr));
>>>> >>>> +  dump_printf (MSG_NOTE, "\n");
>>>> >>>> + }
>>>> >>>> +
>>>> >>>> +      dr_b1.seg_len = size_binop (PLUS_EXPR,
>>>> >>>> +  dr_b2.seg_len, size_int (diff));
>>>> >>>> +      ddrs_with_seg_len.erase (ddrs_with_seg_len.begin () + i);
>>>> >>>> +      --i;
>>>> >>>> +    }
>>>> >>>> + }
>>>> >>>> +      else if (dr_b1 == dr_b2)
>>>> >>>> + {
>>>> >>>> +  if (dr_a1.basic_addr != dr_a2.basic_addr
>>>> >>>> +      || TREE_CODE (dr_a1.offset) != INTEGER_CST
>>>> >>>> +      || TREE_CODE (dr_a2.offset) != INTEGER_CST)
>>>> >>>> +    continue;
>>>> >>>> +
>>>> >>>> +  int diff = int_cst_value (dr_a2.offset) -
>>>> >>>> +     int_cst_value (dr_a1.offset);
>>>> >>>> +
>>>> >>>> +  gcc_assert (diff > 0);
>>>> >>>> +
>>>> >>>> +  if (diff <= vect_factor
>>>> >>>> +      || (TREE_CODE (dr_a1.seg_len) == INTEGER_CST
>>>> >>>> +  && diff - int_cst_value (dr_a1.seg_len) < vect_factor)
>>>> >>>> +      || (TREE_CODE (dr_a1.seg_len) == INTEGER_CST
>>>> >>>> +  && TREE_CODE (dr_b1.seg_len) == INTEGER_CST
>>>> >>>> +  && diff - int_cst_value (dr_a1.seg_len) <
>>>> >>>> +     int_cst_value (dr_b1.seg_len)))
>>>> >>>> +    {
>>>> >>>> +      if (dump_enabled_p ())
>>>> >>>> + {
>>>> >>>> +  dump_printf_loc
>>>> >>>> +      (MSG_NOTE, vect_location,
>>>> >>>> +       "combining two runtime checks for data references ");
>>>> >>>> +  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a1.dr));
>>>> >>>> +  dump_printf (MSG_NOTE, " and ");
>>>> >>>> +  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a2.dr));
>>>> >>>> +  dump_printf (MSG_NOTE, "\n");
>>>> >>>> + }
>>>> >>>> +
>>>> >>>> +      dr_a1.seg_len = size_binop (PLUS_EXPR,
>>>> >>>> +  dr_a2.seg_len, size_int (diff));
>>>> >>>> +      ddrs_with_seg_len.erase (ddrs_with_seg_len.begin () + i);
>>>> >>>> +      --i;
>>>> >>>> +    }
>>>> >>>> + }
>>>> >>>> +    }
>>>> >>>> +
>>>> >>>> +  for (size_t i = 0, s = ddrs_with_seg_len.size (); i < s; ++i)
>>>> >>>> +    {
>>>> >>>> +      const dr_addr_with_seg_len& dr_a = ddrs_with_seg_len[i].first;
>>>> >>>> +      const dr_addr_with_seg_len& dr_b = ddrs_with_seg_len[i].second;
>>>> >>>> +      tree segment_length_a = dr_a.seg_len;
>>>> >>>> +      tree segment_length_b = dr_b.seg_len;
>>>> >>>> +
>>>> >>>> +      tree addr_base_a
>>>> >>>> + = fold_build_pointer_plus (dr_a.basic_addr, dr_a.offset);
>>>> >>>> +      tree addr_base_b
>>>> >>>> + = fold_build_pointer_plus (dr_b.basic_addr, dr_b.offset);
>>>> >>>> +
>>>> >>>>        if (dump_enabled_p ())
>>>> >>>>   {
>>>> >>>>    dump_printf_loc (MSG_NOTE, vect_location,
>>>> >>>> -                           "create runtime check for data references ");
>>>> >>>> -  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a));
>>>> >>>> +   "create runtime check for data references ");
>>>> >>>> +  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a.dr));
>>>> >>>>    dump_printf (MSG_NOTE, " and ");
>>>> >>>> -  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_b));
>>>> >>>> -          dump_printf (MSG_NOTE, "\n");
>>>> >>>> +  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_b.dr));
>>>> >>>> +  dump_printf (MSG_NOTE, "\n");
>>>> >>>>   }
>>>> >>>>
>>>> >>>> -      seg_a_min = addr_base_a;
>>>> >>>> -      seg_a_max = fold_build_pointer_plus (addr_base_a, segment_length_a);
>>>> >>>> -      if (tree_int_cst_compare (DR_STEP (dr_a), size_zero_node) < 0)
>>>> >>>> +      tree seg_a_min = addr_base_a;
>>>> >>>> +      tree seg_a_max = fold_build_pointer_plus (addr_base_a, segment_length_a);
>>>> >>>> +      if (tree_int_cst_compare (DR_STEP (dr_a.dr), size_zero_node) < 0)
>>>> >>>>   seg_a_min = seg_a_max, seg_a_max = addr_base_a;
>>>> >>>>
>>>> >>>> -      seg_b_min = addr_base_b;
>>>> >>>> -      seg_b_max = fold_build_pointer_plus (addr_base_b, segment_length_b);
>>>> >>>> -      if (tree_int_cst_compare (DR_STEP (dr_b), size_zero_node) < 0)
>>>> >>>> +      tree seg_b_min = addr_base_b;
>>>> >>>> +      tree seg_b_max = fold_build_pointer_plus (addr_base_b, segment_length_b);
>>>> >>>> +      if (tree_int_cst_compare (DR_STEP (dr_b.dr), size_zero_node) < 0)
>>>> >>>>   seg_b_min = seg_b_max, seg_b_max = addr_base_b;
>>>> >>>>
>>>> >>>>        part_cond_expr =
>>>> >>>> @@ -2477,6 +2696,81 @@ vect_loop_versioning (loop_vec_info loop
>>>> >>>>        adjust_phi_and_debug_stmts (orig_phi, e, PHI_RESULT (new_phi));
>>>> >>>>      }
>>>> >>>>
>>>> >>>> +  /* Extract load and store statements on pointers with zero-stride
>>>> >>>> +     accesses.  */
>>>> >>>> +  if (LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo))
>>>> >>>> +    {
>>>> >>>> +
>>>> >>>> +      /* In the loop body, we iterate each statement to check if it is a load
>>>> >>>> + or store. Then we check the DR_STEP of the data reference.  If
>>>> >>>> + DR_STEP is zero, then we will hoist the load statement to the loop
>>>> >>>> + preheader, and move the store statement to the loop exit.  */
>>>> >>>> +
>>>> >>>> +      for (gimple_stmt_iterator si = gsi_start_bb (loop->header);
>>>> >>>> +   !gsi_end_p (si); )
>>>> >>>> + {
>>>> >>>> +  gimple stmt = gsi_stmt (si);
>>>> >>>> +  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
>>>> >>>> +  struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
>>>> >>>> +
>>>> >>>> +
>>>> >>>> +  if (dr && integer_zerop (DR_STEP (dr)))
>>>> >>>> +    {
>>>> >>>> +      if (DR_IS_READ (dr))
>>>> >>>> + {
>>>> >>>> +  if (dump_file)
>>>> >>>> +    {
>>>> >>>> +      fprintf (dump_file,
>>>> >>>> +       "Hoist the load to outside of the loop:\n");
>>>> >>>> +      print_gimple_stmt (dump_file, stmt, 0,
>>>> >>>> + TDF_VOPS|TDF_MEMSYMS);
>>>> >>>> +    }
>>>> >>>> +
>>>> >>>> +  basic_block preheader = loop_preheader_edge (loop)->src;
>>>> >>>> +  gimple_stmt_iterator si_dst = gsi_last_bb (preheader);
>>>> >>>> +  gsi_move_after (&si, &si_dst);
>>>> >>>> + }
>>>> >>>> +      else
>>>> >>>> + {
>>>> >>>> +  gimple_stmt_iterator si_dst =
>>>> >>>> +      gsi_last_bb (single_exit (loop)->dest);
>>>> >>>> +  gsi_move_after (&si, &si_dst);
>>>> >>>> + }
>>>> >>>> +              continue;
>>>> >>>> +    }
>>>> >>>> +  else if (!dr)
>>>> >>>> +          {
>>>> >>>> +            bool hoist = true;
>>>> >>>> +            for (size_t i = 0; i < gimple_num_ops (stmt); i++)
>>>> >>>> +            {
>>>> >>>> +              tree op = gimple_op (stmt, i);
>>>> >>>> +              if (TREE_CODE (op) == INTEGER_CST
>>>> >>>> +                  || TREE_CODE (op) == REAL_CST)
>>>> >>>> +                continue;
>>>> >>>> +              if (TREE_CODE (op) == SSA_NAME)
>>>> >>>> +              {
>>>> >>>> +                gimple def = SSA_NAME_DEF_STMT (op);
>>>> >>>> +                if (def == stmt
>>>> >>>> +                    || gimple_nop_p (def)
>>>> >>>> +                    || !flow_bb_inside_loop_p (loop, gimple_bb (def)))
>>>> >>>> +                  continue;
>>>> >>>> +              }
>>>> >>>> +              hoist = false;
>>>> >>>> +              break;
>>>> >>>> +            }
>>>> >>>> +
>>>> >>>> +            if (hoist)
>>>> >>>> +            {
>>>> >>>> +              basic_block preheader = loop_preheader_edge (loop)->src;
>>>> >>>> +              gimple_stmt_iterator si_dst = gsi_last_bb (preheader);
>>>> >>>> +              gsi_move_after (&si, &si_dst);
>>>> >>>> +              continue;
>>>> >>>> +            }
>>>> >>>> +          }
>>>> >>>> +          gsi_next (&si);
>>>> >>>> + }
>>>> >>>> +    }
>>>> >>>> +
>>>> >>>>    /* End loop-exit-fixes after versioning.  */
>>>> >>>>
>>>> >>>>    if (cond_expr_stmt_list)
>>>> >>>> Index: gcc/ChangeLog
>>>> >>>> ===================================================================
>>>> >>>> --- gcc/ChangeLog (revision 202663)
>>>> >>>> +++ gcc/ChangeLog (working copy)
>>>> >>>> @@ -1,3 +1,8 @@
>>>> >>>> +2013-10-01  Cong Hou  <congh@google.com>
>>>> >>>> +
>>>> >>>> + * tree-vect-loop-manip.c (vect_create_cond_for_alias_checks): Combine
>>>> >>>> + alias checks if it is possible to amortize the runtime overhead.
>>>> >>>> +
>>>> >>>>
>>>> >>>>
>>>> >>>
>>>> >>> --
>>>> >>> Richard Biener <rguenther@suse.de>
>>>> >>> SUSE / SUSE Labs
>>>> >>> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
>>>> >>> GF: Jeff Hawn, Jennifer Guild, Felix Imend
diff mbox

Patch

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 8a38316..7333f6a 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@ 
+2013-10-14  Cong Hou  <congh@google.com>
+
+ * tree-vect-loop-manip.c (vect_create_cond_for_alias_checks):
+ Combine alias checks if it is possible to amortize the runtime
+ overhead.  Return the number of alias checks after merging.
+ * tree-vect-data-refs.c (vect_prune_runtime_alias_test_list):
+ Use the function vect_create_cond_for_alias_checks () to check
+ the number of alias checks.
+
 2013-10-14  David Malcolm  <dmalcolm@redhat.com>

  * dumpfile.h (gcc::dump_manager): New class, to hold state
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 075d071..ea2782f 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,7 @@ 
+2013-10-14  Cong Hou  <congh@google.com>
+
+ * gcc.dg/vect/vect-alias-check.c: New.
+
 2013-10-14  Tobias Burnus  <burnus@net-b.de>

  PR fortran/58658
diff --git a/gcc/testsuite/gcc.dg/vect/vect-alias-check.c
b/gcc/testsuite/gcc.dg/vect/vect-alias-check.c
new file mode 100644
index 0000000..bf9eb6b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-alias-check.c
@@ -0,0 +1,19 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize
+     --param=vect-max-version-for-alias-checks=2
+     -fdump-tree-vect-details" } */
+
+/* A test case showing three potential alias checks between
+   a[i] and b[i], b[i+7], b[i+14]. With alias checks merging
+   enabled, those tree checks can be merged into one, and the
+   loop will be vectorized with vect-max-version-for-alias-checks=2.  */
+
+void foo (int *a, int *b)
+{
+  int i;
+  for (i = 0; i < 1000; ++i)
+    a[i] = b[i] + b[i+7] + b[i+14];
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index e7b2f33..5f70d58 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -128,41 +128,6 @@  vect_get_smallest_scalar_type (gimple stmt,
HOST_WIDE_INT *lhs_size_unit,
 }


-/* Check if data references pointed by DR_I and DR_J are same or
-   belong to same interleaving group.  Return FALSE if drs are
-   different, otherwise return TRUE.  */
-
-static bool
-vect_same_range_drs (data_reference_p dr_i, data_reference_p dr_j)
-{
-  gimple stmt_i = DR_STMT (dr_i);
-  gimple stmt_j = DR_STMT (dr_j);
-
-  if (operand_equal_p (DR_REF (dr_i), DR_REF (dr_j), 0)
-      || (GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_i))
-    && GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_j))
-    && (GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_i))
- == GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_j)))))
-    return true;
-  else
-    return false;
-}
-
-/* If address ranges represented by DDR_I and DDR_J are equal,
-   return TRUE, otherwise return FALSE.  */
-
-static bool
-vect_vfa_range_equal (ddr_p ddr_i, ddr_p ddr_j)
-{
-  if ((vect_same_range_drs (DDR_A (ddr_i), DDR_A (ddr_j))
-       && vect_same_range_drs (DDR_B (ddr_i), DDR_B (ddr_j)))
-      || (vect_same_range_drs (DDR_A (ddr_i), DDR_B (ddr_j))
-  && vect_same_range_drs (DDR_B (ddr_i), DDR_A (ddr_j))))
-    return true;
-  else
-    return false;
-}
-
 /* Insert DDR into LOOP_VINFO list of ddrs that may alias and need to be
    tested at run-time.  Return TRUE if DDR was successfully inserted.
    Return false if versioning is not supported.  */
@@ -2647,69 +2612,331 @@  vect_analyze_data_ref_accesses (loop_vec_info
loop_vinfo, bb_vec_info bb_vinfo)
   return true;
 }

+
+/* Operator == between two dr_addr_with_seg_len objects.
+
+   This equality operator is used to make sure two data refs
+   are the same one so that we will consider to combine the
+   aliasing checks of those two pairs of data dependent data
+   refs.  */
+
+static bool
+operator == (const dr_addr_with_seg_len& d1,
+     const dr_addr_with_seg_len& d2)
+{
+  return operand_equal_p (d1.basic_addr, d2.basic_addr, 0)
+ && compare_tree (d1.offset, d2.offset) == 0
+ && compare_tree (d1.seg_len, d2.seg_len) == 0;
+}
+
+/* Function comp_dr_addr_with_seg_len_pair.
+
+   Comparison function for sorting objects of dr_addr_with_seg_len_pair_t
+   so that we can combine aliasing checks in one scan.  */
+
+static int
+comp_dr_addr_with_seg_len_pair (const void *p1_, const void *p2_)
+{
+  const dr_addr_with_seg_len_pair_t* p1 =
+    (const dr_addr_with_seg_len_pair_t *) p1_;
+  const dr_addr_with_seg_len_pair_t* p2 =
+    (const dr_addr_with_seg_len_pair_t *) p2_;
+
+  const dr_addr_with_seg_len &p11 = p1->first,
+     &p12 = p1->second,
+     &p21 = p2->first,
+     &p22 = p2->second;
+
+  int comp_res = compare_tree (p11.basic_addr, p21.basic_addr);
+  if (comp_res != 0)
+    return comp_res;
+
+  comp_res = compare_tree (p12.basic_addr, p22.basic_addr);
+  if (comp_res != 0)
+    return comp_res;
+
+  if (TREE_CODE (p11.offset) != INTEGER_CST
+      || TREE_CODE (p21.offset) != INTEGER_CST)
+    {
+      comp_res = compare_tree (p11.offset, p21.offset);
+      if (comp_res != 0)
+ return comp_res;
+    }
+  if (tree_int_cst_compare (p11.offset, p21.offset) < 0)
+    return -1;
+  if (tree_int_cst_compare (p11.offset, p21.offset) > 0)
+    return 1;
+  if (TREE_CODE (p12.offset) != INTEGER_CST
+      || TREE_CODE (p22.offset) != INTEGER_CST)
+    {
+      comp_res = compare_tree (p12.offset, p22.offset);
+      if (comp_res != 0)
+ return comp_res;
+    }
+  if (tree_int_cst_compare (p12.offset, p22.offset) < 0)
+    return -1;
+  if (tree_int_cst_compare (p12.offset, p22.offset) > 0)
+    return 1;
+
+  return 0;
+}
+
+template <class T> static void
+swap (T& a, T& b)
+{
+  T c (a);
+  a = b;
+  b = c;
+}
+
+/* Function vect_vfa_segment_size.
+
+   Create an expression that computes the size of segment
+   that will be accessed for a data reference.  The functions takes into
+   account that realignment loads may access one more vector.
+
+   Input:
+     DR: The data reference.
+     LENGTH_FACTOR: segment length to consider.
+
+   Return an expression whose value is the size of segment which will be
+   accessed by DR.  */
+
+static tree
+vect_vfa_segment_size (struct data_reference *dr, tree length_factor)
+{
+  tree segment_length;
+
+  if (integer_zerop (DR_STEP (dr)))
+    segment_length = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr)));
+  else
+    segment_length = size_binop (MULT_EXPR,
+                                 fold_convert (sizetype, DR_STEP (dr)),
+                                 fold_convert (sizetype, length_factor));
+
+  if (vect_supportable_dr_alignment (dr, false)
+        == dr_explicit_realign_optimized)
+    {
+      tree vector_size = TYPE_SIZE_UNIT
+  (STMT_VINFO_VECTYPE (vinfo_for_stmt (DR_STMT (dr))));
+
+      segment_length = size_binop (PLUS_EXPR, segment_length, vector_size);
+    }
+  return segment_length;
+}
+
 /* Function vect_prune_runtime_alias_test_list.

    Prune a list of ddrs to be tested at run-time by versioning for alias.
+   Merge several alias checks into one if possible.
    Return FALSE if resulting list of ddrs is longer then allowed by
    PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS, otherwise return TRUE.  */

 bool
 vect_prune_runtime_alias_test_list (loop_vec_info loop_vinfo)
 {
-  vec<ddr_p>  ddrs =
+  vec<ddr_p> may_alias_ddrs =
     LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo);
-  unsigned i, j;
+  vec<dr_addr_with_seg_len_pair_t>& comp_alias_ddrs =
+    LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo);
+  int vect_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
+  tree scalar_loop_iters = LOOP_VINFO_NITERS (loop_vinfo);
+
+  ddr_p ddr;
+  unsigned int i;
+  tree length_factor;

   if (dump_enabled_p ())
     dump_printf_loc (MSG_NOTE, vect_location,
                      "=== vect_prune_runtime_alias_test_list ===\n");

-  for (i = 0; i < ddrs.length (); )
+  if (may_alias_ddrs.is_empty ())
+    return true;
+
+  /* Basically, for each pair of dependent data refs store_ptr_0
+     and load_ptr_0, we create an expression:
+
+     ((store_ptr_0 + store_segment_length_0) <= load_ptr_0)
+     || (load_ptr_0 + load_segment_length_0) <= store_ptr_0))
+
+     for aliasing checks.  However, in some cases we can decrease
+     the number of checks by combining two checks into one.  For
+     example, suppose we have another pair of data refs store_ptr_0
+     and load_ptr_1, and if the following condition is satisfied:
+
+     load_ptr_0 < load_ptr_1  &&
+     load_ptr_1 - load_ptr_0 - load_segment_length_0 < store_segment_length_0
+
+     (this condition means, in each iteration of vectorized loop,
+     the accessed memory of store_ptr_0 cannot be between the memory
+     of load_ptr_0 and load_ptr_1.)
+
+     we then can use only the following expression to finish the
+     alising checks between store_ptr_0 & load_ptr_0 and
+     store_ptr_0 & load_ptr_1:
+
+     ((store_ptr_0 + store_segment_length_0) <= load_ptr_0)
+     || (load_ptr_1 + load_segment_length_1 <= store_ptr_0))
+
+     Note that we only consider that load_ptr_0 and load_ptr_1 have the
+     same basic address.  */
+
+  comp_alias_ddrs.create (may_alias_ddrs.length ());
+
+  /* First, we collect all data ref pairs for aliasing checks.  */
+  FOR_EACH_VEC_ELT (may_alias_ddrs, i, ddr)
     {
-      bool found;
-      ddr_p ddr_i;
+      struct data_reference *dr_a, *dr_b;
+      gimple dr_group_first_a, dr_group_first_b;
+      tree segment_length_a, segment_length_b;
+      gimple stmt_a, stmt_b;
+
+      dr_a = DDR_A (ddr);
+      stmt_a = DR_STMT (DDR_A (ddr));
+      dr_group_first_a = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_a));
+      if (dr_group_first_a)
+ {
+  stmt_a = dr_group_first_a;
+  dr_a = STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt_a));
+ }

-      ddr_i = ddrs[i];
-      found = false;
+      dr_b = DDR_B (ddr);
+      stmt_b = DR_STMT (DDR_B (ddr));
+      dr_group_first_b = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_b));
+      if (dr_group_first_b)
+ {
+  stmt_b = dr_group_first_b;
+  dr_b = STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt_b));
+ }

-      for (j = 0; j < i; j++)
-        {
-  ddr_p ddr_j = ddrs[j];
+      if (!operand_equal_p (DR_STEP (dr_a), DR_STEP (dr_b), 0))
+ length_factor = scalar_loop_iters;
+      else
+ length_factor = size_int (vect_factor);
+      segment_length_a = vect_vfa_segment_size (dr_a, length_factor);
+      segment_length_b = vect_vfa_segment_size (dr_b, length_factor);
+
+      dr_addr_with_seg_len_pair_t dr_with_seg_len_pair
+  (dr_addr_with_seg_len
+       (dr_a, DR_BASE_ADDRESS (dr_a),
+ size_binop (PLUS_EXPR, DR_OFFSET (dr_a), DR_INIT (dr_a)),
+ segment_length_a),
+   dr_addr_with_seg_len
+       (dr_b, DR_BASE_ADDRESS (dr_b),
+ size_binop (PLUS_EXPR, DR_OFFSET (dr_b), DR_INIT (dr_b)),
+ segment_length_b));
+
+      if (compare_tree (dr_with_seg_len_pair.first.basic_addr,
+ dr_with_seg_len_pair.second.basic_addr) > 0)
+ swap (dr_with_seg_len_pair.first, dr_with_seg_len_pair.second);
+
+      comp_alias_ddrs.safe_push (dr_with_seg_len_pair);
+    }

-  if (vect_vfa_range_equal (ddr_i, ddr_j))
+  /* Second, we sort the collected data ref pairs so that we can scan
+     them once to combine all possible aliasing checks.  */
+  comp_alias_ddrs.qsort (comp_dr_addr_with_seg_len_pair);
+
+  /* Third, we scan the sorted dr pairs and check if we can combine
+     alias checks of two neighbouring dr pairs.  */
+  for (size_t i = 1; i < comp_alias_ddrs.length (); ++i)
+    {
+      /* Deal with two ddrs (dr_a1, dr_b1) and (dr_a2, dr_b2).  */
+      dr_addr_with_seg_len *dr_a1 = &comp_alias_ddrs[i-1].first,
+   *dr_b1 = &comp_alias_ddrs[i-1].second,
+   *dr_a2 = &comp_alias_ddrs[i].first,
+   *dr_b2 = &comp_alias_ddrs[i].second;
+
+      /* Remove duplicate data ref pairs.  */
+      if (*dr_a1 == *dr_a2 && *dr_b1 == *dr_b2)
+ {
+  if (dump_enabled_p ())
+    {
+      dump_printf_loc (MSG_NOTE, vect_location,
+       "found equal ranges ");
+      dump_generic_expr (MSG_NOTE, TDF_SLIM,
+ DR_REF (dr_a1->dr));
+      dump_printf (MSG_NOTE,  ", ");
+      dump_generic_expr (MSG_NOTE, TDF_SLIM,
+ DR_REF (dr_b1->dr));
+      dump_printf (MSG_NOTE,  " and ");
+      dump_generic_expr (MSG_NOTE, TDF_SLIM,
+ DR_REF (dr_a2->dr));
+      dump_printf (MSG_NOTE,  ", ");
+      dump_generic_expr (MSG_NOTE, TDF_SLIM,
+ DR_REF (dr_b2->dr));
+      dump_printf (MSG_NOTE, "\n");
+    }
+
+  comp_alias_ddrs.ordered_remove (i--);
+  continue;
+ }
+
+      if (*dr_a1 == *dr_a2 || *dr_b1 == *dr_b2)
+ {
+  /* We consider the case that DR_B1 and DR_B2 are same memrefs,
+     and DR_A1 and DR_A2 are two consecutive memrefs.  */
+  if (*dr_a1 == *dr_a2)
+    {
+      swap (dr_a1, dr_b1);
+      swap (dr_a2, dr_b2);
+    }
+
+  if (!operand_equal_p (dr_a1->basic_addr, dr_a2->basic_addr, 0)
+      || !host_integerp (dr_a1->offset, 0)
+      || !host_integerp (dr_a2->offset, 0))
+    continue;
+
+  HOST_WIDEST_INT diff = widest_int_cst_value (dr_a2->offset) -
+ widest_int_cst_value (dr_a1->offset);
+
+
+  /* Now we check if the following condition is satisfied:
+
+     DIFF - SEGMENT_LENGTH_A < SEGMENT_LENGTH_B
+
+     where DIFF = DR_A2->OFFSET - DR_A1->OFFSET.  However,
+     SEGMENT_LENGTH_A or SEGMENT_LENGTH_B may not be constant so we
+     have to make a best estimation.  We can get the minimum value
+     of SEGMENT_LENGTH_B as a constant, represented by MIN_SEG_LEN_B,
+     then either of the following two conditions can guarantee the
+     one above:
+
+     1: DIFF <= MIN_SEG_LEN_B
+     2: DIFF - SEGMENT_LENGTH_A < MIN_SEG_LEN_B
+
+     */
+
+  HOST_WIDEST_INT
+  min_seg_len_b = (TREE_CODE (dr_b1->seg_len) == INTEGER_CST) ?
+      widest_int_cst_value (dr_b1->seg_len) :
+      vect_factor;
+
+  if (diff <= min_seg_len_b
+      || (TREE_CODE (dr_a1->seg_len) == INTEGER_CST
+  && diff - widest_int_cst_value (dr_a1->seg_len) <
+     min_seg_len_b))
     {
       if (dump_enabled_p ())
  {
-  dump_printf_loc (MSG_NOTE, vect_location,
-                   "found equal ranges ");
-  dump_generic_expr (MSG_NOTE, TDF_SLIM,
-                     DR_REF (DDR_A (ddr_i)));
-  dump_printf (MSG_NOTE,  ", ");
-  dump_generic_expr (MSG_NOTE, TDF_SLIM,
-                     DR_REF (DDR_B (ddr_i)));
-  dump_printf (MSG_NOTE,  " and ");
-  dump_generic_expr (MSG_NOTE, TDF_SLIM,
-                     DR_REF (DDR_A (ddr_j)));
-  dump_printf (MSG_NOTE,  ", ");
-  dump_generic_expr (MSG_NOTE, TDF_SLIM,
-                     DR_REF (DDR_B (ddr_j)));
+  dump_printf_loc
+      (MSG_NOTE, vect_location,
+       "merging two runtime checks for data references ");
+  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a1->dr));
+  dump_printf (MSG_NOTE, " and ");
+  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a2->dr));
   dump_printf (MSG_NOTE, "\n");
  }
-      found = true;
-      break;
+
+      dr_a1->seg_len = size_binop (PLUS_EXPR,
+   dr_a2->seg_len, size_int (diff));
+      comp_alias_ddrs.ordered_remove (i--);
     }
  }
-
-      if (found)
-      {
- ddrs.ordered_remove (i);
- continue;
-      }
-      i++;
     }

-  if (ddrs.length () >
-       (unsigned) PARAM_VALUE (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS))
+  if ((int) comp_alias_ddrs.length () >
+      PARAM_VALUE (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS))
     {
       if (dump_enabled_p ())
  {
@@ -2718,8 +2945,6 @@  vect_prune_runtime_alias_test_list
(loop_vec_info loop_vinfo)
                    "generated checks exceeded.\n");
  }

-      LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo).truncate (0);
-
       return false;
     }

diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 574446a..1e03853 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -2211,44 +2211,6 @@  vect_create_cond_for_align_checks
(loop_vec_info loop_vinfo,
     *cond_expr = part_cond_expr;
 }

-
-/* Function vect_vfa_segment_size.
-
-   Create an expression that computes the size of segment
-   that will be accessed for a data reference.  The functions takes into
-   account that realignment loads may access one more vector.
-
-   Input:
-     DR: The data reference.
-     LENGTH_FACTOR: segment length to consider.
-
-   Return an expression whose value is the size of segment which will be
-   accessed by DR.  */
-
-static tree
-vect_vfa_segment_size (struct data_reference *dr, tree length_factor)
-{
-  tree segment_length;
-
-  if (integer_zerop (DR_STEP (dr)))
-    segment_length = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr)));
-  else
-    segment_length = size_binop (MULT_EXPR,
-                                 fold_convert (sizetype, DR_STEP (dr)),
-                                 fold_convert (sizetype, length_factor));
-
-  if (vect_supportable_dr_alignment (dr, false)
-        == dr_explicit_realign_optimized)
-    {
-      tree vector_size = TYPE_SIZE_UNIT
-  (STMT_VINFO_VECTYPE (vinfo_for_stmt (DR_STMT (dr))));
-
-      segment_length = size_binop (PLUS_EXPR, segment_length, vector_size);
-    }
-  return segment_length;
-}
-
-
 /* Function vect_create_cond_for_alias_checks.

    Create a conditional expression that represents the run-time checks for
@@ -2257,28 +2219,24 @@  vect_vfa_segment_size (struct data_reference
*dr, tree length_factor)

    Input:
    COND_EXPR  - input conditional expression.  New conditions will be chained
-                with logical AND operation.
+                with logical AND operation.  If it is NULL, then the function
+                is used to return the number of alias checks.
    LOOP_VINFO - field LOOP_VINFO_MAY_ALIAS_STMTS contains the list of ddrs
         to be checked.

    Output:
    COND_EXPR - conditional expression.

-   The returned value is the conditional expression to be used in the if
+   The returned COND_EXPR is the conditional expression to be used in the if
    statement that controls which version of the loop gets executed at runtime.
 */

-static void
+void
 vect_create_cond_for_alias_checks (loop_vec_info loop_vinfo, tree * cond_expr)
 {
-  vec<ddr_p>  may_alias_ddrs =
-    LOOP_VINFO_MAY_ALIAS_DDRS (loop_vinfo);
-  int vect_factor = LOOP_VINFO_VECT_FACTOR (loop_vinfo);
-  tree scalar_loop_iters = LOOP_VINFO_NITERS (loop_vinfo);
-
-  ddr_p ddr;
-  unsigned int i;
-  tree part_cond_expr, length_factor;
+  vec<dr_addr_with_seg_len_pair_t> comp_alias_ddrs =
+    LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo);
+  tree part_cond_expr;

   /* Create expression
      ((store_ptr_0 + store_segment_length_0) <= load_ptr_0)
@@ -2289,70 +2247,39 @@  vect_create_cond_for_alias_checks
(loop_vec_info loop_vinfo, tree * cond_expr)
      ((store_ptr_n + store_segment_length_n) <= load_ptr_n)
      || (load_ptr_n + load_segment_length_n) <= store_ptr_n))  */

-  if (may_alias_ddrs.is_empty ())
+  if (comp_alias_ddrs.is_empty ())
     return;

-  FOR_EACH_VEC_ELT (may_alias_ddrs, i, ddr)
+  for (size_t i = 0, s = comp_alias_ddrs.length (); i < s; ++i)
     {
-      struct data_reference *dr_a, *dr_b;
-      gimple dr_group_first_a, dr_group_first_b;
-      tree addr_base_a, addr_base_b;
-      tree segment_length_a, segment_length_b;
-      gimple stmt_a, stmt_b;
-      tree seg_a_min, seg_a_max, seg_b_min, seg_b_max;
-
-      dr_a = DDR_A (ddr);
-      stmt_a = DR_STMT (DDR_A (ddr));
-      dr_group_first_a = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_a));
-      if (dr_group_first_a)
-        {
-  stmt_a = dr_group_first_a;
-  dr_a = STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt_a));
- }
-
-      dr_b = DDR_B (ddr);
-      stmt_b = DR_STMT (DDR_B (ddr));
-      dr_group_first_b = GROUP_FIRST_ELEMENT (vinfo_for_stmt (stmt_b));
-      if (dr_group_first_b)
-        {
-  stmt_b = dr_group_first_b;
-  dr_b = STMT_VINFO_DATA_REF (vinfo_for_stmt (stmt_b));
- }
+      const dr_addr_with_seg_len& dr_a = comp_alias_ddrs[i].first;
+      const dr_addr_with_seg_len& dr_b = comp_alias_ddrs[i].second;
+      tree segment_length_a = dr_a.seg_len;
+      tree segment_length_b = dr_b.seg_len;

-      addr_base_a
- = fold_build_pointer_plus (DR_BASE_ADDRESS (dr_a),
-   size_binop (PLUS_EXPR, DR_OFFSET (dr_a),
-       DR_INIT (dr_a)));
-      addr_base_b
- = fold_build_pointer_plus (DR_BASE_ADDRESS (dr_b),
-   size_binop (PLUS_EXPR, DR_OFFSET (dr_b),
-       DR_INIT (dr_b)));
-
-      if (!operand_equal_p (DR_STEP (dr_a), DR_STEP (dr_b), 0))
- length_factor = scalar_loop_iters;
-      else
- length_factor = size_int (vect_factor);
-      segment_length_a = vect_vfa_segment_size (dr_a, length_factor);
-      segment_length_b = vect_vfa_segment_size (dr_b, length_factor);
+      tree addr_base_a
+ = fold_build_pointer_plus (dr_a.basic_addr, dr_a.offset);
+      tree addr_base_b
+ = fold_build_pointer_plus (dr_b.basic_addr, dr_b.offset);

       if (dump_enabled_p ())
  {
   dump_printf_loc (MSG_NOTE, vect_location,
-                           "create runtime check for data references ");
-  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a));
+   "create runtime check for data references ");
+  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_a.dr));
   dump_printf (MSG_NOTE, " and ");
-  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_b));
-          dump_printf (MSG_NOTE, "\n");
+  dump_generic_expr (MSG_NOTE, TDF_SLIM, DR_REF (dr_b.dr));
+  dump_printf (MSG_NOTE, "\n");
  }

-      seg_a_min = addr_base_a;
-      seg_a_max = fold_build_pointer_plus (addr_base_a, segment_length_a);
-      if (tree_int_cst_compare (DR_STEP (dr_a), size_zero_node) < 0)
+      tree seg_a_min = addr_base_a;
+      tree seg_a_max = fold_build_pointer_plus (addr_base_a, segment_length_a);
+      if (tree_int_cst_compare (DR_STEP (dr_a.dr), size_zero_node) < 0)
  seg_a_min = seg_a_max, seg_a_max = addr_base_a;

-      seg_b_min = addr_base_b;
-      seg_b_max = fold_build_pointer_plus (addr_base_b, segment_length_b);
-      if (tree_int_cst_compare (DR_STEP (dr_b), size_zero_node) < 0)
+      tree seg_b_min = addr_base_b;
+      tree seg_b_max = fold_build_pointer_plus (addr_base_b, segment_length_b);
+      if (tree_int_cst_compare (DR_STEP (dr_b.dr), size_zero_node) < 0)
  seg_b_min = seg_b_max, seg_b_max = addr_base_b;

       part_cond_expr =
@@ -2370,7 +2297,9 @@  vect_create_cond_for_alias_checks (loop_vec_info
loop_vinfo, tree * cond_expr)
   if (dump_enabled_p ())
     dump_printf_loc (MSG_NOTE, vect_location,
      "created %u versioning for alias checks.\n",
-     may_alias_ddrs.length ());
+     comp_alias_ddrs.length ());
+
+  comp_alias_ddrs.release ();
 }


diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 8b7b345..f231b62 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -175,6 +175,36 @@  typedef struct _slp_oprnd_info



+/* This struct is used to store the information of a data reference,
+   including the data ref itself, its basic address, the access offset
+   and the segment length for aliasing checks.  This is used to generate
+   alias checks.  */
+
+struct dr_addr_with_seg_len
+{
+  dr_addr_with_seg_len (data_reference* d, tree addr, tree off, tree len)
+    : dr (d), basic_addr (addr), offset (off), seg_len (len) {}
+
+  data_reference *dr;
+  tree basic_addr;
+  tree offset;
+  tree seg_len;
+};
+
+/* This struct contains two dr_addr_with_seg_len objects with aliasing data
+   refs.  Two comparisons are generated from them.  */
+
+struct dr_addr_with_seg_len_pair_t
+{
+  dr_addr_with_seg_len_pair_t (const dr_addr_with_seg_len& d1,
+       const dr_addr_with_seg_len& d2)
+    : first (d1), second (d2) {}
+
+  dr_addr_with_seg_len first;
+  dr_addr_with_seg_len second;
+};
+
+
 typedef struct _vect_peel_info
 {
   int npeel;
@@ -274,6 +304,10 @@  typedef struct _loop_vec_info {
      for a run-time aliasing check.  */
   vec<ddr_p> may_alias_ddrs;

+  /* Data Dependence Relations defining address ranges together with segment
+     lengths from which the run-time aliasing check is built.  */
+  vec<dr_addr_with_seg_len_pair_t> comp_alias_ddrs;
+
   /* Statements in the loop that have data references that are candidates for a
      runtime (loop versioning) misalignment check.  */
   vec<gimple> may_misalign_stmts;
@@ -336,6 +370,7 @@  typedef struct _loop_vec_info {
 #define LOOP_VINFO_MAY_MISALIGN_STMTS(L)   (L)->may_misalign_stmts
 #define LOOP_VINFO_LOC(L)                  (L)->loop_line_number
 #define LOOP_VINFO_MAY_ALIAS_DDRS(L)       (L)->may_alias_ddrs
+#define LOOP_VINFO_COMP_ALIAS_DDRS(L)      (L)->comp_alias_ddrs
 #define LOOP_VINFO_GROUPED_STORES(L)       (L)->grouped_stores
 #define LOOP_VINFO_SLP_INSTANCES(L)        (L)->slp_instances
 #define LOOP_VINFO_SLP_UNROLLING_FACTOR(L) (L)->slp_unrolling_factor