[PR85190] Adjust pointer for aligned access

Message ID DB6PR0802MB2504056A67C88492F350E9A1E7BE0@DB6PR0802MB2504.eurprd08.prod.outlook.com
State New
Headers show
Series
  • [PR85190] Adjust pointer for aligned access
Related show

Commit Message

Bin Cheng April 10, 2018, 9:55 a.m.
Hi,
Pointer q in gcc.dg/vect/pr81196.c is not aligned after vectorization, resulting test failure for some targets.
This simple patch adjust it so that it's aligned.

Is it OK?

Hi Rainer, could you please help me double check that this solves the issue?

Thanks,
bin

gcc/testsuite
2018-04-10  Bin Cheng  <bin.cheng@arm.com>

	PR testsuite/85190
	* gcc.dg/vect/pr81196.c: Adjust pointer for aligned access.

Comments

Richard Biener April 10, 2018, 12:49 p.m. | #1
On Tue, Apr 10, 2018 at 11:55 AM, Bin Cheng <Bin.Cheng@arm.com> wrote:
> Hi,
> Pointer q in gcc.dg/vect/pr81196.c is not aligned after vectorization, resulting test failure for some targets.
> This simple patch adjust it so that it's aligned.
>
> Is it OK?

Yes, looks quite obvious.

Richard.

> Hi Rainer, could you please help me double check that this solves the issue?
>
> Thanks,
> bin
>
> gcc/testsuite
> 2018-04-10  Bin Cheng  <bin.cheng@arm.com>
>
>         PR testsuite/85190
>         * gcc.dg/vect/pr81196.c: Adjust pointer for aligned access.
Jakub Jelinek April 10, 2018, 1:26 p.m. | #2
On Tue, Apr 10, 2018 at 09:55:35AM +0000, Bin Cheng wrote:
> Hi Rainer, could you please help me double check that this solves the issue?
> 
> Thanks,
> bin
> 
> gcc/testsuite
> 2018-04-10  Bin Cheng  <bin.cheng@arm.com>
> 
> 	PR testsuite/85190
> 	* gcc.dg/vect/pr81196.c: Adjust pointer for aligned access.

> diff --git a/gcc/testsuite/gcc.dg/vect/pr81196.c b/gcc/testsuite/gcc.dg/vect/pr81196.c
> index 46d7a9e..15320ae 100644
> --- a/gcc/testsuite/gcc.dg/vect/pr81196.c
> +++ b/gcc/testsuite/gcc.dg/vect/pr81196.c
> @@ -4,14 +4,14 @@
>  
>  void f(short*p){
>    p=(short*)__builtin_assume_aligned(p,64);
> -  short*q=p+256;
> +  short*q=p+255;
>    for(;p!=q;++p,--q){
>      short t=*p;*p=*q;*q=t;

This is UB then though, because p will never be equal to q.

>    }
>  }
>  void b(short*p){
>    p=(short*)__builtin_assume_aligned(p,64);
> -  short*q=p+256;
> +  short*q=p+255;
>    for(;p<q;++p,--q){
>      short t=*p;*p=*q;*q=t;

This one is fine, sure.

	Jakub
Bin.Cheng April 10, 2018, 2:58 p.m. | #3
On Tue, Apr 10, 2018 at 2:26 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Apr 10, 2018 at 09:55:35AM +0000, Bin Cheng wrote:
>> Hi Rainer, could you please help me double check that this solves the issue?
>>
>> Thanks,
>> bin
>>
>> gcc/testsuite
>> 2018-04-10  Bin Cheng  <bin.cheng@arm.com>
>>
>>       PR testsuite/85190
>>       * gcc.dg/vect/pr81196.c: Adjust pointer for aligned access.
>
>> diff --git a/gcc/testsuite/gcc.dg/vect/pr81196.c b/gcc/testsuite/gcc.dg/vect/pr81196.c
>> index 46d7a9e..15320ae 100644
>> --- a/gcc/testsuite/gcc.dg/vect/pr81196.c
>> +++ b/gcc/testsuite/gcc.dg/vect/pr81196.c
>> @@ -4,14 +4,14 @@
>>
>>  void f(short*p){
>>    p=(short*)__builtin_assume_aligned(p,64);
>> -  short*q=p+256;
>> +  short*q=p+255;
>>    for(;p!=q;++p,--q){
>>      short t=*p;*p=*q;*q=t;
>
> This is UB then though, because p will never be equal to q.
Sorry I already checked in, will try to correct it in another patch.

Thanks,
bin
>
>>    }
>>  }
>>  void b(short*p){
>>    p=(short*)__builtin_assume_aligned(p,64);
>> -  short*q=p+256;
>> +  short*q=p+255;
>>    for(;p<q;++p,--q){
>>      short t=*p;*p=*q;*q=t;
>
> This one is fine, sure.
>
>         Jakub
Bin.Cheng April 10, 2018, 4:28 p.m. | #4
On Tue, Apr 10, 2018 at 3:58 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Tue, Apr 10, 2018 at 2:26 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> On Tue, Apr 10, 2018 at 09:55:35AM +0000, Bin Cheng wrote:
>>> Hi Rainer, could you please help me double check that this solves the issue?
>>>
>>> Thanks,
>>> bin
>>>
>>> gcc/testsuite
>>> 2018-04-10  Bin Cheng  <bin.cheng@arm.com>
>>>
>>>       PR testsuite/85190
>>>       * gcc.dg/vect/pr81196.c: Adjust pointer for aligned access.
>>
>>> diff --git a/gcc/testsuite/gcc.dg/vect/pr81196.c b/gcc/testsuite/gcc.dg/vect/pr81196.c
>>> index 46d7a9e..15320ae 100644
>>> --- a/gcc/testsuite/gcc.dg/vect/pr81196.c
>>> +++ b/gcc/testsuite/gcc.dg/vect/pr81196.c
>>> @@ -4,14 +4,14 @@
>>>
>>>  void f(short*p){
>>>    p=(short*)__builtin_assume_aligned(p,64);
>>> -  short*q=p+256;
>>> +  short*q=p+255;
>>>    for(;p!=q;++p,--q){
>>>      short t=*p;*p=*q;*q=t;
>>
>> This is UB then though, because p will never be equal to q.

Hmm, though it's UB in this case, is it OK for niter analysis gives
below results?

Analyzing # of iterations of loop 1
  exit condition [126, + , 18446744073709551615] != 0
  bounds on difference of bases: -126 ... -126
  result:
    # of iterations 126, bounded by 126

I don't really follow last piece of code in number_of_iterations_ne:

  /* Let nsd (step, size of mode) = d.  If d does not divide c, the loop
     is infinite.  Otherwise, the number of iterations is
     (inverse(s/d) * (c/d)) mod (size of mode/d).  */
  bits = num_ending_zeros (s);
  bound = build_low_bits_mask (niter_type,
                   (TYPE_PRECISION (niter_type)
                - tree_to_uhwi (bits)));

  d = fold_binary_to_constant (LSHIFT_EXPR, niter_type,
                   build_int_cst (niter_type, 1), bits);
  s = fold_binary_to_constant (RSHIFT_EXPR, niter_type, s, bits);

  if (!exit_must_be_taken)
    {
      /* If we cannot assume that the exit is taken eventually, record the
     assumptions for divisibility of c.  */
      assumption = fold_build2 (FLOOR_MOD_EXPR, niter_type, c, d);
      assumption = fold_build2 (EQ_EXPR, boolean_type_node,
                assumption, build_int_cst (niter_type, 0));
      if (!integer_nonzerop (assumption))
    niter->assumptions = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
                      niter->assumptions, assumption);
    }

  c = fold_build2 (EXACT_DIV_EXPR, niter_type, c, d);
  tmp = fold_build2 (MULT_EXPR, niter_type, c, inverse (s, bound));
  niter->niter = fold_build2 (BIT_AND_EXPR, niter_type, tmp, bound);
  return true;

Though infinite niters is mentioned, I don't see it's handled?

Thanks,
bin
> Sorry I already checked in, will try to correct it in another patch.
>
> Thanks,
> bin
>>
>>>    }
>>>  }
>>>  void b(short*p){
>>>    p=(short*)__builtin_assume_aligned(p,64);
>>> -  short*q=p+256;
>>> +  short*q=p+255;
>>>    for(;p<q;++p,--q){
>>>      short t=*p;*p=*q;*q=t;
>>
>> This one is fine, sure.
>>
>>         Jakub
Bin.Cheng April 10, 2018, 5:27 p.m. | #5
On Tue, Apr 10, 2018 at 5:28 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Tue, Apr 10, 2018 at 3:58 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Tue, Apr 10, 2018 at 2:26 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>>> On Tue, Apr 10, 2018 at 09:55:35AM +0000, Bin Cheng wrote:
>>>> Hi Rainer, could you please help me double check that this solves the issue?
>>>>
>>>> Thanks,
>>>> bin
>>>>
>>>> gcc/testsuite
>>>> 2018-04-10  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>       PR testsuite/85190
>>>>       * gcc.dg/vect/pr81196.c: Adjust pointer for aligned access.
>>>
>>>> diff --git a/gcc/testsuite/gcc.dg/vect/pr81196.c b/gcc/testsuite/gcc.dg/vect/pr81196.c
>>>> index 46d7a9e..15320ae 100644
>>>> --- a/gcc/testsuite/gcc.dg/vect/pr81196.c
>>>> +++ b/gcc/testsuite/gcc.dg/vect/pr81196.c
>>>> @@ -4,14 +4,14 @@
>>>>
>>>>  void f(short*p){
>>>>    p=(short*)__builtin_assume_aligned(p,64);
>>>> -  short*q=p+256;
>>>> +  short*q=p+255;
>>>>    for(;p!=q;++p,--q){
>>>>      short t=*p;*p=*q;*q=t;
>>>
>>> This is UB then though, because p will never be equal to q.
>
> Hmm, though it's UB in this case, is it OK for niter analysis gives
> below results?
>
> Analyzing # of iterations of loop 1
>   exit condition [126, + , 18446744073709551615] != 0
>   bounds on difference of bases: -126 ... -126
>   result:
>     # of iterations 126, bounded by 126
>
> I don't really follow last piece of code in number_of_iterations_ne:
>
>   /* Let nsd (step, size of mode) = d.  If d does not divide c, the loop
>      is infinite.  Otherwise, the number of iterations is
>      (inverse(s/d) * (c/d)) mod (size of mode/d).  */
>   bits = num_ending_zeros (s);
>   bound = build_low_bits_mask (niter_type,
>                    (TYPE_PRECISION (niter_type)
>                 - tree_to_uhwi (bits)));
>
>   d = fold_binary_to_constant (LSHIFT_EXPR, niter_type,
>                    build_int_cst (niter_type, 1), bits);
>   s = fold_binary_to_constant (RSHIFT_EXPR, niter_type, s, bits);
>
>   if (!exit_must_be_taken)
>     {
>       /* If we cannot assume that the exit is taken eventually, record the
>      assumptions for divisibility of c.  */
>       assumption = fold_build2 (FLOOR_MOD_EXPR, niter_type, c, d);
>       assumption = fold_build2 (EQ_EXPR, boolean_type_node,
>                 assumption, build_int_cst (niter_type, 0));
>       if (!integer_nonzerop (assumption))
>     niter->assumptions = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
>                       niter->assumptions, assumption);
>     }
>
>   c = fold_build2 (EXACT_DIV_EXPR, niter_type, c, d);
>   tmp = fold_build2 (MULT_EXPR, niter_type, c, inverse (s, bound));
>   niter->niter = fold_build2 (BIT_AND_EXPR, niter_type, tmp, bound);
>   return true;
>
> Though infinite niters is mentioned, I don't see it's handled?

So the default behavior we have for long time is to completely unroll
below loop:

void f(short*p){
  p=(short*)__builtin_assume_aligned(p,64);
  short*q=p+5;
  for(;p!=q;++p,--q){
    short t=*p;*p=*q;*q=t;
  }
}

because:

Analyzing # of iterations of loop 1
  exit condition [p_6, + , 4](no_overflow) != p_6 + 10
  bounds on difference of bases: 10 ... 10
  result:
    # of iterations 2, bounded by 2

Thanks,
bin
>
> Thanks,
> bin
>> Sorry I already checked in, will try to correct it in another patch.
>>
>> Thanks,
>> bin
>>>
>>>>    }
>>>>  }
>>>>  void b(short*p){
>>>>    p=(short*)__builtin_assume_aligned(p,64);
>>>> -  short*q=p+256;
>>>> +  short*q=p+255;
>>>>    for(;p<q;++p,--q){
>>>>      short t=*p;*p=*q;*q=t;
>>>
>>> This one is fine, sure.
>>>
>>>         Jakub
Richard Biener April 11, 2018, 9:46 a.m. | #6
On Tue, Apr 10, 2018 at 6:28 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Tue, Apr 10, 2018 at 3:58 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Tue, Apr 10, 2018 at 2:26 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>>> On Tue, Apr 10, 2018 at 09:55:35AM +0000, Bin Cheng wrote:
>>>> Hi Rainer, could you please help me double check that this solves the issue?
>>>>
>>>> Thanks,
>>>> bin
>>>>
>>>> gcc/testsuite
>>>> 2018-04-10  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>       PR testsuite/85190
>>>>       * gcc.dg/vect/pr81196.c: Adjust pointer for aligned access.
>>>
>>>> diff --git a/gcc/testsuite/gcc.dg/vect/pr81196.c b/gcc/testsuite/gcc.dg/vect/pr81196.c
>>>> index 46d7a9e..15320ae 100644
>>>> --- a/gcc/testsuite/gcc.dg/vect/pr81196.c
>>>> +++ b/gcc/testsuite/gcc.dg/vect/pr81196.c
>>>> @@ -4,14 +4,14 @@
>>>>
>>>>  void f(short*p){
>>>>    p=(short*)__builtin_assume_aligned(p,64);
>>>> -  short*q=p+256;
>>>> +  short*q=p+255;
>>>>    for(;p!=q;++p,--q){
>>>>      short t=*p;*p=*q;*q=t;
>>>
>>> This is UB then though, because p will never be equal to q.
>
> Hmm, though it's UB in this case, is it OK for niter analysis gives
> below results?
>
> Analyzing # of iterations of loop 1
>   exit condition [126, + , 18446744073709551615] != 0
>   bounds on difference of bases: -126 ... -126
>   result:
>     # of iterations 126, bounded by 126
>
> I don't really follow last piece of code in number_of_iterations_ne:
>
>   /* Let nsd (step, size of mode) = d.  If d does not divide c, the loop
>      is infinite.  Otherwise, the number of iterations is
>      (inverse(s/d) * (c/d)) mod (size of mode/d).  */
>   bits = num_ending_zeros (s);
>   bound = build_low_bits_mask (niter_type,
>                    (TYPE_PRECISION (niter_type)
>                 - tree_to_uhwi (bits)));
>
>   d = fold_binary_to_constant (LSHIFT_EXPR, niter_type,
>                    build_int_cst (niter_type, 1), bits);
>   s = fold_binary_to_constant (RSHIFT_EXPR, niter_type, s, bits);
>
>   if (!exit_must_be_taken)
>     {
>       /* If we cannot assume that the exit is taken eventually, record the
>      assumptions for divisibility of c.  */
>       assumption = fold_build2 (FLOOR_MOD_EXPR, niter_type, c, d);
>       assumption = fold_build2 (EQ_EXPR, boolean_type_node,
>                 assumption, build_int_cst (niter_type, 0));
>       if (!integer_nonzerop (assumption))
>     niter->assumptions = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
>                       niter->assumptions, assumption);
>     }
>
>   c = fold_build2 (EXACT_DIV_EXPR, niter_type, c, d);
>   tmp = fold_build2 (MULT_EXPR, niter_type, c, inverse (s, bound));
>   niter->niter = fold_build2 (BIT_AND_EXPR, niter_type, tmp, bound);
>   return true;
>
> Though infinite niters is mentioned, I don't see it's handled?

number_of_iterations_ne_max computes this it seems based on the
fact that pointer overflow is undefined.  This means that 126 is
as good as any other number given the testcase is undefined...

Richard.

> Thanks,
> bin
>> Sorry I already checked in, will try to correct it in another patch.
>>
>> Thanks,
>> bin
>>>
>>>>    }
>>>>  }
>>>>  void b(short*p){
>>>>    p=(short*)__builtin_assume_aligned(p,64);
>>>> -  short*q=p+256;
>>>> +  short*q=p+255;
>>>>    for(;p<q;++p,--q){
>>>>      short t=*p;*p=*q;*q=t;
>>>
>>> This one is fine, sure.
>>>
>>>         Jakub
Bin.Cheng April 11, 2018, 1:15 p.m. | #7
On Wed, Apr 11, 2018 at 10:46 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Apr 10, 2018 at 6:28 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Tue, Apr 10, 2018 at 3:58 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>> On Tue, Apr 10, 2018 at 2:26 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>>>> On Tue, Apr 10, 2018 at 09:55:35AM +0000, Bin Cheng wrote:
>>>>> Hi Rainer, could you please help me double check that this solves the issue?
>>>>>
>>>>> Thanks,
>>>>> bin
>>>>>
>>>>> gcc/testsuite
>>>>> 2018-04-10  Bin Cheng  <bin.cheng@arm.com>
>>>>>
>>>>>       PR testsuite/85190
>>>>>       * gcc.dg/vect/pr81196.c: Adjust pointer for aligned access.
>>>>
>>>>> diff --git a/gcc/testsuite/gcc.dg/vect/pr81196.c b/gcc/testsuite/gcc.dg/vect/pr81196.c
>>>>> index 46d7a9e..15320ae 100644
>>>>> --- a/gcc/testsuite/gcc.dg/vect/pr81196.c
>>>>> +++ b/gcc/testsuite/gcc.dg/vect/pr81196.c
>>>>> @@ -4,14 +4,14 @@
>>>>>
>>>>>  void f(short*p){
>>>>>    p=(short*)__builtin_assume_aligned(p,64);
>>>>> -  short*q=p+256;
>>>>> +  short*q=p+255;
>>>>>    for(;p!=q;++p,--q){
>>>>>      short t=*p;*p=*q;*q=t;
>>>>
>>>> This is UB then though, because p will never be equal to q.
>>
>> Hmm, though it's UB in this case, is it OK for niter analysis gives
>> below results?
>>
>> Analyzing # of iterations of loop 1
>>   exit condition [126, + , 18446744073709551615] != 0
>>   bounds on difference of bases: -126 ... -126
>>   result:
>>     # of iterations 126, bounded by 126
>>
>> I don't really follow last piece of code in number_of_iterations_ne:
>>
>>   /* Let nsd (step, size of mode) = d.  If d does not divide c, the loop
>>      is infinite.  Otherwise, the number of iterations is
>>      (inverse(s/d) * (c/d)) mod (size of mode/d).  */
>>   bits = num_ending_zeros (s);
>>   bound = build_low_bits_mask (niter_type,
>>                    (TYPE_PRECISION (niter_type)
>>                 - tree_to_uhwi (bits)));
>>
>>   d = fold_binary_to_constant (LSHIFT_EXPR, niter_type,
>>                    build_int_cst (niter_type, 1), bits);
>>   s = fold_binary_to_constant (RSHIFT_EXPR, niter_type, s, bits);
>>
>>   if (!exit_must_be_taken)
>>     {
>>       /* If we cannot assume that the exit is taken eventually, record the
>>      assumptions for divisibility of c.  */
>>       assumption = fold_build2 (FLOOR_MOD_EXPR, niter_type, c, d);
>>       assumption = fold_build2 (EQ_EXPR, boolean_type_node,
>>                 assumption, build_int_cst (niter_type, 0));
>>       if (!integer_nonzerop (assumption))
>>     niter->assumptions = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
>>                       niter->assumptions, assumption);
>>     }
>>
>>   c = fold_build2 (EXACT_DIV_EXPR, niter_type, c, d);
>>   tmp = fold_build2 (MULT_EXPR, niter_type, c, inverse (s, bound));
>>   niter->niter = fold_build2 (BIT_AND_EXPR, niter_type, tmp, bound);
>>   return true;
>>
>> Though infinite niters is mentioned, I don't see it's handled?
>
> number_of_iterations_ne_max computes this it seems based on the
> fact that pointer overflow is undefined.  This means that 126 is
> as good as any other number given the testcase is undefined...

Okay, in this case, I simply removed the function with UB in the case.
Is it OK?

Thanks,
bin

gcc/testsuite
2018-04-11  Bin Cheng  <bin.cheng@arm.com>

    PR testsuite/85190
    * gcc.dg/vect/pr81196.c: Remove function with undefined behavior.


>
> Richard.
>
>> Thanks,
>> bin
>>> Sorry I already checked in, will try to correct it in another patch.
>>>
>>> Thanks,
>>> bin
>>>>
>>>>>    }
>>>>>  }
>>>>>  void b(short*p){
>>>>>    p=(short*)__builtin_assume_aligned(p,64);
>>>>> -  short*q=p+256;
>>>>> +  short*q=p+255;
>>>>>    for(;p<q;++p,--q){
>>>>>      short t=*p;*p=*q;*q=t;
>>>>
>>>> This one is fine, sure.
>>>>
>>>>         Jakub
diff --git a/gcc/testsuite/gcc.dg/vect/pr81196.c b/gcc/testsuite/gcc.dg/vect/pr81196.c
index 15320ae..97d40a0 100644
--- a/gcc/testsuite/gcc.dg/vect/pr81196.c
+++ b/gcc/testsuite/gcc.dg/vect/pr81196.c
@@ -2,13 +2,6 @@
 /* { dg-require-effective-target vect_int } */
 /* { dg-require-effective-target vect_perm_short } */
 
-void f(short*p){
-  p=(short*)__builtin_assume_aligned(p,64);
-  short*q=p+255;
-  for(;p!=q;++p,--q){
-    short t=*p;*p=*q;*q=t;
-  }
-}
 void b(short*p){
   p=(short*)__builtin_assume_aligned(p,64);
   short*q=p+255;
@@ -16,4 +9,4 @@ void b(short*p){
     short t=*p;*p=*q;*q=t;
   }
 }
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
Richard Biener April 11, 2018, 1:34 p.m. | #8
On Wed, Apr 11, 2018 at 3:15 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Wed, Apr 11, 2018 at 10:46 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Tue, Apr 10, 2018 at 6:28 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>> On Tue, Apr 10, 2018 at 3:58 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>>> On Tue, Apr 10, 2018 at 2:26 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>>>>> On Tue, Apr 10, 2018 at 09:55:35AM +0000, Bin Cheng wrote:
>>>>>> Hi Rainer, could you please help me double check that this solves the issue?
>>>>>>
>>>>>> Thanks,
>>>>>> bin
>>>>>>
>>>>>> gcc/testsuite
>>>>>> 2018-04-10  Bin Cheng  <bin.cheng@arm.com>
>>>>>>
>>>>>>       PR testsuite/85190
>>>>>>       * gcc.dg/vect/pr81196.c: Adjust pointer for aligned access.
>>>>>
>>>>>> diff --git a/gcc/testsuite/gcc.dg/vect/pr81196.c b/gcc/testsuite/gcc.dg/vect/pr81196.c
>>>>>> index 46d7a9e..15320ae 100644
>>>>>> --- a/gcc/testsuite/gcc.dg/vect/pr81196.c
>>>>>> +++ b/gcc/testsuite/gcc.dg/vect/pr81196.c
>>>>>> @@ -4,14 +4,14 @@
>>>>>>
>>>>>>  void f(short*p){
>>>>>>    p=(short*)__builtin_assume_aligned(p,64);
>>>>>> -  short*q=p+256;
>>>>>> +  short*q=p+255;
>>>>>>    for(;p!=q;++p,--q){
>>>>>>      short t=*p;*p=*q;*q=t;
>>>>>
>>>>> This is UB then though, because p will never be equal to q.
>>>
>>> Hmm, though it's UB in this case, is it OK for niter analysis gives
>>> below results?
>>>
>>> Analyzing # of iterations of loop 1
>>>   exit condition [126, + , 18446744073709551615] != 0
>>>   bounds on difference of bases: -126 ... -126
>>>   result:
>>>     # of iterations 126, bounded by 126
>>>
>>> I don't really follow last piece of code in number_of_iterations_ne:
>>>
>>>   /* Let nsd (step, size of mode) = d.  If d does not divide c, the loop
>>>      is infinite.  Otherwise, the number of iterations is
>>>      (inverse(s/d) * (c/d)) mod (size of mode/d).  */
>>>   bits = num_ending_zeros (s);
>>>   bound = build_low_bits_mask (niter_type,
>>>                    (TYPE_PRECISION (niter_type)
>>>                 - tree_to_uhwi (bits)));
>>>
>>>   d = fold_binary_to_constant (LSHIFT_EXPR, niter_type,
>>>                    build_int_cst (niter_type, 1), bits);
>>>   s = fold_binary_to_constant (RSHIFT_EXPR, niter_type, s, bits);
>>>
>>>   if (!exit_must_be_taken)
>>>     {
>>>       /* If we cannot assume that the exit is taken eventually, record the
>>>      assumptions for divisibility of c.  */
>>>       assumption = fold_build2 (FLOOR_MOD_EXPR, niter_type, c, d);
>>>       assumption = fold_build2 (EQ_EXPR, boolean_type_node,
>>>                 assumption, build_int_cst (niter_type, 0));
>>>       if (!integer_nonzerop (assumption))
>>>     niter->assumptions = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
>>>                       niter->assumptions, assumption);
>>>     }
>>>
>>>   c = fold_build2 (EXACT_DIV_EXPR, niter_type, c, d);
>>>   tmp = fold_build2 (MULT_EXPR, niter_type, c, inverse (s, bound));
>>>   niter->niter = fold_build2 (BIT_AND_EXPR, niter_type, tmp, bound);
>>>   return true;
>>>
>>> Though infinite niters is mentioned, I don't see it's handled?
>>
>> number_of_iterations_ne_max computes this it seems based on the
>> fact that pointer overflow is undefined.  This means that 126 is
>> as good as any other number given the testcase is undefined...
>
> Okay, in this case, I simply removed the function with UB in the case.
> Is it OK?

OK.

Richard.

> Thanks,
> bin
>
> gcc/testsuite
> 2018-04-11  Bin Cheng  <bin.cheng@arm.com>
>
>     PR testsuite/85190
>     * gcc.dg/vect/pr81196.c: Remove function with undefined behavior.
>
>
>>
>> Richard.
>>
>>> Thanks,
>>> bin
>>>> Sorry I already checked in, will try to correct it in another patch.
>>>>
>>>> Thanks,
>>>> bin
>>>>>
>>>>>>    }
>>>>>>  }
>>>>>>  void b(short*p){
>>>>>>    p=(short*)__builtin_assume_aligned(p,64);
>>>>>> -  short*q=p+256;
>>>>>> +  short*q=p+255;
>>>>>>    for(;p<q;++p,--q){
>>>>>>      short t=*p;*p=*q;*q=t;
>>>>>
>>>>> This one is fine, sure.
>>>>>
>>>>>         Jakub

Patch

diff --git a/gcc/testsuite/gcc.dg/vect/pr81196.c b/gcc/testsuite/gcc.dg/vect/pr81196.c
index 46d7a9e..15320ae 100644
--- a/gcc/testsuite/gcc.dg/vect/pr81196.c
+++ b/gcc/testsuite/gcc.dg/vect/pr81196.c
@@ -4,14 +4,14 @@ 
 
 void f(short*p){
   p=(short*)__builtin_assume_aligned(p,64);
-  short*q=p+256;
+  short*q=p+255;
   for(;p!=q;++p,--q){
     short t=*p;*p=*q;*q=t;
   }
 }
 void b(short*p){
   p=(short*)__builtin_assume_aligned(p,64);
-  short*q=p+256;
+  short*q=p+255;
   for(;p<q;++p,--q){
     short t=*p;*p=*q;*q=t;
   }