diff mbox

Proving no-trappness for array ref in tree if-conv using loop niter information.

Message ID DB5PR08MB1144D2055F66F5D924D5B6E9E7650@DB5PR08MB1144.eurprd08.prod.outlook.com
State New
Headers show

Commit Message

Bin Cheng April 28, 2016, 12:56 p.m. UTC
Hi,
Tree if-conversion sometimes cannot convert conditional array reference into unconditional one.  Root cause is GCC conservatively assumes newly introduced array reference could be out of array bound and thus trapping.  This patch improves the situation by proving the converted unconditional array reference is within array bound using loop niter information.  To be specific, it checks every index of array reference to see if it's within bound in ifcvt_memrefs_wont_trap.  This patch also factors out base_object_writable checking if the base object is writable or not.
Bootstrap and test on x86_64 and aarch64, is it OK?

Thanks,
bin

2016-04-28  Bin Cheng  <bin.cheng@arm.com>

	* tree-if-conv.c (tree-ssa-loop.h): Include header file.
	(tree-ssa-loop-niter.h): Ditto.
	(idx_within_array_bound, ref_within_array_bound): New functions.
	(ifcvt_memrefs_wont_trap): Check if array ref is within bound.
	Factor out check on writable base object to ...
	(base_object_writable): ... here.

Comments

Richard Biener April 29, 2016, 11:16 a.m. UTC | #1
On Thu, Apr 28, 2016 at 2:56 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
> Hi,
> Tree if-conversion sometimes cannot convert conditional array reference into unconditional one.  Root cause is GCC conservatively assumes newly introduced array reference could be out of array bound and thus trapping.  This patch improves the situation by proving the converted unconditional array reference is within array bound using loop niter information.  To be specific, it checks every index of array reference to see if it's within bound in ifcvt_memrefs_wont_trap.  This patch also factors out base_object_writable checking if the base object is writable or not.
> Bootstrap and test on x86_64 and aarch64, is it OK?

I think you miss to handle the case optimally where the only
non-ARRAY_REF idx is the dereference of the
base-pointer for, say, p->a[i].  In this case we can use
base_master_dr to see if p is unconditionally dereferenced
in the loop.  You also fail to handle the case where we have
MEM_REF[&x].a[i] that is, you see a decl base.
I suppose for_each_index should be fixed for this particular case (to
return true), same for TARGET_MEM_REF TMR_BASE.

+  /* The case of nonconstant bounds could be handled, but it would be
+     complicated.  */
+  if (TREE_CODE (low) != INTEGER_CST || !integer_zerop (low)
+      || !high || TREE_CODE (high) != INTEGER_CST)
+    return false;
+

handling of a non-zero but constant low bound is important - otherwise
all this is a no-op for Fortran.  It
shouldn't be too difficult to handle after all.  In fact I think your
code does handle it correctly already.

+  if (!init || TREE_CODE (init) != INTEGER_CST
+      || !step || TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
+    return false;

step == 0 should be easy to handle as well, no?  The index will simply
always be 'init' ...

+  /* In case the relevant bound of the array does not fit in type, or
+     it does, but bound + step (in type) still belongs into the range of the
+     array, the index may wrap and still stay within the range of the array
+     (consider e.g. if the array is indexed by the full range of
+     unsigned char).
+
+     To make things simpler, we require both bounds to fit into type, although
+     there are cases where this would not be strictly necessary.  */
+  if (!int_fits_type_p (high, type) || !int_fits_type_p (low, type))
+    return false;
+
+  low = fold_convert (type, low);

please use wide_int for all of this.

I wonder if we can do sth for wrapping IVs like

int a[2048];

for (int i = 0; i < 4096; ++i)
  ... a[(unsigned char)i];

as well.  Like if the IVs type max and min value are within the array bounds
simply return true?

Thanks,
Richard.

> Thanks,
> bin
>
> 2016-04-28  Bin Cheng  <bin.cheng@arm.com>
>
>         * tree-if-conv.c (tree-ssa-loop.h): Include header file.
>         (tree-ssa-loop-niter.h): Ditto.
>         (idx_within_array_bound, ref_within_array_bound): New functions.
>         (ifcvt_memrefs_wont_trap): Check if array ref is within bound.
>         Factor out check on writable base object to ...
>         (base_object_writable): ... here.
Bin.Cheng April 29, 2016, 3:05 p.m. UTC | #2
On Fri, Apr 29, 2016 at 12:16 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Thu, Apr 28, 2016 at 2:56 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>> Hi,
>> Tree if-conversion sometimes cannot convert conditional array reference into unconditional one.  Root cause is GCC conservatively assumes newly introduced array reference could be out of array bound and thus trapping.  This patch improves the situation by proving the converted unconditional array reference is within array bound using loop niter information.  To be specific, it checks every index of array reference to see if it's within bound in ifcvt_memrefs_wont_trap.  This patch also factors out base_object_writable checking if the base object is writable or not.
>> Bootstrap and test on x86_64 and aarch64, is it OK?
>
> I think you miss to handle the case optimally where the only
> non-ARRAY_REF idx is the dereference of the
> base-pointer for, say, p->a[i].  In this case we can use
> base_master_dr to see if p is unconditionally dereferenced
Yes, will pick up this case.

> in the loop.  You also fail to handle the case where we have
> MEM_REF[&x].a[i] that is, you see a decl base.
I am having difficulty in creating this case for ifcvt, any advices?  Thanks.

> I suppose for_each_index should be fixed for this particular case (to
> return true), same for TARGET_MEM_REF TMR_BASE.
>
> +  /* The case of nonconstant bounds could be handled, but it would be
> +     complicated.  */
> +  if (TREE_CODE (low) != INTEGER_CST || !integer_zerop (low)
> +      || !high || TREE_CODE (high) != INTEGER_CST)
> +    return false;
> +
>
> handling of a non-zero but constant low bound is important - otherwise
> all this is a no-op for Fortran.  It
> shouldn't be too difficult to handle after all.  In fact I think your
> code does handle it correctly already.
>
> +  if (!init || TREE_CODE (init) != INTEGER_CST
> +      || !step || TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
> +    return false;
>
> step == 0 should be easy to handle as well, no?  The index will simply
> always be 'init' ...
>
> +  /* In case the relevant bound of the array does not fit in type, or
> +     it does, but bound + step (in type) still belongs into the range of the
> +     array, the index may wrap and still stay within the range of the array
> +     (consider e.g. if the array is indexed by the full range of
> +     unsigned char).
> +
> +     To make things simpler, we require both bounds to fit into type, although
> +     there are cases where this would not be strictly necessary.  */
> +  if (!int_fits_type_p (high, type) || !int_fits_type_p (low, type))
> +    return false;
> +
> +  low = fold_convert (type, low);
>
> please use wide_int for all of this.
Now I use wi:fits_to_tree_p instead of int_fits_type_p. But I am not
sure what's the meaning by "handle "low = fold_convert (type, low);"
related code in wide_int".   Do you mean to use tree_int_cst_compare
instead of tree_int_cst_compare in the following code?

>
> I wonder if we can do sth for wrapping IVs like
>
> int a[2048];
>
> for (int i = 0; i < 4096; ++i)
>   ... a[(unsigned char)i];
>
> as well.  Like if the IVs type max and min value are within the array bounds
> simply return true?
I think we can only do this for read.  For write this is not safe.
From vectorizer's point of view, is this worth handling?  Could
vectorizer handles wrapping IV in a smaller range than loop IV?

Thanks,
bin
>
> Thanks,
> Richard.
>
>> Thanks,
>> bin
>>
>> 2016-04-28  Bin Cheng  <bin.cheng@arm.com>
>>
>>         * tree-if-conv.c (tree-ssa-loop.h): Include header file.
>>         (tree-ssa-loop-niter.h): Ditto.
>>         (idx_within_array_bound, ref_within_array_bound): New functions.
>>         (ifcvt_memrefs_wont_trap): Check if array ref is within bound.
>>         Factor out check on writable base object to ...
>>         (base_object_writable): ... here.
Richard Biener May 2, 2016, 9 a.m. UTC | #3
On Fri, Apr 29, 2016 at 5:05 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Fri, Apr 29, 2016 at 12:16 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Thu, Apr 28, 2016 at 2:56 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>>> Hi,
>>> Tree if-conversion sometimes cannot convert conditional array reference into unconditional one.  Root cause is GCC conservatively assumes newly introduced array reference could be out of array bound and thus trapping.  This patch improves the situation by proving the converted unconditional array reference is within array bound using loop niter information.  To be specific, it checks every index of array reference to see if it's within bound in ifcvt_memrefs_wont_trap.  This patch also factors out base_object_writable checking if the base object is writable or not.
>>> Bootstrap and test on x86_64 and aarch64, is it OK?
>>
>> I think you miss to handle the case optimally where the only
>> non-ARRAY_REF idx is the dereference of the
>> base-pointer for, say, p->a[i].  In this case we can use
>> base_master_dr to see if p is unconditionally dereferenced
> Yes, will pick up this case.
>
>> in the loop.  You also fail to handle the case where we have
>> MEM_REF[&x].a[i] that is, you see a decl base.
> I am having difficulty in creating this case for ifcvt, any advices?  Thanks.

Sth like

float a[128];
float foo (int n, int i)
{
  return (*((float(*)[n])a))[i];
}

should do the trick (w/o the component-ref).  Any other type-punning
would do it, too.

>> I suppose for_each_index should be fixed for this particular case (to
>> return true), same for TARGET_MEM_REF TMR_BASE.
>>
>> +  /* The case of nonconstant bounds could be handled, but it would be
>> +     complicated.  */
>> +  if (TREE_CODE (low) != INTEGER_CST || !integer_zerop (low)
>> +      || !high || TREE_CODE (high) != INTEGER_CST)
>> +    return false;
>> +
>>
>> handling of a non-zero but constant low bound is important - otherwise
>> all this is a no-op for Fortran.  It
>> shouldn't be too difficult to handle after all.  In fact I think your
>> code does handle it correctly already.
>>
>> +  if (!init || TREE_CODE (init) != INTEGER_CST
>> +      || !step || TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
>> +    return false;
>>
>> step == 0 should be easy to handle as well, no?  The index will simply
>> always be 'init' ...
>>
>> +  /* In case the relevant bound of the array does not fit in type, or
>> +     it does, but bound + step (in type) still belongs into the range of the
>> +     array, the index may wrap and still stay within the range of the array
>> +     (consider e.g. if the array is indexed by the full range of
>> +     unsigned char).
>> +
>> +     To make things simpler, we require both bounds to fit into type, although
>> +     there are cases where this would not be strictly necessary.  */
>> +  if (!int_fits_type_p (high, type) || !int_fits_type_p (low, type))
>> +    return false;
>> +
>> +  low = fold_convert (type, low);
>>
>> please use wide_int for all of this.
> Now I use wi:fits_to_tree_p instead of int_fits_type_p. But I am not
> sure what's the meaning by "handle "low = fold_convert (type, low);"
> related code in wide_int".   Do you mean to use tree_int_cst_compare
> instead of tree_int_cst_compare in the following code?

I don't think you need any kind of fits-to-type check here.  You'd simply
use to_widest () when operating on / comparing with high/low.

And no, I mean to do it all with widest_ints.

>>
>> I wonder if we can do sth for wrapping IVs like
>>
>> int a[2048];
>>
>> for (int i = 0; i < 4096; ++i)
>>   ... a[(unsigned char)i];
>>
>> as well.  Like if the IVs type max and min value are within the array bounds
>> simply return true?
> I think we can only do this for read.  For write this is not safe.
> From vectorizer's point of view, is this worth handling?  Could
> vectorizer handles wrapping IV in a smaller range than loop IV?

Possibly, but dependence analysis might get confused.

Richard.

> Thanks,
> bin
>>
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> bin
>>>
>>> 2016-04-28  Bin Cheng  <bin.cheng@arm.com>
>>>
>>>         * tree-if-conv.c (tree-ssa-loop.h): Include header file.
>>>         (tree-ssa-loop-niter.h): Ditto.
>>>         (idx_within_array_bound, ref_within_array_bound): New functions.
>>>         (ifcvt_memrefs_wont_trap): Check if array ref is within bound.
>>>         Factor out check on writable base object to ...
>>>         (base_object_writable): ... here.
Bin.Cheng May 3, 2016, 10:01 a.m. UTC | #4
On Mon, May 2, 2016 at 10:00 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Fri, Apr 29, 2016 at 5:05 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Fri, Apr 29, 2016 at 12:16 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On Thu, Apr 28, 2016 at 2:56 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>>>> Hi,
>>>> Tree if-conversion sometimes cannot convert conditional array reference into unconditional one.  Root cause is GCC conservatively assumes newly introduced array reference could be out of array bound and thus trapping.  This patch improves the situation by proving the converted unconditional array reference is within array bound using loop niter information.  To be specific, it checks every index of array reference to see if it's within bound in ifcvt_memrefs_wont_trap.  This patch also factors out base_object_writable checking if the base object is writable or not.
>>>> Bootstrap and test on x86_64 and aarch64, is it OK?
>>>
>>> I think you miss to handle the case optimally where the only
>>> non-ARRAY_REF idx is the dereference of the
>>> base-pointer for, say, p->a[i].  In this case we can use
>>> base_master_dr to see if p is unconditionally dereferenced
>> Yes, will pick up this case.
>>
>>> in the loop.  You also fail to handle the case where we have
>>> MEM_REF[&x].a[i] that is, you see a decl base.
>> I am having difficulty in creating this case for ifcvt, any advices?  Thanks.
>
> Sth like
>
> float a[128];
> float foo (int n, int i)
> {
>   return (*((float(*)[n])a))[i];
> }
>
> should do the trick (w/o the component-ref).  Any other type-punning
> would do it, too.
>
>>> I suppose for_each_index should be fixed for this particular case (to
>>> return true), same for TARGET_MEM_REF TMR_BASE.
>>>
>>> +  /* The case of nonconstant bounds could be handled, but it would be
>>> +     complicated.  */
>>> +  if (TREE_CODE (low) != INTEGER_CST || !integer_zerop (low)
>>> +      || !high || TREE_CODE (high) != INTEGER_CST)
>>> +    return false;
>>> +
>>>
>>> handling of a non-zero but constant low bound is important - otherwise
>>> all this is a no-op for Fortran.  It
>>> shouldn't be too difficult to handle after all.  In fact I think your
>>> code does handle it correctly already.
>>>
>>> +  if (!init || TREE_CODE (init) != INTEGER_CST
>>> +      || !step || TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
>>> +    return false;
>>>
>>> step == 0 should be easy to handle as well, no?  The index will simply
>>> always be 'init' ...
>>>
>>> +  /* In case the relevant bound of the array does not fit in type, or
>>> +     it does, but bound + step (in type) still belongs into the range of the
>>> +     array, the index may wrap and still stay within the range of the array
>>> +     (consider e.g. if the array is indexed by the full range of
>>> +     unsigned char).
>>> +
>>> +     To make things simpler, we require both bounds to fit into type, although
>>> +     there are cases where this would not be strictly necessary.  */
>>> +  if (!int_fits_type_p (high, type) || !int_fits_type_p (low, type))
>>> +    return false;
>>> +
>>> +  low = fold_convert (type, low);
>>>
>>> please use wide_int for all of this.
>> Now I use wi:fits_to_tree_p instead of int_fits_type_p. But I am not
>> sure what's the meaning by "handle "low = fold_convert (type, low);"
>> related code in wide_int".   Do you mean to use tree_int_cst_compare
>> instead of tree_int_cst_compare in the following code?
>
> I don't think you need any kind of fits-to-type check here.  You'd simply
> use to_widest () when operating on / comparing with high/low.
But what would happen if low/high and init/step are different in type
sign-ness?  Anything special I need to do before using wi::ltu_p or
wi::lts_p directly?

Thanks,
bin
>
> And no, I mean to do it all with widest_ints.
>
>>>
>>> I wonder if we can do sth for wrapping IVs like
>>>
>>> int a[2048];
>>>
>>> for (int i = 0; i < 4096; ++i)
>>>   ... a[(unsigned char)i];
>>>
>>> as well.  Like if the IVs type max and min value are within the array bounds
>>> simply return true?
>> I think we can only do this for read.  For write this is not safe.
>> From vectorizer's point of view, is this worth handling?  Could
>> vectorizer handles wrapping IV in a smaller range than loop IV?
>
> Possibly, but dependence analysis might get confused.
>
> Richard.
>
>> Thanks,
>> bin
>>>
>>> Thanks,
>>> Richard.
>>>
>>>> Thanks,
>>>> bin
>>>>
>>>> 2016-04-28  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>         * tree-if-conv.c (tree-ssa-loop.h): Include header file.
>>>>         (tree-ssa-loop-niter.h): Ditto.
>>>>         (idx_within_array_bound, ref_within_array_bound): New functions.
>>>>         (ifcvt_memrefs_wont_trap): Check if array ref is within bound.
>>>>         Factor out check on writable base object to ...
>>>>         (base_object_writable): ... here.
Richard Biener May 3, 2016, 10:08 a.m. UTC | #5
On Tue, May 3, 2016 at 12:01 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Mon, May 2, 2016 at 10:00 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Fri, Apr 29, 2016 at 5:05 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>> On Fri, Apr 29, 2016 at 12:16 PM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>> On Thu, Apr 28, 2016 at 2:56 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>>>>> Hi,
>>>>> Tree if-conversion sometimes cannot convert conditional array reference into unconditional one.  Root cause is GCC conservatively assumes newly introduced array reference could be out of array bound and thus trapping.  This patch improves the situation by proving the converted unconditional array reference is within array bound using loop niter information.  To be specific, it checks every index of array reference to see if it's within bound in ifcvt_memrefs_wont_trap.  This patch also factors out base_object_writable checking if the base object is writable or not.
>>>>> Bootstrap and test on x86_64 and aarch64, is it OK?
>>>>
>>>> I think you miss to handle the case optimally where the only
>>>> non-ARRAY_REF idx is the dereference of the
>>>> base-pointer for, say, p->a[i].  In this case we can use
>>>> base_master_dr to see if p is unconditionally dereferenced
>>> Yes, will pick up this case.
>>>
>>>> in the loop.  You also fail to handle the case where we have
>>>> MEM_REF[&x].a[i] that is, you see a decl base.
>>> I am having difficulty in creating this case for ifcvt, any advices?  Thanks.
>>
>> Sth like
>>
>> float a[128];
>> float foo (int n, int i)
>> {
>>   return (*((float(*)[n])a))[i];
>> }
>>
>> should do the trick (w/o the component-ref).  Any other type-punning
>> would do it, too.
>>
>>>> I suppose for_each_index should be fixed for this particular case (to
>>>> return true), same for TARGET_MEM_REF TMR_BASE.
>>>>
>>>> +  /* The case of nonconstant bounds could be handled, but it would be
>>>> +     complicated.  */
>>>> +  if (TREE_CODE (low) != INTEGER_CST || !integer_zerop (low)
>>>> +      || !high || TREE_CODE (high) != INTEGER_CST)
>>>> +    return false;
>>>> +
>>>>
>>>> handling of a non-zero but constant low bound is important - otherwise
>>>> all this is a no-op for Fortran.  It
>>>> shouldn't be too difficult to handle after all.  In fact I think your
>>>> code does handle it correctly already.
>>>>
>>>> +  if (!init || TREE_CODE (init) != INTEGER_CST
>>>> +      || !step || TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
>>>> +    return false;
>>>>
>>>> step == 0 should be easy to handle as well, no?  The index will simply
>>>> always be 'init' ...
>>>>
>>>> +  /* In case the relevant bound of the array does not fit in type, or
>>>> +     it does, but bound + step (in type) still belongs into the range of the
>>>> +     array, the index may wrap and still stay within the range of the array
>>>> +     (consider e.g. if the array is indexed by the full range of
>>>> +     unsigned char).
>>>> +
>>>> +     To make things simpler, we require both bounds to fit into type, although
>>>> +     there are cases where this would not be strictly necessary.  */
>>>> +  if (!int_fits_type_p (high, type) || !int_fits_type_p (low, type))
>>>> +    return false;
>>>> +
>>>> +  low = fold_convert (type, low);
>>>>
>>>> please use wide_int for all of this.
>>> Now I use wi:fits_to_tree_p instead of int_fits_type_p. But I am not
>>> sure what's the meaning by "handle "low = fold_convert (type, low);"
>>> related code in wide_int".   Do you mean to use tree_int_cst_compare
>>> instead of tree_int_cst_compare in the following code?
>>
>> I don't think you need any kind of fits-to-type check here.  You'd simply
>> use to_widest () when operating on / comparing with high/low.
> But what would happen if low/high and init/step are different in type
> sign-ness?  Anything special I need to do before using wi::ltu_p or
> wi::lts_p directly?

You want to use to_widest (min) which extends according to sign to
an "infinite" precision signed integer.  So you can then use the new
operator< overloads as well.

Richard.

> Thanks,
> bin
>>
>> And no, I mean to do it all with widest_ints.
>>
>>>>
>>>> I wonder if we can do sth for wrapping IVs like
>>>>
>>>> int a[2048];
>>>>
>>>> for (int i = 0; i < 4096; ++i)
>>>>   ... a[(unsigned char)i];
>>>>
>>>> as well.  Like if the IVs type max and min value are within the array bounds
>>>> simply return true?
>>> I think we can only do this for read.  For write this is not safe.
>>> From vectorizer's point of view, is this worth handling?  Could
>>> vectorizer handles wrapping IV in a smaller range than loop IV?
>>
>> Possibly, but dependence analysis might get confused.
>>
>> Richard.
>>
>>> Thanks,
>>> bin
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>>> Thanks,
>>>>> bin
>>>>>
>>>>> 2016-04-28  Bin Cheng  <bin.cheng@arm.com>
>>>>>
>>>>>         * tree-if-conv.c (tree-ssa-loop.h): Include header file.
>>>>>         (tree-ssa-loop-niter.h): Ditto.
>>>>>         (idx_within_array_bound, ref_within_array_bound): New functions.
>>>>>         (ifcvt_memrefs_wont_trap): Check if array ref is within bound.
>>>>>         Factor out check on writable base object to ...
>>>>>         (base_object_writable): ... here.
diff mbox

Patch

diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 2d14901..170e644 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -106,6 +106,8 @@  along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "tree-data-ref.h"
 #include "tree-scalar-evolution.h"
+#include "tree-ssa-loop.h"
+#include "tree-ssa-loop-niter.h"
 #include "tree-ssa-loop-ivopts.h"
 #include "tree-ssa-address.h"
 #include "dbgcnt.h"
@@ -771,6 +773,132 @@  hash_memrefs_baserefs_and_store_DRs_read_written_info (data_reference_p a)
     }
 }
 
+/* Return TRUE if can prove the index IDX of an array reference REF is
+   within array bound.  Return false otherwise.  */
+
+static bool
+idx_within_array_bound (tree ref, tree *idx, void *dta)
+{
+  widest_int niter;
+  tree ev, init, step;
+  tree low, high, type, unsigned_type, delta, valid_niter, step_abs, e;
+  bool sign;
+  struct loop *loop = (struct loop*) dta;
+
+  /* Only support within-bound access for array references.  */
+  if (TREE_CODE (ref) != ARRAY_REF)
+    return false;
+
+  /* For arrays at the end of the structure, we are not guaranteed that they
+     do not really extend over their declared size.  However, for arrays of
+     size greater than one, this is unlikely to be intended.  */
+  if (array_at_struct_end_p (ref))
+    return false;
+
+  ev = analyze_scalar_evolution (loop, *idx);
+  ev = instantiate_parameters (loop, ev);
+  init = initial_condition (ev);
+  step = evolution_part_in_loop_num (ev, loop->num);
+
+  if (!init || TREE_CODE (init) != INTEGER_CST
+      || !step || TREE_CODE (step) != INTEGER_CST || integer_zerop (step))
+    return false;
+
+  low = array_ref_low_bound (ref);
+  high = array_ref_up_bound (ref);
+
+  /* The case of nonconstant bounds could be handled, but it would be
+     complicated.  */
+  if (TREE_CODE (low) != INTEGER_CST || !integer_zerop (low)
+      || !high || TREE_CODE (high) != INTEGER_CST)
+    return false;
+
+  sign = tree_int_cst_sign_bit (step);
+  type = TREE_TYPE (step);
+
+  /* In case the relevant bound of the array does not fit in type, or
+     it does, but bound + step (in type) still belongs into the range of the
+     array, the index may wrap and still stay within the range of the array
+     (consider e.g. if the array is indexed by the full range of
+     unsigned char).
+
+     To make things simpler, we require both bounds to fit into type, although
+     there are cases where this would not be strictly necessary.  */
+  if (!int_fits_type_p (high, type) || !int_fits_type_p (low, type))
+    return false;
+
+  low = fold_convert (type, low);
+  high = fold_convert (type, high);
+
+  /* Check if the first idx is within bound.  */
+  if (tree_int_cst_compare (init, low) < 0
+      || tree_int_cst_compare (init, high) > 0)
+    return false;
+
+  /* Don't issue signed overflow warnings.  */
+  fold_defer_overflow_warnings ();
+
+  unsigned_type = unsigned_type_for (type);
+  init = fold_convert (unsigned_type, init);
+  if (sign)
+    {
+      tree extreme = fold_convert (unsigned_type, low);
+      delta = fold_build2 (MINUS_EXPR, unsigned_type, init, extreme);
+      step_abs = fold_build1 (NEGATE_EXPR, unsigned_type,
+			      fold_convert (unsigned_type, step));
+    }
+  else
+    {
+      tree extreme = fold_convert (unsigned_type, high);
+      delta = fold_build2 (MINUS_EXPR, unsigned_type, extreme, init);
+      step_abs = fold_convert (unsigned_type, step);
+    }
+  valid_niter = fold_build2 (FLOOR_DIV_EXPR, unsigned_type, delta, step_abs);
+
+  /* Check if idx is within bound through all niter of loop.  */
+  if (max_loop_iterations (loop, &niter)
+      && wi::fits_to_tree_p (niter, TREE_TYPE (valid_niter))
+      && (e = fold_binary (GT_EXPR, boolean_type_node, valid_niter,
+			   wide_int_to_tree (TREE_TYPE (valid_niter),
+					     niter))) != NULL
+      && integer_nonzerop (e))
+    {
+      fold_undefer_and_ignore_overflow_warnings ();
+      return true;
+    }
+  fold_undefer_and_ignore_overflow_warnings ();
+
+  return false;
+}
+
+/* Return TRUE if ref is a within bound array reference.  */
+
+static bool
+ref_within_array_bound (gimple *stmt, tree ref)
+{
+  struct loop *loop = loop_containing_stmt (stmt);
+
+  gcc_assert (loop != NULL);
+  return for_each_index (&ref, idx_within_array_bound, loop);
+}
+
+
+/* Given a memory reference expression T, return TRUE if base object
+   it refers to is writable.  The base object of a memory reference
+   is the main object being referenced, which is returned by function
+   get_base_address.  */
+
+static bool
+base_object_writable (tree ref)
+{
+  tree base_tree = get_base_address (ref);
+
+  return (base_tree
+	  && DECL_P (base_tree)
+	  && decl_binds_to_current_def_p (base_tree)
+	  && !TREE_READONLY (base_tree));
+}
+
 /* Return true when the memory references of STMT won't trap in the
    if-converted code.  There are two things that we have to check for:
 
@@ -830,16 +958,20 @@  ifcvt_memrefs_wont_trap (gimple *stmt, vec<data_reference_p> drs)
       if (base_master_dr
 	  && DR_BASE_W_UNCONDITIONALLY (*base_master_dr))
 	return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
-      else
-	{
-	  /* or the base is know to be not readonly.  */
-	  tree base_tree = get_base_address (DR_REF (a));
-	  if (DECL_P (base_tree)
-	      && decl_binds_to_current_def_p (base_tree)
-	      && ! TREE_READONLY (base_tree))
-	    return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
-	}
+      /* or the base is known to be not readonly.  */
+      else if (base_object_writable (DR_REF (a)))
+	return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
     }
+
+  /* If a is within-bound array references then ...  */
+  if (ref_within_array_bound (stmt, DR_REF (a)))
+    {
+      if (DR_IS_READ (a))
+	return true;
+      else if (base_object_writable (DR_REF (a)))
+	return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
+    }
+
   return false;
 }
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-10.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-10.c
new file mode 100644
index 0000000..d35b6d4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-10.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fdump-tree-ifcvt-details" } */
+/* { dg-require-visibility "" } */
+
+int b[256] = {0}, y;
+void bar (int *);
+int foo (int x, int n)
+{
+  int i;
+  int a[128];
+
+  for (i = 0; i < n; i++)
+    {
+      a[i] = i;
+      if (x > i)
+	b[i] = y;
+    }
+  bar (a);
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "Applying if-conversion" 1 "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-9.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-9.c
new file mode 100644
index 0000000..25a28db
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-9.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile } */
+/* { dg-options "-Ofast -fdump-tree-ifcvt-details" } */
+/* { dg-require-visibility "" } */
+
+extern int b[256], y;
+void bar (int *, int);
+int foo (int x, int n)
+{
+  int i;
+  int a[128];
+
+  for (i = 0; i < n; i++)
+    {
+      a[i] = i;
+      if (x > i)
+	y = b[i];
+    }
+  bar (a, y);
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "Applying if-conversion" 1 "ifcvt" } } */