Patchwork [PR45098] Disallow NULL pointer in pointer arithmetic

login
register
mail settings
Submitter Tom de Vries
Date June 16, 2011, 6:39 a.m.
Message ID <4DF9A526.9060906@codesourcery.com>
Download mbox | patch
Permalink /patch/100595/
State New
Headers show

Comments

Tom de Vries - June 16, 2011, 6:39 a.m.
Hi,

Consider the following example.

extern unsigned int foo (int*) __attribute__((pure));
unsigned int
tr (int array[], int n)
{
  unsigned int i;
  unsigned int sum = 0;
  for (i = 0; i < n; i++)
    sum += foo (&array[i]);
  return sum;
}

For 32-bit pointers, the analysis in infer_loop_bounds_from_pointer_arith
currently concludes that the range of valid &array[i] is &array[0x0] to
&array[0x3fffffff], meaning 0x40000000 distinct values.
This implies that i < n is executed at most 0x40000001 times, and i < n
cannot be eliminated by an 32-bit iterator with step 4, since that one has
only 0x40000000 distinct values.

The patch reasons that NULL cannot be used or produced by pointer
arithmetic, and that we can exclude the possibility of the NULL pointer in the
range. So the range of valid &array[i] is &array[0] to &array[0x3ffffffe],
meaning 0x3fffffff distinct values.
This implies that i < n is executed at most 0x40000000 times and i < n can be
eliminated.

The patch implements this new limitation by changing the (low, high, step)
triplet in infer_loop_bounds_from_pointer_arith from (0x0, 0xffffffff, 0x4)
to (0x4, 0xffffffff, 0x4).

I'm not too happy about the test for C-like language: ptrdiff_type_node !=
NULL_TREE, but I'm not sure how else to test for this.

Bootstrapped and reg-tested on x86_64.

I will sent the adapted test cases in a separate email.

OK for trunk?

Thanks,
- Tom

2011-06-15  Tom de Vries  <tom@codesourcery.com>

	PR target/45098
	* tree-ssa-loop-niter.c (infer_loop_bounds_from_pointer_arith): Disallow
	NULL pointer for pointer arithmetic.
Zdenek Dvorak - June 16, 2011, 7:24 a.m.
Hi,

> Consider the following example.
> 
> extern unsigned int foo (int*) __attribute__((pure));
> unsigned int
> tr (int array[], int n)
> {
>   unsigned int i;
>   unsigned int sum = 0;
>   for (i = 0; i < n; i++)
>     sum += foo (&array[i]);
>   return sum;
> }
> 
> For 32-bit pointers, the analysis in infer_loop_bounds_from_pointer_arith
> currently concludes that the range of valid &array[i] is &array[0x0] to
> &array[0x3fffffff], meaning 0x40000000 distinct values.
> This implies that i < n is executed at most 0x40000001 times, and i < n
> cannot be eliminated by an 32-bit iterator with step 4, since that one has
> only 0x40000000 distinct values.

this reasoning seems slightly inaccurate to me: the loop is typically translated
as

if (n > 0)
  {
    i = 0;
    start:
      sum += foo (&array[i]);
      i++;
      if (i < n)
        goto start;
  }

This means that if the array access is performed <= x times, then the exit condition
of the loop is tested <= x times (not counting the copied header).  This could be
exploited to fix the problem as well.

> The patch reasons that NULL cannot be used or produced by pointer
> arithmetic, and that we can exclude the possibility of the NULL pointer in the
> range. So the range of valid &array[i] is &array[0] to &array[0x3ffffffe],
> meaning 0x3fffffff distinct values.
> This implies that i < n is executed at most 0x40000000 times and i < n can be
> eliminated.

Sounds reasonable to me.

> The patch implements this new limitation by changing the (low, high, step)
> triplet in infer_loop_bounds_from_pointer_arith from (0x0, 0xffffffff, 0x4)
> to (0x4, 0xffffffff, 0x4).

At least in principle, the variable could have initial value 0x1, so this is a bit
incorrect.  (alignment of the memory reference, 0xffffffff, 0x4) should work, though.

> I'm not too happy about the test for C-like language: ptrdiff_type_node !=
> NULL_TREE, but I'm not sure how else to test for this.

The middle-end operations in general follow the C semantics, unless explicitly
specified otherwise (e.g. language-specific alias analysis rules, ...).  So, I think
you can drop this test.

Zdenek
Jeff Law - June 16, 2011, 10:01 p.m.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 06/16/11 00:39, Tom de Vries wrote:
> Hi,
> 
> Consider the following example.
> 
> extern unsigned int foo (int*) __attribute__((pure));
> unsigned int
> tr (int array[], int n)
> {
>   unsigned int i;
>   unsigned int sum = 0;
>   for (i = 0; i < n; i++)
>     sum += foo (&array[i]);
>   return sum;
> }
> 
> For 32-bit pointers, the analysis in infer_loop_bounds_from_pointer_arith
> currently concludes that the range of valid &array[i] is &array[0x0] to
> &array[0x3fffffff], meaning 0x40000000 distinct values.
> This implies that i < n is executed at most 0x40000001 times, and i < n
> cannot be eliminated by an 32-bit iterator with step 4, since that one has
> only 0x40000000 distinct values.
> 
> The patch reasons that NULL cannot be used or produced by pointer
> arithmetic, and that we can exclude the possibility of the NULL pointer in the
> range. So the range of valid &array[i] is &array[0] to &array[0x3ffffffe],
> meaning 0x3fffffff distinct values.
> This implies that i < n is executed at most 0x40000000 times and i < n can be
> eliminated.
> 
> The patch implements this new limitation by changing the (low, high, step)
> triplet in infer_loop_bounds_from_pointer_arith from (0x0, 0xffffffff, 0x4)
> to (0x4, 0xffffffff, 0x4).
> 
> I'm not too happy about the test for C-like language: ptrdiff_type_node !=
> NULL_TREE, but I'm not sure how else to test for this.
> 
> Bootstrapped and reg-tested on x86_64.
> 
> I will sent the adapted test cases in a separate email.
Interesting.  I'd never thought about the generation/use angle to prove
a pointer was non-null.  ISTM we could use that same logic to infer that
more pointers are non-null in extract_range_from_binary_expr.

Interested in tackling that improvement, obviously as an independent patch?

jeff
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJN+n0cAAoJEBRtltQi2kC7aRYH/1jyc0xmWEnzkxaMxdn9t5+p
asGN79nl8BSPifZapn2R7brEt9uQQNT6oAe/4wlCr0qf5f0FwMUV8U2QH8uMuez3
gqO+PuqcF6dSxR5+qskgljSjjLndxdFuaiN1Lb95jR9Wg3l/Nv6NGpjdgAaWHiVk
cmiuwAkVGSB46TGMMVnumFWTbXbXAK7udSk1PBDUZlY8Da+B9M2eGX9MuaPBNWvd
YSHRpkVVFAlyJIpwdtAojE6T2korZQyHAmYqiuArBPYxAN7cLuV8Gl4AagzyHcVz
Epkg7e0ayS1PnnQuH1JpAKGKH1DSlOmqo69JJpuL/kyaBh5lo4wu32RWHm/aGkY=
=fESM
-----END PGP SIGNATURE-----

Patch

diff -u gcc/tree-ssa-loop-niter.c (working copy) gcc/tree-ssa-loop-niter.c (working copy)
--- gcc/tree-ssa-loop-niter.c (working copy)
+++ gcc/tree-ssa-loop-niter.c (working copy)
@@ -2875,6 +2875,16 @@ 
   low = lower_bound_in_type (type, type);
   high = upper_bound_in_type (type, type);
 
+  /* In C, pointer arithmetic p + 1 cannot use a NULL pointer, and p - 1 cannot
+     produce a NULL pointer.  The contrary would mean NULL points to an object,
+     while NULL is supposed to compare unequal with the address of all objects.
+     Furthermore, p + 1 cannot produce a NULL pointer and p - 1 cannot use a
+     NULL pointer since that would mean wrapping, which we assume here not to
+     happen.  So, we can exclude NULL from the valid range of pointer
+     arithmetic.  */
+  if (ptrdiff_type_node != NULL_TREE && int_cst_value (low) == 0)
+    low = fold_build2 (PLUS_EXPR, TREE_TYPE (low), low, step);
+
   record_nonwrapping_iv (loop, base, step, stmt, low, high, false, true);
 }