diff mbox

have __builtin_object_size handle POINTER_PLUS with non-const offset

Message ID b60bf590-8063-5986-5cec-227f6a273017@gmail.com
State New
Headers show

Commit Message

Martin Sebor Sept. 16, 2016, 3:29 a.m. UTC
__builtin_object_size fails for POINTER_PLUS expressions involving
non-constant offsets into objects of known size, causing GCC to fail
to detect (and add instrumentation to prevent) buffer overflow in
some basic cases such as the following:

   void f (unsigned i)
   {
     char d [3];
     memcpy (d + i, "abcdef", 5);
   }

Since the size of the destination object is known, the call to memcpy
is guaranteed to write past the end of it regardless of the value of
the offset.

The attached patch enhances __builtin_object_size to handle this case
by returning the size of the whole object as the maximum and the size
of the object minus T_MAX for the type of the offset T as the minimum.

The patch also adds handling of ranges even though only very few cases
benefit from it because the VRP pass runs after the object size pass.
The one case that does appear to profit is when the value of the offset
is constrained by its type, as in

   char a [1000];

   unsigned g (unsigned char i)
   {
     char *p = a + i;
     return __builtin_object_size (p, 2);
   }

Here get_range_info () lets __builtin_object_size determine that
the minimum number of bytes between (a + i) and (a + sizeof s) is
sizeof a - UCHAR_MAX.

The patch results in 64 more checking calls in a Binutils build than
before.

Martin

PS What would be a good way to arrange for the VRP pass to run before
the object size pass so that the latter can benefit more from range
information?  As an experiment I added another instance of the VRP
pass before the object size pass in passes.def and that worked, but
I suspect that running VRP a third time isn't optimal.

Comments

Kugan Vivekanandarajah Sept. 16, 2016, 6:19 a.m. UTC | #1
Hi,

On 16/09/16 13:29, Martin Sebor wrote:
> __builtin_object_size fails for POINTER_PLUS expressions involving
> non-constant offsets into objects of known size, causing GCC to fail
> to detect (and add instrumentation to prevent) buffer overflow in
> some basic cases such as the following:
>
>    void f (unsigned i)
>    {
>      char d [3];
>      memcpy (d + i, "abcdef", 5);
>    }
>
> Since the size of the destination object is known, the call to memcpy
> is guaranteed to write past the end of it regardless of the value of
> the offset.
>
> The attached patch enhances __builtin_object_size to handle this case
> by returning the size of the whole object as the maximum and the size
> of the object minus T_MAX for the type of the offset T as the minimum.
>
> The patch also adds handling of ranges even though only very few cases
> benefit from it because the VRP pass runs after the object size pass.
> The one case that does appear to profit is when the value of the offset
> is constrained by its type, as in
>
>    char a [1000];
>
>    unsigned g (unsigned char i)
>    {
>      char *p = a + i;
>      return __builtin_object_size (p, 2);
>    }
>
> Here get_range_info () lets __builtin_object_size determine that
> the minimum number of bytes between (a + i) and (a + sizeof s) is
> sizeof a - UCHAR_MAX.
>
> The patch results in 64 more checking calls in a Binutils build than
> before.
>
> Martin
>
> PS What would be a good way to arrange for the VRP pass to run before
> the object size pass so that the latter can benefit more from range
> information?  As an experiment I added another instance of the VRP
> pass before the object size pass in passes.def and that worked, but
> I suspect that running VRP a third time isn't optimal.

I am working on early-vrp and that might be of interest to you.
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00993.html

Thanks,
Kugan
>
Richard Biener Sept. 16, 2016, 10:29 a.m. UTC | #2
On Fri, Sep 16, 2016 at 5:29 AM, Martin Sebor <msebor@gmail.com> wrote:
> __builtin_object_size fails for POINTER_PLUS expressions involving
> non-constant offsets into objects of known size, causing GCC to fail
> to detect (and add instrumentation to prevent) buffer overflow in
> some basic cases such as the following:
>
>   void f (unsigned i)
>   {
>     char d [3];
>     memcpy (d + i, "abcdef", 5);
>   }
>
> Since the size of the destination object is known, the call to memcpy
> is guaranteed to write past the end of it regardless of the value of
> the offset.
>
> The attached patch enhances __builtin_object_size to handle this case
> by returning the size of the whole object as the maximum and the size
> of the object minus T_MAX for the type of the offset T as the minimum.
>
> The patch also adds handling of ranges even though only very few cases
> benefit from it because the VRP pass runs after the object size pass.
> The one case that does appear to profit is when the value of the offset
> is constrained by its type, as in
>
>   char a [1000];
>
>   unsigned g (unsigned char i)
>   {
>     char *p = a + i;
>     return __builtin_object_size (p, 2);
>   }
>
> Here get_range_info () lets __builtin_object_size determine that
> the minimum number of bytes between (a + i) and (a + sizeof s) is
> sizeof a - UCHAR_MAX.
>
> The patch results in 64 more checking calls in a Binutils build than
> before.

I believe that you can't simply use the min/max of the ranges the way you
do nor can you assume zero for the minimum size as op1 might be
negative (for POINTER_PLUS_EXPR you need to interpret the offset
as signed).

Unless I completely misremember how tree-object-size.c works, of course.

Say,

char a[1000];

unsigned g (signed char i)
{
  char *p = &a[10] + i;
  return __builtin_object_size (p, 1);
}

range for i will be [-128, 127] but we'd like to return 1000 here I think.

Richard.

> Martin
>
> PS What would be a good way to arrange for the VRP pass to run before
> the object size pass so that the latter can benefit more from range
> information?  As an experiment I added another instance of the VRP
> pass before the object size pass in passes.def and that worked, but
> I suspect that running VRP a third time isn't optimal.
Jakub Jelinek Sept. 16, 2016, 10:39 a.m. UTC | #3
On Fri, Sep 16, 2016 at 12:29:49PM +0200, Richard Biener wrote:
> > PS What would be a good way to arrange for the VRP pass to run before
> > the object size pass so that the latter can benefit more from range
> > information?  As an experiment I added another instance of the VRP
> > pass before the object size pass in passes.def and that worked, but
> > I suspect that running VRP a third time isn't optimal.

As I said in PR77606, I have strong doubts about desirability of using VRP
info for __builtin_object_size computation.  VRP is an optimization that
assumes undefined behavior does not happen, __builtin_object_size is
typically used to catch undefined behavior, so pretty much assumes undefined
behavior can happen.  So, the __builtin_object_size computations should
prove no other value can appear in any program invocation, rather than
just any valid program invocation.  Sure, it is a best effort, with
possibility to return "unknown" or conservatively correct answers, but using
VRP info is IMHO already too much.

	Jakub
Martin Sebor Sept. 16, 2016, 11:55 p.m. UTC | #4
On 09/16/2016 04:29 AM, Richard Biener wrote:
> On Fri, Sep 16, 2016 at 5:29 AM, Martin Sebor <msebor@gmail.com> wrote:
>> __builtin_object_size fails for POINTER_PLUS expressions involving
>> non-constant offsets into objects of known size, causing GCC to fail
>> to detect (and add instrumentation to prevent) buffer overflow in
>> some basic cases such as the following:
>>
>>   void f (unsigned i)
>>   {
>>     char d [3];
>>     memcpy (d + i, "abcdef", 5);
>>   }
>>
>> Since the size of the destination object is known, the call to memcpy
>> is guaranteed to write past the end of it regardless of the value of
>> the offset.
>>
>> The attached patch enhances __builtin_object_size to handle this case
>> by returning the size of the whole object as the maximum and the size
>> of the object minus T_MAX for the type of the offset T as the minimum.
>>
>> The patch also adds handling of ranges even though only very few cases
>> benefit from it because the VRP pass runs after the object size pass.
>> The one case that does appear to profit is when the value of the offset
>> is constrained by its type, as in
>>
>>   char a [1000];
>>
>>   unsigned g (unsigned char i)
>>   {
>>     char *p = a + i;
>>     return __builtin_object_size (p, 2);
>>   }
>>
>> Here get_range_info () lets __builtin_object_size determine that
>> the minimum number of bytes between (a + i) and (a + sizeof s) is
>> sizeof a - UCHAR_MAX.
>>
>> The patch results in 64 more checking calls in a Binutils build than
>> before.
>
> I believe that you can't simply use the min/max of the ranges the way you
> do nor can you assume zero for the minimum size as op1 might be
> negative (for POINTER_PLUS_EXPR you need to interpret the offset
> as signed).
>
> Unless I completely misremember how tree-object-size.c works, of course.
>
> Say,
>
> char a[1000];
>
> unsigned g (signed char i)
> {
>   char *p = &a[10] + i;
>   return __builtin_object_size (p, 1);
> }
>
> range for i will be [-128, 127] but we'd like to return 1000 here I think.

Thanks for the example.  I agree that 1000 is the correct result
in type < 2.  Curiously, the range reported for this case is
actually the anti-range ~[128, -129], and because the patch
didn't handle those the result was 990 (i.e., the same as
__builtin_object_size(&a[10], type) for type < 2).

I think the way to handle cases like this in (type < 2) might be
by computing the size of the whole object first (i.e., ignoring
the array subscript) and then adjusting the result down according
to the bounds of the range.  Let me work on that.

Thanks
Martin
diff mbox

Patch

PR middle-end/77608 - missing protection on trivially detectable runtime
	buffer overflow

gcc/testsuite/ChangeLog:
	PR middle-end/77608
	* gcc.dg/builtin-object-size-18.c: New test.

gcc/ChangeLog:
	PR middle-end/77608
	* tree-object-size.c (plus_stmt_object_size): Handle non-constant
	offsets.

diff --git a/gcc/testsuite/gcc.dg/builtin-object-size-18.c b/gcc/testsuite/gcc.dg/builtin-object-size-18.c
new file mode 100644
index 0000000..e8e2269
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/builtin-object-size-18.c
@@ -0,0 +1,76 @@ 
+/* PR 77608 - missing protection on trivially detectable runtime buffer
+   overflow */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+#define concat(a, b)   a ## b
+#define CAT(a, b)      concat (a, b)
+
+/* Create a symbol name unique to each tes and object size type.  */
+#define SYM(type)  CAT (CAT (CAT (failure_on_line_, __LINE__), _type_), type)
+
+/* References to the following undefined symbol which is unique for each
+   test case are expected to be eliminated.  */
+#define TEST_FAILURE(type)			\
+  do {						\
+    extern void SYM (type)(void);		\
+    SYM (type)();				\
+  } while (0)
+
+#define bos(obj, type) __builtin_object_size (obj, type)
+#define size(obj, n) ((size_t)n == X ? sizeof *obj : (size_t)n)
+
+/* For convenience.  Substitute for 'sizeof object' in test cases where
+   the size can vary from target to target.  */
+#define X  (size_t)0xdeadbeef
+
+#define test(expect, type, obj)			\
+  do {						\
+    if (bos (obj, type)	!= size (obj, expect))	\
+      TEST_FAILURE (type);			\
+  } while (0)
+
+/* Verify that each of the expressions __builtin_object_size(OBJ, n)
+   is folded into Rn.  */
+#define fold(r0, r1, r2, r3, obj)		\
+  do {						\
+    test (r0, 0, (obj));			\
+    test (r1, 1, (obj));			\
+    test (r2, 2, (obj));			\
+    test (r3, 3, (obj));			\
+  } while (0)
+
+/* Verify that __builtin_object_size(xBUF + OFF, n) is folded into Rn
+   for the specified OFF and each of the Rn values.  */
+#define FOLD(off, r0, r1, r2, r3)		\
+  do {						\
+    fold (r0, r1, r2, r3, gbuf + off);		\
+    sink (gbuf);				\
+    fold (r0, r1, r2, r3, lbuf + off);		\
+    sink (lbuf);				\
+    fold (r0, r1, r2, r3, abuf + off);		\
+    sink (abuf);				\
+  } while (0)
+
+#define UCHAR_MAX (__SCHAR_MAX__  * 2 + 1)
+#define BUFSIZE   (UCHAR_MAX * 2)
+
+typedef __SIZE_TYPE__ size_t;
+
+void sink (void*, ...);
+
+char gbuf[BUFSIZE];
+
+void test_pointer_plus (char ci, short si, int i, long li)
+{
+  char lbuf[BUFSIZE];
+
+  char *abuf = __builtin_malloc (BUFSIZE);
+
+  FOLD (ci, BUFSIZE, BUFSIZE, BUFSIZE - UCHAR_MAX, BUFSIZE - UCHAR_MAX);
+  FOLD (si, BUFSIZE, BUFSIZE, 0, 0);
+  FOLD ( i, BUFSIZE, BUFSIZE, 0, 0);
+  FOLD (li, BUFSIZE, BUFSIZE, 0, 0);
+}
+
+/* { dg-final { scan-tree-dump-not "failue_on_line" "optimized" } } */
diff --git a/gcc/tree-object-size.c b/gcc/tree-object-size.c
index 1317ad7..9e84156 100644
--- a/gcc/tree-object-size.c
+++ b/gcc/tree-object-size.c
@@ -837,21 +839,49 @@  plus_stmt_object_size (struct object_size_info *osi, tree var, gimple *stmt)
   if (object_sizes[object_size_type][varno] == unknown[object_size_type])
     return false;
 
+  bool estimated = false;
+
+  /* If the OFFSET isn't constant try to determine its range and use
+     either the lower or the upper bound of the range in its place.  */
+  if (TREE_CODE (op1) == SSA_NAME)
+    {
+      wide_int min, max;
+      enum value_range_type range_type = get_range_info (op1, &min, &max);
+      if (range_type == VR_RANGE)
+	{
+	  if (object_size_type < 2)
+	    op1 = build_int_cst (TREE_TYPE (op1), wi::fits_uhwi_p (min)
+				 ? min.to_uhwi () : min.to_shwi ());
+	  else
+	    op1 = build_int_cst (TREE_TYPE (op1), wi::fits_uhwi_p (max)
+				 ? max.to_uhwi () : max.to_shwi ());
+	}
+    }
+
+  /* If the OFFSET is not constant, assume it's either zero for the maximum
+     object size or SIZE_MAX for the minimum size of zero.  */
+  if (TREE_CODE (op1) != INTEGER_CST || !tree_fits_uhwi_p (op1))
+    {
+      op1 = (object_size_type < 2
+	     ? integer_zero_node : build_all_ones_cst (size_type_node));
+      estimated = true;
+    }
+
   /* Handle PTR + OFFSET here.  */
-  if (TREE_CODE (op1) == INTEGER_CST
-      && (TREE_CODE (op0) == SSA_NAME
-	  || TREE_CODE (op0) == ADDR_EXPR))
+  if (TREE_CODE (op0) == SSA_NAME
+      || TREE_CODE (op0) == ADDR_EXPR)
     {
-      if (! tree_fits_uhwi_p (op1))
-	bytes = unknown[object_size_type];
-      else if (TREE_CODE (op0) == SSA_NAME)
-	return merge_object_sizes (osi, var, op0, tree_to_uhwi (op1));
+      unsigned HOST_WIDE_INT off = tree_to_uhwi (op1);
+
+      if (TREE_CODE (op0) == SSA_NAME)
+	return merge_object_sizes (osi, var, op0, off);
       else
 	{
-	  unsigned HOST_WIDE_INT off = tree_to_uhwi (op1);
-
-          /* op0 will be ADDR_EXPR here.  */
+	  /* op0 will be ADDR_EXPR here.  */
 	  addr_object_size (osi, op0, object_size_type, &bytes);
+	  if (estimated && off == HOST_WIDE_INT_M1U)
+	    off = bytes;
+
 	  if (bytes == unknown[object_size_type])
 	    ;
 	  else if (off > offset_limit)