diff mbox

Don't necessarily emit object size checks for ARRAY_REFs

Message ID 20141106221908.GE2342@redhat.com
State New
Headers show

Commit Message

Marek Polacek Nov. 6, 2014, 10:19 p.m. UTC
First part of this patch is about removing the useless check that
we talked about earlier today.

The rest is about not emitting UBSAN_OBJECT_SIZE checks (those often
come with multiple statements to compute a pointer difference) for
ARRAY_REFs that are already instrumented by UBSAN_BOUNDS.

I do this by moving the UBSAN_OBJECT_SIZE instrumentation so that
it is done first in the ubsan pass - then I can just check whether
the statement before that ARRAY_REF is a UBSAN_BOUND check.  If it
is, it should be clear that it is checking the ARRAY_REF, and I can
drop the UBSAN_OBJECT_SIZE check.  (Moving the UBSAN_OBJECT_SIZE
instrumentation means that there won't be e.g. UBSAN_NULL check in
between the ARRAY_REF and UBSAN_BOUND.)

Earlier, I thought I should check that both UBSAN_OBJECT_SIZE and
UBSAN_BOUND checks are checking the same array index, but that
wouldn't work for multidimensional arrays, and just should not be
needed.

The new test shows where we were bogusly emitting both diagnostics
and ensures that we only emit the -fsanitize=bounds one.
Of course, if the test is run with -fsanitize=object-size or with
-fsanitize=undefined -fno-sanitize=bounds, we emit the object size
diagnostics and not the bounds diagnostics.

Bootstrap-ubsan/regtest passed on x86_64-linux, ok for trunk?

2014-11-06  Marek Polacek  <polacek@redhat.com>

	* sanopt.c (sanopt_optimize_walker): Don't check alignment when
	popping a statement.
	* ubsan.c (instrument_object_size): Don't instrument already
	instrumented ARRAY_REFs.
	(pass_ubsan::execute): Perform object size instrumentation
	first.
testsuite/
	* c-c++-common/ubsan/object-size-9.c: Use -fsanitize=object-size.
	Add dg-output.
	* c-c++-common/ubsan/object-size-10.c: Likewise.
	* c-c++-common/ubsan/undefined-3.c: New test.


	Marek

Comments

Jakub Jelinek Nov. 11, 2014, 8:04 a.m. UTC | #1
On Thu, Nov 06, 2014 at 11:19:08PM +0100, Marek Polacek wrote:
> First part of this patch is about removing the useless check that
> we talked about earlier today.
> 
> The rest is about not emitting UBSAN_OBJECT_SIZE checks (those often
> come with multiple statements to compute a pointer difference) for
> ARRAY_REFs that are already instrumented by UBSAN_BOUNDS.
> 
> I do this by moving the UBSAN_OBJECT_SIZE instrumentation so that
> it is done first in the ubsan pass - then I can just check whether
> the statement before that ARRAY_REF is a UBSAN_BOUND check.  If it
> is, it should be clear that it is checking the ARRAY_REF, and I can
> drop the UBSAN_OBJECT_SIZE check.  (Moving the UBSAN_OBJECT_SIZE
> instrumentation means that there won't be e.g. UBSAN_NULL check in
> between the ARRAY_REF and UBSAN_BOUND.)
> 
> Earlier, I thought I should check that both UBSAN_OBJECT_SIZE and
> UBSAN_BOUND checks are checking the same array index, but that
> wouldn't work for multidimensional arrays, and just should not be
> needed.

IMHO it is needed and is highly desirable, otherwise you risk missed
diagnostics from -fsanitize=object-size when it is needed.

Consider e.g.:

extern int a[][10][10];

int
foo (int x, int y, int z)
{
  return a[x][y][z];
}

int a[10][10][10] = {};

testcase, here only the y and z indexes are bounds checked, but
the x index is not (UBSAN_BOUNDS is added early, before the a var
definition is parsed, while ubsan pass runs afterwards, so can know the
object size.

If you have a multi-dimensional array, you can just walk backwards
within the same bb, looking for UBSAN_BOUNDS calls that verify
the indexes where needed.

Say on:
struct S { int a:3; };
extern struct S a[][10][10];

int
foo (int x, int y, int z)
{
  return a[5][11][z].a;
}

struct S a[10][10][10] = {};

you have:
  UBSAN_BOUNDS (0B, 11, 9);
  z.0_4 = z_3(D);
  UBSAN_BOUNDS (0B, z.0_4, 9);
  _6 = a[5][11][z.0_4].a;

and you walk the handled components:
1) COMPONENT_REF - ok
2) ARRAY_REF with index z.0_4 and array index maximum is 9, there is
   UBSAN_BOUNDS right above it checking that
3) ARRAY_REF with index 11; 11 is bigger than index maximum 9,
   there is UBSAN_BOUNDS call for it in the same bb
4) ARRAY_REF with index 5; 5 is smaller or equal than index maximum 9,
   no UBSAN_BOUNDS is needed
5) decl inside of the innermost handled component, we can avoid
   the object-size instrumentation; if the base is not a decl,
   never omit object-size instrumentation.

	Jakub
diff mbox

Patch

diff --git gcc/sanopt.c gcc/sanopt.c
index 0fc032a..815d976 100644
--- gcc/sanopt.c
+++ gcc/sanopt.c
@@ -151,8 +151,7 @@  sanopt_optimize_walker (basic_block bb, struct sanopt_ctx *ctx)
 				     || flag_sanitize_undefined_trap_on_error
 				     || gimple_location (g)
 					== gimple_location (stmt);
-			  if (!remove && gimple_bb (g) == gimple_bb (stmt)
-			      && tree_int_cst_compare (cur_align, align) == 0)
+			  if (!remove && gimple_bb (g) == gimple_bb (stmt))
 			    v.pop ();
 			  break;
 			}
diff --git gcc/testsuite/c-c++-common/ubsan/object-size-10.c gcc/testsuite/c-c++-common/ubsan/object-size-10.c
index ebc8582..3e8608a 100644
--- gcc/testsuite/c-c++-common/ubsan/object-size-10.c
+++ gcc/testsuite/c-c++-common/ubsan/object-size-10.c
@@ -1,6 +1,6 @@ 
 /* { dg-do run } */
 /* { dg-skip-if "" { *-*-* } { "*" } { "-O2" } } */
-/* { dg-options "-fsanitize=undefined" } */
+/* { dg-options "-fsanitize=object-size" } */
 
 static char a[128];
 static int b[128];
@@ -19,8 +19,7 @@  fn2 (int i)
   return a[i & 128];
 }
 
-/* { dg-output "index 128 out of bounds for type 'char \\\[128\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "\[^\n\r]*load of address \[^\n\r]* with insufficient space for an object of type 'char'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "load of address \[^\n\r]* with insufficient space for an object of type 'char'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*note: pointer points here\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*\\^\[^\n\r]*(\n|\r\n|\r)" } */
@@ -39,7 +38,6 @@  fn4 (int i)
   return b[i & 128];
 }
 
-/* { dg-output "\[^\n\r]*index 128 out of bounds for type 'int \\\[128\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*load of address \[^\n\r]* with insufficient space for an object of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*note: pointer points here\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*\[^\n\r]*(\n|\r\n|\r)" } */
@@ -52,7 +50,6 @@  fn5 (int i, int j)
   return b[i & j];
 }
 
-/* { dg-output "\[^\n\r]*index 128 out of bounds for type 'int \\\[128\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*load of address \[^\n\r]* with insufficient space for an object of type 'int'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*note: pointer points here\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*\[^\n\r]*(\n|\r\n|\r)" } */
diff --git gcc/testsuite/c-c++-common/ubsan/object-size-9.c gcc/testsuite/c-c++-common/ubsan/object-size-9.c
index 829c822..842ad15 100644
--- gcc/testsuite/c-c++-common/ubsan/object-size-9.c
+++ gcc/testsuite/c-c++-common/ubsan/object-size-9.c
@@ -1,6 +1,6 @@ 
 /* { dg-do run } */
 /* { dg-skip-if "" { *-*-* } { "*" } { "-O2" } } */
-/* { dg-options "-fsanitize=undefined" } */
+/* { dg-options "-fsanitize=object-size" } */
 
 /* Test PARM_DECLs and RESULT_DECLs.  */
 
@@ -35,7 +35,6 @@  f2 (int i)
   return x;
 }
 
-/* { dg-output "\[^\n\r]*index 12 out of bounds for type 'char \\\[8\\\]'\[^\n\r]*(\n|\r\n|\r)" { target { c++ } } } */
 /* { dg-output "\[^\n\r]*store to address \[^\n\r]* with insufficient space for an object of type 'char'\[^\n\r]*(\n|\r\n|\r)" { target { c++ } } } */
 /* { dg-output "\[^\n\r]*note: pointer points here\[^\n\r]*(\n|\r\n|\r)" { target { c++ } } } */
 /* { dg-output "\[^\n\r]*\[^\n\r]*(\n|\r\n|\r)" { target { c++ } } } */
@@ -64,7 +63,6 @@  f4 (int i)
   return s.d[i].b;
 }
 
-/* { dg-output "\[^\n\r]*index 12 out of bounds for type 'U \\\[10\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*load of address \[^\n\r]* with insufficient space for an object of type 'unsigned int'\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*note: pointer points here\[^\n\r]*(\n|\r\n|\r)" } */
 /* { dg-output "\[^\n\r]*\[^\n\r]*(\n|\r\n|\r)" } */
diff --git gcc/testsuite/c-c++-common/ubsan/undefined-3.c gcc/testsuite/c-c++-common/ubsan/undefined-3.c
index e69de29..fc8900a 100644
--- gcc/testsuite/c-c++-common/ubsan/undefined-3.c
+++ gcc/testsuite/c-c++-common/ubsan/undefined-3.c
@@ -0,0 +1,67 @@ 
+/* { dg-do run } */
+/* { dg-options "-O2 -fsanitize=undefined -fno-sanitize-recover=object-size" } */
+/* Test that we only emit UBSAN_BOUNDS for the following ARRAY_REFs.  */
+
+static int g;
+struct S { int a[10]; };
+
+__attribute__((noinline, noclone)) void
+fn1 (void)
+{
+  int a[10];
+  asm ("" : : "r" (&a) : "memory");
+  g = a[10];
+}
+
+/* { dg-output "index 10 out of bounds for type 'int \\\[10\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+__attribute__((noinline, noclone)) void
+fn2 (int i)
+{
+  int a[10];
+  asm ("" : : "r" (&a) : "memory");
+  g = a[i];
+}
+
+/* { dg-output "\[^\n\r]*index 10 out of bounds for type 'int \\\[10\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+__attribute__((noinline, noclone)) void
+fn3 (int i)
+{
+  int a[10][10][10];
+  asm ("" : : "r" (&a) : "memory");
+  g += a[i][5][5];
+}
+
+/* { dg-output "\[^\n\r]*index 10 out of bounds for type 'int \\\[10\\\]\\\[10\\\]\\\[10\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+__attribute__((noinline, noclone)) void
+fn4 (int i)
+{
+  struct S s;
+  asm ("" : : "r" (&s.a) : "memory");
+  g = s.a[i];
+}
+
+/* { dg-output "\[^\n\r]*index 10 out of bounds for type 'int \\\[10\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+__attribute__((noinline, noclone)) void
+fn5 (int i)
+{
+  int a[10];
+  a[1] = 10;
+  asm ("" : : "r" (&a) : "memory");
+  g = a[a[1]];
+}
+
+/* { dg-output "\[^\n\r]*index 10 out of bounds for type 'int \\\[10\\\]'\[^\n\r]*(\n|\r\n|\r)" } */
+
+int
+main ()
+{
+  fn1 ();
+  fn2 (10);
+  fn3 (10);
+  fn4 (10);
+  fn5 (10);
+}
diff --git gcc/ubsan.c gcc/ubsan.c
index 41cf546..4a67801 100644
--- gcc/ubsan.c
+++ gcc/ubsan.c
@@ -1484,6 +1484,22 @@  instrument_object_size (gimple_stmt_iterator *gsi, bool is_lhs)
       || bitsize != size_in_bytes * BITS_PER_UNIT)
     return;
 
+  /* Don't instrument ARRAY_REFs if -fsanitize=bounds already
+     took care of that.  */
+  gimple_stmt_iterator gsi2 = *gsi;
+  if ((flag_sanitize & SANITIZE_BOUNDS) && !gsi_end_p (gsi2))
+    {
+      /* Look at the previous statement.  */
+      gsi_prev (&gsi2);
+      gimple prev_stmt = gsi_stmt (gsi2);
+      if (prev_stmt && is_gimple_call (prev_stmt)
+	  && gimple_call_internal_p (prev_stmt)
+	  && gimple_call_internal_fn (prev_stmt) == IFN_UBSAN_BOUNDS)
+	/* Ok, we have at least one UBSAN_BOUNDS call that is checking
+	   this very ARRAY_REF.  No need to emit objsize checking.  */
+	return;
+    }
+
   bool decl_p = DECL_P (inner);
   tree base = decl_p ? inner : TREE_OPERAND (inner, 0);
   tree ptr = build1 (ADDR_EXPR, build_pointer_type (TREE_TYPE (t)), t);
@@ -1630,6 +1646,16 @@  pass_ubsan::execute (function *fun)
 	      continue;
 	    }
 
+	  /* Perform this instrumentation first, so we can easily
+	     check for UBSAN_BOUNDS before this statement.  */
+	  if (flag_sanitize & SANITIZE_OBJECT_SIZE)
+	    {
+	      if (gimple_store_p (stmt))
+		instrument_object_size (&gsi, true);
+	      if (gimple_assign_load_p (stmt))
+		instrument_object_size (&gsi, false);
+	    }
+
 	  if ((flag_sanitize & SANITIZE_SI_OVERFLOW)
 	      && is_gimple_assign (stmt))
 	    instrument_si_overflow (gsi);
@@ -1664,14 +1690,6 @@  pass_ubsan::execute (function *fun)
 	      bb = gimple_bb (stmt);
 	    }
 
-	  if (flag_sanitize & SANITIZE_OBJECT_SIZE)
-	    {
-	      if (gimple_store_p (stmt))
-		instrument_object_size (&gsi, true);
-	      if (gimple_assign_load_p (stmt))
-		instrument_object_size (&gsi, false);
-	    }
-
 	  gsi_next (&gsi);
 	}
     }