diff mbox

c/67882 - improve -Warray-bounds for invalid offsetof

Message ID 56172C8C.2070202@gmail.com
State New
Headers show

Commit Message

Martin Sebor Oct. 9, 2015, 2:55 a.m. UTC
Gcc attempts to diagnose invalid offsetof expressions whose member
designator is an array element with an out-of-bounds index. The
logic in the function that does this detection is incomplete, leading
to false negatives. Since the result of the expression in these cases
can be surprising, this patch tightens up the logic to diagnose more
such cases.

Tested by boostrapping and running c/c++ tests on x86_64 with no
regressions.

Martin

Comments

Martin Sebor Oct. 15, 2015, 9:59 p.m. UTC | #1
The patch is at the following link:

   https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00915.html

Martin

On 10/08/2015 08:55 PM, Martin Sebor wrote:
> Gcc attempts to diagnose invalid offsetof expressions whose member
> designator is an array element with an out-of-bounds index. The
> logic in the function that does this detection is incomplete, leading
> to false negatives. Since the result of the expression in these cases
> can be surprising, this patch tightens up the logic to diagnose more
> such cases.
>
> Tested by boostrapping and running c/c++ tests on x86_64 with no
> regressions.
>
> Martin
Bernd Schmidt Oct. 16, 2015, 12:27 p.m. UTC | #2
On 10/09/2015 04:55 AM, Martin Sebor wrote:
> Gcc attempts to diagnose invalid offsetof expressions whose member
> designator is an array element with an out-of-bounds index. The
> logic in the function that does this detection is incomplete, leading
> to false negatives. Since the result of the expression in these cases
> can be surprising, this patch tightens up the logic to diagnose more
> such cases.

In the future, please explain more clearly in the patch submission what 
the false negatives are. That'll make the reviewer's job easier.

> Tested by boostrapping and running c/c++ tests on x86_64 with no
> regressions.

Should run the full testsuite (standard practice, and library tests 
might have occurrences of offsetof).

A ChangeLog is missing. (Not that I personally care about ChangeLogs, 
but apparently others do.)

> +struct offsetof_ctx_t
> +{
> +  tree inx;     /* The invalid array index or NULL_TREE.  */
> +  int maxinx;   /* All indices to the array have the highest valid value. */
> +};

I think "idx" is commonly used, I've never seen the spelling "inx". 
Also, typically comments go on their own lines before the field.

> +
> +	      if (tree_int_cst_lt (upbound, t)) {
> +		pctx->inx = t;

None-standard formatting. Elsewhere too.

> +		  /* Index is considered valid when it's either less than
> +		     the upper bound or equal to it and refers to the lowest
> +		     rank.  Since in the latter case it may not at this point
> +		     in the recursive call to the function be known whether
> +		     the element at the index is used to do more than to
> +		     compute its offset (e.g., it can be used to designate
> +		     a member of the type to which the just past-the-end
> +		     element refers), point the INX variable at the index
> +		     and leave it up to the caller to decide whether or not
> +		     to diagnose it.  Special handling is required for minor
> +		     index values referring to the element just past the end
> +		     of the array object.  */

I admit to having trouble parsing this comment. Can you write that in a 
clearer way somehow? I'm still trying to make my mind up whether the 
logic in this patch could be simplified.

>         t = convert (sizetype, t);
>         off = size_binop (MULT_EXPR, TYPE_SIZE_UNIT (TREE_TYPE (expr)), t);
> +
>         break;

Spurious change, please remove.

> +extern tree fold_offsetof_1 (tree, offsetof_ctx_t* = NULL);

Space before *.

> +// treatment since the offsetof exression yields the same result for

"expression".

> +    // The following expression is silently accepted as an extension
> +    // because it simply forms the equivalent of a just-past-the-end
> +    // address.
> +    __builtin_offsetof (A, a1_1 [0][1]),    // extension

Hmm, do we really want to support any kind of multidimensional array for 
this extension? My guess would have been to warn here.

So I checked and it looks like we accept flexible array member syntax 
like "int a[][2];", which suggests that the test might have the right 
idea, but has the indices swapped (the first one is the flexible one)? 
Ccing Joseph for a ruling.


Bernd
Joseph Myers Oct. 16, 2015, 4:41 p.m. UTC | #3
On Fri, 16 Oct 2015, Bernd Schmidt wrote:

> > +    // The following expression is silently accepted as an extension
> > +    // because it simply forms the equivalent of a just-past-the-end
> > +    // address.
> > +    __builtin_offsetof (A, a1_1 [0][1]),    // extension
> 
> Hmm, do we really want to support any kind of multidimensional array for this
> extension? My guess would have been to warn here.

We do deliberately want to support sequences of array indexing and 
structure / union member reference inside __builtin_offsetof.

> So I checked and it looks like we accept flexible array member syntax like
> "int a[][2];", which suggests that the test might have the right idea, but has
> the indices swapped (the first one is the flexible one)? Ccing Joseph for a
> ruling.

"int a[][2];" is a fully valid flexible array member (whose elements have 
type int[2]) - ISO C, not an extension.

Given such a flexible array member, [anything][0] and [anything][1] are 
logically valid if [anything] is nonnegative (of course, the array might 
or might not have enough element at runtime).  [anything][2] is a 
reference to just past the end of an element of the flexible array member; 
[anything][3] is clearly invalid.

Regarding the actual testcase, accepting a1_1[0][1] seems fine (it's 
referencing the just-past-end element of the valid array a1_1[0]).  The 
test unfortunately doesn't show what happens in cases with fewer indices - 
it should be expanded that way.  I'd say a1_1[1] should be accepted as 
just past end, even if a1_1[1][0] gets a warning as shown in the test.
diff mbox

Patch

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 4b922bf..fc7c991d 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -10536,12 +10536,31 @@  c_common_to_target_charset (HOST_WIDE_INT c)
 
 /* Fold an offsetof-like expression.  EXPR is a nested sequence of component
    references with an INDIRECT_REF of a constant at the bottom; much like the
-   traditional rendering of offsetof as a macro.  Return the folded result.  */
+   traditional rendering of offsetof as a macro.  Return the folded result.
+   PCTX, which is initially null, is set by the first recursive call of
+   the function to refer to a local object describing the potentially out-
+   of-bounds index of the array member whose offset is being computed, and
+   to indicate whether all indices to the same array object have the highest
+   valid value.  The function issues a warning for out-of-bounds array indices
+   that either refer to elements past the one just past end of the array object
+   or that exceed any of the major bounds.  */
+
+struct offsetof_ctx_t
+{
+  tree inx;     /* The invalid array index or NULL_TREE.  */
+  int maxinx;   /* All indices to the array have the highest valid value. */
+};
 
 tree
-fold_offsetof_1 (tree expr)
+fold_offsetof_1 (tree expr, offsetof_ctx_t *pctx /* = 0 */)
 {
   tree base, off, t;
+  offsetof_ctx_t ctx = { NULL_TREE, -1 };
+
+  /* Set the context pointer to point to the local context object
+     to use by subsequent recursive calls.  */
+  if (!pctx)
+    pctx = &ctx;
 
   switch (TREE_CODE (expr))
     {
@@ -10567,10 +10586,19 @@  fold_offsetof_1 (tree expr)
       return TREE_OPERAND (expr, 0);
 
     case COMPONENT_REF:
-      base = fold_offsetof_1 (TREE_OPERAND (expr, 0));
+      base = fold_offsetof_1 (TREE_OPERAND (expr, 0), pctx);
       if (base == error_mark_node)
 	return base;
 
+      if (ctx.inx != NULL_TREE) {
+	warning (OPT_Warray_bounds,
+		 "index %E denotes an offset "
+		 "greater than size of %qT",
+		 ctx.inx, TREE_TYPE (TREE_OPERAND (expr, 0)));
+	/* Reset to avoid diagnosing the same expression multiple times.  */
+	pctx->inx = NULL_TREE;
+      }
+
       t = TREE_OPERAND (expr, 1);
       if (DECL_C_BIT_FIELD (t))
 	{
@@ -10581,10 +10609,11 @@  fold_offsetof_1 (tree expr)
       off = size_binop_loc (input_location, PLUS_EXPR, DECL_FIELD_OFFSET (t),
 			    size_int (tree_to_uhwi (DECL_FIELD_BIT_OFFSET (t))
 				      / BITS_PER_UNIT));
+      pctx->maxinx = -1;
       break;
 
     case ARRAY_REF:
-      base = fold_offsetof_1 (TREE_OPERAND (expr, 0));
+      base = fold_offsetof_1 (TREE_OPERAND (expr, 0), pctx);
       if (base == error_mark_node)
 	return base;
 
@@ -10601,8 +10630,10 @@  fold_offsetof_1 (tree expr)
 	    {
 	      upbound = size_binop (PLUS_EXPR, upbound,
 				    build_int_cst (TREE_TYPE (upbound), 1));
-	      if (tree_int_cst_lt (upbound, t))
-		{
+
+	      if (tree_int_cst_lt (upbound, t)) {
+		pctx->inx = t;
+
 		tree v;
 
 		for (v = TREE_OPERAND (expr, 0);
@@ -10622,25 +10653,61 @@  fold_offsetof_1 (tree expr)
 		/* Don't warn if the array might be considered a poor
 		   man's flexible array member with a very permissive
 		   definition thereof.  */
-		  if (TREE_CODE (v) == ARRAY_REF
-		      || TREE_CODE (v) == COMPONENT_REF)
+		if (TREE_CODE (v) != ARRAY_REF
+		    && TREE_CODE (v) != COMPONENT_REF)
+		  pctx = NULL;
+	      }
+	      else if (pctx->inx == NULL_TREE && tree_int_cst_equal (upbound, t))
+		{
+		  /* Index is considered valid when it's either less than
+		     the upper bound or equal to it and refers to the lowest
+		     rank.  Since in the latter case it may not at this point
+		     in the recursive call to the function be known whether
+		     the element at the index is used to do more than to
+		     compute its offset (e.g., it can be used to designate
+		     a member of the type to which the just past-the-end
+		     element refers), point the INX variable at the index
+		     and leave it up to the caller to decide whether or not
+		     to diagnose it.  Special handling is required for minor
+		     index values referring to the element just past the end
+		     of the array object.  */
+		  pctx->inx = t;
+		  tree_code lastcode = TREE_CODE (TREE_OPERAND (expr, 0));
+		  if ((lastcode != ARRAY_REF && pctx != &ctx)
+		      || (pctx == &ctx && pctx->maxinx))
+		    pctx = NULL;
+		}
+	      else
+		{
+		  HOST_WIDE_INT ubi = tree_to_shwi (upbound);
+		  HOST_WIDE_INT inx = tree_to_shwi (t);
+
+		  if (pctx->maxinx)
+		    pctx->maxinx = inx + 1 == ubi;
+		}
+	    }
+	}
+
+      if (pctx && pctx->inx)
+	{
 	  warning (OPT_Warray_bounds,
 		   "index %E denotes an offset "
 		   "greater than size of %qT",
-			     t, TREE_TYPE (TREE_OPERAND (expr, 0)));
-		}
-	    }
+		   pctx->inx, TREE_TYPE (TREE_OPERAND (expr, 0)));
+
+	  pctx->inx = NULL_TREE;
 	}
 
       t = convert (sizetype, t);
       off = size_binop (MULT_EXPR, TYPE_SIZE_UNIT (TREE_TYPE (expr)), t);
+
       break;
 
     case COMPOUND_EXPR:
       /* Handle static members of volatile structs.  */
       t = TREE_OPERAND (expr, 1);
       gcc_assert (VAR_P (t));
-      return fold_offsetof_1 (t);
+      return fold_offsetof_1 (t, pctx);
 
     default:
       gcc_unreachable ();
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 74d1bc1..72e2f95 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1015,7 +1015,8 @@  extern bool c_dump_tree (void *, tree);
 
 extern void verify_sequence_points (tree);
 
-extern tree fold_offsetof_1 (tree);
+struct offsetof_ctx_t;
+extern tree fold_offsetof_1 (tree, offsetof_ctx_t* = NULL);
 extern tree fold_offsetof (tree);
 
 /* Places where an lvalue, or modifiable lvalue, may be required.
diff --git a/gcc/testsuite/c-c++-common/builtin-offsetof-2.c b/gcc/testsuite/c-c++-common/builtin-offsetof-2.c
new file mode 100644
index 0000000..09979fd
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/builtin-offsetof-2.c
@@ -0,0 +1,157 @@ 
+// { dg-options "-Warray-bounds" }
+// { dg-do compile }
+
+// Test case exercising pr c/67882 - surprising offsetof result
+//   on an invalid array member without diagnostic.
+
+typedef struct A {
+  char a1_1[1][1];
+  char a[];
+} A;
+
+typedef struct A2 {
+  char a1_1[1][1];
+  char c;
+} A2;
+
+typedef struct B {
+  A2 a2_3[2][3];
+  char a1_1[3][5];
+  char a[];
+} B;
+
+// Structures with members that contain flexible array members are
+// an extension accepted by GCC.
+typedef struct C {
+  A a5_7 [5][7];
+  int a;
+} C;
+
+// Structs with a "fake" flexible array member (a GCC extension).
+typedef struct FA0 {
+  int i;
+  char a0 [0];
+} FA0;
+
+typedef struct FA1 {
+  int i;
+  char a1 [1];
+} FA1;
+
+typedef struct FA3 {
+  int i;
+  char a3 [3];
+} FA3;
+
+// A "fake" multidimensional flexible array member deserves special
+// treatment since the offsetof exression yields the same result for
+// references to apparently distinct members (e.g., a5_7[0][7] is
+// at the same offset as a5_7[1][0] which may be surprising).
+typedef struct FA5_7 {
+  int i;
+  char a5_7 [5][7];
+} FA5_7;
+
+static void test (void)
+{
+  // Verify that offsetof references to array elements past the end of
+  // the array member are diagnosed.  As an extension, permit references
+  // to the element just past-the-end of the array.
+
+  int a[] = {
+    __builtin_offsetof (A, a),              // valid
+    __builtin_offsetof (A, a [0]),          // valid
+    __builtin_offsetof (A, a [1]),          // valid
+    __builtin_offsetof (A, a [99]),         // valid
+
+    __builtin_offsetof (A, a1_1 [0][0]),    // valid
+
+    // The following expression is silently accepted as an extension
+    // because it simply forms the equivalent of a just-past-the-end
+    // address.
+    __builtin_offsetof (A, a1_1 [0][1]),    // extension
+
+    __builtin_offsetof (A, a1_1 [1][0]),    // { dg-warning "index" }
+    __builtin_offsetof (A, a1_1 [1][1]),    // { dg-warning "index" }
+
+    __builtin_offsetof (B, a2_3 [0][0].c),           // valid
+    __builtin_offsetof (B, a2_3 [0][0].a1_1 [0][0]), // valid
+
+    // The following expressions are silently accepted as an extension
+    // because they simply form the equivalent of a just-past-the-end
+    // address.
+    __builtin_offsetof (B, a2_3 [1][3]),             // extension
+    __builtin_offsetof (B, a2_3 [0][0].a1_1 [0][1]), // extension
+
+    __builtin_offsetof (B, a2_3 [0][0].a1_1 [1][0]), // { dg-warning "index" }
+    __builtin_offsetof (B, a2_3 [0][0].a1_1 [1][1]), // { dg-warning "index" }
+
+    __builtin_offsetof (B, a2_3 [1][2].a1_1 [0][0]), // valid
+
+    // Forming an offset to the just-past-end element is accepted.
+    __builtin_offsetof (B, a2_3 [1][2].a1_1 [0][1]), // extension
+    __builtin_offsetof (B, a2_3 [1][2].a1_1 [1][0]), // { dg-warning "index" }
+    __builtin_offsetof (B, a2_3 [1][2].a1_1 [1][1]), // { dg-warning "index" }
+
+    // Forming an offset to the just-past-end element is accepted.
+    __builtin_offsetof (B, a2_3 [1][3]),             // exension
+    // ...but these are diagnosed because they dereference a just-path-the-end
+    // element.
+    __builtin_offsetof (B, a2_3 [1][3].a1_1 [0][0]), // { dg-warning "index" }
+    __builtin_offsetof (B, a2_3 [1][3].a1_1 [0][0]), // { dg-warning "index" }
+    __builtin_offsetof (B, a2_3 [1][3].a1_1 [0][1]), // { dg-warning "index" }
+    __builtin_offsetof (B, a2_3 [1][3].a1_1 [1][0]), // { dg-warning "index" }
+    __builtin_offsetof (B, a2_3 [1][3].a1_1 [1][1]), // { dg-warning "index" }
+
+    // Analogous to the case above, these are both diagnosed because they
+    // dereference just-path-the-end elements of the a2_3 array.
+    __builtin_offsetof (B, a2_3 [1][3].c),       // { dg-warning "index" }
+    __builtin_offsetof (B, a2_3 [1][3].c),       // { dg-warning "index" }
+
+    // The following are all invalid because of the reference to a2_3[2].
+    __builtin_offsetof (B, a2_3 [2][0].a1_1 [0][0]), // { dg-warning "index" }
+    __builtin_offsetof (B, a2_3 [2][0].a1_1 [0][1]), // { dg-warning "index" }
+    __builtin_offsetof (B, a2_3 [2][0].a1_1 [1][0]), // { dg-warning "index" }
+    __builtin_offsetof (B, a2_3 [2][0].a1_1 [1][1]), // { dg-warning "index" }
+    __builtin_offsetof (B, a2_3 [2][0].c),           // { dg-warning "index" }
+
+    __builtin_offsetof (C, a5_7 [4][6]),
+    __builtin_offsetof (C, a5_7 [4][6].a),
+    __builtin_offsetof (C, a5_7 [4][6].a [0]),
+    __builtin_offsetof (C, a5_7 [4][6].a [99]),
+
+    __builtin_offsetof (C, a5_7 [4][7]),             // extension
+    // Diagnose the following even though the object whose offset is
+    // computed is a flexible array member.
+    __builtin_offsetof (C, a5_7 [4][7].a),           // { dg-warning "index" }
+    __builtin_offsetof (C, a5_7 [4][7].a [0]),       // { dg-warning "index" }
+    __builtin_offsetof (C, a5_7 [4][7].a [99]),      // { dg-warning "index" }
+
+    // Verify that no diagnostic is issued for offsetof expressions
+    // involving structs where the array has a rank of 1 and is the last
+    // member (e.g., those are treated as flexible array members).
+    __builtin_offsetof (FA0, a0 [0]),
+    __builtin_offsetof (FA0, a0 [1]),
+    __builtin_offsetof (FA0, a0 [99]),
+
+    __builtin_offsetof (FA1, a1 [0]),
+    __builtin_offsetof (FA1, a1 [1]),
+    __builtin_offsetof (FA1, a1 [99]),
+
+    __builtin_offsetof (FA3, a3 [0]),
+    __builtin_offsetof (FA3, a3 [3]),
+    __builtin_offsetof (FA3, a3 [99]),
+
+    __builtin_offsetof (FA5_7, a5_7 [0][0]),
+
+    // Unlike one-dimensional arrays, verify that out-of-bounds references
+    // to arrays with rank of 2 and greater are diagnosed.
+    __builtin_offsetof (FA5_7, a5_7 [0][7]),         // { dg-warning "index" }
+    __builtin_offsetof (FA5_7, a5_7 [1][7]),         // { dg-warning "index" }
+    __builtin_offsetof (FA5_7, a5_7 [5][0]),         // { dg-warning "index" }
+    __builtin_offsetof (FA5_7, a5_7 [5][7]),         // { dg-warning "index" }
+    __builtin_offsetof (FA5_7, a5_7 [99][99])        // { dg-warning "index" }
+  };
+
+  (void)&a;
+}