diff mbox

[ubsan] Add VLA bound instrumentation

Message ID 20131030145253.GB31396@redhat.com
State New
Headers show

Commit Message

Marek Polacek Oct. 30, 2013, 2:52 p.m. UTC
On Fri, Oct 25, 2013 at 03:04:41PM -0400, Jason Merrill wrote:
> >I'm sorry, you want me to move the c++1y VLA check into
> >compute_array_index_type, or just do the ubsan instrumentation in
> >there?  Thanks,
> 
> Both.

Unfortunately, I'm having quite a lot of trouble with side-effects. :(
For e.g.
int x = 1;
int a[++x];

with the following hunk


we generate

  int x = 1;
  int a[0:(sizetype) SAVE_EXPR <D.2143>];

  <<cleanup_point   int x = 1;>>;
  <<cleanup_point <<< Unknown tree: expr_stmt
  if (SAVE_EXPR < ++x> <= 0)
    {   
      __builtin___ubsan_handle_vla_bound_not_positive (&*.Lubsan_data0, (unsigned long) SAVE_EXPR < ++x>);
    }   
  else
    {   
      0   
    }, (void) SAVE_EXPR < ++x>; >>>>>;
    ssizetype D.2143;
  <<cleanup_point <<< Unknown tree: expr_stmt
  (void) (D.2143 = (ssizetype)  ++x + -1) >>>>>;
  <<cleanup_point   int a[0:(sizetype) SAVE_EXPR <D.2143>];>>;

that is, x is incremented twice and that is wrong.

Is it possible to tell "x has already been evaluated, don't evaluate
it again" so that the x isn't incremented in the cleanup_point?

Or, would you, please, have some other advice?  I've been looking into this
for quite some time now, but haven't been able to come up with anything
better than moving the checks back to create_array_type_for_decl, where it
all started ;).

	Marek

Comments

Jason Merrill Oct. 30, 2013, 3:56 p.m. UTC | #1
On 10/30/2013 10:52 AM, Marek Polacek wrote:
> +         if ((flag_sanitize & SANITIZE_VLA)
> +             && !processing_template_decl

You don't need to check processing_template_decl; the template case was 
already handled above.

> +             tree x = cp_save_expr (size);
> +             x = build2 (COMPOUND_EXPR, TREE_TYPE (x),
> +                         ubsan_instrument_vla (input_location, x), x);
> +             finish_expr_stmt (x);

Saving 'size' here doesn't help since it's already been used above. 
Could you use itype instead of size here?

Jason
Marek Polacek Oct. 30, 2013, 4:15 p.m. UTC | #2
On Wed, Oct 30, 2013 at 11:56:25AM -0400, Jason Merrill wrote:
> On 10/30/2013 10:52 AM, Marek Polacek wrote:
> >+         if ((flag_sanitize & SANITIZE_VLA)
> >+             && !processing_template_decl
> 
> You don't need to check processing_template_decl; the template case
> was already handled above.

Right, removed.
 
> >+             tree x = cp_save_expr (size);
> >+             x = build2 (COMPOUND_EXPR, TREE_TYPE (x),
> >+                         ubsan_instrument_vla (input_location, x), x);
> >+             finish_expr_stmt (x);
> 
> Saving 'size' here doesn't help since it's already been used above.
> Could you use itype instead of size here?

I already experimented with that and I think I can't, since we call
the finish_expr_stmt too soon, which results in:

    int x = 1;
    int a[0:(sizetype) SAVE_EXPR <D.2143>];
  
    <<cleanup_point   int x = 1;>>;
    <<cleanup_point <<< Unknown tree: expr_stmt
    if (SAVE_EXPR <D.2143> <= 0)
      {   
        __builtin___ubsan_handle_vla_bound_not_positive (&*.Lubsan_data0, (unsigned long) SAVE_EXPR <D.2143>);
      }   
    else
      {   
        0   
      }, (void) SAVE_EXPR <D.2143>; >>>>>;
      ssizetype D.2143;
    <<cleanup_point <<< Unknown tree: expr_stmt
    (void) (D.2143 = (ssizetype)  ++x + -1) >>>>>;

and that ICEs in gimplify_var_or_parm_decl, presumably because the
if (SAVE_EXPR <D.2143> <= 0) { ... } should be emitted *after* that
cleanup_point.  When we generated the C++1y check in cp_finish_decl,
we emitted the check after the cleanup_point, and everything was OK.
I admit I don't understand the cleanup_points very much and I don't
know exactly where they are coming from, because normally I don't see
them coming out of C FE. :)  Thanks.

	Marek
Mike Stump Oct. 30, 2013, 8:24 p.m. UTC | #3
On Oct 30, 2013, at 9:15 AM, Marek Polacek <polacek@redhat.com> wrote:
> I admit I don't understand the cleanup_points very much and I don't
> know exactly where they are coming from

So, here is the mental model…  and how it is related to the standard.  C++ mandates that destructors for objects and temporary objects run no sooner than a certain place, and no later than another place.  In the implementation, we choose a single point to run them, and use a cleanup point as the embodiment of when destructors run.  For example:

cleanup (a + cleanup (b - c))

means generate this:

a
b
c
-
dtors for things related to b-c
+
dtors for things related to a+ (b-c)

that's it.  Pretty simple.  Now, cute little details, once you get past the simplicity, would be things like, if you run the cleanups for b-c, at the first dtor line above, do you also run those same things at the lower point?  That answer is no, they only run once.  If one takes an exception out of that region, does the cleanup action run?  That answer is yes.  Lots of other possible questions like this, all with fairly simple, easy to understand answers.  Just ask.

Now, some advanced topics…  So, one thing you discover, if you _add_ a cleanup point into an expression, it will run those actions sooner that they would have run, if you had not.  One cannot meet the requirements of the language standard and just arbitrarily add cleanup points.  However, constructs beyond the language standard, say ({ s1; s2; s3; }) + b;, one discovers that the implementation is free to decide if there is a cleanup point for ({ }) or not.  The language standard places no requirements on such code, and this is why we can decide.

decl cleanups are strongly related to these sorts of cleanups, but lie just outside (enclosing).  I'll note their existence for completeness.  See CLEANUP_STMT for these.
Marek Polacek Oct. 30, 2013, 10:15 p.m. UTC | #4
Thanks Mike.

I had a quick look at the CLEANUP_STMT and cp-tree.def says
"A CLEANUP_STMT marks the point at which a declaration is fully
constructed.", while doc says
"Used to represent an action that should take place upon exit from the
enclosing scope.  Typically, these actions are calls to destructors for
local objects."  Huh?  So, how come it e.g. initializes variables, and on
the other hand it should run dtors?  I'm baffled (but it's too late for me
to think clearly ;)).

	Marek
Mike Stump Oct. 30, 2013, 10:41 p.m. UTC | #5
On Oct 30, 2013, at 3:15 PM, Marek Polacek <polacek@redhat.com> wrote:
> I had a quick look at the CLEANUP_STMT and cp-tree.def says
> "A CLEANUP_STMT marks the point at which a declaration is fully
> constructed.", while doc says
> "Used to represent an action that should take place upon exit from the
> enclosing scope.  Typically, these actions are calls to destructors for
> local objects."  Huh?  So, how come it e.g. initializes variables, and on
> the other hand it should run dtors?  I'm baffled (but it's too late for me
> to think clearly ;)).

The dtors only run, after the ctors run.  We mark where the ctors finish spot, as the _start_ of the region for which we have to clean up.  Really, the cleanup has nothing to do with ctors.  You can have dtors, without any ctors, or ctors, without any dtors.

{
  decl d;
  s;
}

transforms into:

<-----  start of lifetime of the storage for d
ctor(d)
<-----  start of lifetime of the fully constructed object d
s;
<-----  end of lifetime of fully constructed object d
dtor(d)
<-----  end of the storage of d

CLEANUP_STMT documents when the region protected by the cleanup starts.  One want to describe that region is, the end of the ctors, if any, else after the storage is allocated.  In the above, that is the second <---- spot.

Now, in the trees, the above is decl d; ctors; CLEANUP_STMT (s, dtors, d).

s is the region for which the cleanups are active for.  dtors is the cleanup to perform on transfer out of that region, and d is the decl related to the actions in dtors.
Jason Merrill Oct. 31, 2013, 1:10 a.m. UTC | #6
On 10/30/2013 12:15 PM, Marek Polacek wrote:
> On Wed, Oct 30, 2013 at 11:56:25AM -0400, Jason Merrill wrote:
>> Saving 'size' here doesn't help since it's already been used above.
>> Could you use itype instead of size here?
>
> I already experimented with that and I think I can't, since we call
> the finish_expr_stmt too soon, which results in:
>
>      int x = 1;
>      int a[0:(sizetype) SAVE_EXPR <D.2143>];
>
>      <<cleanup_point   int x = 1;>>;
>      <<cleanup_point <<< Unknown tree: expr_stmt
>      if (SAVE_EXPR <D.2143> <= 0)
>        {
>          __builtin___ubsan_handle_vla_bound_not_positive (&*.Lubsan_data0, (unsigned long) SAVE_EXPR <D.2143>);
>        }
>      else
>        {
>          0
>        }, (void) SAVE_EXPR <D.2143>; >>>>>;
>        ssizetype D.2143;
>      <<cleanup_point <<< Unknown tree: expr_stmt
>      (void) (D.2143 = (ssizetype)  ++x + -1) >>>>>;

Ah, looks like you're getting an unfortunate interaction with 
stabilize_vla_size, which is replacing the contents of the SAVE_EXPR 
with a reference to a variable that isn't initialized yet.  Perhaps we 
should move the stabilize_vla_size call into compute_array_index_type, too.

> and that ICEs in gimplify_var_or_parm_decl, presumably because the
> if (SAVE_EXPR <D.2143> <= 0) { ... } should be emitted *after* that
> cleanup_point.  When we generated the C++1y check in cp_finish_decl,
> we emitted the check after the cleanup_point, and everything was OK.
> I admit I don't understand the cleanup_points very much and I don't
> know exactly where they are coming from, because normally I don't see
> them coming out of C FE. :)

You can ignore the cleanup_points; they just wrap every full-expression.

Jason
Marek Polacek Oct. 31, 2013, 9:40 a.m. UTC | #7
On Wed, Oct 30, 2013 at 03:41:53PM -0700, Mike Stump wrote:
> The dtors only run, after the ctors run.  We mark where the ctors finish spot, as the _start_ of the region for which we have to clean up.  Really, the cleanup has nothing to do with ctors.  You can have dtors, without any ctors, or ctors, without any dtors.
> 
> {
>   decl d;
>   s;
> }
> 
> transforms into:
> 
> <-----  start of lifetime of the storage for d
> ctor(d)
> <-----  start of lifetime of the fully constructed object d
> s;
> <-----  end of lifetime of fully constructed object d
> dtor(d)
> <-----  end of the storage of d
> 
> CLEANUP_STMT documents when the region protected by the cleanup starts.  One want to describe that region is, the end of the ctors, if any, else after the storage is allocated.  In the above, that is the second <---- spot.
> 
> Now, in the trees, the above is decl d; ctors; CLEANUP_STMT (s, dtors, d).
> 
> s is the region for which the cleanups are active for.  dtors is the cleanup to perform on transfer out of that region, and d is the decl related to the actions in dtors.

I see now.  Thanks very much, Mike.

	Marek
diff mbox

Patch

--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -8394,6 +8382,18 @@  compute_array_index_type (tree name, tree size, tsubst_flags_t com
              if (found)
                itype = variable_size (fold (newitype));
            }
+
+         if ((flag_sanitize & SANITIZE_VLA)
+             && !processing_template_decl
+             /* From C++1y onwards, we throw an exception on a negative
+                length size of an array; see above  */
+             && cxx_dialect < cxx1y)
+           {
+             tree x = cp_save_expr (size);
+             x = build2 (COMPOUND_EXPR, TREE_TYPE (x),
+                         ubsan_instrument_vla (input_location, x), x);
+             finish_expr_stmt (x);
+           }
        }
       /* Make sure that there was no overflow when creating to a signed
         index type.  (For example, on a 32-bit machine, an array with