diff mbox

Asan constructor init order checking

Message ID 20131115135845.GE892@tucnak.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek Nov. 15, 2013, 1:58 p.m. UTC
Hi!

Here is an implementation of init-order checking (at least how I understand
it from looking at libsanitizer and eyeballing what clang emits) plus some
performance improvements.

Previously instrument_derefs ignored stores and loads from VAR_DECLs, that
is of course now not possible generally, because it could be dynamic init
protected globals.  But, as for the improvements, if get_inner_reference
tells us that the access is known to be within bounds of some VAR_DECL,
we can check if we know it must be fine.  Automatic vars in current function
within that function are (at least for now) all known to be accessible,
similarly non-external vars that don't have dynamic initialization.

I had to tweak no-redundant-instrumentation-1.c test a little bit, because
for the foo var accesses instrument_derefs can now find out the
insturmentation is unecessary, similarly for the static tab var that didn't
have dynamic initialization.  Note, some comments in that file seems to be
completely wrong and also the fact that it doesn't instrument high bound
of the memset 4/5 looks like a severe bug (though, of course if memset
isn't expanded inline, but as a library call, it will still be caught by
the runtime library).

2013-11-15  Jakub Jelinek  <jakub@redhat.com>

	* sanitizer.def (BUILT_IN_ASAN_BEFORE_DYNAMIC_INIT,
	BUILT_IN_ASAN_AFTER_DYNAMIC_INIT): New.
	* asan.c (instrument_derefs): Handle also VAR_DECL loads/stores.
	Don't instrument accesses to VAR_DECLs which are known to fit
	into their bounds and the vars are known to have shadow bytes
	indicating allowed access.
	(asan_dynamic_init_call): New function.
	(asan_add_global): If vnode->asan_dynamically_initialized,
	set __has_dynamic_init to 1 instead of 0.
	(initialize_sanitizer_builtins): Add BT_FN_VOID_CONST_PTR var.
	* asan.h (asan_dynamic_init_call): New prototype.
	* cgraph.h (varpool_node): Add asan_dynamically_initialized bitfield.
cp/
	* decl2.c: Include asan.h.
	(one_static_initialization_or_destruction): If -fsanitize=address,
	init is non-NULL and guard is NULL, set
	vnode->asan_dynamically_initialized.
	(do_static_initialization_or_destruction): Call
	__asan_{before,after}_dynamic_init around the static initialization.
testsuite/
	* c-c++-common/asan/no-redundant-instrumentation-1.c: Tweak to avoid
	optimizing away some __asan_report* calls.


	Jakub

Comments

Konstantin Serebryany Nov. 15, 2013, 6:34 p.m. UTC | #1
+samsonov, who wrote the clang part

Do you plan to add tests?
We have four lit-style tests for this (Alexey, that's all, right?):
init-order-atexit.cc
init-order-dlopen.cc
init-order-pthread-create.cc
Linux/initialization-bug-any-order.cc

I think we need at least the basic one in gcc
(Linux/initialization-bug-any-order.cc)

As a routine reminder: I am worried about the maintenance cost of two
different test and build systems,
even if I don't seem to pay for the second one. We need to find a way
to share tests and preferably the build infrastructure.
Ideally we would share the repository as well, but let's start with
something small :)

--kcc


On Fri, Nov 15, 2013 at 5:58 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> Hi!
>
> Here is an implementation of init-order checking (at least how I understand
> it from looking at libsanitizer and eyeballing what clang emits) plus some
> performance improvements.
>
> Previously instrument_derefs ignored stores and loads from VAR_DECLs, that
> is of course now not possible generally, because it could be dynamic init
> protected globals.  But, as for the improvements, if get_inner_reference
> tells us that the access is known to be within bounds of some VAR_DECL,
> we can check if we know it must be fine.  Automatic vars in current function
> within that function are (at least for now) all known to be accessible,
> similarly non-external vars that don't have dynamic initialization.
>
> I had to tweak no-redundant-instrumentation-1.c test a little bit, because
> for the foo var accesses instrument_derefs can now find out the
> insturmentation is unecessary, similarly for the static tab var that didn't
> have dynamic initialization.  Note, some comments in that file seems to be
> completely wrong and also the fact that it doesn't instrument high bound
> of the memset 4/5 looks like a severe bug (though, of course if memset
> isn't expanded inline, but as a library call, it will still be caught by
> the runtime library).
>
> 2013-11-15  Jakub Jelinek  <jakub@redhat.com>
>
>         * sanitizer.def (BUILT_IN_ASAN_BEFORE_DYNAMIC_INIT,
>         BUILT_IN_ASAN_AFTER_DYNAMIC_INIT): New.
>         * asan.c (instrument_derefs): Handle also VAR_DECL loads/stores.
>         Don't instrument accesses to VAR_DECLs which are known to fit
>         into their bounds and the vars are known to have shadow bytes
>         indicating allowed access.
>         (asan_dynamic_init_call): New function.
>         (asan_add_global): If vnode->asan_dynamically_initialized,
>         set __has_dynamic_init to 1 instead of 0.
>         (initialize_sanitizer_builtins): Add BT_FN_VOID_CONST_PTR var.
>         * asan.h (asan_dynamic_init_call): New prototype.
>         * cgraph.h (varpool_node): Add asan_dynamically_initialized bitfield.
> cp/
>         * decl2.c: Include asan.h.
>         (one_static_initialization_or_destruction): If -fsanitize=address,
>         init is non-NULL and guard is NULL, set
>         vnode->asan_dynamically_initialized.
>         (do_static_initialization_or_destruction): Call
>         __asan_{before,after}_dynamic_init around the static initialization.
> testsuite/
>         * c-c++-common/asan/no-redundant-instrumentation-1.c: Tweak to avoid
>         optimizing away some __asan_report* calls.
>
> --- gcc/sanitizer.def.jj        2013-11-12 11:31:12.000000000 +0100
> +++ gcc/sanitizer.def   2013-11-15 12:25:06.160748091 +0100
> @@ -60,6 +60,12 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_UNRE
>  DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_HANDLE_NO_RETURN,
>                       "__asan_handle_no_return",
>                       BT_FN_VOID, ATTR_TMPURE_NOTHROW_LEAF_LIST)
> +DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_BEFORE_DYNAMIC_INIT,
> +                     "__asan_before_dynamic_init",
> +                     BT_FN_VOID_CONST_PTR, ATTR_NOTHROW_LEAF_LIST)
> +DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_AFTER_DYNAMIC_INIT,
> +                     "__asan_after_dynamic_init",
> +                     BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
>
>  /* Thread Sanitizer */
>  DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_INIT, "__tsan_init",
> --- gcc/cp/decl2.c.jj   2013-11-12 23:10:07.000000000 +0100
> +++ gcc/cp/decl2.c      2013-11-15 12:16:24.195466188 +0100
> @@ -49,6 +49,7 @@ along with GCC; see the file COPYING3.
>  #include "splay-tree.h"
>  #include "langhooks.h"
>  #include "c-family/c-ada-spec.h"
> +#include "asan.h"
>
>  extern cpp_reader *parse_in;
>
> @@ -3456,7 +3457,15 @@ one_static_initialization_or_destruction
>    if (initp)
>      {
>        if (init)
> -       finish_expr_stmt (init);
> +       {
> +         finish_expr_stmt (init);
> +         if ((flag_sanitize & SANITIZE_ADDRESS) && guard == NULL_TREE)
> +           {
> +             struct varpool_node *vnode = varpool_get_node (decl);
> +             if (vnode)
> +               vnode->asan_dynamically_initialized = 1;
> +           }
> +       }
>
>        /* If we're using __cxa_atexit, register a function that calls the
>          destructor for the object.  */
> @@ -3498,6 +3507,9 @@ do_static_initialization_or_destruction
>                              tf_warning_or_error);
>    finish_if_stmt_cond (cond, init_if_stmt);
>
> +  if (flag_sanitize & SANITIZE_ADDRESS)
> +    finish_expr_stmt (asan_dynamic_init_call (/*after_p=*/false));
> +
>    node = vars;
>    do {
>      tree decl = TREE_VALUE (node);
> @@ -3546,6 +3558,9 @@ do_static_initialization_or_destruction
>
>    } while (node);
>
> +  if (flag_sanitize & SANITIZE_ADDRESS)
> +    finish_expr_stmt (asan_dynamic_init_call (/*after_p=*/true));
> +
>    /* Finish up the init/destruct if-stmt body.  */
>    finish_then_clause (init_if_stmt);
>    finish_if_stmt (init_if_stmt);
> --- gcc/asan.c.jj       2013-11-15 12:37:07.000000000 +0100
> +++ gcc/asan.c  2013-11-15 13:26:23.648429812 +0100
> @@ -214,7 +214,7 @@ along with GCC; see the file COPYING3.
>         // Name of the module where the global variable is declared.
>         const void *__module_name;
>
> -       // This is always set to NULL for now.
> +       // 1 if it has dynamic initialization, 0 otherwise.
>         uptr __has_dynamic_init;
>       }
>
> @@ -1571,7 +1571,9 @@ instrument_derefs (gimple_stmt_iterator
>      case COMPONENT_REF:
>      case INDIRECT_REF:
>      case MEM_REF:
> +    case VAR_DECL:
>        break;
> +      /* FALLTHRU */
>      default:
>        return;
>      }
> @@ -1585,8 +1587,8 @@ instrument_derefs (gimple_stmt_iterator
>    tree offset;
>    enum machine_mode mode;
>    int volatilep = 0, unsignedp = 0;
> -  get_inner_reference (t, &bitsize, &bitpos, &offset,
> -                      &mode, &unsignedp, &volatilep, false);
> +  tree inner = get_inner_reference (t, &bitsize, &bitpos, &offset,
> +                                   &mode, &unsignedp, &volatilep, false);
>    if (bitpos % (size_in_bytes * BITS_PER_UNIT)
>        || bitsize != size_in_bytes * BITS_PER_UNIT)
>      {
> @@ -1601,6 +1603,34 @@ instrument_derefs (gimple_stmt_iterator
>        return;
>      }
>
> +  if (TREE_CODE (inner) == VAR_DECL
> +      && offset == NULL_TREE
> +      && bitpos >= 0
> +      && DECL_SIZE (inner)
> +      && host_integerp (DECL_SIZE (inner), 0)
> +      && bitpos + bitsize <= tree_low_cst (DECL_SIZE (inner), 0))
> +    {
> +      if (DECL_THREAD_LOCAL_P (inner))
> +       return;
> +      if (!TREE_STATIC (inner))
> +       {
> +         /* Automatic vars in the current function will be always
> +            accessible.  */
> +         if (decl_function_context (inner) == current_function_decl)
> +           return;
> +       }
> +      /* Always instrument external vars, they might be dynamically
> +        initialized.  */
> +      else if (!DECL_EXTERNAL (inner))
> +       {
> +         /* For static vars if they are known not to be dynamically
> +            initialized, they will be always accessible.  */
> +         struct varpool_node *vnode = varpool_get_node (inner);
> +         if (vnode && !vnode->asan_dynamically_initialized)
> +           return;
> +       }
> +    }
> +
>    base = build_fold_addr_expr (t);
>    if (!has_mem_ref_been_instrumented (base, size_in_bytes))
>      {
> @@ -2059,6 +2089,34 @@ transform_statements (void)
>  }
>
>  /* Build
> +   __asan_before_dynamic_init (module_name)
> +   or
> +   __asan_after_dynamic_init ()
> +   call.  */
> +
> +tree
> +asan_dynamic_init_call (bool after_p)
> +{
> +  tree fn = builtin_decl_implicit (after_p
> +                                  ? BUILT_IN_ASAN_AFTER_DYNAMIC_INIT
> +                                  : BUILT_IN_ASAN_BEFORE_DYNAMIC_INIT);
> +  tree module_name_cst = NULL_TREE;
> +  if (!after_p)
> +    {
> +      pretty_printer module_name_pp;
> +      pp_string (&module_name_pp, main_input_filename);
> +
> +      if (shadow_ptr_types[0] == NULL_TREE)
> +       asan_init_shadow_ptr_types ();
> +      module_name_cst = asan_pp_string (&module_name_pp);
> +      module_name_cst = fold_convert (const_ptr_type_node,
> +                                     module_name_cst);
> +    }
> +
> +  return build_call_expr (fn, after_p ? 0 : 1, module_name_cst);
> +}
> +
> +/* Build
>     struct __asan_global
>     {
>       const void *__beg;
> @@ -2147,7 +2205,10 @@ asan_add_global (tree decl, tree type, v
>                           fold_convert (const_ptr_type_node, str_cst));
>    CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE,
>                           fold_convert (const_ptr_type_node, module_name_cst));
> -  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE, build_int_cst (uptr, 0));
> +  struct varpool_node *vnode = varpool_get_node (decl);
> +  int has_dynamic_init = vnode ? vnode->asan_dynamically_initialized : 0;
> +  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE,
> +                         build_int_cst (uptr, has_dynamic_init));
>    init = build_constructor (type, vinner);
>    CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, init);
>  }
> @@ -2164,6 +2225,8 @@ initialize_sanitizer_builtins (void)
>    tree BT_FN_VOID = build_function_type_list (void_type_node, NULL_TREE);
>    tree BT_FN_VOID_PTR
>      = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
> +  tree BT_FN_VOID_CONST_PTR
> +    = build_function_type_list (void_type_node, const_ptr_type_node, NULL_TREE);
>    tree BT_FN_VOID_PTR_PTR
>      = build_function_type_list (void_type_node, ptr_type_node,
>                                 ptr_type_node, NULL_TREE);
> --- gcc/asan.h.jj       2013-11-15 12:37:07.000000000 +0100
> +++ gcc/asan.h  2013-11-15 12:37:18.991783166 +0100
> @@ -27,6 +27,7 @@ extern rtx asan_emit_stack_protection (r
>                                        tree *, int);
>  extern bool asan_protect_global (tree);
>  extern void initialize_sanitizer_builtins (void);
> +extern tree asan_dynamic_init_call (bool);
>
>  /* Alias set for accessing the shadow memory.  */
>  extern alias_set_type asan_shadow_set;
> --- gcc/cgraph.h.jj     2013-11-13 18:32:52.000000000 +0100
> +++ gcc/cgraph.h        2013-11-15 12:05:25.950985500 +0100
> @@ -520,6 +520,11 @@ class GTY((tag ("SYMTAB_VARIABLE"))) var
>  public:
>    /* Set when variable is scheduled to be assembled.  */
>    unsigned output : 1;
> +  /* Set if the variable is dynamically initialized.  Not set for
> +     function local statics or variables that can be initialized in
> +     multiple compilation units (such as template static data members
> +     that need construction).  */
> +  unsigned asan_dynamically_initialized : 1;
>  };
>
>  /* Every top level asm statement is put into a asm_node.  */
> --- gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-1.c.jj 2013-02-18 16:39:54.000000000 +0100
> +++ gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-1.c    2013-11-15 14:16:22.727049197 +0100
> @@ -6,7 +6,7 @@
>  /* { dg-do compile } */
>  /* { dg-skip-if "" { *-*-* } { "*" } { "-O0" } } */
>
> -static char tab[4] = {0};
> +extern char tab[4];
>
>  static int
>  test0 ()
> @@ -27,12 +27,14 @@ test0 ()
>    return t0 + t1;
>  }
>
> -static int
> -test1 ()
> +__attribute__((noinline, noclone)) static int
> +test1 (int i)
>  {
> +  char foo[4] = {};
> +
>    /*__builtin___asan_report_store1 called 1 time here to instrument
>      the initialization.  */
> -  char foo[4] = {1};
> +  foo[i] = 1;
>
>    /*__builtin___asan_report_store1 called 2 times here to instrument
>      the store to the memory region of tab.  */
> @@ -45,7 +47,7 @@ test1 ()
>    /* There are 2 calls to __builtin___asan_report_store1 and 2 calls
>       to __builtin___asan_report_load1 to instrument the store to
>       (subset of) the memory region of tab.  */
> -  __builtin_memcpy (&tab[1], foo, 3);
> +  __builtin_memcpy (&tab[1], foo + i, 3);
>
>    /* This should not generate a __builtin___asan_report_load1 because
>       the reference to tab[1] has been already instrumented above.  */
> @@ -58,7 +60,7 @@ test1 ()
>  int
>  main ()
>  {
> -  return test0 () && test1 ();
> +  return test0 () && test1 (0);
>  }
>
>  /* { dg-final { scan-tree-dump-times "__builtin___asan_report_store1" 7 "asan0" } } */
>
>         Jakub
Jakub Jelinek Nov. 15, 2013, 6:40 p.m. UTC | #2
On Fri, Nov 15, 2013 at 10:34:28PM +0400, Konstantin Serebryany wrote:
> +samsonov, who wrote the clang part
> 
> Do you plan to add tests?

Eventually yes, but likely only in stage3, there is only limited time
left during stage1 and I need to still work on OpenMP elementals next week.

> We have four lit-style tests for this (Alexey, that's all, right?):
> init-order-atexit.cc
> init-order-dlopen.cc
> init-order-pthread-create.cc
> Linux/initialization-bug-any-order.cc
> 
> I think we need at least the basic one in gcc
> (Linux/initialization-bug-any-order.cc)

What should be done is just diff the compiler-rt tests in between the date
where we mass ported most of the tests last time and now, and just port
(where appropriate) the tests to DejaGNU.  It is at most a few hours of
work.

	Jakub
Jakub Jelinek Nov. 15, 2013, 6:46 p.m. UTC | #3
On Fri, Nov 15, 2013 at 10:34:28PM +0400, Konstantin Serebryany wrote:
> +samsonov, who wrote the clang part
> 
> Do you plan to add tests?

OT, what is the -fsanitize=address,use-after-scope doing?  Tried that
and it didn't seem to do anything at all, besides adding some extra
start/end scope markers to the IL, but not actually adding any
instrumentation.  Is that just not done yet?  What are you planning for
that?  For a subset of vars poison them upon entry to the function and
then when entering their scope unpoison them and when leaving scope poison
them again?  GCC right now has just var ={v} {CLOBBER}; statements that
mark vars going out of scope, we could add for asan only add also into scope
markers, make sure they are never optimized away (current CLOBBERs are just
optimization hints and in some cases it is better to optimize them away
rather than e.g. get worse EH code), but because they are used for
optimizations, programs with use-after-scope bugs often get miscompiled by
GCC (well, they have undefined behavior, so miscompiled is a weird term),
it would be nice to have asan complain about those.

	Jakub
Konstantin Serebryany Nov. 15, 2013, 6:49 p.m. UTC | #4
On Fri, Nov 15, 2013 at 10:40 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Fri, Nov 15, 2013 at 10:34:28PM +0400, Konstantin Serebryany wrote:
>> +samsonov, who wrote the clang part
>>
>> Do you plan to add tests?
>
> Eventually yes, but likely only in stage3, there is only limited time
> left during stage1 and I need to still work on OpenMP elementals next week.

sure.

>
>> We have four lit-style tests for this (Alexey, that's all, right?):
>> init-order-atexit.cc
>> init-order-dlopen.cc
>> init-order-pthread-create.cc
>> Linux/initialization-bug-any-order.cc
>>
>> I think we need at least the basic one in gcc
>> (Linux/initialization-bug-any-order.cc)
>
> What should be done is just diff the compiler-rt tests in between the date
> where we mass ported most of the tests last time and now, and just port
> (where appropriate) the tests to DejaGNU.  It is at most a few hours of
> work.

That's never like this.
Half a year later we'll change something in the run-time and will
update tests in compiler-rt accordingly.
Then two more months later I'll try to make a merge to gcc and will
find that I have to fix the tests there too.
At that point I'll be in different context (and, besides, I am not
that familiar with dejagnu).
That's what I call doubled maintenance cost.

The more tests we have in gcc (and we do want to have more tests) the
higher is the cost.

>
>         Jakub
Konstantin Serebryany Nov. 15, 2013, 6:55 p.m. UTC | #5
On Fri, Nov 15, 2013 at 10:46 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Fri, Nov 15, 2013 at 10:34:28PM +0400, Konstantin Serebryany wrote:
>> +samsonov, who wrote the clang part
>>
>> Do you plan to add tests?
>
> OT, what is the -fsanitize=address,use-after-scope doing?  Tried that
> and it didn't seem to do anything at all, besides adding some extra
> start/end scope markers to the IL, but not actually adding any
> instrumentation.  Is that just not done yet?  What are you planning for

Yes, this one is not ready yet. It handles only the simplest cases and
is actually unstable yet.
The largest piece of work is in the compiler:  make the frontend and
inliner emit these CLOBBER
 statements and teach other optimizations not to mess around with them.
The run-time support is more or less straightforward, although it has
to be tuned for performance.

> that?  For a subset of vars poison them upon entry to the function and
> then when entering their scope unpoison them and when leaving scope poison
> them again?

right.

>  GCC right now has just var ={v} {CLOBBER}; statements that
> mark vars going out of scope, we could add for asan only add also into scope
> markers, make sure they are never optimized away (current CLOBBERs are just
> optimization hints and in some cases it is better to optimize them away
> rather than e.g. get worse EH code), but because they are used for
> optimizations, programs with use-after-scope bugs often get miscompiled by
> GCC (well, they have undefined behavior, so miscompiled is a weird term),
> it would be nice to have asan complain about those.

We've been bitten by these "legal miscompiles" quite a few times.
that's exactly why we started doing this thing. Just haven't finished yet.


>
>         Jakub
Dodji Seketeli Nov. 22, 2013, 3:38 p.m. UTC | #6
Hello,

Jakub Jelinek <jakub@redhat.com> writes:

> --- gcc/cgraph.h.jj	2013-11-13 18:32:52.000000000 +0100
> +++ gcc/cgraph.h	2013-11-15 12:05:25.950985500 +0100
> @@ -520,6 +520,11 @@ class GTY((tag ("SYMTAB_VARIABLE"))) var
>  public:
>    /* Set when variable is scheduled to be assembled.  */
>    unsigned output : 1;
> +  /* Set if the variable is dynamically initialized.  Not set for
> +     function local statics or variables that can be initialized in
> +     multiple compilation units (such as template static data members
> +     that need construction).  */
> +  unsigned asan_dynamically_initialized : 1;
>  };

Maybe this could just be called dynamically_initialized?  It's just used
by asan today, but it looks like an information that could be used more
generally, independently from asan.

>  
>        /* If we're using __cxa_atexit, register a function that calls the
>  	 destructor for the object.  */
> @@ -3498,6 +3507,9 @@ do_static_initialization_or_destruction
>  			     tf_warning_or_error);
>    finish_if_stmt_cond (cond, init_if_stmt);
>  
> +  if (flag_sanitize & SANITIZE_ADDRESS)
> +    finish_expr_stmt (asan_dynamic_init_call (/*after_p=*/false));
> +

I guess this spot could use some comment referring to the comment of
asan_globals.cc:__asan_before_dynamic_init from libsanitizer.  Basically
saying that we are emitting a call to __asan_before_dynamic_init to
poison all dynamically initialized global variables not defined in this
TU, so that a dynamic initializer for a global variable is only allowed
to touch the global variables from this current TU.  This comment could
be valuable when chasing a bug about this a couple of months from now
when we forget about how this works again.

And then, similarly ...

> @@ -3546,6 +3558,9 @@ do_static_initialization_or_destruction
>  
>    } while (node);
>  
> +  if (flag_sanitize & SANITIZE_ADDRESS)
> +    finish_expr_stmt (asan_dynamic_init_call (/*after_p=*/true));
> +

... this spot could also use some comment referring to the comment of
asan_global.cc:__asan_after_dynamic_init, saying that because the
initializers of globals must have run by now (they are emitted by
one_static_initialization_or_destruction that has been invoked before
this point and after the point above) we are un-poisoning all
dynamically initialized global variables.

Also, do we have some tests for this?  I am not sure how I'd write
multi-tu dejagnu tests for this myself though ;-)

Other than that, LGTM.

Thanks.
Dodji Seketeli Nov. 22, 2013, 3:42 p.m. UTC | #7
Dodji Seketeli <dodji@redhat.com> writes:

> Also, do we have some tests for this?  I am not sure how I'd write
> multi-tu dejagnu tests for this myself though ;-)

Woops, I have just seen the sub-thread about the tests with Konstantin,
you and Alexey.  Sorry.

Cheers.
Jakub Jelinek Nov. 22, 2013, 3:53 p.m. UTC | #8
On Fri, Nov 22, 2013 at 04:38:58PM +0100, Dodji Seketeli wrote:
> Jakub Jelinek <jakub@redhat.com> writes:
> 
> > --- gcc/cgraph.h.jj	2013-11-13 18:32:52.000000000 +0100
> > +++ gcc/cgraph.h	2013-11-15 12:05:25.950985500 +0100
> > @@ -520,6 +520,11 @@ class GTY((tag ("SYMTAB_VARIABLE"))) var
> >  public:
> >    /* Set when variable is scheduled to be assembled.  */
> >    unsigned output : 1;
> > +  /* Set if the variable is dynamically initialized.  Not set for
> > +     function local statics or variables that can be initialized in
> > +     multiple compilation units (such as template static data members
> > +     that need construction).  */
> > +  unsigned asan_dynamically_initialized : 1;
> >  };
> 
> Maybe this could just be called dynamically_initialized?  It's just used
> by asan today, but it looks like an information that could be used more
> generally, independently from asan.

I used that name initially, but then changed it, because it actually is
quite asan specific.  E.g. template static data members are dynamically
initialized, but we intentionally don't set asan_dynamically_initialized
on those, because their initializer can be called from multiple CUs, there
is no CU that owns the variable.

> >        /* If we're using __cxa_atexit, register a function that calls the
> >  	 destructor for the object.  */
> > @@ -3498,6 +3507,9 @@ do_static_initialization_or_destruction
> >  			     tf_warning_or_error);
> >    finish_if_stmt_cond (cond, init_if_stmt);
> >  
> > +  if (flag_sanitize & SANITIZE_ADDRESS)
> > +    finish_expr_stmt (asan_dynamic_init_call (/*after_p=*/false));
> > +

Will add the comments.

> Also, do we have some tests for this?  I am not sure how I'd write
> multi-tu dejagnu tests for this myself though ;-)

I've postponed tests for stage3, when I find spare time I'll try to
port libsanitizer tests that are applicable to gcc over to dejagnu,
plus perhaps add new tests as needed.

	Jakub
Dodji Seketeli Nov. 22, 2013, 4:15 p.m. UTC | #9
Jakub Jelinek <jakub@redhat.com> writes:

> On Fri, Nov 22, 2013 at 04:38:58PM +0100, Dodji Seketeli wrote:
>> Jakub Jelinek <jakub@redhat.com> writes:
>> 
>> > --- gcc/cgraph.h.jj	2013-11-13 18:32:52.000000000 +0100
>> > +++ gcc/cgraph.h	2013-11-15 12:05:25.950985500 +0100
>> > @@ -520,6 +520,11 @@ class GTY((tag ("SYMTAB_VARIABLE"))) var
>> >  public:
>> >    /* Set when variable is scheduled to be assembled.  */
>> >    unsigned output : 1;
>> > +  /* Set if the variable is dynamically initialized.  Not set for
>> > +     function local statics or variables that can be initialized in
>> > +     multiple compilation units (such as template static data members
>> > +     that need construction).  */
>> > +  unsigned asan_dynamically_initialized : 1;
>> >  };
>> 
>> Maybe this could just be called dynamically_initialized?  It's just used
>> by asan today, but it looks like an information that could be used more
>> generally, independently from asan.
>
> I used that name initially, but then changed it, because it actually is
> quite asan specific.  E.g. template static data members are dynamically
> initialized, but we intentionally don't set asan_dynamically_initialized
> on those, because their initializer can be called from multiple CUs, there
> is no CU that owns the variable.

I see.  OK, I don't feel strongly about this anyway.  Thanks for the
clarification.

>
>> >        /* If we're using __cxa_atexit, register a function that calls the
>> >  	 destructor for the object.  */
>> > @@ -3498,6 +3507,9 @@ do_static_initialization_or_destruction
>> >  			     tf_warning_or_error);
>> >    finish_if_stmt_cond (cond, init_if_stmt);
>> >  
>> > +  if (flag_sanitize & SANITIZE_ADDRESS)
>> > +    finish_expr_stmt (asan_dynamic_init_call (/*after_p=*/false));
>> > +
>
> Will add the comments.
>
>> Also, do we have some tests for this?  I am not sure how I'd write
>> multi-tu dejagnu tests for this myself though ;-)
>
> I've postponed tests for stage3, when I find spare time I'll try to
> port libsanitizer tests that are applicable to gcc over to dejagnu,
> plus perhaps add new tests as needed.

OK, thanks.

The patch is OK to commit, as far as I am concerned.

Thanks.
Jakub Jelinek Nov. 22, 2013, 5:54 p.m. UTC | #10
On Fri, Nov 22, 2013 at 09:46:47PM +0400, Alexander Potapenko wrote:
> Side note: init order checking isn't supposed to work on Darwin, at least
> for now.

I'll let Darwin maintainers/users if any to worry about it, that said,
does it have any compiler side effects (like, that dynamic_init shouldn't
be ever set in the array registered by __asan_register_globals on Darwin,
or __asan_{before,after}_dynamic_init shouldn't be called, or is that purely
a library thing (it will just ignore it or something similar)?
I'd hope the latter, so then it would just be a matter of conditionalizing
tests for it on !*-*-darwin* if/when they are added.

	Jakub
diff mbox

Patch

--- gcc/sanitizer.def.jj	2013-11-12 11:31:12.000000000 +0100
+++ gcc/sanitizer.def	2013-11-15 12:25:06.160748091 +0100
@@ -60,6 +60,12 @@  DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_UNRE
 DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_HANDLE_NO_RETURN,
 		      "__asan_handle_no_return",
 		      BT_FN_VOID, ATTR_TMPURE_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_BEFORE_DYNAMIC_INIT,
+		      "__asan_before_dynamic_init",
+		      BT_FN_VOID_CONST_PTR, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_ASAN_AFTER_DYNAMIC_INIT,
+		      "__asan_after_dynamic_init",
+		      BT_FN_VOID, ATTR_NOTHROW_LEAF_LIST)
 
 /* Thread Sanitizer */
 DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_INIT, "__tsan_init", 
--- gcc/cp/decl2.c.jj	2013-11-12 23:10:07.000000000 +0100
+++ gcc/cp/decl2.c	2013-11-15 12:16:24.195466188 +0100
@@ -49,6 +49,7 @@  along with GCC; see the file COPYING3.
 #include "splay-tree.h"
 #include "langhooks.h"
 #include "c-family/c-ada-spec.h"
+#include "asan.h"
 
 extern cpp_reader *parse_in;
 
@@ -3456,7 +3457,15 @@  one_static_initialization_or_destruction
   if (initp)
     {
       if (init)
-	finish_expr_stmt (init);
+	{
+	  finish_expr_stmt (init);
+	  if ((flag_sanitize & SANITIZE_ADDRESS) && guard == NULL_TREE)
+	    {
+	      struct varpool_node *vnode = varpool_get_node (decl);
+	      if (vnode)
+		vnode->asan_dynamically_initialized = 1;
+	    }
+	}
 
       /* If we're using __cxa_atexit, register a function that calls the
 	 destructor for the object.  */
@@ -3498,6 +3507,9 @@  do_static_initialization_or_destruction
 			     tf_warning_or_error);
   finish_if_stmt_cond (cond, init_if_stmt);
 
+  if (flag_sanitize & SANITIZE_ADDRESS)
+    finish_expr_stmt (asan_dynamic_init_call (/*after_p=*/false));
+
   node = vars;
   do {
     tree decl = TREE_VALUE (node);
@@ -3546,6 +3558,9 @@  do_static_initialization_or_destruction
 
   } while (node);
 
+  if (flag_sanitize & SANITIZE_ADDRESS)
+    finish_expr_stmt (asan_dynamic_init_call (/*after_p=*/true));
+
   /* Finish up the init/destruct if-stmt body.  */
   finish_then_clause (init_if_stmt);
   finish_if_stmt (init_if_stmt);
--- gcc/asan.c.jj	2013-11-15 12:37:07.000000000 +0100
+++ gcc/asan.c	2013-11-15 13:26:23.648429812 +0100
@@ -214,7 +214,7 @@  along with GCC; see the file COPYING3.
        // Name of the module where the global variable is declared.
        const void *__module_name;
 
-       // This is always set to NULL for now.
+       // 1 if it has dynamic initialization, 0 otherwise.
        uptr __has_dynamic_init;
      }
 
@@ -1571,7 +1571,9 @@  instrument_derefs (gimple_stmt_iterator
     case COMPONENT_REF:
     case INDIRECT_REF:
     case MEM_REF:
+    case VAR_DECL:
       break;
+      /* FALLTHRU */
     default:
       return;
     }
@@ -1585,8 +1587,8 @@  instrument_derefs (gimple_stmt_iterator
   tree offset;
   enum machine_mode mode;
   int volatilep = 0, unsignedp = 0;
-  get_inner_reference (t, &bitsize, &bitpos, &offset,
-		       &mode, &unsignedp, &volatilep, false);
+  tree inner = get_inner_reference (t, &bitsize, &bitpos, &offset,
+				    &mode, &unsignedp, &volatilep, false);
   if (bitpos % (size_in_bytes * BITS_PER_UNIT)
       || bitsize != size_in_bytes * BITS_PER_UNIT)
     {
@@ -1601,6 +1603,34 @@  instrument_derefs (gimple_stmt_iterator
       return;
     }
 
+  if (TREE_CODE (inner) == VAR_DECL
+      && offset == NULL_TREE
+      && bitpos >= 0
+      && DECL_SIZE (inner)
+      && host_integerp (DECL_SIZE (inner), 0)
+      && bitpos + bitsize <= tree_low_cst (DECL_SIZE (inner), 0))
+    {
+      if (DECL_THREAD_LOCAL_P (inner))
+	return;
+      if (!TREE_STATIC (inner))
+	{
+	  /* Automatic vars in the current function will be always
+	     accessible.  */
+	  if (decl_function_context (inner) == current_function_decl)
+	    return;
+	}
+      /* Always instrument external vars, they might be dynamically
+	 initialized.  */
+      else if (!DECL_EXTERNAL (inner))
+	{
+	  /* For static vars if they are known not to be dynamically
+	     initialized, they will be always accessible.  */
+	  struct varpool_node *vnode = varpool_get_node (inner);
+	  if (vnode && !vnode->asan_dynamically_initialized)
+	    return;
+	}
+    }
+
   base = build_fold_addr_expr (t);
   if (!has_mem_ref_been_instrumented (base, size_in_bytes))
     {
@@ -2059,6 +2089,34 @@  transform_statements (void)
 }
 
 /* Build
+   __asan_before_dynamic_init (module_name)
+   or
+   __asan_after_dynamic_init ()
+   call.  */
+
+tree
+asan_dynamic_init_call (bool after_p)
+{
+  tree fn = builtin_decl_implicit (after_p
+				   ? BUILT_IN_ASAN_AFTER_DYNAMIC_INIT
+				   : BUILT_IN_ASAN_BEFORE_DYNAMIC_INIT);
+  tree module_name_cst = NULL_TREE;
+  if (!after_p)
+    {
+      pretty_printer module_name_pp;
+      pp_string (&module_name_pp, main_input_filename);
+
+      if (shadow_ptr_types[0] == NULL_TREE)
+	asan_init_shadow_ptr_types ();
+      module_name_cst = asan_pp_string (&module_name_pp);
+      module_name_cst = fold_convert (const_ptr_type_node,
+				      module_name_cst);
+    }
+
+  return build_call_expr (fn, after_p ? 0 : 1, module_name_cst);
+}
+
+/* Build
    struct __asan_global
    {
      const void *__beg;
@@ -2147,7 +2205,10 @@  asan_add_global (tree decl, tree type, v
 			  fold_convert (const_ptr_type_node, str_cst));
   CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE,
 			  fold_convert (const_ptr_type_node, module_name_cst));
-  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE, build_int_cst (uptr, 0));
+  struct varpool_node *vnode = varpool_get_node (decl);
+  int has_dynamic_init = vnode ? vnode->asan_dynamically_initialized : 0;
+  CONSTRUCTOR_APPEND_ELT (vinner, NULL_TREE,
+			  build_int_cst (uptr, has_dynamic_init));
   init = build_constructor (type, vinner);
   CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, init);
 }
@@ -2164,6 +2225,8 @@  initialize_sanitizer_builtins (void)
   tree BT_FN_VOID = build_function_type_list (void_type_node, NULL_TREE);
   tree BT_FN_VOID_PTR
     = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE);
+  tree BT_FN_VOID_CONST_PTR
+    = build_function_type_list (void_type_node, const_ptr_type_node, NULL_TREE);
   tree BT_FN_VOID_PTR_PTR
     = build_function_type_list (void_type_node, ptr_type_node,
 				ptr_type_node, NULL_TREE);
--- gcc/asan.h.jj	2013-11-15 12:37:07.000000000 +0100
+++ gcc/asan.h	2013-11-15 12:37:18.991783166 +0100
@@ -27,6 +27,7 @@  extern rtx asan_emit_stack_protection (r
 				       tree *, int);
 extern bool asan_protect_global (tree);
 extern void initialize_sanitizer_builtins (void);
+extern tree asan_dynamic_init_call (bool);
 
 /* Alias set for accessing the shadow memory.  */
 extern alias_set_type asan_shadow_set;
--- gcc/cgraph.h.jj	2013-11-13 18:32:52.000000000 +0100
+++ gcc/cgraph.h	2013-11-15 12:05:25.950985500 +0100
@@ -520,6 +520,11 @@  class GTY((tag ("SYMTAB_VARIABLE"))) var
 public:
   /* Set when variable is scheduled to be assembled.  */
   unsigned output : 1;
+  /* Set if the variable is dynamically initialized.  Not set for
+     function local statics or variables that can be initialized in
+     multiple compilation units (such as template static data members
+     that need construction).  */
+  unsigned asan_dynamically_initialized : 1;
 };
 
 /* Every top level asm statement is put into a asm_node.  */
--- gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-1.c.jj	2013-02-18 16:39:54.000000000 +0100
+++ gcc/testsuite/c-c++-common/asan/no-redundant-instrumentation-1.c	2013-11-15 14:16:22.727049197 +0100
@@ -6,7 +6,7 @@ 
 /* { dg-do compile } */
 /* { dg-skip-if "" { *-*-* } { "*" } { "-O0" } } */
 
-static char tab[4] = {0};
+extern char tab[4];
 
 static int
 test0 ()
@@ -27,12 +27,14 @@  test0 ()
   return t0 + t1;
 }
 
-static int
-test1 ()
+__attribute__((noinline, noclone)) static int
+test1 (int i)
 {
+  char foo[4] = {};
+  
   /*__builtin___asan_report_store1 called 1 time here to instrument
     the initialization.  */
-  char foo[4] = {1}; 
+  foo[i] = 1;
 
   /*__builtin___asan_report_store1 called 2 times here to instrument
     the store to the memory region of tab.  */
@@ -45,7 +47,7 @@  test1 ()
   /* There are 2 calls to __builtin___asan_report_store1 and 2 calls
      to __builtin___asan_report_load1 to instrument the store to
      (subset of) the memory region of tab.  */
-  __builtin_memcpy (&tab[1], foo, 3);
+  __builtin_memcpy (&tab[1], foo + i, 3);
 
   /* This should not generate a __builtin___asan_report_load1 because
      the reference to tab[1] has been already instrumented above.  */
@@ -58,7 +60,7 @@  test1 ()
 int
 main ()
 {
-  return test0 () && test1 ();
+  return test0 () && test1 (0);
 }
 
 /* { dg-final { scan-tree-dump-times "__builtin___asan_report_store1" 7 "asan0" } } */