diff mbox series

asan: Mark instrumented vars addressable [PR102656]

Message ID Yg9gJavTtpJo7NxH@tucnak
State New
Headers show
Series asan: Mark instrumented vars addressable [PR102656] | expand

Commit Message

Jakub Jelinek Feb. 18, 2022, 9 a.m. UTC
Hi!

We ICE on the following testcase, because the asan1 pass decides to
instrument
  <retval>.x = 0;
and does that by
  _13 = &<retval>.x;
  .ASAN_CHECK (7, _13, 4, 4);
  <retval>.x = 0;
and later sanopt pass turns that into:
  _39 = (unsigned long) &<retval>.x;
  _40 = _39 >> 3;
  _41 = _40 + 2147450880;
  _42 = (signed char *) _41;
  _43 = *_42;
  _44 = _43 != 0;
  _45 = _39 & 7;
  _46 = (signed char) _45;
  _47 = _46 + 3;
  _48 = _47 >= _43;
  _49 = _44 & _48;
  if (_49 != 0)
    goto <bb 10>; [0.05%]
  else
    goto <bb 9>; [99.95%]

  <bb 10> [local count: 536864]:
  __builtin___asan_report_store4 (_39);
  
  <bb 9> [local count: 1073741824]:
  <retval>.x = 0;
The problem is during expansion, <retval> isn't marked TREE_ADDRESSABLE,
even when we take its address in (unsigned long) &<retval>.x.

Now, instrument_derefs has code to avoid the instrumentation altogether
if we can prove the access is within bounds of an automatic variable in the
current function and the var isn't TREE_ADDRESSABLE (or we don't instrument
use after scope), but we do it solely for VAR_DECLs.

I think we should treat RESULT_DECLs exactly like that too, which is what
the following patch does.  I must say I'm unsure about PARM_DECLs, those can
have different cases, either they are fully or partially passed in
registers, then if we take parameter's address, they are in a local copy
inside of a function and so work like those automatic vars.  But if they
are fully passed in memory, we typically just take address of the slot
and in that case they live in the caller's frame.  It is true we don't
(can't) put any asan padding in between the arguments, so all asan could
detect in that case is if caller passes fewer on stack arguments or smaller
arguments than callee accepts.  Anyway, as I'm unsure, I haven't added
PARM_DECLs to that case.

And another thing is, when we actually build_fold_addr_expr, we need to
mark_addressable the inner if it isn't addressable already.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-02-18  Jakub Jelinek  <jakub@redhat.com>

	PR sanitizer/102656
	* asan.cc (instrument_derefs): If inner is a RESULT_DECL and access is
	known to be within bounds, treat it like automatic variables.
	If instrumenting access and inner is {VAR,PARM,RESULT}_DECL from
	current function and !TREE_STATIC which is not TREE_ADDRESSABLE, mark
	it addressable.

	* g++.dg/asan/pr102656.C: New test.


	Jakub

Comments

Richard Biener Feb. 18, 2022, 9:19 a.m. UTC | #1
On Fri, 18 Feb 2022, Jakub Jelinek wrote:

> Hi!
> 
> We ICE on the following testcase, because the asan1 pass decides to
> instrument
>   <retval>.x = 0;
> and does that by
>   _13 = &<retval>.x;
>   .ASAN_CHECK (7, _13, 4, 4);
>   <retval>.x = 0;
> and later sanopt pass turns that into:
>   _39 = (unsigned long) &<retval>.x;
>   _40 = _39 >> 3;
>   _41 = _40 + 2147450880;
>   _42 = (signed char *) _41;
>   _43 = *_42;
>   _44 = _43 != 0;
>   _45 = _39 & 7;
>   _46 = (signed char) _45;
>   _47 = _46 + 3;
>   _48 = _47 >= _43;
>   _49 = _44 & _48;
>   if (_49 != 0)
>     goto <bb 10>; [0.05%]
>   else
>     goto <bb 9>; [99.95%]
> 
>   <bb 10> [local count: 536864]:
>   __builtin___asan_report_store4 (_39);
>   
>   <bb 9> [local count: 1073741824]:
>   <retval>.x = 0;
> The problem is during expansion, <retval> isn't marked TREE_ADDRESSABLE,
> even when we take its address in (unsigned long) &<retval>.x.
> 
> Now, instrument_derefs has code to avoid the instrumentation altogether
> if we can prove the access is within bounds of an automatic variable in the
> current function and the var isn't TREE_ADDRESSABLE (or we don't instrument
> use after scope), but we do it solely for VAR_DECLs.
> 
> I think we should treat RESULT_DECLs exactly like that too, which is what
> the following patch does.  I must say I'm unsure about PARM_DECLs, those can
> have different cases, either they are fully or partially passed in
> registers, then if we take parameter's address, they are in a local copy
> inside of a function and so work like those automatic vars.  But if they
> are fully passed in memory, we typically just take address of the slot
> and in that case they live in the caller's frame.  It is true we don't
> (can't) put any asan padding in between the arguments, so all asan could
> detect in that case is if caller passes fewer on stack arguments or smaller
> arguments than callee accepts.  Anyway, as I'm unsure, I haven't added
> PARM_DECLs to that case.
> 
> And another thing is, when we actually build_fold_addr_expr, we need to
> mark_addressable the inner if it isn't addressable already.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2022-02-18  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR sanitizer/102656
> 	* asan.cc (instrument_derefs): If inner is a RESULT_DECL and access is
> 	known to be within bounds, treat it like automatic variables.
> 	If instrumenting access and inner is {VAR,PARM,RESULT}_DECL from
> 	current function and !TREE_STATIC which is not TREE_ADDRESSABLE, mark
> 	it addressable.
> 
> 	* g++.dg/asan/pr102656.C: New test.
> 
> --- gcc/asan.cc.jj	2022-02-12 19:14:45.915168286 +0100
> +++ gcc/asan.cc	2022-02-17 17:13:09.346987047 +0100
> @@ -2688,13 +2688,13 @@ instrument_derefs (gimple_stmt_iterator
>      return;
>  
>    poly_int64 decl_size;
> -  if (VAR_P (inner)
> +  if ((VAR_P (inner) || TREE_CODE (inner) == RESULT_DECL)
>        && offset == NULL_TREE
>        && DECL_SIZE (inner)
>        && poly_int_tree_p (DECL_SIZE (inner), &decl_size)
>        && known_subrange_p (bitpos, bitsize, 0, decl_size))
>      {
> -      if (DECL_THREAD_LOCAL_P (inner))
> +      if (VAR_P (inner) && DECL_THREAD_LOCAL_P (inner))
>  	return;
>        /* If we're not sanitizing globals and we can tell statically that this
>  	 access is inside a global variable, then there's no point adding
> @@ -2724,6 +2726,14 @@ instrument_derefs (gimple_stmt_iterator
>  	}
>      }
>  
> +  if ((VAR_P (inner)
> +       || TREE_CODE (inner) == PARM_DECL
> +       || TREE_CODE (inner) == RESULT_DECL)
> +      && !TREE_STATIC (inner)
> +      && decl_function_context (inner) == current_function_decl
> +      && !TREE_ADDRESSABLE (inner))

There's no good reason to exclute TREE_STATIC or global vars.  If
we take the address of something we should mark it addressable.
I think you can just use DECL_P (inner) as well here for simplicity.

OK with that change.

Thanks,
Richard.

> +    mark_addressable (inner);
> +
>    base = build_fold_addr_expr (t);
>    if (!has_mem_ref_been_instrumented (base, size_in_bytes))
>      {
> --- gcc/testsuite/g++.dg/asan/pr102656.C.jj	2022-02-17 17:09:42.758866731 +0100
> +++ gcc/testsuite/g++.dg/asan/pr102656.C	2022-02-17 17:10:46.212981870 +0100
> @@ -0,0 +1,27 @@
> +// PR sanitizer/102656
> +// { dg-do compile }
> +// { dg-options "-std=c++20 -fsanitize=address" }
> +
> +#include <coroutine>
> +
> +class promise;
> +
> +struct future {
> +  using promise_type = promise;
> +  future() = default;
> +  int x = 0;
> +};
> +
> +struct promise {
> +  future get_return_object() noexcept { return {}; }
> +  auto initial_suspend() noexcept { return std::suspend_never{}; }
> +  auto final_suspend() noexcept { return std::suspend_never{}; }
> +  void return_void() noexcept {}
> +  void unhandled_exception() {}
> +};
> +
> +future
> +func ()
> +{
> +  co_return;
> +}
> 
> 	Jakub
> 
>
diff mbox series

Patch

--- gcc/asan.cc.jj	2022-02-12 19:14:45.915168286 +0100
+++ gcc/asan.cc	2022-02-17 17:13:09.346987047 +0100
@@ -2688,13 +2688,13 @@  instrument_derefs (gimple_stmt_iterator
     return;
 
   poly_int64 decl_size;
-  if (VAR_P (inner)
+  if ((VAR_P (inner) || TREE_CODE (inner) == RESULT_DECL)
       && offset == NULL_TREE
       && DECL_SIZE (inner)
       && poly_int_tree_p (DECL_SIZE (inner), &decl_size)
       && known_subrange_p (bitpos, bitsize, 0, decl_size))
     {
-      if (DECL_THREAD_LOCAL_P (inner))
+      if (VAR_P (inner) && DECL_THREAD_LOCAL_P (inner))
 	return;
       /* If we're not sanitizing globals and we can tell statically that this
 	 access is inside a global variable, then there's no point adding
@@ -2724,6 +2726,14 @@  instrument_derefs (gimple_stmt_iterator
 	}
     }
 
+  if ((VAR_P (inner)
+       || TREE_CODE (inner) == PARM_DECL
+       || TREE_CODE (inner) == RESULT_DECL)
+      && !TREE_STATIC (inner)
+      && decl_function_context (inner) == current_function_decl
+      && !TREE_ADDRESSABLE (inner))
+    mark_addressable (inner);
+
   base = build_fold_addr_expr (t);
   if (!has_mem_ref_been_instrumented (base, size_in_bytes))
     {
--- gcc/testsuite/g++.dg/asan/pr102656.C.jj	2022-02-17 17:09:42.758866731 +0100
+++ gcc/testsuite/g++.dg/asan/pr102656.C	2022-02-17 17:10:46.212981870 +0100
@@ -0,0 +1,27 @@ 
+// PR sanitizer/102656
+// { dg-do compile }
+// { dg-options "-std=c++20 -fsanitize=address" }
+
+#include <coroutine>
+
+class promise;
+
+struct future {
+  using promise_type = promise;
+  future() = default;
+  int x = 0;
+};
+
+struct promise {
+  future get_return_object() noexcept { return {}; }
+  auto initial_suspend() noexcept { return std::suspend_never{}; }
+  auto final_suspend() noexcept { return std::suspend_never{}; }
+  void return_void() noexcept {}
+  void unhandled_exception() {}
+};
+
+future
+func ()
+{
+  co_return;
+}