diff mbox

ASAN: handle addressable params (PR sanitize/81040).

Message ID ebab2c43-4fb6-0137-e292-4b6633e74b37@suse.cz
State New
Headers show

Commit Message

Martin Liška June 20, 2017, 1:06 p.m. UTC
On 06/20/2017 11:32 AM, Jakub Jelinek wrote:
> On Tue, Jun 20, 2017 at 11:23:36AM +0200, Martin Liška wrote:
>>> Then something needs to be done for debugging too.  If it is without VTA,
>>> then probably just having DECL_VALUE_EXPR is good enough, otherwise
>>> (VTA) you probably don't want that (or can reset it at that point), but
>>> instead emit after the initialization stmt a debug stmt that the variable
>>> value now lives in a different var.  Though ideally we want the debugger
>>> to be able to also change the value of the var, that might be harder.
>>> With DECL_VALUE_EXPR on the other side the debug info will be incorrect in
>>> the prologue until it is assigned to the slot.
>>
>> Here I'm not sure about how to distinguish whether to build or not to build
>> the debug statement. According to flag_var_tracking?
> 
> More like if (target_for_debug_bind (arg))
> And if that is false, just make sure DECL_VALUE_EXPR is set to var.
> 
>> You mean something like:
>> g = gimple_build_debug_bind (arg, var, g);
>> ?
> 
> Well, there is no stmt, so the last argument would be just NULL.
> 
>>> I don't understand the distinction.  If you turn the original parm
>>> for complex/vector DECL_GIMPLE_REG_P, you should need the exact same code
>>> (but I think it would be better to use the default SSA_NAME of the PARM_DECL
>>> if it is a gimple reg type, rather than use the PARM_DECL itself
>>> and wait for update_ssa).
>>
>> Yes, the test-case /gcc/testsuite/g++.dg/asan/function-argument-3.C fails for me
>> as one needs to have a temporary SSA name, otherwise:
>>
>> /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/asan/function-argument-3.C:13:1: error: invalid rhs for gimple memory store
>>  foo (v4si arg)
>>  ^~~
>> arg
>>
>> arg
>>
>> # .MEM_4 = VDEF <.MEM_1(D)>
>> arg = arg;
>> during GIMPLE pass: sanopt
>>
>> If I see correctly the function in my test-case does not have default def SSA name for the parameter.
>> Thus I guess I need to create a SSA name?
> 
> I'd expect if you have DECL_GIMPLE_REG_P set on the PARM_DECL and
> use the default def, you shouldn't run into this.
> 
> 	Jakub
> 

Good I fixed that in v2, that passes regression tests.
Ale objections should be resolved in the version.

Ready for trunk?
Martin

Comments

Martin Liška June 28, 2017, 1:16 p.m. UTC | #1
PING^1

On 06/20/2017 03:06 PM, Martin Liška wrote:
> On 06/20/2017 11:32 AM, Jakub Jelinek wrote:
>> On Tue, Jun 20, 2017 at 11:23:36AM +0200, Martin Liška wrote:
>>>> Then something needs to be done for debugging too.  If it is without VTA,
>>>> then probably just having DECL_VALUE_EXPR is good enough, otherwise
>>>> (VTA) you probably don't want that (or can reset it at that point), but
>>>> instead emit after the initialization stmt a debug stmt that the variable
>>>> value now lives in a different var.  Though ideally we want the debugger
>>>> to be able to also change the value of the var, that might be harder.
>>>> With DECL_VALUE_EXPR on the other side the debug info will be incorrect in
>>>> the prologue until it is assigned to the slot.
>>>
>>> Here I'm not sure about how to distinguish whether to build or not to build
>>> the debug statement. According to flag_var_tracking?
>>
>> More like if (target_for_debug_bind (arg))
>> And if that is false, just make sure DECL_VALUE_EXPR is set to var.
>>
>>> You mean something like:
>>> g = gimple_build_debug_bind (arg, var, g);
>>> ?
>>
>> Well, there is no stmt, so the last argument would be just NULL.
>>
>>>> I don't understand the distinction.  If you turn the original parm
>>>> for complex/vector DECL_GIMPLE_REG_P, you should need the exact same code
>>>> (but I think it would be better to use the default SSA_NAME of the PARM_DECL
>>>> if it is a gimple reg type, rather than use the PARM_DECL itself
>>>> and wait for update_ssa).
>>>
>>> Yes, the test-case /gcc/testsuite/g++.dg/asan/function-argument-3.C fails for me
>>> as one needs to have a temporary SSA name, otherwise:
>>>
>>> /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/asan/function-argument-3.C:13:1: error: invalid rhs for gimple memory store
>>>  foo (v4si arg)
>>>  ^~~
>>> arg
>>>
>>> arg
>>>
>>> # .MEM_4 = VDEF <.MEM_1(D)>
>>> arg = arg;
>>> during GIMPLE pass: sanopt
>>>
>>> If I see correctly the function in my test-case does not have default def SSA name for the parameter.
>>> Thus I guess I need to create a SSA name?
>>
>> I'd expect if you have DECL_GIMPLE_REG_P set on the PARM_DECL and
>> use the default def, you shouldn't run into this.
>>
>> 	Jakub
>>
> 
> Good I fixed that in v2, that passes regression tests.
> Ale objections should be resolved in the version.
> 
> Ready for trunk?
> Martin
>
Jakub Jelinek June 29, 2017, 11:17 a.m. UTC | #2
On Tue, Jun 20, 2017 at 03:06:56PM +0200, Martin Liška wrote:
> +/* Rewrite all usages of tree OP which is a PARM_DECL with a VAR_DECL
> +   that is it's DECL_VALUE_EXPR.  */
> +
> +static tree
> +rewrite_usage_of_param (tree *op, int *walk_subtrees, void *)
> +{
> +  if (TREE_CODE (*op) == PARM_DECL && DECL_VALUE_EXPR (*op) != NULL_TREE)

DECL_VALUE_EXPR testing is costly (it is a hash table lookup).
Therefore you should test DECL_HAS_VALUE_EXPR_P (*op) after checking
== PARM_DECL.  And DECL_HAS_VALUE_EXPR_P should apply non-NULL
DECL_VALUE_EXPR.
That said, I wonder if we don't create DECL_VALUE_EXPR for PARM_DECLs in
other parts of the compiler, whether it wouldn't be safer to also test here
after == PARM_DECL and DECL_HAS_VALUE_EXPR_P check whether *op is in
addressable_params hash table.

> +    {
> +      *op = DECL_VALUE_EXPR (*op);
> +      *walk_subtrees = 0;
> +    }
> +
> +  return NULL;
> +}
> +
> +/* For a given function FUN, rewrite all addressable parameters so that
> +   a new automatic variable is introduced.  Right after function entry
> +   a parameter is assigned to the variable.  */
> +
> +static void
> +sanitize_rewrite_addressable_params (function *fun)
> +{
> +  gimple *g;
> +  gimple_seq stmts = NULL;
> +  auto_vec<tree> addressable_params;

You don't really use the addressable_params vector anywhere, right?
Except for:

> +
> +  for (tree arg = DECL_ARGUMENTS (current_function_decl);
> +       arg; arg = DECL_CHAIN (arg))
> +    {
> +      if (TREE_ADDRESSABLE (arg) && !TREE_ADDRESSABLE (TREE_TYPE (arg)))
> +	{
> +	  TREE_ADDRESSABLE (arg) = 0;
> +	  /* The parameter is no longer addressable.  */
> +	  tree type = TREE_TYPE (arg);
> +	  addressable_params.safe_push (arg);

pushing stuff into it and later

> +  if (addressable_params.is_empty ())
> +    return;

If you only need that, a bool flag if any params have been changed is
enough.  But see above whether it wouldn't be safer to use a hash table
to verify it.  Plus, I think it would be desirable to clear
DECL_HAS_VALUE_EXPR_P and SET_DECL_VALUE_EXPR to NULL afterwards
if (target_for_debug_bind (arg)) - whch can be done either the with vec
or with a hash table traversal, for that we don't care about the ordering.

> +
> +	  /* Create a new automatic variable.  */
> +	  tree var = build_decl (DECL_SOURCE_LOCATION (arg),
> +				 VAR_DECL, DECL_NAME (arg), type);
> +	  TREE_ADDRESSABLE (var) = 1;
> +	  DECL_ARTIFICIAL (var) = 1;
> +	  DECL_SEEN_IN_BIND_EXPR_P (var) = 0;

This is 0 already from build_decl, IMHO no need to set it.

> +	  gimple_add_tmp_var (var);
> +
> +	  if (dump_file)
> +	    fprintf (dump_file,
> +		     "Rewriting parameter whose address is taken: %s\n",
> +		     IDENTIFIER_POINTER (DECL_NAME (arg)));
> +
> +	  SET_DECL_VALUE_EXPR (arg, var);

But obviously you miss setting DECL_HAS_VALUE_EXPR_P here.

> +	  /* Assign value of parameter to newly created variable.  */
> +	  if ((TREE_CODE (type) == COMPLEX_TYPE
> +	       || TREE_CODE (type) == VECTOR_TYPE))
> +	    {
> +	      /* We need to create a SSA name that will be used for the
> +		 assignment.  */

Why don't you just set DECL_GIMPLE_REG_P (arg) = 1; for
COMPLEX_TYPE/VECTOR_TYPE?  The arg is going to be only used to copy it into
the new var.  And then just use get_or_create_ssa_default_def,
regardless of whether if is complex/vector or other.

> +  /* Replace all usages of PARM_DECLs with the newly
> +     created variable VAR.  */
> +  basic_block bb;
> +  FOR_EACH_BB_FN (bb, fun)
> +    {
> +      gimple_stmt_iterator gsi;
> +      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> +	{
> +	  gimple *stmt = gsi_stmt (gsi);
> +	  gimple_stmt_iterator it = gsi_for_stmt (stmt);
> +	  walk_gimple_stmt (&it, NULL, rewrite_usage_of_param, NULL);
> +	}
> +      for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> +	{
> +	  gphi *phi = dyn_cast<gphi *> (gsi_stmt (gsi));
> +	  for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i)
> +	    {
> +	      hash_set<tree> visited_nodes;
> +	      walk_tree (gimple_phi_arg_def_ptr (phi, i),
> +			 rewrite_usage_of_param, NULL, &visited_nodes);
> +	    }

Doesn't walk_gimple_stmt on the PHI handle this?

	Jakub
diff mbox

Patch

From ed5da705250c3015e964de8d23d1aa3d0056012a Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Wed, 14 Jun 2017 11:40:01 +0200
Subject: [PATCH] ASAN: handle addressable params (PR sanitize/81040).

gcc/testsuite/ChangeLog:

2017-06-19  Martin Liska  <mliska@suse.cz>

	PR sanitize/81040
	* g++.dg/asan/function-argument-1.C: New test.
	* g++.dg/asan/function-argument-2.C: New test.
	* g++.dg/asan/function-argument-3.C: New test.

gcc/ChangeLog:

2017-06-19  Martin Liska  <mliska@suse.cz>

	PR sanitize/81040
	* sanopt.c (rewrite_usage_of_param): New function.
	(sanitize_rewrite_addressable_params): Likewise.
	(pass_sanopt::execute): Call rewrite_usage_of_param.
---
 gcc/sanopt.c                                    | 132 ++++++++++++++++++++++++
 gcc/testsuite/g++.dg/asan/function-argument-1.C |  30 ++++++
 gcc/testsuite/g++.dg/asan/function-argument-2.C |  24 +++++
 gcc/testsuite/g++.dg/asan/function-argument-3.C |  27 +++++
 4 files changed, 213 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/asan/function-argument-1.C
 create mode 100644 gcc/testsuite/g++.dg/asan/function-argument-2.C
 create mode 100644 gcc/testsuite/g++.dg/asan/function-argument-3.C

diff --git a/gcc/sanopt.c b/gcc/sanopt.c
index 16bdba76042..077811b5b93 100644
--- a/gcc/sanopt.c
+++ b/gcc/sanopt.c
@@ -37,6 +37,12 @@  along with GCC; see the file COPYING3.  If not see
 #include "gimple-ssa.h"
 #include "tree-phinodes.h"
 #include "ssa-iterators.h"
+#include "gimplify.h"
+#include "gimple-iterator.h"
+#include "gimple-walk.h"
+#include "cfghooks.h"
+#include "tree-dfa.h"
+#include "tree-ssa.h"
 
 /* This is used to carry information about basic blocks.  It is
    attached to the AUX field of the standard CFG block.  */
@@ -858,6 +864,129 @@  sanitize_asan_mark_poison (void)
     }
 }
 
+/* Rewrite all usages of tree OP which is a PARM_DECL with a VAR_DECL
+   that is it's DECL_VALUE_EXPR.  */
+
+static tree
+rewrite_usage_of_param (tree *op, int *walk_subtrees, void *)
+{
+  if (TREE_CODE (*op) == PARM_DECL && DECL_VALUE_EXPR (*op) != NULL_TREE)
+    {
+      *op = DECL_VALUE_EXPR (*op);
+      *walk_subtrees = 0;
+    }
+
+  return NULL;
+}
+
+/* For a given function FUN, rewrite all addressable parameters so that
+   a new automatic variable is introduced.  Right after function entry
+   a parameter is assigned to the variable.  */
+
+static void
+sanitize_rewrite_addressable_params (function *fun)
+{
+  gimple *g;
+  gimple_seq stmts = NULL;
+  auto_vec<tree> addressable_params;
+
+  for (tree arg = DECL_ARGUMENTS (current_function_decl);
+       arg; arg = DECL_CHAIN (arg))
+    {
+      if (TREE_ADDRESSABLE (arg) && !TREE_ADDRESSABLE (TREE_TYPE (arg)))
+	{
+	  TREE_ADDRESSABLE (arg) = 0;
+	  /* The parameter is no longer addressable.  */
+	  tree type = TREE_TYPE (arg);
+	  addressable_params.safe_push (arg);
+
+	  /* Create a new automatic variable.  */
+	  tree var = build_decl (DECL_SOURCE_LOCATION (arg),
+				 VAR_DECL, DECL_NAME (arg), type);
+	  TREE_ADDRESSABLE (var) = 1;
+	  DECL_ARTIFICIAL (var) = 1;
+	  DECL_SEEN_IN_BIND_EXPR_P (var) = 0;
+
+	  gimple_add_tmp_var (var);
+
+	  if (dump_file)
+	    fprintf (dump_file,
+		     "Rewriting parameter whose address is taken: %s\n",
+		     IDENTIFIER_POINTER (DECL_NAME (arg)));
+
+	  SET_DECL_VALUE_EXPR (arg, var);
+
+	  /* Assign value of parameter to newly created variable.  */
+	  if ((TREE_CODE (type) == COMPLEX_TYPE
+	       || TREE_CODE (type) == VECTOR_TYPE))
+	    {
+	      /* We need to create a SSA name that will be used for the
+		 assignment.  */
+	      tree tmp;
+	      if (DECL_GIMPLE_REG_P (arg))
+		tmp = ssa_default_def (cfun, arg);
+	      else
+		{
+		  tmp = make_ssa_name (type);
+		  g = gimple_build_assign (tmp, arg);
+		  gimple_set_location (g, DECL_SOURCE_LOCATION (arg));
+		  gimple_seq_add_stmt (&stmts, g);
+		}
+	      g = gimple_build_assign (var, tmp);
+	      gimple_set_location (g, DECL_SOURCE_LOCATION (arg));
+	      gimple_seq_add_stmt (&stmts, g);
+	    }
+	  else
+	    {
+	      dump_function (0, cfun->decl);
+	      g = gimple_build_assign (var, arg);
+	      gimple_set_location (g, DECL_SOURCE_LOCATION (arg));
+	      gimple_seq_add_stmt (&stmts, g);
+	    }
+
+	  if (target_for_debug_bind (arg))
+	    {
+	      g = gimple_build_debug_bind (arg, var, NULL);
+	      gimple_seq_add_stmt (&stmts, g);
+	    }
+	}
+    }
+
+  if (addressable_params.is_empty ())
+    return;
+
+  /* Replace all usages of PARM_DECLs with the newly
+     created variable VAR.  */
+  basic_block bb;
+  FOR_EACH_BB_FN (bb, fun)
+    {
+      gimple_stmt_iterator gsi;
+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gimple *stmt = gsi_stmt (gsi);
+	  gimple_stmt_iterator it = gsi_for_stmt (stmt);
+	  walk_gimple_stmt (&it, NULL, rewrite_usage_of_param, NULL);
+	}
+      for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+	{
+	  gphi *phi = dyn_cast<gphi *> (gsi_stmt (gsi));
+	  for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i)
+	    {
+	      hash_set<tree> visited_nodes;
+	      walk_tree (gimple_phi_arg_def_ptr (phi, i),
+			 rewrite_usage_of_param, NULL, &visited_nodes);
+	    }
+	}
+    }
+
+  /* Insert default assignments at the beginning of a function.  */
+  basic_block entry_bb = ENTRY_BLOCK_PTR_FOR_FN (fun);
+  entry_bb = split_edge (single_succ_edge (entry_bb));
+
+  gimple_stmt_iterator gsi = gsi_start_bb (entry_bb);
+  gsi_insert_seq_before (&gsi, stmts, GSI_NEW_STMT);
+}
+
 unsigned int
 pass_sanopt::execute (function *fun)
 {
@@ -891,6 +1020,9 @@  pass_sanopt::execute (function *fun)
       sanitize_asan_mark_poison ();
     }
 
+  if (asan_sanitize_stack_p ())
+    sanitize_rewrite_addressable_params (fun);
+
   bool use_calls = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD < INT_MAX
     && asan_num_accesses >= ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD;
 
diff --git a/gcc/testsuite/g++.dg/asan/function-argument-1.C b/gcc/testsuite/g++.dg/asan/function-argument-1.C
new file mode 100644
index 00000000000..148c4628316
--- /dev/null
+++ b/gcc/testsuite/g++.dg/asan/function-argument-1.C
@@ -0,0 +1,30 @@ 
+// { dg-do run }
+// { dg-shouldfail "asan" }
+
+struct A
+{
+  int a[5];
+};
+
+static __attribute__ ((noinline)) int
+goo (A *a)
+{
+  int *ptr = &a->a[0];
+  return *(volatile int *) (ptr - 1);
+}
+
+__attribute__ ((noinline)) int
+foo (A arg)
+{
+  return goo (&arg);
+}
+
+int
+main ()
+{
+  return foo (A ());
+}
+
+// { dg-output "ERROR: AddressSanitizer: stack-buffer-underflow on address.*(\n|\r\n|\r)" }
+// { dg-output "READ of size . at.*" }
+// { dg-output ".*'arg' <== Memory access at offset \[0-9\]* underflows this variable.*" }
diff --git a/gcc/testsuite/g++.dg/asan/function-argument-2.C b/gcc/testsuite/g++.dg/asan/function-argument-2.C
new file mode 100644
index 00000000000..3a7c33bdaaa
--- /dev/null
+++ b/gcc/testsuite/g++.dg/asan/function-argument-2.C
@@ -0,0 +1,24 @@ 
+// { dg-do run }
+// { dg-shouldfail "asan" }
+
+static __attribute__ ((noinline)) int
+goo (int *a)
+{
+  return *(volatile int *)a;
+}
+
+__attribute__ ((noinline)) int
+foo (char arg)
+{
+  return goo ((int *)&arg);
+}
+
+int
+main ()
+{
+  return foo (12);
+}
+
+// { dg-output "ERROR: AddressSanitizer: stack-buffer-overflow on address.*(\n|\r\n|\r)" }
+// { dg-output "READ of size . at.*" }
+// { dg-output ".*'arg' <== Memory access at offset \[0-9\]* partially overflows this variable.*" }
diff --git a/gcc/testsuite/g++.dg/asan/function-argument-3.C b/gcc/testsuite/g++.dg/asan/function-argument-3.C
new file mode 100644
index 00000000000..14617ba8425
--- /dev/null
+++ b/gcc/testsuite/g++.dg/asan/function-argument-3.C
@@ -0,0 +1,27 @@ 
+// { dg-do run }
+// { dg-shouldfail "asan" }
+
+typedef int v4si __attribute__ ((vector_size (16)));
+
+static __attribute__ ((noinline)) int
+goo (v4si *a)
+{
+  return (*(volatile v4si *) (a + 1))[2];
+}
+
+__attribute__ ((noinline)) int
+foo (v4si arg)
+{
+  return goo (&arg);
+}
+
+int
+main ()
+{
+  v4si v = {1,2,3,4};
+  return foo (v);
+}
+
+// { dg-output "ERROR: AddressSanitizer: stack-buffer-overflow on address.*(\n|\r\n|\r)" }
+// { dg-output "READ of size . at.*" }
+// { dg-output ".*'arg' <== Memory access at offset \[0-9\]* overflows this variable.*" }
-- 
2.13.1