diff mbox

Speed-up use-after-scope (re-writing to SSA) (version 2)

Message ID c02f7364-44ed-406f-cee4-7b1fc8899f75@suse.cz
State New
Headers show

Commit Message

Martin Liška Dec. 20, 2016, 11:26 a.m. UTC
On 11/16/2016 05:28 PM, Jakub Jelinek wrote:
> Otherwise LGTM, but please post the asan patch to llvm-commits
> or through their web review interface.
> 
> 	Jakub

Ok, llvm folks are unwilling to accept the new API function, thus I've decided to come up
with approach suggested by Jakub. Briefly, when expanding ASAN_POISON internal function,
we create a new variable (with the same name as the original one). The variable is poisoned
at the location of the ASAN_POISON and all usages just call ASAN_CHECK that would trigger
use-after-scope run-time error. Situation where ASAN_POISON has a LHS is very rare and
is very likely to be a bug. Thus suggested not super-optimized approach should not be
problematic.

I'm not sure about the introduction of 'create_var' function, maybe we would need some
refactoring. Thoughts?

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Martin

Comments

Jakub Jelinek Dec. 21, 2016, 8:52 a.m. UTC | #1
On Tue, Dec 20, 2016 at 12:26:41PM +0100, Martin Liška wrote:
> Ok, llvm folks are unwilling to accept the new API function, thus I've decided to come up
> with approach suggested by Jakub. Briefly, when expanding ASAN_POISON internal function,
> we create a new variable (with the same name as the original one). The variable is poisoned
> at the location of the ASAN_POISON and all usages just call ASAN_CHECK that would trigger
> use-after-scope run-time error. Situation where ASAN_POISON has a LHS is very rare and
> is very likely to be a bug. Thus suggested not super-optimized approach should not be
> problematic.

Do you have a testcase for the case where there is a write to the var after
poison that is then made non-addressable?  use-after-scope-9.c only covers
the read.

> I'm not sure about the introduction of 'create_var' function, maybe we would need some
> refactoring. Thoughts?

It doesn't belong to gimple-expr.c and the name is way too generic, we have
many create var functions already.  And this one is very specialized.

> 2016-12-19  Martin Liska  <mliska@suse.cz>
> 
> 	* asan.c (asan_expand_poison_ifn): New function.
> 	* asan.h (asan_expand_poison_ifn):  Likewise.

Too many spaces.

> +  tree shadow_var = create_var (TREE_TYPE (poisoned_var),
> +				IDENTIFIER_POINTER (DECL_NAME (var_decl)));

For the shadow var creation, IMHO you should
1) use a hash table, once you add a shadow variable for a certain variable
   for the first time, reuse it for all the other cases; you can have many
   ASAN_POISON () calls for the same underlying variable
2) as I said, use just a function in sanopt.c for this,
   create_asan_shadow_var or whatever
3) I think you just want to do copy_node, plus roughly what
   copy_decl_for_dup_finish does (and set DECL_ARTIFICIAL and
   DECL_IGNORED_P) - except that you don't have copy_body_data
   so you can't use it directly (well, you could create copy_body_data
   just for that purpose and set src_fn and dst_fn to current_function_decl
   and the rest to NULL)

I'd really like to see the storing to poisoned var becoming non-addressable
in action (if it can ever happen, so it isn't just theoretical) to look at
what it does.

	Jakub
diff mbox

Patch

From 03c21ac90503aea7af4d74ea8c34f94efde782b6 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Mon, 19 Dec 2016 15:36:11 +0100
Subject: [PATCH] Speed up use-after-scope (v2): rewrite into SSA

gcc/ChangeLog:

2016-12-19  Martin Liska  <mliska@suse.cz>

	* asan.c (asan_expand_poison_ifn): New function.
	* asan.h (asan_expand_poison_ifn):  Likewise.
	* gimple-expr.c (create_var): Likewise.
	* gimple-expr.h (create_var): Likewise.
	* internal-fn.c (expand_ASAN_POISON): Likewise.
	* internal-fn.def (ASAN_POISON): New builtin.
	* sanopt.c (pass_sanopt::execute): Expand
	asan_expand_poison_ifn.
	* tree-ssa.c (is_asan_mark_p): New function.
	(execute_update_addresses_taken): Rewrite local variables
	(identified just by use-after-scope as addressable) into SSA.

gcc/testsuite/ChangeLog:

2016-12-19  Martin Liska  <mliska@suse.cz>

	* gcc.dg/asan/use-after-scope-3.c: Add additional flags.
	* gcc.dg/asan/use-after-scope-9.c: Likewise and grep for
	sanopt optimization for ASAN_POISON.
---
 gcc/asan.c                                    | 84 ++++++++++++++++++++++++++-
 gcc/asan.h                                    |  1 +
 gcc/gimple-expr.c                             | 27 +++++++++
 gcc/gimple-expr.h                             |  1 +
 gcc/internal-fn.c                             |  7 +++
 gcc/internal-fn.def                           |  1 +
 gcc/sanopt.c                                  |  9 +++
 gcc/testsuite/gcc.dg/asan/use-after-scope-3.c |  1 +
 gcc/testsuite/gcc.dg/asan/use-after-scope-9.c |  2 +
 gcc/tree-ssa.c                                | 69 ++++++++++++++++++----
 10 files changed, 191 insertions(+), 11 deletions(-)

diff --git a/gcc/asan.c b/gcc/asan.c
index 53acff0a2fb..de8ce12f818 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -32,8 +32,8 @@  along with GCC; see the file COPYING3.  If not see
 #include "tree-pass.h"
 #include "memmodel.h"
 #include "tm_p.h"
+#include "ssa.h"
 #include "stringpool.h"
-#include "tree-vrp.h"
 #include "tree-ssanames.h"
 #include "optabs.h"
 #include "emit-rtl.h"
@@ -3055,6 +3055,88 @@  asan_expand_check_ifn (gimple_stmt_iterator *iter, bool use_calls)
   return true;
 }
 
+
+/* Expand the ASAN_POISON builtins.  */
+
+bool
+asan_expand_poison_ifn (gimple_stmt_iterator *iter,
+			bool *need_commit_edge_insert)
+{
+  gimple *g = gsi_stmt (*iter);
+  tree poisoned_var = gimple_call_lhs (g);
+  if (!poisoned_var)
+    {
+      gsi_remove (iter, true);
+      return true;
+    }
+
+  tree var_decl = SSA_NAME_VAR (poisoned_var);
+
+  bool recover_p;
+  if (flag_sanitize & SANITIZE_USER_ADDRESS)
+    recover_p = (flag_sanitize_recover & SANITIZE_USER_ADDRESS) != 0;
+  else
+    recover_p = (flag_sanitize_recover & SANITIZE_KERNEL_ADDRESS) != 0;
+
+  tree shadow_var = create_var (TREE_TYPE (poisoned_var),
+				IDENTIFIER_POINTER (DECL_NAME (var_decl)));
+
+  tree size = DECL_SIZE_UNIT (shadow_var);
+  gimple *poison_call
+    = gimple_build_call_internal (IFN_ASAN_MARK, 3,
+				  build_int_cst (integer_type_node,
+						 ASAN_MARK_POISON),
+				  build_fold_addr_expr (shadow_var), size);
+
+  use_operand_p use_p;
+  imm_use_iterator imm_iter;
+  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, poisoned_var)
+    {
+      gimple *use = USE_STMT (use_p);
+      if (is_gimple_debug (use))
+	continue;
+
+      int nargs;
+      tree fun = report_error_func (false, recover_p, tree_to_uhwi (size),
+				    &nargs);
+
+      gcall *call = gimple_build_call (fun, 1,
+				       build_fold_addr_expr (shadow_var));
+      gimple_set_location (call, gimple_location (use));
+      gimple *call_to_insert = call;
+
+      /* The USE can be a gimple PHI node.  If so, insert the call on
+	 all edges leading to the PHI node.  */
+      if (is_a <gphi *> (use))
+	{
+	  gphi *phi = dyn_cast<gphi *> (use);
+	  for (unsigned i = 0; i < gimple_phi_num_args (phi); ++i)
+	    if (gimple_phi_arg_def (phi, i) == poisoned_var)
+	      {
+		edge e = gimple_phi_arg_edge (phi, i);
+
+		if (call_to_insert == NULL)
+		  call_to_insert = gimple_copy (call);
+
+		gsi_insert_seq_on_edge (e, call_to_insert);
+		*need_commit_edge_insert = true;
+		call_to_insert = NULL;
+	      }
+	}
+      else
+	{
+	  gimple_stmt_iterator gsi = gsi_for_stmt (use);
+	  gsi_insert_before (&gsi, call, GSI_NEW_STMT);
+	}
+    }
+
+  SSA_NAME_IS_DEFAULT_DEF (poisoned_var) = true;
+  SSA_NAME_DEF_STMT (poisoned_var) = gimple_build_nop ();
+  gsi_replace (iter, poison_call, false);
+
+  return true;
+}
+
 /* Instrument the current function.  */
 
 static unsigned int
diff --git a/gcc/asan.h b/gcc/asan.h
index 355a350bfeb..e176e47f4eb 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -30,6 +30,7 @@  extern void initialize_sanitizer_builtins (void);
 extern tree asan_dynamic_init_call (bool);
 extern bool asan_expand_check_ifn (gimple_stmt_iterator *, bool);
 extern bool asan_expand_mark_ifn (gimple_stmt_iterator *);
+extern bool asan_expand_poison_ifn (gimple_stmt_iterator *, bool *);
 
 extern gimple_stmt_iterator create_cond_insert_point
      (gimple_stmt_iterator *, bool, bool, bool, basic_block *, basic_block *);
diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c
index de5cce1f7cc..968e19727f6 100644
--- a/gcc/gimple-expr.c
+++ b/gcc/gimple-expr.c
@@ -477,6 +477,33 @@  create_tmp_var (tree type, const char *prefix)
   return tmp_var;
 }
 
+/* Create a new variable declaration of type TYPE with specified NAME.  */
+
+tree
+create_var (tree type, const char *name)
+{
+  tree tmp_var;
+
+  gcc_assert (!TREE_ADDRESSABLE (type) && COMPLETE_TYPE_P (type));
+
+  tmp_var = build_decl (input_location, VAR_DECL, get_identifier (name), type);
+
+  /* The variable was declared by the compiler.  */
+  DECL_ARTIFICIAL (tmp_var) = 1;
+  /* And we don't want debug info for it.  */
+  DECL_IGNORED_P (tmp_var) = 1;
+
+  /* Make the variable writable.  */
+  TREE_READONLY (tmp_var) = 0;
+
+  DECL_EXTERNAL (tmp_var) = 0;
+  TREE_STATIC (tmp_var) = 0;
+  TREE_USED (tmp_var) = 1;
+
+  gimple_add_tmp_var (tmp_var);
+  return tmp_var;
+}
+
 /* Create a new temporary variable declaration of type TYPE by calling
    create_tmp_var and if TYPE is a vector or a complex number, mark the new
    temporary as gimple register.  */
diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h
index f2ccd29a94a..d8b393103fd 100644
--- a/gcc/gimple-expr.h
+++ b/gcc/gimple-expr.h
@@ -29,6 +29,7 @@  extern bool gimple_has_body_p (tree);
 extern const char *gimple_decl_printable_name (tree, int);
 extern tree copy_var_decl (tree, tree, tree);
 extern tree create_tmp_var_name (const char *);
+extern tree create_var (tree, const char *);
 extern tree create_tmp_var_raw (tree, const char * = NULL);
 extern tree create_tmp_var (tree, const char * = NULL);
 extern tree create_tmp_reg (tree, const char * = NULL);
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index b1dbc988b9c..40f5fe6c69c 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -380,6 +380,13 @@  expand_ASAN_MARK (internal_fn, gcall *)
   gcc_unreachable ();
 }
 
+/* This should get expanded in the sanopt pass.  */
+
+static void
+expand_ASAN_POISON (internal_fn, gcall *)
+{
+  gcc_unreachable ();
+}
 
 /* This should get expanded in the tsan pass.  */
 
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 9a03e17be23..81a6bbdd15c 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -167,6 +167,7 @@  DEF_INTERNAL_FN (ABNORMAL_DISPATCHER, ECF_NORETURN, NULL)
 DEF_INTERNAL_FN (BUILTIN_EXPECT, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (ASAN_CHECK, ECF_TM_PURE | ECF_LEAF | ECF_NOTHROW, ".R...")
 DEF_INTERNAL_FN (ASAN_MARK, ECF_LEAF | ECF_NOTHROW, ".R..")
+DEF_INTERNAL_FN (ASAN_POISON, ECF_LEAF | ECF_NOTHROW | ECF_NOVOPS, NULL)
 DEF_INTERNAL_FN (ADD_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (SUB_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (MUL_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
diff --git a/gcc/sanopt.c b/gcc/sanopt.c
index ae716cffcf4..204aa09d787 100644
--- a/gcc/sanopt.c
+++ b/gcc/sanopt.c
@@ -894,6 +894,7 @@  pass_sanopt::execute (function *fun)
   bool use_calls = ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD < INT_MAX
     && asan_num_accesses >= ASAN_INSTRUMENTATION_WITH_CALL_THRESHOLD;
 
+  bool need_commit_edge_insert = false;
   FOR_EACH_BB_FN (bb, fun)
     {
       gimple_stmt_iterator gsi;
@@ -931,6 +932,10 @@  pass_sanopt::execute (function *fun)
 		case IFN_ASAN_MARK:
 		  no_next = asan_expand_mark_ifn (&gsi);
 		  break;
+		case IFN_ASAN_POISON:
+		  no_next = asan_expand_poison_ifn (&gsi,
+						    &need_commit_edge_insert);
+		  break;
 		default:
 		  break;
 		}
@@ -962,6 +967,10 @@  pass_sanopt::execute (function *fun)
 	    gsi_next (&gsi);
 	}
     }
+
+  if (need_commit_edge_insert)
+    gsi_commit_edge_inserts ();
+
   return 0;
 }
 
diff --git a/gcc/testsuite/gcc.dg/asan/use-after-scope-3.c b/gcc/testsuite/gcc.dg/asan/use-after-scope-3.c
index 9aeed51a770..8b11bea9940 100644
--- a/gcc/testsuite/gcc.dg/asan/use-after-scope-3.c
+++ b/gcc/testsuite/gcc.dg/asan/use-after-scope-3.c
@@ -1,5 +1,6 @@ 
 // { dg-do run }
 // { dg-shouldfail "asan" }
+// { dg-additional-options "-O0" }
 
 int
 main (void)
diff --git a/gcc/testsuite/gcc.dg/asan/use-after-scope-9.c b/gcc/testsuite/gcc.dg/asan/use-after-scope-9.c
index 2e30deffa18..5d069dd18ea 100644
--- a/gcc/testsuite/gcc.dg/asan/use-after-scope-9.c
+++ b/gcc/testsuite/gcc.dg/asan/use-after-scope-9.c
@@ -1,5 +1,6 @@ 
 // { dg-do run }
 // { dg-shouldfail "asan" }
+// { dg-additional-options "-O2 -fdump-tree-asan1" }
 
 int
 main (int argc, char **argv)
@@ -15,6 +16,7 @@  main (int argc, char **argv)
   return *ptr;
 }
 
+// { dg-final { scan-tree-dump-times "= ASAN_POISON \\(\\)" 1 "asan1" } }
 // { dg-output "ERROR: AddressSanitizer: stack-use-after-scope on address.*(\n|\r\n|\r)" }
 // { dg-output "READ of size .*" }
 // { dg-output ".*'a' <== Memory access at offset \[0-9\]* is inside this variable.*" }
diff --git a/gcc/tree-ssa.c b/gcc/tree-ssa.c
index 62eea8bb8a4..3adbef48037 100644
--- a/gcc/tree-ssa.c
+++ b/gcc/tree-ssa.c
@@ -41,6 +41,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "cfgexpand.h"
 #include "tree-cfg.h"
 #include "tree-dfa.h"
+#include "asan.h"
 
 /* Pointer map of variable mappings, keyed by edge.  */
 static hash_map<edge, auto_vec<edge_var_map> > *edge_var_maps;
@@ -1550,6 +1551,30 @@  maybe_optimize_var (tree var, bitmap addresses_taken, bitmap not_reg_needs,
     }
 }
 
+/* Return true when STMT is ASAN mark where second argument is an address
+   of a local variable.  */
+
+static bool
+is_asan_mark_p (gimple *stmt)
+{
+  if (!gimple_call_internal_p (stmt, IFN_ASAN_MARK))
+    return false;
+
+  tree addr = get_base_address (gimple_call_arg (stmt, 1));
+  if (TREE_CODE (addr) == ADDR_EXPR
+      && VAR_P (TREE_OPERAND (addr, 0)))
+    {
+      tree var = TREE_OPERAND (addr, 0);
+      unsigned addressable = TREE_ADDRESSABLE (var);
+      TREE_ADDRESSABLE (var) = 0;
+      bool r = is_gimple_reg (var);
+      TREE_ADDRESSABLE (var) = addressable;
+      return r;
+    }
+
+  return false;
+}
+
 /* Compute TREE_ADDRESSABLE and DECL_GIMPLE_REG_P for local variables.  */
 
 void
@@ -1575,17 +1600,23 @@  execute_update_addresses_taken (void)
 	  enum gimple_code code = gimple_code (stmt);
 	  tree decl;
 
-	  if (code == GIMPLE_CALL
-	      && optimize_atomic_compare_exchange_p (stmt))
+	  if (code == GIMPLE_CALL)
 	    {
-	      /* For __atomic_compare_exchange_N if the second argument
-		 is &var, don't mark var addressable;
-		 if it becomes non-addressable, we'll rewrite it into
-		 ATOMIC_COMPARE_EXCHANGE call.  */
-	      tree arg = gimple_call_arg (stmt, 1);
-	      gimple_call_set_arg (stmt, 1, null_pointer_node);
-	      gimple_ior_addresses_taken (addresses_taken, stmt);
-	      gimple_call_set_arg (stmt, 1, arg);
+	      if (optimize_atomic_compare_exchange_p (stmt))
+		{
+		  /* For __atomic_compare_exchange_N if the second argument
+		     is &var, don't mark var addressable;
+		     if it becomes non-addressable, we'll rewrite it into
+		     ATOMIC_COMPARE_EXCHANGE call.  */
+		  tree arg = gimple_call_arg (stmt, 1);
+		  gimple_call_set_arg (stmt, 1, null_pointer_node);
+		  gimple_ior_addresses_taken (addresses_taken, stmt);
+		  gimple_call_set_arg (stmt, 1, arg);
+		}
+	      else if (is_asan_mark_p (stmt))
+		;
+	      else
+		gimple_ior_addresses_taken (addresses_taken, stmt);
 	    }
 	  else
 	    /* Note all addresses taken by the stmt.  */
@@ -1841,6 +1872,24 @@  execute_update_addresses_taken (void)
 			continue;
 		      }
 		  }
+		else if (is_asan_mark_p (stmt))
+		  {
+		    tree var = TREE_OPERAND (gimple_call_arg (stmt, 1), 0);
+		    if (bitmap_bit_p (suitable_for_renaming, DECL_UID (var)))
+		      {
+			unlink_stmt_vdef (stmt);
+			if (asan_mark_p (stmt, ASAN_MARK_POISON))
+			  {
+			    gcall *call
+			      = gimple_build_call_internal (IFN_ASAN_POISON, 0);
+			    gimple_call_set_lhs (call, var);
+			    gsi_replace (&gsi, call, GSI_SAME_STMT);
+			  }
+			else
+			  gsi_remove (&gsi, true);
+			continue;
+		      }
+		  }
 		for (i = 0; i < gimple_call_num_args (stmt); ++i)
 		  {
 		    tree *argp = gimple_call_arg_ptr (stmt, i);
-- 
2.11.0