diff mbox

[Pointer,Bounds,Checker,14/x] Passes [12/n] Optimize string functions

Message ID 20141008191815.GL13454@msticlxl57.ims.intel.com
State New
Headers show

Commit Message

Ilya Enkovich Oct. 8, 2014, 7:18 p.m. UTC
Hi,

This patch introduces simple optimization of string function calls using variants with no checks and/or bounds copy when possible.

Thanks,
Ilya
--
2014-10-08  Ilya Enkovich  <ilya.enkovich@intel.com>

	* tree-chkp.c (check_infos): New.
	(chkp_get_nobnd_fndecl): New.
	(chkp_get_nochk_fndecl): New.
	(chkp_optimize_string_function_calls): New.
	(chkp_opt_init): New.
	(chkp_opt_fini): New.
	(chkp_opt_execute): New.
	(chkp_opt_gate): New.
	(pass_data_chkp_opt): New.
	(pass_chkp_opt): New.
	(make_pass_chkp_opt): New.

Comments

Jeff Law Oct. 9, 2014, 5:02 p.m. UTC | #1
On 10/08/14 13:18, Ilya Enkovich wrote:
> Hi,
>
> This patch introduces simple optimization of string function calls using variants with no checks and/or bounds copy when possible.
>
> Thanks,
> Ilya
> --
> 2014-10-08  Ilya Enkovich  <ilya.enkovich@intel.com>
>
> 	* tree-chkp.c (check_infos): New.
> 	(chkp_get_nobnd_fndecl): New.
> 	(chkp_get_nochk_fndecl): New.
> 	(chkp_optimize_string_function_calls): New.
> 	(chkp_opt_init): New.
> 	(chkp_opt_fini): New.
> 	(chkp_opt_execute): New.
> 	(chkp_opt_gate): New.
> 	(pass_data_chkp_opt): New.
> 	(pass_chkp_opt): New.
> 	(make_pass_chkp_opt): New.
Just a note, these reviews are going to come out-of-order.

I thought I mentioned it in the prior message.  Can you pull the 
optimization stuff into its own file.  It seems to me it ought to be 
conceptually separate from expansion and the other miscellaneous stuff 
in tree-chkp.c.


> +
> +/* Find memcpy, mempcpy, memmove and memset calls, perform
> +   checks before call and then call no_chk version of
> +   functions.  We do it on O2 to enable inlining of these
> +   functions during expand.
So is the purpose here to expose the checks that would normally be done 
in the mem* routines to their caller in the hopes that doing so will 
expose redundant checks?  Or is there some other reason?


> +
> +   Also try to find memcpy, mempcpy, memmove and memset calls
> +   which are known to not write pointers to memory and use
> +   faster function versions for them.  */
Can you please include tests for both classes of transformations?  It 
doesn't have to be exhaustive, just a few simple tests to both verify 
the transformation is working today and to help ensure nobody breaks it 
in the future?


> +void
> +chkp_optimize_string_function_calls (void)
> +{
> +  basic_block bb;
> +
> +  if (dump_file && (dump_flags & TDF_DETAILS))
> +    fprintf (dump_file, "Searching for replacable string function calls...\n");
s/replacable/replaceable/

Seems like if you're going to emit a message into the debug file that 
the compiler is searching for replaceable calls that you ought to emit a 
message when a replaceable call is found too.

And doing so ought to make writing those testcases easier too since you 
can scan the appropriate dump file for the message :-)


> +
> +	      if (fndecl_nochk)
> +		fndecl = fndecl_nochk;
> +
> +	      gimple_call_set_fndecl (stmt, fndecl);
> +
> +	      /* If there is no nochk version of function then
> +		 do nothing.  Otherwise insert checks before
> +		 the call.  */
> +	      if (!fndecl_nochk)
> +		continue;
It's a nit, but I'd tend to write that as:

if (!fndecl_nochk)
   continue;

fndecl = fndecl_nochk
gimple_call_set_fndecl (stmt, fndecl);



> +		 to perform runtiome check for size and perform
s/runtiome/runtime/

> +		 checks only when size is not zero.  */
> +	      if (!known)
> +		{
> +		  gimple check = gimple_build_cond (NE_EXPR,
> +						    size,
> +						    size_zero_node,
> +						    NULL_TREE,
> +						    NULL_TREE);
> +
> +		  /* Split block before string function call.  */
> +		  j = i;
> +		  gsi_prev (&j);
> +		  fall = split_block (bb, gsi_stmt (j));
> +		  bb = fall->src;
> +
> +		  /* Add size check.  */
> +		  j = gsi_last_bb (bb);
> +		  if (gsi_end_p (j))
> +		    gsi_insert_before (&j, check, GSI_SAME_STMT);
> +		  else
> +		    gsi_insert_after (&j, check, GSI_SAME_STMT);
> +
> +		  /* Create basic block for checks.  */
> +		  check_bb = create_empty_bb (bb);
> +		  make_edge (bb, check_bb, EDGE_TRUE_VALUE);
> +		  make_single_succ_edge (check_bb, fall->dest, EDGE_FALLTHRU);
> +
> +		  /* Fix edge for splitted bb.  */
> +		  fall->flags = EDGE_FALSE_VALUE;
> +		  fall->count = bb->count;
> +		  fall->probability = REG_BR_PROB_BASE;
> +
> +		  /* Update dominance info.  */
> +		  if (dom_info_available_p (CDI_DOMINATORS))
> +		    {
> +		      set_immediate_dominator (CDI_DOMINATORS, check_bb, bb);
> +		      set_immediate_dominator (CDI_DOMINATORS, fall->dest, bb);
> +		    }
> +
> +		  /* Update loop info.  */
> +		  if (current_loops)
> +		    add_bb_to_loop (check_bb, bb->loop_father);
> +
> +		  /* Set position for checks.  */
> +		  j = gsi_last_bb (check_bb);
> +
> +		  /* The block was splitted and therefore we
> +		     need to set iterator to its end.  */
> +		  i = gsi_last_bb (bb);
> +		}
Basically we have a point (denoted by an iterator) and we want to split 
the block at that point and insert a conditional around some new blob of 
code before the original split point.

I'm a bit surprised we don't have this kind of capability already broken 
out.  But assuming that's the case, can you go ahead and break that out 
into its own little helper function?    You don't need to find all the 
cases where we're doing this kind of thing today, just create the helper 
function and use it in your new code.

> +
> +/* Finalise checker optimization  pass.  */
> +void
> +chkp_opt_fini (void)
> +{
> +  chkp_fix_cfg ();
> +
> +  free_dominance_info (CDI_DOMINATORS);
> +  free_dominance_info (CDI_POST_DOMINATORS);
> +}
You incrementally update the dominance information, so is there really a 
need to free it here?  ie, the data is still valid, so I think we can 
just avoid the free_dominance_info calls.

Passes after this which use dominator information then won't have to 
rebuild it from scratch.

Generally this looks good.  With the light changes above, this will be OK.

Jeff
diff mbox

Patch

diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
index aae9681..5230d14 100644
--- a/gcc/tree-chkp.c
+++ b/gcc/tree-chkp.c
@@ -399,6 +399,8 @@  static void chkp_parse_array_and_component_ref (tree node, tree *ptr,
 #define chkp_extract_upper_fndecl \
   (targetm.builtin_chkp_function (BUILT_IN_CHKP_EXTRACT_UPPER))
 
+static vec<struct bb_checks, va_heap, vl_ptr> check_infos = vNULL;
+
 static GTY (()) tree chkp_uintptr_type;
 
 static GTY (()) tree chkp_zero_bounds_var;
@@ -4567,8 +4569,315 @@  chkp_fill_check_info (gimple stmt, struct check_info *ci)
   ci->stmt = stmt;
 }
 
+/* Return fast version of string function FNCODE.  */
+tree
+chkp_get_nobnd_fndecl (enum built_in_function fncode)
+{
+  /* Check if we are allowed to use fast string functions.  */
+  if (!flag_chkp_use_fast_string_functions)
+    return NULL_TREE;
+
+  switch (fncode)
+    {
+    case BUILT_IN_MEMCPY:
+      return builtin_decl_implicit (BUILT_IN_CHKP_MEMCPY_NOBND);
+
+    case BUILT_IN_MEMPCPY:
+      return builtin_decl_implicit (BUILT_IN_CHKP_MEMPCPY_NOBND);
+
+    case BUILT_IN_MEMMOVE:
+      return builtin_decl_implicit (BUILT_IN_CHKP_MEMMOVE_NOBND);
+
+    case BUILT_IN_MEMSET:
+      return builtin_decl_implicit (BUILT_IN_CHKP_MEMSET_NOBND);
+
+    case BUILT_IN_CHKP_MEMCPY_NOCHK:
+      return builtin_decl_implicit (BUILT_IN_CHKP_MEMCPY_NOBND_NOCHK);
+
+    case BUILT_IN_CHKP_MEMPCPY_NOCHK:
+      return builtin_decl_implicit (BUILT_IN_CHKP_MEMPCPY_NOBND_NOCHK);
+
+    case BUILT_IN_CHKP_MEMMOVE_NOCHK:
+      return builtin_decl_implicit (BUILT_IN_CHKP_MEMMOVE_NOBND_NOCHK);
+
+    case BUILT_IN_CHKP_MEMSET_NOCHK:
+      return builtin_decl_implicit (BUILT_IN_CHKP_MEMSET_NOBND_NOCHK);
+
+    default:
+      return NULL_TREE;
+    }
+}
+
+
+/* Return no-check version of string function FNCODE.  */
+tree
+chkp_get_nochk_fndecl (enum built_in_function fncode)
+{
+  /* Check if we are allowed to use fast string functions.  */
+  if (!flag_chkp_use_nochk_string_functions)
+    return NULL_TREE;
+
+  switch (fncode)
+    {
+    case BUILT_IN_MEMCPY:
+      return builtin_decl_implicit (BUILT_IN_CHKP_MEMCPY_NOCHK);
+
+    case BUILT_IN_MEMPCPY:
+      return builtin_decl_implicit (BUILT_IN_CHKP_MEMPCPY_NOCHK);
+
+    case BUILT_IN_MEMMOVE:
+      return builtin_decl_implicit (BUILT_IN_CHKP_MEMMOVE_NOCHK);
+
+    case BUILT_IN_MEMSET:
+      return builtin_decl_implicit (BUILT_IN_CHKP_MEMSET_NOCHK);
+
+    case BUILT_IN_CHKP_MEMCPY_NOBND:
+      return builtin_decl_implicit (BUILT_IN_CHKP_MEMCPY_NOBND_NOCHK);
+
+    case BUILT_IN_CHKP_MEMPCPY_NOBND:
+      return builtin_decl_implicit (BUILT_IN_CHKP_MEMPCPY_NOBND_NOCHK);
+
+    case BUILT_IN_CHKP_MEMMOVE_NOBND:
+      return builtin_decl_implicit (BUILT_IN_CHKP_MEMMOVE_NOBND_NOCHK);
+
+    case BUILT_IN_CHKP_MEMSET_NOBND:
+      return builtin_decl_implicit (BUILT_IN_CHKP_MEMSET_NOBND_NOCHK);
+
+    default:
+      return NULL_TREE;
+    }
+}
+
+/* Find memcpy, mempcpy, memmove and memset calls, perform
+   checks before call and then call no_chk version of
+   functions.  We do it on O2 to enable inlining of these
+   functions during expand.
+
+   Also try to find memcpy, mempcpy, memmove and memset calls
+   which are known to not write pointers to memory and use
+   faster function versions for them.  */
+void
+chkp_optimize_string_function_calls (void)
+{
+  basic_block bb;
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    fprintf (dump_file, "Searching for replacable string function calls...\n");
+
+  FOR_EACH_BB_FN (bb, cfun)
+    {
+      gimple_stmt_iterator i;
+
+      for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
+        {
+	  gimple stmt = gsi_stmt (i);
+	  tree fndecl;
+
+	  if (gimple_code (stmt) != GIMPLE_CALL
+	      || !gimple_call_with_bounds_p (stmt))
+	    continue;
+
+	  fndecl = gimple_call_fndecl (stmt);
+
+	  if (!fndecl || DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL)
+	    continue;
+
+	  if (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_MEMCPY
+	      || DECL_FUNCTION_CODE (fndecl) == BUILT_IN_MEMPCPY
+	      || DECL_FUNCTION_CODE (fndecl) == BUILT_IN_MEMMOVE
+	      || DECL_FUNCTION_CODE (fndecl) == BUILT_IN_MEMSET)
+	    {
+	      tree dst = gimple_call_arg (stmt, 0);
+	      tree dst_bnd = gimple_call_arg (stmt, 1);
+	      bool is_memset = DECL_FUNCTION_CODE (fndecl) == BUILT_IN_MEMSET;
+	      tree size = gimple_call_arg (stmt, is_memset ? 3 : 4);
+	      tree fndecl_nochk;
+	      gimple_stmt_iterator j;
+	      basic_block check_bb;
+	      edge fall;
+	      address_t size_val;
+	      int sign;
+	      bool known;
+
+	      /* We may replace call with corresponding __chkp_*_nobnd
+		 call in case destination pointer base type is not
+		 void or pointer.  */
+	      if (POINTER_TYPE_P (TREE_TYPE (dst))
+		  && !VOID_TYPE_P (TREE_TYPE (TREE_TYPE (dst)))
+		  && !chkp_type_has_pointer (TREE_TYPE (TREE_TYPE (dst))))
+		{
+		  tree fndecl_nobnd
+		    = chkp_get_nobnd_fndecl (DECL_FUNCTION_CODE (fndecl));
+
+		  if (fndecl_nobnd)
+		    fndecl = fndecl_nobnd;
+		}
+
+	      fndecl_nochk = chkp_get_nochk_fndecl (DECL_FUNCTION_CODE (fndecl));
+
+	      if (fndecl_nochk)
+		fndecl = fndecl_nochk;
+
+	      gimple_call_set_fndecl (stmt, fndecl);
+
+	      /* If there is no nochk version of function then
+		 do nothing.  Otherwise insert checks before
+		 the call.  */
+	      if (!fndecl_nochk)
+		continue;
+
+	      /* If size passed to call is known and > 0
+		 then we may insert checks unconditionally.  */
+	      size_val.pol.create (0);
+	      chkp_collect_value (size, size_val);
+	      known = chkp_is_constant_addr (size_val, &sign);
+	      size_val.pol.release ();
+
+	      /* If we are not sure size is not zero then we have
+		 to perform runtiome check for size and perform
+		 checks only when size is not zero.  */
+	      if (!known)
+		{
+		  gimple check = gimple_build_cond (NE_EXPR,
+						    size,
+						    size_zero_node,
+						    NULL_TREE,
+						    NULL_TREE);
+
+		  /* Split block before string function call.  */
+		  j = i;
+		  gsi_prev (&j);
+		  fall = split_block (bb, gsi_stmt (j));
+		  bb = fall->src;
+
+		  /* Add size check.  */
+		  j = gsi_last_bb (bb);
+		  if (gsi_end_p (j))
+		    gsi_insert_before (&j, check, GSI_SAME_STMT);
+		  else
+		    gsi_insert_after (&j, check, GSI_SAME_STMT);
+
+		  /* Create basic block for checks.  */
+		  check_bb = create_empty_bb (bb);
+		  make_edge (bb, check_bb, EDGE_TRUE_VALUE);
+		  make_single_succ_edge (check_bb, fall->dest, EDGE_FALLTHRU);
+
+		  /* Fix edge for splitted bb.  */
+		  fall->flags = EDGE_FALSE_VALUE;
+		  fall->count = bb->count;
+		  fall->probability = REG_BR_PROB_BASE;
+
+		  /* Update dominance info.  */
+		  if (dom_info_available_p (CDI_DOMINATORS))
+		    {
+		      set_immediate_dominator (CDI_DOMINATORS, check_bb, bb);
+		      set_immediate_dominator (CDI_DOMINATORS, fall->dest, bb);
+		    }
+
+		  /* Update loop info.  */
+		  if (current_loops)
+		    add_bb_to_loop (check_bb, bb->loop_father);
+
+		  /* Set position for checks.  */
+		  j = gsi_last_bb (check_bb);
+
+		  /* The block was splitted and therefore we
+		     need to set iterator to its end.  */
+		  i = gsi_last_bb (bb);
+		}
+	      /* If size is known to be zero then no checks
+		 should be performed.  */
+	      else if (!sign)
+		continue;
+	      else
+		j = i;
+
+	      size = size_binop (MINUS_EXPR, size, size_one_node);
+	      if (!is_memset)
+		{
+		  tree src = gimple_call_arg (stmt, 2);
+		  tree src_bnd = gimple_call_arg (stmt, 3);
+
+		  chkp_check_mem_access (src, fold_build_pointer_plus (src, size),
+					 src_bnd, j, gimple_location (stmt),
+					 integer_zero_node);
+		}
+
+	      chkp_check_mem_access (dst, fold_build_pointer_plus (dst, size),
+				     dst_bnd, j, gimple_location (stmt),
+				     integer_one_node);
+
+	    }
+	}
+    }
+}
+
+/* Initilize checker optimization pass.  */
+void
+chkp_opt_init (void)
+{
+  check_infos.create (0);
+
+  calculate_dominance_info (CDI_DOMINATORS);
+  calculate_dominance_info (CDI_POST_DOMINATORS);
+
+  /* With LTO constant bounds vars may be not initialized by now.
+     Get constant bounds vars to handle their assignments during
+     optimizations.  */
+  chkp_get_zero_bounds_var ();
+  chkp_get_none_bounds_var ();
+}
+
+/* Finalise checker optimization  pass.  */
+void
+chkp_opt_fini (void)
+{
+  chkp_fix_cfg ();
+
+  free_dominance_info (CDI_DOMINATORS);
+  free_dominance_info (CDI_POST_DOMINATORS);
+}
+
+/* Checker optimization pass function.  */
+unsigned int
+chkp_opt_execute (void)
+{
+  chkp_opt_init();
+
+  /* This optimization may introduce new checks
+     and thus we put it before checks search.  */
+  chkp_optimize_string_function_calls ();
+
+  chkp_opt_fini ();
+
+  return 0;
+}
+
+/* Pass gate.  */
+bool
+chkp_opt_gate (void)
+{
+  return chkp_function_instrumented_p (cfun->decl)
+    && (flag_chkp_optimize > 0
+	|| (flag_chkp_optimize == -1 && optimize > 0));
+}
+
 namespace {
 
+const pass_data pass_data_chkp_opt =
+{
+  GIMPLE_PASS, /* type */
+  "chkpopt", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_NONE, /* tv_id */
+  PROP_ssa | PROP_cfg, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  TODO_verify_il
+  | TODO_update_ssa /* todo_flags_finish */
+};
+
 const pass_data pass_data_chkp =
 {
   GIMPLE_PASS, /* type */
@@ -4608,6 +4917,31 @@  public:
 
 }; // class pass_chkp
 
+class pass_chkp_opt : public gimple_opt_pass
+{
+public:
+  pass_chkp_opt (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_chkp_opt, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual opt_pass * clone ()
+    {
+      return new pass_chkp_opt (m_ctxt);
+    }
+
+  virtual bool gate (function *)
+    {
+      return chkp_opt_gate ();
+    }
+
+  virtual unsigned int execute (function *)
+    {
+      return chkp_opt_execute ();
+    }
+
+}; // class pass_chkp_opt
+
 } // anon namespace
 
 gimple_opt_pass *
@@ -4616,4 +4950,10 @@  make_pass_chkp (gcc::context *ctxt)
   return new pass_chkp (ctxt);
 }
 
+gimple_opt_pass *
+make_pass_chkp_opt (gcc::context *ctxt)
+{
+  return new pass_chkp_opt (ctxt);
+}
+
 #include "gt-tree-chkp.h"